The patched maestro-p (resubmit loop + getScreenTail dump) IS now running. Confirmed: today's first_byte_timeout errors in cue.db carry the new last screen at timeout (ANSI-stripped tail) block. So the fix is live AND we finally have ground truth. It changes the root cause.
All three of today's Pedsidian agent cues still failed at 121s (chain-1 07:00, chain-9 11:00, chain-8 17:00), BUT the screen tails show claude is alive and working the whole time:
- 11:00 + 17:00 tails are pure spinner animation:
Pouncing…/Recombobulating…with a rising counter (1m 0s,50,1,2,3...). That is claude's working spinner. The turn STARTED. - 07:00 tail shows a RESUMED session re-rendering a large prior conversation (
What are we working on?,Mulling… (22s · ↓ 1.6k tokens), reminder text from an earlier session).
So the resubmit fix worked: the prompt is now being submitted and claude begins the turn. The failure is no longer "prompt swallowed / never started." It is now: claude is healthily thinking past 120s, but maestro-p declares first_byte_timeout anyway.
markFirstEntrySeen() is called ONLY from handleEntry ← the JsonlTailer "entry" event (lines ~4685, 4786-4787, 4833, 4877). So "first byte" = "a JSONL transcript entry was written." But claude can be demonstrably alive for >120s (spinner animating, token counter rising) BEFORE it writes its first transcript entry, especially on these heavy morning prompts (full news analysis + web search + extended thinking) and especially on the --resume path where it first reloads a big prior conversation. maestro-p kills a working session because it is measuring the wrong signal.
- Treat TUI liveness as first-byte, not just JSONL entries. The spinner frames (
Pouncing…/Recombobulating…/Mulling…) and the rising token counter inrollingBufferare proof of life. If the screen is changing / the spinner is advancing, the turn started, so clearfirstByteTimer. Only time out on TRUE stall (no screen change AND no JSONL entry for N seconds). This is the real fix. - The
--resumepath is a latency trap. It reloads a large prior conversation before the new turn produces output, eating the budget. Consider a fresh session per scheduled cue, or exclude resume-reload time from the first-byte budget. - Interim, low-risk: raise the first-byte budget for these agents well past 120s (e.g.
--first-byte-timeout 300). The morning workload legitimately needs minutes to first transcript entry. This alone would likely make today's runs pass. - Possible harm from resubmit:
RESUBMIT_INTERVAL_MS = 8000keeps pressing Enter every 8s. Once claude is already thinking, those Enters may queue empty submissions / interrupts. Gate the resubmit loop to stop as soon as ANY TUI activity (spinner motion), not only a JSONL entry, is detected.
- Every failed spawn logs
SecCodeCheckValidity ... Code=-67034(codesign).codesign --verifyon the app now reports modified/added files inapp.asar.unpacked, so the bundle signature is broken after the manual maestro-p/asar swap. Likely benign noise, but clean re-sign would remove it and rule it out.