Created
May 1, 2026 23:23
-
-
Save mondain/da7f803c8e6f668926e74995c4d95474 to your computer and use it in GitHub Desktop.
TSHandler first-PTS anchor fix
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| M2TS Ingest First-PTS Anchor — Change Report | |
| Original issue | |
| SRT/M2TS ingest produced a startup A/V skew where RTMP subscribers saw the first video frame at wire timestamp ~0 ms while the first audio frame arrived at ~9685 ms. The 9.685 | |
| s gap then persisted for the lifetime of the stream — silent video for ~10 s, with audio always trailing on the wire. | |
| Trace evidence (~/Downloads/red5.log): | |
| init pts 1421 127920 # video PID 256 set firstPTS = 127920 (1421 ms) | |
| firstAudioPts: 9685 # audio PID 257 first ts = audioPts - firstPTS = 9685 ms | |
| Root cause | |
| TSHandler shared a single firstPTS field set by whichever PID's first PES landed first at its per-PID worker. Each subsequent PID's first wire timestamp was computed as max(0, | |
| pts - firstPTS). Because the encoder delivered the first audio PES with a PTS roughly 9.685 s later than the first video PES (a common pattern: AAC priming, audio capture lag, | |
| separate encoder threads), the audio PID started with a large positive offset that never cleared. | |
| TSHandler.java:1173-1180 (old initPts): first-arrival-wins, no min reconciliation. | |
| TSHandler.java:1372, 1531: per-PID first-frame math against the racy shared firstPTS. | |
| Fix applied | |
| Replaced first-arrival-wins with a min-anchor preroll gate at the chunk-worker boundary: | |
| 1. Per-PID PTS sniff — when a chunk-worker pulls a psi=1 chunk and the PID has not yet first-PES'd, sniffPesPts reads the PTS from a ByteBuffer.duplicate() (no mutation) and | |
| stores it in state.firstPesPts. | |
| 2. Held warmup — until anchor is set, chunks accumulate in state.warmupHeld (per-PID) instead of being dispatched. Capped at 4096 entries with a forced-anchor escape. | |
| 3. Anchor decision — tryAnchor fires when either: | |
| - every registered PID (video ∪ audio ∪ meta) has produced a first PES, or | |
| - the preroll window (firstPts.preroll.ms, default 250 ms) has elapsed since the first PID first-PES'd. | |
| It then publishes firstPTS = min(firstPesPts across PIDs). | |
| 4. Drain — on the next worker iteration after anchor, each PID drains its warmupHeld through the existing processChunk path. The unmodified per-PID first-frame branch | |
| (state.ptsTime = max(0, pts - firstPTS)) now sees the minimum firstPTS, so each PID's first wire timestamp is the real source-side offset relative to the earliest-PES'd PID — | |
| typically a few tens of ms, not seconds. | |
| 5. Idle-path watchdog — when a worker's queue is empty, it still calls tryAnchor so a PID that briefly stops producing PES does not strand its peers. | |
| 6. Fallback — initPts is retained for the case where the warmup sniff fails to parse a malformed PES header; it logs a WARN if it ever fires. | |
| Concurrency model | |
| All synchronization removed. The fix is fully lock-free: | |
| ┌──────────────────────┬──────────────────────────────────────────────────────┬────────────────────────────────────────────────────┐ | |
| │ State │ Mechanism │ Set-once? │ | |
| ├──────────────────────┼──────────────────────────────────────────────────────┼────────────────────────────────────────────────────┤ | |
| │ firstPTS │ AtomicLong, anchor via compareAndSet(MAX_VALUE, min) │ yes │ | |
| ├──────────────────────┼──────────────────────────────────────────────────────┼────────────────────────────────────────────────────┤ | |
| │ state.firstPesPts │ AtomicLong, sniff via compareAndSet(MAX_VALUE, pts) │ yes │ | |
| ├──────────────────────┼──────────────────────────────────────────────────────┼────────────────────────────────────────────────────┤ | |
| │ firstPesArrivalNanos │ AtomicLong, compareAndSet(-1L, now) │ yes (except deliberate back-date on overflow path) │ | |
| ├──────────────────────┼──────────────────────────────────────────────────────┼────────────────────────────────────────────────────┤ | |
| │ state.warmupHeld │ ConcurrentLinkedDeque │ n/a │ | |
| └──────────────────────┴──────────────────────────────────────────────────────┴────────────────────────────────────────────────────┘ | |
| tryAnchor is idempotent and re-entrant — multiple workers can race; only the winning CAS publishes. Snapshot consistency holds because every read is from a set-once field. | |
| Steady-state hot path is one volatile load of firstPTS per chunk and a fast-path call to processChunk; no CAS, no fences, no contention. | |
| Expected result | |
| - Targeted symptom (~9685 ms audio offset): eliminated. With firstPTS = min(firstPesVideo, firstPesAudio), audio's first wire timestamp falls to ~0 ms (or whatever true | |
| source-side offset exists, usually <100 ms). | |
| - General SRT/M2TS ingest: any encoder-side first-PES skew between video, audio, and KLV/SCTE/ID3 metadata PIDs collapses to its true source-side delta instead of being | |
| inflated by the worker arrival race. | |
| - Steady-state behavior: unchanged. After anchor, the fix is a no-op gate. PTS-delta accumulation, rollover handling, discontinuity, CC reset, and PCR usage are untouched. | |
| - Worst-case audio-only / video-only ingest: anchor fires on the preroll deadline (~250 ms after the first PES), no chunks lost. | |
| - Startup latency: ≤ ~250 ms additional first-frame latency in the pathological case where one PID is slow to deliver. Most streams anchor immediately when both PIDs first-PES | |
| within the same caching window (~40–100 ms), making the added latency negligible. | |
| Configuration & rollback | |
| - Tunable via plugin property firstPts.preroll.ms in the SRT/TSIngest plugin properties. | |
| - Setting firstPts.preroll.ms=0 reverts behavior to legacy first-arrival-wins (anchor fires on the first PES from any PID; the min-collapse becomes a no-op). | |
| - No schema changes, no API changes, no protocol changes — the change is local to TSHandler. | |
| Recommended verification | |
| 1. Replay an SRT capture that produced the original 9685 ms skew. Confirm Anchored firstPTS=... log line and that the first audio dispatched timestamp is now near zero. | |
| 2. Smoke test on a "normal" SRT source to confirm steady-state behavior is unchanged (no regressions in delta accumulation or rollover). | |
| 3. Edge case: video-only and audio-only streams should publish without stalling — anchor fires on preroll deadline. | |
| 4. Edge case: a stream that drops audio mid-warmup (slow producer) — confirm idle-path tryAnchor fires the deadline and video proceeds. | |
| 5. Watch for any firstPTS fallback set ... WARN — indicates the sniff missed a PES; investigate the source's PES framing. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment