Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save mondain/da7f803c8e6f668926e74995c4d95474 to your computer and use it in GitHub Desktop.

Select an option

Save mondain/da7f803c8e6f668926e74995c4d95474 to your computer and use it in GitHub Desktop.
TSHandler first-PTS anchor fix
M2TS Ingest First-PTS Anchor — Change Report
Original issue
SRT/M2TS ingest produced a startup A/V skew where RTMP subscribers saw the first video frame at wire timestamp ~0 ms while the first audio frame arrived at ~9685 ms. The 9.685
s gap then persisted for the lifetime of the stream — silent video for ~10 s, with audio always trailing on the wire.
Trace evidence (~/Downloads/red5.log):
init pts 1421 127920 # video PID 256 set firstPTS = 127920 (1421 ms)
firstAudioPts: 9685 # audio PID 257 first ts = audioPts - firstPTS = 9685 ms
Root cause
TSHandler shared a single firstPTS field set by whichever PID's first PES landed first at its per-PID worker. Each subsequent PID's first wire timestamp was computed as max(0,
pts - firstPTS). Because the encoder delivered the first audio PES with a PTS roughly 9.685 s later than the first video PES (a common pattern: AAC priming, audio capture lag,
separate encoder threads), the audio PID started with a large positive offset that never cleared.
TSHandler.java:1173-1180 (old initPts): first-arrival-wins, no min reconciliation.
TSHandler.java:1372, 1531: per-PID first-frame math against the racy shared firstPTS.
Fix applied
Replaced first-arrival-wins with a min-anchor preroll gate at the chunk-worker boundary:
1. Per-PID PTS sniff — when a chunk-worker pulls a psi=1 chunk and the PID has not yet first-PES'd, sniffPesPts reads the PTS from a ByteBuffer.duplicate() (no mutation) and
stores it in state.firstPesPts.
2. Held warmup — until anchor is set, chunks accumulate in state.warmupHeld (per-PID) instead of being dispatched. Capped at 4096 entries with a forced-anchor escape.
3. Anchor decision — tryAnchor fires when either:
- every registered PID (video ∪ audio ∪ meta) has produced a first PES, or
- the preroll window (firstPts.preroll.ms, default 250 ms) has elapsed since the first PID first-PES'd.
It then publishes firstPTS = min(firstPesPts across PIDs).
4. Drain — on the next worker iteration after anchor, each PID drains its warmupHeld through the existing processChunk path. The unmodified per-PID first-frame branch
(state.ptsTime = max(0, pts - firstPTS)) now sees the minimum firstPTS, so each PID's first wire timestamp is the real source-side offset relative to the earliest-PES'd PID —
typically a few tens of ms, not seconds.
5. Idle-path watchdog — when a worker's queue is empty, it still calls tryAnchor so a PID that briefly stops producing PES does not strand its peers.
6. Fallback — initPts is retained for the case where the warmup sniff fails to parse a malformed PES header; it logs a WARN if it ever fires.
Concurrency model
All synchronization removed. The fix is fully lock-free:
┌──────────────────────┬──────────────────────────────────────────────────────┬────────────────────────────────────────────────────┐
│ State │ Mechanism │ Set-once? │
├──────────────────────┼──────────────────────────────────────────────────────┼────────────────────────────────────────────────────┤
│ firstPTS │ AtomicLong, anchor via compareAndSet(MAX_VALUE, min) │ yes │
├──────────────────────┼──────────────────────────────────────────────────────┼────────────────────────────────────────────────────┤
│ state.firstPesPts │ AtomicLong, sniff via compareAndSet(MAX_VALUE, pts) │ yes │
├──────────────────────┼──────────────────────────────────────────────────────┼────────────────────────────────────────────────────┤
│ firstPesArrivalNanos │ AtomicLong, compareAndSet(-1L, now) │ yes (except deliberate back-date on overflow path) │
├──────────────────────┼──────────────────────────────────────────────────────┼────────────────────────────────────────────────────┤
│ state.warmupHeld │ ConcurrentLinkedDeque │ n/a │
└──────────────────────┴──────────────────────────────────────────────────────┴────────────────────────────────────────────────────┘
tryAnchor is idempotent and re-entrant — multiple workers can race; only the winning CAS publishes. Snapshot consistency holds because every read is from a set-once field.
Steady-state hot path is one volatile load of firstPTS per chunk and a fast-path call to processChunk; no CAS, no fences, no contention.
Expected result
- Targeted symptom (~9685 ms audio offset): eliminated. With firstPTS = min(firstPesVideo, firstPesAudio), audio's first wire timestamp falls to ~0 ms (or whatever true
source-side offset exists, usually <100 ms).
- General SRT/M2TS ingest: any encoder-side first-PES skew between video, audio, and KLV/SCTE/ID3 metadata PIDs collapses to its true source-side delta instead of being
inflated by the worker arrival race.
- Steady-state behavior: unchanged. After anchor, the fix is a no-op gate. PTS-delta accumulation, rollover handling, discontinuity, CC reset, and PCR usage are untouched.
- Worst-case audio-only / video-only ingest: anchor fires on the preroll deadline (~250 ms after the first PES), no chunks lost.
- Startup latency: ≤ ~250 ms additional first-frame latency in the pathological case where one PID is slow to deliver. Most streams anchor immediately when both PIDs first-PES
within the same caching window (~40–100 ms), making the added latency negligible.
Configuration & rollback
- Tunable via plugin property firstPts.preroll.ms in the SRT/TSIngest plugin properties.
- Setting firstPts.preroll.ms=0 reverts behavior to legacy first-arrival-wins (anchor fires on the first PES from any PID; the min-collapse becomes a no-op).
- No schema changes, no API changes, no protocol changes — the change is local to TSHandler.
Recommended verification
1. Replay an SRT capture that produced the original 9685 ms skew. Confirm Anchored firstPTS=... log line and that the first audio dispatched timestamp is now near zero.
2. Smoke test on a "normal" SRT source to confirm steady-state behavior is unchanged (no regressions in delta accumulation or rollover).
3. Edge case: video-only and audio-only streams should publish without stalling — anchor fires on preroll deadline.
4. Edge case: a stream that drops audio mid-warmup (slow producer) — confirm idle-path tryAnchor fires the deadline and video proceeds.
5. Watch for any firstPTS fallback set ... WARN — indicates the sniff missed a PES; investigate the source's PES framing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment