Skip to content

Long-running RTSP consumers progressively lose audio on G.711/H.265 relay; camera-direct consumers unaffected #2303

Description

@teajaysee

Summary

On budget H.265 + G.711 (PCMA/8000) cameras relayed by go2rtc over RTSP, a long-running RTSP consumer of the relayed stream progressively loses its audio: AAC/PCMA packets per 10 s of output slide from ~78 down to ~1 over several minutes and then stay at ~1 (effectively silent) for hours, while video is unaffected. A freshly-connected consumer on the same stream is fine; a consumer reading the camera directly (bypassing go2rtc) is also unaffected.

I haven't proven the internal cause yet (instrumented measurement in progress — see below), so I've kept this report to what's observable and reproducible and put the mechanism as a hypothesis. Happy to gather whatever would help.

Environment

  • go2rtc 1.9.10 (as bundled in Frigate 0.17.1), linux/amd64, Docker on a WSL2 host.
  • ~12 identical budget cams: 4K H.265 video + G.711 A-law (PCMA, 8 kHz mono) audio, rtsp://user:pass@ip:554/ch0_0.h265.
  • go2rtc stream is a pure relay (no transcode): cam_raw: rtsp://user:pass@ip:554/ch0_0.h265; consumers read rtsp://127.0.0.1:8554/cam_raw?video&audio.
  • Consumers are long-running ffmpeg -c:v copy -c:a aac -f segment recorders (Frigate's recorder + our own test recorders).

What's observed (reproducible)

  1. It's the long-running consumer, not the camera. While a recording consumer is stuck at ~1 audio pkt/segment, the go2rtc API shows the producer's PCMA receiver byte/packet counters still advancing normally, and a fresh short ffmpeg/ffprobe probe of the same relayed URL decodes real, audible audio (~48 kHz·s of samples in an 8 s probe, mean ≈ −37 dB). So the relay is still receiving and can still serve audio; only the established consumer's audio has collapsed.

  2. Independent consumers collapse — and recover — in lockstep. Two separate ffmpeg recorders of the same relayed stream (different ffmpeg versions, different process ages) slid 78→1 within the same ~1–2 minute window, and on another occasion both returned to ~78 in the same minute without reconnecting. A per-consumer clock-drift explanation doesn't fit a simultaneous spontaneous recovery; it points at something shared on the relay/producer side.

  3. A camera-direct consumer is immune. The identical ffmpeg -c:v copy -c:a aac pipeline reading the camera's RTSP directly (not via go2rtc) ran 4 h through several of these windows with steady ~78 audio pkts/segment. A direct ffmpeg client re-bases RTP timing from the camera's RTCP Sender Reports; the go2rtc-relayed consumers do not appear to get that benefit.

  4. Frequency: across one 48 h span we logged 62 decay episodes over ~12 cameras (segment audio-packet count ≤5), median duration ~3.3 h (max ~15 h). So it's frequent and long-lived, not a rare blip.

Hypothesis (not yet proven — would value your read)

These cheap cameras occasionally emit an audio RTP timestamp discontinuity. A direct ffmpeg client absorbs it (RTCP SR re-basing); the relay appears to pass the broken audio/video timestamp relationship through to its already-attached consumers, whose muxers then progressively starve the audio once the A/V gap exceeds their interleave tolerance — while newly-attached consumers negotiate fresh timing and are fine.

From a read of the v1.9.10 source, the RTSP consumer path (pkg/rtsp/consumer.go packetWriter) copies packet.Timestamp straight through, and the RTSP server side doesn't appear to emit RTCP Sender Reports to consumers — so a consumer has nothing to re-base against when the producer's audio timeline jumps. I want to confirm this with direct instrumentation before claiming it.

What I'm doing / what would help

I'm building an instrumented v1.9.10 to log the actual RTP timestamps go2rtc receives from the camera vs. what it relays to each consumer, captured across a real decay window — I'll follow up here with that data. If there's a preferred place to add that logging, or if this is a known/duplicate area, I'd appreciate a pointer. If the fix is "the relay should re-base/regenerate timestamps (or emit RTCP SRs) per consumer," I'm happy to put up a PR once I've confirmed the mechanism.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions