Long-running RTSP consumers progressively lose audio on G.711/H.265 relay; camera-direct consumers unaffected

## Summary

On budget H.265 + G.711 (PCMA/8000) cameras relayed by go2rtc over RTSP, a **long-running** RTSP consumer of the relayed stream progressively loses its audio: AAC/PCMA packets per 10 s of output slide from ~78 down to ~1 over several minutes and then stay at ~1 (effectively silent) for hours, while **video is unaffected**. A freshly-connected consumer on the same stream is fine; a consumer reading the **camera directly** (bypassing go2rtc) is also unaffected.

I haven't proven the internal cause yet (instrumented measurement in progress — see below), so I've kept this report to what's observable and reproducible and put the mechanism as a hypothesis. Happy to gather whatever would help.

## Environment

- go2rtc **1.9.10** (as bundled in Frigate 0.17.1), `linux/amd64`, Docker on a WSL2 host.
- ~12 identical budget cams: 4K **H.265** video + **G.711 A-law (PCMA, 8 kHz mono)** audio, `rtsp://user:pass@ip:554/ch0_0.h265`.
- go2rtc stream is a pure relay (no transcode): `cam_raw: rtsp://user:pass@ip:554/ch0_0.h265`; consumers read `rtsp://127.0.0.1:8554/cam_raw?video&audio`.
- Consumers are long-running `ffmpeg -c:v copy -c:a aac -f segment` recorders (Frigate's recorder + our own test recorders).

## What's observed (reproducible)

1. **It's the long-running consumer, not the camera.** While a recording consumer is stuck at ~1 audio pkt/segment, the go2rtc API shows the producer's PCMA receiver byte/packet counters still advancing normally, and a fresh short `ffmpeg`/`ffprobe` probe of the same relayed URL decodes real, audible audio (~48 kHz·s of samples in an 8 s probe, mean ≈ −37 dB). So the relay is still receiving and can still serve audio; only the established consumer's audio has collapsed.

2. **Independent consumers collapse — and recover — in lockstep.** Two separate `ffmpeg` recorders of the **same** relayed stream (different ffmpeg versions, different process ages) slid 78→1 within the same ~1–2 minute window, and on another occasion both returned to ~78 in the **same minute without reconnecting**. A per-consumer clock-drift explanation doesn't fit a simultaneous spontaneous recovery; it points at something shared on the relay/producer side.

3. **A camera-direct consumer is immune.** The identical `ffmpeg -c:v copy -c:a aac` pipeline reading the camera's RTSP **directly** (not via go2rtc) ran 4 h through several of these windows with steady ~78 audio pkts/segment. A direct ffmpeg client re-bases RTP timing from the camera's RTCP Sender Reports; the go2rtc-relayed consumers do not appear to get that benefit.

4. **Frequency:** across one 48 h span we logged **62 decay episodes** over ~12 cameras (segment audio-packet count ≤5), median duration ~3.3 h (max ~15 h). So it's frequent and long-lived, not a rare blip.

## Hypothesis (not yet proven — would value your read)

These cheap cameras occasionally emit an audio RTP timestamp discontinuity. A direct ffmpeg client absorbs it (RTCP SR re-basing); the relay appears to pass the broken audio/video timestamp relationship through to its **already-attached** consumers, whose muxers then progressively starve the audio once the A/V gap exceeds their interleave tolerance — while newly-attached consumers negotiate fresh timing and are fine.

From a read of the v1.9.10 source, the RTSP consumer path (`pkg/rtsp/consumer.go` `packetWriter`) copies `packet.Timestamp` straight through, and the RTSP server side doesn't appear to emit RTCP Sender Reports to consumers — so a consumer has nothing to re-base against when the producer's audio timeline jumps. I want to confirm this with direct instrumentation before claiming it.

## What I'm doing / what would help

I'm building an instrumented v1.9.10 to log the actual RTP timestamps go2rtc receives from the camera vs. what it relays to each consumer, captured across a real decay window — I'll follow up here with that data. If there's a preferred place to add that logging, or if this is a known/duplicate area, I'd appreciate a pointer. If the fix is "the relay should re-base/regenerate timestamps (or emit RTCP SRs) per consumer," I'm happy to put up a PR once I've confirmed the mechanism.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long-running RTSP consumers progressively lose audio on G.711/H.265 relay; camera-direct consumers unaffected #2303

Summary

Environment

What's observed (reproducible)

Hypothesis (not yet proven — would value your read)

What I'm doing / what would help

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Long-running RTSP consumers progressively lose audio on G.711/H.265 relay; camera-direct consumers unaffected #2303

Description

Summary

Environment

What's observed (reproducible)

Hypothesis (not yet proven — would value your read)

What I'm doing / what would help

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions