Decoding and encoding RTSP 

## Overview
I'm using PyAV to read and write an RTSP link on disk (storing audio and video), but I haven't found a good way to do this since most of my implementations end up with audio and video pops. Bellow is my current test algorithm.

``` python
import av
import cv2

video_path = "rtsp://wowzaec2demo.streamlock.net/vod/mp4:BigBuckBunny_115k.mov"
rtsp = av.open(video_path)

in_streams = [stream for stream in rtsp.streams] 
a_stream = rtsp.streams.audio[0]
v_stream = rtsp.streams.video[0]

output_container = av.open('live_stream.mp4', mode='w')
out_audio = output_container.add_stream(template=a_stream)
out_video = output_container.add_stream(template=v_stream)

audio_demux = iter(rtsp.demux(a_stream))
video_demux = iter(rtsp.demux(v_stream))

frame_template = av.VideoFrame(v_stream.width, v_stream.height, format="yuv420p")

counter = 0
first_packet = True
while True:
    try:
        counter += 1

        audio_packet = next(audio_demux)
        video_packet = next(video_demux)

        if first_packet:
            audio_packet.pts = 0
            audio_packet.dts = 0
            video_packet.pts = 0
            video_packet.dts = 0
            first_packet = False

        if audio_packet.dts is None or video_packet.dts is None:
            continue

        ##### Decode/Encode frame #####
        frame = video_packet.decode()
        if len(frame) != 0:
            frame = frame[0].to_ndarray(format='yuv420p').copy()
            frame = cv2.circle(frame, (500,500), 50, (0,0,255), -1) 
            frame = frame_template.from_ndarray(frame, format="yuv420p")
            video_packet = out_video.encode(frame)
        ###############################

        audio_packet.stream = out_audio
        video_packet.stream = out_video

        output_container.mux(audio_packet)
        output_container.mux(video_packet)

        print(counter)
        if counter > 200:
            raise StopIteration

    except StopIteration:
        print("[INFO] Success")
        output_container.close()
        break

    ##### Comment this handler for more information about the raised error #####
    except Exception as e:
        print(f"[ERROR] Demuxing error: {e}")
        break
```

## Expected behavior
Expected an mp4 output containing the whole video from the provided RTSP with the exact image and audio quality.

## Actual behavior
If I disregard the part of my code that decodes the video packet as a NumPy array and encodes it again, then a [video file](https://imgur.com/a/SCGmUx4) is written on disk, but containing audio and video pops. Otherwise, an error is raised when the encoding tries to be applied on the `np::ndarray`.

Traceback (Error raised when maintaining the frame decoding/encoding):
```console
  _ = out_video.encode(frame)
  File "av/stream.pyx", line 155, in av.stream.Stream.encode
  File "av/codec/context.pyx", line 476, in av.codec.context.CodecContext.encode
  File "av/codec/context.pyx", line 391, in av.codec.context.CodecContext._send_frame_and_recv
  File "av/error.pyx", line 336, in av.error.err_check
av.error.ValueError: [Errno 22] Invalid argument; last error log: [h264] co located POCs unavailable
Traceback (most recent call last):
  File "av/container/output.pyx", line 25, in av.container.output.close_output
TypeError: 'NoneType' object is not iterable
Exception ignored in: 'av.container.output.OutputContainer.__dealloc__'
Traceback (most recent call last):
  File "av/container/output.pyx", line 25, in av.container.output.close_output
TypeError: 'NoneType' object is not iterable
```

## Investigation
- Regarding only writing the RTSP on disk without decoding each frame as `np::ndarray`, I've tested isolating each output stream (audio and video), and in both cases the output is fine, but when they are subsequent the pops happen. 
- Regarding only decoding each frame as `np::ndarray`, I've tried checking the writer and frame dims, changing colorspaces, and pre-allocating a VideoFrame object to then load the array on it.

## Research

I have done the following:

- [X] Checked the [PyAV documentation](https://pyav.org/docs)
- [X] Searched on [Google](https://www.google.com/search?q=pyav+how+do+I+foo)
- [X] Searched on [Stack Overflow](https://stackoverflow.com/search?q=pyav)
- [X] Looked through [old GitHub issues](https://github.com/PyAV-Org/PyAV/issues?&q=is%3Aissue)
- [ ] Asked on [PyAV Gitter](https://gitter.im/PyAV-Org)
- [ ] ... and waited 72 hours for a response.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Decoding and encoding RTSP #777

Overview

Expected behavior

Actual behavior

Investigation

Research

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Decoding and encoding RTSP #777

Description

Overview

Expected behavior

Actual behavior

Investigation

Research

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions