Skip to content

AV1 support for HW decoding #1690

@legraphista

Description

@legraphista

Overview

#1685 implemented HW support for CUDA devices. This support works for H264, VP9 but not AV1 (on supported devices)

Expected behavior

Decoding to be done on GPU.

Actual behavior

Decoding is done in software (even though allow_software_fallback is False)

Traceback:
n/a

Investigation

sample media: av1.webm

import av
import time

file = 'av1.webm'

hwaccel = av.codec.hwaccel.HWAccel(device_type='cuda', allow_software_fallback=False)
container = av.open(file, hwaccel=hwaccel)

start_time = time.time()
frame_count = 0
for packet in container.demux(video=0):
    for _ in packet.decode():
        frame_count += 1

hw_time = time.time() - start_time
hw_fps = frame_count / hw_time
container.close()

print(f"Decoded with cuda in {hw_time:.2f}s ({hw_fps:.2f} fps).")

Sanity Check:
FFmpeg:

$ ffmpeg -c:v av1_cuvid -i av1.webm -f null -
...
frame=  300 fps=0.0 q=-0.0 Lsize=N/A time=00:00:10.00 bitrate=N/A speed=17.3x    
# 30 FPS * 17.3x realtime ~ 500FPS decode speed

Py Sample:

$ python test.py
Decoded with cuda in 3.18s (94.25 fps).

Reproduction

see above

Versions

PyAV v14.0.1
library configuration: --disable-static --enable-shared --libdir=/tmp/vendor/lib --prefix=/tmp/vendor --disable-alsa --disable-doc --disable-libtheora --disable-libfreetype --disable-libfontconfig --disable-libbluray --disable-libopenjpeg --disable-mediafoundation --enable-gmp --enable-gnutls --enable-libaom --enable-libdav1d --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopus --enable-libspeex --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libxcb --enable-libxml2 --enable-lzma --enable-zlib --enable-version3 --enable-libx264 --disable-libopenh264 --enable-libx265 --enable-gpl
library license: GPL version 3 or later
libavcodec     61. 19.100
libavdevice    61.  3.100
libavfilter    10.  4.100
libavformat    61.  7.100
libavutil      59. 39.100
libswresample   5.  3.100
libswscale      8.  3.100
  • I am/tried using the binary wheels
  • I compiled from source

Research

I have done the following:

Additional context

Tests done on an RTX4090, reproduced also on an L4 in GCP

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions