Skip to content

Add more docs about how to build a wheel of vision with the all features of video #7250

Open
@wqh17101

Description

@wqh17101

🚀 The feature

No docs to show how to build a wheel with the all features of video including the video_reader(gpu decoder).

Motivation, pitch

I want to use GPU to accelerate the speed of video decoding.
And i find that you support the gpu video decoder.
There are some questions below:

  1. from https://github.com/pytorch/vision#video-backend, I know that i need ffmpeg or pyav to enable the video feature. However, both of them do not support GPU originally. So what do i need if i want to use GPU video decoder.
  2. No detail docs to show how to build a wheel of vision with GPU video decoder.
  3. After gpu decoding,where is the tensor, system memory or gpu memory?
  4. What's the data flow of your video processing and inference?
1. Decoding in the gpu memory
2. Downloading to the system memory.
3. Uploading to the gpu memory for inference.
4. Downloading to the system memory.
5. Uploading to gpu memory for encoding.(Maybe it does not exist)

or

1. Decoding in the gpu memory
2. Inference in the gpu memory directly.
3. Encoding in the gpu memory(Maybe it does not exist)

5.Is there any way for video to work with this pipeline——1.decoded by gpu and keep it in the gpu memory. 2.Inference with tensor in gpu memory directly without downloading to the system memory and uploading to gpu memory for inference again.

I think you should add these to docs.

Alternatives

No response

Additional context

No response

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions