Is this a new feature, an improvement, or a change to existing functionality?
Improvement
How would you describe the priority of this feature request
Currently preventing usage
Please provide a clear description of problem this feature solves
Currently for video files, we only transcribe the audio.
We should also identify and extract key frames and do OCR based text extraction on them.
Images of such frames should also be .embed()able, .store()able to disk
Describe the feature, and optionally a solution or implementation and any alternatives
N/A
Additional context
No response