Skip to content

[Feature Request]: video Embeddings : chromaDB also for videos #3533

Open
@Keerthivardhan1

Description

@Keerthivardhan1

Describe the problem

For up until now chromaDB is only for text, from this feature it will be the solution for videos as well

like "Chat With Video" / "Talk To Video"

Describe the proposed solution

function that creates embeddings for videos

steps :

video ==> audio ==> text ==> nltk(sent_tokenize) ==> vector (steps followed in text ) -- (creating collection for sentences in video )

sample code :

from moviepy import VideoFileClip
import speech_recognition as sr

def extract_audio_from_video(video_file_path , audio_file_path):
'''

desc : extracts audio from video_file_path and stores it in audio_file_path

input :
    video_file_path : path to video file
    audio_file_path : path to audio file 

output : 

'''

video = VideoFileClip(video_file_path)
video.audio.write_audiofile(audio_file_path)

def transcribe_audio_to_text(audio_file_path , text_file_path):
'''
desc : transcribes the audio to text

input : 
    audio_file_path : path to audio file 
    text_file_path : path to text file to save the text

output : 
'''
recognizer = sr.Recognizer()
with sr.AudioFile(audio_file_path) as source:
    audio_data = recognizer.record(source)
    text = recognizer.recognize_google(audio_data)
    with open(text_file_path , "w") as file:
        file.writelines(text)
    return text

video_file_path = "samples.mp4"
audio_file_path = "temp_audio.wav"
text_file_path = "text_of_video.txt"

extract_audio_from_video(video_file_path, audio_file_path)
transcription = transcribe_audio_to_text(audio_file_path , text_file_path)
print("Transcription:")
print(transcription)

and I would like to contribute to this feature

Alternatives considered

No response

Importance

nice to have

Additional Information

I would like to work on it!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions