INTRODUCTION

The app has a very simple one page UI to ensure simplicity for blind people.
The frontend has explicit voice audio instructions at every step to ensure convinience, so that blind people can easily record videos.
The process used for caption generation is as follows:

First audio and video are seperated for processing.
Key frame extraction is done using opencv by converting frames to luv format, smoothening them and selecting the frames which have the maximum difference from the ones around them.
An image captioning model by Salesforce is used for captioning each image.
Whisper is used for audio transcription.
Gemini is used for combining the data produced from captioning and transcription.
The frontend utilizes bloc state management to ensure a smooth user experience.
For faster processing celery and redis have been used so that backend processing can be asynchronous.
The summary returned by Gemini is read using text-to-speech for the blind person.

SETUP

Backend setup:

cd ./backend/video_summary_backend
Make a virtual environment and install all dependencies mention in the requirement.txt
Make a redis instance on port 6379
Check the working with redis-cli ping command
To open the celery worker use the command: celery -A video_summary_backend worker --pool=solo -l info
To setup database: run python manage.py makemigrations
run python manage.py migrate
run python manage.py runserver 0.0.0.0:8000
To setup the keys in the .env refer to the sample.envv

Frontend setup:

To ensure the latest version run:flutter upgrade (optional)
use flutter pub get
use flutter run

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
backend		backend
frontend/video_summarizer		frontend/video_summarizer
README.md		README.md
WhatsApp Image 2024-07-10 at 22.59.58_4aea107f.jpg		WhatsApp Image 2024-07-10 at 22.59.58_4aea107f.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

INTRODUCTION

SETUP

Backend setup:

Frontend setup:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

MS-githubaccnt/Video-Summary-for-blind

Folders and files

Latest commit

History

Repository files navigation

INTRODUCTION

SETUP

Backend setup:

Frontend setup:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages