You can view a demonstration of the application here: https://youtu.be/FIFocNaa4mo
To run this yourself, you must have Docker installed. Then in cmd type: docker-compose up Everything takes a while to spin up due to the image size, which I aim to reduce in the future. The application image works all locally on your desktop, so depending on your input file, it can take a while to process.
Alternatively, you can run this in developer mode:
- cloning the repo
- in /frontend, run npm install for all the packages
- in /backend, run pip install -r requirements.txt after creating the relevant Python virtual environment
- in the Python virtual environment run pip install git+https://github.com/m-bain/whisperx.git
- Note: for installation for the whipserX, you can follow https://github.com/m-bain/whisperX
- Once all packages from npm and pip have finished you can then proceed to spinup the backend and frontend
- for frontend, navigate to /frontend in the cmd and type: npm run start
- for backend, navigate to /backend and in the cmd within your python venv type: python manage.py runserver
- The apps should be exposed on: localhost:3000 for frontend, localhost:8000 for backend api
- You can view what's going on via cmd for frontend and backend when you started up the servers
Use cases for this application:
- Text analysis for long-form audio
- Generate subtitles automatically for your own videos
TODO:
- Let user download results to different formats (COMPLETE)
- TXT, CSV, SRT, VTT
- Let user input from youtube link
- Currently only accepts pre-downloaded files
- Add separate folder to store temporary uploaded files and remove them when done (COMPLETE)
- Currently stored at same level directory as manage.py in /backend
- Add search functionality
- Lexical search (COMPLETE)
- Semantic search
- (maybe using vector search methods and embeddings)
- Reduced search should have two buttons: Download ALL (this downloads all the data) and Download Filtered (Downloads only results of search)
- Fix search functionality to keep memory of original data. Subsequent searches and clicking reset button only works on filtered data, losing original data.
- Improved styling and look of the app with SVG and more animations
- For mp4/youtube uploads, display the video itself, replacing the animated waveform for audio only inputs