Today is a FastAPI-based application that enables users to transcribe and store their diary entries through an online speech-to-text system.
- Speech-to-Text Transcription: Uses Whisper ASR for converting spoken words into text.
- Real-time Processing: Utilizes WebSocket connections to process audio streams in real time.
- Diary Saving & Retrieval: Stores diary entries with timestamps and enables users to retrieve past entries.
- Grammar Correction: Uses an LLM to correct grammatical mistakes in diary entries you can use any open source LLM you like.
- privacy: completly private to you
No APIs used
- Whisper ASR: Automatic Speech Recognition model for transcriptions.
- VLLM: Integration for LLM-based text processing.
- FastAPI: API framework for building scalable applications.
- WebSockets: For real-time communication with the transcription backend.
Ensure you have the following installed:
- Python 3.12
- FFmpeg
- Whisper ASR dependencies
- Hugging Face API Token (for LLM access)
- cudnn
- Docker and Docker Compose (for containerized setup)
- Make sure you have Docker and Docker Compose installed
- clone the repo
git clone https://github.com/Mahmoud-ghareeb/today.git- Create a
.envfile with your Hugging Face token:
cp .env.example .env- Add your HF token to the .env file
- Build and run the container:
docker compose up --build -d- The application will automatically start when you run
docker-compose up. Access it at:
- Web Interface: http://localhost:8008
- create conda env
conda create -n today python==3.12
conda activate today- install pytorch
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
- install dependencies
sudo apt install portaudio19-dev && python3-pyaudio
- install the requirements
pip install -r requirements.txt-
install cudnn
-
Set up environment variables: make a copy of
.env.examplefile rename it to.envand fill the required information -
To start the FastAPI server:
python main.py/asr: Accepts audio streams and transcribes them in real time.
GET /: Serves the static HTML interface.POST /save: Saves a diary entry.GET /get: Retrieves a diary entry by timestamp.POST /diary: Converts raw transcription into a diary format.POST /correct_mistakes: Corrects grammatical mistakes in the text.
- Open the application interface in a browser.
- Start speaking, and the app will transcribe in real time.
- Save the transcribed text as a diary entry.
- Retrieve and edit past entries as needed.
- Use the AI-powered grammar correction and diary formatting features.
- Correct Mistakes
- Diary Formatting
- Docker Support
- Dark mode
- Chat feature
- Support for multiple languages.
Contributions are welcome! Feel free to open an issue or submit a pull request.
For any questions or support, reach out to mahmoudghareeb11111@gmail.com.
