TODAY

Today is a FastAPI-based application that enables users to transcribe and store their diary entries through an online speech-to-text system.

Features

Speech-to-Text Transcription: Uses Whisper ASR for converting spoken words into text.
Real-time Processing: Utilizes WebSocket connections to process audio streams in real time.
Diary Saving & Retrieval: Stores diary entries with timestamps and enables users to retrieve past entries.
Grammar Correction: Uses an LLM to correct grammatical mistakes in diary entries you can use any open source LLM you like.
privacy: completly private to you No APIs used

Technologies Used

Whisper ASR: Automatic Speech Recognition model for transcriptions.
VLLM: Integration for LLM-based text processing.
FastAPI: API framework for building scalable applications.
WebSockets: For real-time communication with the transcription backend.

Installation

Prerequisites

Ensure you have the following installed:

Python 3.12
FFmpeg
Whisper ASR dependencies
Hugging Face API Token (for LLM access)
cudnn
Docker and Docker Compose (for containerized setup)

Setup

Option 1: Docker Installation (Recommended)

Make sure you have Docker and Docker Compose installed
clone the repo

git clone https://github.com/Mahmoud-ghareeb/today.git

Create a .env file with your Hugging Face token:

cp .env.example .env

Add your HF token to the .env file
Build and run the container:

docker compose up --build -d

The application will automatically start when you run docker-compose up. Access it at:

Web Interface: http://localhost:8008

Option 2: Local Installation

create conda env

conda create -n today python==3.12
conda activate today

install pytorch

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126

install dependencies

sudo apt install portaudio19-dev && python3-pyaudio

install the requirements

pip install -r requirements.txt

install cudnn
Set up environment variables: make a copy of .env.example file rename it to .env and fill the required information
To start the FastAPI server:

python main.py

API Endpoints

WebSocket Endpoint

/asr: Accepts audio streams and transcribes them in real time.

REST Endpoints

GET /: Serves the static HTML interface.
POST /save: Saves a diary entry.
GET /get: Retrieves a diary entry by timestamp.
POST /diary: Converts raw transcription into a diary format.
POST /correct_mistakes: Corrects grammatical mistakes in the text.

Usage

Open the application interface in a browser.
Start speaking, and the app will transcribe in real time.
Save the transcribed text as a diary entry.
Retrieve and edit past entries as needed.
Use the AI-powered grammar correction and diary formatting features.

Under Development

Correct Mistakes
Diary Formatting
Docker Support

Todos

Dark mode
Chat feature

Future Enhancements

Support for multiple languages.

Contributions

Contributions are welcome! Feel free to open an issue or submit a pull request.

Contact

For any questions or support, reach out to mahmoudghareeb11111@gmail.com.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TODAY

Features

Technologies Used

Installation

Prerequisites

Setup

Option 1: Docker Installation (Recommended)

Option 2: Local Installation

API Endpoints

WebSocket Endpoint

REST Endpoints

Usage

Under Development

Todos

Future Enhancements

Contributions

Contact

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

TODAY

Features

Technologies Used

Installation

Prerequisites

Setup

Option 1: Docker Installation (Recommended)

Option 2: Local Installation

API Endpoints

WebSocket Endpoint

REST Endpoints

Usage

Under Development

Todos

Future Enhancements

Contributions

Contact