A scalable microservice architecture for audio transcription built with Python, Flask, and Kubernetes. This system allows users to upload audio files, which are then automatically transcribed to text using OpenAI Whisper model, with email notifications sent upon completion.
flowchart TD
Client((User))
Gateway[Gateway Service]
Auth[Auth Service]
Transcription[Transcription Service]
Notification[Notification Service]
MongoDB[(MongoDB)]
MySQL[(MySQL)]
RabbitMQ[(RabbitMQ)]
Client -->|Upload/Download| Gateway
Gateway -->|Auth Request| Auth
Gateway -->|Send Audio| RabbitMQ
Auth -->|User Data| MySQL
RabbitMQ -->|Audio Task| Transcription
Transcription -->|Save Transcription| MongoDB
Transcription -->|Notify| RabbitMQ
Notification -->|Send Email| Client
Gateway -->|Get Transcription| MongoDB
RabbitMQ -->|Send Notify| Notification
This project implements a microservice architecture with the following components:
- Gateway Service - API gateway handling client requests, authentication, file uploads/downloads
- Auth Service - User authentication and JWT token management
- Transcription Service - Audio-to-text transcription using AI models
- Notification Service - Email notifications for completed transcriptions
- MongoDB - Document storage for audio files and transcription results
- RabbitMQ - Message queue for asynchronous processing
- MySQL - User authentication data storage
- Kubernetes - Container orchestration and deployment
- Upload: User uploads audio file through the Gateway API
- Queue: Audio file queued in RabbitMQ for processing
- Transcribe: Transcription service processes audio and generates text
- Store: Transcribed text saved to MongoDB
- Notify: User receives email notification when transcription is complete
- Download: User can download transcription results via API