This project is a comprehensive AI voice generation platform that combines multiple state-of-the-art text-to-speech and voice cloning technologies. The platform includes a modern Next.js frontend and multiple AI model backends for different voice generation capabilities.
- Text-to-Speech with StyleTTS2
- Voice Cloning with SEED-VC
- Audio Generation with Make-An-Audio
- Modern Next.js Frontend with TypeScript
- Docker-based deployment
- GPU-accelerated AI models
- User authentication and management
- File storage with AWS S3
- Node.js 18+ and pnpm 9.12.1
- Docker and Docker Compose
- NVIDIA GPU with CUDA support
- AWS Account (for S3 storage)
- PostgreSQL database
- Clone the repository:
git clone https://github.com/khaireddine-arbouch/11labs-clone.git
cd 11labs-clone- Install frontend dependencies:
cd 11labs-clone-frontend
pnpm install- Set up environment variables:
Create a
.envfile in the frontend directory with:
DATABASE_URL="postgresql://user:password@localhost:5432/dbname"
NEXTAUTH_SECRET="your-secret"
NEXTAUTH_URL="http://localhost:3000"
AWS_ACCESS_KEY_ID="your-aws-key"
AWS_SECRET_ACCESS_KEY="your-aws-secret"
AWS_REGION="your-aws-region"
AWS_BUCKET_NAME="your-bucket-name"- Initialize the database:
pnpm db:generate
pnpm db:push- Start the development server:
pnpm devThe project uses Docker Compose to manage multiple services:
- StyleTTS2 API (Port 8000)
- SEED-VC API (Port 8001)
- Make-An-Audio API (Port 8002)
To start all services:
docker-compose up -d.
├── 11labs-clone-frontend/ # Next.js frontend application
│ ├── src/ # Source code
│ ├── prisma/ # Database schema and migrations
│ └── public/ # Static assets
├── StyleTTS2/ # StyleTTS2 voice generation service
├── seed-vc/ # SEED-VC voice cloning service
├── Make-An-Audio/ # Audio generation service
└── docker-compose.yml # Docker services configuration
- Next.js 15.2.3
- React 19.0.0
- TypeScript
- TailwindCSS 4.0.15
- tRPC
- Prisma
- NextAuth.js
- Zustand for state management
- StyleTTS2 for text-to-speech
- SEED-VC for voice cloning
- Make-An-Audio for audio generation
- All services are containerized with Docker
- Endpoint:
http://localhost:8000 - GPU-accelerated text-to-speech generation
- Endpoint:
http://localhost:8001 - Voice cloning capabilities
- Endpoint:
http://localhost:8002 - Audio generation and manipulation
- All API endpoints are protected with authentication
- AWS S3 is used for secure file storage
- Environment variables for sensitive data
- HTTPS enforced in production
- Build the frontend:
cd 11labs-clone-frontend
pnpm build- Start the Docker services:
docker-compose up -d- The application will be available at:
- Frontend: http://localhost:3000
- StyleTTS2 API: http://localhost:8000
- SEED-VC API: http://localhost:8001
- Make-An-Audio API: http://localhost:8002
- StyleTTS2: https://github.com/keithito/tacotron
- SEED-VC: https://github.com/keithito/tacotron
- Make-An-Audio: https://github.com/keithito/tacotron