An AI-powered technical interview practice platform with real-time voice interaction
π Quick Start β’ π Documentation β’ π οΈ Installation β’ π€ Contributing
|
Practice technical interviews with AI-powered feedback and real-time voice interaction |
Speak naturally while coding - just like a real interview |
HT6 Interview Agent is an AI-powered platform designed to help developers practice technical interviews in a realistic environment. Using advanced AI agents powered by Google's Gemini model, the platform provides interactive coding challenges with voice-enabled communication, real-time feedback, and comprehensive performance analysis.
- π€ AI-Powered Interview Agent: Interactive interviewer using Google Gemini 2.5 Flash
- π£οΈ Voice Recognition & TTS: Real-time speech-to-text and text-to-speech capabilities
- π» Multi-Language Code Editor: Support for Python, JavaScript, Java, and C++ with syntax highlighting
- β±οΈ Real-Time Timer: Track your interview performance with live timing
- π― Coding Problem Database: Curated collection of technical interview problems
- π Performance Analysis: Detailed feedback and performance metrics
- π WebSocket Integration: Real-time communication between frontend and backend
- π± Responsive Design: Modern, clean UI built with React and Tailwind CSS
-
Clone the repository
git clone https://github.com/eddywang4340/HT6-interview-agent.git cd HT6-interview-agent -
Set up environment variables
# Create .env file in the backend directory echo "GEMINI_API_KEY=your_gemini_api_key_here" > backend/.env
-
Start the backend
cd backend pip install -r requirements.txt uvicorn app.main:app --reload -
Start the frontend
cd frontend/interview-agent-frontend npm install npm run dev -
Open your browser and navigate to
http://localhost:5173
- Python 3.10+
- Node.js 18+
- npm or pnpm
- Google Gemini API key
cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your configurationcd frontend/interview-agent-frontend
# Install dependencies
npm install
# or with pnpm
pnpm install
# Start development server
npm run devCreate a .env file in the backend directory:
GEMINI_API_KEY=your_gemini_api_key_here
DATABASE_URL=postgresql://username:password@localhost/dbname
DEBUG=true| Variable | Description | Required |
|---|---|---|
GEMINI_API_KEY |
Google Gemini API key for AI agent | Yes |
DATABASE_URL |
PostgreSQL connection string | Yes |
DEBUG |
Enable debug mode | No |
- Select Interview Settings: Choose difficulty level, programming language, and interview duration
- Start Interview: Begin with an AI interviewer that will guide you through the process
- Solve Problems: Write code in the integrated editor while discussing your approach
- Voice Interaction: Use voice commands to communicate naturally with the AI interviewer
- Get Feedback: Receive real-time feedback and suggestions from the AI agent
- Review Results: Analyze your performance and areas for improvement
HT6-interview-agent/
βββ backend/ # FastAPI backend
β βββ app/
β β βββ agent/ # AI agents (interview & feedback)
β β βββ core/ # Core configuration
β β βββ db/ # Database models and connection
β β βββ main.py # FastAPI application entry point
β βββ requirements.txt # Python dependencies
βββ frontend/ # React frontend
βββ interview-agent-frontend/
βββ src/
β βββ components/ # React components
β βββ hooks/ # Custom React hooks
β βββ pages/ # Page components
β βββ types/ # TypeScript type definitions
βββ package.json # Node.js dependencies
- Interview Agent: Handles AI-powered interview interactions using Google Gemini
- Feedback Agent: Provides performance analysis and coding feedback
- TTS Service: Text-to-speech functionality for voice responses
- WebSocket Manager: Real-time communication between client and server
- Code Editor: Multi-language code editor with syntax highlighting
GET /problems- Retrieve coding problemsGET /problems/random- Get a random problemPOST /interview/start- Start a new interview sessionPOST /interview/submit- Submit code solutionWebSocket /ws/{client_id}- Real-time communication
interview_start- Begin interview sessioncode_update- Update code in real-timevoice_message- Send voice transcriptionai_response- Receive AI agent response
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
The Problem: Traditional interview prep focuses on pattern recognition, not the crucial "meta-skills" of communication, strategic questioning, and hint extraction vital for real technical interviews.
Our Solution: Diolex is a voice-first AI-powered simulator designed to train these essential meta-skills. Our AI interviewer:
- Watches code in real-time: Provides contextual feedback based on your approach.
- Teaches strategic questioning: Withholds information, prompting you to ask clarifying questions.
- Simulates authentic dynamics: Engages in follow-up questions, hint extraction, and edge-case discussions.
- Provides detailed analysis: Offers specific feedback on communication and problem-solving.
Frontend:
- React + TypeScript: For robust, type-safe components.
- Tailwind CSS: For rapid, responsive styling.
- CodeMirror 6: Provides a syntax-highlighted code editor.
- Custom WebSocket Hooks & Speech Recognition API: Enables real-time, bidirectional communication and continuous voice input.
- React Router: Manages seamless navigation.
Backend:
- FastAPI: A high-performance asynchronous API.
- WebSockets: For real-time voice and text communication.
- Custom Interview Agent: A structured, stateful AI with authentic interviewer persona.
- Kokoro TTS: For natural-sounding spoken feedback.
- Piston API: Secure sandboxed code execution.
- SQLAlchemy + PostgreSQL: For reliable data persistence.
AI & Voice Technology:
- Finetuned Outputs & Context-Aware Responses: Ensures authentic, adaptive conversations based on code and history.
- Multi-modal Interaction: Supports both voice and text.
- Intelligent Hint Distribution: Provides strategic guidance without giving away answers.
- Real-time Voice + Code Synchronization: We built a sophisticated WebSocket message queue system with prioritization and conflict resolution to ensure seamless integration of speech recognition, code editing, and AI responses.
- Context-Aware AI Responses: Developed a dynamic context injection system that sends code snapshots with every message, allowing the AI to intelligently reference your live implementation.
- Authentic Interview Simulation: Achieved realistic AI behavior through extensive prompt engineering, multi-phase interview logic, information withholding strategies, and natural conversation flow patterns.
- Cross-browser Speech Recognition: Implemented robust fallback mechanisms, automatic restart logic, and graceful degradation to text-only mode to counter browser inconsistencies.
- Low-latency Voice Responses: Streamed TTS with chunk-based audio playback and WebSocket message prioritization to minimize delay for natural conversation flow.
Technical: Mastered WebSocket architecture, advanced speech API integration, AI prompt engineering for conversational AI, React performance optimization, and FastAPI async patterns.
Product: Understood the critical impact of authentic simulation, unique UX considerations for voice interfaces, and the importance of seamless transitions for users.
Startup: Validated our core hypothesis, discovered new use cases (e.g., explaining solutions), and recognized the scalability potential for different interview styles.
Diolex proves the power of AI to authentically simulate complex human interactions. Our vision is a comprehensive interview preparation platform that adapts to diverse company styles, skill levels, and formats, revolutionizing career readiness for developers.