An intelligent Q&A system based on local large language models, supporting voice input, text input, multilingual translation, and AI-powered responses.
-
Voice Input 🎤
- Start/Stop recording buttons
- Real-time recording status
- Speech recognition results display
-
Text Input ⌨️
- Text input field
- Answer generation button
-
Translation 🌐
- Multi-language translation support
- Language selection dropdown
- Translation results display
-
AI Response 💭
- AI-powered answers based on voice/text input
- Intelligent Q&A results display
- Clean and modern user interface
- Real-time mouse tracking animations
- Responsive layout design
- Elegant transition effects
-
Frontend
- HTML5
- CSS3 (Modern layout, animations)
- JavaScript (Vanilla)
- Web Audio API (Voice recording)
-
Backend
- Python
- Flask (Web server)
- Ollama (Local large language model)
- Whisper (Speech recognition)
- Python 3.8+
- Ollama
- FFmpeg (for audio processing)
- Modern browsers (Chrome/Firefox/Safari)
- Clone the repository
git clone [repository-url]
cd [repository-name]
- Install Python dependencies
pip install -r requirements.txt
- Install and start Ollama
# Install Ollama (refer to official documentation)
# Pull required model
ollama pull deepseek-r1:1.5b
- Start the application
python app.py
- Access the application
Open http://localhost:5000 in your browser
-
Voice Input
- Click "Start Recording" to begin
- Speak your question
- Click "Stop Recording" to end
- System will automatically recognize speech and display text
- Auto-generates Chinese translation (if input is in English)
-
Text Input
- Enter your question in the text field
- Click "Generate Answer" to get response
-
Translation
- Select target language from dropdown menu
- Click "Translate to" to translate
-
AI Response
- After seeing speech recognition or text input
- Click "Generate Answer" to get AI response
.
├── app.py # Backend main program
├── requirements.txt # Python dependencies
├── templates/
│ └── index.html # Frontend page
└── README.md # Project documentation
- Optimized user interface layout
- Separated input and output sections
- Improved translation functionality interaction
- Added real-time animation effects
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create your feature branch (git checkout -b feature/AmazingFeature)
- Commit your changes (git commit -m 'Add some AmazingFeature')
- Push to the branch (git push origin feature/AmazingFeature)
- Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details
For any questions or suggestions, please reach out through:
- Email: [email protected]