TTS - text to speech
SST - speech to text
LLM - large language model
Decided to write my own "implementation"(if we can say so) of original rabbit r1 android application. Basically idea behind it is to provide same(or almost the same) capabilities as original apk is providing plus configurability. Let me explain: When there is any voice interaction with LLM is planned - the sequence is following:
- Translate speech into text
- Pass text to LLM
- (Optionally) Text to speech translation
And by configurability i mean to allow selection of each of those steps to use separate models. E.g. a) Translate speech into text - Eleven Labs b) Use ChatGPT to proceed the text c) And then voice the response from ChatGPT using deepgram
Basically trying to implement my open source walkie-talkie with AI with making the source code as readable as possible in order to allow you to extend it per your own needs. Example - using the same application, but replace steps of interaction of ChatGPT with interaction with local ollama. Or if more intresting - to interact it with self-running python server which using interacts with trello using langchain. You're hosting your server which accepts text and do something with your tasks(create few, some to move, some delete)
- Clone repository
- Create(if not created) in root
local.propertiesfile - Add into it apiChatGptKey and apiEleventLabsKey. Example
- Run your application on the device
...
apiChatGptKey="sk-proj-aaaaa"
apiEleventLabsKey="sk_aaaa"
...
- Write the "bone" application
- Write a minimal usable example
- STT - Add interaction with ElevenLabs
- STT - Add interaction with Deepgram
- LLM - Add interactino with chatGPT
- LLM - Add interactino with ollama
- TTS - Add interaction with Deepgram
- TTS - Add interaction with ElevenLabs
- Add camera + text interaction
- Extend README
- Add possibility to configure application using qr-code
- After clean launch start with qr-code reader to read configuration
- Extend README
- Integrate with github CI
- Implement tiles logic - for details check #1
Youtube playlist to track the progress(small short demos of current state of application) - https://www.youtube.com/playlist?list=PL7lKhwkfj7OKGUDMnX-fJg4hEy3f6kg5r