Spritely AI 🧚🏼‍♀️ ✨ 🎙️

Free humanity from the keyboard - Your AI-powered voice companion

Spritely AI is a powerful desktop application that enables real-time audio transcription with AI analysis. It combines local audio processing with cloud-based AI to provide a seamless voice-to-text experience.

🌟 Features

Desktop App

Real-time audio transcription using Deepgram's Nova-2 model
Multiple transcription modes:
- Direct field input (Cmd+Alt+L)
- AI-analyzed input (Cmd+Alt+K)
Speaker diarization support
System-wide keyboard shortcuts
Local audio processing for privacy
Automatic microphone selection and configuration

🗺️ Roadmap

Always-on listening mode with wake word detection. See local STT.
Create a md file for database. Use an LLM instead of vector similarity search. Code example: text
Add meeting summaries to md file.
Polish the tkinter UI, i.e for meeting summary and transcription.
Add Greptile API to tools.

Bug Fixes

Spritely's spoken output cuts off without completing the LLMs entire response

🚀 Getting Started

Prerequisites

Python 3.12+
MacOS (Windows support coming soon)
API keys for the following services
- Elevenlabs account
- Deepgram
- Groq
- Anthropic
You will need to give keystroke permissions to the app for the shortcuts

Desktop App Setup

Clone the repository:

git clone https://github.com/spritelyai/spritely-ai.git

Install dependencies:

cd spritely-ai
python3.12 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
touch .env
# add your api keys to .env

Run the app:

python main.py

🎯 Usage

Desktop Shortcuts

Cmd+Alt+K: Start/stop AI-analyzed transcription
Cmd+Alt+L: Start/stop direct field transcription
ESC: Stop current transcription

Permissions

The app requires:

Microphone access
Accessibility permissions (for keyboard shortcuts)
Internet connection (for AI analysis)

🏗️ Architecture

Desktop App Components

Audio Capture: PyAudio
Transcription: Deepgram SDK
Voice Synthesis: Cartesia
Keyboard Control: pynput

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Open a Pull Request

📝 License

This project is licensed under the AGPL-3.0 for non-commercial use.

Commercial Use For commercial use or deployments requiring a setup fee, please contact us for a commercial license at michael@flowon.ai.

By using this software, you agree to the terms of the license.

🙏 Acknowledgments

Deepgram for real-time transcription
Cartesia for voice synthesis
All our contributors and supporters

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
meetings		meetings
scripts		scripts
src/spritely		src/spritely
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
agent_history.gif		agent_history.gif
dev.ipynb		dev.ipynb
install_shortcut.sh		install_shortcut.sh
main.py		main.py
main.sh		main.sh
prompts.py		prompts.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
settings.json		settings.json
start_transcription.sh		start_transcription.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spritely AI 🧚🏼‍♀️ ✨ 🎙️

🌟 Features

Desktop App

🗺️ Roadmap

Bug Fixes

🚀 Getting Started

Prerequisites

Desktop App Setup

🎯 Usage

Desktop Shortcuts

Permissions

🏗️ Architecture

Desktop App Components

🤝 Contributing

📝 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

miali88/spritely_ai

Folders and files

Latest commit

History

Repository files navigation

Spritely AI 🧚🏼‍♀️ ✨ 🎙️

🌟 Features

Desktop App

🗺️ Roadmap

Bug Fixes

🚀 Getting Started

Prerequisites

Desktop App Setup

🎯 Usage

Desktop Shortcuts

Permissions

🏗️ Architecture

Desktop App Components

🤝 Contributing

📝 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages