ESOPN - AI Sports Commentator Duo

Real-time AI commentary for coding sessions, delivered sports-broadcast style by two AI commentators.

Features

Dual Commentators: Alex (play-by-play) and Morgan (color commentary) with distinct personalities
Real-time Analysis: Uses Gemini Vision to understand what's happening on screen
Natural Dialogue: Powered by Gemini TTS for realistic two-speaker audio (free!)
Floating UI Controller: Pause, resume, or stop commentary with a simple floating window
Smart Change Detection: Only comments when the screen actually changes
Hotkey Toggle: Pause/resume commentary with Ctrl+Shift+P

Installation

# Clone the repo
git clone https://github.com/thefirebanks/esopn.git
cd esopn

# Install with uv (recommended)
uv sync

# Or with pip
pip install -e .

Quick Start

1. Set up your API key

# Get a free API key from https://aistudio.google.com/apikey
export ESOPN_GEMINI_API_KEY=your_key_here

2. Run with the UI controller (Recommended)

# Start commentary with floating UI controller
uv run esopn watch --ui

This launches a floating window with Pause, Resume, and Stop buttons. Switch to your code editor or terminal and watch the commentary roll in!

3. Alternative: Headless mode

# Run without UI (use Ctrl+Shift+P to pause, Ctrl+C to stop)
uv run esopn watch

Controls

Action	Method
Pause/Resume	Click button in UI, or press `Ctrl+Shift+P`
Stop	Click "Stop & Exit" in UI, or press `Ctrl+C`

Commands

esopn watch         # Start commentary (recommended)
esopn watch --ui    # Start with floating UI controller
esopn run           # Start commentary (full options)
esopn test-capture  # Test screenshot capture
esopn test-tts      # Test TTS synthesis
esopn test-vision   # Test vision analysis
esopn info          # Check system/dependencies

Watch Options

esopn watch [OPTIONS]

Options:
  --ui                      Show floating UI controller window
  -i, --interval FLOAT      Seconds between screenshots
  -d, --device TEXT         TTS device (cuda, mps, cpu, auto)
  -M, --mode TEXT           Commentary mode (sports, wwe, freeman_mj)
  -v, --verbose             Enable verbose logging

How It Works

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Screenshot │ ──▶ │   Gemini    │ ──▶ │ Commentary  │ ──▶ │ Gemini TTS  │
│  (mss)      │     │   Vision    │     │  Generator  │     │  (free!)    │
└─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘
       │                                                            │
       │                     ┌─────────────┐                       │
       └────────────────────▶│   Speakers  │◀──────────────────────┘
                             └─────────────┘

Captures screenshots and detects when the screen changes (>5% difference)
Gemini Vision analyzes what's happening (code, terminal, action)
Commentary LLM generates sports-style dialogue between Alex & Morgan
Gemini TTS synthesizes natural two-speaker audio with distinct voices
Audio plays through your speakers in real-time

Requirements

Python: 3.10+
API Key: Google Gemini (free tier works great!)
macOS/Linux: For screen capture

macOS Permissions

For screen capture, grant permissions:

System Preferences → Privacy & Security → Screen Recording
Add your terminal app (Terminal, iTerm2, etc.)

For active window capture, also grant:

System Preferences → Privacy & Security → Accessibility
Add your terminal app

Commentator Personas

Alex (S1) - Play-by-Play

High-energy, describes what's happening
Calls out specific actions, file names, patterns
Uses conversational descriptions (not literal code reading)

Example: "New submit handler going in! Looks like they're setting up form validation!"

Morgan (S2) - Color Commentary

Analytical with energy and enthusiasm
Explains WHY the code matters
Provides technical insight

Example: "That's the Strategy pattern right there - makes it easy to swap algorithms later!"

Configuration

Environment variables:

Variable	Default	Description
`ESOPN_GEMINI_API_KEY`	-	Google Gemini API key (required)
`ESOPN_CAPTURE_INTERVAL`	3.0	Seconds between screenshots
`ESOPN_TTS_PROVIDER`	gemini	TTS provider (gemini or elevenlabs)

Or create a .env file:

ESOPN_GEMINI_API_KEY=your_key_here
ESOPN_CAPTURE_INTERVAL=3.0

Security & Privacy

Screenshots are sent to Gemini for analysis. Avoid capturing windows with secrets, credentials, or private data.
Do not pass API keys on CLI flags; prefer .env or exported environment variables.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src/esopn		src/esopn
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ESOPN - AI Sports Commentator Duo

Features

Installation

Quick Start

1. Set up your API key

2. Run with the UI controller (Recommended)

3. Alternative: Headless mode

Controls

Commands

Watch Options

How It Works

Requirements

macOS Permissions

Commentator Personas

Alex (S1) - Play-by-Play

Morgan (S2) - Color Commentary

Configuration

Security & Privacy

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ESOPN - AI Sports Commentator Duo

Features

Installation

Quick Start

1. Set up your API key

2. Run with the UI controller (Recommended)

3. Alternative: Headless mode

Controls

Commands

Watch Options

How It Works

Requirements

macOS Permissions

Commentator Personas

Alex (S1) - Play-by-Play

Morgan (S2) - Color Commentary

Configuration

Security & Privacy

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages