Transform your RGB LED matrix into a dynamic, controllable display
verpixeld is a comprehensive .NET 8 application designed to control large RGB LED matrix displays through an elegant, modern web interface. Originally developed for Raspberry Pi with HUB75 LED panels, verpixeld transforms your LED matrix into a powerful, programmable display system.
Whether you want to show the time, display weather information, run animations, stream your camera, play YouTube videos, or create collaborative pixel art – verpixeld provides a unified platform with:
- 🖥️ Beautiful Web Control Panel — Control everything from any device with a browser
- 🧩 Plugin Architecture — Extend functionality with custom content extensions
- 🎨 Real-time Visual Filters — Apply effects like blur, color correction, and more
- 📐 Flexible Layouts — Multi-canvas support with independent layers and opacity
- ⏰ Smart Scheduling — Automate content changes based on time
- 🌙 Night Mode — Automatic brightness adjustment for different times of day
- 📷 Camera Streaming — Stream live video from your phone to the display
- ✏️ Drawing Mode — Create pixel art directly on the display
- 🎬 Media Player — Full video/audio playback with YouTube, network shares, and local files
- ⭐ Favorites & History — Save and quickly replay media with remembered settings
- 🔊 Bluetooth Audio — Output audio to Bluetooth speakers via PulseAudio
- 📹 Camera Motion Alerts — Auto-switch to a camera feed when motion is detected
- 🖼️ Image & Video Upload — Upload photos or stream video clips from any device to the display
- 🤖 AI Art Generation — Generate images with Azure OpenAI or OpenAI, with image-to-image stylization, gallery storage, and scheduled auto-generation
- 🎙️ Voice Assistant — Hands-free voice commands with wake word detection, fast-path instant execution, intent classification, spoken responses with audio ducking, via Azure Speech + OpenAI
- 🎵 Music Search & Radio — Search and play YouTube Music songs, start endless genre radio, or tune into internet radio stations (Radio Browser) by voice or through the web UI
- 📹 Voice Camera Control — Show or dismiss camera feeds (IP/RTSP alert camera or USB webcam) by voice command
| Feature | Description |
|---|---|
| Multi-Canvas System | Layer multiple content sources with independent z-ordering, opacity, and brightness |
| Layout Profiles | Pre-defined layouts: FullScreen, HeaderContent, ThreePanel, SplitView, Dashboard |
| Custom Overlays | Create positioned overlay canvases for notifications, clocks, etc. |
| Hot Reload | Change content and settings without restarting |
Extensions are dynamically loaded plugins that provide content for canvases:
- Clock displays (analog, digital, world time)
- Weather information
- RSS/News feeds
- Image slideshows
- Animations and visualizations
- Custom content via plugin API
Real-time post-processing filters applied to the entire display:
- Color Adjustments: Brightness, contrast, saturation, hue shift
- Effects: Blur, sharpen, pixelate, noise
- Artistic: Color tint, gradient overlay, vignette
- Corrections: Gamma, color temperature
Automated layout switching based on time with daily/weekly schedules, priorities, and manual override capability.
Automatic brightness management with configurable time ranges and gradual transitions.
Stream live video from any device camera to the display with configurable FPS and real-time downsampling.
Interactive drawing with freehand tools, shapes, color picker, and the ability to save/load drawings.
Full video and audio playback system powered by FFmpeg:
- Local Files — Play videos and audio from the device filesystem
- Network Streaming — Native SMB/CIFS support via FFmpeg libsmbclient (no mount required)
- YouTube — Stream YouTube videos via
yt-dlpwith automatic format selection - Generic Streams — Play any HTTP/HTTPS/RTSP/RTMP stream URL directly (e.g. IP cameras)
- Audio-Only Mode — Efficient playback for MP3/FLAC/etc without video decoding overhead
- Bluetooth Audio — Output audio to Bluetooth speakers via PulseAudio
- A/V Sync Control — Real-time audio/video synchronization with configurable offset (±5 seconds)
- Configurable Video Scaling — Choose FFmpeg scale filter per stream (area, lanczos, bicubic, gauss, etc.)
- Hardware Acceleration — V4L2 M2M hardware decoding on Raspberry Pi for efficient video playback
- Seeking Support — Full seek support for local and network files
- Metadata Extraction — ID3 tags and container metadata (title, artist, album, etc.)
- Playlist Support — Queue management with shuffle, repeat, and auto-advance
- Pause/Resume — Signal-based pause using SIGSTOP/SIGCONT (Linux)
- Pre-buffering — Configurable frame buffering for smooth A/V sync on network streams
- Audio Visualizer — Real-time FFT-based audio visualization with multiple modes and color schemes
Save and replay your media with full context:
- Favorites — Save any currently playing media (video, audio, YouTube, network stream) with a custom name
- A/V Sync Remembered — Audio sync offset is saved per-favorite and re-applied on playback
- Scale Filter Remembered — The chosen video scaling algorithm is saved and restored
- Thumbnail Extraction — Automatic thumbnail generation for videos in favorites and history lists
- Recently Played History — Persistent list of recently played media with one-click replay
- Auto-Play — Sequential or shuffled playback through your entire favorites list with animated loading screens between tracks
Automatic camera feed display triggered by motion detection webhooks:
- Webhook Trigger — Simple
POST /api/alert/triggerendpoint for any camera's HTTP action - Auto-Display — Pauses current media playback and shows camera stream on a high-priority overlay canvas
- Auto-Dismiss — Configurable timeout (5–120 seconds) with automatic return to normal
- Re-trigger Reset — Consecutive motion events reset the timeout timer
- Manual Dismiss — Dismiss button in the GUI or via API
- Resume Playback — Automatically resumes paused media when the alert ends
- Animated Connecting Screen — Surveillance-style animated overlay while the camera stream connects
- Double-Buffered Rendering — Decode pipeline decoupled from display for flicker-free camera feed
- RTSP Optimized — TCP transport, low-latency flags, and tuned probe settings for IP cameras
- Configurable Scale Filter — Choose the downscaling algorithm for the camera stream
- Persistent Config — Stream URL, timeout, and settings saved to disk
Upload media directly from any device (phone, tablet, desktop) to the LED matrix:
- Photo Upload — Select or drag-drop images (JPG, PNG, GIF, WebP) to instantly display on the matrix
- Video Upload — Load video files, seek to any frame, and stream frames to the display at configurable FPS
- Drag & Drop — Full drag-and-drop support in the web interface
- Auto-Scaling — Images automatically scaled to the display resolution
- Uses Existing Pipeline — Leverages the
/api/draw/applyendpoint, no new backend needed
Generate unique artwork for your LED matrix using AI image generation:
- Azure OpenAI (Default) — Supports DALL-E 3 and GPT Image models via Azure credits
- OpenAI (Alternative) — Direct OpenAI API support for DALL-E 3, GPT Image 1, GPT Image 1 Mini
- Text-to-Image — Describe what you want and the AI generates it, optimized for LED matrix display
- Image-to-Image — Upload a photo and have the AI stylize it (pixel art, watercolor, cyberpunk, etc.)
- Style Presets — Pixel Art, Retro 8-bit, Neon Synthwave, Abstract, Photograph, Watercolor, Oil Painting, Comic, Minimalist, Cyberpunk
- Quality Control — Low/Medium/High quality settings to balance speed and detail
- Generation History — Browse and re-apply past generations with one click
- Scheduled Auto-Generation — Configure prompts and intervals to auto-generate fresh artwork periodically
- Live Preview — See generated images before applying them to the display
- Gallery with Overlay Display — Save generated images to a gallery, browse thumbnails, and apply images to the display via an overlay canvas (z=250) that stays visible above running extensions until dismissed
- Gallery Slideshow — Auto-cycle through gallery images with configurable interval and shuffle/sequential order
- Persistent Configuration — API keys and schedule settings saved to disk
A full voice assistant that listens for a wake word, understands spoken commands in any language, and responds with actions and spoken audio:
- Wake Word Detection — Trigger with a custom keyword (Azure Custom Keyword
.tablemodel) - Unified Keyword + STT Pipeline — Single audio stream for keyword detection and cloud speech-to-text, eliminating the gap between wake word and command recognition
- Follow-Up Listening — If you pause after the wake word ("Hey Pixel" ... "wie spät ist es?"), the system automatically listens for your follow-up command
- Fast-Path Instant Commands — Simple commands like "Stop", "Pause", "Leiser", "Kamera aus" execute instantly without LLM roundtrip (~0ms vs 1-3s)
- Intent Classification via LLM — Complex commands are routed through Azure OpenAI (GPT-4o/GPT-5) to classify intent and generate a natural-language response
- Text-to-Speech — Spoken responses via Azure TTS with configurable voice (German/English voices available)
- Audio Ducking — Music volume is automatically lowered during voice responses and restored afterward, so TTS is always clearly audible over background music (configurable volume level)
- Non-Blocking Feedback — After the assistant speaks, the response text stays visible on the display for a reading period while the listen loop resumes immediately, so you can speak a new command without waiting
- Smart Overlay Management — AI images, camera feeds, and feedback overlays are automatically dismissed when a new voice command is received
- Stale Audio Prevention — Audio capture is paused during command processing (LLM, image generation, TTS) and resumed with fresh audio when listening restarts, ensuring instant wake word detection
- Content Filter Resilience — When Azure's content filter blocks or drops the LLM response, the system falls back to local intent detection (recognizing German draw commands like "male", "zeichne" automatically)
- Push-to-Talk — Manual trigger via web UI button in addition to wake word
Supported voice commands:
| Command Type | Examples | Action |
|---|---|---|
| AI Image Generation | "Male einen Drachen", "Paint a sunset" | Generates and displays an AI image (auto-dismissed on next command) |
| Questions & Chat | "Wie spät ist es?", "Tell me a joke" | LLM answers, response spoken aloud |
| Media Control | "Pause", "Nächstes Lied", "Stop" | Controls media player playback (fast-path, instant) |
| Volume | "Lauter", "Leiser", "Ton aus" | Adjusts media volume (fast-path, instant) |
| Brightness | "Licht an", "Display aus", "Helligkeit auf 80" | Adjusts LED matrix brightness (fast-path for on/off) |
| Extension Switching | "Zeig die Uhr" | Switches active display extension |
| Music Search | "Spiele Bohemian Rhapsody", "Play something by Daft Punk" | Searches YouTube Music and plays the top result |
| Music Radio | "Spiele Trance Musik", "Spiele Jazz" | Starts endless genre radio — shuffled playback with auto-refill |
| Internet Radio | "Spiele Techno Radio", "Play jazz radio" | Searches and plays a live internet radio station |
| Show Camera | "Zeig mir die Kamera", "Zeig USB-Kamera" | Shows alert (IP/RTSP) or local USB camera on the display |
| Hide Camera | "Kamera aus", "Kamera stopp" | Dismisses any active camera feed (fast-path, instant) |
Search and play music from YouTube Music or internet radio — by voice or through the web interface:
- YouTube Music Integration — Search for songs, artists, or albums using the YouTubeMusicAPI (no API key required)
- Songs vs Music Videos — Toggle between audio tracks (with album art) and actual music videos
- Audio-Only Mode — Play songs without overlaying the display, keeping the current content visible (default for voice commands)
- Voice-Triggered — Say "Hey Pixel, spiele Bohemian Rhapsody von Queen" to search and play instantly
- Genre Radio — Say "Hey Pixel, spiele Trance Musik" to start endless genre playback with shuffled tracks and automatic queue refill
- Internet Radio — Search and play live internet radio stations by genre using the Radio Browser API (free, no API key required)
- Click-to-Play Results — Search results displayed as a list with title, artist, album, and duration
- yt-dlp Playback — Uses the existing media player pipeline (yt-dlp + FFmpeg) for reliable playback
- Error Handling — Restricted or unavailable videos show a clear error message in the UI and via voice
Live backend log streaming to the web interface:
- Real-time Log Viewer — All
Console.WriteLineoutput captured and streamed to a dedicated Console tab - Search & Filter — Filter logs by keyword in real-time
- Auto-scroll — Follows new output automatically with pause/resume control
- Ring Buffer — Memory-efficient circular buffer keeps recent log history
Comprehensive audio output management:
- PulseAudio Integration — Full control over audio routing and volume
- Bluetooth Discovery — Scan, pair, and connect Bluetooth speakers from the web interface
- Device Selection — Switch audio output between ALSA, PulseAudio sinks, and Bluetooth devices
- Volume Control — System-wide volume adjustment with mute toggle
- Real-time Updates — Server-Sent Events for instant UI feedback on volume/device changes
┌─────────────────────────────────────────────────────────────────┐
│ Web Control Panel │
│ (HTML/CSS/JavaScript) │
│ Tabs: Layouts│Schedule│Canvas│AI│Media│Effects│Voice│Console│ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ ASP.NET Core API │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Layout │ │ Media │ │ YouTube │ │Favorites │ │
│ │Endpoints │ │Endpoints │ │Endpoints │ │Endpoints │ │
│ ├──────────┤ ├──────────┤ ├──────────┤ ├──────────┤ │
│ │ Alert │ │ AI │ │ Audio │ │ Log │ │
│ │Endpoints │ │Endpoints │ │Endpoints │ │Endpoints │ │
│ ├──────────┤ ├──────────┤ └──────────┘ └──────────┘ │
│ │ Voice │ │ Music │ │
│ │Endpoints │ │Endpoints │ │
│ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Core Services │
│ ┌────────────────┐ ┌────────────────┐ ┌─────────────────────┐ │
│ │LayoutManager │ │ContentManager │ │ ScheduleManager │ │
│ ├────────────────┤ ├────────────────┤ ├─────────────────────┤ │
│ │MediaPlayerSvc │ │ AlertService │ │ FavoritesService │ │
│ ├────────────────┤ ├────────────────┤ ├─────────────────────┤ │
│ │AudioOutputSvc │ │AiImageService │ │ NetworkShareService │ │
│ ├────────────────┤ ├────────────────┤ ├─────────────────────┤ │
│ │ LogService │ │AiChatService │ │ MusicSearchService │ │
│ ├────────────────┤ ├────────────────┤ ├─────────────────────┤ │
│ │VoiceCommandSvc │ │RadioBrowserSvc │ │ │ │
│ └────────────────┘ └────────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Canvas Management │
│ (Multi-layer composition, z-ordering & filters) │
│ │
│ z=100: Extensions z=200: Media z=250: AI/Gallery Overlay │
│ z=300: CameraAlert z=350: VoiceFeedback │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ RGB Matrix Renderer │
│ (Hardware abstraction layer) │
│ ┌─────────────┐ │
│ │ HUB75 LED │ │
│ │ Matrix │ │
│ └─────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ MediaPlayerService │
│ (Orchestrates video/audio playback) │
│ ┌───────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ VideoPlayer │ │ AlsaAudioService │ │ NetworkShareSvc │ │
│ │ (FFmpeg) │ │ (System Volume) │ │ (SMB Credentials)│ │
│ └───────────────┘ └──────────────────┘ └──────────────────┘ │
│ ┌───────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ YouTubeService│ │ FavoritesService │ │ Auto-Play Queue │ │
│ │ (yt-dlp) │ │ (JSON Persist) │ │ (Sequential/Shuf)│ │
│ └───────────────┘ └──────────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ AudioOutputService │
│ (PulseAudio/ALSA routing, Bluetooth management) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ AlertService │
│ (Camera motion alerts, independent canvas z=300) │
│ ┌───────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ FFmpeg Decode │ │ Double-Buffered │ │ Auto-Dismiss │ │
│ │ (RTSP/HTTP) │ │ Display Loop │ │ Timer │ │
│ └───────────────┘ └──────────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Supported Protocols:
smb://— SMB/CIFS network shares (requires FFmpeg with libsmbclient)rtsp://— RTSP camera streams (TCP transport, optimized for IP cameras)http:///https://— HTTP streams, HLS, HTTP-FLVrtmp://— RTMP streams- YouTube URLs — via
yt-dlpautomatic format extraction - Local filesystem paths
- Raspberry Pi 4 (recommended) or Pi 3
- HUB75 RGB LED Matrix panels
- .NET 8.0 Runtime
- Raspberry Pi OS (64-bit recommended)
- FFmpeg (with PulseAudio and libsmbclient support — see compilation guide)
yt-dlp(optional, for YouTube streaming —pip install yt-dlp)
-
Clone the repository
git clone https://github.com/Jan1503/verpixeld.git cd verpixeld -
Build the application
dotnet publish -c Release -r linux-arm64
-
Deploy to Raspberry Pi
scp -r bin/Release/net8.0/linux-arm64/publish/* pi@raspberrypi:/home/pi/verpixeld/ -
Configure systemd service (see Raspberry Pi Setup Guide)
-
Access the web interface
- HTTP:
<http://<pi-ip>>:5000 - HTTPS:
<https://<pi-ip>>:5001
- HTTP:
verpixeld exposes a comprehensive REST API for integration with external systems.
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/media/status |
Full media player status (playback, position, metadata, alert state) |
POST |
/api/media/play/{filename} |
Play a local video file |
POST |
/api/media/pause |
Toggle pause/resume |
POST |
/api/media/stop |
Stop playback |
POST |
/api/media/seek |
Seek to position |
POST |
/api/media/scale-filter |
Set the video scaling algorithm |
GET |
/api/media/scale-filters |
List available FFmpeg scale filters |
POST |
/api/youtube/play |
Play a YouTube URL or generic stream |
GET |
/api/favorites |
List all favorites |
POST |
/api/favorites/add-current |
Save currently playing media as favorite |
POST |
/api/favorites/{id}/play |
Play a saved favorite |
POST |
/api/favorites/auto-play/start |
Start auto-play through favorites |
POST |
/api/favorites/auto-play/stop |
Stop auto-play |
GET |
/api/favorites/history |
Get recently played history |
POST |
/api/alert/trigger |
Webhook: Trigger camera motion alert |
POST |
/api/alert/dismiss |
Dismiss active camera alert |
GET |
/api/alert/status |
Get alert status and configuration |
POST |
/api/alert/configure |
Configure camera stream URL and timeout |
POST |
/api/ai/generate |
Generate an image from a text prompt |
POST |
/api/ai/edit |
Image-to-image: stylize an uploaded photo |
POST |
/api/ai/apply |
Apply a generated image to the display (overlay) |
POST |
/api/ai/dismiss |
Dismiss the image overlay from the display |
GET |
/api/ai/gallery |
List saved gallery images |
GET |
/api/ai/gallery/{filename} |
Get a gallery image as base64 |
DELETE |
/api/ai/gallery/{filename} |
Delete a gallery image |
GET |
/api/ai/status |
AI provider status and configuration |
POST |
/api/ai/configure |
Configure AI provider (Azure/OpenAI) |
POST |
/api/ai/schedule |
Configure scheduled auto-generation |
GET |
/api/ai/history |
Get generation history |
GET |
/api/voice/status |
Voice assistant status, config, and last command info |
POST |
/api/voice/configure |
Configure voice settings (speech key, TTS, language, etc.) |
POST |
/api/voice/start |
Start voice listening |
POST |
/api/voice/stop |
Stop voice listening |
POST |
/api/voice/trigger |
Manual push-to-talk trigger |
POST |
/api/music/search |
Search YouTube Music (songs or music videos) |
POST |
/api/music/play |
Play a music search result or search-and-play by query |
GET |
/api/audio/status |
Audio output and Bluetooth status |
GET |
/api/logs/recent |
Get recent console log entries |
GET |
/health |
Health check endpoint |
The camera alert system is designed for easy integration with IP cameras. Configure your camera's motion detection to call:
curl -X POST http://<verpixeld-host>:5000/api/alert/triggerNo body, no authentication, no parameters needed. The endpoint returns 200 OK immediately. Compatible with Reolink, Hikvision, Dahua, and any camera that supports HTTP webhook actions.
The voice assistant and AI art features require Azure cloud services. This section covers everything needed to get them running.
You need two Azure resources (both have generous free tiers):
| Resource | Used For | Free Tier |
|---|---|---|
| Azure OpenAI | Image generation + intent classification (chat) | Pay-as-you-go (see costs below) |
| Azure Speech Services | Speech-to-text, text-to-speech, wake word | 5 hours STT + 500K TTS chars/month free |
- Go to Azure Portal > Create a resource > search "Azure OpenAI"
- Select your subscription and resource group
- Choose a region (e.g.
swedencentral,eastus) — check model availability - Select pricing tier Standard S0
- Click Create
Once created, note:
- Endpoint: Found in Keys and Endpoint (e.g.
https://myresource.openai.azure.com/) - API Key: Found in Keys and Endpoint (Key 1 or Key 2)
You need two model deployments:
- Go to Azure AI Foundry or Azure Portal > your OpenAI resource > Model Deployments
- Click Create new deployment
- Select model: gpt-image-1 (recommended) or dall-e-3
- Name the deployment (e.g.
gpt-image-1) - Click Create
- Click Create new deployment again
- Select model — recommended options (best to cheapest):
- gpt-5 — Best quality, 75% cheaper input tokens than GPT-4o, 400K context
- gpt-5-mini — Great quality, very cheap ($0.25/M input tokens)
- gpt-4o — Proven reliable, widely available
- gpt-4o-mini — Budget option, adequate for intent classification
- Name the deployment (e.g.
gpt-5-miniorgpt-4o) - Click Create
Cost note: Each voice command makes one chat call (~100-300 tokens) for intent classification. With
gpt-5-minithat's about $0.0001 per command. Even heavy use (100 commands/day) costs less than $1/month.
- Go to Azure Portal > Create a resource > search "Speech"
- Select Speech Services
- Choose your subscription, resource group, and region
- Select pricing tier Free F0 (5 hours STT + 500K TTS characters/month) or Standard S0
- Click Create
Once created, note:
- Key: Found in Keys and Endpoint (Key 1)
- Region: e.g.
westeurope,eastus
A custom wake word (e.g. "Hey Pixel") allows hands-free activation:
- Go to Speech Studio > Custom Keyword
- Click Create new model
- Enter your wake word phrase (e.g. "Hey Pixel")
- Click Create and wait for training (~10 minutes)
- Download the
.tablemodel file - Upload it via the verpixeld Voice Settings in the web UI
The voice assistant requires a USB microphone on the Raspberry Pi:
# Verify USB mic is detected
arecord -l
# Check PulseAudio sees it
pactl list sources shortThe microphone source name (e.g. alsa_input.usb-Lenovo_Lenovo_510_Camera-...) will appear in the voice settings dropdown.
-
AI Art tab > Settings subtab:
- Provider: Azure
- Azure Endpoint:
https://yourresource.openai.azure.com/ - Azure API Key: your key
- Image Deployment:
gpt-image-1(from Step 2) - Chat Deployment:
gpt-5-mini(from Step 2) - Click Save
-
AI Art tab > Voice subtab:
- Azure Speech Key: your Speech key (from Step 3)
- Azure Region: your Speech region (e.g.
westeurope) - Speech Language:
de-DE(German) oren-US(English) - Microphone: Select your USB mic from the dropdown
- Voice Responses: Enabled
- TTS Voice: Choose a voice (e.g.
Conrad (DE, Male)) - Audio Ducking: Enabled (lowers music volume during speech)
- Duck Volume: 15% (how quiet music gets during speech)
- Upload wake word
.tablefile (optional, from Step 4) - Click Save Voice Settings
- Click Start Listening
┌──────────────┐ ┌─────────────────────────────────┐
│ USB Mic │────▶│ Unified Keyword + STT Pipeline │
│ (parec) │ │ (single audio stream) │
│ persistent │ │ 1. On-device keyword detection │
└──────────────┘ │ 2. Cloud STT (same stream) │
└───────────────┬─────────────────┘
│ transcription
▼
┌─────────────────────┐
│ Fast-Path Matcher │──▶ instant execution
│ (local, no LLM) │ (stop, pause, etc.)
└─────────┬───────────┘
│ no match
▼
┌──────────────────┐
│ Azure OpenAI │
│ Chat (GPT-5) │
└────────┬─────────┘
│ JSON intent + response
▼
┌───────────────────┐
│ VoiceCommandRouter│
│ Intent Dispatch │
└─────────┬─────────┘
┌────────┬──────┬──────┼──────┬────────┬────────┬────────┐
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
┌───────┐┌──────┐┌─────┐┌─────┐┌──────┐┌───────┐┌──────┐┌───────┐
│ Image ││Media ││Q&A ││Brig-││Music ││Music ││Camera││Exten- │
│ Gen ││Ctrl ││ ││htns ││Search││Radio ││Show/ ││ sion │
└───────┘└──────┘└─────┘└─────┘└──────┘└───────┘│ Hide │└───────┘
└──────┘
│
▼
┌──────────────────┐
│ Azure TTS │────▶ Speakers
│ (paplay output) │
│ + Audio Ducking │
└──────────────────┘
verpixeld supports audio playback via ALSA or PulseAudio, with optional Bluetooth speaker support.
Quick Start: If you just want basic ALSA audio (no Bluetooth), verpixeld works out of the box — no extra setup needed.
For Bluetooth speaker support, you need to:
- Install PulseAudio with Bluetooth modules
- Configure PulseAudio in system-wide mode
- Set up D-Bus permissions for the
pulseuser - Pair your Bluetooth speaker
- Compile FFmpeg with PulseAudio support
Detailed setup guides are in the docs/ folder:
| Guide | Description |
|---|---|
| Audio & Bluetooth Setup | Complete step-by-step guide for PulseAudio, Bluetooth pairing, system configuration, and troubleshooting |
| FFmpeg Compilation | Compiling FFmpeg with PulseAudio output and SMB network share support |
| Raspberry Pi Setup | General Pi setup: systemd service, HTTPS certificate, web-based reboot |
Once set up, verify your Bluetooth audio from the command line:
# Check Bluetooth is enabled
bluetoothctl show | grep "Powered:"
# Check PulseAudio sees Bluetooth sink
pactl list short sinks | grep bluez
# Test audio output
paplay /usr/share/sounds/alsa/Front_Left.wavverpixeld is built on the shoulders of giants. The following libraries make this project possible:
| Library | Purpose | License |
|---|---|---|
| .NET 8.0 | Runtime and base framework | MIT |
| ASP.NET Core | Web server and API framework | MIT |
| Library | Purpose | License |
|---|---|---|
| SkiaSharp | 2D graphics rendering, canvas operations | MIT |
| rpi-rgb-led-matrix | HUB75 LED matrix hardware driver | GPL-2.0-or-later |
| Resource | Purpose | License |
|---|---|---|
| BDF Fonts | Bitmap fonts for LED display text rendering | Various (Public Domain / MIT) |
| Tool / Library | Purpose | License |
|---|---|---|
| FFmpeg | Video/audio decoding, scaling, streaming, and audio output | LGPL/GPL |
| yt-dlp | YouTube URL extraction and format selection | Unlicense |
| YouTubeMusicAPI (NuGet) | YouTube Music search (songs, videos, albums) — no API key required | GPL-3.0 |
| Library | Purpose | License |
|---|---|---|
| Microsoft.CognitiveServices.Speech (NuGet) | Azure Speech SDK — wake word detection, speech-to-text, text-to-speech | MIT |
| Library | Purpose | License |
|---|---|---|
| Google Fonts (Orbitron, Rajdhani, JetBrains Mono) | UI typography | OFL |
Special thanks to:
- Henner Zeller for the incredible rpi-rgb-led-matrix library that makes LED matrix control possible on the Raspberry Pi
- The Mono Project for SkiaSharp, providing powerful cross-platform 2D graphics
- The .NET Team for the excellent .NET 8 runtime and ASP.NET Core framework
- IcySnex for YouTubeMusicAPI, enabling YouTube Music search without an API key
- The open-source community for countless tools, libraries, and inspiration
- All contributors who help improve this project
Portions of this application's code were generated with the assistance of AI tools. The AI was used as a coding assistant to help with:
- Code generation and refactoring
- Documentation writing
- UI/UX improvements
- Bug fixing and optimization
All AI-generated code has been reviewed and integrated by the project maintainer. The use of AI tools is intended to accelerate development while maintaining code quality and functionality.
This project is developed for educational and personal use purposes.
Any resemblance to or inclusion of copyrighted material is entirely unintentional. If you believe any content in this project infringes on your copyright or intellectual property rights, please contact the maintainer immediately, and the material will be promptly reviewed and removed if necessary.
The developers make no claims to any third-party trademarks, logos, or copyrighted materials that may be referenced or inadvertently included. All product names, logos, and brands are property of their respective owners.
GNU General Public License v3.0 or later
Copyright (c) 2022-2026 Jan R. Wrage
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.
This application uses several libraries licensed under the GNU GPL:
- rpi-rgb-led-matrix — GPL-2.0-or-later
- YouTubeMusicAPI — GPL-3.0
Because the YouTubeMusicAPI requires GPL-3.0 and rpi-rgb-led-matrix permits "GPL-2.0 or later", the combined work must be distributed under GPL-3.0-or-later to satisfy both licenses.
What this means for you:
- You can freely use, study, and modify this software
- You can distribute copies of this software
- You can distribute modified versions
- If you distribute this software (modified or not), you must:
- Make the source code available
- License your modifications under GPL-3.0 or later
- Include this license notice
For the full license text, see the LICENSE file or visit https://www.gnu.org/licenses/gpl-3.0.html
Made with ❤️ and lots of ☕
© 2022-2026 Jan R. Wrage


