Releases: NVIDIA-AI-IOT/live-vlm-webui
v0.4.0 - Cloud deployment, multi-session, debug payloads
Cloud deployment, debug payloads, and docs
Added
- Multi-session support: Per-session VLM state and WebSocket routing so multiple tabs/users get isolated streams and config (cloud-friendly).
- Cloud deployment config overrides: Environment variables
LIVE_VLM_API_BASE,LIVE_VLM_DEFAULT_MODEL, andLIVE_VLM_PROCESS_EVERYoverride default API base URL, default model, and frame processing interval (e.g.LIVE_VLM_PROCESS_EVERY=150for ~5 s interval to reduce API quota;LIVE_VLM_API_BASE=https://integrate.api.nvidia.com/v1andLIVE_VLM_DEFAULT_MODEL=google/gemma-3-4b-itfor NVIDIA API Catalog). Documented in Docker guide with-e/--env-fileexamples. - Debug payloads: In Settings (gear) → Debug: toggles for Show request payload and Show response payload. Collapsible request JSON (under prompt) and response JSON (under VLM result) for debugging how image and prompt are sent and what the API returns.
Fixed
- Server startup: Resolved
UnboundLocalErrorforoswhen using env overrides (removed redundantimport os/import sysinsidemain()).
Changed
- Server config:
server_confignow includesprocess_everyso the UI shows the server default frame interval on connect. - README: Ollama on Jetson Thor — recommend upgrading to latest Ollama instead of pinning to 0.12.9; link to troubleshooting for workarounds if needed.
v0.3.0 - UI upgrade and robotics prompts
Added
- Video overlay controls (play / stop):
- Big green PLAY button centered on video; animates to top-left and fades when streaming starts
- Small red STOP button in top-left while streaming (higher opacity for visibility)
- Sidebar start/stop replaced by overlay flow for cleaner UX
- Fullscreen mode: Toggle fullscreen on the video card with VLM output overlay; shrink and mirror buttons remain clickable (z-index fix)
- Robotics-oriented prompt preset: "Robot Navigation (Simple)" system prompt—describe scene and output 5 navigation commands (
linear_x,angular_z) with reasons, e.g. for bathroom-finding or similar tasks
Fixed
- Model initialization race condition: Auto-selected model is sent to server as soon as WebSocket connects so VLM processing starts without manually re-selecting the model
- MediaStreamError on stop: Track end when user stops is handled as normal shutdown (logged at DEBUG only, no error/traceback)
- Fullscreen controls: Shrink (minimize) and Mirror buttons stay above the VLM overlay and remain clickable in fullscreen
- Jetson Thor Docker (#14):
start_container.shnow uses--runtime=nvidiainstead of--gpus allon Jetson (Thor and Orin) so containers start correctly
Changed
- WebRTC: Wait for ICE gathering to complete before sending offer (reduces stuck "checking" connections)
- Troubleshooting: New "WebRTC connection issues" section (ICE stuck, firewall, STUN, verification steps)
- Scripts:
start_server.shsuggestskill -9when port is in use
v0.2.1 - Version String Fix and Test Infrastructure Improvements
🐛 Bug Fixes
- Version string fix:
live-vlm-webui --versionnow correctly displays 0.2.1 (was showing 0.1.1 in v0.2.0) - Test infrastructure: Fixed pytest-asyncio event loop conflicts - all tests now pass reliably
📦 Installation
pip install --upgrade live-vlm-webui==0.2.1Verify: live-vlm-webui --version should show 0.2.1
📚 Documentation
- Consolidated release documentation
- Added version verification steps to prevent future version mismatches
Screenshot
RTSP Feature Preview:
Full changelog: CHANGELOG.md
v0.2.0 - RTSP IP Camera Support (Beta) + UI/UX Improvements
Release Notes for v0.2.0
🎉 What's New
New Beta Feature: RTSP IP Camera Support
- Stream video from RTSP IP cameras for continuous monitoring
- Switch between webcam and RTSP camera in UI
- Auto-reconnection on stream drops
- Support for H.264, H.265, and MJPEG codecs
- Complete setup guide:
docs/usage/rtsp-ip-cameras.md
UI/UX Improvements
- 🎨 OS Dark/Light Mode Preference: Automatically detects and honors system theme preference
- 📝 Markdown Rendering: Render formatted markdown responses from VLMs (headers, lists, code blocks, tables)
- 📋 Copy to Clipboard: One-click copy of generation results with visual feedback
- 🐳 Docker Version Picker: Interactive version selection for Docker deployments
📦 Installation
New Installation:
pip install live-vlm-webui==0.2.0Upgrading from Previous Version:
pip install --upgrade live-vlm-webui==0.2.0Or simply:
pip install --upgrade live-vlm-webuiDocker Users:
# Use the versioned image
./scripts/start_container.sh --version 0.2.0
# Or use latest (will be 0.2.0 after release)
./scripts/start_container.sh📚 Documentation
- RTSP setup guide:
docs/usage/rtsp-ip-cameras.md - Full changelog: See CHANGELOG.md
🔗 Links
v0.1.1 - WSL2 Support and VLM Documentation
Bug Fixes and Documentation Improvements
🐛 Fixed
- WSL2 GPU monitoring resilience: Robust error handling for intermittent NVML issues
- Automatic retry logic with graduated thresholds
- Auto-recovery when GPU access restored
- WSL2 now fully supported with complete GPU monitoring
📚 Added
- Comprehensive VLM documentation: Complete model catalog with 16 verified NVIDIA API models
- Corrected vision capabilities for gemma3 and llava models
- Guidance on text-only vs vision-capable models
- Troubleshooting for common model selection issues
- Windows WSL usage guide: Full setup instructions for WSL2 environments
🧪 Tested On
- ✅ Windows WSL2 (Ubuntu 22.04) with NVIDIA RTX A3000 Laptop GPU
Installation
pip install --upgrade live-vlm-webui==0.1.1See CHANGELOG.md for complete details.
v0.1.0 - First PyPI Release
v0.1.0 - First PyPI Release 🎉
This is the first public release of Live VLM WebUI - a real-time vision language model interface with WebRTC video streaming and live GPU monitoring.
✨ Key Features
Real-time Vision Analysis
- WebRTC video streaming with live VLM analysis overlay
- Support for multiple VLM backends: Ollama, vLLM, NVIDIA API Catalog, OpenAI API
- Configurable prompts, frame intervals, and model settings
Live System Monitoring
- GPU utilization and VRAM usage with sparkline charts
- CPU and RAM monitoring
- Inference latency tracking (last, average, total count)
Multi-Platform Support
- ✅ PC with NVIDIA GPUs
- ✅ NVIDIA DGX Spark
- ✅ NVIDIA Jetson (Orin, Thor)
- ✅ Apple Silicon Macs
- ✅ CPU-only fallback
Easy Deployment
- PyPI package:
pip install live-vlm-webui - Docker images for all platforms
- Automatic SSL certificate generation
📦 Installation
pip install live-vlm-webui
live-vlm-webuiThen open your browser to https://localhost:8090
Docker:
docker run -d --gpus all -p 8090:8090 \
ghcr.io/nvidia-ai-iot/live-vlm-webui:latest🧪 Tested Platforms
- ✅ x86_64 PC (Linux Ubuntu 22.04)
- ✅ NVIDIA DGX Spark (ARM64 SBSA)
- ✅ NVIDIA Jetson AGX Thor (JetPack 7.0)
- ✅ NVIDIA Jetson AGX Orin (JetPack 6.2)
- ✅ NVIDIA Jetson Orin Nano (JetPack 6.2)
- ✅ macOS (Apple Silicon M-series)
⚠️ Known Issues
- Jetson Thor + Ollama 0.12.10: GPU inference fails on JetPack 7.0
- Workaround: Use Ollama 0.12.9 or NVIDIA API Catalog
- See troubleshooting guide
📚 Documentation
- README - Quick start and usage
- CHANGELOG - Complete list of changes
- Troubleshooting Guide - Common issues and solutions
🙏 Acknowledgments
This is the initial release developed and tested across multiple NVIDIA platforms. Feedback and contributions are welcome!
Full Changelog: https://github.com/NVIDIA-AI-IOT/live-vlm-webui/blob/main/CHANGELOG.md