Simulator Camera Integration:
- Use laptop webcam to simulate MechDog's ESP32-S3 camera
- Live camera preview with start/stop/capture controls
- Auto-floats to top-right corner (compact overlay)
- Resize option: toggle between small (200px) and large (400px)
- POST
/camera/captureendpoint to receive webcam frames - GET
/camera/captureendpoint to serve last captured frame - Perfect for testing Nebius VLM integration without hardware!
Camera Features:
- Browser permissions handling with troubleshooting docs
- Chrome recommended (Safari has permission issues)
- Touch support for mobile testing
- Dock/float toggle for flexible viewing
Visual Command Feedback:
- Emojis spawn and float upward when commands execute
- Action emojis: 🪑 sit, 🧍 stand, 👋 wave, 🤝 shake, 💃 dance, 🤸 balance
- Movement emojis: ⬆️ forward, ⬇️ backward, ⬅️ left, ➡️ right, 🛑 stop
- Pop-in animation (starts small, grows to full size)
- White circular background for visibility
- Fully opaque (100% visible) with smooth fade-out
- Like live chat reactions!
New Test Scripts:
bridge/demo/test_camera.sh- Test camera capture, validate JPEG, auto-open imagebridge/demo/test_vision.sh- Test VLM integration (Nebius/Anthropic/mock)bridge/demo/test_vision_advanced.sh- Advanced vision-guided navigation demo (NEW!)bridge/demo/README.md- Comprehensive demo guide with examples
Documentation Improvements:
docs/FAQ.md- Extensive FAQ covering setup, connectivity, camera, vision, hackathon tipsdocs/TROUBLESHOOTING.md- Detailed troubleshooting for all components- WiFi vs Bluetooth connectivity guide in
docs/ESP32_API.md - Camera troubleshooting section (Safari vs Chrome permissions)
- Updated README.md with camera features and Nebius stretch goals
Compact Status Panel:
- Single-line layout: "POSITION 300, 300" instead of two lines
- Reduced padding and font sizes (~40% smaller)
- Right-aligned values for cleaner look
- More screen space for canvas
Camera UX:
- Auto-floats on page load (no need to click)
- Default size reduced to 200px (half previous size)
- Description text hidden in floating mode
- Perfect for vision testing while watching robot
Dependencies:
- Added
multerfor handling camera image uploads - Added
@types/multerfor TypeScript support
API Updates:
- POST
/camera/capturewith multipart/form-data support - Enhanced error messages for missing camera frames
- Helpful hints for simulator camera setup
Cache Management:
- Version bumped to v=12 for reliable browser updates
Camera Testing:
# 1. Start simulator
./scripts/start.sh
# 2. Open http://localhost:3000 in Chrome
# 3. Camera auto-floats, click "Start"
# 4. Click "Capture" to take snapshot
# 5. Test capture
bridge/demo/test_camera.sh localhost:3000
# 6. Test vision (with API key)
export ANTHROPIC_API_KEY=your-key
bridge/demo/test_vision.sh localhost:3000
# 7. Test advanced vision-guided navigation (NEW!)
./scripts/test.sh demo-vision --ip localhost:3000VLM Scene Understanding (Goal #1):
- Camera capture working ✅
- Vision script ready (
bridge/vision.py) ✅ - Demo scripts for testing ✅
- Documentation complete ✅
- Just need Nebius API key!
Vision-Guided Navigation (Goal #2):
- Advanced demo script with 4 scenarios ✅
- Scene understanding → Decision → Navigation loop → Celebration ✅
- Integrated into test.sh as
demo-visionsubcommand ✅ - Mock mode for testing without VLM API ✅
Test Commands:
# Basic demo
./scripts/test.sh demo --ip localhost:3000
# Advanced vision-guided navigation
./scripts/test.sh demo-vision --ip localhost:3000Next Steps:
- Test with real MechDog at venue
- Wire OpenClaw agent with skill
- Implement Nebius VLM integration
- Add real vision feedback in navigation loop
Python Bridge
- HTTP client for MechDog ESP32 WiFi API
- Support for movement commands (forward, backward, left, right, stop)
- Support for action commands (sit, stand, shake, wave, dance, balance)
- Camera capture integration
- Clean CLI with argparse interface
- Managed with
uvfor modern Python dependency management
OpenClaw TypeScript Skill
- Three primary tools:
move,action,look - Natural language control of MechDog quadruped robot
- Environment variable configuration for IP address
- Type-safe TypeScript implementation
- Ready for OpenClaw agent integration
Visual Simulator
- Beautiful web-based interface with real-time canvas visualization
- Express + WebSocket server for live updates
- Interactive drag-and-drop for robot positioning (mouse & touch support)
- Manual control buttons for quick testing
- Status dashboard (position, rotation, battery, last command)
- Drop-in replacement for real MechDog hardware
- Perfect for development without physical robot
Interactive Physics Simulation
- Colored balls (red, blue, green) for vision testing scenarios
- Realistic physics with friction (0.98) and bounce damping (0.7)
- Drag-and-drop for both robot and balls
- Wall collision detection
- Up to 3 balls simultaneously on canvas
Responsive Design
- Optimized for half-screen viewing (hackathon demo setup)
- Responsive grid layout: 4 columns (>900px), 3 columns (600-900px), 2 columns (<600px)
- Mobile-friendly touch controls
- Cache-busting for reliable updates
Scripts Organization
setup.sh- One-command installationscripts/start.sh- Start simulator in backgroundscripts/stop.sh- Stop simulator gracefullyscripts/status.sh- Check simulator statusscripts/test.sh- Comprehensive test suite with multiple modesscripts/build.sh- Build and setupscripts/lint.sh- Code quality checks
Testing Modes
check- Health check (no hardware needed)bridge- Test Python bridge with hardware/simulatorskill- TypeScript type checkingdemo- Full demo sequencedemo-vision- Advanced vision-guided navigation demo (NEW in v0.1.1)all- Run all tests
mechdog-skill/
├── bridge/ # Python HTTP client
├── skills/mechdog/ # OpenClaw TypeScript skill
├── simulator/ # Visual web simulator
├── scripts/ # Build/test/management scripts
├── docs/ # Documentation
└── setup.sh # One-command setup
Simulator provides full API compatibility with MechDog ESP32:
POST /move- Movement controlPOST /action- Action executionPOST /position- Manual positioning (simulator-only)GET /status- Status queryGET /camera/capture- Camera interface (mocked)
Real-time updates via WebSocket:
{
"type": "move",
"direction": "forward",
"duration": 2000,
"state": {
"position": { "x": 450, "y": 300 },
"rotation": 45,
"action": null,
"battery": 98,
"lastCommand": "move forward"
}
}All tests passing:
- ✓ Python bridge CLI working
- ✓ TypeScript type checking passing
- ✓ Simulator server operational
- ✓ Health checks passing
Python: requests (via uv)
Node.js: express, ws, body-parser, tsx
# Setup (one-time)
./setup.sh
# Start simulator
./scripts/start.sh
# Test with Python bridge
./scripts/test.sh bridge --ip localhost:3000
# Open browser to visualize
# http://localhost:3000- Simulator uses port 3000 (configurable via SIMULATOR_PORT)
- Ball physics are simplified (2D only)
- Camera capture returns mock data in simulator
- Vision integration with Nebius GPU
- Multi-agent coordination
- Live video stream processing
- Test with real MechDog hardware at venue