Vision Cockpit AI is an interactive driving simulator where YOU are the controller. Instead of using a keyboard or mouse, you control the car using head gestures detected by your webcam.
The system uses advanced Computer Vision (SegFormer + Visual Odometry) to analyze the road in real-time, calculating Risk, Speed, and Traffic Safety just like a real self-driving car.
- ๐ฃ๏ธ Voice Control (New!):
- "Start Driving": Activates the car.
- "Stop": Stops the car.
- "Park": Initiates auto-parking.
- "Speed Up / Slow Down": Adjusts cruising speed.
- โ Gesture Toggles:
- Show Fist: Start (Toggle ON).
- Show Palm: Stop (Toggle OFF).
- ๐ค Autonomous Brain:
- The car handles steering, lane keeping, and obstacle avoidance automatically once started.
- Safety Timeout: If no face is seen for 60s, the car auto-parks.
- ๐๏ธ Pixel-Level Semantic Segmentation:
- The Goal: Colors the entire road as "drivable", sidewalk as "non-drivable", and sky as "background" in real-time.
- Implementation: Uses SegFormer (Cityscapes) for high-resolution scene understanding.
- ๐บ๏ธ 3D Lidar Map: Projects road features onto a real-time bird's-eye view map.
- ๐ก๏ธ ADAS Safety System:
- Collision Warning: Detects cars/trucks and warns of "High Risk".
- Reasoning Layer (VLM): Displays the AI's "Inner Monologue" (e.g., "Obstacle Interaction Likely").
- Pedestrian Intent: Monitors pose to predict crossing, flashing RED borders if intent is deduced.
- Comfort Score: Rates your driving smoothness.
- ๐ป Ghost Planning (Counterfactuals):
- Visualizes "What if?" scenarios by drawing predicted trajectories (Purple Arrows) for other vehicles.
- Simulates multi-agent interaction logic.
- ๐ง Neural Radiance Fields (NeRF) Memory:
- Builds a high-fidelity 3D "Neural Map" of the environment in real-time.
- Allows the car to "remember" past locations, creating a digital twin of the chaos it drove through.
- ๏ฟฝ systematic Evaluation (Science Layer):
- Stress Test Module: Automatically injects "Worst-Case Scenarios" (Sudden Glare, Surprise Obstacles) to test system robustness.
- Safety Metrics: Tracks Disengagement Rate (Human Panic Braking) and Safety Violations in real-time.
- ๐ True Uncertainty Quantification (Doubt Layer):
- Aleatoric vs Epistemic: Distinguishes between "Noisy Data" (Rain/Glare) and "Unknown Objects" (Model Ignorance).
- High Alert Mode: Automatically switches system mode from "AUTONOMOUS" to "HIGH ALERT" when uncertainty spikes > 40%.
- ๐ Absolute Vision Localization (Memory Layer):
- Semantic SLAM: Recognizes Landmarks (Traffic Lights, Signs) to find global coordinates without GPS.
- Status Lock: HUD updates to "GPS: VISUAL LOCK" when landmarks are confirmed.
- ๐ฅ Explainable AI (XAI) Heatmaps (Transparency Layer):
- Real-Time Saliency: Overlays a "Heatmap" glowing red on obstacles that caused the AI to brake.
- Transparency: Shows why the decision was made.
- ๏ฟฝ๐ก Precision Mapping (VO Refinement):
- Uses Semantic Segmentation to "mask out" moving vehicles.
- Ensures the 3D Map only tracks static world objects (Road, Buildings), preventing map drift.
- ๐ฆ๏ธ Dynamic Weather Engine: Automatically cycles through Sunny, Rainy, Night, and Mirror modes to keep the drive interesting.
- ๐ค Autonomous Fallback (New!):
- Auto-Pilot: If the driver's face is not detected, the system automatically takes control.
- Lane Keeping: Follows the center of the road using semantic segmentation.
- Safe Stopping: Slows down or stops if obstacles are detected, ensuring safety even without human input.
- OS: Windows 10/11, Linux, or macOS.
- Python: Version 3.8 or higher.
- Hardware:
- Webcam (Required for controls).
- Decent CPU (GPU recommended but optional).
Download this project folder to your local computer.
It is recommended to use a virtual environment. Open your terminal/command prompt in the project folder:
# Create virtual environment
python -m venv .venv
# Activate it (Windows)
.venv\Scripts\Activate.ps1
# Activate it (Mac/Linux)
source .venv/bin/activatepip install -r requirements.txt(Note: This creates the .venv folder and installs libraries like torch, opencv-python, and transformers.)
You have 3 ways to use the simulator:
This uses the included sample videos to demonstrate the AI capabilities.
cd src
python run_cockpit.py- Select [1] for Rainy Loop.
- Select [2] for Sunny Highway.
- Select [6] to Browse for your own video.
Want to simulate your own dashcam video? Just drag the file onto the terminal!
cd src
python demo_dual_stream.py "C:\Path\To\Your\Video.mp4"- The AI will automatically resize your video to fit its neural network.
Runs the default embedded video with all features active.
cd src
python demo_dual_stream.pyOnce the simulator starts:
- Sit comfortably in front of your webcam.
- Ensure your face is detected (Green box around face).
- To Start (Go): Show an OPEN PALM (Hand up, fingers spread). The car will speed up.
- To Stop: Show a FIST (Closed hand). The car will brake.
- To Turn: Lean your head Left or Right.
- Autonomous Fallback: If you hide your face or move out of frame, the AI takes over automatically.
HUD Indicators:
- RISK: High/Low (Based on traffic ahead).
- TTC: Time-To-Collision (Seconds).
- COMFORT: 0-100% (Don't drive jerkily!).
- ๐ฃ Pink/Purple Arrows: Ghost Planning. Shows where other cars might go.
- ๐ด Red Flashing Border: Pedestrian Intent. Warns that a person might cross.
- ๐ฅ Glowing Red Heatmap: XAI Saliency. The specific object causing the AI to brake.
- ๐ข Green/Purple Road: Semantic Brain. The AI "painting" the drivable area.
- ๐ต Blue/Red Point Cloud: NeRF Memory. The 3D "Digital Twin" of the road.
| Message / Keyword | Meaning | Layer |
|---|---|---|
| "GPS: VISUAL LOCK" | The AI recognized a Traffic Sign/Light and fixed its position without Satellites. | Memory |
| "GPS: DENIED" | No satellites or landmarks found; relying on Visual Odometry estimation. | Memory |
| "XAI: ATTENTION BLOCKED" | The car is stopping because of the object highlighted in RED. | Transparency |
| "STRESS TEST: GLARE" | The system is simulating a Blind Sun Glare event to test robustness. | Science |
| "STRESS TEST: SURPRISE" | A red block was injected to test reaction time. | Science |
| "UNCERTAINTY: High %" | The AI is "confused" (due to rain/glare/unknown objects). | Doubt |
| "MODE: HIGH ALERT" | The AI switched to Caution Mode because Uncertainty > 40%. | Doubt |
| "AI THOUGHT: ..." | The "Inner Monologue" explaining logic (e.g., "Obstacle Ahead -> Braking"). | Reasoning |
| "DISENGAGEMENTS: X" | Count of times you had to Panic Brake (Human Takeover). | Science |
- Model:
nvidia/segformer-b0-finetuned-cityscapes - Function: Breaks the image into semantic classes (Road, Car, Sky).
- Training: Trained on Cityscapes (German cities), but generalizes to any video.
- Algorithm: FAST Feature Detector + Lucas-Kanade Optical Flow.
- Function: Tracks ground pixels frame-to-frame to estimate specific speed and trajectory without GPS.
- Algorithm: Haar Cascades.
- Function: Calculates head pitch/yaw and Eye Aspect Ratio (EAR) to detect sleepiness.
- "Camera Not Found": Ensure no other app (Zoom/Teams) is using your webcam.
- "RuntimeError: Mat1 and Mat2 shapes...": This is fixed! The system now auto-resizes videos.
- Slow Performance: This is a heavy AI application. If it lags, try resizing your video to a smaller resolution (e.g., 640x480) before loading it.
- UI Cutoff/Overlap: The new dashboard is designed for 1600x900. Ensure your monitor resolution is high enough.
- Hardware Integration: Connect a Logitech G29 Racing Wheel for realistic steering.
- Lane Keeping Assist (LKA): Add logic to automatically steer the car to the center of the lane.
- Voice Control: Use speech recognition to change weather modes ("Alexa, make it sunny").
GitHub Repository: https://github.com/vishalsnghkush/Hack-4-Viksit-Bharat
- AI Model: HuggingFace & NVIDIA (SegFormer).
- Datasets: Cityscapes Dataset (Cordts et al.).
- Created For: Interactive Vision Speed Smoothing & Safety Project.
- Team / Collaborators:
- Aditi Malviya
- Ashi Dubey
- Anshika