Skip to content

Ravikiran-Bhonagiri/Edge-Vision

Repository files navigation

Edge Vision Detection System

Edge Vision is an offline-first computer-vision pipeline built for Raspberry Pi 5–class hardware. It captures frames from a live camera or video file, runs Ultralytics detectors (YOLOv8 or RT-DETR), and emits structured JSON to stdout. Optional GPIO alerts, frame capture, Docker packaging, CI, and regression tests are included so you can treat it like a production-ready service.


1. Features at a Glance

  • Dual detection backends: Ultralytics YOLOv8 (default) and RT-DETR for transformer-based accuracy
  • Live camera or file-based sources with automatic fallback demo clip
  • JSON logs enriched with confidence scores, per-frame video timestamp, and first-seen timing
  • Optional annotated frame export and GPIO LED alerts
  • Makefile automation, Docker image, GitHub Actions CI/CD, and pytest suite
  • Sample-video workflow for laptop development (5s/10s/20s/30s curated clips)

2. Directory Tour

  • src/edge_vision/ – detection pipelines, CLI, configuration and utilities
  • config/default.yaml – all runtime settings (camera, model paths, logging, alerts)
  • requirements*.txt – runtime and dev dependencies
  • scripts/ – helper utilities (load_weights.py, generate_sample_clips.py)
  • data/ – generated samples (data/samples/) and annotated frames
  • models/ – expected location for Ultralytics weight files (yolov8n.pt, rtdetr-l.pt)
  • tests/ – pytest coverage for config, CLI, logging, and inference utilities

3. Prerequisites

  • Python 3.11+ with venv
  • curl (or equivalent) for downloading sample videos/weights
  • make (optional but recommended)
  • GPU not required; CPU inference is fine for the provided clips
  • (Optional) gpiozero and physical LED if you intend to exercise GPIO alerts on Raspberry Pi

4. Environment Setup

4.1 Quick setup with Makefile (recommended)

make setup          # create .venv/ and install runtime + dev dependencies
make demo-videos    # download raw sources (~180 MB) and build curated clips

This populates:

  • .venv/ – isolated Python environment
  • data/source_videos/ – Intel and Big Buck Bunny source footage
  • data/samples/ – 5s/10s/20s/30s trimmed clips tailored to the detection tasks

4.2 Manual setup (if you prefer explicit commands)

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
pip install -r requirements-dev.txt

mkdir -p models data/annotated
curl -L -o models/yolov8n.pt https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8n.pt
curl -L -o models/rtdetr-l.pt https://github.com/ultralytics/assets/releases/download/v8.1.0/rtdetr-l.pt

make demo-videos   # optional but handy on laptops/WSL

5. Running the Pipelines

All commands below assume the virtualenv is active (source .venv/bin/activate). When launching from a cloned repo, remember to expose the source package:

export PYTHONPATH=src

5.1 YOLOv8 (default backend)

# laptop/WSL with a demo clip
python -m edge_vision --headless --no-alerts --video data/samples/humans-5s.mp4

# machine with a real webcam (GUI window will open)
python -m edge_vision --no-alerts

5.2 RT-DETR backend

# use a clip
python -m edge_vision --backend rtdetr --headless --no-alerts --video data/samples/humans-5s.mp4

# use the camera with on-screen display
python -m edge_vision --backend rtdetr --no-alerts

5.3 Common CLI switches

  • --config PATH – use an alternate YAML config
  • --video PATH – override the video/camera source
  • --headless – disable any GUI windows (typical for servers or WSL)
  • --no-alerts – suppress GPIO outputs even if enabled in config
  • --log-level DEBUG – bump verbosity
  • --backend {yolov8,rtdetr} – choose model family at runtime
  • --camera-backend {opencv,pi} – select OpenCV capture or Picamera2 (Pi only)
  • --save-annotated – persist annotated frames to system.annotated_output_dir

5.4 JSON log schema

Each detection frame prints a line similar to:

{
  "timestamp": "2025-10-30T06:55:37.562858+00:00",
  "objects_detected": ["person"],
  "count": 1,
  "confidence": [0.902],
  "video_time_sec": 4.0,
  "first_seen_video_sec": [2.5]
}
  • video_time_sec – wall-clock position inside the video/camera stream (seconds)
  • first_seen_video_sec – when each class was first observed in the clip
  • Fields are omitted when timestamps are unavailable (e.g., headless camera drivers without POS_MSEC support)

5.5 Raspberry Pi 5 Camera Pipeline

Running on a Pi 5 with the official camera uses the dedicated Picamera2 pipeline:

  1. Install Picamera2 (Raspberry Pi OS Bullseye or newer):

    sudo apt update
    sudo apt install -y python3-picamera2
  2. Either update config/default.yaml:

    system:
      camera_backend: pi

    or launch with --camera-backend pi.

  3. Run YOLOv8 on the Pi camera:

    python -m edge_vision --camera-backend pi --no-alerts

    RT-DETR works the same way:

    python -m edge_vision --backend rtdetr --camera-backend pi --no-alerts

The Pi pipeline streams frames via Picamera2, still prints JSON logs with timings, and honours display/save/alert options. When the Pi camera is unavailable the application raises a clear error.


6. Configuration Reference (config/default.yaml)

Key sections:

  • system
    • camera_index, video_source, demo_video, camera_backend
    • frame_width, frame_height
    • display, headless, save_annotated, annotated_output_dir
  • detection
    • backend (yolov8 or rtdetr)
    • model_path, yolov8_model_path, rtdetr_model_path
    • targets (labels to keep), confidence_threshold
  • logging
    • level, json_indent, utc
  • alerts
    • enabled, gpio_pin, active_high, flash_duration_ms
  • profiling (flag placeholder for future FPS logging)

All relative paths are normalized relative to the config file. Override any value via CLI flags, alternate YAML, or editing this file.


7. Sample Media Workflow

  1. make demo-videos downloads:
    • face-demographics-walking.mp4 (people)
    • person-bicycle-car-detection.mp4 (mixed street)
    • BigBuckBunny.mp4 (animals)
  2. scripts/generate_sample_clips.py trims them to quick-to-run clips:
    • data/samples/humans-5s.mp4
    • data/samples/humans-10s.mp4
    • data/samples/humans-vehicles-20s.mp4
    • data/samples/animals-30s.mp4

These clips are ideal for headless testing on laptops/WSL where /dev/video0 might not exist. Update or extend the script if you want other durations or scenarios.


8. Tooling & Development

  • Tests: PYTHONPATH=src python -m pytest -q
  • Linting: make lint (Black + Flake8)
  • Full automation: make test
  • Docker image: make build then make docker-run (mounts /dev/video0, uses --headless by default)
  • GitHub Actions: CI runs formatting and tests; CD builds/pushes a Docker image to GHCR on main

Scripts worth noting:

  • scripts/load_weights.py – quick sanity check that the configured backend weights can be loaded (reads detection.backend)
  • scripts/generate_sample_clips.py – reproducible sample clip creation

9. Troubleshooting Tips

  • “No module named edge_vision”: ensure PYTHONPATH=src or install the project as a package (pip install -e .)
  • Camera index errors: adjust system.camera_index, or use --video path/to/file.mp4
  • Missing Ultralytics modules: re-run pip install -r requirements.txt inside the virtualenv
  • GPIO warnings: expected on laptops; disable via --no-alerts
  • Timestamp missing: some drivers don’t expose CAP_PROP_POS_MSEC. Logs will omit video_time_sec in that case

10. Next Steps & Customization

  • Swap targets in the config to include additional classes
  • Point model_path to specialized YOLO weights (e.g., custom-trained models)
  • Extend the JSON payload (e.g., bounding boxes) within run_inference and emit_json_log
  • Add new backends by following the RT-DETR pattern (src/edge_vision/rtdetr_app.py)

Happy hacking! If you notice missing assets or want to automate more, the Makefile and scripts directory are the best places to extend.

About

Writing end to end ML pipeline for Edge Vision

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published