Skip to content

jaesi/pedestrian-detection-tracking-yolo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pedestrian Detection and Tracking with YOLO

A comprehensive system for detecting pedestrians in video footage and mapping their movement patterns using YOLO object detection and homography transformation.

Overview

This project combines computer vision techniques to:

  1. Extract frames from video footage
  2. Create spatial transformations using homography mapping
  3. Detect pedestrians using YOLO (You Only Look Once)
  4. Transform detection coordinates to top-down view
  5. Generate heatmaps showing pedestrian movement patterns

Features

  • Modular Architecture: Clean separation of concerns with dedicated modules for each task
  • Configurable Parameters: All settings managed through YAML configuration
  • YOLO Integration: Uses state-of-the-art YOLOv8 for accurate pedestrian detection
  • Homography Transformation: Maps camera perspective to satellite/top-down view
  • Outlier Detection: Statistical filtering of erroneous detections
  • Rich Visualizations: Scatter plots, heatmaps, and transformed views

Project Structure

pedestrian-detection-tracking-YOLO/
├── config/
│   └── config.yaml              # Configuration parameters
├── src/
│   ├── __init__.py
│   ├── utils.py                 # Utility functions
│   ├── video_processor.py       # Video frame extraction
│   ├── homography_mapper.py     # Homography transformation setup
│   └── detector.py              # YOLO detection and tracking
├── data/
│   ├── input/
│   │   ├── videos/              # Input video files
│   │   └── images/              # Reference images (video frame, satellite)
│   ├── output/
│   │   └── images/              # Output visualizations
│   ├── frames/                  # Extracted video frames
│   └── detections/              # Detection results (CSV)
├── models/                      # YOLO model weights
├── tests/                       # Unit tests
├── docs/                        # Additional documentation
├── README.md
└── requirements.txt

Installation

Prerequisites

  • Python 3.8 or higher
  • CUDA-compatible GPU (optional, for faster processing)

Setup

  1. Clone the repository:
git clone https://github.com/yourusername/pedestrian-detection-tracking-YOLO.git
cd pedestrian-detection-tracking-YOLO
  1. Install dependencies:
pip install -r requirements.txt
  1. Download YOLO weights:
# YOLOv8 models will be automatically downloaded on first run
# Or manually download and place in models/ directory

Usage

Step 1: Extract Frames from Video

Extract frames from your video file at specified intervals:

from src.video_processor import VideoFrameExtractor

extractor = VideoFrameExtractor()
extractor.extract_frames(
    video_path="data/input/videos/your_video.mp4",
    output_folder="data/frames",
    interval_sec=1  # Extract 1 frame per second
)

Or run directly:

python -m src.video_processor

Step 2: Create Homography Transformation Matrix

Set up the transformation between camera view and top-down view:

from src.homography_mapper import HomographyMapper

mapper = HomographyMapper()
matrix = mapper.create_homography_matrix(
    source_image_path="data/input/images/video_frame.jpg",
    dest_image_path="data/input/images/satellite_view.jpg"
)
mapper.save_matrix("data/homography_matrix.npy")

Or run interactively:

python -m src.homography_mapper

Instructions:

  • Click on 4 corresponding points in the source image (video frame)
  • Click on 4 matching points in the destination image (satellite/top-down view)
  • The transformation matrix will be calculated and saved

Step 3: Detect and Map Pedestrians

Run the complete detection and tracking pipeline:

import numpy as np
from src.detector import PedestrianTrackingPipeline

# Load homography matrix
matrix = np.load("data/homography_matrix.npy")

# Run pipeline
pipeline = PedestrianTrackingPipeline(matrix)
results = pipeline.run(images_folder="data/frames")

Or run directly:

python -m src.detector

Complete Workflow

# 1. Extract frames
python -m src.video_processor

# 2. Create homography transformation (interactive)
python -m src.homography_mapper

# 3. Run detection and tracking
python -m src.detector

Configuration

All parameters can be configured in config/config.yaml:

# Example configuration
paths:
  input_video: "data/input/videos/sample_video.mp4"
  yolo_weights: "models/yolov8l.pt"

video_processing:
  frame_interval_sec: 1

detection:
  model_size: "yolov8l"
  confidence_threshold: 0.5

visualization:
  heatmap:
    bins_x: 20
    bins_y: 35

Output

The pipeline generates several outputs:

  1. Extracted Frames: data/frames/frame_XXXX.png
  2. Homography Visualizations:
    • data/output/images/source_image_with_points.jpg
    • data/output/images/destination_image_with_points.jpg
    • data/output/images/transformed_image.jpg
  3. Detection Data: data/detections/detections_data.csv
  4. Visualizations:
    • data/output/images/center_points_original.jpg
    • data/output/images/transformed_points_scatter.jpg
    • data/output/images/pedestrian_heatmap.jpg

Sample Data Structure

To use this project, prepare your data as follows:

data/
├── input/
│   ├── videos/
│   │   └── your_video.mp4          # Your video file
│   └── images/
│       ├── video_frame.jpg         # A sample frame from your video
│       └── satellite_view.jpg      # Top-down/satellite view of the area

Algorithm Details

Homography Transformation

Homography is a projective transformation that maps points from one plane to another. In this project:

  • Source: Camera perspective view (angled ~15 degrees)
  • Destination: Top-down satellite view
  • Method: Point correspondence (4+ points)

Pedestrian Detection

  • Model: YOLOv8 (various sizes available: n, s, m, l, x)
  • Target Class: Person (class 0 in COCO dataset)
  • Detection Point: Bottom center of bounding box (foot position)

Outlier Removal

Uses Z-score method to remove erroneous detections:

  • Calculate Z-scores for transformed coordinates
  • Remove points with Z-score > threshold (default: 3)

Advanced Usage

Custom Configuration

Create a custom configuration file:

from src.utils import load_config
from src.detector import PedestrianTrackingPipeline

config = load_config("path/to/custom_config.yaml")
pipeline = PedestrianTrackingPipeline(matrix, config=config)

Individual Components

Use components separately:

from src.detector import PedestrianDetector, HomographyTransformer, PedestrianVisualizer

# Detect only
detector = PedestrianDetector()
detections = detector.detect_batch("data/frames")

# Transform only
transformer = HomographyTransformer(matrix)
transformed = transformer.transform_points(detections)

# Visualize only
visualizer = PedestrianVisualizer()
visualizer.plot_heatmap(transformed)

Troubleshooting

Common Issues

  1. "Cannot load YOLO model"

    • Ensure ultralytics is installed: pip install ultralytics
    • Model weights will download automatically on first run
  2. "CUDA out of memory"

    • Use smaller YOLO model (yolov8n or yolov8s)
    • Set device: "cpu" in config.yaml
  3. "No homography matrix found"

    • Run Step 2 (homography_mapper.py) before Step 3
  4. Poor transformation quality

    • Select more accurate correspondence points
    • Ensure points are spread across the image
    • Use distinctive landmarks for point selection

Performance Tips

  • Use GPU acceleration (CUDA) for faster inference
  • Adjust frame extraction interval based on video content
  • Use appropriate YOLO model size (larger = more accurate but slower)
  • Process frames in batches for better efficiency

Citation

If you use this code in your research, please cite:

@software{pedestrian_detection_tracking,
  title={Pedestrian Detection and Tracking with YOLO},
  author={Your Name},
  year={2024},
  url={https://github.com/yourusername/pedestrian-detection-tracking-YOLO}
}

License

This project is provided as-is for archival purposes.

Acknowledgments

  • Ultralytics YOLO - YOLOv8 implementation
  • OpenCV - Computer vision operations
  • COCO Dataset - Pre-trained detection models

Contact

For questions or issues, please open an issue on GitHub.


Note: This is an archived repository. The code has been refactored and organized for clarity and maintainability.

About

A comprehensive system for detecting pedestrians in video footage and mapping their movement patterns using YOLO object detection and homography transformation.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages