A comprehensive system for detecting pedestrians in video footage and mapping their movement patterns using YOLO object detection and homography transformation.
This project combines computer vision techniques to:
- Extract frames from video footage
- Create spatial transformations using homography mapping
- Detect pedestrians using YOLO (You Only Look Once)
- Transform detection coordinates to top-down view
- Generate heatmaps showing pedestrian movement patterns
- Modular Architecture: Clean separation of concerns with dedicated modules for each task
- Configurable Parameters: All settings managed through YAML configuration
- YOLO Integration: Uses state-of-the-art YOLOv8 for accurate pedestrian detection
- Homography Transformation: Maps camera perspective to satellite/top-down view
- Outlier Detection: Statistical filtering of erroneous detections
- Rich Visualizations: Scatter plots, heatmaps, and transformed views
pedestrian-detection-tracking-YOLO/
├── config/
│ └── config.yaml # Configuration parameters
├── src/
│ ├── __init__.py
│ ├── utils.py # Utility functions
│ ├── video_processor.py # Video frame extraction
│ ├── homography_mapper.py # Homography transformation setup
│ └── detector.py # YOLO detection and tracking
├── data/
│ ├── input/
│ │ ├── videos/ # Input video files
│ │ └── images/ # Reference images (video frame, satellite)
│ ├── output/
│ │ └── images/ # Output visualizations
│ ├── frames/ # Extracted video frames
│ └── detections/ # Detection results (CSV)
├── models/ # YOLO model weights
├── tests/ # Unit tests
├── docs/ # Additional documentation
├── README.md
└── requirements.txt
- Python 3.8 or higher
- CUDA-compatible GPU (optional, for faster processing)
- Clone the repository:
git clone https://github.com/yourusername/pedestrian-detection-tracking-YOLO.git
cd pedestrian-detection-tracking-YOLO- Install dependencies:
pip install -r requirements.txt- Download YOLO weights:
# YOLOv8 models will be automatically downloaded on first run
# Or manually download and place in models/ directoryExtract frames from your video file at specified intervals:
from src.video_processor import VideoFrameExtractor
extractor = VideoFrameExtractor()
extractor.extract_frames(
video_path="data/input/videos/your_video.mp4",
output_folder="data/frames",
interval_sec=1 # Extract 1 frame per second
)Or run directly:
python -m src.video_processorSet up the transformation between camera view and top-down view:
from src.homography_mapper import HomographyMapper
mapper = HomographyMapper()
matrix = mapper.create_homography_matrix(
source_image_path="data/input/images/video_frame.jpg",
dest_image_path="data/input/images/satellite_view.jpg"
)
mapper.save_matrix("data/homography_matrix.npy")Or run interactively:
python -m src.homography_mapperInstructions:
- Click on 4 corresponding points in the source image (video frame)
- Click on 4 matching points in the destination image (satellite/top-down view)
- The transformation matrix will be calculated and saved
Run the complete detection and tracking pipeline:
import numpy as np
from src.detector import PedestrianTrackingPipeline
# Load homography matrix
matrix = np.load("data/homography_matrix.npy")
# Run pipeline
pipeline = PedestrianTrackingPipeline(matrix)
results = pipeline.run(images_folder="data/frames")Or run directly:
python -m src.detector# 1. Extract frames
python -m src.video_processor
# 2. Create homography transformation (interactive)
python -m src.homography_mapper
# 3. Run detection and tracking
python -m src.detectorAll parameters can be configured in config/config.yaml:
# Example configuration
paths:
input_video: "data/input/videos/sample_video.mp4"
yolo_weights: "models/yolov8l.pt"
video_processing:
frame_interval_sec: 1
detection:
model_size: "yolov8l"
confidence_threshold: 0.5
visualization:
heatmap:
bins_x: 20
bins_y: 35The pipeline generates several outputs:
- Extracted Frames:
data/frames/frame_XXXX.png - Homography Visualizations:
data/output/images/source_image_with_points.jpgdata/output/images/destination_image_with_points.jpgdata/output/images/transformed_image.jpg
- Detection Data:
data/detections/detections_data.csv - Visualizations:
data/output/images/center_points_original.jpgdata/output/images/transformed_points_scatter.jpgdata/output/images/pedestrian_heatmap.jpg
To use this project, prepare your data as follows:
data/
├── input/
│ ├── videos/
│ │ └── your_video.mp4 # Your video file
│ └── images/
│ ├── video_frame.jpg # A sample frame from your video
│ └── satellite_view.jpg # Top-down/satellite view of the area
Homography is a projective transformation that maps points from one plane to another. In this project:
- Source: Camera perspective view (angled ~15 degrees)
- Destination: Top-down satellite view
- Method: Point correspondence (4+ points)
- Model: YOLOv8 (various sizes available: n, s, m, l, x)
- Target Class: Person (class 0 in COCO dataset)
- Detection Point: Bottom center of bounding box (foot position)
Uses Z-score method to remove erroneous detections:
- Calculate Z-scores for transformed coordinates
- Remove points with Z-score > threshold (default: 3)
Create a custom configuration file:
from src.utils import load_config
from src.detector import PedestrianTrackingPipeline
config = load_config("path/to/custom_config.yaml")
pipeline = PedestrianTrackingPipeline(matrix, config=config)Use components separately:
from src.detector import PedestrianDetector, HomographyTransformer, PedestrianVisualizer
# Detect only
detector = PedestrianDetector()
detections = detector.detect_batch("data/frames")
# Transform only
transformer = HomographyTransformer(matrix)
transformed = transformer.transform_points(detections)
# Visualize only
visualizer = PedestrianVisualizer()
visualizer.plot_heatmap(transformed)-
"Cannot load YOLO model"
- Ensure ultralytics is installed:
pip install ultralytics - Model weights will download automatically on first run
- Ensure ultralytics is installed:
-
"CUDA out of memory"
- Use smaller YOLO model (yolov8n or yolov8s)
- Set
device: "cpu"in config.yaml
-
"No homography matrix found"
- Run Step 2 (homography_mapper.py) before Step 3
-
Poor transformation quality
- Select more accurate correspondence points
- Ensure points are spread across the image
- Use distinctive landmarks for point selection
- Use GPU acceleration (CUDA) for faster inference
- Adjust frame extraction interval based on video content
- Use appropriate YOLO model size (larger = more accurate but slower)
- Process frames in batches for better efficiency
If you use this code in your research, please cite:
@software{pedestrian_detection_tracking,
title={Pedestrian Detection and Tracking with YOLO},
author={Your Name},
year={2024},
url={https://github.com/yourusername/pedestrian-detection-tracking-YOLO}
}This project is provided as-is for archival purposes.
- Ultralytics YOLO - YOLOv8 implementation
- OpenCV - Computer vision operations
- COCO Dataset - Pre-trained detection models
For questions or issues, please open an issue on GitHub.
Note: This is an archived repository. The code has been refactored and organized for clarity and maintainability.