This project implements a lightweight yet fully functional closed-loop system for autonomous driving perception and planning. It utilizes a monocular camera video stream, employs YOLOv8 for real-time object detection, combines Inverse Perspective Mapping (IPM) to map visual information to the vehicle coordinate system, and finally uses the Frenet Lattice Planner algorithm to plan smooth and safe obstacle avoidance trajectories in dynamic environments.
(Note: Real-time planning effect during execution; the red curve represents the dynamic obstacle avoidance trajectory generated by the algorithm.)
- End-to-End Closed Loop: Implements a complete autonomous driving software stack from "Video Pixel Input" to "Control Trajectory Output".
- Deep Learning Perception: Integrates the YOLOv8 model, optimized for CPU inference, allowing smooth operation without expensive GPUs.
- Visual-Spatial Mapping: Utilizes Inverse Perspective Mapping (IPM) based on monocular vision to map 2D image detection boxes to 3D Bird's Eye View (BEV) coordinates.
-
Dynamic Trajectory Planning:
- Adopts the Frenet Coordinate System (
$s, d$ ) to decouple lateral and longitudinal movements. - Uses Quintic Polynomials to ensure continuous Jerk (change in acceleration), guaranteeing a smooth ride.
- Implements a Cost Function based multi-objective optimization, balancing safety, comfort, and efficiency.
- Adopts the Frenet Coordinate System (
- Engineering Implementation: Includes automated data download pipelines, exception handling mechanisms, and resource integrity checks.
To convert the pixel coordinate
Where
The state of the vehicle is described using lateral offset
-
Lateral Movement (
$d$ ): Planned using a Quintic Polynomial to satisfy boundary conditions (start/end position, velocity, acceleration):$$d(t) = a_0 + a_1 t + a_2 t^2 + a_3 t^3 + a_4 t^4 + a_5 t^5$$ -
Longitudinal Movement (
$s$ ): Planned using a Quartic Polynomial for velocity profile generation:$$s(t) = b_0 + b_1 t + b_2 t^2 + b_3 t^3 + b_4 t^4$$
The planner samples multiple trajectories and selects the optimal one by minimizing the total cost function
-
$J_{jerk}$ : Comfort term (minimizing jerk). -
$d_{diff}$ : Deviation from the reference lane. -
$v_{diff}$ : Deviation from target speed. -
$C_{collision}$ : Hard constraint for obstacle avoidance.
| Module | Solution | Description |
|---|---|---|
| Perception | YOLOv8-Nano (PyTorch) | Lightweight object detection to extract Obstacle Bounding Boxes. |
| Vision | OpenCV / Homography | Image processing and Coordinate Transformation (Pixel -> Meter). |
| Planning | Frenet Lattice Planner | Sampling-based local path planning (Apollo-like approach). |
| Math | Cubic Spline / Polynomials | Cubic Spline for reference lines, Polynomials for trajectory fitting. |
| Env | CPU Optimization | Multi-threading and inference optimization for non-GPU devices. |
MonoVision-ADAS-Planner/
├── assets/ # Resource folder (Auto-downloads test video and models)
├── src/ # Core algorithm source code
│ ├── __init__.py
│ ├── cubic_spline.py # Cubic Spline Interpolation: Builds smooth reference lines
│ ├── polynomials.py # Quintic/Quartic Polynomials: Core math for trajectory gen
│ ├── frenet_optimal.py # Planner: Trajectory sampling, collision check, cost eval
│ └── transform.py # Vision Transform: Perspective matrix calculation
├── main.py # Main Entry: Integrates Perception, Localization, and Planning loop
├── requirements.txt # Dependency list
└── README.md # Project documentation
