End-to-End Visual Autonomous Parking via Control-Aided Attention

Abstract

Precise parking requires an end-to-end system where perception adaptively provides policy-relevant details-especially in critical areas where fine control decisions are essential. End-to-end learning offers a unified framework by directly mapping sensor inputs to control actions, but existing approaches lack effective synergy between perception and control. We find that transformer-based self-attention, when used alone, tends to produce unstable and temporally inconsistent spatial attention, which undermines the reliability of downstream policy decisions over time. Instead, we propose CAA-Policy, an end-to-end imitation learning system that allows control signal to guide the learning of visual attention via a novel Control-Aided Attention (CAA) mechanism. For the first time, we train such an attention module in a self-supervised manner, using backpropagated gradients from the control outputs instead of from the training loss. This strategy encourages the attention to focus on visual features that induce high variance in action outputs, rather than merely minimizing the training loss-a shift we demonstrate leads to a more robust and generalizable policy. To further enhance stability, CAA-Policy integrates short-horizon waypoint prediction as an auxiliary task, and introduces a separately trained motion prediction module to robustly track the target spot over time. Extensive experiments in the CARLA simulator show that \titlevariable~consistently surpasses both the end-to-end learning baseline and the modular BEV segmentation + hybrid A* pipeline, achieving superior accuracy, robustness, and interpretability. Code is released at this https URL.

arXiv

distribution of delta x,y,yaw, use this for tokenize

waypoints

News

[2025/11/11] Initial code pushing.

Setup

Clone the repo, set up CARLA 0.9.11, and build the conda environment:

git clone https://github.com/qintonguav/e2e-parking-carla.git
cd e2e-parking-carla/
conda env create -f environment.yml
conda activate E2EParking
chmod +x setup_carla.sh
./setup_carla.sh

CUDA 11.7 is used as default. Compatibility with CUDA 10.2 and 11.3 has also been validated. Launch the CARLA server:

./carla/CarlaUE4.sh -opengl

Evaluation (Inference with pre-trained model)

For inference, run:

python3 carla_parking_eva.py

Main variables for the evaluation script:

Variable Description Default --model_path Path to the model checkpoint file Required --eva_epochs Number of evaluation epochs 4 --eva_task_nums Number of evaluation tasks 16 --eva_parking_nums Number of parking attempts per slot 6 --eva_result_path Path to save the evaluation results (CSV file) Required --shuffle_veh Whether to shuffle static vehicles between tasks True --shuffle_weather Whether to shuffle weather between tasks False --random_seed Random seed for environment initialization 0 Evaluation metrics will be saved as CSV files at the location specified by --eva_result_path.

Dataset and Training

Training Data Generation In a separate terminal (with the CARLA server running), generate training data:

git checkout new_data_generate
python3 carla_data_gen.py

Main variables for the data generation script: Variable Description Default --save_path Path to save sensor data ./e2e_parking/ --task_num Number of parking tasks 16 --shuffle_veh Shuffle static vehicles between tasks True --shuffle_weather Shuffle weather between tasks False --random_seed Random seed; if 0, use current timestamp 0

Keyboard Controls
Key	Action
w / a / s / d	Throttle / left steer / right steer / hand brake
space	Brake
q	Reverse gear
Backspace	Reset current task
TAB	Switch camera view
Parking Success Conditions
Position error (vehicle center to slot center) < 0.5 meter
Orientation error < 0.5 degree
Conditions maintained for 60 consecutive frames
The target parking slot is marked with a red T. The task automatically switches when completed; any collisions reset the current task.

Training Script Run training on a single GPU:

python pl_train.py

Configure training parameters such as data path, epochs, and checkpoint path in training.yaml. For multi-GPU training, modify pl_train.py by updating:

os.environ['CUDA_VISIBLE_DEVICES'] = '0,1,2,3,4,5,6,7'

num_gpus = 8

Acknowledgement TODOs

Related Work in Occupancy Perception:

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
docs		docs
projects		projects
tools		tools
.gitignore		.gitignore
Download_Data.md		Download_Data.md
LICENSE		LICENSE
README.md		README.md
img.png		img.png
run.sh		run.sh
run_eval.sh		run_eval.sh
setup.py		setup.py
visual_ply.py		visual_ply.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

End-to-End Visual Autonomous Parking via Control-Aided Attention

Abstract

distribution of delta x,y,yaw, use this for tokenize

News

Setup

Evaluation (Inference with pre-trained model)

Main variables for the evaluation script:

Dataset and Training

About

Uh oh!

Releases

Packages

Languages

License

ai4ce/CAAPolicy

Folders and files

Latest commit

History

Repository files navigation

End-to-End Visual Autonomous Parking via Control-Aided Attention

Abstract

distribution of delta x,y,yaw, use this for tokenize

News

Setup

Evaluation (Inference with pre-trained model)

Main variables for the evaluation script:

Dataset and Training

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages