SegFormer Finetuning on BDD100K for Autonomous Driving: An End-to-End Deployment Pipeline with TensorRT INT8 Quantization

This project demonstrates a complete, end-to-end pipeline for developing and deploying a high-performance visual perception model for autonomous driving. A state-of-the-art SegFormer model is fine-tuned on the BDD100K dataset for semantic segmentation. Subsequently, the model is optimized using the NVIDIA TensorRT SDK, converting it from PyTorch to highly efficient FP32, FP16, and INT8 inference engines. This optimization process is benchmarked to quantify the significant gains in performance and reduction in resource usage, making the model suitable for real-time applications.

Key Features

Model: NVIDIA's SegFormer (nvidia/segformer-b5-finetuned-ade-640-640 about 85 million parameters).
Dataset: BDD100K, a large-scale, industry-standard autonomous driving dataset.
Training Strategies: Implements both full model fine-tuning and a more efficient head-only fine-tuning approach.
Frameworks: PyTorch for training, Hugging Face Transformers for model handling.
Optimization Stack: A production-grade workflow using ONNX as an intermediate representation and NVIDIA TensorRT for final optimization and inference.
Quantization: Advanced optimization using INT8 quantization with a custom calibrator to achieve maximum performance.
Benchmarking: Rigorous, end-to-end benchmarking of performance (latency, throughput), memory usage, and accuracy (mIoU) across all model versions.

Training and Fine-Tuning

Two fine-tuning strategies were evaluated: a full fine-tuning of all model parameters and a head-only fine-tuning where the pre-trained encoder is frozen. Head-only fine-tuning proved to be significantly more efficient, using about 1/3 of the GPU memory (6G/18 G) and training 2.5x faster (7.3vs3 it/s) than full fine-tuning on an NVIDIA RTX 5880 Ada Generation.

Gray Line: Full Fine-Tuning
Cyan Line: Head-Only Fine-Tuning

Training Loss	Validation Loss

Validation mIoU	Validation Accuracy

Performance and Accuracy Benchmark Results

The models were benchmarked on an NVIDIA GeForce RTX 3060 12G. The results demonstrate the clear advantages of the TensorRT optimization pipeline.

Full Fine-Tuning (Epoch 5)

This model was selected for its peak validation mIoU during the full fine-tuning run.

Metric	PyTorch (FP32)	TensorRT (FP32)	TensorRT (FP16)	TensorRT (INT8)
Latency (ms)	107.80	69.62	29.29	26.86
Throughput (FPS)	9.28	14.36	34.14	37.23
Memory (MB)	1362.36	127.84	128.34	128.34
mIoU	0.53	0.06	0.06	0.07
Mean Accuracy	0.63	0.12	0.13	0.13

Head-Only Fine-Tuning (Epoch 10)

This model was selected as the best-performing checkpoint from the head-only fine-tuning run.

Metric	PyTorch (FP32)	TensorRT (FP32)	TensorRT (FP16)	TensorRT (INT8)
Latency (ms)	107.72	69.36	29.43	26.92
Throughput (FPS)	9.28	14.42	33.97	37.14
Memory (MB)	1362.36	127.84	128.34	128.34
mIoU	0.51	0.07	0.07	0.07
Mean Accuracy	0.59	0.11	0.12	0.12

Setup and Usage

1. Environment Setup

This project uses Conda for environment management. Ensure you have an NVIDIA GPU, the appropriate driver, the CUDA Toolkit and TensorRT installed.

# Create and activate the conda environment
conda create -n tensorrt_project python=3.12 -y
conda activate tensorrt_project

# Install other dependencies
pip install -r requirements.txt

# Download dataset
./download_BDD100K.sh

2. Running the Pipeline

The project is divided into several executable scripts located in the scripts/ directory. They should be run in the following order.

Step 1: Fine-Tune the Model This script fine-tunes the pre-trained SegFormer model on a subset of the BDD100K dataset and saves the weights.

# For full fine-tuning
python scripts/finetune.py --output_path "models/segformer_bdd100k_finetuned.pth"

# For head-only fine-tuning
python scripts/finetune.py --freeze-encoder --output_path "models/segformer_bdd100k_finetuned_headonly.pth"

Step 2: Export to ONNX This script converts the fine-tuned PyTorch model to the ONNX format.

# For full fine-tuned model
python scripts/onnx_export.py --finetuned_checkpoint "models/segformer_bdd100k_finetuned_epoch_005.pth" --onnx_output_path "models/segformer.onnx"

# For head-only model
python scripts/onnx_export.py --finetuned_checkpoint "models/segformer_bdd100k_finetuned_headonly_epoch_009.pth" --onnx_output_path "models/segformer_headonly.onnx"

Step 3: Build TensorRT Engines This script builds the optimized FP32, FP16, and INT8 engines from the ONNX file.

# For full fine-tuned model
python scripts/build_engine.py

# For head-only model
python scripts/build_engine.py --headonly

Step 4: Run Benchmarking and Visualization This final script runs a performance and accuracy comparison of all model versions and generates the visual result images.

# Benchmark the full fine-tuned model
python scripts/benchmark.py

# Benchmark the head-only model (modify SUFFIX in benchmark.py to "_headonly")
python scripts/benchmark.py

# Generate visualizations
python scripts/visualizatize.py

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
result_imgs		result_imgs
scripts		scripts
.gitignore		.gitignore
README.md		README.md
download_BDD100K.sh		download_BDD100K.sh
requirement.txt		requirement.txt
verify_environment.py		verify_environment.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SegFormer Finetuning on BDD100K for Autonomous Driving: An End-to-End Deployment Pipeline with TensorRT INT8 Quantization

Key Features

Training and Fine-Tuning

Performance and Accuracy Benchmark Results

Full Fine-Tuning (Epoch 5)

Head-Only Fine-Tuning (Epoch 10)

Setup and Usage

1. Environment Setup

2. Running the Pipeline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SegFormer Finetuning on BDD100K for Autonomous Driving: An End-to-End Deployment Pipeline with TensorRT INT8 Quantization

Key Features

Training and Fine-Tuning

Performance and Accuracy Benchmark Results

Full Fine-Tuning (Epoch 5)

Head-Only Fine-Tuning (Epoch 10)

Setup and Usage

1. Environment Setup

2. Running the Pipeline

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages