A comprehensive two-stage person re-identification pipeline integrating YOLO person detection, TAO-trained ReID models (via Triton Inference Server), and BoxMOT tracking with full experimental logging.
┌─────────────┐ ┌──────────────────┐ ┌──────────────┐
│ YOLO │───▶│ Triton ReID │───▶│ BoxMOT │
│ Detector │ │ (TensorRT FP16) │ │ Tracker │
└─────────────┘ └──────────────────┘ └──────────────┘
│ │ │
└──────────────────────┴──────────────────────┘
│
┌────────▼────────┐
│ Experiment │
│ Logger │
└─────────────────┘
- YOLO11n Person Detector - Detects persons and extracts crops
- TAO ReID Model - Extracts 256-dim embeddings via Triton Inference Server
- BoxMOT (BoTSORT) - Multi-object tracking with external ReID features
- Experimental Logger - Comprehensive logging for reproducibility
- ✅ FP16 TensorRT inference for TAO ReID model
- ✅ Dynamic batching (1-16) via Triton Inference Server
- ✅ Real-time multi-object tracking with ReID features
- ✅ Comprehensive experimental logging (detections, embeddings, tracks, metrics)
- ✅ Model versioning with SHA256 hashing
- ✅ Video visualization with track IDs
- ✅ Performance metrics (FPS, latency, GPU memory)
# Activate conda environment
conda activate tensorrt_blackwell
# Install dependencies
pip install -r requirements.txt
# Pull Triton Docker container (if user hasn't already)
# docker pull nvcr.io/nvidia/tritonserver:25.04-py3# Convert ONNX to TensorRT engine
python scripts/export_to_tensorrt.py \\
--onnx models/lttc_0.1.4.49.onnx \\
--output triton_models/lttc_reid/1/model.plan \\
--config configs/reid_config.yaml# Start Triton Inference Server in Docker
bash scripts/start_triton_server.sh
# Wait for server to be ready (script will wait automatically)# Validate all components
python scripts/validate_models.py# Process a video
python main.py \\
--video data/videos/your_video.mp4 \\
--output data/outputs/result.mp4 \\
--experiment-name my_test_runReid_Inference_Pipeline_0.2/
├── configs/ # Configuration files
│ ├── yolo_config.yaml # YOLO detection settings
│ ├── reid_config.yaml # ReID + Triton settings
│ ├── tracker_config.yaml # BoxMOT tracker settings
│ └── pipeline_config.yaml # Pipeline orchestration
│
├── models/ # Model files
│ ├── yolo11n.pt # YOLO model
│ ├── lttc_0.1.4.49.pth # TAO ReID PyTorch checkpoint
│ ├── lttc_0.1.4.49.onnx # TAO ReID ONNX export
│ └── lttc_0.1.4.49.engine # Original TensorRT engine
│
├── triton_models/ # Triton model repository
│ └── lttc_reid/
│ ├── config.pbtxt # Triton model config
│ └── 1/
│ └── model.plan # TensorRT engine (FP16, dynamic batch)
│
├── src/ # Source code
│ ├── detector.py # YOLO wrapper
│ ├── reid_client.py # Triton HTTP client
│ ├── tracker.py # BoxMOT integration
│ ├── logger.py # Experimental logging
│ ├── pipeline.py # Main orchestration
│ └── utils/ # Utilities
│
├── scripts/ # Helper scripts
│ ├── export_to_tensorrt.py # ONNX → TensorRT converter
│ ├── setup_triton_model.py # Triton model repo setup
│ ├── start_triton_server.sh # Triton server launcher
│ └── validate_models.py # Model validation
│
├── data/ # Data directories
│ ├── videos/ # Input videos
│ └── outputs/ # Output visualizations
│
├── experiments/ # Experiment logs
│ └── exp_YYYYMMDD_HHMMSS/ # Auto-generated per run
│ ├── config_snapshot.json
│ ├── detections.jsonl
│ ├── embeddings.jsonl
│ ├── tracks.jsonl
│ ├── metrics.jsonl
│ └── video_metadata.json
│
├── main.py # Main CLI entry point
├── requirements.txt # Python dependencies
└── README.md # This file
model:
path: "models/yolo11n.pt"
device: "cuda:0"
detection:
conf_threshold: 0.5
iou_threshold: 0.7
classes: [0] # Person only
imgsz: 640triton:
server_url: "localhost:8000"
model_name: "lttc_reid"
model_version: "1"
model:
input_shape: [256, 128] # H x W
embedding_dim: 256
preprocessing:
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
tensorrt:
min_batch: 1
opt_batch: 8
max_batch: 16
precision: "fp16"botsort:
track_buffer: 30
appearance_thresh: 0.25
with_reid: false # Use external TAO embeddings# Process single video
python main.py --video test_video.mp4# Use custom experiment name
python main.py --video test_video.mp4 --experiment-name exp_yolo11n_tao_fp16# Process only first 100 frames
python main.py --video test_video.mp4 --max-frames 100# Skip visualization output (faster)
python main.py --video test_video.mp4 --no-visualizationEach pipeline run creates a unique experiment directory with comprehensive logs:
- detections.jsonl - Per-frame YOLO detections
- embeddings.jsonl - ReID embeddings for each person
- tracks.jsonl - Tracking results with track IDs
- metrics.jsonl - Performance metrics (FPS, GPU memory, latency)
- config_snapshot.json - Complete configuration used
- model_versions.json - Model file SHA256 hashes
- video_metadata.json - Input video metadata
# View detections
cat experiments/exp_*/detections.jsonl | jq '.'
# View tracking results
cat experiments/exp_*/tracks.jsonl | jq '.tracks'
# View performance metrics
cat experiments/exp_*/metrics.jsonl | jq '.fps'
# Get experiment summary
cat experiments/exp_*/config_snapshot.json | jq '.pipeline'- YOLO Detection: ~15-20 ms per frame
- ReID Inference (Triton): ~8-10 ms per batch of 8 crops
- Tracking: ~2-5 ms per frame
- Overall FPS: > 20 FPS on 640x640 videos
- Batch Size: Adjust
opt_batchinreid_config.yaml - YOLO Input Size: Change
imgszinyolo_config.yaml - Tracking Buffer: Modify
track_bufferintracker_config.yaml
# Check Docker logs
docker logs triton-reid-server
# Verify model repository
python scripts/setup_triton_model.py --validate-only
# Restart server
docker stop triton-reid-server
bash scripts/start_triton_server.sh- Reduce
max_batchinreid_config.yaml - Reduce YOLO
imgszinyolo_config.yaml - Close other GPU applications
- Enable FP16 in YOLO config
- Increase Triton batch size
- Use smaller YOLO model (yolo11n vs yolo11x)
# Validate all models
python scripts/validate_models.py
# Re-generate TensorRT engine
python scripts/export_to_tensorrt.py- YOLO11n -
models/yolo11n.pt(already present) - TAO ReID ONNX -
models/lttc_0.1.4.49.onnx(already present)
- TensorRT Engine -
triton_models/lttc_reid/1/model.plan(generated by script)
# Test YOLO detector
python src/detector.py
# Test ReID client (requires Triton running)
python src/reid_client.py
# Test tracker
python src/tracker.py
# Test logger
python src/logger.py# Validate setup
python scripts/validate_models.py
# Test pipeline (no Triton)
python scripts/validate_models.py --skip-tritonIf you use this pipeline in your research, please cite:
- YOLO: Ultralytics YOLOv11
- TAO Toolkit: NVIDIA TAO Toolkit
- Triton: NVIDIA Triton Inference Server
- BoxMOT: BoxMOT Multi-Object Tracking
This project is provided as-is for research and development purposes.
For issues or questions:
- Check troubleshooting section above
- Validate setup with
python scripts/validate_models.py - Review experiment logs in
experiments/directory
- Initial implementation with Triton Inference Server
- FP16 TensorRT engine support
- BoxMOT tracking integration
- Comprehensive experimental logging