89 lines (71 loc) · 4.47 KB

FOXHOUND — Asset Manifest

Paper

Title: Learning Motion Blur Robust Vision Transformers for Real-Time UAV Tracking
ArXiv: 2407.05383
URL: https://arxiv.org/abs/2407.05383
Authors: You Wu, Xucheng Wang, Dan Zeng, Hengzhou Ye, Xiaolan Xie, Qijun Zhao, Shuiwang Li

Status

ALMOST

Why not READY:

The public repo is present but not reproducible.
Shared dataset volume audit did not confirm the paper datasets locally.
Exact training recipe inherits from Aba-ViTrack and must still be recovered.

Pretrained Weights

Model	Size	Source	Path on Server	Status
`deit_tiny_patch16_224`	~5.8M params	timm / DeiT upstream	`/mnt/forge-data/models/deit_tiny_patch16_224.safetensors`	MISSING
`vit_tiny_patch16_224`	~5.7M params	timm / ViT upstream	`/mnt/forge-data/models/vit_tiny_patch16_224.safetensors`	MISSING
`t2t_vit_t_14`	lightweight ViT	T2T-ViT upstream	`/mnt/forge-data/models/t2t_vit_t_14.safetensors`	MISSING
`yolo26m.pt`	detector backbone	internal YOLO26 baseline	`/mnt/forge-data/models/yolo26/yolo26m.pt`	MISSING
`yolo26m-uav.pt`	detector adaptation	internal YOLO26 UAV fine-tune	`/mnt/forge-data/models/yolo26/yolo26m-uav.pt`	MISSING

Datasets

Dataset	Size	Split	Source	Path	Status
`MegaUAV-1.8M`	~1.8M images	train/val/test	internal	`/Volumes/AIFlowDev/RobotFlowLabs/datasets/MegaUAV-1.8M`	MISSING
`UAV123`	benchmark	test	public benchmark	`/Volumes/AIFlowDev/RobotFlowLabs/datasets/UAV123`	MISSING
`UAV123@10fps`	derived benchmark	test	downsampled from `UAV123`	`/Volumes/AIFlowDev/RobotFlowLabs/datasets/UAV123_10fps`	MISSING
`VisDrone2018`	benchmark	test	public benchmark	`/Volumes/AIFlowDev/RobotFlowLabs/datasets/VisDrone2018`	MISSING
`UAVDT`	benchmark	test	public benchmark	`/Volumes/AIFlowDev/RobotFlowLabs/datasets/UAVDT`	MISSING
`DroneVehicle`	adaptation	train/val	public benchmark	`/Volumes/AIFlowDev/RobotFlowLabs/datasets/DroneVehicle`	MISSING
`SeaDronesSee`	adaptation	train/val	public benchmark	`/Volumes/AIFlowDev/RobotFlowLabs/datasets/SeaDronesSee`	MISSING
`UAVTrack112_L`	real-world eval	eval	paper real-world set	`/Volumes/AIFlowDev/RobotFlowLabs/datasets/UAVTrack112_L`	MISSING

Hyperparameters From Paper

Param	Value	Paper Section
`template_size`	`128 x 128`	`§4.1`
`search_size`	`256 x 256`	`§4.1`
`loss_cls`	weighted focal loss	`§3.4`
`loss_box`	`L1 + GIoU`	`§3.4`
`eta_iou`	`2.0`	`Eq. 7`
`eta_l1`	`5.0`	`Eq. 7`
`rho`	`1e-4`	`Eq. 7`, `Table 5`
`gamma`	`1e3`	`Eq. 7`
`epsilon`	`0.05`	`§3.3`
`exit_threshold`	`0.95`	`§3.3`

Hyperparameters Still Missing

Param	Status	Note
`optimizer`	MISSING	paper says training pipeline follows Aba-ViTrack
`batch_size`	MISSING	not restated in BDTrack text
`schedule`	MISSING	not restated in BDTrack text
`nenf`	MISSING	concept described, exact deployed value not exposed in paper text
`lambda_weight`	MISSING	equation includes lambda but text does not provide the final scalar
`tau`	MISSING	equation includes tau but final chosen constant is not stated in exposed paper text

Expected Metrics

Benchmark	Metric	Paper Value	Our Target
four-benchmark average	precision	`84.4`	`>= 83.0` bootstrap reproduction
four-benchmark average	success	`64.5`	`>= 63.0` bootstrap reproduction
`UAVDT`	precision	`84.1`	`>= 82.0`
`UAVDT`	success	`61.0`	`>= 59.0`
`VisDrone2018`	precision	`85.2`	`>= 83.0`
`VisDrone2018`	success	`64.3`	`>= 62.0`
`UAV123`	precision	`84.8`	`>= 83.0`
`UAV123`	success	`66.7`	`>= 65.0`
`UAV123@10fps`	precision	`83.5`	`>= 82.0`
`UAV123@10fps`	success	`65.9`	`>= 64.0`
efficiency	GPU FPS	`283.4`	`>= 240` on modern CUDA path

Reproducibility Notes

Public GitHub repo currently lacks the implementation needed for a direct reproduction.
Paper implementation details partially inherit from Aba-ViTrack, so FOXHOUND must recover or re-derive missing training settings before Phase 2.
YOLO26 is an adaptation layer for ANIMA deployment, not part of the original BDTrack paper.