Skip to content

Latest commit

 

History

History
146 lines (123 loc) · 5.96 KB

File metadata and controls

146 lines (123 loc) · 5.96 KB

TERRIER: PRL-Track Integration PRD (Wave-10 WARDOG)

0. Document Control

1. Executive Summary

TERRIER ports the PRL-Track tracking architecture into ANIMA as a reproducible, deployment-oriented module for UAV defense use cases. The module keeps the paper's core components (AR, SR, HMG) while adapting data pipelines and inference integration for:

  1. ANIMA runtime and ROS2 stack.
  2. YOLO26-first detection bootstrap.
  3. Dual-compute execution (MLX on Apple Silicon and CUDA on RTX-class GPU).

2. Verification Verdict

2.1 Paper authenticity

  • ArXiv entry exists (2409.16652v1, published 2024-09-25).
  • IEEE IROS publication exists (10.1109/IROS58592.2024.10803050).
  • PDF in papers/2409.16652.pdf is consistent with arXiv abstract/method/results.

2.2 Reference implementation validity

  • Upstream repo exists and is active enough for reproduction bootstrapping:
    • stars: 52
    • forks: 11
    • latest push: 2024-09-29
  • CLI scripts resolve and parse args (tools/train.py --help, tools/test.py --help).
  • Important risk: upstream model hardcodes CUDA (.cuda() in ModelBuilder), blocking direct CPU/MLX execution without patching.

2.3 Dataset availability check

  • Shared dataset volume mounted: /Volumes/AIFlowDev/RobotFlowLabs/datasets/.
  • Present locally: shared/coco, wave10_staging/visdrone.
  • Missing in shared volume snapshot: GOT-10K, LaSOT, UAVDT, DroneVehicle, SeaDronesSee, 1.8M mega UAV (explicit path not found).

2.4 Metric plausibility and external signal

  • Paper-reported values are internally coherent (UAVTrack112/UAVTrack112_L/UAV123 + real-world FPS 42.6).
  • External citation signal exists:
    • Semantic Scholar citation count: 19 (as of 2026-04-10)
    • OpenAlex published article citation count: 11 (as of 2026-04-10)

2.5 Verdict

  • Status: VERIFIED WITH RISKS
  • Risks requiring explicit handling:
    1. Upstream CUDA-only assumptions.
    2. Incomplete local availability of training datasets.
    3. Upstream training recipe discrepancy between paper text (70 epochs) and repo config defaults (100 epochs).

3. Scope: Take / Skip / Adapt

3.1 Take

  • Progressive representation stack:
    • Appearance-aware regulator (AR)
    • Semantic-aware regulators (SR)
    • Hierarchical modeling generator (HMG)
  • Depthwise correlation based template-search fusion.
  • Multi-head localization/classification output structure.

3.2 Skip

  • Any component requiring direct dependence on non-portable CUDA-only code paths.
  • Non-UAV-defense-specific tooling not needed for ANIMA deployment.
  • Full-scale training run in this bootstrap phase.

3.3 Adapt

  • Detection front-end anchored on YOLO26.
  • Unified backend selector and parity checks for mlx|cuda|cpu.
  • Dataset abstraction for internal 1.8M UAV corpus + staged public datasets.

4. System Requirements

4.1 Functional

  • Load PRL-Track style model scaffold and run synthetic template/search forward pass.
  • Expose backend selection via CLI and config.
  • Support YOLO26 detector integration point.
  • Produce deterministic smoke-check output for CI.

4.2 Non-functional

  • Dual-compute API compatibility (mlx|cuda|cpu).
  • Reproducible synthetic tests.
  • Clear failure modes for missing weights/datasets.

5. Target Architecture (Initial)

  1. TinyBackbone extracts hierarchical features (5 scales).
  2. Depthwise cross-correlation fuses template/search feature maps.
  3. AR refines shallow representation.
  4. SR modules refine deeper semantic representation.
  5. HMG performs hierarchical cross-attention fusion.
  6. Heads output:
    • loc: [B,4,H,W]
    • cls1: [B,2,H,W]
    • cls2: [B,1,H,W]
  7. YOLO26 adapter is used as detector bootstrap before tracking handoff.

6. Data Plan

6.1 Paper reproduction data (upstream)

  • COCO, GOT-10K, LaSOT

6.2 WARDOG adaptation data (module-level)

  • 1.8M Mega UAV dataset (internal)
  • VisDrone
  • UAVDT
  • DroneVehicle
  • SeaDronesSee

6.3 Data gate rule

  • Do not force downloads in code paths by default.
  • Check mounted shared volume first.
  • Keep download scripts idempotent and non-destructive.

7. Implementation Phases

Phase A — Verification and planning (done in this pass)

  • Complete evidence-backed paper/repo/dataset verification.
  • Generate PRD suite and task backlog.

Phase B — Initial code scaffold (done in this pass)

  • Implement ANIMA package skeleton with AR/SR/HMG core and smoke tests.
  • Implement backend abstraction and parity utility.
  • Add YOLO26 integration interface.

Phase C — Reproduction track (next)

  • Reproduce upstream benchmark pipeline with canonical datasets.
  • Resolve epoch/config discrepancy by controlled experiments.
  • Match published benchmark metrics within tolerance band.

Phase D — WARDOG adaptation

  • Fine-tune with 1.8M UAV + public UAV datasets.
  • Evaluate defense-oriented scenarios and latency targets.
  • Integrate ROS2 + simulator hooks.

8. Acceptance Criteria (Bootstrap Completion)

  • Paper and repo verified with documented risks.
  • Full PRD/task suite generated (prds/, tasks/, ASSETS.md).
  • Initial PRL-Track architecture scaffold implemented in src/.
  • CLI supports --backend mlx|cuda|cpu|auto.
  • Synthetic unit tests pass locally.

9. Open Risks / CTO Escalations

  1. Upstream CUDA hard-binding requires patch/port for complete parity.
  2. Dataset readiness gap for full reproduction and adaptation.
  3. Real-world 42.6 FPS claim cannot be validated without edge hardware run.

10. Outputs of This Pass

  • PRD.md (this file) replaced from scaffold to verified execution plan.
  • ASSETS.md, prds/*, tasks/* generated.
  • Initial code scaffold and tests implemented.