Tracker Evaluators

This directory contains evaluator implementations for computing tracking quality metrics in the evaluation pipeline.

Overview

Each tracker evaluator implements the TrackerEvaluator abstract base class (see ../base/tracker_evaluator.py) to:

Configure which metrics to compute
Process tracker outputs and ground-truth data
Compute industry-standard tracking quality metrics
Export results and optional plots

Evaluators handle metric-library-specific details (TrackEval, py-motmetrics, etc.) while providing a unified interface to the evaluation pipeline.

Available Evaluators

TrackEvalEvaluator

Purpose: Compute tracking quality metrics using the TrackEval library with custom 3D point tracking support.

Status: FULLY IMPLEMENTED - Computes real metrics from tracker outputs using TrackEval library with custom MotChallenge3DPoint dataset class.

Supported Metrics:

HOTA metrics: HOTA, DetA, AssA, LocA, DetPr, DetRe, AssPr, AssRe
CLEAR MOT metrics: MOTA, MOTP, MT, ML, Frag
Identity metrics: IDF1, IDP, IDR

For full metric list, refer to the TrackEval documentation: https://pypi.org/project/trackeval/.

Key Features:

3D Point Tracking: Custom MotChallenge3DPoint class extends TrackEval's MotChallenge2DBox with:
- Euclidean distance similarity (instead of IoU)
- 3D position extraction (x, y, z from translation field)
- Configurable distance threshold (default: 2.0 meters for 0.5 similarity)
Format Conversion: Automatic conversion from canonical JSON format to MOTChallenge CSV
UUID Mapping: Consistent UUID-to-integer ID mapping for track identity preservation
Timestamp Handling: Frame synchronization via FPS-based timestamp-to-frame conversion

Usage Example:

import sys
from pathlib import Path

# Add parent directories to path
sys.path.insert(0, str(Path(__file__).parent))

from evaluators.trackeval_evaluator import TrackEvalEvaluator
from datasets.metric_test_dataset import MetricTestDataset
from harnesses.scene_controller_harness import SceneControllerHarness

# Initialize dataset
dataset = MetricTestDataset("path/to/dataset")
dataset.set_cameras(["x1", "x2"]).set_camera_fps(30)

# Initialize and run harness
harness = SceneControllerHarness(container_image='scenescape-controller:latest')
harness.set_scene_config(dataset.get_scene_config())
harness.set_custom_config({'tracker_config_path': '/path/to/tracker-config.json'})
tracker_outputs = harness.process_inputs(dataset.get_inputs())

# Initialize evaluator
evaluator = TrackEvalEvaluator()

# Configure metrics
evaluator.configure_metrics(['HOTA', 'MOTA', 'IDF1'])
evaluator.set_output_folder(Path('/path/to/results'))

# Process and evaluate
evaluator.process_tracker_outputs(
    tracker_outputs=tracker_outputs,
    ground_truth=dataset.get_ground_truth()
)

# Get metrics
metrics = evaluator.evaluate_metrics()
print(f"HOTA: {metrics['HOTA']:.3f}")
print(f"MOTA: {metrics['MOTA']:.3f}")
print(f"IDF1: {metrics['IDF1']:.3f}")

Current Limitations:

Fixed class name ("pedestrian") for all objects
Single-sequence evaluation only
No parallel processing support
Limited configuration options for TrackEval parameters

Implementation: trackeval_evaluator.py

Tests: See tests/test_trackeval_evaluator.py for comprehensive test suite with 16 test cases covering configuration, processing, evaluation, and integration workflows.

JitterEvaluator

Purpose: Evaluate tracker smoothness by measuring positional jitter in tracked object trajectories, and compare it against jitter already present in the ground-truth test data.

Status: FULLY IMPLEMENTED — Computes RMS jerk and acceleration variance from both tracker outputs and ground-truth tracks using numerical differentiation.

Supported Metrics:

Metric	Source	Description
`rms_jerk`	Tracker output	RMS jerk across all tracker output tracks (m/s³)
`acceleration_variance`	Tracker output	Variance of acceleration magnitudes across all tracker output tracks (m/s²)²
`rms_jerk_gt`	Ground truth	Same as `rms_jerk` computed on ground-truth tracks
`acceleration_variance_gt`	Ground truth	Same as `acceleration_variance` computed on ground-truth tracks

Comparing rms_jerk with rms_jerk_gt shows how much jitter the tracker adds on top of any jitter already present in the test data.

Algorithm:

All metrics are derived by applying three sequential layers of forward finite differences to 3D positions, accounting for variable time steps between frames:

$$v_i = \frac{p_{i+1} - p_i}{\Delta t_i}, \quad a_i = \frac{v_{i+1} - v_i}{\Delta t_{v,i}}, \quad j_i = \frac{a_{i+1} - a_i}{\Delta t_{a,i}}$$

rms_jerk / rms_jerk_gt: $\sqrt{\frac{1}{N}\sum |j_i|^2}$ over all jerk samples from all tracks.
acceleration_variance / acceleration_variance_gt: $\text{Var}(|a_i|)$ over all acceleration magnitude samples from all tracks.

Minimum track length: 3 points for acceleration, 4 points for jerk. Shorter tracks are skipped; if no eligible tracks exist, the metric returns 0.0.

For GT metrics, ground-truth frame numbers are converted to relative timestamps using the FPS derived from the tracker output.

Key Features:

Builds per-track position histories from canonical tracker output format.
Parses MOTChallenge 3D CSV ground-truth file for GT metric computation.
Supports variable frame rates — time deltas are computed from actual timestamps.
Deduplicates frames with identical timestamps (mirrors TrackEvalEvaluator behaviour).
Sorts each track's positions by timestamp before metric computation.
Saves a plain-text jitter_results.txt summary to the configured output folder.

Usage Example:

from pathlib import Path
from evaluators.jitter_evaluator import JitterEvaluator

evaluator = JitterEvaluator()
evaluator.configure_metrics(['rms_jerk', 'rms_jerk_gt', 'acceleration_variance', 'acceleration_variance_gt'])
evaluator.set_output_folder(Path('/path/to/results'))

# Pass ground_truth=None to skip GT metrics
evaluator.process_tracker_outputs(tracker_outputs, ground_truth=dataset.get_ground_truth())
metrics = evaluator.evaluate_metrics()

print(f"RMS Jerk (tracker): {metrics['rms_jerk']:.4f} m/s³")
print(f"RMS Jerk (GT):      {metrics['rms_jerk_gt']:.4f} m/s³")
print(f"Tracker added jitter: {metrics['rms_jerk'] - metrics['rms_jerk_gt']:.4f} m/s³")

Pipeline Configuration:

evaluators:
  - class: evaluators.jitter_evaluator.JitterEvaluator
    config:
      metrics:
        [rms_jerk, rms_jerk_gt, acceleration_variance, acceleration_variance_gt]

Implementation: jitter_evaluator.py

Tests: See tests/test_jitter_evaluator.py.

Adding New Evaluators

To add support for a new metric computation library:

Create evaluator class: Implement all abstract methods from TrackerEvaluator base class (see ../base/tracker_evaluator.py)
Integrate metric library: Wrap the external library (TrackEval, py-motmetrics, etc.) or implement custom code to compute metrics
Handle formats: Convert canonical tracker outputs and ground-truth to library-specific formats
Support configuration:
- configure_metrics() - specify which metrics to compute

set_output_folder() - where to save results and plots

Document requirements: Update this README with supported metrics and configuration options
Create tests: Add tests validating metric computation and result export

Implementation Patterns

Metric computation workflow:

Configure metrics via configure_metrics(['HOTA', 'MOTA', ...])
Set result output folder via set_output_folder(Path('/results'))
Process data via process_tracker_outputs(tracker_outputs, ground_truth)
Compute metrics via evaluate_metrics() → returns Dict[str, float]
Reset state via reset() to evaluate another tracker

Method chaining: All configuration methods return self for fluent API:

metrics = (evaluator
           .configure_metrics(['HOTA', 'MOTA'])
           .set_output_folder(Path('/results'))
           .process_tracker_outputs(outputs, gt)
           .evaluate_metrics())

Ground-truth format: Evaluators receive ground-truth in MOTChallenge 3D CSV format: See Canonical Data Formats

Provided by dataset's get_ground_truth() method

Design Documentation

See tracker-evaluation-pipeline.md for overall architecture and design decisions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracker Evaluators

Overview

Available Evaluators

TrackEvalEvaluator

JitterEvaluator

Adding New Evaluators

Implementation Patterns

Design Documentation

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Tracker Evaluators

Overview

Available Evaluators

TrackEvalEvaluator

JitterEvaluator

Adding New Evaluators

Implementation Patterns

Design Documentation