Skip to content
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
3421046
Bump scenescape-controller to 2026.1.0-dev
dmytroye Apr 8, 2026
4b94a3d
ITEP-89828: Support multiple evaluators in pipeline engine
dmytroye Apr 8, 2026
5994ec7
Add dummy evaluator without metric calculations
dmytroye Apr 8, 2026
b0bcf74
Remove not existing jet evaluator
dmytroye Apr 8, 2026
c45492e
Revert trackeval_evaluator config
dmytroye Apr 8, 2026
16bc8fd
Remove not jet existing evaluator
dmytroye Apr 8, 2026
5f59525
Merge branch 'main' into ITEP-89828/multiple-evaluators-support
dmytroye Apr 8, 2026
f845d8e
Update tools/tracker/evaluation/pipeline_engine.py
dmytroye Apr 8, 2026
8825929
Apply code review suggestions
dmytroye Apr 8, 2026
ef9fa4c
Update tools/tracker/evaluation/pipeline_engine.py
dmytroye Apr 8, 2026
e800a2e
Update README.md
dmytroye Apr 8, 2026
ac76633
Each evaluator reads from the same list object without copying it
dmytroye Apr 8, 2026
9183dee
Prettier
dmytroye Apr 8, 2026
57a7aca
Merge branch 'ITEP-89828/multiple-evaluators-support' into ITEP-89467…
dmytroye Apr 8, 2026
14b1afe
Add metrics calculation implementation
dmytroye Apr 9, 2026
2999270
Update config
dmytroye Apr 9, 2026
13413a0
Merge branch 'main' into ITEP-89828/multiple-evaluators-support
dmytroye Apr 9, 2026
c808db9
Merge branch 'ITEP-89828/multiple-evaluators-support' into ITEP-89467…
dmytroye Apr 9, 2026
7583448
Add tests
dmytroye Apr 9, 2026
30a6879
Calculate jitter metrics for GT data
dmytroye Apr 9, 2026
33d065e
Apply suggestions from code review
dmytroye Apr 9, 2026
b4642ec
Add tests
dmytroye Apr 9, 2026
e0acf62
Add test coverage for GT metrics and FPS derivation; fix docstring
dmytroye Apr 9, 2026
8042952
Fix relative import in jitter_evaluator
dmytroye Apr 9, 2026
8f9ed15
Apply suggestions from code review
dmytroye Apr 9, 2026
289a8f8
Apply review suggestions
dmytroye Apr 9, 2026
28d4b49
Calclate ratio Tracker vs GT
dmytroye Apr 9, 2026
752d51a
Prettier
dmytroye Apr 9, 2026
e98604c
Update Readme
dmytroye Apr 9, 2026
691272b
Update config filenam
dmytroye Apr 9, 2026
0872eae
Merge branch 'ITEP-89828/multiple-evaluators-support' of https://gith…
dmytroye Apr 9, 2026
e6b3b49
Merge branch 'main' into ITEP-89828/multiple-evaluators-support
dmytroye Apr 9, 2026
da2b4e7
Rename config
dmytroye Apr 9, 2026
cc8ffaa
Merge branch 'ITEP-89828/multiple-evaluators-support' into ITEP-89467…
dmytroye Apr 9, 2026
6e1d1c0
Merge config
dmytroye Apr 9, 2026
80095d6
Merge branch 'main' into ITEP-89828/multiple-evaluators-support
dmytroye Apr 10, 2026
558cc75
Merge branch 'ITEP-89828/multiple-evaluators-support' into ITEP-89467…
dmytroye Apr 10, 2026
94d58e9
Resolve conflicts
dmytroye Apr 10, 2026
2aabc9a
Merge branch 'ITEP-89467/add-jitter-metrics' of https://github.com/op…
dmytroye Apr 10, 2026
977af97
Merge branch 'main' into ITEP-89467/add-jitter-metrics
dmytroye Apr 10, 2026
623d2c0
Review suggestion
dmytroye Apr 10, 2026
78bf696
Rename cofig file
dmytroye Apr 10, 2026
2990c7b
Update Readmes
dmytroye Apr 10, 2026
721a00e
extend the table with diagnostic evaluator
dmytroye Apr 10, 2026
70bea5a
Prettier
dmytroye Apr 10, 2026
260fef6
Resolve coflicts in README.md
dmytroye Apr 10, 2026
238a681
Revert scene_controller_harness.py
dmytroye Apr 10, 2026
44e4f4f
README
dmytroye Apr 10, 2026
f82064c
Update Agents.md
dmytroye Apr 10, 2026
dc7cc16
Small change in init.py
dmytroye Apr 10, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 42 additions & 3 deletions tools/tracker/evaluation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,10 @@ evaluators:
- class: evaluators.trackeval_evaluator.TrackEvalEvaluator
config:
metrics: [HOTA, MOTA, IDF1]
- class: evaluators.jitter_evaluator.JitterEvaluator
config:
metrics:
[rms_jerk, rms_jerk_gt, acceleration_variance, acceleration_variance_gt]
```

Run the pipeline:
Expand All @@ -88,10 +92,38 @@ python -m pipeline_engine config.yaml
β”œβ”€β”€ dataset/ # Dataset-specific caches or exports
β”œβ”€β”€ harness/ # Harness logs or artifacts
└── evaluators/
└── <evaluator-class-name>/ # Evaluated metrics
└── <evaluator-key>/ # One folder per evaluator
```

The `<evaluator-key>` is the evaluator class name (e.g., `TrackEvalEvaluator`). When two evaluators
share the same class name, an index suffix is appended to keep keys unique
(e.g., `TrackEvalEvaluator_0/`, `TrackEvalEvaluator_1/`).

Example with a single evaluator:

```
/tmp/tracker-evaluation/20260211_142530/evaluators/TrackEvalEvaluator/
```

Example with two evaluators of the same class:

```
/tmp/tracker-evaluation/20260211_142530/evaluators/TrackEvalEvaluator_0/
/tmp/tracker-evaluation/20260211_142530/evaluators/TrackEvalEvaluator_1/
```

Example: `/tmp/tracker-evaluation/20260211_142530/evaluators/TrackEvalEvaluator/`
**Multiple evaluators**: The `evaluators` list accepts any number of entries. Each evaluator runs
against the same tracker outputs independently:

```yaml
evaluators:
- class: evaluators.trackeval_evaluator.TrackEvalEvaluator
config:
metrics: [HOTA, MOTA]
- class: evaluators.trackeval_evaluator.TrackEvalEvaluator
config:
metrics: [IDF1]
```

## Directory Structure

Expand Down Expand Up @@ -123,6 +155,13 @@ evaluation/
1. Create a new file in `evaluators/` (e.g., `custom_evaluator.py`)
2. Implement the `TrackerEvaluator` ABC from `base/tracker_evaluator.py`

### Available Evaluators

| Evaluator | Metrics | Description |
| -------------------- | ------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `TrackEvalEvaluator` | HOTA, MOTA, IDF1, and more | Industry-standard tracking accuracy metrics via the TrackEval library |
| `JitterEvaluator` | `rms_jerk`, `rms_jerk_gt`, `acceleration_variance`, `acceleration_variance_gt` | Trajectory smoothness metrics based on numerical differentiation of 3D positions; GT variants allow comparing tracker-added jitter against test-data jitter |

## Canonical Data Formats

The pipeline uses standardized data formats defined by JSON schemas to enable interoperability between components. All implementations must conform to these canonical formats.
Expand Down Expand Up @@ -244,7 +283,7 @@ pytest . -v -m integration
pytest tests/ -v # Integration tests
pytest datasets/tests/ -v # Dataset unit tests
pytest harnesses/tests/ -v # Harness unit tests
pytest evaluators/tests/ -v # Evaluators unit tests
pytest evaluators/tests/ -v # Evaluators unit tests
```

**Run tests from a specific file**:
Expand Down
6 changes: 3 additions & 3 deletions tools/tracker/evaluation/base/tracker_harness.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"""Base class for tracker harness implementations."""

from abc import ABC, abstractmethod
from typing import Iterator, Dict, Any
from typing import Dict, Any, Iterator, List
from pathlib import Path


Expand Down Expand Up @@ -71,7 +71,7 @@ def set_output_folder(self, path: Path) -> 'TrackerHarness':
pass

@abstractmethod
def process_inputs(self, inputs: Iterator[Dict[str, Any]]) -> Iterator[Dict[str, Any]]:
def process_inputs(self, inputs: Iterator[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Process input detections through the tracker synchronously.

This is the default (synchronous) mode. Processes all inputs and returns outputs.
Expand All @@ -82,7 +82,7 @@ def process_inputs(self, inputs: Iterator[Dict[str, Any]]) -> Iterator[Dict[str,
(see tools/tracker/evaluation/README.md#canonical-data-formats).

Returns:
Iterator of tracker outputs in canonical Tracker Output Format.
List of tracker outputs in canonical Tracker Output Format.

Raises:
RuntimeError: If processing fails.
Expand Down
73 changes: 73 additions & 0 deletions tools/tracker/evaluation/evaluators/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,79 @@ print(f"IDF1: {metrics['IDF1']:.3f}")

**Tests**: See [tests/test_trackeval_evaluator.py](tests/test_trackeval_evaluator.py) for comprehensive test suite with 16 test cases covering configuration, processing, evaluation, and integration workflows.

### JitterEvaluator

**Purpose**: Evaluate tracker smoothness by measuring positional jitter in tracked object trajectories, and compare it against jitter already present in the ground-truth test data.

**Status**: **FULLY IMPLEMENTED** β€” Computes RMS jerk and acceleration variance from both tracker outputs and ground-truth tracks using numerical differentiation.

**Supported Metrics**:

| Metric | Source | Description |
| -------------------------- | -------------- | ---------------------------------------------------------------------------- |
| `rms_jerk` | Tracker output | RMS jerk across all tracker output tracks (m/sΒ³) |
| `acceleration_variance` | Tracker output | Variance of acceleration magnitudes across all tracker output tracks (m/sΒ²)Β² |
| `rms_jerk_gt` | Ground truth | Same as `rms_jerk` computed on ground-truth tracks |
| `acceleration_variance_gt` | Ground truth | Same as `acceleration_variance` computed on ground-truth tracks |

Comparing `rms_jerk` with `rms_jerk_gt` shows how much jitter the tracker
adds on top of any jitter already present in the test data.

**Algorithm**:

All metrics are derived by applying three sequential layers of forward finite differences to 3D positions, accounting for variable time steps between frames:

$$v_i = \frac{p_{i+1} - p_i}{\Delta t_i}, \quad a_i = \frac{v_{i+1} - v_i}{\Delta t_{v,i}}, \quad j_i = \frac{a_{i+1} - a_i}{\Delta t_{a,i}}$$

- **rms_jerk / rms_jerk_gt**: $\sqrt{\frac{1}{N}\sum |j_i|^2}$ over all jerk samples from all tracks.
- **acceleration_variance / acceleration_variance_gt**: $\text{Var}(|a_i|)$ over all acceleration magnitude samples from all tracks.

Minimum track length: 3 points for acceleration, 4 points for jerk. Shorter tracks are skipped; if no eligible tracks exist, the metric returns 0.0.

For GT metrics, ground-truth frame numbers are converted to relative timestamps using the FPS derived from the tracker output.

**Key Features**:

- Builds per-track position histories from canonical tracker output format.
- Parses MOTChallenge 3D CSV ground-truth file for GT metric computation.
- Supports variable frame rates β€” time deltas are computed from actual timestamps.
- Deduplicates frames with identical timestamps (mirrors `TrackEvalEvaluator` behaviour).
- Sorts each track's positions by timestamp before metric computation.
- Saves a plain-text `jitter_results.txt` summary to the configured output folder.

**Usage Example**:

```python
from pathlib import Path
from evaluators.jitter_evaluator import JitterEvaluator

evaluator = JitterEvaluator()
evaluator.configure_metrics(['rms_jerk', 'rms_jerk_gt', 'acceleration_variance', 'acceleration_variance_gt'])
evaluator.set_output_folder(Path('/path/to/results'))

# Pass ground_truth=None to skip GT metrics
evaluator.process_tracker_outputs(tracker_outputs, ground_truth=dataset.get_ground_truth())
metrics = evaluator.evaluate_metrics()

print(f"RMS Jerk (tracker): {metrics['rms_jerk']:.4f} m/sΒ³")
print(f"RMS Jerk (GT): {metrics['rms_jerk_gt']:.4f} m/sΒ³")
print(f"Tracker added jitter: {metrics['rms_jerk'] - metrics['rms_jerk_gt']:.4f} m/sΒ³")
```

**Pipeline Configuration**:

```yaml
evaluators:
- class: evaluators.jitter_evaluator.JitterEvaluator
config:
metrics:
[rms_jerk, rms_jerk_gt, acceleration_variance, acceleration_variance_gt]
```

**Implementation**: [jitter_evaluator.py](jitter_evaluator.py)

**Tests**: See [tests/test_jitter_evaluator.py](tests/test_jitter_evaluator.py).

## Adding New Evaluators

To add support for a new metric computation library:
Expand Down
3 changes: 2 additions & 1 deletion tools/tracker/evaluation/evaluators/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,6 @@

"""Evaluator implementations for tracker evaluation."""
from .trackeval_evaluator import TrackEvalEvaluator
from .jitter_evaluator import JitterEvaluator

__all__ = ['TrackEvalEvaluator']
__all__ = ['TrackEvalEvaluator', 'JitterEvaluator']
Loading