Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
3421046
Bump scenescape-controller to 2026.1.0-dev
dmytroye Apr 8, 2026
4b94a3d
ITEP-89828: Support multiple evaluators in pipeline engine
dmytroye Apr 8, 2026
5994ec7
Add dummy evaluator without metric calculations
dmytroye Apr 8, 2026
b0bcf74
Remove not existing jet evaluator
dmytroye Apr 8, 2026
c45492e
Revert trackeval_evaluator config
dmytroye Apr 8, 2026
16bc8fd
Remove not jet existing evaluator
dmytroye Apr 8, 2026
5f59525
Merge branch 'main' into ITEP-89828/multiple-evaluators-support
dmytroye Apr 8, 2026
f845d8e
Update tools/tracker/evaluation/pipeline_engine.py
dmytroye Apr 8, 2026
8825929
Apply code review suggestions
dmytroye Apr 8, 2026
ef9fa4c
Update tools/tracker/evaluation/pipeline_engine.py
dmytroye Apr 8, 2026
e800a2e
Update README.md
dmytroye Apr 8, 2026
ac76633
Each evaluator reads from the same list object without copying it
dmytroye Apr 8, 2026
9183dee
Prettier
dmytroye Apr 8, 2026
57a7aca
Merge branch 'ITEP-89828/multiple-evaluators-support' into ITEP-89467…
dmytroye Apr 8, 2026
14b1afe
Add metrics calculation implementation
dmytroye Apr 9, 2026
2999270
Update config
dmytroye Apr 9, 2026
13413a0
Merge branch 'main' into ITEP-89828/multiple-evaluators-support
dmytroye Apr 9, 2026
c808db9
Merge branch 'ITEP-89828/multiple-evaluators-support' into ITEP-89467…
dmytroye Apr 9, 2026
7583448
Add tests
dmytroye Apr 9, 2026
30a6879
Calculate jitter metrics for GT data
dmytroye Apr 9, 2026
33d065e
Apply suggestions from code review
dmytroye Apr 9, 2026
b4642ec
Add tests
dmytroye Apr 9, 2026
e0acf62
Add test coverage for GT metrics and FPS derivation; fix docstring
dmytroye Apr 9, 2026
8042952
Fix relative import in jitter_evaluator
dmytroye Apr 9, 2026
8f9ed15
Apply suggestions from code review
dmytroye Apr 9, 2026
289a8f8
Apply review suggestions
dmytroye Apr 9, 2026
28d4b49
Calclate ratio Tracker vs GT
dmytroye Apr 9, 2026
752d51a
Prettier
dmytroye Apr 9, 2026
e98604c
Update Readme
dmytroye Apr 9, 2026
691272b
Update config filenam
dmytroye Apr 9, 2026
0872eae
Merge branch 'ITEP-89828/multiple-evaluators-support' of https://gith…
dmytroye Apr 9, 2026
e6b3b49
Merge branch 'main' into ITEP-89828/multiple-evaluators-support
dmytroye Apr 9, 2026
da2b4e7
Rename config
dmytroye Apr 9, 2026
cc8ffaa
Merge branch 'ITEP-89828/multiple-evaluators-support' into ITEP-89467…
dmytroye Apr 9, 2026
6e1d1c0
Merge config
dmytroye Apr 9, 2026
80095d6
Merge branch 'main' into ITEP-89828/multiple-evaluators-support
dmytroye Apr 10, 2026
558cc75
Merge branch 'ITEP-89828/multiple-evaluators-support' into ITEP-89467…
dmytroye Apr 10, 2026
94d58e9
Resolve conflicts
dmytroye Apr 10, 2026
2aabc9a
Merge branch 'ITEP-89467/add-jitter-metrics' of https://github.com/op…
dmytroye Apr 10, 2026
977af97
Merge branch 'main' into ITEP-89467/add-jitter-metrics
dmytroye Apr 10, 2026
623d2c0
Review suggestion
dmytroye Apr 10, 2026
78bf696
Rename cofig file
dmytroye Apr 10, 2026
2990c7b
Update Readmes
dmytroye Apr 10, 2026
721a00e
extend the table with diagnostic evaluator
dmytroye Apr 10, 2026
70bea5a
Prettier
dmytroye Apr 10, 2026
260fef6
Resolve coflicts in README.md
dmytroye Apr 10, 2026
238a681
Revert scene_controller_harness.py
dmytroye Apr 10, 2026
44e4f4f
README
dmytroye Apr 10, 2026
f82064c
Update Agents.md
dmytroye Apr 10, 2026
dc7cc16
Small change in init.py
dmytroye Apr 10, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions tools/tracker/evaluation/Agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,12 @@ Check `harnesses/README.md` for more details
- **TrackEvalEvaluator**: `evaluators/trackeval_evaluator.py`
Wraps TrackEval library, provides tracker output format conversion and delivers state of the art tracking metrics.

- **DiagnosticEvaluator**: `evaluators/diagnostic_evaluator.py`
Per-frame location and distance error analysis between bipartite-matched tracker output tracks and ground-truth tracks. Produces CSV and plot outputs alongside summary scalar metrics (`DIST_T_mean`, `LOC_T_X_mae`, `LOC_T_Y_mae`, `num_matches`).

- **JitterEvaluator**: `evaluators/jitter_evaluator.py`
Measures trajectory smoothness via RMS jerk and acceleration variance, computed from both tracker outputs and ground-truth tracks. Supports GT and ratio variants to isolate tracker-added jitter from dataset-inherent jitter.

Multiple evaluators can be configured in a single YAML pipeline; each runs independently against the same tracker outputs and writes results to its own subfolder under the run output directory.

Check `evaluators/README.md` for more details
Expand All @@ -75,6 +81,8 @@ Check `evaluators/README.md` for more details
- Harness: [base/tracker_harness.py](base/tracker_harness.py)
- Evaluator: [base/tracker_evaluator.py](base/tracker_evaluator.py)
- **TrackEval adapter & helpers**: [evaluators/trackeval_evaluator.py](evaluators/trackeval_evaluator.py), [utils/format_converters/](./utils/format_converters.py).
- **Jitter adapter**: [evaluators/jitter_evaluator.py](evaluators/jitter_evaluator.py).
- **Diagnostic adapter**: [evaluators/diagnostic_evaluator.py](evaluators/diagnostic_evaluator.py).

## Guidelines for Adding New Component or Updating Existing One

Expand Down
24 changes: 23 additions & 1 deletion tools/tracker/evaluation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,20 @@ evaluators:
- class: evaluators.trackeval_evaluator.TrackEvalEvaluator
config:
metrics: [HOTA, MOTA, IDF1]
- class: evaluators.diagnostic_evaluator.DiagnosticEvaluator
config:
metrics: [LOC_T_X, LOC_T_Y, DIST_T]
- class: evaluators.jitter_evaluator.JitterEvaluator
config:
metrics:
[
rms_jerk,
rms_jerk_gt,
rms_jerk_ratio,
acceleration_variance,
acceleration_variance_gt,
acceleration_variance_ratio,
]
Comment thread
dmytroye marked this conversation as resolved.
```

Run the pipeline:
Expand Down Expand Up @@ -128,6 +142,14 @@ evaluation/
1. Create a new file in `evaluators/` (e.g., `custom_evaluator.py`)
2. Implement the `TrackerEvaluator` ABC from `base/tracker_evaluator.py`

### Available Evaluators
Comment thread
dmytroye marked this conversation as resolved.

| Evaluator | Metrics | Description |
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `TrackEvalEvaluator` | HOTA, MOTA, IDF1, and more | Industry-standard tracking accuracy metrics via the TrackEval library |
| `DiagnosticEvaluator` | `LOC_T_X`, `LOC_T_Y`, `DIST_T` β†’ summary scalars: `DIST_T_mean`, `LOC_T_X_mae`, `LOC_T_Y_mae`, `num_matches` | Per-frame location and distance error between matched tracker output tracks and ground-truth tracks; uses bipartite (Hungarian) assignment over overlapping frames |
| `JitterEvaluator` | `rms_jerk`, `rms_jerk_gt`, `rms_jerk_ratio`, `acceleration_variance`, `acceleration_variance_gt`, `acceleration_variance_ratio` | Trajectory smoothness metrics based on numerical differentiation of 3D positions; GT and ratio variants allow comparing tracker-added jitter against test-data jitter |

## Canonical Data Formats

The pipeline uses standardized data formats defined by JSON schemas to enable interoperability between components. All implementations must conform to these canonical formats.
Expand Down Expand Up @@ -249,7 +271,7 @@ pytest . -v -m integration
pytest tests/ -v # Integration tests
pytest datasets/tests/ -v # Dataset unit tests
pytest harnesses/tests/ -v # Harness unit tests
pytest evaluators/tests/ -v # Evaluators unit tests
pytest evaluators/tests/ -v # Evaluators unit tests
```

**Run tests from a specific file**:
Expand Down
85 changes: 85 additions & 0 deletions tools/tracker/evaluation/evaluators/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,91 @@ print(f"Matched pairs: {int(metrics['num_matches'])}")

**Tests**: See [tests/test_diagnostic_evaluator.py](tests/test_diagnostic_evaluator.py) for unit tests covering track matching, scalar metrics, CSV output, and reset workflows.

### JitterEvaluator

**Purpose**: Evaluate tracker smoothness by measuring positional jitter in tracked object trajectories, and compare it against jitter already present in the ground-truth test data.

**Status**: **FULLY IMPLEMENTED** β€” Computes RMS jerk and acceleration variance from both tracker outputs and ground-truth tracks using numerical differentiation.

**Supported Metrics**:

| Metric | Source | Description |
| ----------------------------- | -------------- | ----------------------------------------------------------------------------------------------------------------- |
| `rms_jerk` | Tracker output | RMS jerk across all tracker output tracks (m/sΒ³) |
| `acceleration_variance` | Tracker output | Variance of acceleration magnitudes across all tracker output tracks (m/sΒ²)Β² |
| `rms_jerk_gt` | Ground truth | Same as `rms_jerk` computed on ground-truth tracks |
| `acceleration_variance_gt` | Ground truth | Same as `acceleration_variance` computed on ground-truth tracks |
| `rms_jerk_ratio` | Tracker / GT | `rms_jerk` / `rms_jerk_gt` β€” tracker jitter relative to GT (1.0 = equal) |
| `acceleration_variance_ratio` | Tracker / GT | `acceleration_variance` / `acceleration_variance_gt` β€” tracker acceleration variance relative to GT (1.0 = equal) |

Comparing `rms_jerk` with `rms_jerk_gt` shows how much jitter the tracker
adds on top of any jitter already present in the test data.

**Algorithm**:

All metrics are derived by applying three sequential layers of forward finite differences to 3D positions, accounting for variable time steps between frames:

$$v_i = \frac{p_{i+1} - p_i}{\Delta t_i}, \quad a_i = \frac{v_{i+1} - v_i}{\Delta t_{v,i}}, \quad j_i = \frac{a_{i+1} - a_i}{\Delta t_{a,i}}$$

- **rms_jerk / rms_jerk_gt**: $\sqrt{\frac{1}{N}\sum |j_i|^2}$ over all jerk samples from all tracks.
- **acceleration_variance / acceleration_variance_gt**: $\text{Var}(|a_i|)$ over all acceleration magnitude samples from all tracks.
- **rms_jerk_ratio / acceleration_variance_ratio**: tracker metric divided by the corresponding GT metric. Returns 0.0 when the GT denominator is zero. Values >1.0 indicate the tracker adds more jitter than is inherent in the ground truth.

Minimum track length: 3 points for acceleration, 4 points for jerk. Shorter tracks are skipped; if no eligible tracks exist, the metric returns 0.0.

For GT metrics, ground-truth frame numbers are converted to relative timestamps using the FPS derived from the tracker output.

**Key Features**:

- Builds per-track position histories from canonical tracker output format.
- Parses MOTChallenge 3D CSV ground-truth file for GT metric computation.
- Supports variable frame rates β€” time deltas are computed from actual timestamps.
- Deduplicates frames with identical timestamps (mirrors `TrackEvalEvaluator` behaviour).
- Sorts each track's positions by timestamp before metric computation.
- Saves a plain-text `jitter_results.txt` summary to the configured output folder.

**Usage Example**:

```python
from pathlib import Path
from evaluators.jitter_evaluator import JitterEvaluator

evaluator = JitterEvaluator()
evaluator.configure_metrics(['rms_jerk', 'rms_jerk_gt', 'rms_jerk_ratio',
'acceleration_variance', 'acceleration_variance_gt',
'acceleration_variance_ratio'])
evaluator.set_output_folder(Path('/path/to/results'))

# Pass ground_truth=None to skip GT metrics
evaluator.process_tracker_outputs(tracker_outputs, ground_truth=dataset.get_ground_truth())
metrics = evaluator.evaluate_metrics()

print(f"RMS Jerk (tracker): {metrics['rms_jerk']:.4f} m/sΒ³")
print(f"RMS Jerk (GT): {metrics['rms_jerk_gt']:.4f} m/sΒ³")
print(f"RMS Jerk ratio: {metrics['rms_jerk_ratio']:.4f} (1.0 = equal jitter)")
```

**Pipeline Configuration**:

```yaml
evaluators:
- class: evaluators.jitter_evaluator.JitterEvaluator
config:
metrics:
[
rms_jerk,
rms_jerk_gt,
rms_jerk_ratio,
acceleration_variance,
acceleration_variance_gt,
acceleration_variance_ratio,
]
Comment thread
dmytroye marked this conversation as resolved.
```

**Implementation**: [jitter_evaluator.py](jitter_evaluator.py)

**Tests**: See [tests/test_jitter_evaluator.py](tests/test_jitter_evaluator.py).

## Adding New Evaluators

To add support for a new metric computation library:
Expand Down
3 changes: 2 additions & 1 deletion tools/tracker/evaluation/evaluators/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,6 @@
"""Evaluator implementations for tracker evaluation."""
from .trackeval_evaluator import TrackEvalEvaluator
from .diagnostic_evaluator import DiagnosticEvaluator
from .jitter_evaluator import JitterEvaluator

__all__ = ['TrackEvalEvaluator', 'DiagnosticEvaluator']
__all__ = ['TrackEvalEvaluator', 'DiagnosticEvaluator', 'JitterEvaluator']
Loading
Loading