This repo now includes a full validation path for player, shuttle, and optional pose evaluation.
- benchmark frame extraction
- manual annotation workflow for player boxes and shuttle points
- model inference on labeled frames
- automatic metric calculation and report generation
scripts/create_validation_manifest.pyscripts/annotate_validation.pyscripts/run_validation_inference.pyscripts/bootstrap_validation_labels.pyscripts/eval_validation.pyscripts/render_validation_review.pyscripts/run_benchmark_bundle.pyscripts/select_validation_subset.pysrc/eval/metrics.pysrc/model_defaults.pyvalidation/schema/frame_labels.schema.jsonvalidation/examples/frame_label.example.json
python scripts/create_validation_manifest.py \
--video badminton_sample.mp4 \
--out-dir validation/sample \
--every 30 \
--max-frames 20Outputs:
validation/sample/manifest.jsonlvalidation/sample/frames/*.jpg
python scripts/annotate_validation.py --manifest validation/sample/manifest.jsonlControls:
1annotateP1box by drag2annotateP2box by dragsshuttle mode, left click to place shuttleomark shuttle as occluded / not visiblecclear current mode annotationnnext framepprevious framewsaveqquit
Default output:
validation/sample/labels.jsonl
.venv/bin/python scripts/run_validation_inference.py \
--manifest validation/sample/labels.jsonl \
--with-poseDefault output:
validation/sample/predictions.jsonl
python scripts/eval_validation.py \
--labels validation/sample/labels.jsonl \
--predictions validation/sample/predictions.jsonlOutputs:
validation/sample/predictions.eval.jsonvalidation/sample/predictions.eval.md
If you want a faster review loop, first generate model predictions, then turn them into prefilled labels for later correction:
.venv/bin/python scripts/bootstrap_validation_labels.py \
--manifest validation/sample/manifest.jsonl \
--predictions validation/sample/predictions.jsonlOutput:
validation/sample/labels.bootstrap.jsonl
Then review that file inside the annotator by copying it to labels.jsonl or passing it as your working labels file.
You can also render a contact sheet for quick inspection:
.venv/bin/python scripts/render_validation_review.py \
--input validation/sample/predictions.jsonlOutputs:
validation/sample/review/*.jpg
This is useful for fast regression review before doing detailed manual correction.
If you want a repeatable regression bundle from a video sample, run:
.venv/bin/python scripts/run_benchmark_bundle.py \
--video badminton_sample.mp4 \
--out-dir validation/bundles/fullcourt12 \
--every 45 \
--max-frames 24 \
--indices 7,8,9,10,11,12,13,14,20,21,22,23 \
--with-poseOutputs include:
- sampled manifest
- predictions
- bootstrap labels
- review contact sheet
bundle.summary.json
This is the fastest way to regenerate a benchmark package after tracker or model changes.
If a sampled benchmark includes close-ups, logo cards, or other non-rally frames, you can carve out a cleaner subset:
python scripts/select_validation_subset.py \
--input validation/sample/manifest.jsonl \
--out validation/sample/manifest.rally.jsonl \
--indices 0,1,4,5,6The repo now prefers the local shuttle specialized weight when present:
models/weights/shuttle_best.pt
Player detection still defaults to generic yolo11n.pt unless you explicitly pass --player-model, because the local player_best.pt may underperform on some sample videos.
- visibility recall
- mean / median pixel error
- normalized error by image diagonal
- accuracy at 5px / 10px / 20px
- mean IoU for
P1andP2 - recall at IoU 0.5 and 0.75
- mean center error in pixels
- PCK@0.1
- PCK@0.2
Pose metrics are only computed when ground truth pose keypoints are present in the labels file.
- Ground truth labels use semantic player ids:
P1,P2 - Predictions use tracker slots
1,2, and the evaluator maps them toP1,P2 - This lets the current pipeline be evaluated without changing its runtime output format
- The framework is intended for repeated regression checks after model or tracker changes
Start by labeling shuttle + players on 100 to 300 frames across several hard scenes. That already gives a useful benchmark before investing in full pose keypoint annotation.