Skip to content

feat(cli): phentrieve config calibrate-thresholds subcommand #234

@berntpopp

Description

@berntpopp

Summary

Add phentrieve config calibrate-thresholds to help users tune quality_threshold and margin_threshold after switching the retrieval embedding model.

Why

The CLI profiles + adaptive re-chunking work landing under #28 / #171 / #148 ships threshold defaults calibrated specifically for BioLORD-class biomedical encoders:

  • chunk_retrieval_threshold = 0.7
  • aggregated_term_confidence = 0.75
  • adaptive_rechunking.quality_threshold = 0.55
  • adaptive_rechunking.margin_threshold = 0.03

Users who switch to a different retrieval_model see different similarity-score distributions and need to retune. Today the user-guide tells them to "retune your thresholds" without telling them how. This subcommand closes that gap.

Proposed Behavior

phentrieve config calibrate-thresholds
phentrieve config calibrate-thresholds --benchmark-fixture tests/data/benchmarks/german/tiny_v1.json
phentrieve config calibrate-thresholds --model FremyCompany/BioLORD-2023-M --output suggested-profile.yaml

The command:

  1. Loads the configured retrieval model (or the one passed via --model).
  2. Runs it against a fixture benchmark (default: a small bundled set; user can pass --benchmark-fixture).
  3. Computes per-encoder distributions of: top-1 similarity, top-1 minus top-2 margin, and aggregated-term confidence on hits and misses.
  4. Suggests calibrated values for each threshold, written to stdout as a YAML profile snippet that can be pasted into phentrieve.yaml.
  5. Reports the score distribution stats so the user understands where the suggestions came from.

Acceptance Criteria

  • New phentrieve config calibrate-thresholds Typer command in phentrieve/cli/config_commands.py.
  • Outputs a YAML snippet ready to paste into phentrieve.yaml profiles: section.
  • Documented in docs/user-guide/configuration-profiles.md and docs/user-guide/adaptive-rechunking.md (cross-referenced).
  • Smoke test on the German tiny benchmark fixture.
  • Exit non-zero with a clear error if the configured model can't be loaded or the fixture is missing.

References

  • Spec A: .planning/specs/2026-04-25-cli-profiles-default-resolution-spec.md Future work
  • Spec B: .planning/specs/2026-04-25-adaptive-rechunking-spec.md Future work
  • Encoder calibration warning in docs/user-guide/adaptive-rechunking.md (added by Plan B)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions