Summary
Add phentrieve config calibrate-thresholds to help users tune quality_threshold and margin_threshold after switching the retrieval embedding model.
Why
The CLI profiles + adaptive re-chunking work landing under #28 / #171 / #148 ships threshold defaults calibrated specifically for BioLORD-class biomedical encoders:
chunk_retrieval_threshold = 0.7
aggregated_term_confidence = 0.75
adaptive_rechunking.quality_threshold = 0.55
adaptive_rechunking.margin_threshold = 0.03
Users who switch to a different retrieval_model see different similarity-score distributions and need to retune. Today the user-guide tells them to "retune your thresholds" without telling them how. This subcommand closes that gap.
Proposed Behavior
phentrieve config calibrate-thresholds
phentrieve config calibrate-thresholds --benchmark-fixture tests/data/benchmarks/german/tiny_v1.json
phentrieve config calibrate-thresholds --model FremyCompany/BioLORD-2023-M --output suggested-profile.yaml
The command:
- Loads the configured retrieval model (or the one passed via
--model).
- Runs it against a fixture benchmark (default: a small bundled set; user can pass
--benchmark-fixture).
- Computes per-encoder distributions of: top-1 similarity, top-1 minus top-2 margin, and aggregated-term confidence on hits and misses.
- Suggests calibrated values for each threshold, written to stdout as a YAML profile snippet that can be pasted into
phentrieve.yaml.
- Reports the score distribution stats so the user understands where the suggestions came from.
Acceptance Criteria
References
- Spec A:
.planning/specs/2026-04-25-cli-profiles-default-resolution-spec.md Future work
- Spec B:
.planning/specs/2026-04-25-adaptive-rechunking-spec.md Future work
- Encoder calibration warning in
docs/user-guide/adaptive-rechunking.md (added by Plan B)
Summary
Add
phentrieve config calibrate-thresholdsto help users tunequality_thresholdandmargin_thresholdafter switching the retrieval embedding model.Why
The CLI profiles + adaptive re-chunking work landing under #28 / #171 / #148 ships threshold defaults calibrated specifically for BioLORD-class biomedical encoders:
chunk_retrieval_threshold = 0.7aggregated_term_confidence = 0.75adaptive_rechunking.quality_threshold = 0.55adaptive_rechunking.margin_threshold = 0.03Users who switch to a different
retrieval_modelsee different similarity-score distributions and need to retune. Today the user-guide tells them to "retune your thresholds" without telling them how. This subcommand closes that gap.Proposed Behavior
The command:
--model).--benchmark-fixture).phentrieve.yaml.Acceptance Criteria
phentrieve config calibrate-thresholdsTyper command inphentrieve/cli/config_commands.py.phentrieve.yaml profiles:section.docs/user-guide/configuration-profiles.mdanddocs/user-guide/adaptive-rechunking.md(cross-referenced).References
.planning/specs/2026-04-25-cli-profiles-default-resolution-spec.mdFuture work.planning/specs/2026-04-25-adaptive-rechunking-spec.mdFuture workdocs/user-guide/adaptive-rechunking.md(added by Plan B)