Evaluate broader contributor-tier calibration beyond immediate mislabeling bugfix

## Summary

Follow-up from planning M042.

We should evaluate whether Kodiai's contributor expertise/tier system needs broader calibration beyond the immediate CrystalP mislabeling bugfix. The current M042 scope is to fix stored tier truthfulness, review-surface correctness, cache/fallback consistency, and the real repro on xbmc/xbmc#28132.

This issue tracks the larger question we are intentionally leaving out of that bugfix milestone.

## Why this is separate

The immediate bug appears to be a correctness problem in how stored contributor tiers and review-time classification interact. That can likely be fixed without reopening the whole scoring model.

A broader calibration pass is different work:
- sampling a wider set of contributors across the repo
- checking whether score decay, weights, or percentile thresholds match reality
- deciding whether the current tier bands are still the right shape
- validating the model against more than one obvious bad output

That is useful, but it should not block the focused correctness fix.

## Questions to answer

- Are the current expertise weights (`commit`, `pr_review`, `pr_authored`) still the right relative signals?
- Is percentile-based tiering the right mechanism for this repo's contributor distribution?
- Should tier recalculation happen on a schedule, on every meaningful update, or both?
- Do the current 4 stored tiers map well to the review-surface tone behavior?
- What real contributor samples should be used as calibration fixtures?

## Out of scope for M042

- redesigning the contributor scoring model from scratch
- repo-wide threshold retuning unless required for the immediate correctness fix
- expanding the milestone from bugfix to product redesign

## Done looks like

- we have concrete sample contributors to test against
- we understand whether the current model is structurally sound or just operationally buggy
- if recalibration is needed, it is split into its own milestone/slice with proof criteria


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate broader contributor-tier calibration beyond immediate mislabeling bugfix #78

Summary

Why this is separate

Questions to answer

Out of scope for M042

Done looks like

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Evaluate broader contributor-tier calibration beyond immediate mislabeling bugfix #78

Description

Summary

Why this is separate

Questions to answer

Out of scope for M042

Done looks like

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions