feat(locate): additive promotion-only cross-encoder blend (M1, FIR-987)#39
Open
troyjr4103 wants to merge 1 commit into
Open
feat(locate): additive promotion-only cross-encoder blend (M1, FIR-987)#39troyjr4103 wants to merge 1 commit into
troyjr4103 wants to merge 1 commit into
Conversation
The legacy cross-encoder rerank overwrites the fused RRF score with the raw unbounded logit and re-sorts the top-K window purely by it. That is substitutive: it discards fusion corroboration and evicts multi-signal golds whenever the encoder prefers a single-signal filler — the verified cause of every prior rerank-lever regression (gap is rank, not recall). New KIN_LOCATE_RERANK_BLEND (default OFF) makes the rerank additive instead: sigmoid(logit) -> spread-normalize over the window -> confidence-gate (skip when spread < KIN_LOCATE_RERANK_MIN_SPREAD, default 0.10) -> ADD w*norm (KIN_LOCATE_RERANK_BLEND_WEIGHT default 0.04) to the fused score. The term is bounded and non-negative, so every candidate's score is monotonically non-decreasing: a corroborated gold can only move up, never below its fused floor. Rescues near-ties without evicting comfortable wins. Legacy overwrite preserved as the else-branch (the A/B comparison arm). Gated OFF; must clear the 27-task aggregate F1 gate (raw up, strict/line/symbol not regressed, n>=3) before any default flip. Signed-off-by: Troy Fortin <troy@firelock.io>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why every prior reranker regressed (code-verified)
locate.rs:2376-2389doescandidates[i].1 = score— it overwrites the fused RRF score with the raw, unbounded cross-encoder logit (no sigmoid;kin-inferemitslogits[[b,0]]verbatim) and re-sorts the top-K window purely by it. This annihilates the fusion corroboration inside the window and operates on an incommensurable scale. With a 5/6-hit, avg-rank-~3.4 starting order, a substitutive re-sort can only shuffle slots — it evicts a multi-signal-corroborated gold whenever the encoder prefers a single-signal filler. The gap is rank, not recall, so the lever must promote without evicting.The fix — additive, promotion-only, confidence-gated
New
KIN_LOCATE_RERANK_BLEND(default OFF):sigmoid(logit)→ bounded (0,1) (kills the unbounded/negative-logit incoherence)spread < KIN_LOCATE_RERANK_MIN_SPREAD(default 0.10)candidates[i].1 += w * norm(KIN_LOCATE_RERANK_BLEND_WEIGHTdefault 0.04 — RRF-commensurate withcross_bonus/one rank-step), never overwriteBecause the term is bounded and non-negative, every score is monotonically non-decreasing: a corroborated gold can only move up, never below its fused floor. The induced reorder is a bounded perturbation of the fusion order (a candidate can only overtake a neighbor whose fused lead is
< w), so strong wins are immovable and only near-ties get rescued. Legacy overwrite preserved as theelsebranch = the A/B comparison arm.Gate (do not merge-flip until cleared)
Default-OFF and inert until flagged. Must clear the 27-task aggregate F1 gate, n≥3 over the 3 suites (cli/sympy/nlohmann), raw F1 up AND strict/line/symbol not regressed — strict-F1 is the canary for over-promotion (how the de-cliff arm was rejected). Sweep
KIN_LOCATE_RERANK_BLEND_WEIGHT ∈ {0.02,0.04,0.08}, accept the largest W that improves raw without regressing strict. Compiles clean (kin-cli + kin-daemon).