Skip to content

feat: add --length_normalize to correct short-document bias#191

Open
darkness8i8 wants to merge 3 commits intoEleutherAI:mainfrom
darkness8i8:length-normalize
Open

feat: add --length_normalize to correct short-document bias#191
darkness8i8 wants to merge 3 commits intoEleutherAI:mainfrom
darkness8i8:length-normalize

Conversation

@darkness8i8
Copy link
Contributor

Summary

  • Adds --length_normalize flag to PreprocessConfig that scales each document's attribution score by sqrt(num_tokens) at score time
  • Corrects for short-document bias under mean loss reduction without fully removing gradient magnitude (as --unit_normalize does)
  • Composes cleanly with existing flags (unit_normalize, preconditioning, attribute_tokens)

Motivation

Under mean loss reduction, shorter sequences produce larger-magnitude gradients, causing them to consistently rank highest in attribution scores. --unit_normalize (cosine similarity) overcorrects by removing magnitude entirely, which introduces its own bias toward short, focused documents. --length_normalize provides a middle ground: score * sqrt(num_tokens) partially compensates for the inverse-length dependence without erasing magnitude information.

Changes

File Change
bergson/config.py Add length_normalize: bool = False to PreprocessConfig
bergson/score/scorer.py Accept length_normalize + num_token_grads; apply sqrt(n) scaling in __call__
bergson/score/score.py Compute token counts via compute_num_token_grads() and pass to Scorer
tests/test_score.py Test correctness (raw_score * sqrt(n)) and validation (ValueError without token counts)

Test plan

  • pytest tests/test_score.py::test_scorer_length_normalize — verifies scores equal raw_score * sqrt(num_tokens)
  • pytest tests/test_score.py::test_scorer_length_normalize_requires_token_counts — verifies ValueError when num_token_grads is missing
  • Existing tests pass (no changes to default behavior since length_normalize=False by default)

🤖 Generated with Claude Code

darkness8i8 and others added 3 commits March 11, 2026 14:56
Under mean loss reduction, shorter sequences produce larger-magnitude
gradients, causing them to dominate attribution scores. The existing
--unit_normalize flag overcorrects by removing magnitude entirely
(cosine similarity). This adds a middle ground: --length_normalize
scales each document's score by sqrt(num_tokens), dampening the
short-doc advantage without erasing magnitude information.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant