Measure bracket survival on real PDF→markdown before committing the hard scope filter
Version: eyecite-ts@0.28.1
Summary
The scope-before-recency design treats parenthetical scope as a hard candidate filter, which assumes the bracket structure of (quoting …) parentheticals survives extraction. If ( / ) are frequently dropped or garbled by PDF→markdown, abstains would be driven by parse failure rather than true ambiguity, and a hard filter could hide correct antecedents.
This is the single highest-priority unknown before committing the hard filter.
Measure
- Bracket survival — on a real corpus sample, how often does an inner
(quoting …) cite retain intact, balanced delimiters after extraction? Break down by source (native-text vs OCR'd PDF).
- Error split — of observed
Id. / supra misattributions, what fraction are scope errors (wrong candidate set) vs intra-scope salience errors (right set, wrong rank)? This gates whether a learned ranker is ever needed beyond a thin deterministic scorer.
Outcome
Decides (a) hard vs. degrade-to-soft-on-balance-failure for the scope filter, and (b) whether to invest in stage-3 ranking at all.
Related
- Design backbone:
docs/research/2026-06-02-shortform-resolution-02-scope-and-binding.md, …-03-bracket-parsing-error-recovery.md.
Measure bracket survival on real PDF→markdown before committing the hard scope filter
Version:
eyecite-ts@0.28.1Summary
The scope-before-recency design treats parenthetical scope as a hard candidate filter, which assumes the bracket structure of
(quoting …)parentheticals survives extraction. If(/)are frequently dropped or garbled by PDF→markdown, abstains would be driven by parse failure rather than true ambiguity, and a hard filter could hide correct antecedents.This is the single highest-priority unknown before committing the hard filter.
Measure
(quoting …)cite retain intact, balanced delimiters after extraction? Break down by source (native-text vs OCR'd PDF).Id./supramisattributions, what fraction are scope errors (wrong candidate set) vs intra-scope salience errors (right set, wrong rank)? This gates whether a learned ranker is ever needed beyond a thin deterministic scorer.Outcome
Decides (a) hard vs. degrade-to-soft-on-balance-failure for the scope filter, and (b) whether to invest in stage-3 ranking at all.
Related
docs/research/2026-06-02-shortform-resolution-02-scope-and-binding.md,…-03-bracket-parsing-error-recovery.md.