Skip to content

Bug: Reporter with internal spaces not recognized ("U. S." vs "U.S.") #15

@medelman17

Description

@medelman17

Problem

Reporter abbreviations with internal spaces (common in SCOTUS opinions) are not matched by case patterns.

"506 U.S. 534"    → case citation found ✓  (reporter: "U.S.")
"506 U. S. 534"   → case citation found ✗  (reporter: undefined)

The spaced form "U. S." is the official style used by the Supreme Court Reporter in many opinions.

Root Cause

The supreme-court pattern uses a literal U\.S\. which doesn't account for optional spaces:

regex: /\b(\d+)\s+(U\.S\.|S\.\s?Ct\.|L\.\s?Ed\.(?:\s?2d)?)\s+(\d+)\b/g
//                  ^^^^^ no optional space between "U." and "S."

Note: S. Ct. already handles optional space (S\.\s?Ct\.) — U.S. should too.

Fix

Change U\.S\. to U\.\s?S\. in the supreme-court pattern.

Similarly check other patterns for this class of issue: L. Ed., F. Supp., etc.

Upstream Reference

Python eyecite #147 — related reporter normalization issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions