Skip to content

Bug: Year and court not extracted from parentheticals on full case citations #22

@medelman17

Description

@medelman17

Description

Full case citations return year: undefined and court: undefined (or "N/A") even when the year and court are clearly present in a trailing parenthetical. The extraction captures volume, reporter, and page correctly but doesn't parse the (Court Year) parenthetical that follows.

Reproduction

import { extractCitations } from 'eyecite-ts'

const text = 'See Texas v. Johnson, 491 U.S. 397, 404 (1989).'
const citations = extractCitations(text)

console.log(citations[0])
// {
//   type: 'case',
//   volume: 491,
//   reporter: 'U.S.',
//   page: 397,
//   year: undefined,    // Expected: 1989
//   court: undefined,   // Expected: 'scotus' (inferred from U.S. Reports)
// }

Additional failing examples from SCOTUS text:

Input Expected Year Actual Year Expected Court Actual Court
491 U.S. 397, 404 (1989) 1989 undefined scotus undefined
418 U.S. 405, 409 (1974) 1974 undefined scotus undefined
468 U.S. 288, 294 (1984) 1984 undefined scotus undefined
391 U.S. 367, 376 (1968) 1968 undefined scotus undefined

Context

The Python eyecite library extracts year and court from the trailing parenthetical as core metadata on FullCaseCitation objects. This is critical for:

  1. Disambiguation — when the same volume/reporter/page appears in multiple series, the year helps identify the correct case
    1. Temporal analysis — citation network analysis depends on knowing when cited cases were decided
    1. Reporter validation — the year can be cross-referenced against reporter date ranges to validate citation accuracy
    1. Court hierarchy — knowing the citing court matters for precedential weight analysis

Related Issues

Expected Behavior

The parenthetical following a case citation should be parsed for:

  • Year: 4-digit number, typically at the end: (1989), (9th Cir. 2020)
    • Court: Court abbreviation before the year: (9th Cir. 2020), (D. Mass. 2019), (S.D.N.Y. 2021)
      • For Supreme Court reporters (U.S., S. Ct., L. Ed.), court should be inferred as scotus even when not explicitly stated in the parenthetical

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions