BUG: raise helpful errors for unsupported pyarrow read_csv usage by jbrockmendel · Pull Request #65862 · pandas-dev/pandas

jbrockmendel · 2026-06-11T22:21:25Z

Follow-up to the pyarrow-engine xfail cleanup (GH-65859): the remaining read_csv pyarrow xfails were cases where unsupported usage produced a cryptic exception or silently-wrong results. This PR makes them raise informative errors (or behave like the other engines where the fix was simple) and converts the corresponding xfails to assert those errors.

list-valued header (MultiIndex columns, unsupported by pyarrow.csv) now raises ValueError: header argument must be an integer or None when using engine='pyarrow' instead of TypeError: an integer is required; singleton header=[N] now behaves like header=N
names together with an integer header was silently ignored; the header row is now discarded and names applied, matching other engines (including implicit-index handling when names is shorter than the file width)
too many names (or a names/usecols length mismatch) now raises "Number of passed names did not match number of header fields in the file" like the python engine
na_filter=False was silently ignored (NA filtering happened anyway); it now raises the standard unsupported-option ValueError
tuple names now produce MultiIndex columns like other engines

~30 xfailed tests converted to the raise-assert pattern; 11 tests now pass outright and lost their xfails.

Note: touches the same areas of arrow_parser_wrapper.py as GH-65859, so whichever merges second will need minor conflict resolution.

🤖 Generated with Claude Code

The remaining pyarrow-engine read_csv xfails were cases where unsupported usage produced a cryptic exception or silently-wrong results: - list-valued header (MultiIndex columns) now raises an informative ValueError instead of "TypeError: an integer is required"; header=[N] is treated as header=N - names with an integer header was silently ignored; the header row is now discarded and names applied, matching other engines - too many names now raises "Number of passed names did not match number of header fields in the file" like the python engine - na_filter=False was silently ignored; it now raises the standard unsupported-option ValueError - tuple names now produce MultiIndex columns like other engines Convert the affected xfails to assert the errors; remove xfails from 11 tests that now pass. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

jbrockmendel added Bug Error Reporting Incorrect or improved errors from pandas IO CSV read_csv, to_csv Arrow pyarrow functionality labels Jun 11, 2026

jbrockmendel force-pushed the tst-pyarrow-csv-3 branch from d88bfe1 to dae5e52 Compare June 11, 2026 22:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BUG: raise helpful errors for unsupported pyarrow read_csv usage#65862

BUG: raise helpful errors for unsupported pyarrow read_csv usage#65862
jbrockmendel wants to merge 1 commit into
pandas-dev:mainfrom
jbrockmendel:tst-pyarrow-csv-3

jbrockmendel commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jbrockmendel commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant