Skip to content

Multiindexed columns (schemas) are not serialisable #2182

@ConorForgie

Description

@ConorForgie

Pandas schemas for multi-indexed columns cannot be serialised to JSON

import pandas as pd
import pandera.pandas as pa


# Create a simple schema with MultiIndex column names (tuples)
schema = pa.DataFrameSchema({
    ('level1', 'col_a'): pa.Column(float),
    ('level1', 'col_b'): pa.Column(float),
    ('level2', 'col_c'): pa.Column(int),
})

# Create a matching DataFrame with MultiIndex columns
df = pd.DataFrame({
    ('level1', 'col_a'): [1.0, 2.0, 3.0],
    ('level1', 'col_b'): [4.0, 5.0, 6.0],
    ('level2', 'col_c'): [7, 8, 9],
})

print('DataFrame columns:')
print(df.columns)

# Validation works fine
validated = schema.validate(df)

# But serialization to JSON fails
json_str = schema.to_json()

Failes with

TypeError: keys must be str, int, float, bool or None, not tuple

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions