Skip to content

pandas DataFrame.from_records has incomplete data type #2167

@mrichards42

Description

@mrichards42

Describe the bug

DataFrame.from_records should accept a list of dicts, but currently is typed to only accept a list-of-tuples or a top level dict.

For comparison, pandas-stubs uses the following type for the data parameter:

        data: (
            np_2darray
            | Sequence[SequenceNotStr]
            | Sequence[Mapping[str, Any]]
            | Mapping[str, Any]
            | Mapping[str, SequenceNotStr[Any]]
        ),
  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of pandera.
  • (optional) I have confirmed this bug exists on the main branch of pandera.

Code Sample, a copy-pastable example

import pandera.pandas as pa
from pandera.typing.pandas import DataFrame


class MySchema(pa.DataFrameModel):
    a: float
    b: float


def records_to_my_df(records: list[dict[str, float]]) -> DataFrame[MySchema]:
    return DataFrame.from_records(MySchema, records)
$ mypy example.py
example.py:11: error: Argument 2 to "from_records" of "DataFrame" has incompatible type "list[dict[str, float]]"; expected "ndarray[Any, Any] | list[tuple[Any, ...]] | dict[Any, Any] | DataFrame"  [arg-type]
Found 1 error in 1 file (checked 1 source file)

Expected behavior

DataFrame.from_records should accept a list of dicts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions