-
-
Notifications
You must be signed in to change notification settings - Fork 370
Description
Describe the bug
My goal is to create a DataFrameSchema which has a Column of type List/Array with a defined inner type. This works fine if I simply use dtype=pl.List but then I am unsure how to define the inner type.
I could write dtype=pl.List(pl.Int64), which works when I call .validate() and it behaves normally but dtype expects PolarsDtypeInputTypes, ie str | type | pl.datatypes.classes.DataTypeClass | None and this trips up my linter.
Maybe I am doing this incorrectly but I couldn't find any additional information on how to accomplish this and I want to be able to use this feature without needed # type: ignore[arg-type] all over my code.
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandera.
- (optional) I have confirmed this bug exists on the main branch of pandera.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
import polars as pl
import pandera.polars as pa
column1 = pa.Column(
name="column1",
dtype=pl.Int64
)
column2 = pa.Column(
name="column2",
dtype=pl.List(pl.Int64), # Type Warning
)
schema = pa.DataFrameSchema(
columns={
column1.name: column1,
column2.name: column2
}
)
df = pl.DataFrame(
{
column1.name: [1,2],
column2.name: [[1], [2]]
}
)
schema.validate(df)Expected behavior
The possibility to create a List/Array Column and define the inner Datatype without a Type Hint Warning