Invalid index type when extracting values from a multi-index data frame #1018
Open
Description
Describe the bug
When values are extracted from a data frame with a multi-index, the key is supposed to be an accepted tuple, but it is not.
To Reproduce
- Provide a minimal runnable
pandas
example that is not properly checked by the stubs.
from __future__ import annotations
from typing import TypeAlias
import pandas as pd
_KeyType: TypeAlias = str | list[str | bool] | slice
_MultiKeyType: TypeAlias = str | slice | tuple[_KeyType, ...] | list[str | bool | tuple[_KeyType, ...]]
def df_loc(df: pd.DataFrame, key: _MultiKeyType) -> pd.DataFrame:
print(key)
return df.loc[:, key]
if __name__=="__main__":
df_multi_columns = pd.DataFrame({
("A", "a"): [1, 2, 3],
("A", "b"): [4, 5, 6],
("B", "a"): [7, 8, 9],
("B", "b"): [10, 11, 12]
})
print(df_loc(df_multi_columns, "A"))
print(df_loc(df_multi_columns, ("A", "a")))
print(df_loc(df_multi_columns, ["A", "B"]))
print(df_loc(df_multi_columns, pd.IndexSlice[:, "a"]))
print(df_loc(df_multi_columns, [("A", "a"), ("B", "b")]))
- Indicate which type checker you are using (
mypy
orpyright
).
mypy - Show the error message received from that type checker while checking your example.
get_pandas_loc.py:11: error: Invalid index type "tuple[slice, slice | tuple[str | list[str | builtins.bool] | slice, ...] | list[str | builtins.bool | tuple[str | list[str | builtins.bool] | slice, ...]]]" for "_LocIndexerFrame"; expected type "slice | ndarray[Any, dtype[integer[Any]]] | Index[Any] | list[int] | Series[int] | <6 more items>" [index]
Please complete the following information:
- OS: Windows
- OS Version [e.g. 22]: 11
- python version: 3.10.15
- version of type checker: mypy 2.1.2
- version of installed
pandas-stubs
: 2.2.3.241009
Additional context
- version of
pandas
: 2.2.3 - mypy option:
strict=True