Skip to content

Invalid index type when extracting values from a multi-index data frame #1018

Open
@kubotty

Description

Describe the bug
When values are extracted from a data frame with a multi-index, the key is supposed to be an accepted tuple, but it is not.

To Reproduce

  1. Provide a minimal runnable pandas example that is not properly checked by the stubs.
from __future__ import annotations

from typing import TypeAlias
import pandas as pd

_KeyType: TypeAlias = str | list[str | bool] | slice
_MultiKeyType: TypeAlias = str | slice | tuple[_KeyType, ...] | list[str | bool | tuple[_KeyType, ...]]

def df_loc(df: pd.DataFrame, key: _MultiKeyType) -> pd.DataFrame:
    print(key)
    return df.loc[:, key]

if __name__=="__main__":
    df_multi_columns = pd.DataFrame({
        ("A", "a"): [1, 2, 3],
        ("A", "b"): [4, 5, 6],
        ("B", "a"): [7, 8, 9],
        ("B", "b"): [10, 11, 12]
    })
    print(df_loc(df_multi_columns, "A"))
    print(df_loc(df_multi_columns, ("A", "a")))
    print(df_loc(df_multi_columns, ["A", "B"]))
    print(df_loc(df_multi_columns, pd.IndexSlice[:, "a"]))
    print(df_loc(df_multi_columns, [("A", "a"), ("B", "b")]))
  1. Indicate which type checker you are using (mypy or pyright).
    mypy
  2. Show the error message received from that type checker while checking your example.
    get_pandas_loc.py:11: error: Invalid index type "tuple[slice, slice | tuple[str | list[str | builtins.bool] | slice, ...] | list[str | builtins.bool | tuple[str | list[str | builtins.bool] | slice, ...]]]" for "_LocIndexerFrame"; expected type "slice | ndarray[Any, dtype[integer[Any]]] | Index[Any] | list[int] | Series[int] | <6 more items>" [index]

Please complete the following information:

  • OS: Windows
  • OS Version [e.g. 22]: 11
  • python version: 3.10.15
  • version of type checker: mypy 2.1.2
  • version of installed pandas-stubs: 2.2.3.241009

Additional context

  • version of pandas: 2.2.3
  • mypy option: strict=True

Metadata

Assignees

No one assigned

    Labels

    IndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions