Skip to content

SNOW-1797580: Integer columns contain Na after filtering when using to pandas in local testing #2598

Open
@frederiksteiner

Description

  1. What version of Python are you using?

    Python 3.11.8

  2. What operating system and processor architecture are you using?

    Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35

  3. What are the component versions in the environment (pip freeze)?

    snowflake-connector-python==3.12.3
    snowflake-snowpark-python==1.24.0

  4. What did you do?

if __name__ == "__main__":
    from snowflake.snowpark import Session
    import snowflake.snowpark.functions as spf
    conn_params = {
        "schema": "SCHEMA",
        "local_testing": True,
    }

    session = Session.builder.configs(conn_params).create()
    data = [
        [1, False],
        [1, False],
        [1, False],
        [2, True],
    ]
    schema = ["INT_COL", "BOOL_COL"]
    df = session.create_dataframe(data, schema)
    df = df.with_column("INT_COL", spf.cast("INT_COL", "int"))
    filtered = df.filter(
            spf.col("BOOL_COL")
        )
    pd_df = filtered.to_pandas()
    collected = filtered.collect()
  1. What did you expect to see?

    That the pd_df has the same data as collected. But the int column is NaN for the pandas df. I already found the issue and will open a PR asap

Metadata

Assignees

Labels

bugSomething isn't workingstatus-triage_doneInitial triage done, will be further handled by the driver team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions