Skip to content

BUG: why arrow only work on mac arm? #60714

Open
@wonb168

Description

@wonb168

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
pd.options.mode.copy_on_write = True
def df_merge(
    left,
    right,
    how: Literal["left", "right", "inner", "outer", "cross"] = "inner",
    on=None,
    left_on=None,
    right_on=None,
    left_index: bool = False,
    right_index: bool = False,
    sort: bool = False,
    suffixes=("_x", "_y"),
    copy: bool = True,
    indicator: bool = False,
    validate=None,
):
    if not pd.api.types.is_dtype_backend(left, "pyarrow"):
        left = pa.Table.from_pandas(left).to_pandas()
    if not pd.api.types.is_dtype_backend(right, "pyarrow"):
        right = pa.Table.from_pandas(right).to_pandas()

Issue Description

I have a python project use pickle file ,pandas2.1,
when it run in x86 centos7,cost 107s,
but only need 71s in mac m2,
and I upgrade pandas to 2.2.3,and set:pd.options.mode.copy_on_write = True
and edit function df_merge(which is the most cost time fuciton),
change df to arrow first。
then only need 35s。
but,same in x86 centos7,still need 100s,why arrow not work?

Expected Behavior

use arrow twice quickly in x86 centos7,but no effect!

Installed Versions

2.2.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions