Skip to content

[SPARK-46160][PYTHON] Add axis parameter to DataFrame.shift in pandas API on Spark#55545

Open
devin-petersohn wants to merge 2 commits intoapache:masterfrom
devin-petersohn:devin/shift-axis-parameter
Open

[SPARK-46160][PYTHON] Add axis parameter to DataFrame.shift in pandas API on Spark#55545
devin-petersohn wants to merge 2 commits intoapache:masterfrom
devin-petersohn:devin/shift-axis-parameter

Conversation

@devin-petersohn
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Add axis parameter to DataFrame.shift() in the pandas API on Spark. axis=1 shifts values across columns using a pandas_udf, with a fast path for small DataFrames following the compute.shortcut_limit pattern.

Why are the changes needed?

pandas DataFrame.shift() supports axis=1 but the pandas API on Spark did not.

Does this PR introduce any user-facing change?

Yes. DataFrame.shift() now accepts axis (0, 1, 'index', 'columns'). Default behavior is unchanged.

How was this patch tested?

Added test_shift_axis covering axis=0/1, string aliases, various periods, fill_value, single-column, NaN values, multi-index columns, large dataset (UDF path), mixed types, empty DataFrame, and invalid axis.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (claude-opus-4-6)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant