Release Python Polars 1.31.0-beta.1 · pola-rs/polars

💥 Breaking changes

Remove old streaming engine (#23103)

⚠️ Deprecations

Deprecate allow_missing_columns in scan_parquet in favor of missing_columns (#22784)

🚀 Performance improvements

Improve streaming groupby CSE (#23092)
Move row index materialization in post-apply to occur after slicing (#22995)
Add first_(true|false)_idx to BooleanChunked and use in bool arg_(min|max) (#22907)
Don't go through row encoding for most types on index_of (#22903)
Optimise low-level null scans and arg_max for bools (when chunked) (#22897)
Optimize multiscan performance (#22886)

✨ Enhancements

DataType expressions in Python (#23167)
Native implementation for Iceberg positional deletes (#23091)
Remove old streaming engine (#23103)
Basic implementation of DataTypeExpr in Rust DSL (#23049)
Add required: bool to ParquetFieldOverwrites (#23013)
Support serializing name.map_fields (#22997)
Support serializing Expr::RenameAlias (#22988)
Remove duplicate verbose logging from FetchedCredentialsCache (#22973)
Add keys column in finish_callback (#22968)
Add extra_columns parameter to scan_parquet (#22699)
Add CORR function to polars SQL (#22690)
Add per partition sort and finish callback to sinks (#22789)
Support descendingly-sorted values in search_sorted() (#22825)
Derive DSL schema (#22866)

🐞 Bug fixes

Fix panic reading empty parquet with multiple boolean columns (#23159)
Raise ComputeError instead of panicking in truncate when mixing month/week/day/sub-daily units (#23176)
Materialize list.eval with unknown type (#23186)
Only set sorting flag for 1st column with PQ SortingColumns (#23184)
Typo in AExprBuilder (#23171)
Null return from var/std on scalar column (#23158)
Support Datetime broadcast in list.concat (#23137)
Ensure projection pushdown maintains right table schema (#22603)
Add Null dtype support to arg_sort_by (#23107)
Raise error by default on invalid CSV quotes (#22876)
Fix group_by mean and median returning all nulls for Decimal dtype (#23093)
Fix hive partition pruning not filtering out __HIVE_DEFAULT_PARTITION__ (#23074)
Fix AssertionError when using scan_delta() on AWS with storage_options (#23076)
Fix deadlock on collect(background=True) / collect_concurrently() (#23075)
Incorrect null count in rolling_min/max (#23073)
Preserve file:// in LazyFrame node traverser (#23072)
Respect column order in register_io_source schema (#23057)
Don't call unnest for objects implementing __arrow_c_array__ (#23069)
Incorrect output when using sort with group_by and cum_sum (#23001)
Implement owned arithmetic for Int128 (#23055)
Do not schema-match structs with different field counts (#23018)
Fix confusing error message on duplicate row_index (#23043)
Add include_nulls to Agg::Count CSE check (#23032)
View buffer exceeding 2^32 - 1 bytes in concatenate_view (#23017)
Fix incorrect result selecting pl.len() from scan_csv with skip_lines (#22949)
Allow for IO plugins with reordered columns in streaming (#22987)
Method str.zfill was inconsistent with Python and pandas when string contained leading '+' (#22985)
Integer underflow in propagate_nulls (#22986)
Setting compat_level=0 for sink_ipc (#22960)
Narrow return type for DataType.is_, improve Pyright's type completeness from 69% to 95% (#22962)
Support arrow Decimal32 and Decimal64 types (#22954)
Guard against dictionaries being passed to projection keywords (#22928)
Update arrow format (#22941)
Fix filter pushdown to IO plugins (#22910)
Improve numeric stability rolling_mean<f32> (#22944)
Guard against invalid nested objects in 'map_elements' (#22932)
Allow subclasses in type equality checking (#22915)
Return early in pl.Expr.__array_ufunc__ when only single input (#22913)
Add inline implodes in type coercion (#22885)
Add {top, bottom}_k_by to Series (#22902)
Correct int_ranges to raise error on invalid inputs (#22894)
Don't silently overflow for temporal casts (#22901)
Fix error using write_csv with storage_options (#22881)
Schema resolution .over(mapping_strategy="join") with non-aggregations (#22875)
Ensure rename behaves the same as select (#22852)

📖 Documentation

Document aggregations that return identity when there's no non-null values, suggest workaround for those who want SQL-standard behaviour (#23143)
Fix reference to non-existent Expr.replace_all in replace_strict docs (#23144)
Fix typo on pandas comparison page (#23123)
Minor improvement to cum_count docstring example (#23099)
Add missing DataFrame.__setitem__ to API reference (#22938)
Add missing entry for LazyFrame __getitem__ (#22924)
Add missing top_k_by and bottom_k_by to Series reference (#22917)

📦 Build system

Update pyo3 and numpy crates to version 0.25 (#22763)
Actually disable ir_serde by default (#23046)
Add a feature flag for serde_ignored (#22957)
Fix warnings, update DSL version and schema hash (#22953)

🛠️ Other improvements

Added more descriptive error message by replacing FixedSizeList with Array (#23168)
Connect Python assert_series_equal() to Rust back-end (#23141)
Refactor skip_batches to use AExprBuilder (#23147)
Use ir_serde instead of serde for IRFunctionExpr (#23148)
Separate FunctionExpr and IRFunctionExpr (#23140)
Remove AExpr::Alias (#23070)
Add components for Iceberg deletion file support (#23059)
Feature gate StructFunction::JsonEncode (#23060)
Propagate iceberg position delete information to IR (#23045)
Add environment variable to get Parquet decoding metrics (#23052)
Turn pl.cumulative_eval into its own AExpr (#22994)
Add make test-streaming (#23044)
Move scan parameter parsing for parquet to reusable function (#23019)
Prepare deltalake 1.0 (#22931)
Implement Hash and use SpecialEq for RenameAliasFn (#22989)
Turn list.eval into an AExpr (#22911)
Fix CI for latest pandas-stubs release (#22971)
Add a CI check for DSL schema changes (#22898)
Add schema parameters to expr.meta (#22906)
Update rust toolchain in nix flake (#22905)
Update toolchain (#22859)

Thank you to all our contributors for making this release possible!
@Athsus, @DahaoALG, @FabianWolff, @JakubValtar, @Kevin-Patyk, @MarcoGorelli, @SanjitBasker, @alexander-beedie, @bschoenmaeckers, @coastalwhite, @deanm0000, @dsprenkels, @eitsupi, @florian-klein, @i1oveMyse1f, @ion-elgreco, @itamarst, @kdn36, @kutal10, @mroeschke, @nameexhaustion, @nikaltipar, @orlp, @paskhaver, @ritchie46, @stijnherfst and @thomasfrederikhoeck

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python Polars 1.31.0-beta.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

💥 Breaking changes

⚠️ Deprecations

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

Contributors

Uh oh!