Python Polars 1.3.0
🚀 Performance improvements
- Ensure metadata flags are maintained on vertical parallelization (#17804)
- Ensure only nodes that are not changed are cached in collapse optimizer (#17791)
- Use bitflags for OptState (#17788)
- Remove async directory auto-detection (#17779)
- Fix accidental quadratic horizontal concat (#17783)
- Batch parquet integer decoding (#17734)
- Use mmap-ed memory if possible in Parquet reader (#17725)
- Use bitflags for function options (#17723)
- Also set target features and tune cpu for CC (#17716)
- Introduce
MemReaderto file buffer in Parquet reader (#17712)
✨ Enhancements
- Expose binary_elementwise_into_string_amortized for plugin authors, recommend
apply_into_string_amortizedinstead ofapply_to_buffer(#17903) - Expose allocator to capsule (#17817)
- Decompress in CSV / NDJSON scan (#17841)
- Ensure unique names in HConcat (#17884)
- Support authentication with HuggingFace login (#17881)
- Enable collection with gpu engine (#17550)
- Support "BY NAME" qualifier for
SQL"INTERSECT" and "EXCEPT" set ops (#17835) - Write data at table level in
write_excel(#17757) - Support PyCapsule Interface in DataFrame & Series constructors (#17693)
- Implement Arrow PyCapsule Interface for Series/DataFrame export (#17676)
- Raise informative error instead of panicking when passing invalid directives to
to_stringfor Date dtype (#17670) - Implement forward/backward fill for all types (#17861)
- Implement
is_inoperation on decimal type (#17832) - Optimise
read_excelwhen using "calamine" engine with the latestfastexcel(#17735) - Support
hf://inread_(csv|ipc|ndjson)functions (#17785) - Allow literals in sort (#17780)
- Expose 'strict' argument to 'is_in' (#17776)
- Release the GIL in
collect_schema(#17761) - Cloud support for NDJSON (#17717)
- Support API token for scanning
hf://(#17682)
🐞 Bug fixes
- Scanning '%' from cloud (#17890)
- Raise suitable error when invalid column passed to
get_column_index(#17868) - Respect
glob=Falsefor cloud reads (#17860) - Properly write nest-nulled values in Parquet (#17845)
- Improve default
write_excelint/float format when using a dark "table_style" (#17869) - Fix
from_arrowfor struct type (#17839) - Fix bool/string usage of "column_totals" parameter in
write_excel(#17846) - Infer decimal scales on mixed scale input (#17840)
- Don't ignore timezones in list of dicts constructor (#14211)
- Raise on unsupported fill strategy dtype (#17837)
- Properly write nested
NullArrayin Parquet (#17807) - Check input type on list.to_struct (#17834)
- Fix right join schema (#17833)
- Simultaneous usage of
named_exprandschemainpl.struct(#17768) - Fix projection pusdhown of literals without names (#17778)
- Don't expand HTTP paths (#17774)
- Check funtion input len at expansion (#17763)
- Don't panic in invalid agg_groups (#17762)
- Raise empty struct (#17736)
- Fix GC logic in
write_ipc(#17752) - Panic in pl.concat_list and list.concat on empty inputs (#17742)
- Fix out nullability for structs coming from arrow (#17738)
- Percent encode for Hugging Face paths (#17718)
📖 Documentation
- Updating the join example input for rust for consistency with python example (#17898)
- Improve filter documentation (#17755)
- Reword "how" param docstring entry for 'semi' and 'anti'
jointypes for clarity (#17843) - Mention
read_*functions in Hugging Face section in user guide (#17799) - Show return type for Series attributes in API reference (#17759)
- Add function with multiple arguments example to
Expr.map_batches(#17789) - Add Hugging Face section to user guide (#17721)
📦 Build system
- Update Rust toolchain to
nightly-2024-07-26(#17891) - Correctly reference released package in optional dependencies (#17691)
🛠️ Other improvements
- On Python release, trigger docs build after API reference build (#17904)
- Set
uv pip installto verbose (#17901) - Fix broken
typoscommand inmake pre-commitfor py-polars folder (#17897) - Remove HybridRLE iter / batch nested parquet decoding (#17889)
- Add version field for python IR (#17876)
- Pass through missing rolling and stringfunction information in pyir (#17702)
- Make better use of
typosconfiguration features (#17800) - Better deprecate message for _import_from_c (#17753)
- Rename Unit to Plain in Parquet reader (#17751)
- Unpin
setuptools(#17726) - Update CODEOWNERS (#17707)
Thank you to all our contributors for making this release possible!
@MarcoGorelli, @Object905, @SandroCasagrande, @alexander-beedie, @atigbadr, @coastalwhite, @deanm0000, @delsner, @dependabot, @dependabot[bot], @henryharbeck, @implicit-apparatus, @jparag, @knl, @kylebarron, @lukapeschke, @mcrumiller, @nameexhaustion, @orlp, @ritchie46, @ruihe774, @stinodego, @szepeviktor and @wence-