Changelog
55.0.0 (2025-04-08)
Breaking changes:
- Change Parquet API interaction to use
u64
(support files larger than 4GB in WASM) #7371 [parquet] (kylebarron) - Remove
AsyncFileReader::get_metadata_with_options
, addoptions
toAsyncFileReader::get_metadata
#7342 [parquet] (corwinjoy) - Parquet: Support reading Parquet metadata via suffix range requests #7334 [parquet] (kylebarron)
- Upgrade to
object_store
to0.12.0
#7328 [parquet] (mbrobbel) - Upgrade
pyo3
to0.24
#7324 [arrow] (mbrobbel) - Reapply Box
FlightErrror::tonic
to reduce size (fixes nightly clippy) #7277 [arrow] [arrow-flight] (alamb) - Improve parquet gzip compression performance using zlib-rs #7200 [parquet] (psvri)
- Fix:
date_part
to extract only the requested part (not the overall interval) #7189 [arrow] (delamarch3) - chore: upgrade flatbuffer version to
25.2.10
#7134 [arrow] (tisonkun) - Add hooks to json encoder to override default encoding or add support for unsupported types #7015 [arrow] (adriangb)
Implemented enhancements:
- Improve the performance of
concat
#7357 [arrow] - Pushdown predictions to Parquet in-memory row group fetches #7348 [parquet]
- Improve CSV parsing errors: Print the row that makes csv parsing fails #7344 [arrow]
- Support ColumnMetaData
encoding_stats
in Parquet Writing #7341 [parquet] - Support writing Parquet with modular encryption #7327 [parquet]
- Parquet Use U64 Instead of Usize (wasm support for files greater than 4GB) #7238 [parquet]
- Support different TimeUnits and timezones when reading Timestamps from INT96 #7220 [parquet]
Fixed bugs:
- New clippy failures in code base with release of rustc 1.86 #7381 [parquet] [arrow]
- Fix bug in
ParquetMetaDataReader
and add test of suffix metadata reads with encryption #7372 [parquet] (etseidl)
Documentation updates:
- Improve documentation on
ArrayData::offset
#7385 [arrow] (alamb) - Improve documentation for
AsyncFileReader::get_metadata
#7380 [parquet] (alamb) - Improve documentation on implementing Parquet predicate pushdown #7370 [parquet] (alamb)
- Add documentation and examples for pretty printing, make
pretty_format_columns_with_options
pub #7346 [arrow] (alamb) - Improve documentation on writing parquet, including multiple threads #7321 [parquet] (alamb)
Merged pull requests:
- chore: apply clippy suggestions newly introduced in rust 1.86 #7382 [parquet] [arrow] (westonpace)
- bench: add more {boolean, string, int} benchmarks for concat kernel #7376 [arrow] (rluvaton)
- Add more examples of using Parquet encryption #7374 [parquet] (adamreeve)
- Clean up
ArrowReaderMetadata::load_async
#7369 [parquet] (etseidl) - bump pyo3 for RUSTSEC-2025-0020 #7368 [arrow] (onursatici)
- Test int96 Parquet file from Spark #7367 [parquet] (mbutrovich)
- fix: respect offset/length when converting ArrayData to StructArray #7366 [arrow] (westonpace)
- Print row, data present, expected type, and row number in error messages for arrow-csv #7361 [arrow] (psiayn)
- Use rust builtins for round_upto_multiple_of_64 and ceil #7358 [arrow] (psvri)
- Write parquet PageEncodingStats #7354 [parquet] (jhorstmann)
- Move
sysinfo
todev-dependencies
#7353 [parquet] (mbrobbel) - chore(deps): update sysinfo requirement from 0.33.0 to 0.34.0 #7352 [parquet] (dependabot[bot])
- Add additional benchmarks for utf8view comparison kernels #7351 [arrow] (zhuqi-lucas)
- Upgrade to twox-hash 2.0 #7347 [parquet] (alamb)
- refactor: apply borrowed chunk reader to Sbbf::read_from_column_chunk #7345 [parquet] (ethe)
- Merge changelog and version from 54.3.1 into main #7340 [parquet] [arrow] (timsaucer)
- Remove
object-store
label from.asf.yaml
#7339 (mbrobbel) - Encapsulate encryption code more in readers #7337 [parquet] (alamb)
- Bump MSRV to 1.81 #7336 [parquet] [arrow] [arrow-flight] (mbrobbel)
- Add an option to show column type #7335 [arrow] (blaginin)
- Add missing type annotation #7326 [parquet] (mbrobbel)
- Minor: Improve parallel parquet encoding example #7323 [parquet] (alamb)
- feat: allow if expressions for fallbacks in downcast macro #7322 [arrow] (rluvaton)
- Minor: rename
ParquetRecordBatchStream::reader
toParquetRecordBatchStream::reader_factory
#7319 [parquet] (alamb) - bugfix: correct offsets when serializing a list of fixed sized list and non-zero start offset #7318 [arrow] (timsaucer)
- Remove object_store references in Readme.md #7317 (alamb)
- Adopt MSRV policy #7314 (psvri)
- fix: correct array length validation error message #7313 [arrow] (wkalt)
- chore: remove trailing space in debug print #7311 [arrow] (xxchan)
- Improve
concat
performance, and addappend_array
for some array builder implementations #7309 [arrow] (rluvaton) - feat: add
append_buffer
forNullBufferBuilder
#7308 [arrow] (rluvaton) - MINOR: fix incorrect method name in deprecate node #7306 [arrow] (waynexia)
- Allow retrieving Parquet decryption keys using the key metadata #7286 [parquet] (adamreeve)
- Support different TimeUnits and timezones when reading Timestamps from INT96 #7285 [parquet] (mbutrovich)
- Add Parquet Modular encryption support (write) #7111 [parquet] (rok)
* This Changelog was automatically generated by github_changelog_generator