arrow 55.2.0
Changelog
55.2.0 (2025-06-22)
Implemented enhancements:
- Do not populate nulls for
NullArray
forMutableArrayData
#7725 - Implement
PartialEq
for RunArray #7691 interleave_views
is really slow #7688 [arrow]- Add min max aggregates for FixedSizeBinary #7674 [arrow]
- Deliver pyarrow as a standalone crate #7668 [arrow]
- [Variant] Implement
VariantObject::field
andVariantObject::fields
#7665 [parquet] - [Variant] Implement read support for remaining primitive types #7630 [parquet]
- Fast and ergonomic method to add metadata to a
RecordBatch
#7628 [arrow] - Add efficient way to change the keys of string dictionary builder #7610 [arrow]
- Support
add_nulls
on additional builder types #7605 [arrow] - Add
into_inner
forAsyncArrowWriter
#7603 [parquet] - Optimize
PrimitiveBuilder::append_trusted_len_iter
#7591 [arrow] - Benchmark for filter+concat and take+concat into even sized record batches #7589 [arrow]
max_statistics_truncate_length
is ignored when writing statistics to data page headers #7579 [parquet]- Feature Request: Encoding in
parquet-rewrite
#7575 [parquet] - Add a
strong_count
method toBuffer
#7568 [arrow] - Create version of LexicographicalComparator that compares fixed number of columns #7531 [arrow]
- parquet-show-bloom-filter should work with integer typed columns #7528 [parquet]
- Allow merging primitive dictionary values in concat and interleave kernels #7518 [arrow]
- Add efficient concatenation of StructArrays #7516 [arrow]
- Rename
flight-sql-experimental
toflight-sql
#7498 [arrow] [arrow-flight] - Consider moving from ryu to lexical-core for string formatting / casting floats to string. #7496
- Arithmetic kernels can be safer and faster #7494 [arrow]
- Speedup
filter_bytes
by precalculating capacity #7465 [arrow] - [Variant]: Rust API to Create Variant Values #7424 [parquet] [arrow]
- [Variant] Rust API to Read Variant Values #7423 [arrow]
- Release arrow-rs / parquet Minor version
55.1.0
(May 2025) #7393 [parquet] - Support create_random_array for Decimal data types #7343 [arrow]
- Truncate Parquet page data page statistics #7555 [parquet] (etseidl)
Fixed bugs:
- In arrow_json, Decoder::decode can panic if it encounters two high surrogates in a row. #7712
- FlightSQL "GetDbSchemas" and "GetTables" schemas do not fully match the protocol #7637 [arrow] [arrow-flight]
- Cannot read encrypted Parquet file if page index reading is enabled #7629 [parquet]
encoding_stats
not present in Parquet generated byparquet-rewrite
#7616 [parquet]- When writing parquet plaintext footer files
footer_signing_key_metadata
is not included, encryption alghoritm is always written in footer #7599 [parquet] new_null_array
panics when constructing a struct of a dictionary #7571- Parquet derive fails to build when Result is aliased #7547
- Unable to read
Dictionary(u8, FixedSizeBinary(_))
using datafusion. #7545 [parquet] - filter_record_batch panics with empty struct array. #7538 [arrow]
- Panic in
pretty_format
function when displaying DurationSecondsArray withi64::MIN
/i64::MAX
#7533 [arrow] - Record API unable to parse TIME_MILLIS when encoded as INT32 #7510 [parquet]
- The
read_record_batch
func of theRecordBatchDecoder
does not respect theskip_validation
property #7508 [arrow] arrow-55.1.0
breaksfilter_record_batch
#7500- Files containing binary data with >=8_388_855 bytes per row written with
arrow-rs
can't be read withpyarrow
#7489 [parquet] - [Bug] Ingestion with Arrow Flight Sql panic when the input stream is empty or fallible #7329 [arrow] [arrow-flight]
- Ensure page encoding statistics are written to Parquet file #7643 [parquet] (etseidl)
Documentation updates:
- arrow_reader_row_filter benchmark doesn't capture page cache improvements #7460 [parquet] [arrow]
- chore: fix a typo in
ExtensionType::supports_data_type
docs #7682 [arrow] (mbrobbel) - [Variant] Add variant docs and examples #7661 [parquet] (alamb)
- Minor: Add version to deprecation notice for
ParquetMetaDataReader::decode_footer
#7639 [parquet] (etseidl) - Add references for defaults in
WriterPropertiesBuilder
#7558 [parquet] (etseidl) - Clarify Docs: NullBuffer::len is in bits #7556 [arrow] (alamb)
- docs: fix typo for
Decimal128Array
#7525 [arrow] (burmecia) - Minor: Add examples to ProjectionMask documentation #7523 [parquet] (alamb)
- Improve documentation for Parquet
WriterProperties
#7491 [parquet] (alamb)
Closed issues:
- [Variant] More efficient determination of String vs ShortString #7700
- [Variant] Improve API for iterating over values of a VariantList #7685 [parquet]
- [Variant] Consider validating variants on creation (rather than read) #7684 [parquet]
- Miri test_native_type_pow test failing #7641 [arrow]
- Improve performance of
coalesce
andconcat
for views #7615 [arrow] - Bad min value in row group statistics in some special cases #7593
- Feature Request: BloomFilter Position Flexibility in
parquet-rewrite
#7552 [parquet]
Merged pull requests:
- arrow-array: Implement PartialEq for RunArray #7727 [arrow] (brancz)
- fix: Do not add null buffer for
NullArray
in MutableArrayData #7726 [arrow] (comphead) - fix JSON decoder error checking for UTF16 / surrogate parsing panic #7721 [arrow] (nicklan)
- [Variant] Introduce new type over &str for ShortString #7718 [parquet] (friendlymatthew)
- Split out variant code into several new sub-modules #7717 [parquet] (scovich)
- Support write to buffer api for SerializedFileWriter #7714 [parquet] (zhuqi-lucas)
- Make variant iterators safely infallible #7704 [parquet] (scovich)
- Speedup
interleave_views
(4-7x faster) #7695 [arrow] (Dandandan) - Define a "arrow-pyrarrow" crate to implement the "pyarrow" feature. #7694 [arrow] (brunal)
- Document REE row format and add some more tests #7680 [arrow] (alamb)
- feat: add min max aggregate support for FixedSizeBinary #7675 [arrow] (alexwilcoxson-rel)
- arrow-data: Add REE support for
build_extend
andbuild_extend_nulls
#7671 [arrow] (brancz) - Remove
lazy_static
dependency #7669 [arrow] (Expyron) - Finish implementing Variant::Object and Variant::List #7666 [parquet] (scovich)
- Add
RecordBatch::schema_metadata_mut
andField::metadata_mut
#7664 [arrow] (emilk) - [Variant] Simplify creation of Variants from metadata and value #7663 [parquet] (alamb)
- chore: group prost dependabot updates #7659 (mbrobbel)
- Initial Builder API for Creating Variant Values #7653 [parquet] (PinkCrow007)
- Add
BatchCoalescer::push_filtered_batch
and docs #7652 [arrow] (alamb) - Optimize coalesce kernel for StringView (10-50% faster) #7650 [arrow] (alamb)
- arrow-row: Add support for REE #7649 [arrow] (brancz)
- Use approximate comparisons for pow tests #7646 [arrow] (adamreeve)
- [Variant] Implement read support for remaining primitive types #7644 [parquet] (superserious-dev)
- Add
pretty_format_batches_with_schema
function #7642 [arrow] (lewiszlw) - Deprecate old Parquet page index parsing functions #7640 [parquet] (etseidl)
- Update FlightSQL
GetDbSchemas
andGetTables
schemas to fully match the protocol #7638 [arrow] [arrow-flight] (sgrebnov) - Minor: Remove outdated FIXME from
ParquetMetaDataReader
#7635 [parquet] (etseidl) - Fix the error info of
StructArray::try_new
#7634 [arrow] (xudong963) - Fix reading encrypted Parquet pages when using the page index #7633 [parquet] (adamreeve)
- [Variant] Add commented out primitive test casees #7631 [parquet] (alamb)
- Improve
coalesce
kernel tests #7626 [arrow] (alamb) - Revert "Revert "Improve
coalesce
andconcat
performance for views… #7625 [arrow] (Dandandan) - Revert "Improve
coalesce
andconcat
performance for views (#7614)" #7623 [arrow] (Dandandan) - Improve coalesce_kernel benchmark to capture inline vs non inline views #7619 [arrow] (alamb)
- Improve
coalesce
andconcat
performance for views #7614 [arrow] (Dandandan) - feat: add constructor to help efficiently upgrade key for GenericBytesDictionaryBuilder #7611 [arrow] (albertlockett)
- feat: support append_nulls on additional builders #7606 [arrow] (albertlockett)
- feat: add AsyncArrowWriter::into_inner #7604 [parquet] (jpopesculian)
- Move variant interop test to Rust integration test #7602 [parquet] (alamb)
- Include footer key metadata when writing encrypted Parquet with a plaintext footer #7600 [parquet] (rok)
- Add
coalesce
kernel andBatchCoalescer
for statefully combining selected b…atches: #7597 [arrow] (alamb) - Add FixedSizeBinary to
take_kernel
benchmark #7592 [arrow] (alamb) - Fix GenericBinaryArray docstring. #7588 [arrow] (brunal)
- fix: error reading multiple batches of
Dict(_, FixedSizeBinary(_))
#7585 [parquet] (albertlockett) - Revert "Minor: remove filter code deprecated in 2023 (#7554)" #7583 [arrow] (alamb)
- Fixed a warning build build: function never used. #7577 [parquet] (JigaoLuo)
- Adding Encoding argument in
parquet-rewrite
#7576 [parquet] (JigaoLuo) - feat: add
row_group_is_[max/min]_value_exact
to StatisticsConverter #7574 [parquet] (CookiePieWw) - [array] Remove unwrap checks from GenericByteArray::value_unchecked #7573 [arrow] (ctsk)
- [benches/row_format] fix typo in array lengths #7572 [arrow] (ctsk)
- Add a strong_count method to Buffer #7569 [arrow] (westonpace)
- Minor: Enable byte view for clickbench benchmark #7565 [parquet] (zhuqi-lucas)
- Optimize length calculation in row encoding for fixed-length columns #7564 [arrow] (ctsk)
- Use PR title and description for commit message #7563 (kou)
- Use apache/arrow-{go,java,js} in integration test #7561 (kou)
- Implement Array Decoding in arrow-avro #7559 [arrow] (jecsand838)
- Minor: remove filter code deprecated in 2023 #7554 [arrow] (alamb)
- fix: Correct docs for
WriterPropertiesBuilder::set_column_index_truncate_length
#7553 [parquet] (etseidl) - Adding Bloom Filter Position argument in parquet-rewrite #7550 [parquet] (JigaoLuo)
- Fix
Result
name collision in parquet_derive #7548 (jspaezp) - Fix: Converted feature flight-sql-experimental to flight-sql #7546 [arrow] [arrow-flight] (kunalsinghdadhwal)
- Fix CI on main due to logical conflict #7542 [arrow] (alamb)
- Fix
filter_record_batch
panics with empty struct array #7539 [arrow] (thorfour) - [Variant] Initial API for reading Variant data and metadata #7535 (mkarbo)
- fix: Panic in pretty_format function when displaying DurationSecondsA… #7534 [arrow] (zhuqi-lucas)
- Create version of LexicographicalComparator that compares fixed number of columns (~ -15%) #7530 [arrow] (Dandandan)
- Make parquet-show-bloom-filter work with integer typed columns #7529 [parquet] (adamreeve)
- chore(deps): update criterion requirement from 0.5 to 0.6 #7527 [parquet] [arrow] (mbrobbel)
- Minor: Add a parquet row_filter test, reduce some test boiler plate #7522 [parquet] (alamb)
- Refactor
build_array_reader
into a struct #7521 [parquet] (alamb) - arrow: add concat structs benchmark #7520 [arrow] (asubiotto)
- arrow-select: add support for merging primitive dictionary values #7519 [arrow] (asubiotto)
- arrow-select: add support for optimized concatenation of struct arrays #7517 [arrow] (asubiotto)
- Fix Clippy in CI for Rust 1.87 release #7514 [parquet] [arrow] [arrow-flight] (alamb)
- Simplify
ParquetRecordBatchReader::next
control logic #7512 [parquet] (alamb) - Fix record API support for reading INT32 encoded TIME_MILLIS #7511 [parquet] (njaremko)
- RecordBatchDecoder: skip RecordBatch validation when
skip_validation
property is enabled #7509 [arrow] (nilskch) - Introduce
ReadPlan
to encapsulate the calculation of what parquet rows to decode #7502 [parquet] (alamb) - Update documentation for ParquetReader #7501 [parquet] (alamb)
- Improve
Field
docs, add missingField::set_*
methods #7497 [arrow] (alamb) - Speed up arithmetic kernels, reduce
unsafe
usage #7493 [arrow] (Dandandan) - Prevent FlightSQL server panics for
do_put
when stream is empty or 1st stream element is an Err #7492 [arrow] [arrow-flight] (superserious-dev) - arrow-ipc: add
StreamDecoder::schema
#7488 [arrow] (lidavidm) - arrow-select: Implement concat for
RunArray
s #7487 [arrow] (brancz) - [Variant] Add (empty)
parquet-variant
crate, updateparquet-testing
pin #7485 (alamb) - Improve error messages if schema hint mismatches with parquet schema #7481 [parquet] [arrow] (alamb)
- Add
arrow_reader_clickbench
benchmark #7470 [parquet] (alamb) - Speedup
filter_bytes
-20-40%,-37%) #7463 [arrow] (Dandandan)filter_native
low selectivity ( - Update arrow_reader_row_filter benchmark to reflect ClickBench distribution #7461 [parquet] (alamb)
- Add Map support to arrow-avro #7451 [arrow] (jecsand838)
- Support Utf8View for Avro #7434 [arrow] (kumarlokesh)
- Add support for creating random Decimal128 and Decimal256 arrays #7427 [arrow] (Weijun-H)
* This Changelog was automatically generated by github_changelog_generator