Releases: apache/arrow-rs
arrow 56.2.0
Changelog
56.2.0 (2025-09-19)
Implemented enhancements:
- [Variant] Add variant to arrow primitives for unsigned integers #8368
- [Variant] [Shredding] Support typed_access for
FixedSizeBinary
#8335 - [Variant] [Shredding] Support typed_access for
Utf8
andBinaryView
#8333 - [Variant] [Shredding] Support typed_access for
Boolean
#8329 - Allow specifying projection in ParquetRecordBatchReader::try_new_with_row_groups #8326
- [Parquet] Expose predicates from RowFilter #8314
- [Variant] Use row-oriented builders in
cast_to_variant
#8310 - Use apache/arrow-dotnet for integration test #8294
- [Variant] Add
Vairant::as_u*
#8283 - Add a way to modify WriterProperties #8273
- Dont truncate timestamps on display for Row #8265
- [Parquet] Add row group write with AsyncArrowWriter #8261
- [Parquet] Expose ArrowRowGroupWriter #8259
- [Parquet] Do not compress v2 data page when compress is bad quality #8256 [parquet]
- [Variant] Refactor
cast_to_variant
#8234 - [Variant]: Implement
DataType::Union
support forcast_to_variant
kernel #8195 - [Variant]: Implement
DataType::Duration
support forcast_to_variant
kernel #8194 - [Variant] Support typed access for numeric types in variant_get #8178
- [Parquet] Implement a "push style" API for decoding Parquet Metadata #8164
- [Variant] Support creating Variants with pre-existing Metadata #8152
- [Variant] Support Shredded Objects in
variant_get
: typed path access (STEP 1) #8150 - [Variant] Add
variant
feature toparquet
crate #8132 - [Parquet] Concurrent writes with ArrowWriter.get_column_writers should parallelize across row groups #8115
- [Variant] Implement
VariantArray::value
for shredded variants #8091 - [Variant] Integration tests for reading parquet w/ Variants #8084
- [Variant]: Implement
DataType::Map
support forcast_to_variant
kernel #8063 - [Variant]: Implement
DataType::List/LargeList
support forcast_to_variant
kernel #8060
Fixed bugs:
- Casting floating point numbers fails for Decimal64 but works for other variants #8362
- [Variant] cast_to_variant conflates empty map with NULL #8289
- [Avro] Decoder flush panics for map whose value field contains metadata #8270
- Parquet: Avoid page size exceeds i32::MAX #8263 [parquet]
- [Avro] Decoder panics on flush when schema contains map whose value is non-nullable #8253
- Avro nullable field decode failure leads to panic upon decoder flush #8212
- Avro to arrow schema conversion fails when a field has a default type that is not string #8209
- parquet: No method named
to_ne_bytes
found for structbloom_filter::Block
for targets390x-unknown-linux-gnu
#8207 - [Variant] cast_to_variant will panic on certain
Date64
or Timestamp Values values #8155 - Parquet: Avoid page-size overflows i32 #8264 [parquet] (mapleFU)
Documentation updates:
- Update docstring comment for Writer::write() in writer.rs #8267 [arrow] (YKoustubhRao)
Closed issues:
- comfy-table release 7.2.0 breaks MSRV #8243
- [Variant] Add
Variant::as_f16
#8228 - Support appending raw bytes to variant objects and lists #8217
VariantArrayBuilder
usesParentState
for simpler rollbacks #8205- Make
ObjectBuilder::finish
signature infallible #8184 - Improve performance of
i256
tof64
#8013
Merged pull requests:
- [Variant] Support Variant to PrimitiveArrow for unsigned integer #8369 (klion26)
- [Variant] [Shredding] Support typed_access for Utf8 and BinaryView #8364 [parquet] (petern48)
- Fix casting floats to Decimal64 #8363 [arrow] (AdamGS)
- [Variant] Implement new VariantValueArrayBuilder #8360 (scovich)
- [Variant] Add constants for empty variant metadata #8359 (scovich)
- [Variant] Allow lossless casting from integer to floating point #8357 (scovich)
- [Variant] Minor code cleanups #8356 (scovich)
- [Variant] Remove unused metadata from variant ShreddingState #8355 (scovich)
- Adds Map & Enum support, round-trip & benchmark tests #8353 [arrow] (nathaniel-d-ef)
- [Variant] [Shredding] feat: Support typed_access for FixedSizeBinary #8352 (petern48)
- Add arrow-avro Reader support for Dense Union and Union resolution (Part 1) #8348 [arrow] (jecsand838)
- [Variant] feat: Support typed_access for Boolean #8346 (Weijun-H)
- [Variant] Make VariantToArrowRowBuilder an enum #8345 (scovich)
- [Variant] Rename VariantShreddingRowBuilder to VariantToArrowRowBuilder #8344 (scovich)
- [Variant] Add tests for variant_get requesting Some struct #8343 (scovich)
- [Variant] Add nullable arg to StructArrayBuilder::with_field #8342 (scovich)
- Minor: avoid an
Arc::clone
in CacheOptions for Parquet PredicateCache #8338 [parquet] (alamb) - Fix
can_cast_types
for temporal toUtf8View
#8328 [arrow] ([findepi](https://github.c...
arrow 56.1.0
Changelog
56.1.0 (2025-08-21)
Implemented enhancements:
- Implement cast and other operations on decimal32 and decimal64 #7815 #8204 [arrow]
- Speed up Parquet filter pushdown with predicate cache #8203 [parquet]
- Optionally read parquet page indexes #8070 [parquet]
- Parquet reader: add method for sync reader read bloom filter #8023 [parquet]
- [parquet] Support writing logically equivalent types to
ArrowWriter
#8012 [parquet] - Improve StringArray(Utf8) sort performance #7847 [arrow]
- feat: arrow-ipc delta dictionary support #8001 [arrow] (JakeDern)
Fixed bugs:
- The Rustdocs are clean CI job is failing #8175
- [avro] Bug in resolving avro schema with named type #8045 [arrow]
- Doc test failure (test arrow-avro/src/lib.rs - reader) when verifying avro 56.0.0 RC1 release #8018 [arrow]
Documentation updates:
- arrow-row: Document dictionary handling #8168 [arrow] (alamb)
- Docs: Clarify that Array::value does not check for nulls #8065 [arrow] (alamb)
- docs: Fix a typo in README #8036 (EricccTaiwan)
- Add more comments to the internal parquet reader #7932 [parquet] (alamb)
Performance improvements:
- perf(arrow-ipc): avoid counting nulls in
RecordBatchDecoder
#8127 [arrow] (rluvaton) - Use
Vec
directly in builders #7984 [arrow] (liamzwbao) - Improve StringArray(Utf8) sort performance (~2-4x faster) #7860 [arrow] (zhuqi-lucas)
Closed issues:
- [Variant] Improve fuzz test for Variant #8199
- [Variant] Improve fuzz test for Variant #8198
VariantArrayBuilder
tracks starting offsets instead of (offset, len) pairs #8192- Rework
ValueBuilder
API to work withParentState
for reliable nested rollbacks #8188 - [Variant] Rename
ValueBuffer
asValueBuilder
#8186 - [Variant] Refactor
ParentState
to track and rollback state on behalf of its owning builder #8182 - [Variant]
ObjectBuilder
should detect duplicates at insertion time, not at finish #8180 - [Variant] ObjectBuilder does not reliably check for duplicates #8170
- [Variant] Support
StringView
andLargeString
in ´batch_json_string_to_variant` #8145 [parquet] - [Variant] Rename
batch_json_string_to_variant
andbatch_variant_to_json_string
json_to_variant #8144 [parquet] - [avro] Use
tempfile
crate rather than custom temporary file generator in tests #8143 [arrow] - [Avro] Use
Write
ratherdyn Write
in Decoder #8142 [arrow] - [Variant] Nested builder rollback is broken #8136
- [Variant] Add support the remaing primitive type(timestamp_nanos/timestampntz_nanos/uuid) for parquet variant #8126
- Meta: Implement missing Arrow 56.0 lint rules - Sequential workflow #8121
- ARROW-012-015: Add linter rules for remaining Arrow 56.0 breaking changes #8120
- ARROW-010 & ARROW-011: Add linter rules for Parquet Statistics and Metadata API removals #8119
- ARROW-009: Add linter rules for IPC Dictionary API removals in Arrow 56.0 #8118
- ARROW-008: Add linter rule for SerializedPageReaderState usize→u64 breaking change #8117
- ARROW-007: Add linter rule for Schema.all_fields() removal in Arrow 56.0 #8116
- [Variant] Implement
ShreddingState::AllNull
variant #8088 [parquet] - [Variant] Support Shredded Objects in
variant_get
#8083 [parquet] - [Variant]: Implement
DataType::RunEndEncoded
support forcast_to_variant
kernel #8064 [parquet] - [Variant]: Implement
DataType::Dictionary
support forcast_to_variant
kernel #8062 [parquet] - [Variant]: Implement
DataType::Struct
support forcast_to_variant
kernel #8061 [parquet] - [Variant]: Implement
DataType::Decimal32/Decimal64/Decimal128/Decimal256
support forcast_to_variant
kernel #8059 [parquet] - [Variant]: Implement
DataType::Timestamp(..)
support forcast_to_variant
kernel #8058 [parquet] - [Variant]: Implement
DataType::Float16
support forcast_to_variant
kernel #8057 [parquet] - [Variant]: Implement
DataType::Interval
support forcast_to_variant
kernel #8056 [parquet] - [Variant]: Implement
DataType::Time32/Time64
support forcast_to_variant
kernel #8055 [parquet] - [Variant]: Implement
DataType::Date32 / DataType::Date64
support forcast_to_variant
kernel #8054 [parquet] - [Variant]: Implement
DataType::Null
support forcast_to_variant
kernel #8053 [parquet] - [Variant]: Implement
DataType::Boolean
support forcast_to_variant
kernel #8052 [parquet] - [Variant]: Implement
DataType::FixedSizeBinary
support forcast_to_variant
kernel #8051 [parquet] - [Va...
arrow 56.0.0
Changelog
56.0.0 (2025-07-29)
Breaking changes:
- arrow-schema: Remove dict_id from being required equal for merging #7968 [arrow] (brancz)
- [Parquet] Use
u64
forSerializedPageReaderState.offset
&remaining_bytes
, instead ofusize
#7918 [parquet] (JigaoLuo) - Upgrade tonic dependencies to 0.13.0 version (try 2) #7839 [arrow] [arrow-flight] (alamb)
- Remove deprecated Arrow functions #7830 [arrow] [arrow-flight] (etseidl)
- Remove deprecated temporal functions #7813 [arrow] (etseidl)
- Remove functions from parquet crate deprecated in or before 54.0.0 #7811 [parquet] (etseidl)
- GH-7686: [Parquet] Fix int96 min/max stats #7687 [parquet] (rahulketch)
Implemented enhancements:
- [parquet] Relax type restriction to allow writing dictionary/native batches for same column #8004
- Support casting int64 to interval #7988 [arrow]
- [Variant] Add
ListBuilder::with_value
for convenience #7951 [parquet] - [Variant] Add
ObjectBuilder::with_field
for convenience #7949 [parquet] - [Variant] Impl PartialEq for VariantObject #7943 #7948
- [Variant] Offer
simdutf8
as an optional dependency when validating metadata #7902 [parquet] [arrow] - [Variant] Avoid collecting offset iterator #7901 [parquet]
- [Variant] Remove superfluous check when validating monotonic offsets #7900 [parquet]
- [Variant] Avoid extra allocation in
ObjectBuilder
#7899 [parquet] - [Variant][Compute]
variant_get
kernel #7893 [parquet] - [Variant][Compute] Add batch processing for Variant-JSON String conversion #7883 [parquet]
- Support
MapArray
in lexsort #7881 [arrow] - [Variant] Add testing for invalid variants (fuzz testing??) #7842 [parquet]
- [Variant] VariantMetadata, VariantList and VariantObject are too big for Copy #7831 [parquet]
- Allow choosing flate2 backend #7826 [parquet]
- [Variant] Tests for creating "large"
VariantObjects
s #7821 [parquet] - [Variant] Tests for creating "large"
VariantList
s #7820 [parquet] - [Variant] Support VariantBuilder to write to buffers owned by the caller #7805 [parquet]
- [Variant] Move JSON related functionality to different crate. #7800 [parquet]
- [Variant] Add flag in
ObjectBuilder
to control validation behavior on duplicate field write #7777 [parquet] - [Variant] make
serde_json
an optional dependency ofparquet-variant
#7775 [parquet] - [coalesce] Implement specialized
BatchCoalescer::push_batch
forPrimitiveArray
#7763 [arrow] - Add sort_kernel benchmark for StringViewArray case #7758 [arrow]
- [Variant] Improved API for accessing Variant Objects and lists #7756 [parquet]
- Buildable reproducible release builds #7751
- Allow per-column parquet dictionary page size limit #7723 [parquet]
- [Variant] Test and implement efficient building for "large" Arrays #7699 [parquet]
- [Variant] Improve VariantBuilder when creating field name dictionaries / sorted dictionaries #7698 [parquet]
- [Variant] Add input validation in
VariantBuilder
#7697 [parquet] - [Variant] Support Nested Data in
VariantBuilder
#7696 [parquet] - Parquet: Incorrect min/max stats for int96 columns #7686 [parquet]
- Add
DictionaryArray::gc
method #7683 [arrow] - [Variant] Add negative tests for reading invalid primitive variant values #7645 [parquet]
Fixed bugs:
- [Variant] Panic when appending nested objects to VariantBuilder #7907 [parquet]
- Panic when casting large Decimal256 to f64 due to unchecked
unwrap()
#7886 [arrow] - Incorrect inlined string view comparison after " Add prefix compare for inlined" #7874 [parquet] [arrow]
- [Variant]
test_json_to_variant_object_very_large
takes over 20s #7872 [parquet] - [Variant] If
ObjectBuilder::finalize
is not called, the resulting Variant object is malformed. #7863 [parquet] - CSV error message has values transposed #7848 [arrow]
- Concating struct arrays with no fields unnecessarily errors #7828 [arrow]
- Clippy CI is failing on ...
arrow 55.2.0
Changelog
55.2.0 (2025-06-22)
Implemented enhancements:
- Do not populate nulls for
NullArray
forMutableArrayData
#7725 - Implement
PartialEq
for RunArray #7691 interleave_views
is really slow #7688 [arrow]- Add min max aggregates for FixedSizeBinary #7674 [arrow]
- Deliver pyarrow as a standalone crate #7668 [arrow]
- [Variant] Implement
VariantObject::field
andVariantObject::fields
#7665 [parquet] - [Variant] Implement read support for remaining primitive types #7630 [parquet]
- Fast and ergonomic method to add metadata to a
RecordBatch
#7628 [arrow] - Add efficient way to change the keys of string dictionary builder #7610 [arrow]
- Support
add_nulls
on additional builder types #7605 [arrow] - Add
into_inner
forAsyncArrowWriter
#7603 [parquet] - Optimize
PrimitiveBuilder::append_trusted_len_iter
#7591 [arrow] - Benchmark for filter+concat and take+concat into even sized record batches #7589 [arrow]
max_statistics_truncate_length
is ignored when writing statistics to data page headers #7579 [parquet]- Feature Request: Encoding in
parquet-rewrite
#7575 [parquet] - Add a
strong_count
method toBuffer
#7568 [arrow] - Create version of LexicographicalComparator that compares fixed number of columns #7531 [arrow]
- parquet-show-bloom-filter should work with integer typed columns #7528 [parquet]
- Allow merging primitive dictionary values in concat and interleave kernels #7518 [arrow]
- Add efficient concatenation of StructArrays #7516 [arrow]
- Rename
flight-sql-experimental
toflight-sql
#7498 [arrow] [arrow-flight] - Consider moving from ryu to lexical-core for string formatting / casting floats to string. #7496
- Arithmetic kernels can be safer and faster #7494 [arrow]
- Speedup
filter_bytes
by precalculating capacity #7465 [arrow] - [Variant]: Rust API to Create Variant Values #7424 [parquet] [arrow]
- [Variant] Rust API to Read Variant Values #7423 [arrow]
- Release arrow-rs / parquet Minor version
55.1.0
(May 2025) #7393 [parquet] - Support create_random_array for Decimal data types #7343 [arrow]
- Truncate Parquet page data page statistics #7555 [parquet] (etseidl)
Fixed bugs:
- In arrow_json, Decoder::decode can panic if it encounters two high surrogates in a row. #7712
- FlightSQL "GetDbSchemas" and "GetTables" schemas do not fully match the protocol #7637 [arrow] [arrow-flight]
- Cannot read encrypted Parquet file if page index reading is enabled #7629 [parquet]
encoding_stats
not present in Parquet generated byparquet-rewrite
#7616 [parquet]- When writing parquet plaintext footer files
footer_signing_key_metadata
is not included, encryption alghoritm is always written in footer #7599 [parquet] new_null_array
panics when constructing a struct of a dictionary #7571- Parquet derive fails to build when Result is aliased #7547
- Unable to read
Dictionary(u8, FixedSizeBinary(_))
using datafusion. #7545 [parquet] - filter_record_batch panics with empty struct array. #7538 [arrow]
- Panic in
pretty_format
function when displaying DurationSecondsArray withi64::MIN
/i64::MAX
#7533 [arrow] - Record API unable to parse TIME_MILLIS when encoded as INT32 #7510 [parquet]
- The
read_record_batch
func of theRecordBatchDecoder
does not respect theskip_validation
property #7508 [arrow] arrow-55.1.0
breaksfilter_record_batch
#7500- Files containing binary data with >=8_388_855 bytes per row written with
arrow-rs
can't be read withpyarrow
#7489 [parquet] - [Bug] Ingestion with Arrow Flight Sql panic when the input stream is empty or fallible #7329 [arrow] [arrow-flight]
- Ensure page encoding statistics are written to Parquet file #7643 [parquet] (etseidl)
Documentation updates:
- arrow_reader_row_filter benchmark doesn't capture page cache improvements #7460 [parquet] [arrow]
- chore: fix a typo in
ExtensionType::supports_data_type
docs #7682 [arrow] (mbrobbel) - [Variant] Add variant docs and examples #7661 [parquet] (alamb)
- Minor: Add version to deprecation notice for `ParquetMetaDataReader::decode_foote...
arrow 55.1.0
Changelog
55.1.0 (2025-05-09)
Breaking changes:
- refactor!: do not default the struct array length to 0 in Struct::try_new #7247 [arrow] (westonpace)
Implemented enhancements:
- Add a way to get max
usize
fromOffsetSizeTrait
#7474 [arrow] - Deterministic metadata encoding #7448 [arrow]
- Support Arrow type Dictionary with value FixedSizeBinary in Parquet #7445
- Parquet: Add ability to project rowid in parquet reader #7444
- Move parquet::file::metadata::reader::FooterTail to parquet::file::metadata so that it is public #7438 [parquet]
- Speedup take_bytes by precalculating capacity #7432 [arrow]
- Improve performance of interleave_primitive and interleave_bytes #7421 [arrow]
- Implement
Eq
andDefault
forScalarBuffer
#7411 [arrow] - Add decryption support for column index and offset index #7390 [parquet]
- Support writing encrypted Parquet files with plaintext footers #7320 [parquet]
- Support Parquet key management tools #7256 [parquet]
- Verify footer tags when reading encrypted Parquet files with plaintext footers #7255 [parquet]
- StructArray::try_new behavior can be unexpected when there are no child arrays #7246 [arrow]
- Parquet performance: improve performance of reading int8/int16 #7097 [parquet]
Fixed bugs:
- StructArray::try_new validation incorrectly returns an error when
logical_nulls()
returns Some() && null_count == 0 #7435 - Reading empty DataPageV2 fails with
snappy: corrupt input (empty)
#7388 [parquet]
Documentation updates:
- Improve documentation and add examples for ArrowPredicateFn #7480 [parquet] (alamb)
- Document Arrow <--> Parquet schema conversion better #7479 [parquet] (alamb)
- Fix a typo in arrow/examples/README.md #7473 [arrow] (Mottl)
Closed issues:
- Refactor Parquet DecryptionPropertiesBuilder to fix use of unreachable #7476 [parquet]
- Implement
Eq
andDefault
forOffsetBuffer
#7417 [arrow]
Merged pull requests:
- Add Parquet
arrow_reader
benchmarks for {u}int{8,16} columns #7484 [parquet] (alamb) - fix:
rustdoc::unportable_markdown
was removed #7483 [arrow] [arrow-flight] (crepererum) - Support round trip reading / writing Arrow
Duration
type to parquet #7482 [parquet] (Liyixin95) - Add const MAX_OFFSET to OffsetSizeTrait #7478 [arrow] (thinkharderdev)
- Refactor Parquet DecryptionPropertiesBuilder #7477 [parquet] (adamreeve)
- Support parsing and display pretty for StructType #7469 [arrow] (goldmedal)
- chore(deps): update sysinfo requirement from 0.34.0 to 0.35.0 #7462 [parquet] (dependabot[bot])
- Verify footer tags when reading encrypted Parquet files with plaintext footers #7459 [parquet] (rok)
- Improve comments for avro #7449 [arrow] (kumarlokesh)
- feat: Support round trip reading/writing Arrow type
Dictionary(_, FixedSizeBinary(_))
to Parquet #7446 [parquet] (albertlockett) - Fix out of bounds crash in RleValueDecoder #7441 [parquet] (apilloud)
- Make
FooterTail
public #7440 [parquet] (masonh22) - Support writing encrypted Parquet files with plaintext footers #7439 [parquet] (rok)
- feat: deterministic metadata encoding #7437 [arrow] (timsaucer)
- Fix validation logic in
StructArray::try_new
to account for array.logical_nulls() returning Some() and null_count == 0 #7436 [arrow] (phillipleblanc) - Minor: Fix typo in async_reader comment #7433 [parquet] (amoeba)
- feat: coerce fixed size binary to binary view #7431 [arrow] (chenkovsky)
- chore(deps): update brotli requirement from 7.0 to 8.0 #7430 [parquet] (dependabot[bot])
- Speedup take_bytes (-35% -69%) by precalculating capacity #7422 [arrow] (Dandandan)
- Improve performance of interleave_primitive (-15% - 45%) / interleave_bytes (-10-25%) #7420 [arrow] (Dandandan)
- Implement
Eq
andDefault
forOffsetBuffer
#7418 [arrow] (kylebarron) - Implement
Default
forBuffer
&ScalarBuffer
#7413 [arrow] ([emilk...
arrow 55.0.0
Changelog
55.0.0 (2025-04-08)
Breaking changes:
- Change Parquet API interaction to use
u64
(support files larger than 4GB in WASM) #7371 [parquet] (kylebarron) - Remove
AsyncFileReader::get_metadata_with_options
, addoptions
toAsyncFileReader::get_metadata
#7342 [parquet] (corwinjoy) - Parquet: Support reading Parquet metadata via suffix range requests #7334 [parquet] (kylebarron)
- Upgrade to
object_store
to0.12.0
#7328 [parquet] (mbrobbel) - Upgrade
pyo3
to0.24
#7324 [arrow] (mbrobbel) - Reapply Box
FlightErrror::tonic
to reduce size (fixes nightly clippy) #7277 [arrow] [arrow-flight] (alamb) - Improve parquet gzip compression performance using zlib-rs #7200 [parquet] (psvri)
- Fix:
date_part
to extract only the requested part (not the overall interval) #7189 [arrow] (delamarch3) - chore: upgrade flatbuffer version to
25.2.10
#7134 [arrow] (tisonkun) - Add hooks to json encoder to override default encoding or add support for unsupported types #7015 [arrow] (adriangb)
Implemented enhancements:
- Improve the performance of
concat
#7357 [arrow] - Pushdown predictions to Parquet in-memory row group fetches #7348 [parquet]
- Improve CSV parsing errors: Print the row that makes csv parsing fails #7344 [arrow]
- Support ColumnMetaData
encoding_stats
in Parquet Writing #7341 [parquet] - Support writing Parquet with modular encryption #7327 [parquet]
- Parquet Use U64 Instead of Usize (wasm support for files greater than 4GB) #7238 [parquet]
- Support different TimeUnits and timezones when reading Timestamps from INT96 #7220 [parquet]
Fixed bugs:
- New clippy failures in code base with release of rustc 1.86 #7381 [parquet] [arrow]
- Fix bug in
ParquetMetaDataReader
and add test of suffix metadata reads with encryption #7372 [parquet] (etseidl)
Documentation updates:
- Improve documentation on
ArrayData::offset
#7385 [arrow] (alamb) - Improve documentation for
AsyncFileReader::get_metadata
#7380 [parquet] (alamb) - Improve documentation on implementing Parquet predicate pushdown #7370 [parquet] (alamb)
- Add documentation and examples for pretty printing, make
pretty_format_columns_with_options
pub #7346 [arrow] (alamb) - Improve documentation on writing parquet, including multiple threads #7321 [parquet] (alamb)
Merged pull requests:
- chore: apply clippy suggestions newly introduced in rust 1.86 #7382 [parquet] [arrow] (westonpace)
- bench: add more {boolean, string, int} benchmarks for concat kernel #7376 [arrow] (rluvaton)
- Add more examples of using Parquet encryption #7374 [parquet] (adamreeve)
- Clean up
ArrowReaderMetadata::load_async
#7369 [parquet] (etseidl) - bump pyo3 for RUSTSEC-2025-0020 #7368 [arrow] (onursatici)
- Test int96 Parquet file from Spark #7367 [parquet] (mbutrovich)
- fix: respect offset/length when converting ArrayData to StructArray #7366 [arrow] (westonpace)
- Print row, data present, expected type, and row number in error messages for arrow-csv #7361 [arrow] (psiayn)
- Use rust builtins for round_upto_multiple_of_64 and ceil #7358 [arrow] (psvri)
- Write parquet PageEncodingStats #7354 [parquet] (jhorstmann)
- Move
sysinfo
todev-dependencies
#7353 [parquet] (mbrobbel) - chore(deps): update sysinfo requirement from 0.33.0 to 0.34.0 #7352 [parquet] (dependabot[bot])
- Add additional benchmarks for utf8view comparison kernels #7351 [arrow] (zhuqi-lucas)
- Upgrade to twox-hash 2.0 #7347 [parquet] (alamb)
- refactor: apply borrowed chunk reader to Sbbf::read_from_column_chunk #7345 [parquet] (ethe)
- Merge changelog and version from 54.3.1 into main #7340 [parquet] [arrow] (timsaucer)
- Remove
object-store
label from.asf.yaml
#7339 (mbrobbel)
...
arrow 54.3.1
Changelog
54.3.1 (2025-03-26)
Fixed bugs:
- Round trip encoding of list of fixed list fails when offset is not zero #7315
Merged pull requests:
- Add missing type annotation #7326 [parquet] (mbrobbel)
- bugfix: correct offsets when serializing a list of fixed sized list and non-zero start offset #7318 [arrow] (timsaucer)
* This Changelog was automatically generated by github_changelog_generator
arrow 54.3.0
Changelog
54.3.0 (2025-03-17)
Implemented enhancements:
- Using column chunk offset index in
InMemoryRowGroup::fetch
#7300 - Support reading parquet with modular encryption #7296 [parquet]
- Add example for how to read/write encrypted parquet files #7281 [parquet]
- Have writer return parsed
ParquetMetadata
#7254 [parquet] - feat: Support Utf8View in JSON reader #7244 [arrow]
- StructBuilder should provide a way to get a &dyn ArrayBuilder of a field builder #7193 [arrow]
- Support div_wrapping/rem_wrapping for numeric arithmetic kernels #7158 [arrow]
- Improve RleDecoder performance #7195 [parquet] (Dandandan)
- Improve arrow-json deserialization performance by 30% #7157 [arrow] (mwylde)
- Add
with_skip_validation
flag to IPCStreamReader
,FileReader
andFileDecoder
#7120 [arrow] (alamb)
Fixed bugs:
- Archery integration CI test is failing on main: error: package
half v2.5.0
cannot be built because it requires rustc 1.81 or newer, while the currently active rustc version is 1.77.2 #7291 - MSRV CI check is failing on main #7289
- Incorrect IPC schema encoding for multiple dictionaries #7058 [arrow] [arrow-flight]
Documentation updates:
- Add example for how to read encrypted parquet files #7283 [parquet] (rok)
- Update the relative path of the test data in docs #7221 (Ziy1-Tan)
- Minor: fix doc and remove unused code #7194 [arrow] (lewiszlw)
- doc: modify wrong comment #7190 [arrow] (YichiZhang0613)
- doc: fix IPC file reader/writer docs #7178 [arrow] (Jefffrey)
Merged pull requests:
- chore: require ffi feature in arrow-schema benchmark #7298 [arrow] (ethe)
- Fix archery integration test #7292 (alamb)
- Minor: run
test_decimal_list
again #7282 [parquet] (alamb) - Move Parquet encryption tests into the arrow_reader integration tests #7279 [parquet] (adamreeve)
- Include license and notice files in published crates, part 2 #7275 [arrow] (ankane)
- feat: Support Utf8View in JSON reader #7263 [arrow] (zhuqi-lucas)
- feat: use
force_validate
feature flag when creating an arrays #7241 [arrow] (rluvaton) - fix: take on empty struct array returns empty array #7224 [arrow] (westonpace)
- fix: correct
bloom_filter_position
description #7223 [parquet] (romanz) - Minor: Move
make_builder
into mod.rs #7218 (lewiszlw) - Expose
field_builders
inStructBuilder
#7217 [arrow] (lewiszlw) - Minor: Fix json StructMode docs links #7215 [arrow] (gstvg)
- [main] Bump arrow version to 54.2.1 (#7207) #7212 (alamb)
- feat: add
downcast_integer_array
macro helper #7211 [arrow] (rluvaton) - Remove zstd pin #7199 [parquet] (tustvold)
- fix: Use chrono's quarter() to avoid conflict #7198 [arrow] (yutannihilation)
- Fix some Clippy 1.85 warnings #7167 [parquet] [arrow] (mbrobbel)
- feat: add to concat different data types error message the data types #7166 [arrow] (rluvaton)
- Add Week ISO, Year ISO computation #7163 [arrow] (kosiew)
- fix: create_random_batch fails with timestamp types having a timezone #7162 [arrow] (niebayes)
- Avoid overflow of remainder #7159 [arrow] (wForget)
- fix: Data type inference for NaN, inf and -inf in csv files #7150 [arrow] (Mottl)
- Preserve null dictionary values in
interleave
andconcat
kernels #7144 [arrow] (kawadakk) - Support casting
Date
to a time zone-specific timestamp #7141 [arrow] (friendlymatthew) - Minor: Add doctest to ArrayDataBuilder::build_unchecked #7139 [arrow] (gstvg)
- arrow-ord: add support for nested types to
partition
#7131 [arrow] (asubiotto) - Update prost-build requirement from =0.13.4 to =0.13.5 #7127 [arrow] [arrow-flight] (dependabot[bot])
- Avoid use of `flatbuffers...
object_store 0.12.0
Changelog
object_store_0.12.0 (2025-03-05)
Breaking changes:
- feat: add
Extensions
to object storePutMultipartOpts
#7214 [object-store] (crepererum) - feat: add
Extensions
to object storePutOptions
#7213 [object-store] (crepererum) - chore: enable conditional put by default for S3 #7181 [object-store] (meteorgan)
- feat: add
Extensions
to object storeGetOptions
#7170 [object-store] (crepererum) - feat(object_store): Override DNS Resolution to Randomize IP Selection #7123 [object-store] (crepererum)
- Use
u64
range instead ofusize
, for better wasm32 support #6961 [object-store] (XiangpengHao) - object_store: Add enabled-by-default "fs" feature #6636 [object-store] (Turbo87)
- Return
BoxStream
with'static
lifetime fromObjectStore::list
#6619 [object-store] (kylebarron) - object_store: Migrate from snafu to thiserror #6266 [object-store] (Turbo87)
Implemented enhancements:
- Object Store: S3 IP address selection is biased #7117 [object-store]
- object_store: GCSObjectStore should derive Clone #7113 [object-store]
- Remove all RCs after release #7059 [object-store]
- LocalFileSystem::list_with_offset is very slow over network file system #7018 [object-store]
- Release object store
0.11.2
(non API breaking) Around Dec 15 2024 #6902 [object-store]
Fixed bugs:
- LocalFileSystem errors with satisfiable range request #6749 [object-store]
Merged pull requests:
- ObjectStore WASM32 Support #7226 [object-store] (tustvold)
- [main] Bump arrow version to 54.2.1 (#7207) #7212 (alamb)
- Decouple ObjectStore from Reqwest #7183 [object-store] (tustvold)
- object_store: Disable all compression formats in HTTP reqwest client #7143 [object-store] (kylewlacy)
- refactor: remove unused
async
fromInMemory::entry
#7133 [object-store] (crepererum) - object_store/gcp: derive Clone for GoogleCloudStorage #7112 [object-store] (james-rms)
- Update version to 54.2.0 and add CHANGELOG #7110 (alamb)
- Remove all RCs after release #7060 [object-store] (kou)
- Update release schedule README.md #7053 (alamb)
- Create GitHub releases automatically on tagging #7042 (kou)
- Change Log On Succesful S3 Copy / Multipart Upload to Debug #7033 [object-store] (diptanu)
- Prepare for
54.1.0
release #7031 (alamb) - Add a custom implementation
LocalFileSystem::list_with_offset
#7019 [object-store] (corwinjoy) - Improve docs for
AmazonS3Builder::from_env
#6977 [object-store] (kylebarron) - Fix WASM CI for Rust 1.84 release #6963 (alamb)
- Update itertools requirement from 0.13.0 to 0.14.0 in /object_store #6925 [object-store] (dependabot[bot])
- Fix LocalFileSystem with range request that ends beyond end of file #6751 [object-store] (kylebarron)
* This Changelog was automatically generated by github_changelog_generator
arrow 54.2.1
Changelog
54.2.1 (2025-02-27)
Fixed bugs:
- Use chrono >= 0.4.34, < 0.4.40 to avoid breaking #7210
* This Changelog was automatically generated by github_changelog_generator