Skip to content

Releases: apache/arrow-rs

arrow 56.2.0

23 Sep 13:14
ae8e6c6
Compare
Choose a tag to compare

Changelog

56.2.0 (2025-09-19)

Full Changelog

Implemented enhancements:

  • [Variant] Add variant to arrow primitives for unsigned integers #8368
  • [Variant] [Shredding] Support typed_access for FixedSizeBinary #8335
  • [Variant] [Shredding] Support typed_access for Utf8 and BinaryView #8333
  • [Variant] [Shredding] Support typed_access for Boolean #8329
  • Allow specifying projection in ParquetRecordBatchReader::try_new_with_row_groups #8326
  • [Parquet] Expose predicates from RowFilter #8314
  • [Variant] Use row-oriented builders in cast_to_variant #8310
  • Use apache/arrow-dotnet for integration test #8294
  • [Variant] Add Vairant::as_u* #8283
  • Add a way to modify WriterProperties #8273
  • Dont truncate timestamps on display for Row #8265
  • [Parquet] Add row group write with AsyncArrowWriter #8261
  • [Parquet] Expose ArrowRowGroupWriter #8259
  • [Parquet] Do not compress v2 data page when compress is bad quality #8256 [parquet]
  • [Variant] Refactor cast_to_variant #8234
  • [Variant]: Implement DataType::Union support for cast_to_variant kernel #8195
  • [Variant]: Implement DataType::Duration support for cast_to_variant kernel #8194
  • [Variant] Support typed access for numeric types in variant_get #8178
  • [Parquet] Implement a "push style" API for decoding Parquet Metadata #8164
  • [Variant] Support creating Variants with pre-existing Metadata #8152
  • [Variant] Support Shredded Objects in variant_get: typed path access (STEP 1) #8150
  • [Variant] Add variant feature to parquet crate #8132
  • [Parquet] Concurrent writes with ArrowWriter.get_column_writers should parallelize across row groups #8115
  • [Variant] Implement VariantArray::value for shredded variants #8091
  • [Variant] Integration tests for reading parquet w/ Variants #8084
  • [Variant]: Implement DataType::Map support for cast_to_variant kernel #8063
  • [Variant]: Implement DataType::List/LargeList support for cast_to_variant kernel #8060

Fixed bugs:

  • Casting floating point numbers fails for Decimal64 but works for other variants #8362
  • [Variant] cast_to_variant conflates empty map with NULL #8289
  • [Avro] Decoder flush panics for map whose value field contains metadata #8270
  • Parquet: Avoid page size exceeds i32::MAX #8263 [parquet]
  • [Avro] Decoder panics on flush when schema contains map whose value is non-nullable #8253
  • Avro nullable field decode failure leads to panic upon decoder flush #8212
  • Avro to arrow schema conversion fails when a field has a default type that is not string #8209
  • parquet: No method named to_ne_bytes found for struct bloom_filter::Block for target s390x-unknown-linux-gnu #8207
  • [Variant] cast_to_variant will panic on certain Date64 or Timestamp Values values #8155
  • Parquet: Avoid page-size overflows i32 #8264 [parquet] (mapleFU)

Documentation updates:

Closed issues:

  • comfy-table release 7.2.0 breaks MSRV #8243
  • [Variant] Add Variant::as_f16 #8228
  • Support appending raw bytes to variant objects and lists #8217
  • VariantArrayBuilder uses ParentState for simpler rollbacks #8205
  • Make ObjectBuilder::finish signature infallible #8184
  • Improve performance of i256 to f64 #8013

Merged pull requests:

  • [Variant] Support Variant to PrimitiveArrow for unsigned integer #8369 (klion26)
  • [Variant] [Shredding] Support typed_access for Utf8 and BinaryView #8364 [parquet] (petern48)
  • Fix casting floats to Decimal64 #8363 [arrow] (AdamGS)
  • [Variant] Implement new VariantValueArrayBuilder #8360 (scovich)
  • [Variant] Add constants for empty variant metadata #8359 (scovich)
  • [Variant] Allow lossless casting from integer to floating point #8357 (scovich)
  • [Variant] Minor code cleanups #8356 (scovich)
  • [Variant] Remove unused metadata from variant ShreddingState #8355 (scovich)
  • Adds Map & Enum support, round-trip & benchmark tests #8353 [arrow] (nathaniel-d-ef)
  • [Variant] [Shredding] feat: Support typed_access for FixedSizeBinary #8352 (petern48)
  • Add arrow-avro Reader support for Dense Union and Union resolution (Part 1) #8348 [arrow] (jecsand838)
  • [Variant] feat: Support typed_access for Boolean #8346 (Weijun-H)
  • [Variant] Make VariantToArrowRowBuilder an enum #8345 (scovich)
  • [Variant] Rename VariantShreddingRowBuilder to VariantToArrowRowBuilder #8344 (scovich)
  • [Variant] Add tests for variant_get requesting Some struct #8343 (scovich)
  • [Variant] Add nullable arg to StructArrayBuilder::with_field #8342 (scovich)
  • Minor: avoid an Arc::clone in CacheOptions for Parquet PredicateCache #8338 [parquet] (alamb)
  • Fix can_cast_types for temporal to Utf8View #8328 [arrow] ([findepi](https://github.c...
Read more

arrow 56.1.0

25 Aug 11:56
76b75ee
Compare
Choose a tag to compare

Changelog

56.1.0 (2025-08-21)

Full Changelog

Implemented enhancements:

  • Implement cast and other operations on decimal32 and decimal64 #7815 #8204 [arrow]
  • Speed up Parquet filter pushdown with predicate cache #8203 [parquet]
  • Optionally read parquet page indexes #8070 [parquet]
  • Parquet reader: add method for sync reader read bloom filter #8023 [parquet]
  • [parquet] Support writing logically equivalent types to ArrowWriter #8012 [parquet]
  • Improve StringArray(Utf8) sort performance #7847 [arrow]
  • feat: arrow-ipc delta dictionary support #8001 [arrow] (JakeDern)

Fixed bugs:

  • The Rustdocs are clean CI job is failing #8175
  • [avro] Bug in resolving avro schema with named type #8045 [arrow]
  • Doc test failure (test arrow-avro/src/lib.rs - reader) when verifying avro 56.0.0 RC1 release #8018 [arrow]

Documentation updates:

Performance improvements:

Closed issues:

  • [Variant] Improve fuzz test for Variant #8199
  • [Variant] Improve fuzz test for Variant #8198
  • VariantArrayBuilder tracks starting offsets instead of (offset, len) pairs #8192
  • Rework ValueBuilder API to work with ParentState for reliable nested rollbacks #8188
  • [Variant] Rename ValueBuffer as ValueBuilder #8186
  • [Variant] Refactor ParentState to track and rollback state on behalf of its owning builder #8182
  • [Variant] ObjectBuilder should detect duplicates at insertion time, not at finish #8180
  • [Variant] ObjectBuilder does not reliably check for duplicates #8170
  • [Variant] Support StringView and LargeString in ´batch_json_string_to_variant` #8145 [parquet]
  • [Variant] Rename batch_json_string_to_variant and batch_variant_to_json_string json_to_variant #8144 [parquet]
  • [avro] Use tempfile crate rather than custom temporary file generator in tests #8143 [arrow]
  • [Avro] Use Write rather dyn Write in Decoder #8142 [arrow]
  • [Variant] Nested builder rollback is broken #8136
  • [Variant] Add support the remaing primitive type(timestamp_nanos/timestampntz_nanos/uuid) for parquet variant #8126
  • Meta: Implement missing Arrow 56.0 lint rules - Sequential workflow #8121
  • ARROW-012-015: Add linter rules for remaining Arrow 56.0 breaking changes #8120
  • ARROW-010 & ARROW-011: Add linter rules for Parquet Statistics and Metadata API removals #8119
  • ARROW-009: Add linter rules for IPC Dictionary API removals in Arrow 56.0 #8118
  • ARROW-008: Add linter rule for SerializedPageReaderState usize→u64 breaking change #8117
  • ARROW-007: Add linter rule for Schema.all_fields() removal in Arrow 56.0 #8116
  • [Variant] Implement ShreddingState::AllNull variant #8088 [parquet]
  • [Variant] Support Shredded Objects in variant_get #8083 [parquet]
  • [Variant]: Implement DataType::RunEndEncoded support for cast_to_variant kernel #8064 [parquet]
  • [Variant]: Implement DataType::Dictionary support for cast_to_variant kernel #8062 [parquet]
  • [Variant]: Implement DataType::Struct support for cast_to_variant kernel #8061 [parquet]
  • [Variant]: Implement DataType::Decimal32/Decimal64/Decimal128/Decimal256 support for cast_to_variant kernel #8059 [parquet]
  • [Variant]: Implement DataType::Timestamp(..) support for cast_to_variant kernel #8058 [parquet]
  • [Variant]: Implement DataType::Float16 support for cast_to_variant kernel #8057 [parquet]
  • [Variant]: Implement DataType::Interval support for cast_to_variant kernel #8056 [parquet]
  • [Variant]: Implement DataType::Time32/Time64 support for cast_to_variant kernel #8055 [parquet]
  • [Variant]: Implement DataType::Date32 / DataType::Date64 support for cast_to_variant kernel #8054 [parquet]
  • [Variant]: Implement DataType::Null support for cast_to_variant kernel #8053 [parquet]
  • [Variant]: Implement DataType::Boolean support for cast_to_variant kernel #8052 [parquet]
  • [Variant]: Implement DataType::FixedSizeBinary support for cast_to_variant kernel #8051 [parquet]
  • [Va...
Read more

arrow 56.0.0

01 Aug 19:26
876585c
Compare
Choose a tag to compare

Changelog

56.0.0 (2025-07-29)

Full Changelog

Breaking changes:

Implemented enhancements:

  • [parquet] Relax type restriction to allow writing dictionary/native batches for same column #8004
  • Support casting int64 to interval #7988 [arrow]
  • [Variant] Add ListBuilder::with_value for convenience #7951 [parquet]
  • [Variant] Add ObjectBuilder::with_field for convenience #7949 [parquet]
  • [Variant] Impl PartialEq for VariantObject #7943 #7948
  • [Variant] Offer simdutf8 as an optional dependency when validating metadata #7902 [parquet] [arrow]
  • [Variant] Avoid collecting offset iterator #7901 [parquet]
  • [Variant] Remove superfluous check when validating monotonic offsets #7900 [parquet]
  • [Variant] Avoid extra allocation in ObjectBuilder #7899 [parquet]
  • [Variant][Compute] variant_get kernel #7893 [parquet]
  • [Variant][Compute] Add batch processing for Variant-JSON String conversion #7883 [parquet]
  • Support MapArray in lexsort #7881 [arrow]
  • [Variant] Add testing for invalid variants (fuzz testing??) #7842 [parquet]
  • [Variant] VariantMetadata, VariantList and VariantObject are too big for Copy #7831 [parquet]
  • Allow choosing flate2 backend #7826 [parquet]
  • [Variant] Tests for creating "large" VariantObjectss #7821 [parquet]
  • [Variant] Tests for creating "large" VariantLists #7820 [parquet]
  • [Variant] Support VariantBuilder to write to buffers owned by the caller #7805 [parquet]
  • [Variant] Move JSON related functionality to different crate. #7800 [parquet]
  • [Variant] Add flag in ObjectBuilder to control validation behavior on duplicate field write #7777 [parquet]
  • [Variant] make serde_json an optional dependency of parquet-variant #7775 [parquet]
  • [coalesce] Implement specialized BatchCoalescer::push_batch for PrimitiveArray #7763 [arrow]
  • Add sort_kernel benchmark for StringViewArray case #7758 [arrow]
  • [Variant] Improved API for accessing Variant Objects and lists #7756 [parquet]
  • Buildable reproducible release builds #7751
  • Allow per-column parquet dictionary page size limit #7723 [parquet]
  • [Variant] Test and implement efficient building for "large" Arrays #7699 [parquet]
  • [Variant] Improve VariantBuilder when creating field name dictionaries / sorted dictionaries #7698 [parquet]
  • [Variant] Add input validation in VariantBuilder #7697 [parquet]
  • [Variant] Support Nested Data in VariantBuilder #7696 [parquet]
  • Parquet: Incorrect min/max stats for int96 columns #7686 [parquet]
  • Add DictionaryArray::gc method #7683 [arrow]
  • [Variant] Add negative tests for reading invalid primitive variant values #7645 [parquet]

Fixed bugs:

  • [Variant] Panic when appending nested objects to VariantBuilder #7907 [parquet]
  • Panic when casting large Decimal256 to f64 due to unchecked unwrap() #7886 [arrow]
  • Incorrect inlined string view comparison after " Add prefix compare for inlined" #7874 [parquet] [arrow]
  • [Variant] test_json_to_variant_object_very_large takes over 20s #7872 [parquet]
  • [Variant] If ObjectBuilder::finalize is not called, the resulting Variant object is malformed. #7863 [parquet]
  • CSV error message has values transposed #7848 [arrow]
  • Concating struct arrays with no fields unnecessarily errors #7828 [arrow]
  • Clippy CI is failing on ...
Read more

arrow 55.2.0

22 Jun 13:10
25114c5
Compare
Choose a tag to compare

Changelog

55.2.0 (2025-06-22)

Full Changelog

Implemented enhancements:

  • Do not populate nulls for NullArray for MutableArrayData #7725
  • Implement PartialEq for RunArray #7691
  • interleave_views is really slow #7688 [arrow]
  • Add min max aggregates for FixedSizeBinary #7674 [arrow]
  • Deliver pyarrow as a standalone crate #7668 [arrow]
  • [Variant] Implement VariantObject::field and VariantObject::fields #7665 [parquet]
  • [Variant] Implement read support for remaining primitive types #7630 [parquet]
  • Fast and ergonomic method to add metadata to a RecordBatch #7628 [arrow]
  • Add efficient way to change the keys of string dictionary builder #7610 [arrow]
  • Support add_nulls on additional builder types #7605 [arrow]
  • Add into_inner for AsyncArrowWriter #7603 [parquet]
  • Optimize PrimitiveBuilder::append_trusted_len_iter #7591 [arrow]
  • Benchmark for filter+concat and take+concat into even sized record batches #7589 [arrow]
  • max_statistics_truncate_length is ignored when writing statistics to data page headers #7579 [parquet]
  • Feature Request: Encoding in parquet-rewrite #7575 [parquet]
  • Add a strong_count method to Buffer #7568 [arrow]
  • Create version of LexicographicalComparator that compares fixed number of columns #7531 [arrow]
  • parquet-show-bloom-filter should work with integer typed columns #7528 [parquet]
  • Allow merging primitive dictionary values in concat and interleave kernels #7518 [arrow]
  • Add efficient concatenation of StructArrays #7516 [arrow]
  • Rename flight-sql-experimental to flight-sql #7498 [arrow] [arrow-flight]
  • Consider moving from ryu to lexical-core for string formatting / casting floats to string. #7496
  • Arithmetic kernels can be safer and faster #7494 [arrow]
  • Speedup filter_bytes by precalculating capacity #7465 [arrow]
  • [Variant]: Rust API to Create Variant Values #7424 [parquet] [arrow]
  • [Variant] Rust API to Read Variant Values #7423 [arrow]
  • Release arrow-rs / parquet Minor version 55.1.0 (May 2025) #7393 [parquet]
  • Support create_random_array for Decimal data types #7343 [arrow]
  • Truncate Parquet page data page statistics #7555 [parquet] (etseidl)

Fixed bugs:

  • In arrow_json, Decoder::decode can panic if it encounters two high surrogates in a row. #7712
  • FlightSQL "GetDbSchemas" and "GetTables" schemas do not fully match the protocol #7637 [arrow] [arrow-flight]
  • Cannot read encrypted Parquet file if page index reading is enabled #7629 [parquet]
  • encoding_stats not present in Parquet generated by parquet-rewrite #7616 [parquet]
  • When writing parquet plaintext footer files footer_signing_key_metadata is not included, encryption alghoritm is always written in footer #7599 [parquet]
  • new_null_array panics when constructing a struct of a dictionary #7571
  • Parquet derive fails to build when Result is aliased #7547
  • Unable to read Dictionary(u8, FixedSizeBinary(_)) using datafusion. #7545 [parquet]
  • filter_record_batch panics with empty struct array. #7538 [arrow]
  • Panic in pretty_format function when displaying DurationSecondsArray with i64::MIN / i64::MAX #7533 [arrow]
  • Record API unable to parse TIME_MILLIS when encoded as INT32 #7510 [parquet]
  • The read_record_batch func of the RecordBatchDecoder does not respect the skip_validation property #7508 [arrow]
  • arrow-55.1.0 breaks filter_record_batch #7500
  • Files containing binary data with >=8_388_855 bytes per row written with arrow-rs can't be read with pyarrow #7489 [parquet]
  • [Bug] Ingestion with Arrow Flight Sql panic when the input stream is empty or fallible #7329 [arrow] [arrow-flight]
  • Ensure page encoding statistics are written to Parquet file #7643 [parquet] (etseidl)

Documentation updates:

  • arrow_reader_row_filter benchmark doesn't capture page cache improvements #7460 [parquet] [arrow]
  • chore: fix a typo in ExtensionType::supports_data_type docs #7682 [arrow] (mbrobbel)
  • [Variant] Add variant docs and examples #7661 [parquet] (alamb)
  • Minor: Add version to deprecation notice for `ParquetMetaDataReader::decode_foote...
Read more

arrow 55.1.0

09 May 20:08
822cba4
Compare
Choose a tag to compare

Changelog

55.1.0 (2025-05-09)

Full Changelog

Breaking changes:

  • refactor!: do not default the struct array length to 0 in Struct::try_new #7247 [arrow] (westonpace)

Implemented enhancements:

  • Add a way to get max usize from OffsetSizeTrait #7474 [arrow]
  • Deterministic metadata encoding #7448 [arrow]
  • Support Arrow type Dictionary with value FixedSizeBinary in Parquet #7445
  • Parquet: Add ability to project rowid in parquet reader #7444
  • Move parquet::file::metadata::reader::FooterTail to parquet::file::metadata so that it is public #7438 [parquet]
  • Speedup take_bytes by precalculating capacity #7432 [arrow]
  • Improve performance of interleave_primitive and interleave_bytes #7421 [arrow]
  • Implement Eq and Default for ScalarBuffer #7411 [arrow]
  • Add decryption support for column index and offset index #7390 [parquet]
  • Support writing encrypted Parquet files with plaintext footers #7320 [parquet]
  • Support Parquet key management tools #7256 [parquet]
  • Verify footer tags when reading encrypted Parquet files with plaintext footers #7255 [parquet]
  • StructArray::try_new behavior can be unexpected when there are no child arrays #7246 [arrow]
  • Parquet performance: improve performance of reading int8/int16 #7097 [parquet]

Fixed bugs:

  • StructArray::try_new validation incorrectly returns an error when logical_nulls() returns Some() && null_count == 0 #7435
  • Reading empty DataPageV2 fails with snappy: corrupt input (empty) #7388 [parquet]

Documentation updates:

Closed issues:

  • Refactor Parquet DecryptionPropertiesBuilder to fix use of unreachable #7476 [parquet]
  • Implement Eq and Default for OffsetBuffer #7417 [arrow]

Merged pull requests:

Read more

arrow 55.0.0

08 Apr 15:24
9322547
Compare
Choose a tag to compare

Changelog

55.0.0 (2025-04-08)

Full Changelog

Breaking changes:

Implemented enhancements:

  • Improve the performance of concat #7357 [arrow]
  • Pushdown predictions to Parquet in-memory row group fetches #7348 [parquet]
  • Improve CSV parsing errors: Print the row that makes csv parsing fails #7344 [arrow]
  • Support ColumnMetaData encoding_stats in Parquet Writing #7341 [parquet]
  • Support writing Parquet with modular encryption #7327 [parquet]
  • Parquet Use U64 Instead of Usize (wasm support for files greater than 4GB) #7238 [parquet]
  • Support different TimeUnits and timezones when reading Timestamps from INT96 #7220 [parquet]

Fixed bugs:

  • New clippy failures in code base with release of rustc 1.86 #7381 [parquet] [arrow]
  • Fix bug in ParquetMetaDataReader and add test of suffix metadata reads with encryption #7372 [parquet] (etseidl)

Documentation updates:

  • Improve documentation on ArrayData::offset #7385 [arrow] (alamb)
  • Improve documentation for AsyncFileReader::get_metadata #7380 [parquet] (alamb)
  • Improve documentation on implementing Parquet predicate pushdown #7370 [parquet] (alamb)
  • Add documentation and examples for pretty printing, make pretty_format_columns_with_options pub #7346 [arrow] (alamb)
  • Improve documentation on writing parquet, including multiple threads #7321 [parquet] (alamb)

Merged pull requests:

Read more

arrow 54.3.1

26 Mar 16:02
e62b212
Compare
Choose a tag to compare

Changelog

54.3.1 (2025-03-26)

Full Changelog

Fixed bugs:

  • Round trip encoding of list of fixed list fails when offset is not zero #7315

Merged pull requests:

* This Changelog was automatically generated by github_changelog_generator

arrow 54.3.0

17 Mar 20:57
57942c4
Compare
Choose a tag to compare

Changelog

54.3.0 (2025-03-17)

Full Changelog

Implemented enhancements:

  • Using column chunk offset index in InMemoryRowGroup::fetch #7300
  • Support reading parquet with modular encryption #7296 [parquet]
  • Add example for how to read/write encrypted parquet files #7281 [parquet]
  • Have writer return parsed ParquetMetadata #7254 [parquet]
  • feat: Support Utf8View in JSON reader #7244 [arrow]
  • StructBuilder should provide a way to get a &dyn ArrayBuilder of a field builder #7193 [arrow]
  • Support div_wrapping/rem_wrapping for numeric arithmetic kernels #7158 [arrow]
  • Improve RleDecoder performance #7195 [parquet] (Dandandan)
  • Improve arrow-json deserialization performance by 30% #7157 [arrow] (mwylde)
  • Add with_skip_validation flag to IPC StreamReader, FileReader and FileDecoder #7120 [arrow] (alamb)

Fixed bugs:

  • Archery integration CI test is failing on main: error: package half v2.5.0 cannot be built because it requires rustc 1.81 or newer, while the currently active rustc version is 1.77.2 #7291
  • MSRV CI check is failing on main #7289
  • Incorrect IPC schema encoding for multiple dictionaries #7058 [arrow] [arrow-flight]

Documentation updates:

Merged pull requests:

Read more

object_store 0.12.0

03 Apr 10:59
3da5e0d
Compare
Choose a tag to compare

Changelog

object_store_0.12.0 (2025-03-05)

Full Changelog

Breaking changes:

Implemented enhancements:

Fixed bugs:

Merged pull requests:

* This Changelog was automatically generated by github_changelog_generator

arrow 54.2.1

27 Feb 12:07
3f56468
Compare
Choose a tag to compare

Changelog

54.2.1 (2025-02-27)

Full Changelog

Fixed bugs:

  • Use chrono >= 0.4.34, < 0.4.40 to avoid breaking #7210

* This Changelog was automatically generated by github_changelog_generator