Skip to content

Releases: lance-format/lance

v4.0.0-beta.12

13 Mar 21:40

Choose a tag to compare

v4.0.0-beta.12 Pre-release
Pre-release

What's Changed

Bug Fixes 🐛

  • fix: disallowing stale credentials from directory namespace by @hamersaw in #6194

Full Changelog: v4.0.0-beta.11...v4.0.0-beta.12

v4.0.0-beta.11

13 Mar 20:37

Choose a tag to compare

v4.0.0-beta.11 Pre-release
Pre-release

What's Changed

New Features 🎉

  • feat: support vector indices in describe_indices filtering by @ndpvt-web in #6145

Bug Fixes 🐛

  • fix: preserve merge insert delete-by-source semantics by @Xuanwo in #6148
  • fix: handle nullable validity layers without def levels by @Xuanwo in #6187
  • fix: use to_arrow_reader in benchmark datagen by @Xuanwo in #6190
  • fix: memory_limit and num_workers params are not passed to index worker by @BubbleCal in #6197

Documentation 📚

  • docs: update the rules for data replacement conflicts to reflect reality by @westonpace in #6182

Performance Improvements 🚀

  • perf(inverted): reuse posting batch builder and merge tail partitions by @BubbleCal in #6191

Other Changes

Full Changelog: v4.0.0-beta.10...v4.0.0-beta.11

v3.0.0

13 Mar 15:11

Choose a tag to compare

What's Changed

Breaking Changes 🛠

Critical Fixes ‼️

  • fix: deduplicate row addresses in take to prevent panic by @wjones127 in #5881
  • fix: fts flat search drops rows when avg_doc_length < 1.0 by @wjones127 in #5897
  • fix: invalidate index fragment bitmaps after data replacement and stale merge by @wjones127 in #5929

New Features 🎉

  • feat: add RLE support for block by @yingjianwu98 in #4937
  • feat: compress complex all null by @yingjianwu98 in #4990
  • feat: support cleanup across branches by @majin1102 in #5009
  • feat: dictionary index always32 bits by @yingjianwu98 in #5011
  • feat: abort dictionary encode if not useful by @yingjianwu98 in #5055
  • feat(cdf): cdf support upsert for views by @zhangyue19921010 in #5369
  • feat(compaction): binary copy capability for compaction by @zhangyue19921010 in #5434
  • feat: expose use_scalar_index param in Java scanner by @xloya in #5487
  • feat(python): expose search_filter in scanner by @wojiaodoubao in #5506
  • feat: add alter column nullable to non-nullable support by @Xuanwo in #5589
  • feat: evolute all_null_layout to constant layout by @Xuanwo in #5641
  • feat(java): support creating IVF_RQ index by @majin1102 in #5648
  • feat(java): support building vector index distributively by @majin1102 in #5664
  • feat(rust): add datafusion catalog_provider through namespace by @majin1102 in #5686
  • feat: support List and Struct type for KeyValue in inserted_rows.rs by @wojiaodoubao in #5713
  • feat: support tencent cos by @ztorchan in #5740
  • feat: add Lance-HF docs to lance.org/integrations/huggingface/ by @prrao87 in #5748
  • feat(python): support namespace for tensorflow by @yuqi1129 in #5750
  • feat: add range to External blob by @wojiaodoubao in #5765
  • feat(java): support json extraction by scanning by @majin1102 in #5770
  • feat: introduce RowIdSet and RowIdMask by @yanghua in #5771
  • feat: expose blob handling APIs to python by @Xuanwo in #5790
  • feat: add blob handling support for fragment by @Xuanwo in #5801
  • feat: add plan/execute separation to FilteredReadExec by @LuQQiu in #5843
  • feat: add LSM scanner with point lookup and vector search support by @touch-of-grey in #5850
  • feat: add rename table implementations to REST namespaces by @bryanck in #5874
  • feat(python): expose enable_stable_row_ids in commit() by @fecet in #5908
  • feat: support aggregate in scanner by @jackye1995 in #5911
  • feat: spill page metadata to disk during IVF shuffle by @wkalt in #5921
  • feat: add third party licenses lists by @jackye1995 in #5922
  • feat(java): support session by @jackye1995 in #5931
  • feat: make geodatafusion/geoarrow optional via geo feature flag by @apoc in #5934
  • perf: create local writer for efficient local writes by @wkalt in #5939
  • feat: add python and java binding for aggregate by @jackye1995 in #5951
  • feat: add proto serialization for FilteredReadExec by @LuQQiu in #5954
  • feat: create an arrow-scalar crate utilizing arrow-row and arrow-data by @westonpace in #5955
  • feat: add progress monitoring via callbacks for inverted indexes by @vivek-bharathan in #5958
  • feat: add size to object store tracing by @wjones127 in #5962
  • feat: update minimum supported rust version from 1.88 to 1.91 by @westonpace in #5964
  • feat: add Dataset::with_object_store for request-scoped store overrides by @wkalt in #5966
  • feat: support namespace as external manifest store by @jackye1995 in #5968
  • feat: serialize storage options in table identifier proto by @LuQQiu in #5973
  • feat(core): add Levenshtein-based suggestions to not-found errors in schema by @HemantSudarshan in #5976
  • feat: add URI-based commit support to Java SDK by @hamersaw in #5978
  • fix: concurrent read and write to directory namespace by @jackye1995 in #5983
  • feat: add ability to pass custom headers to objectstore requests by @hamersaw in #5989
  • feat: add DeleteResult with num_deleted_rows by @wkalt in #6001
  • feat: introduce IncompatibleTransaction error by @wjones127 in #6003
  • feat: surface ambiguous merge insert error as InvalidInput by @wjones127 in #6048
  • feat(blob): distribute blob sidecar keys with reversed binary ids by @Xuanwo in #6060
  • feat: handle JSONB literals in Lance SQL planner by @wkalt in #6061
  • feat(java): expose Dataset.dropIndex method to drop specific index by @fangbo in #6065
  • feat(blob): map external blob URIs to multi-base base ids by @Xuanwo in #6066
  • feat: add env toggle for repetition index cache on read by @Xuanwo in #6069
  • feat(compaction): single reserve_fragment_ids after rewriting files by @hamersaw in #6072
  • feat: expose compaction binary copy configuration through python and java SDKs by @hamersaw in #6074
  • feat: mark 2.2 as stable and add 2.3 as the next file format version by @Xuanwo in #6088
  • feat: support prewarm for IVF-based ANN indices by @wjones127 in #6090

Bug Fixes 🐛

  • fix: ensure blob encoding work when using file reader directly by @rahil-c in #5193
  • fix: support system columns in dataset.take* operations by @hamersaw in #5722
  • fix: skip missing indices in compaction rewrite by @AndreaBozzo in #5739
  • fix(lance-linalg): check fp16kernels feature before arch-specific code by @durch in #5747
  • refactor: align blob behavior that write via file format version, read via layout by @Xuanwo in #5752
  • fix: fix deletion when using file-object-store:// by @cmccabe in #5760
  • fix: remove unreasonable nullable check for data types in hash_joiner during merge operation by @zhangyue19921010 in #5784
  • fix: allow unused_unsafe for __cpuid to support both stable and nightly by @jackye1995 in #5793
  • fix: set JUnit dependency as test scope by @bryanck in #5815
  • fix(java): transaction fatal bug in java transaction api by @wojiaodoubao in #5824
  • fix: fix remap so that it handles deletions correctly by @westonpace in #5828
  • fix: inconsistent transposed pq code and metadata when build ivf_pq index distributedly by @yanghua in #5834
  • fix: improve error messages in FixedSizeListArrayExt::convert_to_floating_point by @LuciferYang in #5836
  • fix(java): panic when reading CreateIndex transaction by @majin1102 in #5853
  • fi...
Read more

v4.0.0-beta.10

12 Mar 22:36

Choose a tag to compare

v4.0.0-beta.10 Pre-release
Pre-release

What's Changed

Breaking Changes 🛠

  • perf(inverted)!: reduce fts indexing time and memory by @BubbleCal in #6174

Bug Fixes 🐛

  • fix: avoid empty range reads for zero-length blobs by @Xuanwo in #6168

Documentation 📚

Performance Improvements 🚀

  • perf: remove shard content key sorting from distributed merge by @Xuanwo in #6179

Full Changelog: v4.0.0-beta.9...v4.0.0-beta.10

v4.0.0-beta.9

11 Mar 22:10

Choose a tag to compare

v4.0.0-beta.9 Pre-release
Pre-release

What's Changed

New Features 🎉

  • feat: clearer progress reporting for IVF by @wkalt in #6126

Bug Fixes 🐛

  • fix: replace fetch_arrow_table with to_arrow_table by @BubbleCal in #6146
  • fix: handle DataType::Null in adjust_child_validity to prevent panic by @wjones127 in #6160
  • fix: persist frag reuse index external file on local filesystem by @wjones127 in #6163

Documentation 📚

Performance Improvements 🚀

Full Changelog: v4.0.0-beta.8...v4.0.0-beta.9

v4.0.0-beta.8

10 Mar 02:57

Choose a tag to compare

v4.0.0-beta.8 Pre-release
Pre-release

What's Changed

Breaking Changes 🛠

New Features 🎉

  • feat: expose use_scalar_index param in Java scanner by @xloya in #5487
  • feat(compaction): add Python config for defer_index_remap by @zhangyue19921010 in #5691
  • feat(cleanup): add more metrics to RemovalStats by @zhangyue19921010 in #6025
  • feat(java): expose prefilter parameter to support vector search with fragments by @nyl3532016 in #6040
  • feat: handle JSONB literals in Lance SQL planner by @wkalt in #6061
  • feat(compaction): single reserve_fragment_ids after rewriting files by @hamersaw in #6072
  • feat(cleanup): support rate limiter for cleanup operation by @zhangyue19921010 in #6084
  • feat: add skip_transpose flag to vector index builders by @BubbleCal in #6114

Bug Fixes 🐛

  • fix: filter stale row IDs in TakeExec for FTS/vector after delete by @wkalt in #6042
  • fix(btree): include null pages in non-IsNull queries for correct thre… by @wkalt in #6043
  • fix: bitmap iterator exhaustion in mask_to_offset_ranges by @wkalt in #6046
  • fix: incorrect deletion masking in DatasetPreFilter by @cijiugechu in #6083
  • fix: compile error for err_express by @zhangyue19921010 in #6094
  • fix(python): crash when schema contains nested fixed_size_list or extension type by @erandagan in #6107
  • fix: dont sample if no vectors are needed by @westonpace in #6110
  • fix(index): preserve stable row-id entries during scalar index optimize by @acking-you in #6117
  • fix: disallow wrapping auto-detected fsst in other compression by @hamersaw in #6120
  • fix: pin substrait to 0.62.2 until DF supports 0.62.3 by @westonpace in #6121
  • fix: vector index type shown as unknown in describe_indices by @jackye1995 in #6122
  • fix: handle inverted index worker exits during dispatch by @BubbleCal in #6129

Documentation 📚

  • docs: update index.md to fix indexes to indices for uniformity by @wombatu-kun in #6113

Other Changes

  • refactor: overhaul AGENTS.md with PR review insights by @Xuanwo in #6103

Full Changelog: v4.0.0-beta.7...v4.0.0-beta.8

v4.0.0-beta.7

04 Mar 01:28

Choose a tag to compare

v4.0.0-beta.7 Pre-release
Pre-release

What's Changed

Breaking Changes 🛠

New Features 🎉

  • feat: mark 2.2 as stable and add 2.3 as the next file format version by @Xuanwo in #6088
  • feat: support prewarm for IVF-based ANN indices by @wjones127 in #6090

Performance Improvements 🚀

Full Changelog: v4.0.0-beta.6...v4.0.0-beta.7

v3.0.0-rc.3

04 Mar 22:49

Choose a tag to compare

v3.0.0-rc.3 Pre-release
Pre-release

What's Changed

Breaking Changes 🛠

Critical Fixes ‼️

  • fix: deduplicate row addresses in take to prevent panic by @wjones127 in #5881
  • fix: fts flat search drops rows when avg_doc_length < 1.0 by @wjones127 in #5897
  • fix: invalidate index fragment bitmaps after data replacement and stale merge by @wjones127 in #5929

New Features 🎉

  • feat: add RLE support for block by @yingjianwu98 in #4937
  • feat: compress complex all null by @yingjianwu98 in #4990
  • feat: support cleanup across branches by @majin1102 in #5009
  • feat: dictionary index always32 bits by @yingjianwu98 in #5011
  • feat: abort dictionary encode if not useful by @yingjianwu98 in #5055
  • feat(cdf): cdf support upsert for views by @zhangyue19921010 in #5369
  • feat(compaction): binary copy capability for compaction by @zhangyue19921010 in #5434
  • feat: expose use_scalar_index param in Java scanner by @xloya in #5487
  • feat(python): expose search_filter in scanner by @wojiaodoubao in #5506
  • feat: add alter column nullable to non-nullable support by @Xuanwo in #5589
  • feat: evolute all_null_layout to constant layout by @Xuanwo in #5641
  • feat(java): support creating IVF_RQ index by @majin1102 in #5648
  • feat(java): support building vector index distributively by @majin1102 in #5664
  • feat(rust): add datafusion catalog_provider through namespace by @majin1102 in #5686
  • feat: support List and Struct type for KeyValue in inserted_rows.rs by @wojiaodoubao in #5713
  • feat: support tencent cos by @ztorchan in #5740
  • feat: add Lance-HF docs to lance.org/integrations/huggingface/ by @prrao87 in #5748
  • feat(python): support namespace for tensorflow by @yuqi1129 in #5750
  • feat: add range to External blob by @wojiaodoubao in #5765
  • feat(java): support json extraction by scanning by @majin1102 in #5770
  • feat: introduce RowIdSet and RowIdMask by @yanghua in #5771
  • feat: expose blob handling APIs to python by @Xuanwo in #5790
  • feat: add blob handling support for fragment by @Xuanwo in #5801
  • feat: add plan/execute separation to FilteredReadExec by @LuQQiu in #5843
  • feat: add LSM scanner with point lookup and vector search support by @touch-of-grey in #5850
  • feat: add rename table implementations to REST namespaces by @bryanck in #5874
  • feat(python): expose enable_stable_row_ids in commit() by @fecet in #5908
  • feat: support aggregate in scanner by @jackye1995 in #5911
  • feat: spill page metadata to disk during IVF shuffle by @wkalt in #5921
  • feat: add third party licenses lists by @jackye1995 in #5922
  • feat(java): support session by @jackye1995 in #5931
  • feat: make geodatafusion/geoarrow optional via geo feature flag by @apoc in #5934
  • perf: create local writer for efficient local writes by @wkalt in #5939
  • feat: add python and java binding for aggregate by @jackye1995 in #5951
  • feat: add proto serialization for FilteredReadExec by @LuQQiu in #5954
  • feat: create an arrow-scalar crate utilizing arrow-row and arrow-data by @westonpace in #5955
  • feat: add progress monitoring via callbacks for inverted indexes by @vivek-bharathan in #5958
  • feat: add size to object store tracing by @wjones127 in #5962
  • feat: update minimum supported rust version from 1.88 to 1.91 by @westonpace in #5964
  • feat: add Dataset::with_object_store for request-scoped store overrides by @wkalt in #5966
  • feat: support namespace as external manifest store by @jackye1995 in #5968
  • feat: serialize storage options in table identifier proto by @LuQQiu in #5973
  • feat(core): add Levenshtein-based suggestions to not-found errors in schema by @HemantSudarshan in #5976
  • feat: add URI-based commit support to Java SDK by @hamersaw in #5978
  • fix: concurrent read and write to directory namespace by @jackye1995 in #5983
  • feat: add ability to pass custom headers to objectstore requests by @hamersaw in #5989
  • feat: add DeleteResult with num_deleted_rows by @wkalt in #6001
  • feat: introduce IncompatibleTransaction error by @wjones127 in #6003
  • feat: surface ambiguous merge insert error as InvalidInput by @wjones127 in #6048
  • feat(blob): distribute blob sidecar keys with reversed binary ids by @Xuanwo in #6060
  • feat: handle JSONB literals in Lance SQL planner by @wkalt in #6061
  • feat(java): expose Dataset.dropIndex method to drop specific index by @fangbo in #6065
  • feat(blob): map external blob URIs to multi-base base ids by @Xuanwo in #6066
  • feat: add env toggle for repetition index cache on read by @Xuanwo in #6069
  • feat(compaction): single reserve_fragment_ids after rewriting files by @hamersaw in #6072
  • feat: expose compaction binary copy configuration through python and java SDKs by @hamersaw in #6074
  • feat: mark 2.2 as stable and add 2.3 as the next file format version by @Xuanwo in #6088
  • feat: support prewarm for IVF-based ANN indices by @wjones127 in #6090

Bug Fixes 🐛

  • fix: ensure blob encoding work when using file reader directly by @rahil-c in #5193
  • fix: support system columns in dataset.take* operations by @hamersaw in #5722
  • fix: skip missing indices in compaction rewrite by @AndreaBozzo in #5739
  • fix(lance-linalg): check fp16kernels feature before arch-specific code by @durch in #5747
  • refactor: align blob behavior that write via file format version, read via layout by @Xuanwo in #5752
  • fix: fix deletion when using file-object-store:// by @cmccabe in #5760
  • fix: remove unreasonable nullable check for data types in hash_joiner during merge operation by @zhangyue19921010 in #5784
  • fix: allow unused_unsafe for __cpuid to support both stable and nightly by @jackye1995 in #5793
  • fix: set JUnit dependency as test scope by @bryanck in #5815
  • fix(java): transaction fatal bug in java transaction api by @wojiaodoubao in #5824
  • fix: fix remap so that it handles deletions correctly by @westonpace in #5828
  • fix: inconsistent transposed pq code and metadata when build ivf_pq index distributedly by @yanghua in #5834
  • fix: improve error messages in FixedSizeListArrayExt::convert_to_floating_point by @LuciferYang in #5836
  • fix(java): panic when reading CreateIndex transaction by @majin1102 in #5853...
Read more

v4.0.0-beta.6

03 Mar 18:44

Choose a tag to compare

v4.0.0-beta.6 Pre-release
Pre-release

What's Changed

New Features 🎉

  • feat(blob): distribute blob sidecar keys with reversed binary ids by @Xuanwo in #6060
  • feat(java): expose Dataset.dropIndex method to drop specific index by @fangbo in #6065
  • feat(blob): map external blob URIs to multi-base base ids by @Xuanwo in #6066
  • feat: add env toggle for repetition index cache on read by @Xuanwo in #6069
  • feat: expose compaction binary copy configuration through python and java SDKs by @hamersaw in #6074

Bug Fixes 🐛

  • fix(java): transaction fatal bug in java transaction api by @wojiaodoubao in #5824
  • fix: allowing headers for static configuration to be consistent by @hamersaw in #6045
  • fix(build): add Android aarch64 support to lance-linalg by @dardourimohamed in #6057
  • fix: make blob v2 reads base-aware in multi-base datasets by @Xuanwo in #6064
  • fix(lance-linalg): fix missing return value in u8x16::bit_and for non-x86_64/aarch64 targets by @cheungxi in #6068
  • fix: resolve Python lint failure on main by @Xuanwo in #6073
  • fix: restore main CI by formatting take_blob imports by @Xuanwo in #6082
  • fix: avoid thread pool contention between compression and write operations during FTS indexing by @BubbleCal in #6085

Documentation 📚

  • docs: clarify how to generate TPCH benchmark dataset locally by @Xuanwo in #6063

Performance Improvements 🚀

  • perf: add dict-values compression controls with lz4 default by @Xuanwo in #6059
  • perf: avoid frequent allocating when computing residual vectors by @BubbleCal in #6062
  • perf: add take_blob benchmark with cache_repetition_index matrix by @Xuanwo in #6067

Other Changes

Full Changelog: v4.0.0-beta.5...v4.0.0-beta.6

v4.0.0-beta.5

28 Feb 01:53

Choose a tag to compare

v4.0.0-beta.5 Pre-release
Pre-release

What's Changed

Full Changelog: v4.0.0-beta.4...v4.0.0-beta.5