Open
Description
Environment
deltalake==0.25.0:
Binding: Python3.11?
Environment:
- Azure:
- Mac OS:
Bug
What happened:
During vacuuming the delta table, this error was raised:
PanicException: the length + offset of the sliced StructArray cannot exceed the existing length.
delta.vacuum(
retention_hours=1,
dry_run=False,
enforce_retention_duration=False
)
What you expected to happen:
The program has worked using deltalake==0.23.2
before. I expected the delta table to be vacuum'd after being optimized without raising exceptions.
How to reproduce it:
Not quite sure yet.
More details:
INFO:main.py:Optimizing delta table.
INFO:main.py:Vacuuming delta table.
thread '<unnamed>' panicked at /Users/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-array-54.2.0/src/array/struct_array.rs:275:9:
the length + offset of the sliced StructArray cannot exceed the existing length
stack backtrace:
0: _rust_begin_unwind
1: core::panicking::panic_fmt
2: arrow_array::array::struct_array::StructArray::slice
3: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
4: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
5: arrow_array::array::struct_array::StructArray::slice
6: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
7: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
8: arrow_array::array::struct_array::StructArray::slice
9: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
10: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
11: arrow_array::array::struct_array::StructArray::slice
12: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
13: arrow_select::filter::filter_array
14: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
15: core::iter::adapters::try_process
16: arrow_select::filter::filter_record_batch
17: deltalake_core::kernel::snapshot::replay::LogReplayScanner::process_files_batch
18: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold
19: <core::iter::adapters::chain::Chain<A,B> as core::iter::traits::iterator::Iterator>::try_fold
20: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
21: core::iter::adapters::try_process
22: deltalake_core::kernel::snapshot::EagerSnapshot::advance
23: deltalake_core::operations::transaction::PostCommit::run_post_commit_hook::{{closure}}
24: <deltalake_core::operations::transaction::PostCommit as core::future::into_future::IntoFuture>::into_future::{{closure}}
25: <deltalake_core::operations::transaction::PreCommit as core::future::into_future::IntoFuture>::into_future::{{closure}}
26: <deltalake_core::operations::vacuum::VacuumBuilder as core::future::into_future::IntoFuture>::into_future::{{closure}}
27: tokio::runtime::park::CachedParkThread::block_on
28: tokio::runtime::context::runtime::enter_runtime
29: tokio::runtime::runtime::Runtime::block_on
30: deltalake::RawDeltaTable::vacuum
31: deltalake::RawDeltaTable::__pymethod_vacuum__
32: pyo3::impl_::trampoline::trampoline
33: deltalake::<impl pyo3::impl_::pyclass::PyMethods<deltalake::RawDeltaTable> for pyo3::impl_::pyclass::PyClassImplCollector<deltalake::RawDeltaTable>>::py_methods::ITEMS::trampoline
34: _cfunction_call
35: __PyObject_MakeTpCall
36: __PyEval_EvalFrameDefault
37: __PyEval_Vector
38: _context_run
39: _cfunction_vectorcall_FASTCALL_KEYWORDS
40: _partial_vectorcall
41: __PyEval_EvalFrameDefault
42: __PyEval_Vector
43: __PyEval_EvalFrameDefault
44: __PyEval_Vector
45: _method_vectorcall
46: __PyEval_EvalFrameDefault
47: __PyEval_Vector
48: __PyObject_FastCallDictTstate
49: __PyObject_Call_Prepend
50: _slot_tp_call
51: __PyObject_Call
52: _thread_run
53: _pythread_wrapper
54: __pthread_deallocate