Skip to content

Error during vacuum: the length + offset of the sliced StructArray cannot exceed the existing length #3237

Open
@Menziess

Description

@Menziess

Environment

deltalake==0.25.0:

Binding: Python3.11?

Environment:

  • Azure:
  • Mac OS:

Bug

What happened:

During vacuuming the delta table, this error was raised:

PanicException: the length + offset of the sliced StructArray cannot exceed the existing length.
delta.vacuum(
    retention_hours=1,
    dry_run=False,
    enforce_retention_duration=False
)

What you expected to happen:

The program has worked using deltalake==0.23.2 before. I expected the delta table to be vacuum'd after being optimized without raising exceptions.

How to reproduce it:

Not quite sure yet.

More details:

INFO:main.py:Optimizing delta table.
INFO:main.py:Vacuuming delta table.
thread '<unnamed>' panicked at /Users/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-array-54.2.0/src/array/struct_array.rs:275:9:
the length + offset of the sliced StructArray cannot exceed the existing length
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: arrow_array::array::struct_array::StructArray::slice
   3: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
   4: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
   5: arrow_array::array::struct_array::StructArray::slice
   6: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
   7: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
   8: arrow_array::array::struct_array::StructArray::slice
   9: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
  10: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
  11: arrow_array::array::struct_array::StructArray::slice
  12: <arrow_array::array::struct_array::StructArray as arrow_array::array::Array>::slice
  13: arrow_select::filter::filter_array
  14: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
  15: core::iter::adapters::try_process
  16: arrow_select::filter::filter_record_batch
  17: deltalake_core::kernel::snapshot::replay::LogReplayScanner::process_files_batch
  18: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold
  19: <core::iter::adapters::chain::Chain<A,B> as core::iter::traits::iterator::Iterator>::try_fold
  20: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
  21: core::iter::adapters::try_process
  22: deltalake_core::kernel::snapshot::EagerSnapshot::advance
  23: deltalake_core::operations::transaction::PostCommit::run_post_commit_hook::{{closure}}
  24: <deltalake_core::operations::transaction::PostCommit as core::future::into_future::IntoFuture>::into_future::{{closure}}
  25: <deltalake_core::operations::transaction::PreCommit as core::future::into_future::IntoFuture>::into_future::{{closure}}
  26: <deltalake_core::operations::vacuum::VacuumBuilder as core::future::into_future::IntoFuture>::into_future::{{closure}}
  27: tokio::runtime::park::CachedParkThread::block_on
  28: tokio::runtime::context::runtime::enter_runtime
  29: tokio::runtime::runtime::Runtime::block_on
  30: deltalake::RawDeltaTable::vacuum
  31: deltalake::RawDeltaTable::__pymethod_vacuum__
  32: pyo3::impl_::trampoline::trampoline
  33: deltalake::<impl pyo3::impl_::pyclass::PyMethods<deltalake::RawDeltaTable> for pyo3::impl_::pyclass::PyClassImplCollector<deltalake::RawDeltaTable>>::py_methods::ITEMS::trampoline
  34: _cfunction_call
  35: __PyObject_MakeTpCall
  36: __PyEval_EvalFrameDefault
  37: __PyEval_Vector
  38: _context_run
  39: _cfunction_vectorcall_FASTCALL_KEYWORDS
  40: _partial_vectorcall
  41: __PyEval_EvalFrameDefault
  42: __PyEval_Vector
  43: __PyEval_EvalFrameDefault
  44: __PyEval_Vector
  45: _method_vectorcall
  46: __PyEval_EvalFrameDefault
  47: __PyEval_Vector
  48: __PyObject_FastCallDictTstate
  49: __PyObject_Call_Prepend
  50: _slot_tp_call
  51: __PyObject_Call
  52: _thread_run
  53: _pythread_wrapper
  54: __pthread_deallocate

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingmre-neededWhether an MRE needs to be provided

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions