Fix decimal casts to primitive arrays#51
Merged
Conversation
This is required to handle cases where we are attempting to dict compress and a sample ends up being all null Signed-off-by: Robert Kruszewski <github@robertk.io> Signed-off-by: Robert Kruszewski <github@robertk.io>
1. Reset validity if it's "all true" instead of filling manually. 2. Set validity to true if parent array is constant (slicing won't affect that). 3. Count ones while copying to destination buffer Signed-off-by: Mikhail Kot <to@myrrc.dev>
remove #[allow(....)
vortex-data#7388) First of all we never use nightly so portable simd never gets enabled, second of all it looks like avx512 doesn't bring any benefits beyond avx2 for simd gather per gcc mailing list and Intel GDS mailing list. AVX2 on modern intel machines is comparable to Zen 5 machines and on Zen 5 machines AVX512, AVX2 and scalar versions behave the same. Signed-off-by: Robert Kruszewski <github@robertk.io>
…data#7374) ## Summary Tracking issue: vortex-data#7297 Decomposes `TurboQuant` into: ```text ScalarFnArray(L2Denorm, [ ScalarFnArray(SorfTransform, [ Extension<Vector>( FixedSizeListArray( DictArray(codes=Primitive<u8>, values=Primitive<f32>), padded_dim ) ) ]), norms ]) ``` This makes the implementation more modular and turns the TurboQuant-specific pieces into reusable building blocks. Also defines SORF sign generation with a frozen local `SplitMix64` implementation and cleans up related vector compute code. ## API Changes - add `SorfTransform` - remove the `TurboQuant` encoding/array type - keep `TurboQuantScheme` as the compressor entry point - simplify a all scalar fn APIs as we no longer need `ApproxOptions` Note that `ApproxOptions` doesn't actually make sense here because we are doing exact compute here, it is the encoding itself that is lossy. Until we figure out what the exact semantics are of a lossy encoding I will remove this. ## Testing More tests: - TurboQuant roundtrip / structural coverage - SORF roundtrip + norm preservation - deterministic SplitMix64 / sign generation coverage - readthrough behavior (when children are normalized) for vector similarity ops --------- Signed-off-by: Connor Tsui <connor.tsui20@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>
## Summary Tracking issue: vortex-data#7297 Adds basic benchmarking setup for vector similarity. Right now it is just a bunch of random vectors. Note that the numbers dont really mean anything right now as we have yet to optimize anything (namely I have not yet added the inner product / cosine similarity optimizations pushed through both the SORF transform and the dictionary for constant array). In the future we will add proper benchmarking on real datasets (likely in `vortex-bench`, and also maybe we will integrate https://github.com/zilliztech/vectordbbench ## Testing N/A Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
…a#7398) FieldName, FieldNames, and StructFields all wrap Arc types but their derived PartialEq always dereferences through the Arc to compare contents. Since DTypes are frequently cloned (every slice, filter, execute copies the DType), the cloned Arcs share pointers - making pointer equality a reliable fast path. DataFusion ClickBench full-suite [apmc](https://github.com/0ax1/apmc) measurement for vortex, averaged over two runs: - Cycles: -5.7% (861B → 812B) — fewer total CPU cycles - IPC: +5.8% (1.88 → 1.99) — more instructions per cycle - MAP_STALL: -5.5% (360B → 340B) — less CPU stalls waiting on memory Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
This PR contains the following updates: | Package | Change | [Age](https://docs.renovatebot.com/merge-confidence/) | [Confidence](https://docs.renovatebot.com/merge-confidence/) | |---|---|---|---| | [io.netty:netty-bom](https://netty.io/) ([source](https://redirect.github.com/netty/netty)) | `4.2.7.Final` → `4.2.12.Final` |  |  | --- > [!WARNING] > Some dependencies could not be looked up. Check the [Dependency Dashboard](..vortex-data/issues/357) for more information. --- ### Configuration 📅 **Schedule**: (UTC) - Branch creation - Between 12:00 AM and 03:59 AM, only on Monday (`* 0-3 * * 1`) - Automerge - At any time (no schedule defined) 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/vortex-data/vortex). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMTAuMiIsInVwZGF0ZWRJblZlciI6IjQzLjExMC4yIiwidGFyZ2V0QnJhbmNoIjoiZGV2ZWxvcCIsImxhYmVscyI6WyJjaGFuZ2Vsb2cvY2hvcmUiXX0=--> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
This PR contains the following updates: | Package | Change | [Age](https://docs.renovatebot.com/merge-confidence/) | [Confidence](https://docs.renovatebot.com/merge-confidence/) | |---|---|---|---| | [react](https://react.dev/) ([source](https://redirect.github.com/facebook/react/tree/HEAD/packages/react)) | [`19.2.4` → `19.2.5`](https://renovatebot.com/diffs/npm/react/19.2.4/19.2.5) |  |  | | [react-dom](https://react.dev/) ([source](https://redirect.github.com/facebook/react/tree/HEAD/packages/react-dom)) | [`19.2.4` → `19.2.5`](https://renovatebot.com/diffs/npm/react-dom/19.2.4/19.2.5) |  |  | --- > [!WARNING] > Some dependencies could not be looked up. Check the [Dependency Dashboard](..vortex-data/issues/357) for more information. --- ### Release Notes <details> <summary>facebook/react (react)</summary> ### [`v19.2.5`](https://redirect.github.com/facebook/react/releases/tag/v19.2.5): 19.2.5 (April 8th, 2026) [Compare Source](https://redirect.github.com/facebook/react/compare/v19.2.4...v19.2.5) ##### React Server Components - Add more cycle protections ([#​36236](https://redirect.github.com/facebook/react/pull/36236) by [@​eps1lon](https://redirect.github.com/eps1lon) and [@​unstubbable](https://redirect.github.com/unstubbable)) </details> --- ### Configuration 📅 **Schedule**: (UTC) - Branch creation - Between 12:00 AM and 03:59 AM, only on Monday (`* 0-3 * * 1`) - Automerge - At any time (no schedule defined) 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/vortex-data/vortex). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMTAuMiIsInVwZGF0ZWRJblZlciI6IjQzLjExMC4yIiwidGFyZ2V0QnJhbmNoIjoiZGV2ZWxvcCIsImxhYmVscyI6WyJjaGFuZ2Vsb2cvY2hvcmUiXX0=--> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [js-sys](https://wasm-bindgen.github.io/wasm-bindgen/) ([source](https://redirect.github.com/wasm-bindgen/wasm-bindgen/tree/HEAD/crates/js-sys)) | dependencies | patch | `0.3.91` → `0.3.95` | | [wasm-bindgen](https://wasm-bindgen.github.io/wasm-bindgen) ([source](https://redirect.github.com/wasm-bindgen/wasm-bindgen)) | dependencies | patch | `0.2.114` → `0.2.118` | | [wasm-bindgen-futures](https://wasm-bindgen.github.io/wasm-bindgen/) ([source](https://redirect.github.com/wasm-bindgen/wasm-bindgen/tree/HEAD/crates/futures)) | workspace.dependencies | patch | `0.4.64` → `0.4.68` | | [web-sys](https://wasm-bindgen.github.io/wasm-bindgen/web-sys/index.html) ([source](https://redirect.github.com/wasm-bindgen/wasm-bindgen/tree/HEAD/crates/web-sys)) | dependencies | patch | `0.3.91` → `0.3.95` | --- > [!WARNING] > Some dependencies could not be looked up. Check the [Dependency Dashboard](..vortex-data/issues/357) for more information. --- ### Release Notes <details> <summary>wasm-bindgen/wasm-bindgen (wasm-bindgen)</summary> ### [`v0.2.118`](https://redirect.github.com/wasm-bindgen/wasm-bindgen/blob/HEAD/CHANGELOG.md#02118) [Compare Source](https://redirect.github.com/wasm-bindgen/wasm-bindgen/compare/0.2.117...0.2.118) ##### Added - Added `Error::stack_trace_limit()` and `Error::set_stack_trace_limit()` bindings to `js-sys` for the non-standard V8 `Error.stackTraceLimit` property. [#​5082](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5082) - Added support for multiple `#[wasm_bindgen(start)]` functions, which are chained together at initialization, as well as a new `#[wasm_bindgen(start, private)]` to register a start function without exporting it as a public export. [#​5081](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5081) - Reinitialization is no longer automatically applied when using `panic=unwind` and `--experimental-reset-state-function`, instead it is triggered by any use of the `handler::schedule_reinit()` function under `panic=unwind`, which is supported from within the `on_abort` handler for reinit workflows. Renamed `handler::reinit()` to `handler::schedule_reinit()` and removed the `set_on_reinit()` handler. The `__instance_terminated` address is now always a simple boolean (`0` = live, `1` = terminated). [#​5083](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5083) - `handler::schedule_reinit()` now works under `panic=abort` builds. Previously it was a no-op; it now sets the JS-side reinit flag and the next export call transparently creates a fresh `WebAssembly.Instance`. [#​5099](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5099) ##### Changed - MSRV bump from 1.71 to 1.76 for the CLI, and 1.82 to 1.86 for the API [#​5102](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5102) ##### Fixed - ES module `import` statements are now hoisted to the top of generated JS files, placed right after the `@ts-self-types` directive. This ensures valid ES module output since `import` declarations must precede other statements. [#​5103](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5103) - Fixed two CLI issues affecting WASM modules built by rustc 1.94+. First, a panic (`failed to find N in function table`) caused by lld emitting element segment offsets as `global.get $__table_base` or extended const expressions instead of plain `i32.const N` for large function tables; the fix adds a const-expression evaluator in `get_function_table_entry` and guards against integer underflow in multi-segment tables. Second, the descriptor interpreter now routes all global reads/writes through a single `globals` HashMap seeded from the module's own globals, and mirrors the module's actual linear memory rather than a fixed 32KB buffer, so the stack pointer's real value is valid without any override. This fixes panics like `failed to find 32752 in function table` caused by `GOT.func.internal.*` globals being misidentified as the stack pointer. [#​5076](https://redirect.github.com/wasm-bindgen/wasm-bindgen/issues/5076) [#​5080](https://redirect.github.com/wasm-bindgen/wasm-bindgen/issues/5080) [#​5093](https://redirect.github.com/wasm-bindgen/wasm-bindgen/issues/5093) [#​5095](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5095) ### [`v0.2.117`](https://redirect.github.com/wasm-bindgen/wasm-bindgen/blob/HEAD/CHANGELOG.md#02117) [Compare Source](https://redirect.github.com/wasm-bindgen/wasm-bindgen/compare/0.2.116...0.2.117) ##### Fixed - Fixed a regression introduced in [#​5026](https://redirect.github.com/wasm-bindgen/wasm-bindgen/issues/5026) where stable `web-sys` methods that accept a union type containing a `[WbgGeneric]` interface (e.g. `ImageBitmapSource`, which includes `VideoFrame`) incorrectly applied typed generics to all union expansions rather than only those whose argument type is itself `[WbgGeneric]`. In practice this caused `Window::create_image_bitmap_with_*` and the corresponding `WorkerGlobalScope` overloads to return `Promise<ImageBitmap>` instead of `Promise<JsValue>` for the stable (non-`VideoFrame`) call sites, breaking `JsFuture::from(promise).await?`. [#​5064](https://redirect.github.com/wasm-bindgen/wasm-bindgen/issues/5064) [#​5073](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5073) - Fixed handling logic for environment variable `WASM_BINDGEN_TEST_ADDRESS` in the test runner, when running tests in headless mode. [#​5087](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5087) ### [`v0.2.116`](https://redirect.github.com/wasm-bindgen/wasm-bindgen/blob/HEAD/CHANGELOG.md#02116) [Compare Source](https://redirect.github.com/wasm-bindgen/wasm-bindgen/compare/0.2.115...0.2.116) ##### Added - Added `js_sys::Float16Array` bindings, `DataView` float16 accessors using `f32`, and raw `[u16]` helper APIs for interoperability with binary16 representations such as `half::f16`. [#​5033](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5033) ##### Changed - Updated to Walrus 0.26.1 for deterministic type section ordering. [#​5069](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5069) - The `#[wasm_bindgen]` macro now emits `&mut (impl FnMut(...) + MaybeUnwindSafe)` / `&(impl Fn(...) + MaybeUnwindSafe)` for raw `&mut dyn FnMut` / `&dyn Fn` import arguments instead of a hidden generic parameter and where-clause. The generated signature is cleaner and the `MaybeUnwindSafe` bound is visible directly in the argument position. The ABI and wire format are unchanged. When building with `panic=unwind`, closures that capture non-`UnwindSafe` values (e.g. `&mut T`, `Cell<T>`) must wrap them in `AssertUnwindSafe` before capture; on all other targets `MaybeUnwindSafe` is a no-op blanket impl. [#​5056](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5056) ### [`v0.2.115`](https://redirect.github.com/wasm-bindgen/wasm-bindgen/blob/HEAD/CHANGELOG.md#02115) [Compare Source](https://redirect.github.com/wasm-bindgen/wasm-bindgen/compare/0.2.114...0.2.115) ##### Added - `console.debug/log/info/warn/error` output from user-spawned `Worker` and `SharedWorker` instances is now forwarded to the CLI test runner during headless browser tests, just like output from the main thread. Works for blob URL workers, module workers, URL-based workers (importScripts), nested workers, and shared workers (including logs emitted before the first port connection). Non-cloneable arguments are serialized via `String()` rather than crashing the worker. The `--nocapture` flag is respected. [#​5037](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5037) - `js_sys::Promise<T>` now implements `IntoFuture`, enabling direct `.await` on any JS promise without a wrapper type. The `wasm-bindgen-futures` implementation has been moved into `js-sys` behind an optional `futures` feature, which is activated automatically when `wasm-bindgen-futures` is a dependency. All existing `wasm_bindgen_futures::*` import paths continue to work unchanged via re-exports. `js_sys::futures` is also available directly for users who want `promise.await` without depending on `wasm-bindgen-futures`. [#​5049](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5049) - Added `--target emscripten` support, generating a `library_bindgen.js` file for consumption by Emscripten at link time. Includes support for futures, JS closures, and TypeScript output. A new Emscripten-specific test runner is also included, along with CI integration. [#​4443](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/4443) - Added `VideoFrame`, `VideoColorSpace`, and related WebCodecs dictionaries/enums to `web-sys`. [#​5008](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5008) - Added `wasm_bindgen::handler` module with `set_on_abort` and `set_on_reinit` hooks for `panic=unwind` builds. `set_on_abort` registers a callback invoked after the instance is terminated (hard abort, OOM, stack overflow). `set_on_reinit` registers a callback invoked after `reinit()` resets the WebAssembly instance via `--experimental-reset-state-function`. Handlers are stored as Wasm indirect-function-table indices so dispatch is safe even when linear memory is corrupt. ##### Changed - Replaced per-closure generic destructors with a single `__wbindgen_destroy_closure` export. [#​5019](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5019) - Refactored the headless browser test runner logging pipeline for dramatically improved performance (>400x faster on Chrome, >10x on Firefox, \~5x on Safari). Switched to incremental DOM scraping with `textContent.slice(offset)`, append-only output semantics, unified log capture across all log levels on failure, and browser-specific invisible-div optimizations (`display:none` for Chrome/Firefox, `visibility:hidden` for Safari). [#​4960](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/4960) - TTY-gated status/clear output in the test runner shell to avoid `\r` control-character artifacts in non-interactive (CI) environments. [#​4960](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/4960) - Added `bench_console_log_10mb` benchmark alongside the existing 1MB benchmark for the headless test runner. The main branch cannot complete this benchmark at any volume. [#​4960](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/4960) - Updated to Walrus 0.26 [#​5057](https://redirect.github.com/wasm-bindgen/walrus/pull/5057) ##### Fixed - Fixed argument order when calling multi-parameter functions in the `wasm-bindgen` interpreter by reversing the args collected from the stack. [#​5047](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5047) - Added support for per-operation `[WbgGeneric]` in WebIDL, restoring typed generic return types (e.g. `Promise<ImageBitmap>`) for `createImageBitmap` on `Window` and `WorkerGlobalScope` that were lost after the `VideoFrame` stabilization. [#​5026](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5026) - Fixed missing `#[cfg(feature = "...")]` gates on deprecated dictionary builder methods and getters for union-typed fields (e.g. `{Open,Save,Directory}FilePickerOptions::start_in()`), and fixed per-setter doc requirements to list each setter's own required features. [#​5039](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5039) - Fixed `JsOption::new()` to use `undefined` instead of `null`, to be compatible with `Option::None` and JS default parameters. [#​5023](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5023) - Fixed unsound `unsafe` transmutes in `JsOption<T>::wrap`, `as_option`, and `into_option` by replacing `transmute_copy` with `unchecked_into()`. Also tightened the `JsGeneric` trait bound and `JsOption<T>` impl block to require `T: JsGeneric` (which implies `JsCast`), preventing use with arbitrary non-JS types. [#​5030](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5030) - Fixed headless test runner emitting `\r` carriage-return sequences in non-TTY environments, which polluted captured logs in CI and complicated output-matching tests. [#​4960](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/4960) - Fixed headless test runner printing incomplete and out-of-order log output on test failures by merging all five log levels into a single unified output div. [#​4960](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/4960) - Fixed large test outputs (10MB+) causing oversized WebDriver responses that were either extremely slow or crashed completely, by switching to incremental streaming output collection. [#​4960](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/4960) - Fixed a duplciate wasm export in node ESM atomics, when compiled in debug mode [#​5028](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5028) - Fixed a type inference regression (`E0283: type annotations needed`) introduced in v0.2.109 where the stable `FromIterator` and `Extend` impls on `js_sys::Array` were changed from `A: AsRef<JsValue>` to `A: AsRef<T>`. Because `#[wasm_bindgen]` generates multiple `AsRef` impls per type, the compiler could not uniquely resolve `T`, breaking code like `Array::from_iter([my_wasm_value])` without explicit annotations. The stable impls are restored to `A: AsRef<JsValue>` (returning `Array<JsValue>`); the generic `A: AsRef<T>` forms remain available under `js_sys_unstable_apis`. [#​5052](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5052) - Fixed `skip_typescript` not being respected when using `reexport`, causing TypeScript definitions to be incorrectly emitted for re-exported items marked with `#[wasm_bindgen(skip_typescript)]`. [#​5051](https://redirect.github.com/wasm-bindgen/wasm-bindgen/pull/5051) ##### Removed </details> --- ### Configuration 📅 **Schedule**: (UTC) - Branch creation - Between 12:00 AM and 03:59 AM, only on Monday (`* 0-3 * * 1`) - Automerge - At any time (no schedule defined) 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://redirect.github.com/renovatebot/renovate/discussions) if that's undesired. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/vortex-data/vortex). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMTAuMiIsInVwZGF0ZWRJblZlciI6IjQzLjExMC4yIiwidGFyZ2V0QnJhbmNoIjoiZGV2ZWxvcCIsImxhYmVscyI6WyJjaGFuZ2Vsb2cvY2hvcmUiXX0=--> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
## Summary Bumps `rand` to `0.10.1`. - https://rustsec.org/advisories/RUSTSEC-2026-0097 - rust-random/rand#1763 ## Testing N/A Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
This PR contains the following updates: | Package | Type | Update | Change | [Age](https://docs.renovatebot.com/merge-confidence/) | [Confidence](https://docs.renovatebot.com/merge-confidence/) | |---|---|---|---|---|---| | [arc-swap](https://redirect.github.com/vorner/arc-swap) | workspace.dependencies | patch | `1.9.0` → `1.9.1` |  |  | | [cc](https://redirect.github.com/rust-lang/cc-rs) | workspace.dependencies | patch | `1.2.57` → `1.2.60` |  |  | | [custom-labels](https://polarsignals.com) ([source](https://redirect.github.com/polarsignals/custom-labels)) | workspace.dependencies | patch | `0.4.5` → `0.4.6` |  |  | | [env_logger](https://redirect.github.com/rust-cli/env_logger) | workspace.dependencies | patch | `0.11.9` → `0.11.10` |  |  | | [fsst-rs](https://redirect.github.com/spiraldb/fsst) | workspace.dependencies | patch | `0.5.9` → `0.5.10` |  |  | | [inventory](https://redirect.github.com/dtolnay/inventory) | workspace.dependencies | patch | `0.3.22` → `0.3.24` |  |  | | [object_store](https://redirect.github.com/apache/arrow-rs-object-store) | workspace.dependencies | patch | `0.13.1` → `0.13.2` |  |  | | [prettier](https://prettier.io) ([source](https://redirect.github.com/prettier/prettier)) | devDependencies | patch | [`3.8.1` → `3.8.2`](https://renovatebot.com/diffs/npm/prettier/3.8.1/3.8.2) |  |  | | [pyo3](https://redirect.github.com/pyo3/pyo3) | workspace.dependencies | patch | `0.28.2` → `0.28.3` |  |  | | [rustc-hash](https://redirect.github.com/rust-lang/rustc-hash) | workspace.dependencies | patch | `2.1.1` → `2.1.2` |  |  | | com.gradleup.shadow | plugin | patch | `9.4.0` → `9.4.1` |  |  | --- > [!WARNING] > Some dependencies could not be looked up. Check the [Dependency Dashboard](..vortex-data/issues/357) for more information. --- ### Release Notes <details> <summary>vorner/arc-swap (arc-swap)</summary> ### [`v1.9.1`](https://redirect.github.com/vorner/arc-swap/blob/HEAD/CHANGELOG.md#191) [Compare Source](https://redirect.github.com/vorner/arc-swap/compare/v1.9.0...v1.9.1) - One more SeqCst :-| ([#​204](https://redirect.github.com/vorner/arc-swap/issues/204)). </details> <details> <summary>rust-lang/cc-rs (cc)</summary> ### [`v1.2.60`](https://redirect.github.com/rust-lang/cc-rs/blob/HEAD/CHANGELOG.md#1260---2026-04-10) [Compare Source](https://redirect.github.com/rust-lang/cc-rs/compare/cc-v1.2.59...cc-v1.2.60) ##### Fixed - *(ar)* suppress warnings from `D` modifier probe ([#​1700](https://redirect.github.com/rust-lang/cc-rs/pull/1700)) ### [`v1.2.59`](https://redirect.github.com/rust-lang/cc-rs/blob/HEAD/CHANGELOG.md#1259---2026-04-03) [Compare Source](https://redirect.github.com/rust-lang/cc-rs/compare/cc-v1.2.58...cc-v1.2.59) ##### Fixed - *(ar)* deterministic archives with `D` modifier ([#​1697](https://redirect.github.com/rust-lang/cc-rs/pull/1697)) ##### Other - Regenerate target info ([#​1698](https://redirect.github.com/rust-lang/cc-rs/pull/1698)) - Fix target abi parsing for sanitiser targets ([#​1695](https://redirect.github.com/rust-lang/cc-rs/pull/1695)) ### [`v1.2.58`](https://redirect.github.com/rust-lang/cc-rs/blob/HEAD/CHANGELOG.md#1258---2026-03-27) [Compare Source](https://redirect.github.com/rust-lang/cc-rs/compare/cc-v1.2.57...cc-v1.2.58) ##### Other - Update Compile-time Requirements to add info about clang-cl.exe ([#​1693](https://redirect.github.com/rust-lang/cc-rs/pull/1693)) </details> <details> <summary>rust-cli/env_logger (env_logger)</summary> ### [`v0.11.10`](https://redirect.github.com/rust-cli/env_logger/blob/HEAD/CHANGELOG.md#01110---2026-03-23) [Compare Source](https://redirect.github.com/rust-cli/env_logger/compare/v0.11.9...v0.11.10) ##### Internal - Update dependencies </details> <details> <summary>spiraldb/fsst (fsst-rs)</summary> ### [`v0.5.10`](https://redirect.github.com/spiraldb/fsst/releases/tag/0.5.10) [Compare Source](https://redirect.github.com/spiraldb/fsst/compare/0.5.9...0.5.10) #### Changes - feat: prune low-value symbols from table on small inputs ([#​203](https://redirect.github.com/spiraldb/fsst/issues/203)) [@​CommanderStorm](https://redirect.github.com/CommanderStorm) - chore(deps): update codspeedhq/action digest to [`db35df7`](https://redirect.github.com/spiraldb/fsst/commit/db35df7) ([#​201](https://redirect.github.com/spiraldb/fsst/issues/201)) @​[renovate\[bot\]](https://redirect.github.com/apps/renovate) - chore(deps): update taiki-e/cache-cargo-install-action digest to [`511847d`](https://redirect.github.com/spiraldb/fsst/commit/511847d) ([#​198](https://redirect.github.com/spiraldb/fsst/issues/198)) @​[renovate\[bot\]](https://redirect.github.com/apps/renovate) - chore(deps): pin dtolnay/rust-toolchain action to [`3c5f7ea`](https://redirect.github.com/spiraldb/fsst/commit/3c5f7ea) ([#​197](https://redirect.github.com/spiraldb/fsst/issues/197)) @​[renovate\[bot\]](https://redirect.github.com/apps/renovate) - chore(deps): lock file maintenance ([#​196](https://redirect.github.com/spiraldb/fsst/issues/196)) @​[renovate\[bot\]](https://redirect.github.com/apps/renovate) - chore(deps): pin release-drafter/release-drafter action to [`139054a`](https://redirect.github.com/spiraldb/fsst/commit/139054a) ([#​194](https://redirect.github.com/spiraldb/fsst/issues/194)) @​[renovate\[bot\]](https://redirect.github.com/apps/renovate) - chore(deps): update taiki-e/cache-cargo-install-action digest to [`7824a3d`](https://redirect.github.com/spiraldb/fsst/commit/7824a3d) ([#​195](https://redirect.github.com/spiraldb/fsst/issues/195)) @​[renovate\[bot\]](https://redirect.github.com/apps/renovate) - Update commments in the decompress method ([#​193](https://redirect.github.com/spiraldb/fsst/issues/193)) [@​robert3005](https://redirect.github.com/robert3005) </details> <details> <summary>dtolnay/inventory (inventory)</summary> ### [`v0.3.24`](https://redirect.github.com/dtolnay/inventory/releases/tag/0.3.24) [Compare Source](https://redirect.github.com/dtolnay/inventory/compare/0.3.23...0.3.24) - Add support for VxWorks targets ([#​89](https://redirect.github.com/dtolnay/inventory/issues/89), thanks [@​elBoberido](https://redirect.github.com/elBoberido)) ### [`v0.3.23`](https://redirect.github.com/dtolnay/inventory/releases/tag/0.3.23) [Compare Source](https://redirect.github.com/dtolnay/inventory/compare/0.3.22...0.3.23) - Avoid triggering clippy::disallowed\_types in downstream projects that use Loom ([#​88](https://redirect.github.com/dtolnay/inventory/issues/88), thanks [@​elBoberido](https://redirect.github.com/elBoberido)) </details> <details> <summary>apache/arrow-rs-object-store (object_store)</summary> ### [`v0.13.2`](https://redirect.github.com/apache/arrow-rs-object-store/blob/HEAD/CHANGELOG.md#v0132-2026-03-19) [Compare Source](https://redirect.github.com/apache/arrow-rs-object-store/compare/v0.13.1...v0.13.2) [Full Changelog](https://redirect.github.com/apache/arrow-rs-object-store/compare/v0.12.5...v0.13.2) **Implemented enhancements:** - `Path::join(Self, &PathPart) -> Self` [#​665](https://redirect.github.com/apache/arrow-rs-object-store/issues/665) - Support for AWS Encryption Client encryption [#​647](https://redirect.github.com/apache/arrow-rs-object-store/issues/647) - `LocalFileSystem`: use `read_at` instead of seek + read [#​622](https://redirect.github.com/apache/arrow-rs-object-store/issues/622) - Avoid reading metadata for `LocalFileSystem::read_ranges` (and other methods) [#​614](https://redirect.github.com/apache/arrow-rs-object-store/issues/614) - expose `Inner` from `HttpRequestBody` [#​606](https://redirect.github.com/apache/arrow-rs-object-store/issues/606) - Release object store `0.13.1` (maintenance) - Target Jan 2026 [#​598](https://redirect.github.com/apache/arrow-rs-object-store/issues/598) - Support AWS\_ENDPOINT\_URL\_S3 in aws backend [#​589](https://redirect.github.com/apache/arrow-rs-object-store/issues/589) - Release object store `0.12.5` (maintenance) - Target Dec 2025 [#​582](https://redirect.github.com/apache/arrow-rs-object-store/issues/582) - Support upper-case configuration options in parse\_url\_opts [#​529](https://redirect.github.com/apache/arrow-rs-object-store/issues/529) - object\_store: Support `{az,abfs,abfss}://container@account.blob.{core.windows.net,fabric.microsoft.com}` URLs [#​430](https://redirect.github.com/apache/arrow-rs-object-store/issues/430) - Support `Transfer-Encoding: chunked` responses in HttpStore [#​340](https://redirect.github.com/apache/arrow-rs-object-store/issues/340) - Use reconstructed ListBlobs marker to provide list offset support in `MicrosoftAzure` store [#​461](https://redirect.github.com/apache/arrow-rs-object-store/issues/461) **Fixed bugs:** - Azure Fabric: Unsigned integer underflow when fetching token causes integer overflow panic [#​640](https://redirect.github.com/apache/arrow-rs-object-store/issues/640) - Error body missing for 5xx errors after retry exhausted [#​617](https://redirect.github.com/apache/arrow-rs-object-store/issues/617) - Heavy contention on credentials cache [#​541](https://redirect.github.com/apache/arrow-rs-object-store/issues/541) - AWS/S3 Default Headers are not considered for signature calculation [#​484](https://redirect.github.com/apache/arrow-rs-object-store/issues/484) - az:// \<container> not work as expected [#​443](https://redirect.github.com/apache/arrow-rs-object-store/issues/443) **Performance improvements:** - Preallocate single `Vec` in `get_ranges` for LocalFilesystem [#​634](https://redirect.github.com/apache/arrow-rs-object-store/issues/634) - Use platform specific `read_at` when available [#​628](https://redirect.github.com/apache/arrow-rs-object-store/pull/628) ([AdamGS](https://redirect.github.com/AdamGS)) - Avoid metadata lookup for `LocalFileSystem::read_ranges` and `chunked_stream` [#​621](https://redirect.github.com/apache/arrow-rs-object-store/pull/621) ([Dandandan](https://redirect.github.com/Dandandan)) **Closed issues:** - \[Security Alert] Exposed API key(s) detected: AWS Access Key [#​659](https://redirect.github.com/apache/arrow-rs-object-store/issues/659) - AWS S3 token expired on multi-threaded app with Arc usage [#​655](https://redirect.github.com/apache/arrow-rs-object-store/issues/655) - Emulator tests fail in CI due to an unsupported service version header in Azurite [#​626](https://redirect.github.com/apache/arrow-rs-object-store/issues/626) **Merged pull requests:** - Replace `Path::child` with `Path::join` [#​666](https://redirect.github.com/apache/arrow-rs-object-store/pull/666) ([Kinrany](https://redirect.github.com/Kinrany)) - Support --xa-s3 suffix for S3 Express One Zone bucket access points [#​663](https://redirect.github.com/apache/arrow-rs-object-store/pull/663) ([pdeva](https://redirect.github.com/pdeva)) - docs: clarify `Clone` behavior [#​656](https://redirect.github.com/apache/arrow-rs-object-store/pull/656) ([crepererum](https://redirect.github.com/crepererum)) - Implement Clone for local and memory stores [#​653](https://redirect.github.com/apache/arrow-rs-object-store/pull/653) ([DoumanAsh](https://redirect.github.com/DoumanAsh)) - Unify `from_env` behaviours [#​652](https://redirect.github.com/apache/arrow-rs-object-store/pull/652) ([miraclx](https://redirect.github.com/miraclx)) - docs: add examples to the aws docs where appropriate [#​651](https://redirect.github.com/apache/arrow-rs-object-store/pull/651) ([CommanderStorm](https://redirect.github.com/CommanderStorm)) - Switch TokenCache to RWLock [#​648](https://redirect.github.com/apache/arrow-rs-object-store/pull/648) ([tustvold](https://redirect.github.com/tustvold)) - Minimize futures dependency into relevant sub-crates [#​646](https://redirect.github.com/apache/arrow-rs-object-store/pull/646) ([AdamGS](https://redirect.github.com/AdamGS)) - Clarify ShuffleResolver doc-comments [#​645](https://redirect.github.com/apache/arrow-rs-object-store/pull/645) ([jkosh44](https://redirect.github.com/jkosh44)) - Introduce a "tokio" to allow pulling a trait-only build [#​644](https://redirect.github.com/apache/arrow-rs-object-store/pull/644) ([AdamGS](https://redirect.github.com/AdamGS)) - fix(azure): fix integer overflow in Fabric token expiry check [#​641](https://redirect.github.com/apache/arrow-rs-object-store/pull/641) ([desmondcheongzx](https://redirect.github.com/desmondcheongzx)) - chore: upgrade to `rand` 0.10 [#​637](https://redirect.github.com/apache/arrow-rs-object-store/pull/637) ([crepererum](https://redirect.github.com/crepererum)) - fix(aws): Include default headers in signature calculation ([#​484](https://redirect.github.com/apache/arrow-rs-object-store/issues/484)) [#​636](https://redirect.github.com/apache/arrow-rs-object-store/pull/636) ([singhsaabir](https://redirect.github.com/singhsaabir)) - fix(azure): correct Microsoft Fabric blob endpoint domain [#​631](https://redirect.github.com/apache/arrow-rs-object-store/pull/631) ([kevinjqliu](https://redirect.github.com/kevinjqliu)) - Unblock emulator based tests [#​627](https://redirect.github.com/apache/arrow-rs-object-store/pull/627) ([AdamGS](https://redirect.github.com/AdamGS)) - Azure ADLS list\_with\_offset support [#​623](https://redirect.github.com/apache/arrow-rs-object-store/pull/623) ([omar](https://redirect.github.com/omar)) - Implement tests for range and partial content responses [#​619](https://redirect.github.com/apache/arrow-rs-object-store/pull/619) ([vitoordaz](https://redirect.github.com/vitoordaz)) - fix: missing 5xx error body when retry exhausted [#​618](https://redirect.github.com/apache/arrow-rs-object-store/pull/618) ([jackye1995](https://redirect.github.com/jackye1995)) - build(deps): update nix requirement from 0.30.0 to 0.31.1 [#​616](https://redirect.github.com/apache/arrow-rs-object-store/pull/616) ([dependabot\[bot\]](https://redirect.github.com/apps/dependabot)) - Clarify behavior of `parse_url_opts` with regards to case sensitivity [#​613](https://redirect.github.com/apache/arrow-rs-object-store/pull/613) ([AdamGS](https://redirect.github.com/AdamGS)) - Fix logical format conflict [#​605](https://redirect.github.com/apache/arrow-rs-object-store/pull/605) ([tustvold](https://redirect.github.com/tustvold)) - Fix Azure URL parsing [#​604](https://redirect.github.com/apache/arrow-rs-object-store/pull/604) ([tustvold](https://redirect.github.com/tustvold)) - build(deps): update quick-xml requirement from 0.38.0 to 0.39.0 [#​602](https://redirect.github.com/apache/arrow-rs-object-store/pull/602) ([dependabot\[bot\]](https://redirect.github.com/apps/dependabot)) - Only read file metadata once in `LocalFileSystem::read_ranges` [#​595](https://redirect.github.com/apache/arrow-rs-object-store/pull/595) ([AdamGS](https://redirect.github.com/AdamGS)) - feat: Add support for AWS\_ENDPOINT\_URL\_S3 environment variable [#​590](https://redirect.github.com/apache/arrow-rs-object-store/pull/590) ([rajatgoel](https://redirect.github.com/rajatgoel)) - feat: impl MultipartStore for PrefixStore [#​587](https://redirect.github.com/apache/arrow-rs-object-store/pull/587) ([ddupg](https://redirect.github.com/ddupg)) - Implement typos-cli [#​570](https://redirect.github.com/apache/arrow-rs-object-store/pull/570) ([jayvdb](https://redirect.github.com/jayvdb)) - feat (azure): support for '.blob.core.windows.net' in "az://" scheme [#​431](https://redirect.github.com/apache/arrow-rs-object-store/pull/431) ([vladidobro](https://redirect.github.com/vladidobro)) \* *This Changelog was automatically generated by [github\_changelog\_generator](https://redirect.github.com/github-changelog-generator/github-changelog-generator)* </details> <details> <summary>prettier/prettier (prettier)</summary> ### [`v3.8.2`](https://redirect.github.com/prettier/prettier/compare/3.8.1...fbf300f9d89820364ddc9b2efa05b92b8c01b692) [Compare Source](https://redirect.github.com/prettier/prettier/compare/3.8.1...3.8.2) </details> <details> <summary>pyo3/pyo3 (pyo3)</summary> ### [`v0.28.3`](https://redirect.github.com/pyo3/pyo3/blob/HEAD/CHANGELOG.md#0283---2026-04-02) [Compare Source](https://redirect.github.com/pyo3/pyo3/compare/v0.28.2...v0.28.3) ##### Fixed - Fix compile error with `#[pyclass(get_all)]` on a type named `Probe`. [#​5837](https://redirect.github.com/PyO3/pyo3/pull/5837) - Fix compile error in debug builds related to `_Py_NegativeRefcount` with Python < 3.12. [#​5847](https://redirect.github.com/PyO3/pyo3/pull/5847) - Fix a race condition where `Python::attach` or `try_attach` could return before `site.py` had finished running. [#​5903](https://redirect.github.com/PyO3/pyo3/pull/5903) - Fix unsoundness in `PyBytesWriter::write_vectored` with Python 3.15 prerelease versions. [#​5907](https://redirect.github.com/PyO3/pyo3/pull/5907) - Fix deadlock in `.into_pyobject()` implementation for C-like `#[pyclass]` enums. [#​5928](https://redirect.github.com/PyO3/pyo3/pull/5928) </details> <details> <summary>rust-lang/rustc-hash (rustc-hash)</summary> ### [`v2.1.2`](https://redirect.github.com/rust-lang/rustc-hash/blob/HEAD/CHANGELOG.md#212) [Compare Source](https://redirect.github.com/rust-lang/rustc-hash/compare/v2.1.1...v2.1.2) - [Refactor byte hashing to remove unreachable panic](https://redirect.github.com/rust-lang/rustc-hash/pull/65) </details> --- ### Configuration 📅 **Schedule**: (UTC) - Branch creation - Between 12:00 AM and 03:59 AM, only on Monday (`* 0-3 * * 1`) - Automerge - At any time (no schedule defined) 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://redirect.github.com/renovatebot/renovate/discussions) if that's undesired. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/vortex-data/vortex). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMTAuMiIsInVwZGF0ZWRJblZlciI6IjQzLjExMC4yIiwidGFyZ2V0QnJhbmNoIjoiZGV2ZWxvcCIsImxhYmVscyI6WyJjaGFuZ2Vsb2cvY2hvcmUiXX0=--> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
The motivation is to compare an upcoming perf improvement. <!-- Thank you for submitting a pull request! We appreciate your time and effort. Please make sure to provide enough information so that we can review your pull request. The Summary and Testing sections below contain guidance on what to include. --> ## Summary <!-- If this PR is related to a tracked effort, please link to the relevant issue here (e.g., `Closes: vortex-data#123`). Otherwise, feel free to ignore / delete this. In this section, please: 1. Explain the rationale for this change. 2. Summarize the changes included in this PR. A general rule of thumb is that larger PRs should have larger summaries. If there are a lot of changes, please help us review the code by explaining what was changed and why. If there is an issue or discussion attached, there is no need to duplicate all the details, but clarity is always preferred over brevity. --> <!-- ## API Changes Uncomment this section if there are any user-facing changes. Consider whether the change affects users in one of the following ways: 1. Breaks public APIs in some way. 2. Changes the underlying behavior of one of the engine integrations. 3. Should some documentation be updated to reflect this change? If a public API is changed in a breaking manner, make sure to add the appropriate label. You can run `./scripts/public-api.sh` locally to see if there are any public API changes (and this also runs in our CI). --> ## Testing <!-- Please describe how this change was tested. Here are some common categories for testing in Vortex: 1. Verifying existing behavior is maintained. 2. Verifying new behavior and functionality works correctly. 3. Serialization compatibility (backwards and forwards) should be maintained or explicitly broken. --> Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com> Co-authored-by: Adam Gutglick <adam@spiraldb.com>
…#7390) ## Summary Adds support for multiple globs in DuckDB, similar to Parquet, E.g. instead of `read_vortex('*.vortex')` now you can also do `read_vortex(['s3://bucket1/path/*.vortex', '/a/b/*.vortex'])` Note that multiple file systems are supported in a single scan. This is implemented and wired through into the underlying Rust MultiFileDataSource. ## Testing We add new DuckDB e2e tests for multi-glob Couldn't figure out a great way to test multiple file systems, just multiple globs --------- Signed-off-by: Andrew Duffy <andrew@a10y.dev>
## Summary Tracking issue: vortex-data#7297 Adds some constant array optimizations to the tensor crate expressions `L2Norm`, `L2Denorm`, and `CosineSimilarity`. The remaining expressions `InnerProduct` and `SorfTransform` are a bit more complicated and deserve their own PR. ## Testing Adds more tests for these optimizations. --------- Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
_Stacked on top of vortex-data#7394 ## Summary Tracking issue: vortex-data#7297 Implements the final pushdown / reduction rules needed to make the TurboQuant quantization scheme make sense. TODO ## Testing More tests --------- Signed-off-by: Connor Tsui <connor.tsui20@gmail.com> Signed-off-by: Claude <noreply@anthropic.com> Co-authored-by: Claude <noreply@anthropic.com>
1. Make duckdb all-invalid exporter a unit type: If we set underlying vector's validity to "all false", duckdb won't read underlying values so you don't need to fill them 2. Exit early on exporter branches if validity is all false. Saves a ConstantArray creation + execute Signed-off-by: Mikhail Kot <to@myrrc.dev>
…ortex-data#7410) Extend the `Arc::ptr_eq` fast-path from vortex-data#7398 to cover the remaining Arc-containing DType variants. List and FixedSizeList hold a bare `Arc<DType>` in the enum variant, so the shortcut is applied in `DType`'s manual `PartialEq` impl. `StructFields` already handles its own `Arc::ptr_eq` internally. The mismatch arms enumerate every variant in the first position so that adding a new DType variant produces a non-exhaustive match compile error. DuckDB StatPopGen full-suite [apmc](https://github.com/0ax1/apmc) measurement for vortex, averaged over two runs: - Cycles: -5.4% (5.97B → 5.65B) - Instructions: -15.3% (15.5B → 13.1B) - L1D_CACHE_MISS_LD: -0.8% (56.2M → 55.7M) - MAP_STALL: -0.2% (1.35B → 1.34B) Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
…rtex-data#6969) ### Summary Closes: vortex-data#6040 Add a first-class IsNotNull scalar function, replacing the previous Not(IsNull(...)) composition pattern. This simplifies the expression tree and enables direct stat_falsification for zone map pruning. Changes: New is_not_null.rs with ScalarFnVTable implementation, including stat_falsification using is_constant && null_count > 0 (with TODO for future RowCount stat) Updated all integration points: DataFusion, DuckDB, Python/Substrait to use is_not_null(...) directly Replaced the Not(IsNull(...)) fallback in erased.rs validity with IsNotNull Registered IsNotNull in ScalarFnSession and ExprBuiltins/ArrayBuiltins ### AI Assistance Disclosure This PR was developed with AI assistance (Kiro). AI was used for code review, implementing stat_falsification, writing tests, and drafting the PR description. All output was reviewed and validated by the author. API Changes New public APIs: vortex_array::expr::is_not_null(child) — creates an IsNotNull expression Expression::is_not_null() / ArrayRef::is_not_null() via ExprBuiltins/ArrayBuiltins traits Python: vortex._lib.expr.is_not_null(child) ### Testing 9 unit tests covering: return dtype, child replacement, mixed/all-valid/all-invalid evaluation, struct field access, display formatting, null sensitivity, and stat falsification pruning expression generation. --------- Signed-off-by: Xiaoxuan Li <xioxuan@amazon.com> Signed-off-by: Robert Kruszewski <github@robertk.io> Co-authored-by: Robert Kruszewski <github@robertk.io>
## Summary Tracking issue: vortex-data#7216 Makes the compressor types more robust (removes the possibility for invalid state), which additionally sets up adding tracing easier (draft at vortex-data#7385) ## API Changes Changes some types: ```rust /// Closure type for [`DeferredEstimate::Callback`]. /// /// The compressor calls this with the same arguments it would pass to sampling. The closure must /// resolve directly to a terminal [`EstimateVerdict`]. #[rustfmt::skip] pub type EstimateFn = dyn FnOnce( &CascadingCompressor, &mut ArrayAndStats, CompressorContext, ) -> VortexResult<EstimateVerdict> + Send + Sync; /// The result of a [`Scheme`]'s compression ratio estimation. /// /// This type is returned by [`Scheme::expected_compression_ratio`] to tell the compressor how /// promising this scheme is for a given array without performing any expensive work. /// /// [`CompressionEstimate::Verdict`] means the scheme already knows the terminal answer. /// [`CompressionEstimate::Deferred`] means the compressor must do extra work before the scheme can /// produce a terminal answer. #[derive(Debug)] pub enum CompressionEstimate { /// The scheme already knows the terminal estimation verdict. Verdict(EstimateVerdict), /// The compressor must perform deferred work to resolve the terminal estimation verdict. Deferred(DeferredEstimate), } /// The terminal answer to a compression estimate request. #[derive(Debug)] pub enum EstimateVerdict { /// Do not use this scheme for this array. Skip, /// Always use this scheme, as it is definitively the best choice. /// /// Some examples include constant detection, decimal byte parts, and temporal decomposition. /// /// The compressor will select this scheme immediately without evaluating further candidates. /// Schemes that return `AlwaysUse` must be mutually exclusive per canonical type (enforced by /// [`Scheme::matches`]), otherwise the winner depends silently on registration order. /// /// [`Scheme::matches`]: crate::scheme::Scheme::matches AlwaysUse, /// The estimated compression ratio. This must be greater than `1.0` to be considered by the /// compressor, otherwise it is worse than the canonical encoding. Ratio(f64), } /// Deferred work that can resolve to a terminal [`EstimateVerdict`]. pub enum DeferredEstimate { /// The scheme cannot cheaply estimate its ratio, so the compressor should compress a small /// sample to determine effectiveness. Sample, /// A fallible estimation requiring a custom expensive computation. /// /// Use this only when the scheme needs to perform trial encoding or other costly checks to /// determine its compression ratio. The callback returns an [`EstimateVerdict`] directly, so /// it cannot request more sampling or another deferred callback. Callback(Box<EstimateFn>), } ``` This will make some changes that we want to make is the future easier as well (tracing, better decision making for what things to try, etc). ## Testing Some new tests Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
<!-- Thank you for submitting a pull request! We appreciate your time and effort. Please make sure to provide enough information so that we can review your pull request. The Summary and Testing sections below contain guidance on what to include. --> ## Summary <!-- If this PR is related to a tracked effort, please link to the relevant issue here (e.g., `Closes: vortex-data#123`). Otherwise, feel free to ignore / delete this. In this section, please: 1. Explain the rationale for this change. 2. Summarize the changes included in this PR. A general rule of thumb is that larger PRs should have larger summaries. If there are a lot of changes, please help us review the code by explaining what was changed and why. If there is an issue or discussion attached, there is no need to duplicate all the details, but clarity is always preferred over brevity. --> Closes: #000 <!-- ## API Changes Uncomment this section if there are any user-facing changes. Consider whether the change affects users in one of the following ways: 1. Breaks public APIs in some way. 2. Changes the underlying behavior of one of the engine integrations. 3. Should some documentation be updated to reflect this change? If a public API is changed in a breaking manner, make sure to add the appropriate label. You can run `./scripts/public-api.sh` locally to see if there are any public API changes (and this also runs in our CI). --> ## Testing <!-- Please describe how this change was tested. Here are some common categories for testing in Vortex: 1. Verifying existing behavior is maintained. 2. Verifying new behavior and functionality works correctly. 3. Serialization compatibility (backwards and forwards) should be maintained or explicitly broken. --> Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
These have been required at some point in our codebase to get getrandom v0.2 and v0.3 to coexsit in wasm setup. However, the world has moved from getrandom v0.2 and override situation is not as complicated anymore. It seems we can wholesale remove these overrides that we kept around for way too long. fix vortex-data#7382 Signed-off-by: Robert Kruszewski <github@robertk.io> Signed-off-by: Robert Kruszewski <github@robertk.io>
…tex-data#7401) Some of our scan profiles show 10% of scan cpu time is spent in integer widening casts (nullable dictionary codes). This commit simplifies and optimizes primitive casts by hoisting a lot of hot loop branching logic. Specifically, this commit relies on values_fit_in to verify representability so that we can avoid a potential validity and error check in the hot loop. Additionally from_trusted_len_iter lets the destination BufferMut optimize the actual cast instead of using push_unchecked for each element. Results locally running the benchmark from vortex-data#7400. Before: ``` $ cargo bench -p vortex-array --bench cast_primitive Finished `bench` profile [optimized + debuginfo] target(s) in 0.16s Running benches/cast_primitive.rs (target/release/deps/cast_primitive-598823f32b8f3db0) Timer precision: 41 ns cast_primitive fastest │ slowest │ median │ mean │ samples │ iters ╰─ cast_u16_to_u32 384.2 µs │ 491.6 µs │ 395.1 µs │ 397.3 µs │ 100 │ 100 ``` After: ``` cargo bench -p vortex-array --bench cast_primitive Finished `bench` profile [optimized + debuginfo] target(s) in 0.17s Running benches/cast_primitive.rs (target/release/deps/cast_primitive-598823f32b8f3db0) Timer precision: 41 ns cast_primitive fastest │ slowest │ median │ mean │ samples │ iters ╰─ cast_u16_to_u32 6.874 µs │ 543.7 µs │ 6.999 µs │ 12.7 µs │ 100 │ 100 ``` Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
This PR contains the following updates: | Package | Change | [Age](https://docs.renovatebot.com/merge-confidence/) | [Confidence](https://docs.renovatebot.com/merge-confidence/) | |---|---|---|---| | [pytest](https://redirect.github.com/pytest-dev/pytest) ([changelog](https://docs.pytest.org/en/stable/changelog.html)) | `8.4.2` → `9.0.3` |  |  | --- > [!WARNING] > Some dependencies could not be looked up. Check the [Dependency Dashboard](..vortex-data/issues/357) for more information. ### GitHub Vulnerability Alerts #### [CVE-2025-71176](https://nvd.nist.gov/vuln/detail/CVE-2025-71176) pytest through 9.0.2 on UNIX relies on directories with the `/tmp/pytest-of-{user}` name pattern, which allows local users to cause a denial of service or possibly gain privileges. --- ### Release Notes <details> <summary>pytest-dev/pytest (pytest)</summary> ### [`v9.0.3`](https://redirect.github.com/pytest-dev/pytest/releases/tag/9.0.3) [Compare Source](https://redirect.github.com/pytest-dev/pytest/compare/9.0.2...9.0.3) ### pytest 9.0.3 (2026-04-07) #### Bug fixes - [#​12444](https://redirect.github.com/pytest-dev/pytest/issues/12444): Fixed `pytest.approx` which now correctly takes into account `~collections.abc.Mapping` keys order to compare them. - [#​13634](https://redirect.github.com/pytest-dev/pytest/issues/13634): Blocking a `conftest.py` file using the `-p no:` option is now explicitly disallowed. Previously this resulted in an internal assertion failure during plugin loading. Pytest now raises a clear `UsageError` explaining that conftest files are not plugins and cannot be disabled via `-p`. - [#​13734](https://redirect.github.com/pytest-dev/pytest/issues/13734): Fixed crash when a test raises an exceptiongroup with `__tracebackhide__ = True`. - [#​14195](https://redirect.github.com/pytest-dev/pytest/issues/14195): Fixed an issue where non-string messages passed to <span class="title-ref">unittest.TestCase.subTest()</span> were not printed. - [#​14343](https://redirect.github.com/pytest-dev/pytest/issues/14343): Fixed use of insecure temporary directory (CVE-2025-71176). #### Improved documentation - [#​13388](https://redirect.github.com/pytest-dev/pytest/issues/13388): Clarified documentation for `-p` vs `PYTEST_PLUGINS` plugin loading and fixed an incorrect `-p` example. - [#​13731](https://redirect.github.com/pytest-dev/pytest/issues/13731): Clarified that capture fixtures (e.g. `capsys` and `capfd`) take precedence over the `-s` / `--capture=no` command-line options in `Accessing captured output from a test function <accessing-captured-output>`. - [#​14088](https://redirect.github.com/pytest-dev/pytest/issues/14088): Clarified that the default `pytest_collection` hook sets `session.items` before it calls `pytest_collection_finish`, not after. - [#​14255](https://redirect.github.com/pytest-dev/pytest/issues/14255): TOML integer log levels must be quoted: Updating reference documentation. #### Contributor-facing changes - [#​12689](https://redirect.github.com/pytest-dev/pytest/issues/12689): The test reports are now published to Codecov from GitHub Actions. The test statistics is visible [on the web interface](https://app.codecov.io/gh/pytest-dev/pytest/tests). \-- by `aleguy02` ### [`v9.0.2`](https://redirect.github.com/pytest-dev/pytest/releases/tag/9.0.2) [Compare Source](https://redirect.github.com/pytest-dev/pytest/compare/9.0.1...9.0.2) ### pytest 9.0.2 (2025-12-06) #### Bug fixes - [#​13896](https://redirect.github.com/pytest-dev/pytest/issues/13896): The terminal progress feature added in pytest 9.0.0 has been disabled by default, except on Windows, due to compatibility issues with some terminal emulators. You may enable it again by passing `-p terminalprogress`. We may enable it by default again once compatibility improves in the future. Additionally, when the environment variable `TERM` is `dumb`, the escape codes are no longer emitted, even if the plugin is enabled. - [#​13904](https://redirect.github.com/pytest-dev/pytest/issues/13904): Fixed the TOML type of the `tmp_path_retention_count` settings in the API reference from number to string. - [#​13946](https://redirect.github.com/pytest-dev/pytest/issues/13946): The private `config.inicfg` attribute was changed in a breaking manner in pytest 9.0.0. Due to its usage in the ecosystem, it is now restored to working order using a compatibility shim. It will be deprecated in pytest 9.1 and removed in pytest 10. - [#​13965](https://redirect.github.com/pytest-dev/pytest/issues/13965): Fixed quadratic-time behavior when handling `unittest` subtests in Python 3.10. #### Improved documentation - [#​4492](https://redirect.github.com/pytest-dev/pytest/issues/4492): The API Reference now contains cross-reference-able documentation of `pytest's command-line flags <command-line-flags>`. ### [`v9.0.1`](https://redirect.github.com/pytest-dev/pytest/releases/tag/9.0.1) [Compare Source](https://redirect.github.com/pytest-dev/pytest/compare/9.0.0...9.0.1) ### pytest 9.0.1 (2025-11-12) #### Bug fixes - [#​13895](https://redirect.github.com/pytest-dev/pytest/issues/13895): Restore support for skipping tests via `raise unittest.SkipTest`. - [#​13896](https://redirect.github.com/pytest-dev/pytest/issues/13896): The terminal progress plugin added in pytest 9.0 is now automatically disabled when iTerm2 is detected, it generated desktop notifications instead of the desired functionality. - [#​13904](https://redirect.github.com/pytest-dev/pytest/issues/13904): Fixed the TOML type of the verbosity settings in the API reference from number to string. - [#​13910](https://redirect.github.com/pytest-dev/pytest/issues/13910): Fixed <span class="title-ref">UserWarning: Do not expect file\_or\_dir</span> on some earlier Python 3.12 and 3.13 point versions. #### Packaging updates and notes for downstreams - [#​13933](https://redirect.github.com/pytest-dev/pytest/issues/13933): The tox configuration has been adjusted to make sure the desired version string can be passed into its `package_env` through the `SETUPTOOLS_SCM_PRETEND_VERSION_FOR_PYTEST` environment variable as a part of the release process -- by `webknjaz`. #### Contributor-facing changes - [#​13891](https://redirect.github.com/pytest-dev/pytest/issues/13891), [#​13942](https://redirect.github.com/pytest-dev/pytest/issues/13942): The CI/CD part of the release automation is now capable of creating GitHub Releases without having a Git checkout on disk -- by `bluetech` and `webknjaz`. - [#​13933](https://redirect.github.com/pytest-dev/pytest/issues/13933): The tox configuration has been adjusted to make sure the desired version string can be passed into its `package_env` through the `SETUPTOOLS_SCM_PRETEND_VERSION_FOR_PYTEST` environment variable as a part of the release process -- by `webknjaz`. ### [`v9.0.0`](https://redirect.github.com/pytest-dev/pytest/releases/tag/9.0.0) [Compare Source](https://redirect.github.com/pytest-dev/pytest/compare/8.4.2...9.0.0) ### pytest 9.0.0 (2025-11-05) #### New features - [#​1367](https://redirect.github.com/pytest-dev/pytest/issues/1367): **Support for subtests** has been added. `subtests <subtests>` are an alternative to parametrization, useful in situations where the parametrization values are not all known at collection time. Example: ```python def contains_docstring(p: Path) -> bool: """Return True if the given Python file contains a top-level docstring.""" ... def test_py_files_contain_docstring(subtests: pytest.Subtests) -> None: for path in Path.cwd().glob("*.py"): with subtests.test(path=str(path)): assert contains_docstring(path) ``` Each assert failure or error is caught by the context manager and reported individually, giving a clear picture of all files that are missing a docstring. In addition, `unittest.TestCase.subTest` is now also supported. This feature was originally implemented as a separate plugin in [pytest-subtests](https://redirect.github.com/pytest-dev/pytest-subtests), but since then has been merged into the core. > \[!NOTE] > This feature is experimental and will likely evolve in future releases. By that we mean that we might change how subtests are reported on failure, but the functionality and how to use it are stable. - [#​13743](https://redirect.github.com/pytest-dev/pytest/issues/13743): Added support for **native TOML configuration files**. While pytest, since version 6, supports configuration in `pyproject.toml` files under `[tool.pytest.ini_options]`, it does so in an "INI compatibility mode", where all configuration values are treated as strings or list of strings. Now, pytest supports the native TOML data model. In `pyproject.toml`, the native TOML configuration is under the `[tool.pytest]` table. ```toml # pyproject.toml [tool.pytest] minversion = "9.0" addopts = ["-ra", "-q"] testpaths = [ "tests", "integration", ] ``` The `[tool.pytest.ini_options]` table remains supported, but both tables cannot be used at the same time. If you prefer to use a separate configuration file, or don't use `pyproject.toml`, you can use `pytest.toml` or `.pytest.toml`: ```toml # pytest.toml or .pytest.toml [pytest] minversion = "9.0" addopts = ["-ra", "-q"] testpaths = [ "tests", "integration", ] ``` The documentation now (sometimes) shows configuration snippets in both TOML and INI formats, in a tabbed interface. See `config file formats` for full details. - [#​13823](https://redirect.github.com/pytest-dev/pytest/issues/13823): Added a **"strict mode"** enabled by the `strict` configuration option. When set to `true`, the `strict` option currently enables - `strict_config` - `strict_markers` - `strict_parametrization_ids` - `strict_xfail` The individual strictness options can be explicitly set to override the global `strict` setting. The previously-deprecated `--strict` command-line flag now enables strict mode. If pytest adds new strictness options in the future, they will also be enabled in strict mode. Therefore, you should only enable strict mode if you use a pinned/locked version of pytest, or if you want to proactively adopt new strictness options as they are added. See `strict mode` for more details. - [#​13737](https://redirect.github.com/pytest-dev/pytest/issues/13737): Added the `strict_parametrization_ids` configuration option. When set, pytest emits an error if it detects non-unique parameter set IDs, rather than automatically making the IDs unique by adding <span class="title-ref">0</span>, <span class="title-ref">1</span>, ... to them. This can be particularly useful for catching unintended duplicates. - [#​13072](https://redirect.github.com/pytest-dev/pytest/issues/13072): Added support for displaying test session **progress in the terminal tab** using the [OSC 9;4;](https://conemu.github.io/en/AnsiEscapeCodes.html#ConEmu_specific_OSC) ANSI sequence. When pytest runs in a supported terminal emulator like ConEmu, Gnome Terminal, Ptyxis, Windows Terminal, Kitty or Ghostty, you'll see the progress in the terminal tab or window, allowing you to monitor pytest's progress at a glance. This feature is automatically enabled when running in a TTY. It is implemented as an internal plugin. If needed, it can be disabled as follows: - On a user level, using `-p no:terminalprogress` on the command line or via an environment variable `PYTEST_ADDOPTS='-p no:terminalprogress'`. - On a project configuration level, using `addopts = "-p no:terminalprogress"`. - [#​478](https://redirect.github.com/pytest-dev/pytest/issues/478): Support PEP420 (implicit namespace packages) as <span class="title-ref">--pyargs</span> target when `consider_namespace_packages` is <span class="title-ref">true</span> in the config. Previously, this option only impacted package imports, now it also impacts tests discovery. - [#​13678](https://redirect.github.com/pytest-dev/pytest/issues/13678): Added a new `faulthandler_exit_on_timeout` configuration option set to "false" by default to let <span class="title-ref">faulthandler</span> interrupt the <span class="title-ref">pytest</span> process after a timeout in case of deadlock. Previously, a <span class="title-ref">faulthandler</span> timeout would only dump the traceback of all threads to stderr, but would not interrupt the <span class="title-ref">pytest</span> process. \-- by `ogrisel`. - [#​13829](https://redirect.github.com/pytest-dev/pytest/issues/13829): Added support for configuration option aliases via the `aliases` parameter in `Parser.addini() <pytest.Parser.addini>`. Plugins can now register alternative names for configuration options, allowing for more flexibility in configuration naming and supporting backward compatibility when renaming options. The canonical name always takes precedence if both the canonical name and an alias are specified in the configuration file. #### Improvements in existing functionality - [#​13330](https://redirect.github.com/pytest-dev/pytest/issues/13330): Having pytest configuration spread over more than one file (for example having both a `pytest.ini` file and `pyproject.toml` with a `[tool.pytest.ini_options]` table) will now print a warning to make it clearer to the user that only one of them is actually used. \-- by `sgaist` - [#​13574](https://redirect.github.com/pytest-dev/pytest/issues/13574): The single argument `--version` no longer loads the entire plugin infrastructure, making it faster and more reliable when displaying only the pytest version. Passing `--version` twice (e.g., `pytest --version --version`) retains the original behavior, showing both the pytest version and plugin information. > \[!NOTE] > Since `--version` is now processed early, it only takes effect when passed directly via the command line. It will not work if set through other mechanisms, such as `PYTEST_ADDOPTS` or `addopts`. - [#​13823](https://redirect.github.com/pytest-dev/pytest/issues/13823): Added `strict_xfail` as an alias to the `xfail_strict` option, `strict_config` as an alias to the `--strict-config` flag, and `strict_markers` as an alias to the `--strict-markers` flag. This makes all strictness options consistently have configuration options with the prefix `strict_`. - [#​13700](https://redirect.github.com/pytest-dev/pytest/issues/13700): <span class="title-ref">--junitxml</span> no longer prints the <span class="title-ref">generated xml file</span> summary at the end of the pytest session when <span class="title-ref">--quiet</span> is given. - [#​13732](https://redirect.github.com/pytest-dev/pytest/issues/13732): Previously, when filtering warnings, pytest would fail if the filter referenced a class that could not be imported. Now, this only outputs a message indicating the problem. - [#​13859](https://redirect.github.com/pytest-dev/pytest/issues/13859): Clarify the error message for <span class="title-ref">pytest.raises()</span> when a regex <span class="title-ref">match</span> fails. - [#​13861](https://redirect.github.com/pytest-dev/pytest/issues/13861): Better sentence structure in a test's expected error message. Previously, the error message would be "expected exception must be \<expected>, but got \<actual>". Now, it is "Expected \<expected>, but got \<actual>". #### Removals and backward incompatible breaking changes - [#​12083](https://redirect.github.com/pytest-dev/pytest/issues/12083): Fixed a bug where an invocation such as <span class="title-ref">pytest a/ a/b</span> would cause only tests from <span class="title-ref">a/b</span> to run, and not other tests under <span class="title-ref">a/</span>. The fix entails a few breaking changes to how such overlapping arguments and duplicates are handled: 1. <span class="title-ref">pytest a/b a/</span> or <span class="title-ref">pytest a/ a/b</span> are equivalent to <span class="title-ref">pytest a</span>; if an argument overlaps another arguments, only the prefix remains. 2. <span class="title-ref">pytest x.py x.py</span> is equivalent to <span class="title-ref">pytest x.py</span>; previously such an invocation was taken as an explicit request to run the tests from the file twice. If you rely on these behaviors, consider using `--keep-duplicates <duplicate-paths>`, which retains its existing behavior (including the bug). - [#​13719](https://redirect.github.com/pytest-dev/pytest/issues/13719): Support for Python 3.9 is dropped following its end of life. - [#​13766](https://redirect.github.com/pytest-dev/pytest/issues/13766): Previously, pytest would assume it was running in a CI/CD environment if either of the environment variables <span class="title-ref">$CI</span> or <span class="title-ref">$BUILD\_NUMBER</span> was defined; now, CI mode is only activated if at least one of those variables is defined and set to a *non-empty* value. - [#​13779](https://redirect.github.com/pytest-dev/pytest/issues/13779): **PytestRemovedIn9Warning deprecation warnings are now errors by default.** Following our plan to remove deprecated features with as little disruption as possible, all warnings of type `PytestRemovedIn9Warning` now generate errors instead of warning messages by default. **The affected features will be effectively removed in pytest 9.1**, so please consult the `deprecations` section in the docs for directions on how to update existing code. In the pytest `9.0.X` series, it is possible to change the errors back into warnings as a stopgap measure by adding this to your `pytest.ini` file: ```ini [pytest] filterwarnings = ignore::pytest.PytestRemovedIn9Warning ``` But this will stop working when pytest `9.1` is released. **If you have concerns** about the removal of a specific feature, please add a comment to `13779`. #### Deprecations (removal in next major release) - [#​13807](https://redirect.github.com/pytest-dev/pytest/issues/13807): `monkeypatch.syspath_prepend() <pytest.MonkeyPatch.syspath_prepend>` now issues a deprecation warning when the prepended path contains legacy namespace packages (those using `pkg_resources.declare_namespace()`). Users should migrate to native namespace packages (`420`). See `monkeypatch-fixup-namespace-packages` for details. #### Bug fixes - [#​13445](https://redirect.github.com/pytest-dev/pytest/issues/13445): Made the type annotations of `pytest.skip` and friends more spec-complaint to have them work across more type checkers. - [#​13537](https://redirect.github.com/pytest-dev/pytest/issues/13537): Fixed a bug in which `ExceptionGroup` with only `Skipped` exceptions in teardown was not handled correctly and showed as error. - [#​13598](https://redirect.github.com/pytest-dev/pytest/issues/13598): Fixed possible collection confusion on Windows when short paths and symlinks are involved. - [#​13716](https://redirect.github.com/pytest-dev/pytest/issues/13716): Fixed a bug where a nonsensical invocation like `pytest x.py[a]` (a file cannot be parametrized) was silently treated as `pytest x.py`. This is now a usage error. - [#​13722](https://redirect.github.com/pytest-dev/pytest/issues/13722): Fixed a misleading assertion failure message when using `pytest.approx` on mappings with differing lengths. - [#​13773](https://redirect.github.com/pytest-dev/pytest/issues/13773): Fixed the static fixture closure calculation to properly consider transitive dependencies requested by overridden fixtures. - [#​13816](https://redirect.github.com/pytest-dev/pytest/issues/13816): Fixed `pytest.approx` which now returns a clearer error message when comparing mappings with different keys. - [#​13849](https://redirect.github.com/pytest-dev/pytest/issues/13849): Hidden `.pytest.ini` files are now picked up as the config file even if empty. This was an inconsistency with non-hidden `pytest.ini`. - [#​13865](https://redirect.github.com/pytest-dev/pytest/issues/13865): Fixed <span class="title-ref">--show-capture</span> with <span class="title-ref">--tb=line</span>. - [#​13522](https://redirect.github.com/pytest-dev/pytest/issues/13522): Fixed `pytester` in subprocess mode ignored all :attr\`pytester.plugins \<pytest.Pytester.plugins>\` except the first. Fixed `pytester` in subprocess mode silently ignored non-str `pytester.plugins <pytest.Pytester.plugins>`. Now it errors instead. If you are affected by this, specify the plugin by name, or switch the affected tests to use `pytester.runpytest_inprocess <pytest.Pytester.runpytest_inprocess>` explicitly instead. #### Packaging updates and notes for downstreams - [#​13791](https://redirect.github.com/pytest-dev/pytest/issues/13791): Minimum requirements on `iniconfig` and `packaging` were bumped to `1.0.1` and `22.0.0`, respectively. #### Contributor-facing changes - [#​12244](https://redirect.github.com/pytest-dev/pytest/issues/12244): Fixed self-test failures when <span class="title-ref">TERM=dumb</span>. - [#​12474](https://redirect.github.com/pytest-dev/pytest/issues/12474): Added scheduled GitHub Action Workflow to run Sphinx linkchecks in repo documentation. - [#​13621](https://redirect.github.com/pytest-dev/pytest/issues/13621): pytest's own testsuite now handles the `lsof` command hanging (e.g. due to unreachable network filesystems), with the affected selftests being skipped after 10 seconds. - [#​13638](https://redirect.github.com/pytest-dev/pytest/issues/13638): Fixed deprecated `gh pr new` command in `scripts/prepare-release-pr.py`. The script now uses `gh pr create` which is compatible with GitHub CLI v2.0+. - [#​13695](https://redirect.github.com/pytest-dev/pytest/issues/13695): Flush <span class="title-ref">stdout</span> and <span class="title-ref">stderr</span> in <span class="title-ref">Pytester.run</span> to avoid truncated outputs in <span class="title-ref">test\_faulthandler.py::test\_timeout</span> on CI -- by `ogrisel`. - [#​13771](https://redirect.github.com/pytest-dev/pytest/issues/13771): Skip <span class="title-ref">test\_do\_not\_collect\_symlink\_siblings</span> on Windows environments without symlink support to avoid false negatives. - [#​13841](https://redirect.github.com/pytest-dev/pytest/issues/13841): `tox>=4` is now required when contributing to pytest. - [#​13625](https://redirect.github.com/pytest-dev/pytest/issues/13625): Added missing docstrings to `pytest_addoption()`, `pytest_configure()`, and `cacheshow()` functions in `cacheprovider.py`. #### Miscellaneous internal changes - [#​13830](https://redirect.github.com/pytest-dev/pytest/issues/13830): Configuration overrides (`-o`/`--override-ini`) are now processed during startup rather than during `config.getini() <pytest.Config.getini>`. </details> --- ### Configuration 📅 **Schedule**: (UTC) - Branch creation - "" - Automerge - At any time (no schedule defined) 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/vortex-data/vortex). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMTAuMiIsInVwZGF0ZWRJblZlciI6IjQzLjExMC4yIiwidGFyZ2V0QnJhbmNoIjoiZGV2ZWxvcCIsImxhYmVscyI6WyJjaGFuZ2Vsb2cvY2hvcmUiXX0=--> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…trategy (vortex-data#7422) <!-- Thank you for submitting a pull request! We appreciate your time and effort. Please make sure to provide enough information so that we can review your pull request. The Summary and Testing sections below contain guidance on what to include. --> ## Summary This allows users to e.g. disable UncompressedSizeInBytes computation, which may be expensive for deeply nested types. <!-- If this PR is related to a tracked effort, please link to the relevant issue here (e.g., `Closes: vortex-data#123`). Otherwise, feel free to ignore / delete this. In this section, please: 1. Explain the rationale for this change. 2. Summarize the changes included in this PR. A general rule of thumb is that larger PRs should have larger summaries. If there are a lot of changes, please help us review the code by explaining what was changed and why. If there is an issue or discussion attached, there is no need to duplicate all the details, but clarity is always preferred over brevity. --> <!-- ## API Changes Uncomment this section if there are any user-facing changes. Consider whether the change affects users in one of the following ways: 1. Breaks public APIs in some way. 2. Changes the underlying behavior of one of the engine integrations. 3. Should some documentation be updated to reflect this change? If a public API is changed in a breaking manner, make sure to add the appropriate label. You can run `./scripts/public-api.sh` locally to see if there are any public API changes (and this also runs in our CI). --> ## Testing <!-- Please describe how this change was tested. Here are some common categories for testing in Vortex: 1. Verifying existing behavior is maintained. 2. Verifying new behavior and functionality works correctly. 3. Serialization compatibility (backwards and forwards) should be maintained or explicitly broken. --> Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
vortex-data#7423) ## Summary ListView sizes are checked on each element iteration. This can cause unecessary calls to arrays_value_equal if most but not all sizes are equal. This commit pulls up some fast checks we can do before executing the expensive arrays_value_equal loop. Concretely, only execute the loop if all sizes are equal *and* non-zero *and* offsets are not all equal, since the inverse is trivially constant. [Profile link here, we're seeing that this check is expensive](https://pprof.me/39a57fd47bacf99f6669aa6ba5beed31/?profile_filters=p%3Ahide_libc%3AHide%2520libc%2Cp%3Ahide_tokio_frames%3AHide%2520Tokio%2520Frames%2Cp%3Ahide_rust_futures%3AHide%2520Rust%2520Futures%2520Infrastructure%2Cp%3Ahide_rust_panic_backtrace%3AHide%2520Rust%2520Panic%2520Backtrace%2520Infrastructure) Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
The pr will get pinged earlier and the author can choose to close it or remove the label if they want another 14 days Signed-off-by: Robert Kruszewski <github@robertk.io>
## Summary When deserializing an array encoded using the experimental patched encoding `ArrayPlugin`, reads can fail b/c of this assertion. `ArrayPlugin`s can/should be able to support multiple codecs The alternative to this is to change ArrayPlugin to return a set of ArrayId's that it covers, rather than a single ID. --------- Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Create execution context outside of benchmark hot loop --------- Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
) Adds `ExecutionStep::AppendChild` and `execute_into_builder` to support iterative execution that appends child arrays directly into builders, avoiding intermediate materialization for chunked arrays. Increase the max iteration limit to 2^22 from 128 part of: vortex-data#7674 --------- Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Add file_index, file_row_number virtual columns. Add file-based filtering (range, selection) to ScanRequest. Add partition index method. Add late materialization support and row id columns support in duckdb. Attempt 1 was accidentally merged at vortex-data#7631 and reverted Signed-off-by: Mikhail Kot <to@myrrc.dev>
…#7710) Casting validity correctly advertises that it might do compute by requiring context. We split the logic of casting between CastReduce rules which can handle all NonNullalbe -> Nullable and Nullable -> NonNullable casts if stats have been calculated and CastKernels which need to handle full generality of casts, in practice Nullable to NonNullable casts --------- Signed-off-by: Robert Kruszewski <github@robertk.io>
## Summary - Closes vortex-data#7731 Turns out I didn't wire the expression convertor extension point correctly. This PR both fixes that AND adds a test to make sure this behavior is maintained. Signed-off-by: Adam Gutglick <adam@spiraldb.com>
## Summary Just renames the module and also the ID to `"vortex.tensor.fixed_shape_tensor"`. ## Testing N/A --------- Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
…7728) The names should append the public api signature. --------- Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Allow the fuzzer to create longer arrays to exercise more file_io logic
(e.g. zone_map partitions).
Adds a new trait `ArbitraryWith` and `ArbitraryArrayConfig` to express
this.
```rust
/// Trait for generating arbitrary values with a caller-provided configuration.
pub trait ArbitraryWith<'a, C>: Sized {
/// Generate an arbitrary value using the provided configuration.
fn arbitrary_with_config(u: &mut Unstructured<'a>, config: &C) -> Result<Self>;
}
/// Configuration for arbitrary array generation.
#[derive(Clone, Debug)]
pub struct ArbitraryArrayConfig {
/// Fixed dtype, or `None` to generate one from [`Unstructured`].
pub dtype: Option<DType>,
/// Inclusive range for the total array length.
pub len: RangeInclusive<usize>,
}
```
---------
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Mikhail Kot <to@myrrc.dev>
## Summary This might help with vortex-data#7699, but it might not. I made the changes though so here we go. Removes the `PartialOrd` implementation for `ScalarValue`, ensuring that we are only able to compare `Scalar`s which carry a `DType`. ## API Changes Removes the `PartialOrd` impl. ## Testing Existing tests should suffice since this just moves some logic around. Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
I believe some of the reference implementations are very slow so we can't operate on bigger arrays Signed-off-by: Robert Kruszewski <github@robertk.io>
We don't use it anyway as the right way is to use MultiFileSession and get file footers out of it. Signed-off-by: Mikhail Kot <to@myrrc.dev>
- Remove noisy/synthetic benchmarks from codspeed CI:
- throughput_cuda (pure memory bandwidth, not Vortex logic)
- for_cuda (u8/u16 undersaturate the GPU)
- filter_cuda + zstd_cuda (entire NVIDIA kernels shard, 10-25% cross-run
swing)
- Only build benchmarks each shard needs
- Run only 100M element input sizes on codspeed
- Increase warm_up_time 1ns -> 500ms
- Consistent benchmark naming: cuda/{encoding}/{params}/{size} with size
always the last path segment
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Two small optimizations: - Skip the `is_invalid` vtable dispatch when `dtype` is non-nullable - Demote the dtype-equality post-check to debug assertion. It's an encoding-correctness invariant, not runtime input validation Signed-off-by: Baris Palaska <barispalaska@gmail.com>
Merges upstream vortex-data/vortex tag 0.70.0 into the spiceai-52 fork branch. Key conflict resolutions: - cast.rs: Preserved SpiceAI date->timestamp extension cast (PR #28) and integrated upstream extension->storage cast fallback (PR vortex-data#7469) - writer.rs: Kept SpiceAI get_opt<WriteStrategyBuilder> pattern with upstream ALLOWED_ENCODINGS for ArrayContext - strategy.rs: Added impl SessionVar for WriteStrategyBuilder (required by SpiceAI session-based write strategy configuration)
Merge upstream Vortex 0.70.0 into spiceai-52
Signed-off-by: "Luke Kim" <80174+lukekim@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes casting DecimalArray values into primitive numeric arrays (notably f64) so execution engines (e.g., DataFusion) can successfully coerce decimal columns into floating-point types during query execution.
Changes:
- Add a
DecimalArray→DType::Primitivecast path in the decimal cast kernel. - Apply decimal scale when converting decimal physical values into primitive numeric values.
- Preserve/cast validity nullability and propagate validity masks while filling invalid slots with default physical values.
- Add regression tests for
decimal(15,2)→f64, including nullable input validity preservation.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
sgrebnov
approved these changes
May 18, 2026
Signed-off-by: "Luke Kim" <80174+lukekim@users.noreply.github.com>
Signed-off-by: "Luke Kim" <80174+lukekim@users.noreply.github.com>
sgrebnov
approved these changes
May 18, 2026
lukekim
added a commit
to spiceai/spiceai
that referenced
this pull request
May 19, 2026
lukekim
added a commit
to spiceai/spiceai
that referenced
this pull request
May 20, 2026
* Vendor Vortex DataFusion for Cayenne * refactor: Improve code readability and consistency in Vortex modules * refactor: Improve documentation and code clarity in cache and exprs modules * refactor: Enhance documentation clarity and improve code consistency across multiple modules * refactor: Improve code clarity and consistency in Vortex modules * Expose Cayenne footer cache and CDC metrics (#10929) * Expose Cayenne footer cache and CDC metrics * Restore Cayenne target file size default * Move Cayenne footer cache to runtime params * refactor: streamline VortexConfig initialization and enhance footer cache drift logging * refactor: simplify VortexConfig initialization by using struct update syntax * refactor: Enhance error handling and return types in expression conversion functions * refactor: Improve code clarity and consistency in scalar conversion and access plan modules * refactor: Simplify scalar conversion logic and improve test error messages * refactor: Update column statistics initialization to use table schema field count * refactor: Improve code clarity by using `map_or_else` for optional values and enhancing documentation formatting * refactor: Enhance Clippy configuration for improved linting during tests * Benchmarks: add publish.workspace = true to chbench-driver Cargo.toml (#10935) * chore: Update Cargo.toml and README.md for repository details; enhance metadata documentation * refactor: Enhance dynamic filter handling and pushdown logic in VortexOpener * refactor: Improve logging format in CayenneAccelerator and enhance ProcessedProjection documentation * docs: Update comment for leftover_projection to improve clarity * test: Use named Vortex persistent snapshots * docs: Add DR-008 for vendoring vortex-datafusion * test(vortex): Decimal-to-f64 cast regression for spiceai/vortex#51 * feat(vortex): Auto-tune scan concurrency * fix(vortex): Improve row deletion adjustment and add tests for edge cases * perf(vortex): Project partition columns zero-copy * fix(vortex): Correct operator conversion test cases and improve precision handling in VortexFormat * refactor(vortex): Simplify distinct count calculation and add unit tests * refactor(vortex): Replace scan_listing_tables cache with scan_file_statistics for improved snapshot handling * feat(vortex): introduce ProjectionPushdown enum for scan behavior control and update related implementations * feat(deletion): add iterator method for PositionDeletionVector * fix(sink): use clone method for object_store in write_record_batch_stream_to_files * fix(sink): use Arc::clone for object_store in write_record_batch_stream_to_files * docs: clarify Vortex adapter ownership * fix(scalars): improve UTF-8 conversion handling in TryToDataFusion implementation * fix(docs): improve documentation clarity for Precision conversion and VortexSource methods * fix(docs): improve documentation formatting for Precision conversion and Vortex metrics * fix(docs): improve documentation clarity for Vortex metrics extraction and conversion functions * fix(docs): enhance documentation clarity for Vortex metrics extraction functions * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Sergei Grebnov <sergei.grebnov@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes decimal array casts to primitive numeric arrays so queries that coerce decimal values to
f64can execute.Summary:
DecimalArraycast path for primitive numeric target dtypes.decimal(15,2)tof64, including nullable input.Verification:
cargo +nightly fmt --allRUSTC_WRAPPER= cargo test -q -p vortex-array decimal::compute::cast -- --nocaptureRUSTC_WRAPPER= cargo build -q -p vortex-datafusionRUSTC_WRAPPER= cargo clippy -q --all-targets --all-features