Skip to content

Vendor VSS source as files instead of a submodule (fix Windows build)#17

Merged
phillipleblanc merged 1 commit into
spiceai-1.5.3from
phillip/vss-vendored-1.5.3
Jun 3, 2026
Merged

Vendor VSS source as files instead of a submodule (fix Windows build)#17
phillipleblanc merged 1 commit into
spiceai-1.5.3from
phillip/vss-vendored-1.5.3

Conversation

@phillipleblanc

Copy link
Copy Markdown

Summary

Replaces the extension/vss/upstream git submodule (→ duckdb-vss) with a committed copy of duckdb-vss's src/. Fixes the Windows build break introduced by the static-VSS work (#16).

Why

duckdb-vss has its own nested duckdb and extension-ci-tools submodules. Cargo recursively checks out all submodules of the duckdb-rs git dependency, so the vss submodule dragged in a full nested duckdb checkout — whose deeply-nested Swift example path exceeds Windows' MAX_PATH (260 chars):

failed to update submodule extension/vss/upstream
  failed to update submodule duckdb
  path too long: 'C:/Users/runneradmin/.cargo/git/checkouts/duckdb-rs-.../extension/vss/upstream/
    duckdb/tools/swift/duckdb-swift/Examples/SwiftUI/ExoplanetExplorer.xcodeproj/.../IDEWorkspaceChecks.plist'
  class=Filesystem (30)

Linux/macOS were unaffected, so it only surfaced on the Windows CI build. A vendored copy has no submodules for Cargo to recurse into.

Change

  • Remove the extension/vss/upstream submodule and .gitmodules.
  • Add extension/vss/src/ — a committed copy of duckdb/duckdb-vss@b833341c's src/ (vss_extension.cpp, hnsw/*.cpp, and the header-only usearch/fp16/simsimd headers + their LICENSEs). Same ABI-matched commit as before.
  • vss_config.py: paths extension/vss/upstream/srcextension/vss/src.
  • SKILL.md: documents the vendored approach and why it must stay vendored (the Windows MAX_PATH reason); upgrade procedure now re-vendors the source instead of bumping a submodule.

The generated duckdb.tar.gz content is unchanged (same vss source files); only how they're stored in the tree changes. This is step 1 of the re-chain (duckdb-rs → table-providers → spiceai duckdb#11107 follow).

duckdb-vss has its own nested `duckdb`/`extension-ci-tools` git submodules. Cargo recursively
checks out submodules of the duckdb-rs git dependency, so the vss *submodule* dragged in a full
nested duckdb checkout whose deeply-nested Swift example paths exceed Windows' MAX_PATH (260),
breaking the Windows build (`path too long ... class=Filesystem`).

Replace the extension/vss/upstream submodule with a committed copy of duckdb-vss@b833341c's
src/ (vss_extension.cpp + hnsw/*.cpp + header-only usearch/fp16/simsimd headers + LICENSEs).
No submodule => nothing for Cargo to recurse into => Windows builds. Update vss_config.py paths
(upstream/src -> src) and SKILL.md (vendored copy, not a submodule).
Copilot AI review requested due to automatic review settings June 3, 2026 08:48

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request removes the extension/vss/upstream git submodule and instead vendors the DuckDB VSS (HNSW) extension source directly under extension/vss/src/ to avoid Cargo recursively fetching nested submodules that break Windows builds due to MAX_PATH limitations.

Changes:

  • Replace the VSS upstream submodule with a committed copy of duckdb-vss’s src/ tree (including header-only deps).
  • Update extension/vss/vss_config.py to point to the vendored extension/vss/src paths.
  • Update extension/vss/SKILL.md to document why vendoring is required and how to upgrade.

Reviewed changes

Copilot reviewed 38 out of 39 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
.gitmodules Removes VSS submodule entry.
extension/vss/vss_config.py Updates packaging paths to use vendored extension/vss/src.
extension/vss/SKILL.md Documents the vendored approach + upgrade procedure.
extension/vss/src/vss_extension.cpp Vendored VSS extension entrypoint/registration.
extension/vss/src/hnsw/CMakeLists.txt Vendored build-source list for HNSW sources.
extension/vss/src/hnsw/hnsw_index.cpp Vendored HNSW index implementation (usearch-backed).
extension/vss/src/hnsw/hnsw_index_macros.cpp Vendored SQL macros registration for VSS helpers.
extension/vss/src/hnsw/hnsw_index_physical_create.cpp Vendored physical operator to build HNSW indexes.
extension/vss/src/hnsw/hnsw_index_physical_create.hpp Vendored header for physical create-index operator.
extension/vss/src/hnsw/hnsw_index_plan.cpp Vendored index planning for CREATE INDEX (HNSW).
extension/vss/src/hnsw/hnsw_index_pragmas.cpp Vendored pragmas (compact/info) for HNSW indexes.
extension/vss/src/hnsw/hnsw_index_scan.cpp Vendored index scan table function implementation.
extension/vss/src/hnsw/hnsw_optimize_expr.cpp Vendored expression optimizer rules for distance funcs.
extension/vss/src/hnsw/hnsw_optimize_join.cpp Vendored join optimizer for HNSW-based join rewrite.
extension/vss/src/hnsw/hnsw_optimize_scan.cpp Vendored TopN→index-scan optimizer rewrite.
extension/vss/src/hnsw/hnsw_optimize_topk.cpp Vendored min_by→index-scan/top-k optimizer rewrite.
extension/vss/src/hnsw/hnsw_topk_operator.cpp Vendored stub TopK operator registration hook.
extension/vss/src/include/aggregate_function_matcher.hpp Vendored matcher helper header for optimizer rules.
extension/vss/src/include/vss_extension.hpp Vendored extension class declaration.
extension/vss/src/include/hnsw/hnsw.hpp Vendored HNSW module registration header.
extension/vss/src/include/hnsw/hnsw_index.hpp Vendored HNSW index type header.
extension/vss/src/include/hnsw/hnsw_index_scan.hpp Vendored HNSW index scan bind/function header.
extension/vss/src/include/fp16/LICENSE Vendored fp16 license.
extension/vss/src/include/fp16/bitcasts.h Vendored fp16 header-only dependency.
extension/vss/src/include/fp16/fp16.h Vendored fp16 header-only dependency.
extension/vss/src/include/usearch/LICENSE Vendored usearch license.
extension/vss/src/include/usearch/duckdb_usearch.hpp Vendored DuckDB wrapper for usearch config macros.
extension/vss/src/include/usearch/index.hpp Vendored usearch header-only dependency.
extension/vss/src/include/usearch/index_dense.hpp Vendored usearch header-only dependency.
extension/vss/src/include/usearch/index_plugins.hpp Vendored usearch header-only dependency.
extension/vss/src/include/simsimd/LICENSE Vendored simsimd license.
extension/vss/src/include/simsimd/types.h Vendored simsimd header-only dependency.
extension/vss/src/include/simsimd/binary.h Vendored simsimd header-only dependency.
extension/vss/src/include/simsimd/geospatial.h Vendored simsimd header-only dependency.
extension/vss/src/include/simsimd/probability.h Vendored simsimd header-only dependency.
extension/vss/src/include/simsimd/dot.h Vendored simsimd header-only dependency.
extension/vss/src/include/simsimd/simsimd.h Vendored simsimd header-only dependency.
extension/vss/src/include/simsimd/spatial.h Vendored simsimd header-only dependency.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread extension/vss/src/hnsw/hnsw_index_macros.cpp
Comment thread extension/vss/src/hnsw/hnsw_index_macros.cpp
Comment thread extension/vss/src/hnsw/hnsw_index_macros.cpp
Comment thread extension/vss/src/hnsw/hnsw_index_macros.cpp
@phillipleblanc phillipleblanc merged commit 01dbffd into spiceai-1.5.3 Jun 3, 2026
88 of 98 checks passed
@phillipleblanc phillipleblanc deleted the phillip/vss-vendored-1.5.3 branch June 3, 2026 09:55
phillipleblanc added a commit to spiceai/duckdb-rs that referenced this pull request Jun 3, 2026
Point the duckdb-sources submodule at spiceai/duckdb#17 (01dbffdd), which vendors the vss
extension as committed files (extension/vss/src) instead of a git submodule. Cargo recursively
checks out submodules of the duckdb-rs git dependency, so the vss submodule pulled in duckdb-vss's
own nested `duckdb` checkout, whose deep Swift example path exceeded Windows' MAX_PATH and broke
the Windows build. A vendored copy has nothing for Cargo to recurse into.

Regenerated duckdb.tar.gz (vss sources now under extension/vss/src; DUCKDB_VERSION still v1.5.3;
vss still statically linked). Updated SKILL.md for the vendored layout.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants