Skip to content

Conversation

robfrank
Copy link
Collaborator

This pull request introduces comprehensive support for the JVector vector index within ArcadeDB, including dependency management, SQL parser enhancements, and a full end-to-end test suite. The changes enable users to create and manage JVector indexes via SQL, configure vector index parameters, and validate vector data workflows through automated tests.

JVector Index Support and Configuration:

  • Added jvector dependency (jvector.version and Maven dependency) to engine/pom.xml to enable vector indexing capabilities. [1] [2]
  • Enhanced CreateIndexStatement.java to recognize and validate the JVECTOR index type in SQL statements, including configuration options for dimensions, similarity function, max connections, and beam width. [1] [2]
  • Implemented logic in CreateIndexStatement.java to build JVector indexes using user-provided or default configuration, and added callback for indexing progress reporting. [1] [2]

End-to-End Testing:

  • Added a new comprehensive E2E test class JVectorSqlE2ETest.java covering schema creation, vector data insertion, retrieval, update, deletion, batch operations, and index management for JVector indexes.

Codebase Maintenance:

  • Updated imports in CreateIndexStatement.java to support JVector integration and improve code clarity.

@robfrank robfrank self-assigned this Sep 16, 2025
@robfrank robfrank linked an issue Sep 16, 2025 that may be closed by this pull request
4 tasks
@robfrank robfrank added this to the 25.9.1 milestone Sep 16, 2025
@robfrank robfrank added enhancement New feature or request experimental labels Sep 16, 2025
Copy link

codacy-production bot commented Sep 16, 2025

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-0.23% 41.11%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (25e480f) 73027 46288 63.38%
Head commit (58fa112) 73845 (+818) 46637 (+349) 63.16% (-0.23%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#2530) 1951 802 41.11%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

@gramian
Copy link
Collaborator

gramian commented Sep 20, 2025

What is the plan for the different distance functions? Will there be a replacement?

@robfrank
Copy link
Collaborator Author

What is the plan for the different distance functions? Will there be a replacement?

I will reimport/rewrite, I've just removed them to have less code around to manage.

Next step, with @lvca is to take some decisions about persistence on disk.

@robfrank robfrank force-pushed the feat/2529-add-jvector-index branch 3 times, most recently from 31811c1 to 8e7b207 Compare September 27, 2025 18:40
@robfrank robfrank force-pushed the feat/2529-add-jvector-index branch 2 times, most recently from bb45c69 to e37afed Compare October 3, 2025 08:24
@robfrank robfrank modified the milestones: 25.9.1, 25.10.1 Oct 7, 2025
@robfrank robfrank force-pushed the feat/2529-add-jvector-index branch from e37afed to 58fa112 Compare October 10, 2025 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request experimental

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement new vector index using JVector library

2 participants