Skip to content

Conversation

@hgryoo
Copy link
Member

@hgryoo hgryoo commented Dec 23, 2025

http://jira.cubrid.org/browse/CUBVEC-147

Purpose

In CUBVEC-138, the disk-based HNSW vector index was initially implemented by storing
graph node metadata (neighbor links) and vector data in separate pages.

While this design simplified the initial implementation, it introduced noticeable performance overhead during search due to:

  • Additional I/O required to access vector pages after loading graph nodes
  • Increased page buffer fix/unfix operations
  • Poor cache locality during graph traversal

Implementation

This PR modifies the storage layout to store vector data together with HNSW node metadata in the same page (or block).

Changes:

  • struct node_t: Co-locate graph node information and its corresponding vector payload
  • get_vector_by_slot_id ()
    • Eliminate extra page accesses for vector retrieval during search
    • Reduce page buffer operations and improve cache locality

Remarks

N/A

@hgryoo hgryoo requested a review from a team as a code owner December 23, 2025 02:22
@hgryoo hgryoo requested review from YeunjunLee, hornetmj, mhoh3963 and vimkim and removed request for a team December 23, 2025 02:22
@hgryoo hgryoo self-assigned this Dec 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants