-
Notifications
You must be signed in to change notification settings - Fork 285
optimize index selection #23213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize index selection #23213
Conversation
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
|||||||||||||
Merge Queue Status✅ The pull request has been merged This pull request spent 8 seconds in the queue, with no time running CI. Required conditions to merge
|
User description
What type of PR is this?
Which issue(s) this PR fixes:
issue #https://github.com/matrixorigin/MO-Cloud/issues/6723
What this PR does / why we need it:
Enhanced SQL parser to support new vector query syntax.
Refactored IVFFlat and HNSW index application logic for better query optimization.
Added comprehensive test coverage for vector IVF mode functionality.
PR Type
Enhancement, Tests
Description
Refactored IVFFlat and HNSW index application logic by extracting context preparation into dedicated methods (
prepareIvfIndexContext,prepareHnswIndexContext) for improved code organization and reusabilitySeparated
RankOptionfromLimitin the AST structure (Selectstruct) to better represent SQL semantics with dedicatedRankOptionfield and updated parser grammar accordinglyIntroduced
vectorSortContextstruct to encapsulate vector sort query context information, improving parameter passing and code clarityImplemented special index guard system with methods (
prepareSpecialIndexGuards,resetSpecialIndexGuards,collectSpecialIndexGuards) to protect scan nodes from inappropriate index applicationEnhanced parser to support "force" mode in addition to "pre" and "post" modes for vector index execution control
Updated
QueryBuilderto integrate guard management andRankOptionseparation with new fields (protectedScans,projectSpecialGuards)Added comprehensive test coverage for vector IVF mode functionality including basic operations, advanced edge cases, and complex query scenarios with multiple distance metrics and filter combinations
Removed legacy IVF test cases and consolidated into new comprehensive test suite
Diagram Walkthrough
File Walkthrough
8 files
apply_indices_ivfflat.go
Refactor IVFFlat index optimization with context extractionpkg/sql/plan/apply_indices_ivfflat.go
preparation into
prepareIvfIndexContextmethodapplyIndicesForSortUsingIvfflatto acceptvectorSortContextinstead of individual node parameters
secondary scans when
mode=prebuildPkExprFromNode,getColName,rebindScanNode,replaceColRefTag,canApplyRegularIndex, andcolRefsWithinfor better code organizationapply_indices.go
Add special index guard system for scan protectionpkg/sql/plan/apply_indices.go
specialIndexKindandspecialIndexGuardtypes to track protectedscan nodes
prepareSpecialIndexGuards,resetSpecialIndexGuards,collectSpecialIndexGuards,registerProjectGuard,clearProjectGuard,isScanProtecteddetectFullTextGuardanddetectVectorGuardmethods to identifyscans protected by special indexes
collectVectorIndexeshelper method to extract vectorindexes from scan nodes
applyIndicesForFiltersandapplyIndicesForJoinsto skipprotected scans
vectorSortContextand newhelper methods
query_builder.go
Integrate RankOption separation and guard managementpkg/sql/plan/query_builder.go
protectedScansandprojectSpecialGuardsfields toQueryBuilderstruct initialization
buildUnionmethod signature to acceptastRankOptionparameterseparately from
astLimitbindLimitto accept bothastLimitandastRankOptionparameters
parseRankOptionto support "force" mode in addition to "pre"and "post" modes
applyIndicesincreateQuerymethodapply_indices_hnsw.go
Refactor HNSW index optimization with context extractionpkg/sql/plan/apply_indices_hnsw.go
preparation into
prepareHnswIndexContextmethodapplyIndicesForSortUsingHnswto acceptvectorSortContextinstead of individual node parameters
context structure
select.go
Separate RankOption from Limit in AST structurepkg/sql/parsers/tree/select.go
RankOptionfield toSelectstructRankOption.Formatmethod for SQL formatting with "by rank"clause
ByRank,Option, andModefields fromLimitstructRankOptionstruct withOptionmap fieldSelect.Formatto outputRankOptionseparately fromLimitapply_indices_vector.go
Introduce vectorSortContext for query context managementpkg/sql/plan/apply_indices_vector.go
vectorSortContextstruct to encapsulate vector sort querycontext information
buildVectorSortContextmethod to construct context fromprojection node
pickVectorLimithelper function to extract limit and rank optionfrom query nodes
types.go
Add guard tracking fields to QueryBuilderpkg/sql/plan/types.go
protectedScansmap field to track protected scan node countsprojectSpecialGuardsmap field to store special index guards perproject node
mysql_sql.y
Refactor parser grammar to separate rank optionpkg/sql/parsers/dialect/mysql/mysql_sql.y
rankOptiontype declaration to parser unionlimit_rank_suffixtorank_optfor cleanerseparation
select_no_parensproduction rules to includerank_optasseparate token after
limit_optlimit_clauserules to remove rank option handlinglimit_rank_suffixrule to createRankOptionstruct instead ofLimitstruct6 files
mysql_sql_test.go
Update tests for RankOption separation from Limitpkg/sql/parsers/dialect/mysql/mysql_sql_test.go
selectStmt.RankOptioninstead ofselectStmt.Limit.ByRankselectStmt.RankOption.Optioninstead ofselectStmt.Limit.OptionTestLimitByRankto reflect the new ASTstructure
query_builder_test.go
Update tests for RankOption parameter and force modepkg/sql/plan/query_builder_test.go
TestQueryBuilder_bindLimitto passnilfor newastRankOptionparameter
valid mode forceto verify "force" mode support inparseRankOptionor 'force'"
vector_ivf_mode.result
Add comprehensive vector IVF mode test coveragetest/distributed/cases/vector/vector_ivf_mode.result
three modes: "pre", "post", and "force"
combinations
filter conditions
vector_ivf_mode.sql
Comprehensive IVF vector index mode test suitetest/distributed/cases/vector/vector_ivf_mode.sql
mode with 242 lines of SQL test cases
mode=pre(pre-filtering),mode=post(post-filtering), and
mode=force(disable index)l2_distance,cosine_distance), various LIMIT sizes, OFFSET clauses, and complexWHERE conditions
composite indexes to validate query optimization
vector_ivf_mode_advanced.sql
Advanced IVF mode edge cases and complex query teststest/distributed/cases/vector/vector_ivf_mode_advanced.sql
cases and complex scenarios
patterns, IN clauses, and NULL value handling
l2_distance,cosine_distance,inner_product) with various over-fetch factorstests OFFSET with complex filters
vector_ivf_mode_advanced.result
Expected test results for advanced IVF mode teststest/distributed/cases/vector/vector_ivf_mode_advanced.result
cases
various test scenarios
op_typeerrormessage
mode=pre,mode=post, andmode=forceexecution paths3 files