-
Notifications
You must be signed in to change notification settings - Fork 285
optimize index selection #23215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize index selection #23215
Conversation
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
||||||||||||||||||
Merge Queue Status✅ The pull request has been merged This pull request spent 56 minutes 37 seconds in the queue, including 56 minutes 22 seconds running CI. Required conditions to merge
|
User description
What type of PR is this?
Which issue(s) this PR fixes:
issue #https://github.com/matrixorigin/MO-Cloud/issues/6723
What this PR does / why we need it:
Enhanced SQL parser to support new vector query syntax.
Refactored IVFFlat and HNSW index application logic for better query optimization.
Added comprehensive test coverage for vector IVF mode functionality.
PR Type
Enhancement, Tests
Description
Refactored SQL parser to separate
RankOptionfromLimitclause in AST structure, enabling independent handling of rank modes ("pre", "post", "force")Enhanced IVFFlat and HNSW index application logic by extracting context preparation into dedicated functions (
prepareIvfIndexContext,prepareHnswIndexContext,buildVectorSortContext)Implemented special index guard mechanism to protect scan nodes from inappropriate regular index application when vector indexes are present
Added
vectorSortContextstruct to encapsulate vector sort query context information for improved index optimization trackingModified grammar rules in
mysql_sql.yto parseRankOptionindependently fromLimitclauseUpdated
QueryBuilderto track protected scans and special index guards per project nodeAdded comprehensive test coverage for IVFFlat vector index functionality including basic, advanced, and edge case scenarios with multiple distance metrics and filter combinations
Migrated test cases from legacy
ivf/directory structure to newvector/vector_ivf_mode*test suiteDiagram Walkthrough
File Walkthrough
8 files
apply_indices_ivfflat.go
Refactor IVFFlat index application with context extractionpkg/sql/plan/apply_indices_ivfflat.go
preparation into
prepareIvfIndexContextfunctionapplyIndicesForSortUsingIvfflatto acceptvectorSortContextinstead of individual node parameters
colRefCntandidxColMapparameters for better indexoptimization tracking
buildPkExprFromNode,getColName,rebindScanNode,replaceColRefTag,canApplyRegularIndex, andcolRefsWithinfor improved node manipulation and validationpushdownEnabledis trueapply_indices.go
Add special index guard mechanism for scan protectionpkg/sql/plan/apply_indices.go
specialIndexKindenum andspecialIndexGuardstruct to trackprotected scan nodes
prepareSpecialIndexGuards,resetSpecialIndexGuards,collectSpecialIndexGuards, and related guard management functionsdetectFullTextGuardanddetectVectorGuardfunctions to identifyscans protected by special indexes
collectVectorIndexeshelper to gather vector indexes fromscan nodes
applyIndicesForFiltersandapplyIndicesForJoinsto prevent applying regular indexes to protectedscans
buildVectorSortContextandpass additional parameters
query_builder.go
Separate RankOption from Limit and add guard preparationpkg/sql/plan/query_builder.go
protectedScansandprojectSpecialGuardsfields toQueryBuilderstruct initialization
buildUnionfunction signature to acceptastRankOptionparameter separately from
astLimitbindSelectto extract and handleRankOptionindependently fromLimitclausebindLimitfunction to accept bothastLimitandastRankOptionparameters
parseRankOptionto support "force" mode in addition to "pre"and "post" modes
prepareSpecialIndexGuardsandresetSpecialIndexGuardsaround
applyIndicesinvocationapply_indices_hnsw.go
Refactor HNSW index application with context extractionpkg/sql/plan/apply_indices_hnsw.go
prepareHnswIndexContextfunction
applyIndicesForSortUsingHnswto acceptvectorSortContextinstead of individual node parameters
RankOptionto disable vector indexusage
select.go
Separate RankOption from Limit in AST structurepkg/sql/parsers/tree/select.go
RankOptionfield toSelectstructByRank,Option, andModefields fromLimitstructRankOptionstruct withOptionmap fieldFormatmethod forRankOptionto handle SQL formattingSelect.Formatto outputRankOptionseparately fromLimitapply_indices_vector.go
Add vector sort context extraction utilitiespkg/sql/plan/apply_indices_vector.go
vectorSortContextstruct to encapsulate vector sort querycontext information
buildVectorSortContextfunction to construct context fromprojection node
pickVectorLimithelper function to extract limit and rankoption from query nodes
types.go
Add guard tracking fields to QueryBuilderpkg/sql/plan/types.go
protectedScansmap field to track protected scan node countsprojectSpecialGuardsmap field to store special index guards perproject node
mysql_sql.y
Refactor grammar to separate rank option from limitpkg/sql/parsers/dialect/mysql/mysql_sql.y
rankOptiontype declaration to yacc grammarlimit_rank_suffixrule torank_optfor clarityselect_no_parensproduction rules to includerank_optasseparate clause after
limit_optRankOptionindependently fromLimitclauselimit_clauserules by removing rank suffix handling6 files
mysql_sql_test.go
Update parser tests for RankOption separationpkg/sql/parsers/dialect/mysql/mysql_sql_test.go
selectStmt.RankOptioninstead ofselectStmt.Limit.ByRankRankOption.Optioninstead ofLimit.OptionTestLimitByRankto reflect the new ASTstructure
query_builder_test.go
Update query builder tests for RankOption parameterpkg/sql/plan/query_builder_test.go
TestQueryBuilder_bindLimitto passnilfor newastRankOptionparameter
TestParseRankOptionoption
vector_ivf_mode.result
Add comprehensive vector IVF mode test coveragetest/distributed/cases/vector/vector_ivf_mode.result
rank modes ("pre", "post", "force")
indexes, and composite indexes
filter conditions
vector_ivf_mode.sql
Comprehensive IVF vector index mode test coveragetest/distributed/cases/vector/vector_ivf_mode.sql
mode with 242 lines of SQL test cases
mode=pre(pre-filtering),mode=post(post-filtering), and
mode=force(disable index)l2_distance,cosine_distance),various LIMIT sizes, and over-fetch factor scenarios
indexes on multiple tables
vector_ivf_mode_advanced.sql
Advanced IVF mode edge cases and complex query testingtest/distributed/cases/vector/vector_ivf_mode_advanced.sql
complex scenarios
patterns, and NULL value handling
l2_distance,cosine_distance,inner_product) with various filter combinationsWHERE clauses
vector_ivf_mode_advanced.result
Expected results for advanced IVF vector index teststest/distributed/cases/vector/vector_ivf_mode_advanced.result
cases
and distance metric combinations
op_typeerror forvector_inner_product)(
mode=pre,mode=post,mode=force)3 files