Skip to content

Conversation

@aptend
Copy link
Contributor

@aptend aptend commented Dec 3, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue https://github.com/matrixorigin/MO-Cloud/issues/6708

What this PR does / why we need it:

This PR fixes a column pruning failure issue caused by applying associative law optimization rules. When the query optimizer applies associative law transformations to reorder JOIN operations, the OnList conditions were not properly migrated to match the new JOIN structure, leading to remapping errors during column reference resolution.

Problem

When applying associative law rules (Rule 1, Rule 2, Rule 3) to transform JOIN structures like A*(B*C) to (A*B)*C, the OnList conditions that reference columns from moved tables were not migrated to the correct JOIN nodes. This caused column remapping failures because:

  1. The JOIN structure was changed, but OnList conditions remained in their original positions
  2. During column remapping phase, references to columns from moved tables could not be found in the expected context
  3. This resulted in errors like "Column remapping failed: cannot find column reference"

Solution

The fix adds proper OnList condition migration logic in all three associative law rules:

  1. Added migrateOnListConditions function: Moves conditions involving specified table tags from source node to destination node
  2. Added checkExprInvolvesTags function: Checks if an expression references any of the specified table tags
  3. Updated Rule 1, Rule 2, Rule 3: Each rule now properly migrates OnList conditions when transforming JOIN structures:
    • Rule 1: Migrates conditions involving table C to the outer join
    • Rule 2: Migrates conditions involving table A to the outer join
    • Rule 3: Migrates conditions involving B and C to their correct positions

Testing

Added a comprehensive test case TestAssociativeLawRemapping that:

  • Creates a scenario where associative law Rule 1 would be triggered
  • Verifies that the query executes successfully without remapping errors
  • Tests the exact scenario from the issue report

PR Type

Bug fix


Description

  • Fix column pruning failure when applying associative law JOIN reordering rules

  • Add OnList condition migration logic to properly handle JOIN transformations

  • Implement helper functions to detect and migrate conditions involving specific table tags

  • Add optimization history tracking for debugging remap errors

  • Include comprehensive test case for associative law remapping scenarios


Diagram Walkthrough

flowchart LR
  A["JOIN Transformation<br/>A*(B*C) to (A*B)*C"] --> B["Detect Conditions<br/>Involving Moved Tables"]
  B --> C["Migrate OnList<br/>Conditions to Correct JOINs"]
  C --> D["Recalculate Stats<br/>and Remap Columns"]
  D --> E["Query Executes<br/>Successfully"]
Loading

File Walkthrough

Relevant files
Bug fix
associative_law.go
Implement OnList condition migration for associative law rules

pkg/sql/plan/associative_law.go

  • Add checkExprInvolvesTags() function to detect if expressions
    reference specific table tags
  • Add migrateOnListConditions() function to move JOIN conditions between
    nodes based on table involvement
  • Add helper functions getTableNameOrLabel() and formatStatsInfo() for
    debugging
  • Update Rule 1, Rule 2, and Rule 3 to migrate OnList conditions when
    transforming JOIN structures
  • Add optimization history logging for each rule application with table
    names and stats
+122/-1 
Enhancement
types.go
Add optimization history tracking field                                   

pkg/sql/plan/types.go

  • Add optimizationHistory field to QueryBuilder struct to track
    optimization steps
  • Initialize as empty slice for recording key optimization
    transformations
+4/-0     
query_builder.go
Include optimization history in remap error messages         

pkg/sql/plan/query_builder.go

  • Initialize optimizationHistory slice in NewQueryBuilder()
  • Enhance buildRemapErrorMessage() to include optimization history in
    error messages
  • Display optimization steps taken during query planning for debugging
+12/-0   
join_order.go
Add join order optimization history logging                           

pkg/sql/plan/join_order.go

  • Add optimization history logging in determineJoinOrder() function
  • Record initial state with gathered leaves and conditions
  • Record final state showing nodeID changes and remaining conditions
+13/-0   
pushdown.go
Add filter pushdown optimization history logging                 

pkg/sql/plan/pushdown.go

  • Add optimization history logging in pushdownFilters() function
  • Record before/middle/after states for different node types (JOIN,
    UNION, TABLE_SCAN)
  • Track filter pushdown transformations for debugging
+23/-0   
sql_executor_context.go
Add test stats cache support for testing                                 

pkg/sql/compile/sql_executor_context.go

  • Add import for logutil package
  • Add test environment detection in Stats() method using context value
  • Check stats cache for test environment and return cached stats if
    available
  • Log when using cached stats for testing purposes
+12/-0   
sql_executor.go
Support injecting stats cache from context                             

pkg/sql/compile/sql_executor.go

  • Modify getCompileContext() to check for test stats cache in context
  • Inject stats cache into compiler context if provided in context
  • Enable test environment to provide pre-configured statistics
+6/-1     
Tests
associative_law_remap_test.go
Add comprehensive associative law remapping test                 

pkg/vm/engine/test/associative_law_remap_test.go

  • Create comprehensive test case TestAssociativeLawRemapping() for
    associative law transformations
  • Set up test database with three tables (connector_job, task, file)
    with primary key constraints
  • Create shared stats cache to control table statistics and trigger Rule
    1 transformation
  • Execute complex multi-table JOIN query that previously caused
    remapping errors
  • Verify query executes successfully without remapping failures
+162/-0 

@qodo-code-review
Copy link

qodo-code-review bot commented Dec 3, 2025

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Debug logging: New logging/optimization history appears added for debugging but does not implement audit
trails for critical actions, and it is unclear whether any user-facing critical actions
require auditing here.

Referred Code
nodeID, conds = builder.pushdownFilters(nodeID, conds, true)
if len(conds) > 0 {
	nodeID = builder.appendNode(&plan.Node{
		NodeType:   plan.Node_FILTER,
		Children:   []int32{nodeID},
		FilterList: conds,
	}, nil)
}
// Record after determineJoinOrder
if nodeID != originalNodeID {
	builder.optimizationHistory = append(builder.optimizationHistory,
		fmt.Sprintf("determineJoinOrder:after (nodeID: %d -> %d, remainingConds: %d)", originalNodeID, nodeID, len(conds)))
} else {
	builder.optimizationHistory = append(builder.optimizationHistory,
		fmt.Sprintf("determineJoinOrder:after (nodeID: %d, no change, remainingConds: %d)", nodeID, len(conds)))
}
return nodeID

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status:
Detailed context: The enhanced remap error message includes optimization history and potentially internal
plan details which may leak internal information if surfaced to end users; visibility
scope is unclear from the diff.

Referred Code
	}
	if exprStr != "" {
		sb.WriteString("🔍 Related Expression:\n")
		sb.WriteString(fmt.Sprintf("   %s\n", exprStr))
		sb.WriteString("\n")
	}
}

// Optimization history
if len(builder.optimizationHistory) > 0 {
	sb.WriteString("🔧 Optimization History:\n")
	for _, hist := range builder.optimizationHistory {
		sb.WriteString(fmt.Sprintf("   - %s\n", hist))
	}
	sb.WriteString("\n")
} else {
	sb.WriteString("🔧 Optimization History: (no associatelaw applied)\n\n")
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Info logging: Added logutil.Infof emits table names and IDs when using cached stats; confirm this log is
internal-only and does not include sensitive data or appear at inappropriate verbosity in
production.

Referred Code
tableID := table.GetTableID(ctx)
if statsWrapper := c.statsCache.GetStatsInfo(tableID, false); statsWrapper != nil && statsWrapper.Stats != nil {
	logutil.Infof("use test env cached stats for table %s (tableID=%d)", tableName, tableID)
	return statsWrapper.Stats, nil
}

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

qodo-code-review bot commented Dec 3, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Ensure thread-safe cache initialization

Ensure statsCache is always initialized in getCompileContext by creating a new
plan.NewStatsCache() if the test cache is not present in the context, thus
preventing a potential race condition.

pkg/sql/compile/sql_executor.go [187-204]

 func (s *sqlExecutor) getCompileContext(
 	ctx context.Context,
 	proc *process.Process,
 	db string,
 	lower int64) *compilerContext {
 	cc := &compilerContext{
 		ctx:       ctx,
 		defaultDB: db,
 		engine:    s.eng,
 		proc:      proc,
 		lower:     lower,
 	}
 	// For testing: check if a stats cache is provided in context
 	if statsCache, ok := ctx.Value("test_stats_cache").(*plan.StatsCache); ok {
 		cc.statsCache = statsCache
+	} else {
+		cc.statsCache = plan.NewStatsCache()
 	}
 	return cc
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: The suggestion provides a concrete and correct fix for a potential data race by ensuring statsCache is always initialized in the constructor, improving thread safety and robustness.

Medium
Prevent race condition during initialization

To prevent a potential race condition, move the initialization of c.statsCache
to the compilerContext constructor instead of performing a non-thread-safe lazy
initialization within the Stats method.

pkg/sql/compile/sql_executor_context.go [157-160]

-	if c.statsCache == nil {
-		c.statsCache = plan.NewStatsCache()
-	}
 	c.statsCache.SetStatsInfo(table.GetTableID(ctx), statsInfo)

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a potential data race in the lazy initialization of c.statsCache, which is a valid correctness concern for concurrent execution.

Medium
High-level
Make optimization history logging conditional

The unconditional logging to optimizationHistory in key optimizer functions can
cause performance overhead. This debugging feature should be made conditional,
activated only when needed, for example, via a session variable or debug flag.

Examples:

pkg/sql/plan/associative_law.go [117-121]
	statsInfo := fmt.Sprintf("rule1: A=%s(stats:%s) B=%s(stats:%s) C=%s(stats:%s)",
		tableNameA, formatStatsInfo(builder.qry.Nodes[node.Children[0]].Stats),
		tableNameB, formatStatsInfo(NodeB.Stats),
		tableNameC, formatStatsInfo(NodeC.Stats))
	builder.optimizationHistory = append(builder.optimizationHistory, statsInfo)
pkg/sql/plan/pushdown.go [28-30]
	// Record before pushdownFilters
	builder.optimizationHistory = append(builder.optimizationHistory,
		fmt.Sprintf("pushdownFilters:before (nodeID: %d, nodeType: %s, filters: %d)", nodeID, builder.qry.Nodes[nodeID].NodeType, len(filters)))

Solution Walkthrough:

Before:

// in pkg/sql/plan/associative_law.go
func (builder *QueryBuilder) applyAssociativeLawRule1(nodeID int32) int32 {
    // ...
    // Migrate OnList conditions...
    // ...

    // Record table names and stats after migration
    tableNameA := builder.getTableNameOrLabel(...)
    tableNameB := builder.getTableNameOrLabel(...)
    tableNameC := builder.getTableNameOrLabel(...)
    statsInfo := fmt.Sprintf("rule1: A=%s(...) B=%s(...) C=%s(...)", ...)
    builder.optimizationHistory = append(builder.optimizationHistory, statsInfo)

    // ...
    return rightChild.NodeId
}

After:

// in pkg/sql/plan/associative_law.go
func (builder *QueryBuilder) applyAssociativeLawRule1(nodeID int32) int32 {
    // ...
    // Migrate OnList conditions...
    // ...

    if builder.shouldRecordOptimizationHistory() { // e.g., check a debug flag
        // Record table names and stats after migration
        tableNameA := builder.getTableNameOrLabel(...)
        tableNameB := builder.getTableNameOrLabel(...)
        tableNameC := builder.getTableNameOrLabel(...)
        statsInfo := fmt.Sprintf("rule1: A=%s(...) B=%s(...) C=%s(...)", ...)
        builder.optimizationHistory = append(builder.optimizationHistory, statsInfo)
    }

    // ...
    return rightChild.NodeId
}
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a potential performance overhead in the query optimizer due to unconditional debug logging and proposes a valid conditional approach to mitigate it.

Medium
General
Optimize slice handling during migration

Optimize migrateOnListConditions by using an in-place filtering technique to
modify src.OnList, which avoids unnecessary slice allocations when migrating
conditions.

pkg/sql/plan/associative_law.go [47-58]

 // migrateOnListConditions moves conditions involving specified tags from src to dst
 func migrateOnListConditions(src *plan.Node, dst *plan.Node, tagsMap map[int32]bool) {
-	var kept []*plan.Expr
+	n := 0
 	for _, cond := range src.OnList {
 		if checkExprInvolvesTags(cond, tagsMap) {
 			dst.OnList = append(dst.OnList, cond)
 		} else {
-			kept = append(kept, cond)
+			src.OnList[n] = cond
+			n++
 		}
 	}
-	src.OnList = kept
+	// Truncate the original slice to remove moved elements
+	src.OnList = src.OnList[:n]
 }
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion proposes a valid in-place slice filtering optimization that reduces memory allocations, which is a good performance and efficiency improvement.

Low
  • Update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working Review effort 3/5 size/M Denotes a PR that changes [100,499] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants