Skip to content

expression: avoid mutating folded child metadata#68451

Open
hawkingrei wants to merge 6 commits into
pingcap:masterfrom
hawkingrei:issue-68053-foldconstant-metadata
Open

expression: avoid mutating folded child metadata#68451
hawkingrei wants to merge 6 commits into
pingcap:masterfrom
hawkingrei:issue-68053-foldconstant-metadata

Conversation

@hawkingrei
Copy link
Copy Markdown
Member

@hawkingrei hawkingrei commented May 18, 2026

What problem does this PR solve?

Issue Number: close #68053

Problem Summary:

Constant folding can return an existing child expression from a folded parent expression. FoldConstant then copies the parent expression's coercibility, charset, collation, and repertoire onto the returned expression.

For a mixed-type CASE expression folded to a child column, this mutates the shared column RetType from utf8mb4 to binary. Later expressions in the same statement can then observe the polluted metadata and evaluate string functions such as REVERSE() with binary semantics.

What changed and how does it work?

When FoldConstant returns a different top-level expression from the original expression, clone the returned Column, CorrelatedColumn, or ScalarFunction return type before applying the parent metadata.

For cloned scalar functions, keep the scalar function return type and the cloned builtin signature return type aligned.

This keeps the narrow folded-result metadata override while avoiding mutation of shared child expressions.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

Fix an issue that constant folding could mutate shared child expression metadata and make later string expressions use binary charset unexpectedly.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed return type metadata preservation during constant-folding optimization of conditional expressions (IF, IFNULL, CASE)
    • Improved charset and collation attribute handling in expression optimization
  • Tests

    • Added regression test coverage for charset/collation scenarios with string functions
    • Enhanced test cases for expression cloning behavior and return type preservation

Review Change Stack

@ti-chi-bot ti-chi-bot Bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/planner SIG: Planner size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 18, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: dbbd9c5d-781e-4b6e-9b23-a46a9f05a0f6

📥 Commits

Reviewing files that changed from the base of the PR and between febacee and e4b414d.

📒 Files selected for processing (1)
  • tests/integrationtest/t/explain_generate_column_substitute.test
💤 Files with no reviewable changes (1)
  • tests/integrationtest/t/explain_generate_column_substitute.test

📝 Walkthrough

Walkthrough

This PR fixes a charset-leakage bug where CASE/IF/IFNULL expressions with BINARY branches incorrectly propagate binary charset to unrelated SELECT expressions. Changes clone return-type pointers during constant folding and expression cloning to isolate type metadata mutations.

Changes

Charset Preservation in Conditional Folding

Layer / File(s) Summary
Clone helper for folded branches
pkg/expression/constant_fold.go
Adds cloneFoldedBranchWithRetType to clone folded branch expressions while preserving their RetType metadata, refreshing ScalarFunction return types from their intrinsic Function.getRetTp().
Integrate helper into fold handlers
pkg/expression/constant_fold.go
IF, IFNULL, and CASE constant-fold handlers now wrap folded THEN/ELSE/branch results via the clone helper so return-type metadata is retained before parent expression type propagates.
Isolate RetType in cloning
pkg/expression/builtin.go, pkg/expression/scalar_function.go
baseBuiltinFunc.cloneFrom and ScalarFunction.Clone() now deep-copy RetType pointers instead of reusing them, ensuring clones have independent type metadata.
Test coverage and directives
pkg/expression/constant_test.go, pkg/planner/core/integration_test.go, tests/integrationtest/t/explain_generate_column_substitute.test
Adds charset/collation regression assertions in TestConstantFolding for CASE/Lower/If folding behavior; integration test validates CONCAT/REVERSE output format in issue 68053 scenario; test regex directives updated for expected-output normalization.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • pingcap/tidb#68127: Addresses RetType metadata leaks in NULLIF rewrite by cloning branch return types similarly.
  • pingcap/tidb#66652: Fixes the same RetType/FieldType pointer-aliasing problem by deep-copying return types in expression construction to prevent shared mutable type state.
  • pingcap/tidb#67301: Extends the same integration test function TestIssue66619 with additional regression cases for unrelated issues.

Suggested reviewers

  • AilinKid
  • qw4990
  • winoros

🐰 A charset leaks through the fold,
But clones stand tall, both bright and bold,
Now BINARY's chill can't touch the rest,
Each type stays pure—a fresh-cloned quest!
UTF8 smiles, no hex display,
Metadata flows the right-cloned way! 🎄

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: preventing mutation of metadata in folded child expressions during constant folding.
Description check ✅ Passed The description provides issue number, problem summary, detailed explanation of changes, test coverage, and release notes following the template.
Linked Issues check ✅ Passed The PR directly addresses issue #68053 by implementing cloning of return types before applying parent metadata, preventing mutation of shared child expression metadata.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing the constant folding metadata mutation bug: modifications to cloning logic in builtin.go, scalar_function.go, new cloning function in constant_fold.go, related tests in constant_test.go and integration_test.go, and test file regex fix.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@hawkingrei hawkingrei added the AI-Correction Bugfix by AI label May 18, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 86.20690% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.4859%. Comparing base (70ff16c) to head (e4b414d).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #68451        +/-   ##
================================================
- Coverage   77.2762%   76.4859%   -0.7904%     
================================================
  Files          2010       1992        -18     
  Lines        555477     557671      +2194     
================================================
- Hits         429252     426540      -2712     
- Misses       125305     131087      +5782     
+ Partials        920         44       -876     
Flag Coverage Δ
integration 41.5677% <86.2069%> (+1.7737%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 60.4888% <ø> (ø)
parser ∅ <ø> (∅)
br 49.9725% <ø> (-13.0354%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hawkingrei
Copy link
Copy Markdown
Member Author

/test check-dev

@tiprow
Copy link
Copy Markdown

tiprow Bot commented May 18, 2026

@hawkingrei: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test fast_test_tiprow
/test tidb_parser_test

Use /test all to run all jobs.

Details

In response to this:

/test check-dev

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hawkingrei hawkingrei force-pushed the issue-68053-foldconstant-metadata branch from ecde50b to 3388c64 Compare May 18, 2026 02:31
@hawkingrei
Copy link
Copy Markdown
Member Author

/retest

@ti-chi-bot ti-chi-bot Bot removed the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 18, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 18, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign qw4990, xuhuaiyu for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 18, 2026
@hawkingrei
Copy link
Copy Markdown
Member Author

/test idc-jenkins-ci-tidb/mysql-test

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 18, 2026

@hawkingrei: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test build
/test check-dev
/test check-dev2
/test mysql-test
/test pull-br-integration-test
/test pull-build-next-gen
/test pull-integration-e2e-test
/test pull-integration-realcluster-test-next-gen
/test pull-lightning-integration-test
/test pull-mysql-client-test
/test pull-mysql-client-test-next-gen
/test pull-unit-test-ddlv1
/test pull-unit-test-next-gen
/test unit-test

The following commands are available to trigger optional jobs:

/test pingcap/tidb/canary_ghpr_unit_test
/test pull-br-integration-test-next-gen
/test pull-check-deps
/test pull-common-test
/test pull-e2e-test
/test pull-error-log-review
/test pull-integration-common-test
/test pull-integration-copr-test
/test pull-integration-ddl-test
/test pull-integration-ddl-test-next-gen
/test pull-integration-e2e-test-next-gen
/test pull-integration-jdbc-test
/test pull-integration-mysql-test
/test pull-integration-nodejs-test
/test pull-integration-python-orm-test
/test pull-mysql-test-next-gen
/test pull-sqllogic-test
/test pull-tiflash-integration-test

Use /test all to run the following jobs that were automatically triggered:

pingcap/tidb/ghpr_build
pingcap/tidb/ghpr_check
pingcap/tidb/ghpr_check2
pingcap/tidb/ghpr_mysql_test
pingcap/tidb/ghpr_unit_test
pingcap/tidb/pull_build_next_gen
pingcap/tidb/pull_integration_e2e_test
pingcap/tidb/pull_integration_realcluster_test_next_gen
pingcap/tidb/pull_mysql_client_test
pingcap/tidb/pull_mysql_client_test_next_gen
pingcap/tidb/pull_unit_test_next_gen
pull-check-deps
pull-error-log-review
Details

In response to this:

/test idc-jenkins-ci-tidb/mysql-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tiprow
Copy link
Copy Markdown

tiprow Bot commented May 18, 2026

@hawkingrei: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test fast_test_tiprow
/test tidb_parser_test

Use /test all to run all jobs.

Details

In response to this:

/test idc-jenkins-ci-tidb/mysql-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hawkingrei
Copy link
Copy Markdown
Member Author

/test mysql-test

@tiprow
Copy link
Copy Markdown

tiprow Bot commented May 18, 2026

@hawkingrei: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test fast_test_tiprow
/test tidb_parser_test

Use /test all to run all jobs.

Details

In response to this:

/test mysql-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hawkingrei
Copy link
Copy Markdown
Member Author

/retest-required

@hawkingrei
Copy link
Copy Markdown
Member Author

/test mysql-test

@tiprow
Copy link
Copy Markdown

tiprow Bot commented May 18, 2026

@hawkingrei: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test fast_test_tiprow
/test tidb_parser_test

Use /test all to run all jobs.

Details

In response to this:

/test mysql-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 18, 2026

@hawkingrei: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
idc-jenkins-ci-tidb/mysql-test e4b414d link true /test mysql-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI-Correction Bugfix by AI release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/planner SIG: Planner size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BINARY expression in WHERE predicate leaks binary charset to SELECT list, causing hex display instead of readable string

1 participant