Skip to content

try to remove redundant alias in expression rewriter and select#20867

Open
buraksenn wants to merge 2 commits intoapache:mainfrom
buraksenn:try-to-remove-redundant-alias
Open

try to remove redundant alias in expression rewriter and select#20867
buraksenn wants to merge 2 commits intoapache:mainfrom
buraksenn:try-to-remove-redundant-alias

Conversation

@buraksenn
Copy link
Contributor

Which issue does this PR close?

Not closes

Rationale for this change

In #20780 (comment) @alamb mentioned whether we can remove redundant alias of count(*) AS count(*) to count(*) and I tried to give this a go.

I'm not sure about the implications at the moment it would be great to have input on this PR

What changes are included in this PR?

Main changes are in:

  • order_by.rs: match only top level expressions instead of recursively searching sub expressions (otherwise we may match wrong expressions)
  • select.rs: strip alias before comparing otherwise we dont use existing alias at all

Are these changes tested?

I've added some tests for alias. Existing tests and plan outputs changed as well you can see in the PR.

Are there any user-facing changes?

Plans will change but not sure if it has impact

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Mar 11, 2026
@buraksenn buraksenn changed the title try to remove redundant alias and projection in expression rewriter and select try to remove redundant alias in expression rewriter and select Mar 11, 2026
@buraksenn
Copy link
Contributor Author

run benchmarks

@alamb-ghbot
Copy link

🤖 Hi @buraksenn, thanks for the request (#20867 (comment)). scrape_comments.py only responds to whitelisted users. Allowed users: Dandandan, Jefffrey, Omega359, adriangb, alamb, comphead, etseidl, gabotechs, geoffreyclaude, klion26, rluvaton, xudong963, zhuqi-lucas.

@buraksenn
Copy link
Contributor Author

I think clickbench plan changes are ok but tried to run benchmark to make sure regressions

@alamb
Copy link
Contributor

alamb commented Mar 11, 2026

run benchmarks

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing try-to-remove-redundant-alias (66ad5c2) to da05287 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and try-to-remove-redundant-alias
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ try-to-remove-redundant-alias ┃    Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0 │  2179.24 ms │                    2140.75 ms │ no change │
│ QQuery 1 │   846.97 ms │                     871.77 ms │ no change │
│ QQuery 2 │  1679.04 ms │                    1669.15 ms │ no change │
│ QQuery 3 │  1020.87 ms │                    1011.77 ms │ no change │
│ QQuery 4 │  2119.86 ms │                    2097.23 ms │ no change │
│ QQuery 5 │ 26363.21 ms │                   25969.03 ms │ no change │
│ QQuery 6 │  3441.45 ms │                    3457.07 ms │ no change │
│ QQuery 7 │  2420.65 ms │                    2451.58 ms │ no change │
└──────────┴─────────────┴───────────────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 40071.29ms │
│ Total Time (try-to-remove-redundant-alias)   │ 39668.35ms │
│ Average Time (HEAD)                          │  5008.91ms │
│ Average Time (try-to-remove-redundant-alias) │  4958.54ms │
│ Queries Faster                               │          0 │
│ Queries Slower                               │          0 │
│ Queries with No Change                       │          8 │
│ Queries with Failure                         │          0 │
└──────────────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ try-to-remove-redundant-alias ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     2.56 ms │                       2.57 ms │     no change │
│ QQuery 1  │    47.98 ms │                      47.58 ms │     no change │
│ QQuery 2  │   150.55 ms │                     148.94 ms │     no change │
│ QQuery 3  │   156.51 ms │                     158.98 ms │     no change │
│ QQuery 4  │   946.44 ms │                     973.82 ms │     no change │
│ QQuery 5  │  1159.11 ms │                    1197.97 ms │     no change │
│ QQuery 6  │     6.43 ms │                       6.35 ms │     no change │
│ QQuery 7  │    52.32 ms │                      52.68 ms │     no change │
│ QQuery 8  │  1329.11 ms │                    1346.30 ms │     no change │
│ QQuery 9  │  1730.86 ms │                    1721.32 ms │     no change │
│ QQuery 10 │   318.36 ms │                     326.18 ms │     no change │
│ QQuery 11 │   363.51 ms │                     370.73 ms │     no change │
│ QQuery 12 │  1104.11 ms │                    1134.70 ms │     no change │
│ QQuery 13 │  1807.41 ms │                    1767.04 ms │     no change │
│ QQuery 14 │  1113.93 ms │                    1133.79 ms │     no change │
│ QQuery 15 │  1098.39 ms │                    1133.95 ms │     no change │
│ QQuery 16 │  2273.85 ms │                    2329.89 ms │     no change │
│ QQuery 17 │  2280.48 ms │                    2338.03 ms │     no change │
│ QQuery 18 │  4590.51 ms │                    4421.44 ms │     no change │
│ QQuery 19 │   119.18 ms │                     121.07 ms │     no change │
│ QQuery 20 │  1688.04 ms │                    1701.67 ms │     no change │
│ QQuery 21 │  1972.20 ms │                    1952.55 ms │     no change │
│ QQuery 22 │  3352.14 ms │                    3347.70 ms │     no change │
│ QQuery 23 │ 11524.76 ms │                   10838.23 ms │ +1.06x faster │
│ QQuery 24 │   183.11 ms │                     186.97 ms │     no change │
│ QQuery 25 │   409.10 ms │                     404.15 ms │     no change │
│ QQuery 26 │   181.48 ms │                     187.80 ms │     no change │
│ QQuery 27 │  2513.69 ms │                    2541.51 ms │     no change │
│ QQuery 28 │ 24147.62 ms │                   24056.03 ms │     no change │
│ QQuery 29 │   995.60 ms │                     977.38 ms │     no change │
│ QQuery 30 │  1169.44 ms │                    1156.68 ms │     no change │
│ QQuery 31 │  1223.32 ms │                    1245.04 ms │     no change │
│ QQuery 32 │  4327.14 ms │                    4561.65 ms │  1.05x slower │
│ QQuery 33 │  5283.15 ms │                    5121.73 ms │     no change │
│ QQuery 34 │  5877.27 ms │                    5497.92 ms │ +1.07x faster │
│ QQuery 35 │  1092.49 ms │                    1087.38 ms │     no change │
│ QQuery 36 │   180.21 ms │                     179.45 ms │     no change │
│ QQuery 37 │    71.11 ms │                      68.65 ms │     no change │
│ QQuery 38 │   105.98 ms │                     107.02 ms │     no change │
│ QQuery 39 │   327.60 ms │                     318.78 ms │     no change │
│ QQuery 40 │    39.53 ms │                      36.94 ms │ +1.07x faster │
│ QQuery 41 │    33.18 ms │                      33.33 ms │     no change │
│ QQuery 42 │    28.78 ms │                      30.20 ms │     no change │
└───────────┴─────────────┴───────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 87378.51ms │
│ Total Time (try-to-remove-redundant-alias)   │ 86372.10ms │
│ Average Time (HEAD)                          │  2032.06ms │
│ Average Time (try-to-remove-redundant-alias) │  2008.65ms │
│ Queries Faster                               │          3 │
│ Queries Slower                               │          1 │
│ Queries with No Change                       │         39 │
│ Queries with Failure                         │          0 │
└──────────────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ try-to-remove-redundant-alias ┃    Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 1  │ 103.99 ms │                     103.51 ms │ no change │
│ QQuery 2  │  30.83 ms │                      30.11 ms │ no change │
│ QQuery 3  │  34.51 ms │                      35.31 ms │ no change │
│ QQuery 4  │  29.14 ms │                      29.51 ms │ no change │
│ QQuery 5  │  79.12 ms │                      78.53 ms │ no change │
│ QQuery 6  │  20.29 ms │                      20.23 ms │ no change │
│ QQuery 7  │ 144.43 ms │                     141.25 ms │ no change │
│ QQuery 8  │  38.09 ms │                      36.52 ms │ no change │
│ QQuery 9  │  92.29 ms │                      95.35 ms │ no change │
│ QQuery 10 │  62.05 ms │                      62.92 ms │ no change │
│ QQuery 11 │  18.05 ms │                      18.50 ms │ no change │
│ QQuery 12 │  54.52 ms │                      54.46 ms │ no change │
│ QQuery 13 │  45.10 ms │                      46.90 ms │ no change │
│ QQuery 14 │  14.11 ms │                      13.96 ms │ no change │
│ QQuery 15 │  29.06 ms │                      28.87 ms │ no change │
│ QQuery 16 │  26.89 ms │                      26.61 ms │ no change │
│ QQuery 17 │ 136.81 ms │                     139.67 ms │ no change │
│ QQuery 18 │ 262.74 ms │                     263.52 ms │ no change │
│ QQuery 19 │  43.04 ms │                      43.14 ms │ no change │
│ QQuery 20 │  53.96 ms │                      53.91 ms │ no change │
│ QQuery 21 │ 185.83 ms │                     185.09 ms │ no change │
│ QQuery 22 │  21.75 ms │                      21.89 ms │ no change │
└───────────┴───────────┴───────────────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 1526.58ms │
│ Total Time (try-to-remove-redundant-alias)   │ 1529.75ms │
│ Average Time (HEAD)                          │   69.39ms │
│ Average Time (try-to-remove-redundant-alias) │   69.53ms │
│ Queries Faster                               │         0 │
│ Queries Slower                               │         0 │
│ Queries with No Change                       │        22 │
│ Queries with Failure                         │         0 │
└──────────────────────────────────────────────┴───────────┘

@buraksenn
Copy link
Contributor Author

There is no notable change then can we say this only removes redundant alias and has no bad side effects?

@alamb
Copy link
Contributor

alamb commented Mar 11, 2026

There is no notable change then can we say this only removes redundant alias and has no bad side effects?

Yes that is my conclusion as well

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @buraksenn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate logical-expr Logical plan and expressions sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants