Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve](Variant) only merge_schema when sync_tablets or scan in… #48570

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

eldenmoon
Copy link
Member

@eldenmoon eldenmoon commented Mar 3, 2025

… cloud mode

  1. refactor some options
  2. set merge_schema only when sync_tablets or scan

This will reduce cost of merge_schema typically in MOW model with variant type with large number of subcolumns

image ### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Mar 3, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@eldenmoon
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31732 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit de9296699bfa467b740ff4d1a2c4de955187982b, data reload: false

------ Round 1 ----------------------------------
q1	17615	5160	5072	5072
q2	2061	308	182	182
q3	10380	1351	720	720
q4	10209	1068	518	518
q5	7494	2384	2428	2384
q6	191	172	135	135
q7	903	736	616	616
q8	9298	1304	1053	1053
q9	4981	4821	4655	4655
q10	6827	2318	1901	1901
q11	472	277	257	257
q12	350	357	215	215
q13	17746	3676	3129	3129
q14	236	238	221	221
q15	522	484	482	482
q16	615	605	612	605
q17	571	868	348	348
q18	6675	6235	6185	6185
q19	1200	954	558	558
q20	324	337	201	201
q21	2875	2147	1987	1987
q22	379	338	308	308
Total cold run time: 101924 ms
Total hot run time: 31732 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5135	5301	5106	5106
q2	244	331	226	226
q3	2180	2691	2324	2324
q4	1416	1849	1365	1365
q5	4286	4113	4178	4113
q6	209	169	122	122
q7	1895	1816	1664	1664
q8	2626	2634	2588	2588
q9	7354	7311	7213	7213
q10	3011	3186	2798	2798
q11	567	520	493	493
q12	694	756	630	630
q13	3556	4019	3267	3267
q14	268	285	277	277
q15	519	447	462	447
q16	644	716	646	646
q17	1148	1579	1408	1408
q18	7523	7372	7354	7354
q19	799	871	1077	871
q20	1964	2026	1861	1861
q21	5360	4926	4856	4856
q22	642	616	516	516
Total cold run time: 52040 ms
Total hot run time: 50145 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183977 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit de9296699bfa467b740ff4d1a2c4de955187982b, data reload: false

query1	978	397	386	386
query2	6522	1990	1926	1926
query3	6810	211	208	208
query4	26426	23915	22860	22860
query5	4805	658	486	486
query6	305	194	190	190
query7	4602	505	301	301
query8	294	253	247	247
query9	8683	2552	2535	2535
query10	500	318	289	289
query11	15383	15106	14884	14884
query12	176	105	101	101
query13	1643	519	397	397
query14	10197	7197	6517	6517
query15	206	194	182	182
query16	7638	626	496	496
query17	1155	713	543	543
query18	1966	391	301	301
query19	196	179	164	164
query20	120	116	111	111
query21	212	117	98	98
query22	4110	4241	4049	4049
query23	34025	32893	32900	32893
query24	7687	2388	2445	2388
query25	520	465	389	389
query26	1270	273	156	156
query27	2140	508	321	321
query28	3978	2384	2369	2369
query29	700	541	412	412
query30	241	191	157	157
query31	944	862	746	746
query32	74	64	67	64
query33	558	362	293	293
query34	793	851	518	518
query35	808	826	725	725
query36	949	986	890	890
query37	119	102	74	74
query38	4374	4188	4072	4072
query39	1454	1399	1376	1376
query40	207	119	103	103
query41	54	51	50	50
query42	122	104	102	102
query43	522	511	501	501
query44	1301	802	813	802
query45	176	181	165	165
query46	877	1049	662	662
query47	1746	1787	1724	1724
query48	379	421	327	327
query49	810	518	449	449
query50	705	746	424	424
query51	4162	4179	4054	4054
query52	108	109	103	103
query53	236	264	190	190
query54	510	526	435	435
query55	83	82	83	82
query56	294	307	265	265
query57	1148	1129	1062	1062
query58	249	233	240	233
query59	2688	2887	2632	2632
query60	278	277	251	251
query61	118	118	119	118
query62	815	725	682	682
query63	234	224	195	195
query64	4210	1007	653	653
query65	3265	3138	3235	3138
query66	1064	405	323	323
query67	15812	15475	15401	15401
query68	8027	886	503	503
query69	463	302	289	289
query70	1179	1131	1107	1107
query71	463	291	294	291
query72	5482	3581	3731	3581
query73	746	702	358	358
query74	9163	9071	9084	9071
query75	3781	3138	2713	2713
query76	3689	1204	777	777
query77	792	377	285	285
query78	9960	10076	9272	9272
query79	2227	844	591	591
query80	617	589	441	441
query81	510	287	244	244
query82	491	124	107	107
query83	185	180	156	156
query84	244	92	74	74
query85	807	361	305	305
query86	417	295	288	288
query87	4495	4451	4293	4293
query88	3907	2242	2203	2203
query89	400	325	283	283
query90	1968	203	197	197
query91	133	138	112	112
query92	72	65	59	59
query93	1866	1068	560	560
query94	629	426	311	311
query95	357	269	273	269
query96	485	560	268	268
query97	3304	3414	3271	3271
query98	240	203	201	201
query99	1361	1437	1257	1257
Total cold run time: 274194 ms
Total hot run time: 183977 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.88 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit de9296699bfa467b740ff4d1a2c4de955187982b, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.04	0.04
query3	0.23	0.07	0.06
query4	1.62	0.11	0.10
query5	0.55	0.55	0.55
query6	1.19	0.72	0.71
query7	0.02	0.02	0.02
query8	0.04	0.04	0.03
query9	0.57	0.55	0.52
query10	0.57	0.58	0.57
query11	0.16	0.10	0.10
query12	0.15	0.11	0.11
query13	0.61	0.60	0.60
query14	2.70	2.86	2.69
query15	0.93	0.86	0.84
query16	0.37	0.38	0.38
query17	1.00	1.04	1.03
query18	0.21	0.20	0.19
query19	1.85	1.81	2.02
query20	0.02	0.01	0.01
query21	15.36	0.90	0.56
query22	0.76	1.20	0.71
query23	14.91	1.42	0.62
query24	6.61	2.22	0.85
query25	0.52	0.25	0.09
query26	0.43	0.16	0.15
query27	0.05	0.04	0.04
query28	9.63	0.88	0.44
query29	12.53	3.92	3.26
query30	0.24	0.09	0.07
query31	2.81	0.61	0.38
query32	3.23	0.56	0.46
query33	3.01	2.99	3.03
query34	15.78	5.33	4.52
query35	4.57	4.55	4.59
query36	0.67	0.50	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.03
query39	0.03	0.02	0.02
query40	0.17	0.15	0.13
query41	0.08	0.03	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 104.51 s
Total hot run time: 30.88 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/51) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 45.44% (12127/26689)
Line Coverage 34.97% (102314/292567)
Region Coverage 34.13% (52391/153518)
Branch Coverage 29.82% (26486/88834)

… cloud mode

1. refactor some options
2. set merge_schema only when `sync_tablets` or scan
@eldenmoon
Copy link
Member Author

run buildall

Copy link
Contributor

github-actions bot commented Mar 4, 2025

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 4, 2025
Copy link
Contributor

github-actions bot commented Mar 4, 2025

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 31923 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0ed7158dd43d0cb86e2fb41b92256745253b0478, data reload: false

------ Round 1 ----------------------------------
q1	17677	5169	5134	5134
q2	2073	297	168	168
q3	10562	1307	768	768
q4	10265	1062	535	535
q5	8203	2429	2384	2384
q6	190	168	134	134
q7	945	755	630	630
q8	9319	1410	1099	1099
q9	4895	4682	4667	4667
q10	6815	2423	1907	1907
q11	480	286	259	259
q12	347	359	220	220
q13	17775	3694	3158	3158
q14	246	240	213	213
q15	538	467	465	465
q16	646	628	590	590
q17	609	870	347	347
q18	6700	6251	6178	6178
q19	1544	986	571	571
q20	324	322	203	203
q21	2891	2169	1993	1993
q22	364	339	300	300
Total cold run time: 103408 ms
Total hot run time: 31923 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5136	5118	5141	5118
q2	244	330	228	228
q3	2149	2699	2355	2355
q4	1483	1853	1354	1354
q5	4307	4153	4200	4153
q6	216	166	125	125
q7	1906	1892	1765	1765
q8	2660	2740	2591	2591
q9	7333	7183	7252	7183
q10	3034	3264	2816	2816
q11	580	519	492	492
q12	719	789	637	637
q13	3353	3983	3345	3345
q14	282	296	265	265
q15	509	456	474	456
q16	636	679	637	637
q17	1165	1604	1368	1368
q18	7619	7625	7317	7317
q19	835	810	873	810
q20	1985	2068	1892	1892
q21	5573	5115	4733	4733
q22	611	610	529	529
Total cold run time: 52335 ms
Total hot run time: 50169 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190041 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0ed7158dd43d0cb86e2fb41b92256745253b0478, data reload: false

query1	1318	941	953	941
query2	6217	1932	1900	1900
query3	10989	4447	4528	4447
query4	54195	25712	23027	23027
query5	5339	529	505	505
query6	355	200	218	200
query7	5022	505	292	292
query8	324	258	240	240
query9	6194	2567	2559	2559
query10	432	302	246	246
query11	15253	15327	14997	14997
query12	157	110	104	104
query13	1120	527	395	395
query14	10731	6423	7133	6423
query15	232	211	184	184
query16	7106	648	480	480
query17	1112	729	595	595
query18	1537	423	322	322
query19	225	201	188	188
query20	132	127	164	127
query21	218	123	105	105
query22	4228	4581	4270	4270
query23	33835	33156	33625	33156
query24	5780	2467	2445	2445
query25	460	473	395	395
query26	672	281	163	163
query27	1718	508	328	328
query28	2901	2482	2447	2447
query29	595	557	433	433
query30	226	199	162	162
query31	878	886	809	809
query32	77	63	63	63
query33	458	376	308	308
query34	789	899	521	521
query35	808	835	760	760
query36	950	1015	909	909
query37	116	101	71	71
query38	4314	4142	4112	4112
query39	1515	1424	1430	1424
query40	211	117	127	117
query41	60	56	50	50
query42	123	104	105	104
query43	507	509	502	502
query44	1309	854	822	822
query45	181	175	165	165
query46	897	1076	666	666
query47	1808	1853	1807	1807
query48	386	422	304	304
query49	691	527	397	397
query50	730	774	425	425
query51	4329	4305	4205	4205
query52	122	107	96	96
query53	236	258	193	193
query54	490	497	427	427
query55	84	82	81	81
query56	275	272	263	263
query57	1143	1180	1135	1135
query58	253	256	246	246
query59	2694	2924	2688	2688
query60	286	271	261	261
query61	120	123	120	120
query62	775	768	701	701
query63	236	198	205	198
query64	1505	1056	679	679
query65	3300	3242	3270	3242
query66	735	410	299	299
query67	16086	15447	15232	15232
query68	7077	902	509	509
query69	550	294	308	294
query70	1198	1121	1117	1117
query71	512	293	265	265
query72	5950	3568	3692	3568
query73	1394	732	359	359
query74	8943	9131	8709	8709
query75	3689	3183	2705	2705
query76	4261	1200	765	765
query77	611	364	290	290
query78	10041	10025	9275	9275
query79	3366	846	586	586
query80	721	531	459	459
query81	517	273	243	243
query82	681	129	98	98
query83	269	173	154	154
query84	278	99	68	68
query85	791	417	297	297
query86	411	307	281	281
query87	4423	4471	4517	4471
query88	3537	2234	2204	2204
query89	428	335	286	286
query90	1879	200	191	191
query91	137	140	114	114
query92	77	67	60	60
query93	2148	1049	574	574
query94	656	416	300	300
query95	355	277	260	260
query96	488	577	265	265
query97	3318	3403	3317	3317
query98	216	208	209	208
query99	1438	1393	1271	1271
Total cold run time: 298460 ms
Total hot run time: 190041 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0ed7158dd43d0cb86e2fb41b92256745253b0478, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.04	0.03
query3	0.23	0.07	0.06
query4	1.62	0.10	0.10
query5	0.55	0.56	0.56
query6	1.18	0.72	0.73
query7	0.03	0.02	0.01
query8	0.05	0.04	0.03
query9	0.59	0.53	0.52
query10	0.57	0.57	0.58
query11	0.16	0.11	0.10
query12	0.15	0.12	0.12
query13	0.62	0.60	0.60
query14	2.71	2.67	2.69
query15	0.92	0.85	0.84
query16	0.39	0.39	0.38
query17	1.03	1.01	1.02
query18	0.21	0.20	0.20
query19	1.96	1.79	1.95
query20	0.01	0.02	0.01
query21	15.36	0.95	0.53
query22	0.75	1.19	0.66
query23	15.00	1.40	0.62
query24	7.05	1.57	0.97
query25	0.46	0.20	0.12
query26	0.56	0.17	0.14
query27	0.05	0.06	0.05
query28	9.66	0.90	0.42
query29	12.54	4.04	3.32
query30	0.25	0.09	0.06
query31	2.81	0.60	0.39
query32	3.22	0.55	0.48
query33	3.04	3.01	3.04
query34	15.75	5.12	4.54
query35	4.46	4.52	4.52
query36	0.68	0.51	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.18	0.14	0.14
query41	0.08	0.03	0.03
query42	0.04	0.02	0.02
query43	0.04	0.02	0.03
Total cold run time: 105.24 s
Total hot run time: 31 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/49) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 45.84% (12241/26704)
Line Coverage 35.35% (103522/292837)
Region Coverage 34.51% (53032/153659)
Branch Coverage 30.22% (26871/88918)

@eldenmoon eldenmoon added p0_b usercase Important user case type label labels Mar 5, 2025
Copy link
Contributor

@csun5285 csun5285 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.x p0_b reviewed usercase Important user case type label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants