[Enhancement](paimon)support native read paimon top level schema change table. #48723

hubgeter · 2025-03-06T02:39:44Z

What problem does this PR solve?

Problem Summary:
Supports native reader reading tables after the top-level schema of paimon is changed, but does not support tables after the internal schema of struct is changed.

change top-level schema(support):

--spark sql 
ALTER TABLE table_name ADD COLUMNS (c1 INT,c2 STRING);
ALTER TABLE table_name RENAME COLUMN c0 TO c1;
ALTER TABLE table_name DROP COLUMNS (c1, c2);
ALTER TABLE table_name ADD COLUMN c INT FIRST;
ALTER TABLE table_name ADD COLUMN c INT AFTER b;
ALTER TABLE table_name ALTER COLUMN col_a FIRST;
ALTER TABLE table_name ALTER COLUMN col_a AFTER col_b;

change internal schema of struct schema(not support, will support in the next PR):

--spark sql 
ALTER TABLE table_name ADD COLUMN        v.value.f3 STRING;
ALTER TABLE table_name RENAME COLUMN v.f1 to f100;
ALTER TABLE table_name DROP COLUMN      v.value.f3 ;
ALTER TABLE table_name ALTER COLUMN      v.col_a FIRST;

Release note

Supports native reader reading tables after the top-level schema of paimon is changed.

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

Thearas · 2025-03-06T02:39:49Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

hubgeter · 2025-03-07T18:19:54Z

run buildall

doris-robot · 2025-03-07T18:37:48Z

TeamCity cloud ut coverage result:
Function Coverage: 82.15% (1063/1294)
Line Coverage: 65.70% (17634/26840)
Region Coverage: 65.14% (8690/13340)
Branch Coverage: 55.12% (4691/8510)
Coverage Report: http://coverage.selectdb-in.cc/coverage/a5f09cd2ab08a54ba551df7fffafe59a39ad3a95_a5f09cd2ab08a54ba551df7fffafe59a39ad3a95_cloud/report/index.html

doris-robot · 2025-03-07T20:14:30Z

TPC-H: Total hot run time: 32681 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a5f09cd2ab08a54ba551df7fffafe59a39ad3a95, data reload: false

------ Round 1 ----------------------------------
q1	17616	5244	5053	5053
q2	2052	306	168	168
q3	10384	1287	747	747
q4	10217	1052	522	522
q5	7516	2479	2378	2378
q6	198	172	137	137
q7	922	749	599	599
q8	9304	1316	1166	1166
q9	5031	4878	4681	4681
q10	6837	2322	1872	1872
q11	483	283	267	267
q12	346	348	222	222
q13	17773	3701	3065	3065
q14	227	221	214	214
q15	543	495	487	487
q16	646	611	586	586
q17	594	870	333	333
q18	7020	6494	6504	6494
q19	2031	953	554	554
q20	326	320	192	192
q21	2797	2171	1957	1957
q22	1093	1044	987	987
Total cold run time: 103956 ms
Total hot run time: 32681 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5250	5167	5162	5162
q2	240	332	234	234
q3	2194	2695	2316	2316
q4	1445	1924	1375	1375
q5	4265	4211	4181	4181
q6	214	163	126	126
q7	2005	1926	1787	1787
q8	2646	2634	2628	2628
q9	7344	7288	7161	7161
q10	3093	3196	2805	2805
q11	584	511	488	488
q12	678	768	574	574
q13	3499	3890	3250	3250
q14	287	302	268	268
q15	607	507	510	507
q16	694	687	646	646
q17	1167	1574	1379	1379
q18	7893	7793	7583	7583
q19	869	818	841	818
q20	2001	2036	1873	1873
q21	5483	4903	4727	4727
q22	1126	1057	1002	1002
Total cold run time: 53584 ms
Total hot run time: 50890 ms

doris-robot · 2025-03-07T20:26:41Z

TPC-DS: Total hot run time: 192362 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a5f09cd2ab08a54ba551df7fffafe59a39ad3a95, data reload: false

query1	1355	1006	999	999
query2	6213	1890	1917	1890
query3	11092	4648	4386	4386
query4	54056	26431	23441	23441
query5	5232	522	493	493
query6	342	187	183	183
query7	4925	497	283	283
query8	307	233	238	233
query9	5851	2529	2563	2529
query10	427	313	264	264
query11	15216	15069	14924	14924
query12	156	110	112	110
query13	1099	520	400	400
query14	10146	6484	7039	6484
query15	199	191	174	174
query16	7139	684	474	474
query17	1073	701	577	577
query18	1548	406	309	309
query19	193	208	166	166
query20	121	127	127	127
query21	211	127	110	110
query22	4668	4747	4623	4623
query23	33902	33198	33240	33198
query24	5658	2403	2419	2403
query25	459	464	395	395
query26	675	279	155	155
query27	1660	485	331	331
query28	2829	2441	2441	2441
query29	580	565	428	428
query30	279	218	199	199
query31	867	881	812	812
query32	77	72	62	62
query33	461	382	326	326
query34	764	863	492	492
query35	829	842	773	773
query36	971	990	907	907
query37	124	105	85	85
query38	4157	4370	4261	4261
query39	1514	1440	1443	1440
query40	218	157	105	105
query41	51	54	49	49
query42	128	102	101	101
query43	498	514	486	486
query44	1314	831	814	814
query45	181	175	172	172
query46	857	1061	652	652
query47	1868	1907	1769	1769
query48	388	426	320	320
query49	715	530	445	445
query50	701	752	435	435
query51	4311	4302	4257	4257
query52	104	109	98	98
query53	237	255	191	191
query54	492	518	412	412
query55	92	91	79	79
query56	269	276	243	243
query57	1194	1181	1113	1113
query58	295	230	243	230
query59	2795	2964	2812	2812
query60	281	278	264	264
query61	123	117	123	117
query62	751	732	677	677
query63	231	202	195	195
query64	1476	1038	679	679
query65	4604	4536	4369	4369
query66	733	388	300	300
query67	15855	15514	15514	15514
query68	7247	881	507	507
query69	534	304	262	262
query70	1138	1059	1114	1059
query71	447	287	265	265
query72	5757	3592	3704	3592
query73	1303	736	348	348
query74	8981	9016	8892	8892
query75	3229	3204	2720	2720
query76	3859	1193	760	760
query77	531	351	289	289
query78	9994	10112	9261	9261
query79	2261	828	597	597
query80	673	579	450	450
query81	483	260	226	226
query82	451	125	95	95
query83	175	168	151	151
query84	285	92	77	77
query85	766	352	297	297
query86	399	320	272	272
query87	4447	4456	4439	4439
query88	3415	2214	2219	2214
query89	398	308	276	276
query90	1812	208	211	208
query91	160	140	107	107
query92	75	59	57	57
query93	2152	1048	583	583
query94	696	409	310	310
query95	347	262	250	250
query96	480	565	275	275
query97	3372	3425	3294	3294
query98	228	203	201	201
query99	1368	1368	1243	1243
Total cold run time: 295600 ms
Total hot run time: 192362 ms

doris-robot · 2025-03-07T20:32:02Z

ClickBench: Total hot run time: 31.4 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a5f09cd2ab08a54ba551df7fffafe59a39ad3a95, data reload: false

query1	0.03	0.03	0.03
query2	0.10	0.04	0.05
query3	0.27	0.05	0.05
query4	1.61	0.08	0.08
query5	0.55	0.55	0.55
query6	1.20	0.72	0.73
query7	0.02	0.02	0.02
query8	0.06	0.05	0.05
query9	0.62	0.52	0.51
query10	0.57	0.60	0.58
query11	0.25	0.13	0.12
query12	0.25	0.13	0.12
query13	0.64	0.62	0.62
query14	2.85	2.80	2.81
query15	1.01	0.88	0.89
query16	0.38	0.37	0.36
query17	1.03	1.04	1.03
query18	0.18	0.19	0.18
query19	1.96	1.93	1.88
query20	0.02	0.01	0.02
query21	15.34	0.97	0.66
query22	0.92	1.02	0.76
query23	14.71	1.49	0.74
query24	5.46	0.57	0.28
query25	0.17	0.09	0.09
query26	0.55	0.23	0.18
query27	0.08	0.09	0.08
query28	10.95	1.18	0.56
query29	12.53	4.08	3.38
query30	0.27	0.08	0.06
query31	2.82	0.60	0.43
query32	3.22	0.59	0.50
query33	3.00	3.05	3.08
query34	16.64	5.05	4.39
query35	4.46	4.47	4.48
query36	0.62	0.49	0.49
query37	0.21	0.17	0.17
query38	0.17	0.15	0.15
query39	0.05	0.04	0.05
query40	0.19	0.16	0.15
query41	0.10	0.04	0.05
query42	0.06	0.05	0.05
query43	0.05	0.05	0.05
Total cold run time: 106.17 s
Total hot run time: 31.4 s

hello-stephen · 2025-03-07T21:30:47Z

BE UT Coverage Report

Increment line coverage 0.00% (0/302) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	46.84% (12531/26750)
Line Coverage	36.43% (106831/293289)
Region Coverage	35.46% (54554/153834)
Branch Coverage	30.82% (27437/89026)

hubgeter · 2025-03-08T14:45:17Z

run buildall

hubgeter · 2025-03-08T14:54:16Z

run buildall

doris-robot · 2025-03-08T15:22:12Z

TeamCity cloud ut coverage result:
Function Coverage: 82.15% (1063/1294)
Line Coverage: 65.69% (17632/26840)
Region Coverage: 65.16% (8692/13340)
Branch Coverage: 55.12% (4691/8510)
Coverage Report: http://coverage.selectdb-in.cc/coverage/10a89015177ef906757e78bb85e91b51a1914ca6_10a89015177ef906757e78bb85e91b51a1914ca6_cloud/report/index.html

doris-robot · 2025-03-08T15:50:31Z

TPC-H: Total hot run time: 32633 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 10a89015177ef906757e78bb85e91b51a1914ca6, data reload: false

------ Round 1 ----------------------------------
q1	17629	5247	5132	5132
q2	2054	308	174	174
q3	10728	1236	768	768
q4	10372	1025	518	518
q5	9077	2470	2299	2299
q6	197	169	130	130
q7	895	744	618	618
q8	9297	1344	1165	1165
q9	4986	4855	4643	4643
q10	6916	2315	1881	1881
q11	468	284	264	264
q12	350	353	221	221
q13	17772	3671	3067	3067
q14	239	222	212	212
q15	532	491	491	491
q16	626	615	580	580
q17	588	866	345	345
q18	6725	6484	6451	6451
q19	2087	935	552	552
q20	307	317	192	192
q21	2923	2147	1951	1951
q22	1040	1030	979	979
Total cold run time: 105808 ms
Total hot run time: 32633 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5267	5100	5110	5100
q2	240	329	239	239
q3	2165	2711	2278	2278
q4	1430	1874	1368	1368
q5	4265	4161	4120	4120
q6	226	172	128	128
q7	1943	1917	1776	1776
q8	2605	2592	2636	2592
q9	7239	7157	7201	7157
q10	3031	3209	2798	2798
q11	559	493	489	489
q12	674	721	580	580
q13	3545	3937	3241	3241
q14	291	302	272	272
q15	521	477	477	477
q16	635	703	653	653
q17	1125	1649	1303	1303
q18	7753	7609	7498	7498
q19	843	822	849	822
q20	2042	2037	1849	1849
q21	5377	4977	4761	4761
q22	1092	1066	996	996
Total cold run time: 52868 ms
Total hot run time: 50497 ms

doris-robot · 2025-03-08T16:02:39Z

TPC-DS: Total hot run time: 192050 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 10a89015177ef906757e78bb85e91b51a1914ca6, data reload: false

query1	1383	1019	1018	1018
query2	6137	1893	1939	1893
query3	11044	4649	4385	4385
query4	53869	24907	23546	23546
query5	5002	587	485	485
query6	350	225	196	196
query7	4874	505	294	294
query8	309	253	238	238
query9	5419	2535	2532	2532
query10	424	318	261	261
query11	15197	15117	14883	14883
query12	163	112	108	108
query13	1038	507	425	425
query14	10189	6273	6464	6273
query15	211	212	192	192
query16	7112	698	463	463
query17	1043	719	555	555
query18	1587	419	318	318
query19	206	188	168	168
query20	143	123	118	118
query21	207	120	103	103
query22	4611	4371	4410	4371
query23	34280	33340	33589	33340
query24	6527	2438	2392	2392
query25	440	469	402	402
query26	685	285	160	160
query27	2117	512	330	330
query28	3095	2448	2390	2390
query29	577	558	417	417
query30	280	223	193	193
query31	902	916	790	790
query32	74	65	62	62
query33	460	388	316	316
query34	767	872	499	499
query35	825	841	765	765
query36	989	979	893	893
query37	113	105	76	76
query38	4302	4192	4248	4192
query39	1495	1448	1461	1448
query40	213	124	108	108
query41	61	59	63	59
query42	120	102	105	102
query43	511	536	489	489
query44	1325	810	795	795
query45	180	176	170	170
query46	866	1068	690	690
query47	1820	1899	1826	1826
query48	384	414	304	304
query49	705	512	445	445
query50	701	735	422	422
query51	4329	4318	4302	4302
query52	105	104	99	99
query53	237	254	186	186
query54	491	493	412	412
query55	85	79	81	79
query56	298	258	255	255
query57	1186	1203	1132	1132
query58	246	241	241	241
query59	2783	2842	2720	2720
query60	288	278	258	258
query61	119	115	118	115
query62	740	716	683	683
query63	229	195	193	193
query64	1467	1053	714	714
query65	4510	4338	4340	4338
query66	722	389	288	288
query67	15911	15527	15520	15520
query68	6490	908	520	520
query69	542	314	276	276
query70	1217	1141	1105	1105
query71	449	285	258	258
query72	5787	3643	3793	3643
query73	1332	738	345	345
query74	9024	9220	8832	8832
query75	3215	3180	2707	2707
query76	3811	1181	755	755
query77	554	367	272	272
query78	10065	10133	9381	9381
query79	1960	830	580	580
query80	614	515	425	425
query81	487	260	224	224
query82	264	127	99	99
query83	181	179	152	152
query84	281	99	72	72
query85	769	347	312	312
query86	418	300	282	282
query87	4422	4450	4352	4352
query88	3644	2179	2202	2179
query89	406	307	284	284
query90	1766	206	203	203
query91	138	135	109	109
query92	75	61	58	58
query93	1440	1047	587	587
query94	684	412	307	307
query95	345	279	261	261
query96	491	555	275	275
query97	3366	3421	3289	3289
query98	224	202	235	202
query99	1343	1401	1269	1269
Total cold run time: 294882 ms
Total hot run time: 192050 ms

doris-robot · 2025-03-08T16:08:03Z

ClickBench: Total hot run time: 31.36 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 10a89015177ef906757e78bb85e91b51a1914ca6, data reload: false

query1	0.04	0.04	0.03
query2	0.11	0.04	0.05
query3	0.27	0.05	0.05
query4	1.61	0.07	0.07
query5	0.55	0.55	0.53
query6	1.19	0.72	0.72
query7	0.02	0.01	0.03
query8	0.05	0.04	0.04
query9	0.62	0.53	0.52
query10	0.58	0.60	0.58
query11	0.25	0.12	0.12
query12	0.25	0.13	0.13
query13	0.64	0.62	0.62
query14	2.66	2.69	2.84
query15	1.01	0.89	0.87
query16	0.37	0.36	0.36
query17	1.03	1.04	1.05
query18	0.18	0.19	0.18
query19	2.02	1.85	1.83
query20	0.02	0.02	0.02
query21	15.35	0.97	0.66
query22	0.94	1.06	0.78
query23	14.73	1.51	0.76
query24	5.52	0.56	0.29
query25	0.17	0.09	0.09
query26	0.56	0.21	0.19
query27	0.08	0.08	0.09
query28	10.98	1.15	0.58
query29	12.52	4.15	3.37
query30	0.28	0.08	0.06
query31	2.81	0.64	0.44
query32	3.24	0.60	0.50
query33	3.08	3.04	3.06
query34	16.64	5.14	4.47
query35	4.50	4.47	4.45
query36	0.63	0.50	0.49
query37	0.21	0.17	0.17
query38	0.16	0.16	0.15
query39	0.05	0.04	0.04
query40	0.19	0.16	0.17
query41	0.10	0.05	0.05
query42	0.07	0.05	0.05
query43	0.05	0.04	0.05
Total cold run time: 106.33 s
Total hot run time: 31.36 s

hubgeter · 2025-03-09T14:13:47Z

run buildall

doris-robot · 2025-03-09T14:31:20Z

TeamCity cloud ut coverage result:
Function Coverage: 82.15% (1063/1294)
Line Coverage: 65.68% (17628/26840)
Region Coverage: 65.13% (8688/13340)
Branch Coverage: 55.08% (4687/8510)
Coverage Report: http://coverage.selectdb-in.cc/coverage/6a516d1ed9fe329c934eeeeb485136333c5935dc_6a516d1ed9fe329c934eeeeb485136333c5935dc_cloud/report/index.html

doris-robot · 2025-03-09T14:49:28Z

TPC-H: Total hot run time: 32645 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6a516d1ed9fe329c934eeeeb485136333c5935dc, data reload: false

------ Round 1 ----------------------------------
q1	17686	5182	5105	5105
q2	2050	306	181	181
q3	10741	1315	728	728
q4	10312	1050	528	528
q5	9211	2417	2406	2406
q6	229	182	135	135
q7	918	753	587	587
q8	9312	1260	1150	1150
q9	5572	4757	4707	4707
q10	6828	2317	1892	1892
q11	487	268	270	268
q12	346	366	217	217
q13	17762	3675	3041	3041
q14	233	231	201	201
q15	547	512	502	502
q16	614	625	564	564
q17	586	875	347	347
q18	7154	6416	6452	6416
q19	1302	959	550	550
q20	328	336	191	191
q21	2817	2231	1955	1955
q22	1065	994	974	974
Total cold run time: 106100 ms
Total hot run time: 32645 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5236	5141	5154	5141
q2	238	334	229	229
q3	2153	2721	2308	2308
q4	1453	1873	1406	1406
q5	4257	4118	4202	4118
q6	234	175	126	126
q7	2041	1941	1825	1825
q8	2635	2626	2584	2584
q9	7284	7177	7216	7177
q10	3021	3215	2802	2802
q11	580	522	500	500
q12	669	760	631	631
q13	3524	3906	3206	3206
q14	275	287	284	284
q15	570	501	507	501
q16	645	704	625	625
q17	1158	1620	1309	1309
q18	7670	7797	7490	7490
q19	824	826	999	826
q20	1961	2042	1905	1905
q21	5399	4900	4798	4798
q22	1111	1037	984	984
Total cold run time: 52938 ms
Total hot run time: 50775 ms

doris-robot · 2025-03-09T15:00:58Z

TPC-DS: Total hot run time: 184942 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6a516d1ed9fe329c934eeeeb485136333c5935dc, data reload: false

query1	962	398	374	374
query2	6519	1935	1943	1935
query3	6814	217	214	214
query4	26323	23740	22872	22872
query5	4347	674	505	505
query6	307	198	197	197
query7	4608	511	288	288
query8	304	243	237	237
query9	8631	2504	2508	2504
query10	474	307	245	245
query11	15937	15086	14993	14993
query12	162	106	105	105
query13	1653	500	399	399
query14	9664	6437	6166	6166
query15	217	189	176	176
query16	7286	593	477	477
query17	1169	717	550	550
query18	1955	412	302	302
query19	206	190	158	158
query20	121	117	116	116
query21	207	122	104	104
query22	4260	4183	4077	4077
query23	34066	32976	33195	32976
query24	7711	2365	2435	2365
query25	558	498	412	412
query26	1243	271	158	158
query27	2428	477	327	327
query28	4210	2398	2405	2398
query29	755	602	407	407
query30	286	214	190	190
query31	952	853	758	758
query32	70	63	68	63
query33	585	362	301	301
query34	787	847	493	493
query35	806	796	745	745
query36	956	955	897	897
query37	118	102	73	73
query38	4318	4162	4072	4072
query39	1491	1419	1424	1419
query40	204	115	100	100
query41	54	51	54	51
query42	117	103	105	103
query43	491	508	481	481
query44	1273	801	772	772
query45	173	168	164	164
query46	831	1019	626	626
query47	1812	1781	1706	1706
query48	366	398	292	292
query49	802	507	413	413
query50	692	747	397	397
query51	4216	4201	4134	4134
query52	116	109	100	100
query53	226	249	188	188
query54	485	481	421	421
query55	83	83	75	75
query56	281	260	246	246
query57	1114	1150	1074	1074
query58	246	233	270	233
query59	2663	2836	2729	2729
query60	298	267	287	267
query61	119	119	117	117
query62	806	724	691	691
query63	242	192	188	188
query64	4294	996	649	649
query65	4418	4302	4310	4302
query66	1113	401	301	301
query67	15854	15533	15605	15533
query68	8018	878	503	503
query69	458	298	266	266
query70	1230	1165	1081	1081
query71	468	294	260	260
query72	5304	3480	3680	3480
query73	736	682	347	347
query74	8966	9185	8975	8975
query75	3810	3134	2712	2712
query76	3541	1172	728	728
query77	787	363	284	284
query78	10023	10279	9342	9342
query79	1528	882	587	587
query80	661	581	450	450
query81	488	259	219	219
query82	208	129	93	93
query83	175	172	157	157
query84	240	92	73	73
query85	749	345	306	306
query86	384	323	302	302
query87	4572	4566	4294	4294
query88	2872	2255	2238	2238
query89	386	317	290	290
query90	1842	214	221	214
query91	140	139	107	107
query92	66	61	60	60
query93	1184	1056	592	592
query94	589	416	306	306
query95	362	274	264	264
query96	481	560	274	274
query97	3280	3444	3244	3244
query98	222	208	227	208
query99	1305	1393	1285	1285
Total cold run time: 271930 ms
Total hot run time: 184942 ms

doris-robot · 2025-03-09T15:06:16Z

ClickBench: Total hot run time: 30.6 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6a516d1ed9fe329c934eeeeb485136333c5935dc, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.03	0.03
query3	0.24	0.06	0.06
query4	1.62	0.10	0.10
query5	0.56	0.54	0.55
query6	1.18	0.73	0.72
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.58	0.52	0.51
query10	0.57	0.62	0.57
query11	0.15	0.11	0.10
query12	0.15	0.11	0.12
query13	0.61	0.60	0.60
query14	2.77	2.83	2.70
query15	0.92	0.85	0.86
query16	0.37	0.38	0.39
query17	1.02	1.02	1.01
query18	0.20	0.20	0.20
query19	1.91	1.77	2.02
query20	0.02	0.01	0.02
query21	15.36	0.88	0.54
query22	0.76	1.28	0.71
query23	14.80	1.31	0.67
query24	6.62	1.84	0.55
query25	0.53	0.24	0.08
query26	0.64	0.16	0.13
query27	0.05	0.05	0.04
query28	9.00	0.88	0.44
query29	12.57	3.97	3.31
query30	0.24	0.09	0.06
query31	2.83	0.58	0.39
query32	3.22	0.54	0.46
query33	2.98	3.03	3.00
query34	15.82	5.14	4.51
query35	4.60	4.55	4.60
query36	0.65	0.49	0.49
query37	0.10	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.17	0.13	0.13
query41	0.08	0.03	0.03
query42	0.04	0.02	0.02
query43	0.03	0.04	0.03
Total cold run time: 104.21 s
Total hot run time: 30.6 s

hubgeter · 2025-03-09T15:19:14Z

run buildall

doris-robot · 2025-03-09T15:25:54Z

TeamCity cloud ut coverage result:
Function Coverage: 82.15% (1063/1294)
Line Coverage: 65.70% (17633/26840)
Region Coverage: 65.14% (8690/13340)
Branch Coverage: 55.09% (4688/8510)
Coverage Report: http://coverage.selectdb-in.cc/coverage/a7747ad1ad6a2a47d778e98bd9a92cbe3dfc950c_a7747ad1ad6a2a47d778e98bd9a92cbe3dfc950c_cloud/report/index.html

hello-stephen · 2025-03-11T12:14:35Z

BE UT Coverage Report

Increment line coverage 15.14% (33/218) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	46.90% (12544/26747)
Line Coverage	36.60% (107193/292914)
Region Coverage	35.65% (54810/153724)
Branch Coverage	31.05% (27615/88926)

morningman · 2025-03-12T15:06:16Z

be/src/vec/exec/format/table/paimon_reader.cpp

@@ -38,6 +38,134 @@ PaimonReader::PaimonReader(std::unique_ptr<GenericReader> file_format_reader,
            ADD_CHILD_TIMER(_profile, "DeleteFileReadTime", paimon_profile);
 }

+/**
+sql:


no need this comment

morningman · 2025-03-12T15:27:59Z

fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonSchemaCacheValue.java


 public class PaimonSchemaCacheValue extends SchemaCacheValue {

    private List<Column> partitionColumns;

-    public PaimonSchemaCacheValue(List<Column> schema, List<Column> partitionColumns) {
+    private TableSchema tableSchema;


Looks like we only use this TableSchema to build columnIdToName, no need to store it.

morningman · 2025-03-12T15:28:59Z

fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonScanNode.java

@@ -154,6 +155,11 @@ protected Optional<String> getSerializedTable() {
        return Optional.of(serializedTable);
    }

+    Map<Long, String> getSchemaInfo(Long schemaId) {


Suggested change

Map<Long, String> getSchemaInfo(Long schemaId) {

private Map<Long, String> getSchemaInfo(Long schemaId) {

morningman · 2025-03-12T15:31:57Z

be/src/vec/exec/scan/file_scanner.cpp

@@ -1273,7 +1273,7 @@ Status FileScanner::_init_expr_ctxes() {
        if (slot_info.is_file_slot) {
            _file_slot_descs.emplace_back(it->second);
            _file_col_names.push_back(it->second->col_name());
-            if (it->second->col_unique_id() > 0) {
+            if (it->second->col_unique_id() >= 0) {


Is this a bug?

No, iceberg's field unique ID starts from 1, paimon/hudi field unique ID starts from 0.

hubgeter · 2025-03-13T02:55:16Z

run buildall

doris-robot · 2025-03-13T03:11:51Z

TeamCity cloud ut coverage result:
Function Coverage: 83.05% (1073/1292)
Line Coverage: 65.93% (17686/26824)
Region Coverage: 65.38% (8718/13334)
Branch Coverage: 55.28% (4703/8508)
Coverage Report: http://coverage.selectdb-in.cc/coverage/1c15d374e1bf43b1f133f4f7046255a7c80cb44d_1c15d374e1bf43b1f133f4f7046255a7c80cb44d_cloud/report/index.html

doris-robot · 2025-03-13T04:40:43Z

TPC-H: Total hot run time: 32806 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1c15d374e1bf43b1f133f4f7046255a7c80cb44d, data reload: false

------ Round 1 ----------------------------------
q1	17609	5221	5067	5067
q2	2064	298	171	171
q3	10402	1294	733	733
q4	10210	1062	536	536
q5	7502	2339	2366	2339
q6	183	164	132	132
q7	936	748	611	611
q8	9327	1319	1130	1130
q9	4861	4924	4794	4794
q10	6824	2297	1890	1890
q11	491	273	270	270
q12	350	359	214	214
q13	17759	3662	3110	3110
q14	231	243	215	215
q15	546	482	478	478
q16	613	624	587	587
q17	579	864	363	363
q18	6763	6561	6445	6445
q19	2154	966	568	568
q20	314	334	197	197
q21	2830	2452	1986	1986
q22	1057	1011	970	970
Total cold run time: 103605 ms
Total hot run time: 32806 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5308	5135	5156	5135
q2	244	336	233	233
q3	2152	2639	2309	2309
q4	1439	1845	1381	1381
q5	4247	4178	4132	4132
q6	204	171	124	124
q7	1994	1958	1743	1743
q8	2646	2717	2703	2703
q9	7363	7347	7297	7297
q10	3000	3185	2807	2807
q11	575	507	473	473
q12	650	746	575	575
q13	3604	3902	3316	3316
q14	269	287	290	287
q15	562	505	501	501
q16	637	693	657	657
q17	1150	1590	1363	1363
q18	7984	7632	7450	7450
q19	850	829	912	829
q20	2019	2054	1918	1918
q21	5607	4864	4848	4848
q22	1109	1074	1011	1011
Total cold run time: 53613 ms
Total hot run time: 51092 ms

doris-robot · 2025-03-13T04:52:34Z

TPC-DS: Total hot run time: 192295 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1c15d374e1bf43b1f133f4f7046255a7c80cb44d, data reload: false

query1	1369	1025	1001	1001
query2	6324	1904	1849	1849
query3	11000	4599	4533	4533
query4	26145	23623	23366	23366
query5	3835	650	497	497
query6	328	193	192	192
query7	3978	505	291	291
query8	284	232	218	218
query9	8497	2584	2616	2584
query10	465	308	254	254
query11	15384	15202	14802	14802
query12	153	107	105	105
query13	1565	514	392	392
query14	8715	6607	6518	6518
query15	202	190	174	174
query16	7477	662	471	471
query17	1207	767	612	612
query18	1975	441	346	346
query19	207	206	181	181
query20	130	137	134	134
query21	217	130	110	110
query22	4737	4797	4533	4533
query23	34494	33714	33462	33462
query24	7764	2397	2439	2397
query25	520	477	452	452
query26	1215	284	153	153
query27	2440	498	346	346
query28	4348	2450	2477	2450
query29	723	579	464	464
query30	273	223	186	186
query31	906	875	812	812
query32	74	61	62	61
query33	531	353	291	291
query34	797	895	505	505
query35	818	863	781	781
query36	979	1065	882	882
query37	118	100	72	72
query38	4140	4177	4110	4110
query39	1468	1457	1475	1457
query40	212	118	107	107
query41	52	56	58	56
query42	120	99	99	99
query43	509	504	491	491
query44	1309	798	785	785
query45	192	181	210	181
query46	862	1060	652	652
query47	1872	1904	1850	1850
query48	389	412	308	308
query49	776	550	428	428
query50	701	756	410	410
query51	4340	4294	4326	4294
query52	103	113	91	91
query53	225	257	193	193
query54	495	492	419	419
query55	85	87	84	84
query56	276	274	268	268
query57	1176	1208	1136	1136
query58	245	248	241	241
query59	2669	2684	2740	2684
query60	289	290	252	252
query61	133	139	118	118
query62	781	744	688	688
query63	231	186	188	186
query64	4130	1030	693	693
query65	4614	4494	4434	4434
query66	1207	405	299	299
query67	16554	15405	15419	15405
query68	9704	873	502	502
query69	476	309	264	264
query70	1165	1047	1128	1047
query71	470	289	269	269
query72	5146	3506	3698	3506
query73	768	704	350	350
query74	8975	8903	8856	8856
query75	4200	3126	2694	2694
query76	3936	1192	758	758
query77	1003	358	277	277
query78	9915	10192	9257	9257
query79	1633	838	572	572
query80	628	515	432	432
query81	463	254	220	220
query82	205	124	94	94
query83	206	165	149	149
query84	276	92	78	78
query85	732	367	408	367
query86	350	288	291	288
query87	4676	4637	4345	4345
query88	2831	2250	2210	2210
query89	384	314	279	279
query90	2050	209	212	209
query91	141	140	113	113
query92	75	56	54	54
query93	1138	1070	586	586
query94	648	404	290	290
query95	357	267	249	249
query96	476	555	270	270
query97	3331	3447	3311	3311
query98	301	209	200	200
query99	1358	1373	1303	1303
Total cold run time: 278366 ms
Total hot run time: 192295 ms

doris-robot · 2025-03-13T04:57:57Z

ClickBench: Total hot run time: 31.35 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1c15d374e1bf43b1f133f4f7046255a7c80cb44d, data reload: false

query1	0.03	0.03	0.04
query2	0.10	0.05	0.04
query3	0.27	0.05	0.05
query4	1.61	0.07	0.08
query5	0.58	0.56	0.56
query6	1.19	0.72	0.72
query7	0.02	0.02	0.01
query8	0.06	0.05	0.05
query9	0.62	0.52	0.52
query10	0.56	0.60	0.57
query11	0.24	0.12	0.12
query12	0.24	0.13	0.14
query13	0.63	0.63	0.61
query14	2.66	2.69	2.68
query15	1.01	0.87	0.85
query16	0.38	0.36	0.38
query17	1.03	1.06	1.05
query18	0.19	0.17	0.18
query19	1.99	1.89	1.93
query20	0.01	0.01	0.01
query21	15.35	0.98	0.67
query22	0.91	0.98	0.78
query23	14.76	1.47	0.73
query24	5.56	0.54	0.28
query25	0.18	0.09	0.09
query26	0.56	0.22	0.18
query27	0.09	0.08	0.08
query28	11.05	1.19	0.55
query29	12.54	4.12	3.42
query30	0.27	0.07	0.06
query31	2.81	0.63	0.43
query32	3.22	0.62	0.49
query33	3.07	3.05	3.05
query34	16.77	5.05	4.42
query35	4.46	4.54	4.48
query36	0.62	0.50	0.49
query37	0.21	0.17	0.17
query38	0.17	0.16	0.16
query39	0.05	0.04	0.04
query40	0.19	0.16	0.16
query41	0.11	0.05	0.06
query42	0.06	0.04	0.05
query43	0.05	0.04	0.04
Total cold run time: 106.48 s
Total hot run time: 31.35 s

github-actions · 2025-03-14T01:48:36Z

PR approved by at least one committer and no changes requested.

github-actions · 2025-03-14T01:48:38Z

PR approved by anyone and no changes requested.

…able. (#49051) ### What problem does this PR solve? Similar to pr #48723 Problem Summary: 1. Supports native reader reading tables after the top-level schema of hudi is changed, but does not support tables after the internal schema of struct is changed. change internal schema of struct schema(not support, will support in the next PR). 2. Unify the logic of iceberg/paimon/hudi native reader to handle schema change's table.

…ge table. (apache#48723) ### What problem does this PR solve? Problem Summary: Supports native reader reading tables after the top-level schema of paimon is changed, but does not support tables after the internal schema of struct is changed. change top-level schema(support): ```sql --spark sql ALTER TABLE table_name ADD COLUMNS (c1 INT,c2 STRING); ALTER TABLE table_name RENAME COLUMN c0 TO c1; ALTER TABLE table_name DROP COLUMNS (c1, c2); ALTER TABLE table_name ADD COLUMN c INT FIRST; ALTER TABLE table_name ADD COLUMN c INT AFTER b; ALTER TABLE table_name ALTER COLUMN col_a FIRST; ALTER TABLE table_name ALTER COLUMN col_a AFTER col_b; ``` change internal schema of struct schema(not support, will support in the next PR): ```sql --spark sql ALTER TABLE table_name ADD COLUMN v.value.f3 STRING; ALTER TABLE table_name RENAME COLUMN v.f1 to f100; ALTER TABLE table_name DROP COLUMN v.value.f3 ; ALTER TABLE table_name ALTER COLUMN v.col_a FIRST; ```

…able. (apache#49051) ### What problem does this PR solve? Similar to pr apache#48723 Problem Summary: 1. Supports native reader reading tables after the top-level schema of hudi is changed, but does not support tables after the internal schema of struct is changed. change internal schema of struct schema(not support, will support in the next PR). 2. Unify the logic of iceberg/paimon/hudi native reader to handle schema change's table.

…ge table. (apache#48723) ### What problem does this PR solve? Problem Summary: Supports native reader reading tables after the top-level schema of paimon is changed, but does not support tables after the internal schema of struct is changed. change top-level schema(support): ```sql --spark sql ALTER TABLE table_name ADD COLUMNS (c1 INT,c2 STRING); ALTER TABLE table_name RENAME COLUMN c0 TO c1; ALTER TABLE table_name DROP COLUMNS (c1, c2); ALTER TABLE table_name ADD COLUMN c INT FIRST; ALTER TABLE table_name ADD COLUMN c INT AFTER b; ALTER TABLE table_name ALTER COLUMN col_a FIRST; ALTER TABLE table_name ALTER COLUMN col_a AFTER col_b; ``` change internal schema of struct schema(not support, will support in the next PR): ```sql --spark sql ALTER TABLE table_name ADD COLUMN v.value.f3 STRING; ALTER TABLE table_name RENAME COLUMN v.f1 to f100; ALTER TABLE table_name DROP COLUMN v.value.f3 ; ALTER TABLE table_name ALTER COLUMN v.col_a FIRST; ```

…ge table. (apache#48723) Problem Summary: Supports native reader reading tables after the top-level schema of paimon is changed, but does not support tables after the internal schema of struct is changed. change top-level schema(support): ```sql --spark sql ALTER TABLE table_name ADD COLUMNS (c1 INT,c2 STRING); ALTER TABLE table_name RENAME COLUMN c0 TO c1; ALTER TABLE table_name DROP COLUMNS (c1, c2); ALTER TABLE table_name ADD COLUMN c INT FIRST; ALTER TABLE table_name ADD COLUMN c INT AFTER b; ALTER TABLE table_name ALTER COLUMN col_a FIRST; ALTER TABLE table_name ALTER COLUMN col_a AFTER col_b; ``` change internal schema of struct schema(not support, will support in the next PR): ```sql --spark sql ALTER TABLE table_name ADD COLUMN v.value.f3 STRING; ALTER TABLE table_name RENAME COLUMN v.f1 to f100; ALTER TABLE table_name DROP COLUMN v.value.f3 ; ALTER TABLE table_name ALTER COLUMN v.col_a FIRST; ```

… schema change table. #48723 (#52174) bp #48723

…able. (apache#49051) Similar to pr apache#48723 Problem Summary: 1. Supports native reader reading tables after the top-level schema of hudi is changed, but does not support tables after the internal schema of struct is changed. change internal schema of struct schema(not support, will support in the next PR). 2. Unify the logic of iceberg/paimon/hudi native reader to handle schema change's table.

hubgeter marked this pull request as draft March 6, 2025 02:42

hubgeter closed this Mar 7, 2025

hubgeter force-pushed the paimon_sc branch from ab35a1c to 2e1268a Compare March 7, 2025 16:17

hubgeter reopened this Mar 7, 2025

[Enhancement](paimon)support paimon top level schema change.

a5f09cd

hubgeter force-pushed the paimon_sc branch from 51af3eb to a5f09cd Compare March 7, 2025 18:19

add be ut

71494cc

fix ut

10a8901

fix case

6a516d1

fix case2

a7747ad

morningman reviewed Mar 12, 2025

View reviewed changes

fix comment

1c15d37

morningman approved these changes Mar 14, 2025

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 14, 2025

github-actions bot added the reviewed label Mar 14, 2025

morningman added dev/2.1.x-experimental dev/3.0.x-experimental labels Mar 14, 2025

hubgeter changed the title ~~[Enhancement](paimon)support paimon top level schema change.~~ [Enhancement](paimon)support native read paimon top level schema change table. Mar 14, 2025

hubgeter mentioned this pull request Mar 14, 2025

[enhancement](hudi)support native read hudi top level schema change table. #49051

Merged

16 tasks

wuwenchi approved these changes Mar 15, 2025

View reviewed changes

morningman merged commit bcaec12 into apache:master Mar 17, 2025
29 of 31 checks passed

hubgeter mentioned this pull request Jun 24, 2025

[Enhancement](paimon)support native read paimon top level schema change table. (#48723) #52174

Merged

16 tasks

morrySnow pushed a commit that referenced this pull request Jun 28, 2025

branch-3.1: [Enhancement](paimon)support native read paimon top level…

384fb2b

… schema change table. #48723 (#52174) bp #48723

morrySnow added the dev/3.1.0-merged label Jun 28, 2025

	Map<Long, String> getSchemaInfo(Long schemaId) {
	private Map<Long, String> getSchemaInfo(Long schemaId) {

[Enhancement](paimon)support native read paimon top level schema change table. #48723

[Enhancement](paimon)support native read paimon top level schema change table. #48723

Uh oh!

Conversation

hubgeter commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

Thearas commented Mar 6, 2025

Uh oh!

hubgeter commented Mar 7, 2025

Uh oh!

doris-robot commented Mar 7, 2025

Uh oh!

doris-robot commented Mar 7, 2025

Uh oh!

doris-robot commented Mar 7, 2025

Uh oh!

doris-robot commented Mar 7, 2025

Uh oh!

hello-stephen commented Mar 7, 2025

BE UT Coverage Report

Uh oh!

hubgeter commented Mar 8, 2025

Uh oh!

hubgeter commented Mar 8, 2025

Uh oh!

doris-robot commented Mar 8, 2025

Uh oh!

doris-robot commented Mar 8, 2025

Uh oh!

doris-robot commented Mar 8, 2025

Uh oh!

doris-robot commented Mar 8, 2025

Uh oh!

hubgeter commented Mar 9, 2025

Uh oh!

doris-robot commented Mar 9, 2025

Uh oh!

doris-robot commented Mar 9, 2025

Uh oh!

doris-robot commented Mar 9, 2025

Uh oh!

doris-robot commented Mar 9, 2025

Uh oh!

hubgeter commented Mar 9, 2025

Uh oh!

doris-robot commented Mar 9, 2025

Uh oh!

hello-stephen commented Mar 11, 2025

BE UT Coverage Report

Uh oh!

morningman Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

morningman Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

morningman Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

morningman Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

hubgeter Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

hubgeter commented Mar 13, 2025

Uh oh!

doris-robot commented Mar 13, 2025

Uh oh!

doris-robot commented Mar 13, 2025

Uh oh!

doris-robot commented Mar 13, 2025

Uh oh!

doris-robot commented Mar 13, 2025

Uh oh!

hubgeter commented Mar 6, 2025 •

edited

Loading