Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat](binlog) Add config to control whether enable persistent connec… #48761

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

w41ter
Copy link
Contributor

@w41ter w41ter commented Mar 6, 2025

…tion during ingesting

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

In the previous PR #48467, a thread_local http connection is added to save the reconnect latency. However during the mixed workloads (both stream load and downloading binlog), the persistent connection will be blocked by the stream loading, which causes a latency spike.

The ingest latency with a persistent connection:
Ingest latency with thread_local persistent connection

The ingest latency without persistent connection:

Ingest latency without thread_local persistent connection

This PR makes a persistent connection for each ingesting task, and adding a config enable_ingest_binlog_with_persistent_connection to control whether to use the persistent connection. The config is disabled by default, and users who need to download binlogs over long distances can enable it to reduce the sync latency.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@w41ter
Copy link
Contributor Author

w41ter commented Mar 7, 2025

run buildall

@w41ter w41ter requested a review from dataroaring March 7, 2025 08:28
Copy link
Contributor

github-actions bot commented Mar 7, 2025

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 32838 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 201e1bebd25112e02a5544e1a94ee010ac941b97, data reload: false

------ Round 1 ----------------------------------
q1	17584	5203	5153	5153
q2	2050	307	162	162
q3	10398	1340	726	726
q4	10230	1076	542	542
q5	7544	2466	2387	2387
q6	196	165	133	133
q7	998	757	608	608
q8	9313	1333	1146	1146
q9	5060	4647	4799	4647
q10	6849	2325	1892	1892
q11	480	279	261	261
q12	356	359	223	223
q13	17786	3704	3159	3159
q14	240	235	206	206
q15	547	489	481	481
q16	615	636	591	591
q17	588	882	355	355
q18	7090	6468	6372	6372
q19	1297	955	548	548
q20	327	341	191	191
q21	3022	2334	2071	2071
q22	1067	1018	984	984
Total cold run time: 103637 ms
Total hot run time: 32838 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5251	5171	5174	5171
q2	243	342	238	238
q3	2130	2683	2316	2316
q4	1432	1843	1363	1363
q5	4249	4154	4183	4154
q6	205	165	124	124
q7	1973	1962	1815	1815
q8	2663	2685	2672	2672
q9	7271	7330	6920	6920
q10	3051	3228	2763	2763
q11	605	521	497	497
q12	672	774	606	606
q13	3559	3952	3297	3297
q14	288	325	270	270
q15	521	480	476	476
q16	661	678	657	657
q17	1196	1645	1374	1374
q18	8012	7763	7439	7439
q19	873	889	1032	889
q20	1989	2023	1894	1894
q21	5483	5043	4973	4973
q22	1103	1087	1021	1021
Total cold run time: 53430 ms
Total hot run time: 50929 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191703 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 201e1bebd25112e02a5544e1a94ee010ac941b97, data reload: false

query1	1374	1021	995	995
query2	6357	1915	1908	1908
query3	11050	4436	4405	4405
query4	54501	26136	23180	23180
query5	5234	544	517	517
query6	374	202	191	191
query7	5017	522	299	299
query8	332	253	242	242
query9	6538	2647	2678	2647
query10	425	325	257	257
query11	15508	15156	14963	14963
query12	157	111	110	110
query13	1130	529	380	380
query14	11048	7251	6400	6400
query15	199	189	200	189
query16	7156	673	476	476
query17	1097	727	619	619
query18	1658	409	333	333
query19	214	205	182	182
query20	132	127	123	123
query21	214	128	111	111
query22	4439	4387	4322	4322
query23	33990	33360	33308	33308
query24	5778	2522	2596	2522
query25	491	463	419	419
query26	711	278	159	159
query27	1897	507	373	373
query28	2939	2445	2467	2445
query29	622	561	471	471
query30	270	220	194	194
query31	913	875	805	805
query32	80	65	68	65
query33	466	355	328	328
query34	790	891	517	517
query35	843	843	759	759
query36	957	1006	905	905
query37	123	106	75	75
query38	4226	4206	4149	4149
query39	1511	1457	1418	1418
query40	210	126	108	108
query41	68	53	50	50
query42	125	110	106	106
query43	507	526	493	493
query44	1414	833	832	832
query45	186	183	166	166
query46	910	1077	669	669
query47	1825	1858	1757	1757
query48	395	430	311	311
query49	710	562	424	424
query50	745	782	439	439
query51	4356	4301	4353	4301
query52	112	109	101	101
query53	251	276	192	192
query54	493	503	431	431
query55	89	82	80	80
query56	272	274	266	266
query57	1207	1237	1152	1152
query58	251	247	247	247
query59	2673	2869	2530	2530
query60	290	277	262	262
query61	123	118	116	116
query62	721	715	677	677
query63	233	193	191	191
query64	1762	1024	677	677
query65	4445	4363	4324	4324
query66	723	406	350	350
query67	15768	15518	15178	15178
query68	7002	895	510	510
query69	540	299	288	288
query70	1225	1107	1118	1107
query71	499	307	274	274
query72	5810	3829	3498	3498
query73	1372	755	341	341
query74	9199	9123	8788	8788
query75	3721	3137	2710	2710
query76	4156	1203	790	790
query77	610	417	300	300
query78	10051	10187	9339	9339
query79	2842	831	600	600
query80	677	540	443	443
query81	485	251	220	220
query82	685	127	97	97
query83	176	178	161	161
query84	280	91	74	74
query85	786	360	316	316
query86	402	317	314	314
query87	4500	4450	4300	4300
query88	3543	2292	2275	2275
query89	416	321	289	289
query90	1843	221	218	218
query91	147	156	111	111
query92	74	64	59	59
query93	2268	1068	573	573
query94	672	399	291	291
query95	352	271	268	268
query96	490	569	283	283
query97	3300	3381	3246	3246
query98	258	202	202	202
query99	1364	1379	1227	1227
Total cold run time: 301370 ms
Total hot run time: 191703 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.05 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 201e1bebd25112e02a5544e1a94ee010ac941b97, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.03	0.04
query3	0.23	0.06	0.06
query4	1.63	0.11	0.10
query5	0.55	0.54	0.55
query6	1.19	0.72	0.71
query7	0.03	0.01	0.02
query8	0.04	0.03	0.04
query9	0.59	0.52	0.52
query10	0.57	0.62	0.58
query11	0.16	0.11	0.11
query12	0.15	0.11	0.11
query13	0.62	0.60	0.60
query14	2.68	2.71	2.80
query15	0.92	0.86	0.84
query16	0.39	0.39	0.38
query17	1.02	1.05	1.04
query18	0.21	0.20	0.20
query19	1.86	1.82	1.96
query20	0.01	0.02	0.01
query21	15.36	0.91	0.54
query22	0.75	1.20	0.77
query23	14.88	1.38	0.68
query24	7.24	1.42	0.95
query25	0.47	0.14	0.17
query26	0.65	0.16	0.14
query27	0.05	0.06	0.05
query28	9.95	0.91	0.43
query29	12.62	3.95	3.29
query30	0.25	0.09	0.07
query31	2.82	0.59	0.38
query32	3.22	0.54	0.47
query33	3.02	2.98	3.09
query34	15.92	5.11	4.46
query35	4.51	4.49	4.49
query36	0.67	0.49	0.48
query37	0.09	0.06	0.06
query38	0.05	0.04	0.03
query39	0.03	0.02	0.03
query40	0.16	0.12	0.12
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 105.82 s
Total hot run time: 31.05 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/6) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 46.86% (12533/26744)
Line Coverage 36.45% (106845/293115)
Region Coverage 35.48% (54556/153757)
Branch Coverage 30.83% (27432/88972)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants