Skip to content

[chore](compaction) remove single replica compaction#63771

Merged
eldenmoon merged 1 commit into
apache:masterfrom
csun5285:remove-single-replica-compaction
Jun 3, 2026
Merged

[chore](compaction) remove single replica compaction#63771
eldenmoon merged 1 commit into
apache:masterfrom
csun5285:remove-single-replica-compaction

Conversation

@csun5285
Copy link
Copy Markdown
Contributor

@csun5285 csun5285 commented May 28, 2026

Remove the single replica compaction (SRC) feature end-to-end across BE, FE and regression tests.

Doc: apache/doris-website#3870

Why remove it

The main reason is correctness risk in peer selection. A follower replica had to pick a peer holding a "proper" version (_find_rowset_to_fetch) and fetch its compacted result, based on replica info that was only refreshed
periodically. Because replicas progress through versions independently and this "leader" selection ran against a stale, time-sensitive view of the cluster, the hoice of which peer to fetch from — and which version — was racy and could select a peer whose state no longer matched, leading to subtle inconsistencies.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@csun5285 csun5285 force-pushed the remove-single-replica-compaction branch 2 times, most recently from 61db3dd to d97a724 Compare May 28, 2026 03:33
@csun5285
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 50.00% (4/8) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.96% (20946/38819)
Line Coverage 37.52% (198477/528930)
Region Coverage 33.87% (155538/459154)
Branch Coverage 34.81% (67695/194451)

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31818 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d97a724cb17833f5b11a182bf652af8ab6017ace, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17718	4049	4014	4014
q2	q3	10995	1443	812	812
q4	4770	478	346	346
q5	10374	2330	2126	2126
q6	361	188	142	142
q7	968	774	643	643
q8	9574	1732	1717	1717
q9	7119	5022	5006	5006
q10	6504	2193	1877	1877
q11	431	275	245	245
q12	696	443	303	303
q13	18158	3547	2741	2741
q14	268	255	236	236
q15	q16	834	780	711	711
q17	1009	964	870	870
q18	7467	5825	6458	5825
q19	1226	1303	1059	1059
q20	546	412	279	279
q21	5938	2727	2560	2560
q22	445	367	306	306
Total cold run time: 105401 ms
Total hot run time: 31818 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4802	4759	4769	4759
q2	q3	4888	5260	4590	4590
q4	2140	2198	1410	1410
q5	4813	4773	4726	4726
q6	237	181	136	136
q7	1888	1773	1560	1560
q8	2356	1981	1952	1952
q9	7456	7472	7418	7418
q10	4740	4648	4199	4199
q11	545	386	356	356
q12	740	745	535	535
q13	3093	3423	2811	2811
q14	268	277	272	272
q15	q16	691	698	613	613
q17	1277	1267	1250	1250
q18	7417	6720	6824	6720
q19	1117	1071	1132	1071
q20	2226	2228	1937	1937
q21	5346	4659	4509	4509
q22	519	484	407	407
Total cold run time: 56559 ms
Total hot run time: 51231 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 172532 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d97a724cb17833f5b11a182bf652af8ab6017ace, data reload: false

query5	4306	662	534	534
query6	330	216	214	214
query7	4265	575	301	301
query8	332	231	222	222
query9	8813	4113	4115	4113
query10	459	340	303	303
query11	5777	2412	2260	2260
query12	181	131	124	124
query13	1321	582	430	430
query14	6130	5435	5131	5131
query14_1	4434	4440	4436	4436
query15	218	206	182	182
query16	1047	436	395	395
query17	1150	733	580	580
query18	2458	474	342	342
query19	209	203	162	162
query20	146	133	129	129
query21	218	135	117	117
query22	13663	13553	13347	13347
query23	17442	16545	16334	16334
query23_1	16268	16233	16335	16233
query24	7455	1773	1340	1340
query24_1	1350	1345	1337	1337
query25	580	507	443	443
query26	1333	339	175	175
query27	2701	558	340	340
query28	4468	2034	2019	2019
query29	1017	659	522	522
query30	324	240	199	199
query31	1128	1083	959	959
query32	91	80	78	78
query33	562	360	311	311
query34	1194	1148	650	650
query35	782	795	707	707
query36	1402	1386	1229	1229
query37	163	106	96	96
query38	3198	3129	3066	3066
query39	938	922	896	896
query39_1	888	874	880	874
query40	232	151	133	133
query41	71	70	69	69
query42	115	112	110	110
query43	329	344	304	304
query44	
query45	224	214	203	203
query46	1096	1233	715	715
query47	2394	2375	2260	2260
query48	418	439	283	283
query49	660	506	398	398
query50	982	354	251	251
query51	4313	4369	4200	4200
query52	105	113	96	96
query53	265	290	208	208
query54	326	293	276	276
query55	96	94	91	91
query56	343	354	329	329
query57	1468	1414	1362	1362
query58	311	292	278	278
query59	1625	1658	1479	1479
query60	340	345	329	329
query61	206	152	157	152
query62	701	657	589	589
query63	246	204	201	201
query64	2427	800	654	654
query65	
query66	1710	472	360	360
query67	29838	29598	29605	29598
query68	
query69	456	353	308	308
query70	1060	1016	995	995
query71	307	282	270	270
query72	3006	2708	2488	2488
query73	872	752	436	436
query74	5112	4924	4774	4774
query75	2707	2592	2263	2263
query76	2323	1152	784	784
query77	417	427	341	341
query78	12400	12314	11818	11818
query79	1507	1011	782	782
query80	651	564	454	454
query81	455	280	240	240
query82	1348	168	119	119
query83	358	275	244	244
query84	258	137	112	112
query85	889	563	508	508
query86	407	339	302	302
query87	3447	3388	3256	3256
query88	3607	2767	2739	2739
query89	449	391	343	343
query90	1982	191	201	191
query91	210	176	142	142
query92	81	73	76	73
query93	1606	1452	854	854
query94	552	369	320	320
query95	679	397	347	347
query96	1079	765	354	354
query97	2732	2718	2603	2603
query98	240	232	223	223
query99	1197	1157	1009	1009
Total cold run time: 255145 ms
Total hot run time: 172532 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 50.00% (4/8) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.94% (28112/38018)
Line Coverage 57.92% (305599/527580)
Region Coverage 55.28% (256273/463573)
Branch Coverage 56.70% (110658/195179)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 1.41% (1/71) 🎉
Increment coverage report
Complete coverage report

@csun5285 csun5285 force-pushed the remove-single-replica-compaction branch 2 times, most recently from 5a02820 to 65422ce Compare May 28, 2026 08:03
@csun5285
Copy link
Copy Markdown
Contributor Author

run buildall

@csun5285 csun5285 closed this May 28, 2026
@csun5285 csun5285 reopened this May 28, 2026
@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31501 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 65422ce2763f89d61d75e28977cac42cdd67a739, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17775	4035	4057	4035
q2	q3	10700	1414	832	832
q4	4698	474	351	351
q5	7560	2253	2137	2137
q6	270	183	139	139
q7	980	759	662	662
q8	9448	1730	1562	1562
q9	5657	4990	4951	4951
q10	6478	2196	1874	1874
q11	437	280	247	247
q12	696	430	294	294
q13	18199	3435	2803	2803
q14	272	259	240	240
q15	q16	828	779	711	711
q17	939	958	997	958
q18	7029	5712	5554	5554
q19	1224	1238	1143	1143
q20	523	422	279	279
q21	5724	2595	2423	2423
q22	446	355	306	306
Total cold run time: 99883 ms
Total hot run time: 31501 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4440	4367	4332	4332
q2	q3	4558	4983	4340	4340
q4	2107	2204	1394	1394
q5	4510	4317	4631	4317
q6	270	217	166	166
q7	2155	2006	1683	1683
q8	2551	2332	2231	2231
q9	8276	7899	8094	7899
q10	4807	4737	4353	4353
q11	616	467	418	418
q12	766	769	576	576
q13	3334	3641	2929	2929
q14	299	296	262	262
q15	q16	723	739	682	682
q17	1518	1357	1305	1305
q18	7952	7489	7392	7392
q19	1198	1093	1124	1093
q20	2208	2217	1952	1952
q21	5274	4567	4423	4423
q22	522	467	403	403
Total cold run time: 58084 ms
Total hot run time: 52150 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171768 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 65422ce2763f89d61d75e28977cac42cdd67a739, data reload: false

query5	4309	672	525	525
query6	343	243	210	210
query7	4214	580	308	308
query8	323	237	225	225
query9	8861	4069	4097	4069
query10	437	350	320	320
query11	5803	2626	2259	2259
query12	190	129	132	129
query13	1299	582	436	436
query14	6138	5490	5128	5128
query14_1	4481	4481	4437	4437
query15	211	207	192	192
query16	1028	455	464	455
query17	1147	715	587	587
query18	2632	472	353	353
query19	212	210	161	161
query20	139	138	134	134
query21	227	139	117	117
query22	13703	13657	13407	13407
query23	17362	16551	16247	16247
query23_1	16265	16325	16286	16286
query24	7500	1764	1331	1331
query24_1	1332	1326	1343	1326
query25	594	498	456	456
query26	1328	344	179	179
query27	2673	603	359	359
query28	4425	2039	2027	2027
query29	1055	658	538	538
query30	308	240	202	202
query31	1137	1094	959	959
query32	97	87	78	78
query33	577	372	309	309
query34	1190	1170	671	671
query35	798	826	705	705
query36	1378	1382	1262	1262
query37	163	110	92	92
query38	3219	3143	3114	3114
query39	939	925	898	898
query39_1	879	907	877	877
query40	243	157	129	129
query41	78	80	70	70
query42	117	115	113	113
query43	341	346	305	305
query44	
query45	218	205	205	205
query46	1065	1212	716	716
query47	2325	2368	2259	2259
query48	431	430	303	303
query49	662	510	413	413
query50	1030	361	267	267
query51	4365	4358	4278	4278
query52	111	109	99	99
query53	268	283	202	202
query54	333	291	270	270
query55	98	94	88	88
query56	330	346	318	318
query57	1443	1434	1296	1296
query58	324	297	278	278
query59	1609	1712	1466	1466
query60	345	334	334	334
query61	191	203	156	156
query62	705	663	600	600
query63	245	207	210	207
query64	2374	807	643	643
query65	
query66	1692	501	369	369
query67	29808	29662	29505	29505
query68	
query69	462	344	307	307
query70	1077	1068	1013	1013
query71	304	277	273	273
query72	3004	2638	2458	2458
query73	856	747	448	448
query74	5139	4919	4834	4834
query75	2683	2620	2264	2264
query76	2263	1126	789	789
query77	408	411	338	338
query78	12500	12445	11891	11891
query79	1459	1088	779	779
query80	639	536	456	456
query81	448	288	246	246
query82	1381	157	127	127
query83	362	278	250	250
query84	265	139	112	112
query85	900	535	458	458
query86	400	340	323	323
query87	3416	3374	3295	3295
query88	3622	2735	2770	2735
query89	449	399	344	344
query90	1990	191	193	191
query91	180	167	144	144
query92	83	82	75	75
query93	1511	1452	865	865
query94	557	345	330	330
query95	686	480	349	349
query96	1062	777	352	352
query97	2733	2747	2628	2628
query98	237	234	236	234
query99	1155	1149	1032	1032
Total cold run time: 254861 ms
Total hot run time: 171768 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 50.00% (4/8) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.99% (20960/38819)
Line Coverage 37.59% (198830/528941)
Region Coverage 33.91% (155733/459205)
Branch Coverage 34.86% (67791/194471)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 50.00% (4/8) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.93% (28108/38018)
Line Coverage 57.86% (305287/527591)
Region Coverage 54.96% (254806/463624)
Branch Coverage 56.51% (110302/195199)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 5.88% (1/17) 🎉
Increment coverage report
Complete coverage report

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review opinion: no blocking issues found.

Critical checkpoint conclusions:

  • Goal and proof: The PR removes single replica compaction support and its table/schema/task plumbing, while cleaning affected regression cases. The code changes consistently stop accepting/propagating the removed property and remove BE execution paths. Existing unrelated tests were updated to remove the obsolete property; I did not run the suite in this runner.
  • Scope: The implementation is focused on deleting the feature and its references. The broad regression-test edits are mechanical removals of the same property.
  • Concurrency: The removed BE background thread and manual compaction path reduce concurrency surface. Remaining compaction locks and manual base/cumulative/full compaction paths keep the existing lock model; I did not find a new lock-order or shared-state hazard.
  • Lifecycle/static initialization: No new static/global lifecycle dependency was introduced. Removed single-replica compaction thread lifecycle appears fully detached from StorageEngine/OlapServer startup and shutdown.
  • Configuration: Removed BE configs and FE table property handling for the deleted feature. No new config was added.
  • Compatibility: Thrift field ids are left unused and protobuf field 22 is reserved, so old serialized fields can be ignored without reusing ids. FE replay of table property maps can still rebuild other properties; the removed property is no longer materialized or emitted by SHOW CREATE.
  • Parallel paths: Create replica, restore/recover replica, rollup/schema-change tasks, partition property updates, report handling, tablet meta, and cloud schema PB conversion were all updated consistently for the removed field.
  • Special checks: The MOW-specific rejection for enable_single_replica_compaction was removed together with the property parser, so the property is now treated as unsupported instead of as a supported-but-invalid combination.
  • Test coverage: Dedicated single-replica compaction tests are removed with the feature; remaining affected regression files were adjusted. No additional user focus points were provided.
  • Test results: Not run locally in this review environment.
  • Observability: Removed metrics and HTTP remote-compaction behavior are consistent with feature removal; remaining compaction status paths retain existing observability.
  • Transaction/persistence/data correctness: No transaction visibility, delete-bitmap, or rowset versioning behavior is added. Removing the compaction mode does not alter normal base/cumulative/full compaction correctness paths.
  • FE-BE variables: Removed field propagation is reflected across FE task construction, thrift schema, BE task handling, and tablet schema conversion.
  • Performance: Removing the single-replica scheduler and config reduces background work; I did not find a new hot-path regression in the remaining code.

User focus response: No additional review focus was provided.

yiguolei
yiguolei previously approved these changes Jun 1, 2026
@yiguolei
Copy link
Copy Markdown
Contributor

yiguolei commented Jun 1, 2026

skip check_coverage

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Jun 1, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

PR approved by anyone and no changes requested.

@gavinchou
Copy link
Copy Markdown
Contributor

Write down the reason for deletion, and delete the corresponding doc as well.

gavinchou
gavinchou previously approved these changes Jun 2, 2026
Remove the single replica compaction (SRC) feature end-to-end across BE,
FE, protocol definitions, and regression tests. SRC let one replica run
compaction and others fetch the result; this drops the code path along
with its tablet property, thread pool, metrics, HTTP `remote=true` knob,
and FE plumbing.

- BE: delete `SingleReplicaCompaction` class and unit tests; drop the
  SRC thread pool, replica-info refresher, tablet hooks, schema field,
  metrics, and config (`max_single_replica_compaction_threads`,
  `update_replica_infos_interval_seconds`).
- FE: drop `enable_single_replica_compaction` property handling
  (PropertyAnalyzer, TableProperty, OlapTable, Env, InternalCatalog,
  SchemaChangeHandler, ModifyTablePropertiesOp) and the corresponding
  CreateReplicaTask / UpdateTabletMetaInfoTask fields.
- Proto/Thrift: mark the removed fields as `reserved` /
  `// deprecated` to preserve wire compatibility.
- Tests: delete SRC-only suites and strip the property from ~130
  regression-test groovy/.out files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@csun5285 csun5285 dismissed stale reviews from gavinchou and yiguolei via dfc334d June 2, 2026 08:01
@csun5285 csun5285 force-pushed the remove-single-replica-compaction branch from 65422ce to dfc334d Compare June 2, 2026 08:01
@csun5285
Copy link
Copy Markdown
Contributor Author

csun5285 commented Jun 2, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.34% (1906/2433)
Line Coverage 64.76% (33977/52468)
Region Coverage 65.27% (17514/26833)
Branch Coverage 53.93% (9293/17230)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 57.14% (4/7) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 74.02% (28260/38178)
Line Coverage 58.03% (307765/530362)
Region Coverage 54.88% (257984/470062)
Branch Coverage 56.29% (111825/198674)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.25% (1/407) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29656 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dfc334da97b616f9ecc7c8da41ffaa4d3ca22c15, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17773	4136	4152	4136
q2	q3	10780	1439	840	840
q4	4684	478	347	347
q5	7614	886	610	610
q6	186	173	136	136
q7	793	872	645	645
q8	9366	1561	1736	1561
q9	5766	4517	4474	4474
q10	6779	1858	1523	1523
q11	441	264	260	260
q12	627	435	293	293
q13	18125	3399	2848	2848
q14	273	260	236	236
q15	q16	800	779	710	710
q17	974	996	1013	996
q18	7062	5795	5538	5538
q19	1143	1303	1135	1135
q20	500	402	266	266
q21	5559	2866	2786	2786
q22	463	383	316	316
Total cold run time: 99708 ms
Total hot run time: 29656 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5197	4833	4834	4833
q2	q3	4947	5298	4644	4644
q4	2114	2205	1399	1399
q5	4841	4986	4689	4689
q6	240	177	127	127
q7	1825	1809	1564	1564
q8	2557	2149	2129	2129
q9	8030	7691	7479	7479
q10	4779	4672	4239	4239
q11	540	396	353	353
q12	737	749	521	521
q13	2995	3379	2833	2833
q14	280	280	262	262
q15	q16	681	696	603	603
q17	1298	1258	1260	1258
q18	7428	6787	6917	6787
q19	1115	1128	1090	1090
q20	2236	2229	1952	1952
q21	5317	4600	4448	4448
q22	528	491	433	433
Total cold run time: 57685 ms
Total hot run time: 51643 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170047 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dfc334da97b616f9ecc7c8da41ffaa4d3ca22c15, data reload: false

query5	4342	640	479	479
query6	456	202	184	184
query7	4900	549	291	291
query8	359	221	214	214
query9	8790	4059	4045	4045
query10	457	319	263	263
query11	5865	2351	2163	2163
query12	159	109	102	102
query13	1271	589	438	438
query14	6411	5404	5058	5058
query14_1	4422	4415	4405	4405
query15	217	197	178	178
query16	1010	469	478	469
query17	1154	730	613	613
query18	2450	503	372	372
query19	201	191	143	143
query20	108	106	102	102
query21	210	135	118	118
query22	13693	13538	13511	13511
query23	17443	16472	16221	16221
query23_1	16322	16322	16381	16322
query24	7548	1771	1318	1318
query24_1	1304	1311	1287	1287
query25	557	436	382	382
query26	1328	350	187	187
query27	2661	579	344	344
query28	4437	2039	2036	2036
query29	1076	603	481	481
query30	303	232	201	201
query31	1112	1076	969	969
query32	100	62	60	60
query33	521	318	250	250
query34	1167	1124	670	670
query35	768	770	693	693
query36	1367	1364	1223	1223
query37	152	104	95	95
query38	3215	3150	3095	3095
query39	933	923	904	904
query39_1	873	878	873	873
query40	225	125	104	104
query41	66	65	61	61
query42	94	95	94	94
query43	327	338	295	295
query44	
query45	195	186	182	182
query46	1076	1190	753	753
query47	2355	2396	2186	2186
query48	387	408	292	292
query49	625	493	359	359
query50	951	360	256	256
query51	4486	4308	4211	4211
query52	90	90	78	78
query53	255	276	194	194
query54	264	222	215	215
query55	85	82	72	72
query56	275	233	251	233
query57	1451	1418	1300	1300
query58	249	228	229	228
query59	1661	1679	1492	1492
query60	302	268	244	244
query61	182	181	177	177
query62	696	656	594	594
query63	243	190	191	190
query64	2617	852	669	669
query65	
query66	1861	477	362	362
query67	29840	29748	29655	29655
query68	
query69	481	304	264	264
query70	994	962	916	916
query71	298	224	196	196
query72	3024	2783	2430	2430
query73	860	831	430	430
query74	5141	4991	4767	4767
query75	2685	2618	2224	2224
query76	2325	1164	763	763
query77	354	394	293	293
query78	12550	12386	11876	11876
query79	1455	1025	801	801
query80	824	465	395	395
query81	488	276	240	240
query82	575	157	121	121
query83	351	277	251	251
query84	267	143	113	113
query85	957	525	445	445
query86	407	304	298	298
query87	3392	3331	3143	3143
query88	3632	2760	2737	2737
query89	436	378	327	327
query90	1877	189	188	188
query91	175	165	138	138
query92	66	65	59	59
query93	1572	1531	830	830
query94	620	358	332	332
query95	679	470	350	350
query96	1079	783	337	337
query97	2675	2713	2559	2559
query98	210	208	201	201
query99	1155	1177	1053	1053
Total cold run time: 252414 ms
Total hot run time: 170047 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one behavior regression that should be fixed before merging.

Critical checkpoint conclusions:

  • Goal/test: The PR removes single-replica compaction and associated table/schema/protocol plumbing. The removal is broadly implemented and many obsolete regression cases/properties are updated, but the manual HTTP compaction endpoint still accepts the old remote request semantics silently, and there is no replacement negative test for that.
  • Scope/focus: Most changes are direct removals and are focused on the feature deletion.
  • Concurrency: No new compaction/thread lifecycle path was added; removed background thread and thread pool cleanup are consistent. No new lock-order issue found.
  • Lifecycle/static initialization: No new static/global lifecycle dependency found.
  • Configuration: Removed BE configs are no longer consumed; Doris BE config loading ignores unknown file keys, so I did not find a startup blocker from stale config entries.
  • Compatibility/storage/protocol: Thrift/protobuf field numbers are left deprecated/reserved and old values are ignored, which is consistent with removing the feature.
  • Parallel paths: FE create/alter/create-replica/update-meta paths and BE tablet schema/task paths were checked; no other active single-replica-compaction path found.
  • Tests: Obsolete positive tests were removed and expected outputs updated, but the endpoint compatibility/negative behavior for remote=true is missing.
  • Observability/performance: No new logging/metric/performance issue found beyond the endpoint behavior below.
  • Transaction/persistence/data correctness: No transaction or visible-version correctness issue found in the reviewed removal. Existing persisted table/proto metadata for the removed flag is ignored.
  • Other issues: See inline comment.

User focus points: No additional user-provided review focus was present.

Comment thread be/src/service/http/action/compaction_action.cpp
@eldenmoon eldenmoon merged commit e9206af into apache:master Jun 3, 2026
34 of 35 checks passed
csun5285 added a commit to csun5285/doris that referenced this pull request Jun 3, 2026
Remove the single replica compaction (SRC) feature end-to-end across BE,
FE and regression tests.

Doc: apache/doris-website#3870

 ### Why remove it

The main reason is **correctness risk in peer selection**. A follower
replica had to pick a peer holding a "proper" version
(`_find_rowset_to_fetch`) and fetch its compacted result, based on
replica info that was only refreshed
periodically. Because replicas progress through versions independently
and this "leader" selection ran against a stale, time-sensitive view of
the cluster, the hoice of which peer to fetch from — and which version —
was racy and could select a peer whose state no longer matched, leading
to subtle inconsistencies.
csun5285 added a commit to csun5285/doris that referenced this pull request Jun 3, 2026
Remove the single replica compaction (SRC) feature end-to-end across BE,
FE and regression tests.

Doc: apache/doris-website#3870

 ### Why remove it

The main reason is **correctness risk in peer selection**. A follower
replica had to pick a peer holding a "proper" version
(`_find_rowset_to_fetch`) and fetch its compacted result, based on
replica info that was only refreshed
periodically. Because replicas progress through versions independently
and this "leader" selection ran against a stale, time-sensitive view of
the cluster, the hoice of which peer to fetch from — and which version —
was racy and could select a peer whose state no longer matched, leading
to subtle inconsistencies.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants