Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions docs/admin-manual/config/be-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -629,12 +629,6 @@ BaseCompaction:546859:
* Description: select the time interval in seconds for rowset to be compacted.
* Default value: 86400

#### `max_single_replica_compaction_threads`

* Type: int32
* Description: The maximum of thread number in single replica compaction thread pool. -1 means one thread per disk.
* Default value: -1

#### `update_replica_infos_interval_seconds`

* Description: Minimal interval (s) to update peer replica infos
Expand Down
1 change: 0 additions & 1 deletion docs/admin-manual/data-admin/ccr/feature.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,6 @@ Renaming is not supported on either the upstream or the downstream. If you do re
| auto_bucket | Yes | - | SQL | |
| group_commit series | Yes | - | SQL | |
| enable_unique_key_merge_on_write | Yes | - | SQL | |
| enable_single_replica_compaction | Yes | - | SQL | |
| disable_auto_compaction | Yes | - | SQL | |
| compaction_policy | Yes | - | SQL | |
| time_series_compaction series | Yes | - | SQL | |
Expand Down
12 changes: 0 additions & 12 deletions docs/admin-manual/trouble-shooting/compaction.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,17 +51,6 @@ Situations where segment compaction is not recommended:

Refer to this [link](https://github.com/apache/doris/pull/12866) for more information about implementation and test results.

## Single replica compaction

By default, compaction for multiple replicas is performed independently, with each replica consuming CPU and IO resources. When single replica compaction is enabled, only one replica performs the compaction. Afterward, the other replicas pull the compacted files from this replica, resulting in CPU resources being consumed only once, saving N - 1 times CPU usage (where N is the number of replicas).

Single replica compaction is specified in the table's PROPERTIES via the parameter `enable_single_replica_compaction`, which is false by default (disabled). To enable it, set the parameter to true.

This parameter can be specified when creating the table or modified later using:
```sql
ALTER TABLE table_name SET("enable_single_replica_compaction" = "true");
```

## Compaction strategy

The compaction strategy determines when and which small files are merged into larger files. Doris currently offers two compaction strategies, specified by the `compaction_policy` parameter in the table properties.
Expand Down Expand Up @@ -98,4 +87,3 @@ Compaction runs in the background and consumes CPU and IO resources. The resourc
The number of concurrent compaction threads is configured in the BE configuration file, including the following parameters:
- `max_base_compaction_threads`: Number of base compaction threads, default is 4.
- `max_cumu_compaction_threads`: Number of cumulative compaction threads, default is -1, which mean that 1 thread per disk.
- `max_single_replica_compaction_threads`: Number of threads for fetching data files during single replica compaction, default is 10.
1 change: 0 additions & 1 deletion docs/connection-integration/data-integration/beats.md
Original file line number Diff line number Diff line change
Expand Up @@ -311,7 +311,6 @@ DISTRIBUTED BY RANDOM BUCKETS 10
PROPERTIES (
"replication_num" = "1",
"compaction_policy" = "time_series",
"enable_single_replica_compaction" = "true",
"dynamic_partition.enable" = "true",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.time_unit" = "DAY",
Expand Down
1 change: 0 additions & 1 deletion docs/connection-integration/data-integration/fluentbit.md
Original file line number Diff line number Diff line change
Expand Up @@ -345,7 +345,6 @@ DISTRIBUTED BY RANDOM BUCKETS 10
PROPERTIES (
"replication_num" = "1",
"compaction_policy" = "time_series",
"enable_single_replica_compaction" = "true",
"dynamic_partition.enable" = "true",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.time_unit" = "DAY",
Expand Down
1 change: 0 additions & 1 deletion docs/connection-integration/data-integration/logstash.md
Original file line number Diff line number Diff line change
Expand Up @@ -399,7 +399,6 @@ DISTRIBUTED BY RANDOM BUCKETS 10
PROPERTIES (
"replication_num" = "1",
"compaction_policy" = "time_series",
"enable_single_replica_compaction" = "true",
"dynamic_partition.enable" = "true",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.time_unit" = "DAY",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -322,7 +322,6 @@ PROPERTIES (
"replication_num" = "1",
"inverted_index_storage_format" = "v2",
"compaction_policy" = "time_series",
"enable_single_replica_compaction" = "true",
"dynamic_partition.enable" = "true",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.time_unit" = "DAY",
Expand Down
1 change: 0 additions & 1 deletion docs/connection-integration/data-integration/vector.md
Original file line number Diff line number Diff line change
Expand Up @@ -341,7 +341,6 @@ PROPERTIES (
"replication_num" = "1",
"inverted_index_storage_format" = "v2",
"compaction_policy" = "time_series",
"enable_single_replica_compaction" = "true",
"dynamic_partition.enable" = "true",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.time_unit" = "DAY",
Expand Down
3 changes: 0 additions & 3 deletions docs/key-features/data-compaction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,6 @@ Apache Doris compaction scores every tablet by version pressure, picks rowsets p
4. **Promote to base.** When a cumulative output grows past `compaction_promotion_size_mbytes` (1 GB by default), it's eligible for base compaction the next round.
5. **Replace and clean up.** The output rowset replaces its inputs in tablet metadata. Old files stick around for a grace window in case a query is still reading them, then get deleted.

If you have multiple replicas, you can let one replica do the merge and have the others copy the result over the network (`enable_single_replica_compaction`). That cuts CPU usage roughly in proportion to the replica count.

## Quick start {#quick-start}

```sql
Expand Down Expand Up @@ -95,7 +93,6 @@ Apache Doris compaction runs on every table by default; the real choice is which
- Any table that takes frequent loads. The defaults already work; the question is just whether to switch policies.
- Append-only logs and metrics. `time_series` reduces write amplification because each rowset participates in compaction once, not repeatedly across size tiers.
- Wide tables (dozens of columns or more). Keep `enable_vertical_compaction = true` so big merges don't blow up memory.
- Multi-replica clusters where compaction CPU is a bottleneck. Turn on `enable_single_replica_compaction`.

**Not a good fit**

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -203,8 +203,7 @@ CREATE TABLE `test_array_index` (
"is_being_synced" = "false",
"storage_format" = "V2",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false"
"disable_auto_compaction" = "false"
);
-- Query example
SELECT id, inventors FROM test_array_index WHERE array_contains(inventors, 'x') ORDER BY id;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -361,7 +361,6 @@ The functionality of creating synchronized materialized views with rollup is lim
| enable_unique_key_merge_on_write | Whether the Unique table uses the Merge-on-Write implementation. This property was default disabled before version 2.1 and default enabled from version 2.1 onwards. |
| light_schema_change | Whether to use the Light Schema Change optimization. If set to `true`, addition and subtraction operations on value columns can be completed faster and synchronously. This feature is enabled by default in versions 2.0.0 and later. |
| disable_auto_compaction | Whether to disable automatic compaction for this table. If this property is set to `true`, the background automatic compaction process will skip all tablets of this table. |
| enable_single_replica_compaction | Whether to enable single-replica compaction for this table. If this property is set to `true`, only one replica of all replicas of the table's tablets will perform the actual compaction action, and other replicas will pull the compacted rowset from that replica. |
| enable_duplicate_without_keys_by_default | When set to `true`, if no Unique, Aggregate, or Duplicate is specified when creating a table, a Duplicate model table without sort columns and prefix indexes will be created by default. |
| skip_write_index_on_load | Whether to enable not writing indexes during data import for this table. If this property is set to `true`, indexes will not be written during data import (currently only effective for inverted indexes), but will be delayed until compaction. This can avoid the CPU and IO resource consumption of writing indexes repeatedly during the first write and compaction, improving the performance of high-throughput imports. |
| compaction_policy | Configures the compaction merge policy for this table, supporting only time_series or size_basedtime_series: When the disk volume of rowsets accumulates to a certain size, version merging is performed. The merged rowset is directly promoted to the base compaction phase. This effectively reduces the write amplification of compact in scenarios with continuous imports. This policy will use parameters prefixed with time_series_compaction to adjust the execution of compaction. |
Expand Down
3 changes: 1 addition & 2 deletions docs/table-design/data-partitioning/basic-concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -355,8 +355,7 @@ A reasonable partitioning and bucketing design needs to balance query performanc
| | "is_being_synced" = "false", |
| | "storage_format" = "V2", |
| | "light_schema_change" = "true", |
| | "disable_auto_compaction" = "false", |
| | "enable_single_replica_compaction" = "false" |
| | "disable_auto_compaction" = "false" |
| | ); |
+-------------------+---------------------------------------------------------------------------------------------------------+
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -634,12 +634,6 @@ BaseCompaction:546859:
* 描述:选取 rowset 去合并的时间间隔,单位为秒
* 默认值:86400

#### `max_single_replica_compaction_threads`

* 类型:int32
* 描述:Single Replica Compaction 线程池中线程数量的最大值,-1 表示每个磁盘一个线程。
* 默认值:-1

#### `update_replica_infos_interval_seconds`

* 类型:int32
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,6 @@ CCR 任务不同步修改库属性的操作。
| auto_bucket | 支持 | - | SQL | |
| group_commit 系列 | 支持 | - | SQL | |
| enable_unique_key_merge_on_write | 支持 | - | SQL | |
| enable_single_replica_compaction | 支持 | - | SQL | |
| disable_auto_compaction | 支持 | - | SQL | |
| compaction_policy | 支持 | - | SQL | |
| time_series_compaction 系列 | 支持 | - | SQL | |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,14 +66,6 @@ Segment compaction 有以下特点:
关于 segment compaction 的实现和测试结果可以查阅[此链接](https://github.com/apache/doris/pull/12866)。


## 单副本 compaction

默认情况下,多个副本的 compaction 是独立进行的,每个副本在都需要消耗 CPU 和 IO 资源。开启单副本 compaction 后,在一个副本进行 compaction 后,其他几个副本拉取 compaction 后的文件,因此 CPU 资源只需要消耗 1 次,节省了 N - 1 倍 CPU 消耗(N 是副本数)。

单副本 compaction 在表的 PROPERTIES 中通过参数 `enable_single_replica_compaction` 指定,默认为 false 不开启,设置为 true 开启。

该参数可以在建表时指定,或者通过 `ALTER TABLE table_name SET("enable_single_replica_compaction" = "true")` 来修改。

## Compaction 策略

Compaction 策略决定什么时候将哪些小文件合并成大文件。Doris 当前提供了 2 种 compaction 策略,通过表属性的 `compaction_policy` 参数指定。
Expand Down Expand Up @@ -109,5 +101,4 @@ Compaction 在后台执行需要消耗 CPU 和 IO 资源,可以通过控制 co
compaction 并发线程数在 BE 的配置文件中配置,包括下面几个:
- `max_base_compaction_threads`:base compaction 的线程数,默认是 4
- `max_cumu_compaction_threads`:cumulative compaction 的线程数,默认是 -1,表示每块盘 1 个线程
- `max_single_replica_compaction_threads`:单副本 compaction 拉取数据文件的线程数,默认是 10

Original file line number Diff line number Diff line change
Expand Up @@ -311,7 +311,6 @@ DISTRIBUTED BY RANDOM BUCKETS 10
PROPERTIES (
"replication_num" = "1",
"compaction_policy" = "time_series",
"enable_single_replica_compaction" = "true",
"dynamic_partition.enable" = "true",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.time_unit" = "DAY",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -345,7 +345,6 @@ DISTRIBUTED BY RANDOM BUCKETS 10
PROPERTIES (
"replication_num" = "1",
"compaction_policy" = "time_series",
"enable_single_replica_compaction" = "true",
"dynamic_partition.enable" = "true",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.time_unit" = "DAY",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -399,7 +399,6 @@ DISTRIBUTED BY RANDOM BUCKETS 10
PROPERTIES (
"replication_num" = "1",
"compaction_policy" = "time_series",
"enable_single_replica_compaction" = "true",
"dynamic_partition.enable" = "true",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.time_unit" = "DAY",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -322,7 +322,6 @@ PROPERTIES (
"replication_num" = "1",
"inverted_index_storage_format" = "v2",
"compaction_policy" = "time_series",
"enable_single_replica_compaction" = "true",
"dynamic_partition.enable" = "true",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.time_unit" = "DAY",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -341,7 +341,6 @@ PROPERTIES (
"replication_num" = "1",
"inverted_index_storage_format" = "v2",
"compaction_policy" = "time_series",
"enable_single_replica_compaction" = "true",
"dynamic_partition.enable" = "true",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.time_unit" = "DAY",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -203,8 +203,7 @@ CREATE TABLE `test_array_index` (
"is_being_synced" = "false",
"storage_format" = "V2",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false"
"disable_auto_compaction" = "false"
);
-- 查询示例
SELECT id, inventors FROM test_array_index WHERE array_contains(inventors, 'x') ORDER BY id;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -362,7 +362,6 @@ rollup 可以创建的同步物化视图功能有限。已不再推荐使用。
| enable_unique_key_merge_on_write | Unique 表是否使用 Merge-on-Write 实现。该属性在 2.1 版本之前默认关闭,从 2.1 版本开始默认开启。 |
| light_schema_change | 是否使用 Light Schema Change 优化。如果设置成 `true`, 对于值列的加减操作,可以更快地,同步地完成。该功能在 2.0.0 及之后版本默认开启。 |
| disable_auto_compaction | 是否对这个表禁用自动 Compaction。如果这个属性设置成 `true`, 后台的自动 Compaction 进程会跳过这个表的所有 Tablet。 |
| enable_single_replica_compaction | 是否对这个表开启单副本 Compaction。如果这个属性设置成 `true`, 这个表的 Tablet 的所有副本只有一个进行实际的 compaction 动作,其他副本的从该副本拉取完成 compaction 的 rowset。 |
| enable_duplicate_without_keys_by_default | 当配置为`true`时,如果创建表的时候没有指定 Unique、Aggregate 或 Duplicate 时,会默认创建一个没有排序列和前缀索引的 Duplicate 模型的表。 |
| skip_write_index_on_load | 是否对这个表开启数据导入时不写索引。如果这个属性设置成 `true`, 数据导入的时候不写索引(目前仅对倒排索引生效),而是在 Compaction 的时候延迟写索引。这样可以避免首次写入和 Compaction 重复写索引的 CPU 和 IO 资源消耗,提升高吞吐导入的性能。 |
| compaction_policy | 配置这个表的 Compaction 的合并策略,仅支持配置为 time_series 或者 size_basedtime_series: 当 rowset 的磁盘体积积攒到一定大小时进行版本合并。合并后的 rowset 直接晋升到 base compaction 阶段。在时序场景持续导入的情况下有效降低 compact 的写入放大率。此策略将使用 time_series_compaction 为前缀的参数调整 Compaction 的执行 |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -355,8 +355,7 @@ PROPERTIES
| | "is_being_synced" = "false", |
| | "storage_format" = "V2", |
| | "light_schema_change" = "true", |
| | "disable_auto_compaction" = "false", |
| | "enable_single_replica_compaction" = "false" |
| | "disable_auto_compaction" = "false" |
| | ); |
+-------------------+---------------------------------------------------------------------------------------------------------+
```
Expand Down
Loading
Loading