Add fault-tolerance

ilyario · ilyario · commit ba1210c21502 · 2026-03-06T13:08:20.000+08:00
diff --git a/docs/en/connectors.md b/docs/en/connectors.md
@@ -8,6 +8,7 @@ DataFlow Operator supports various connectors for data sources and sinks. Each c
 |-----------|--------|------|----------|
 | Kafka | ✅ | ✅ | Consumer groups, TLS, SASL, Avro, Schema Registry |
 | PostgreSQL | ✅ | ✅ | SQL queries, batch inserts, auto-create tables, UPSERT mode |
+| postgresFull | ✅ | ✅ | Full DB sync: schema (tables, views, indexes, functions) + data |
 | Trino | ✅ | ✅ | SQL queries, Keycloak OAuth2 authentication, batch inserts |
 | ClickHouse | ✅ | ✅ | Polling, batch inserts, auto-create MergeTree tables |
 | Nessie | ✅ | ✅ | Iceberg tables via Nessie catalog, branches, Basic/Bearer auth, polling, batch appends |
@@ -54,6 +55,9 @@ All connectors support secret references for the following fields:
 - `connectionStringSecretRef` - connection string
 - `tableSecretRef` - table name
 
+#### postgresFull
+- `connectionStringSecretRef` - connection string (source and sink)
+
 #### ClickHouse
 - `connectionStringSecretRef` - connection string
 - `tableSecretRef` - table name
@@ -454,6 +458,66 @@ sink:
 - **UPSERT Mode**: Updates existing records on conflict (PRIMARY KEY or `conflictKey`)
 - **Soft Delete**: When `softDeleteColumn` is set and message has `metadata.operation=delete`, performs `UPDATE ... SET deleted_at = NOW()` instead of physical DELETE
 
+## postgresFull
+
+The postgresFull connector performs **full database synchronization** from a source PostgreSQL to a target PostgreSQL. It replicates schema (tables, views, materialized views, indexes, sequences, triggers, functions) and optionally data.
+
+### Features
+
+- **Schema sync**: Schemas, tables, views, materialized views, indexes, sequences, triggers, functions
+- **Data sync**: Optional data copy (schema_only or schema_and_data)
+- **ExcludeObjects**: Filter out object types (view, matview, function, trigger, index, sequence)
+- **Databases filter**: Sync specific objects in `schema.object` format
+- **ConnectionStringSecretRef**: Use Kubernetes secrets for credentials
+
+### Example
+
+```yaml
+apiVersion: dataflow.dataflow.io/v1
+kind: DataFlow
+metadata:
+  name: postgres-full-sync
+spec:
+  source:
+    type: postgresFull
+    postgresFull:
+      connectionString: "postgres://user:pass@source-pg:5432/db?sslmode=disable"
+      syncMode: full
+      dataMode: schema_and_data   # or schema_only
+      # databases: ["public.users", "analytics.mv_report"]  # optional filter
+      # excludeObjects: ["view", "function"]               # optional exclude
+  sink:
+    type: postgresFull
+    postgresFull:
+      connectionString: "postgres://user:pass@target-pg:5432/db?sslmode=disable"
+```
+
+### Source Options
+
+| Option | Description |
+|--------|-------------|
+| connectionString | PostgreSQL connection string (required, or use connectionStringSecretRef) |
+| syncMode | `full` (default) or `incremental` |
+| dataMode | `schema_only` or `schema_and_data` (default) |
+| databases | List of `schema.object` to sync; empty = all |
+| excludeObjects | Exclude types: view, matview, function, trigger, index, sequence |
+| syncUsers | Sync roles (CREATE ROLE, without passwords) |
+| syncGrants | Sync grants (GRANT on objects) |
+
+### Sink Options
+
+| Option | Description |
+|--------|-------------|
+| connectionString | Target PostgreSQL connection string |
+| dropTarget | Drop objects on target before applying (use with caution) |
+
+### Requirements
+
+- PostgreSQL 12+ on source and target
+- Source: SELECT on pg_catalog, USAGE on schemas, SELECT on tables
+- Target: CREATE, INSERT, and DDL privileges
+- For data sync with FK: uses `session_replication_role = replica` during insert
+
 ## ClickHouse
 
 The ClickHouse connector supports reading from and writing to ClickHouse tables. It supports polling for incremental reads, custom SQL queries, batch inserts, and auto-creation of MergeTree tables.
diff --git a/docs/en/fault-tolerance.md b/docs/en/fault-tolerance.md
@@ -0,0 +1,122 @@
+# Fault Tolerance and Data Consistency
+
+DataFlow Operator processes messages with **at-least-once** delivery semantics. When the processor pod crashes or restarts, some messages may be re-read and written again. This document explains the behavior, risks of data desynchronization, and how to configure idempotent sinks to prevent duplicates.
+
+## Delivery Semantics
+
+- **At-least-once**: Each message is delivered at least once. Duplicates are possible on processor restart or crash.
+- **Exactly-once**: Not supported natively. Use idempotent sinks to achieve effectively-once semantics.
+
+## Source Behavior on Restart
+
+| Source | State storage | On restart |
+|--------|---------------|------------|
+| **Kafka** | Consumer group (Kafka) | Resumes from last committed offset. No duplicates if offset was committed after sink write. |
+| **PostgreSQL** | In-memory (lastReadChangeTime) | State lost. Re-reads from beginning. Duplicates or gaps possible. |
+| **ClickHouse** | In-memory (lastReadID, lastReadTime) | State lost. Re-reads from beginning. Duplicates possible. |
+| **Trino** | In-memory (lastReadID) | State lost. Re-reads from beginning. Duplicates possible. |
+
+### Kafka Source
+
+The Kafka consumer commits offset **only after** the message is successfully written to the sink (via `msg.Ack()`). If the processor crashes:
+
+- **Before sink write**: Offset not committed. On restart, message is re-read. No duplicate in sink.
+- **After sink write, before Ack**: Data may be in sink, offset not committed. On restart, re-read → duplicate in sink.
+- **After Ack**: Offset committed. On restart, resume from next message. No duplicate.
+
+### Polling Sources (PostgreSQL, ClickHouse, Trino)
+
+Read position (lastReadID, lastReadChangeTime) is stored **only in memory**. On pod crash:
+
+- State is lost.
+- On restart, the source re-reads from the beginning (or from a wrong position).
+- **Duplicates** or **gaps** are possible depending on when the crash occurred.
+
+!!! warning "Idempotent sink required"
+    For polling sources, always configure an **idempotent sink** (UPSERT, ReplacingMergeTree) to handle duplicates safely.
+
+## Batch Sink Behavior
+
+PostgreSQL, ClickHouse, and Trino sinks write in batches. The flow is:
+
+1. Accumulate messages in batch
+2. Execute `Commit` (transaction)
+3. Call `Ack()` for each message (commits Kafka offset, if applicable)
+
+If the processor crashes **between Commit and the last Ack**:
+
+- Data is already in the sink
+- Kafka offset may not be committed
+- On restart: re-read from Kafka → **duplicate writes to sink**
+
+!!! tip "Reduce duplicate window"
+    Use a smaller `batchSize` to reduce the number of messages at risk of duplication on crash.
+
+## Idempotent Sink Configuration
+
+### PostgreSQL Sink
+
+Enable UPSERT mode so that duplicate inserts update existing rows instead of failing:
+
+```yaml
+sink:
+  type: postgresql
+  postgresql:
+    connectionString: "postgres://..."
+    table: output_table
+    upsertMode: true
+    conflictKey: ["id"]  # Optional; defaults to PRIMARY KEY
+```
+
+Requires the table to have a PRIMARY KEY or UNIQUE constraint on the conflict columns.
+
+### ClickHouse Sink
+
+Use `ReplacingMergeTree` engine for automatic deduplication by a version column:
+
+```sql
+CREATE TABLE output_table (
+  id UInt64,
+  data String,
+  created_at DateTime DEFAULT now()
+) ENGINE = ReplacingMergeTree(created_at)
+ORDER BY id;
+```
+
+Or create the table with `autoCreateTable: true` and `rawMode: false` — the connector infers column types. For deduplication, create the table manually with `ReplacingMergeTree(version_column)` and `ORDER BY` on the deduplication key.
+
+### Kafka Sink
+
+The Kafka producer uses `RequiredAcks = WaitForAll` and `Producer.Idempotent = true` for durability and to prevent duplicate messages on retry. Consumers should still handle potential duplicates (e.g., by idempotent processing or deduplication by key) for end-to-end exactly-once semantics.
+
+## Best Practices
+
+1. **Use idempotent sinks** for PostgreSQL (UPSERT) and ClickHouse (ReplacingMergeTree) when using polling sources or when duplicates are possible.
+2. **Kafka source**: Consumer group stores offset; at-least-once is preserved. Idempotent sink recommended for batch sinks.
+3. **batchSize**: Smaller batches reduce the duplicate window on crash. Balance with throughput.
+4. **batchFlushIntervalSeconds**: Shorter intervals flush more frequently, reducing in-flight data at risk.
+5. **Error sink**: Configure `spec.errors` to capture failed messages for replay or analysis.
+
+## Graceful Shutdown
+
+On SIGTERM (e.g., pod eviction, node drain):
+
+1. The processor receives the signal and cancels the context.
+2. Sinks flush in-flight batches before exiting.
+3. `PreStop: sleep 5` gives time for the load balancer to stop routing traffic.
+
+Ensure `terminationGracePeriodSeconds` is sufficient for large batches to flush (default: 600 seconds).
+
+## Checkpoint Persistence (Future)
+
+Persisting source checkpoint (lastReadID, lastReadChangeTime) to external storage (ConfigMap or sink table) would allow polling sources to resume from the last committed position after a processor restart, reducing duplicates. This is planned for a future release. Until then, use idempotent sinks to handle duplicates safely.
+
+## Summary Checklist
+
+| Scenario | Recommendation |
+|----------|-----------------|
+| PostgreSQL sink | Enable `upsertMode: true` with PRIMARY KEY or `conflictKey` |
+| ClickHouse sink | Use `ReplacingMergeTree` with `ORDER BY` on deduplication key |
+| Kafka source | Consumer group persists offset; idempotent sink recommended |
+| Polling sources | **Always** use idempotent sink; state is lost on crash |
+| batchSize | Consider smaller values to reduce duplicate window |
diff --git a/docs/en/index.md b/docs/en/index.md
@@ -132,6 +132,7 @@ See [Metrics](metrics.md) for more details.
 - [Transformations](transformations.md) — message transformations
 - [Examples](examples.md) — practical examples
 - [Errors](errors.md) — error handling and error sink
+- [Fault Tolerance](fault-tolerance.md) — at-least-once semantics, idempotent sinks, data consistency
 - [Metrics](metrics.md) — Prometheus metrics
 - [Development](development.md) — developer guide
 
diff --git a/docs/ru/connectors.md b/docs/ru/connectors.md
@@ -8,6 +8,7 @@ DataFlow Operator поддерживает различные коннектор
 |-----------|----------|----------|-------------|
 | Kafka | ✅ | ✅ | Consumer groups, TLS, SASL, Avro, Schema Registry |
 | PostgreSQL | ✅ | ✅ | SQL запросы, батч-вставки, автосоздание таблиц, UPSERT режим |
+| postgresFull | ✅ | ✅ | Полная синхронизация БД: схема (таблицы, view, индексы, функции) + данные |
 | Trino | ✅ | ✅ | SQL запросы, аутентификация Keycloak OAuth2, батч-вставки |
 | ClickHouse | ✅ | ✅ | Опрос таблиц, батч-вставки, автосоздание MergeTree таблиц |
 | Nessie | ✅ | ✅ | Таблицы Iceberg через каталог Nessie, ветки, Basic/Bearer auth, опрос, батч-дозапись |
@@ -372,6 +373,66 @@ sink:
 
 **Важно:** Для работы UPSERT таблица должна иметь PRIMARY KEY или UNIQUE constraint на указанном `conflictKey`.
 
+## postgresFull
+
+Коннектор postgresFull выполняет **полную синхронизацию базы данных** из PostgreSQL-источника в PostgreSQL-приёмник. Реплицирует схему (таблицы, представления, материализованные представления, индексы, последовательности, триггеры, функции) и опционально данные.
+
+### Особенности
+
+- **Синхронизация схемы**: схемы, таблицы, views, materialized views, индексы, последовательности, триггеры, функции
+- **Синхронизация данных**: опционально (schema_only или schema_and_data)
+- **ExcludeObjects**: исключение типов объектов (view, matview, function, trigger, index, sequence)
+- **Фильтр databases**: синхронизация конкретных объектов в формате `schema.object`
+- **ConnectionStringSecretRef**: использование Kubernetes secrets для учётных данных
+
+### Пример
+
+```yaml
+apiVersion: dataflow.dataflow.io/v1
+kind: DataFlow
+metadata:
+  name: postgres-full-sync
+spec:
+  source:
+    type: postgresFull
+    postgresFull:
+      connectionString: "postgres://user:pass@source-pg:5432/db?sslmode=disable"
+      syncMode: full
+      dataMode: schema_and_data   # или schema_only
+      # databases: ["public.users", "analytics.mv_report"]  # опциональный фильтр
+      # excludeObjects: ["view", "function"]                 # опциональное исключение
+  sink:
+    type: postgresFull
+    postgresFull:
+      connectionString: "postgres://user:pass@target-pg:5432/db?sslmode=disable"
+```
+
+### Опции источника
+
+| Опция | Описание |
+|-------|----------|
+| connectionString | Строка подключения PostgreSQL (обязательно, или connectionStringSecretRef) |
+| syncMode | `full` (по умолчанию) или `incremental` |
+| dataMode | `schema_only` или `schema_and_data` (по умолчанию) |
+| databases | Список `schema.object` для синхронизации; пусто = все |
+| excludeObjects | Исключить типы: view, matview, function, trigger, index, sequence |
+| syncUsers | Синхронизировать роли (CREATE ROLE, без паролей) |
+| syncGrants | Синхронизировать права (GRANT на объекты) |
+
+### Опции приёмника
+
+| Опция | Описание |
+|-------|----------|
+| connectionString | Строка подключения к целевой PostgreSQL |
+| dropTarget | Удалить объекты на target перед применением (осторожно) |
+
+### Требования
+
+- PostgreSQL 12+ на source и target
+- Source: SELECT на pg_catalog, USAGE на схемах, SELECT на таблицах
+- Target: CREATE, INSERT и права на DDL
+- Для синхронизации данных с FK: используется `session_replication_role = replica` при вставке
+
 ## ClickHouse
 
 ClickHouse коннектор поддерживает чтение из таблиц и запись в таблицы ClickHouse. Поддерживает периодический опрос для инкрементального чтения, кастомные SQL запросы, батч-вставки и автосоздание MergeTree таблиц.
diff --git a/docs/ru/fault-tolerance.md b/docs/ru/fault-tolerance.md
diff --git a/mkdocs.yml b/mkdocs.yml