tsdb-operator

A Kubernetes operator that manages the full lifecycle of Prometheus clusters: provisioning, scaling, high availability, scheduled backups to S3-compatible storage, and an audit log of operator actions.

What it solves

Running Prometheus at scale means repeatedly wiring up the same primitives — StatefulSets, PVCs, headless services, health checks, snapshotting, off-cluster backups, and a record of who changed what. tsdb-operator turns that into a single declarative CRD (PrometheusCluster) and a small control plane.

Concretely, pain point → mechanism:

Pain point	Without the operator	How tsdb-operator handles it
One cluster = 4 resources to wire up	StatefulSet + PVC + Service + ConfigMap; easy to forget one on change	One `PrometheusCluster` spec; reconciler drives all dependents
PVCs don't survive zone/cluster loss	DR is a manual `rsync`	Cron → admin snapshot → S3; `tsdb-ctl restore` pulls it back
Replicas don't self-heal	Prometheus has no leader; a stuck pod stays stuck	HA loop probes `/-/ready`, deletes the pod so the StatefulSet rebuilds it
"Who changed what, when?"	K8s Events have a TTL; history disappears	Postgres `audit_log` with retention + pruner
No cross-namespace view	Clusters scattered across namespaces, listed by hand	`PrometheusClusterSet` (cluster-scoped), selects by label
Bad spec only fails at reconcile time	`replicas: 0`, bad cron, empty URL break at runtime	Validating webhook rejects at `kubectl apply` time
Custom scrape = hand-edit ConfigMap	One typo and the config reload fails	`spec.additionalScrapeConfigs`, merged by the operator

Out of scope: long-term storage and cross-cluster query — that's Thanos / VictoriaMetrics territory. See docs/COMPARISON.en.md.

Features

Cluster lifecycle

PrometheusCluster CRD → StatefulSet + headless Service + ConfigMap + PVC
Finalizer-based cleanup; phase reporting (Provisioning / Active / Scaling / Failed)
Scale, image upgrade, retention change reconciled via full pod-template diff

High availability

Periodic /-/ready probes across replicas
Unhealthy pods deleted to trigger rescheduling, LastFailoverTime + K8s Events recorded

Backup & restore

Cron-driven Prometheus admin snapshot → tar streamed via SPDY exec → S3 multipart upload
On-pod snapshot-dir cleanup after successful upload
tsdb-ctl CLI: list / restore against any S3-compatible endpoint (MinIO, AWS, etc.)

Thanos sidecar (opt-in)

spec.thanos.enabled: true attaches a sidecar sharing the /prometheus volume
Automatic --enable-feature=expand-external-labels + per-pod replica label
Object-store config via Secret reference

Remote write

spec.remoteWrite renders into prometheus.yml with basicAuth / bearerToken Secret refs

Multi-namespace aggregation

PrometheusClusterSet (cluster-scoped CRD) selects clusters by label across namespaces
Status reports member count, per-phase histogram, and member list

Admission validation (opt-in)

Validating webhook rejects bad spec.replicas, missing backup.bucket, bad cron, empty remoteWrite[].url at kubectl apply time
cert-manager-backed TLS via Helm values

Audit log (opt-in)

PostgreSQL-backed audit_log, every cluster mutation + backup event recorded
Retention policy (--audit-retention-days) with periodic pruner

REST API (opt-in)

gin-based: /api/clusters, /api/clustersets, /api/clusters/:name/{backup,audit}
cert-manager TLS supported

Observability

Prometheus metrics: tsdb_operator_{cluster_phase,backup_total,failover_total,audit_*}
Grafana dashboard at grafana/dashboards/tsdb-operator.json

Packaging

Helm chart at charts/tsdb-operator/ with feature flags for every subsystem
envtest + kind e2e; every release verified against a real kind cluster

Why snapshot to S3 when PVCs exist? See docs/BACKUPS.en.md (中文).

How to restore from a backup: docs/RESTORE.md (中文).

Migrating to/from prometheus-operator: docs/MIGRATION.md (中文).

Cluster-scoped aggregation across namespaces: docs/CLUSTERSET.md (中文).

How does this compare to Thanos and VictoriaMetrics? See docs/COMPARISON.en.md (中文).

Broader TSDB landscape (Prometheus ecosystem + general-purpose TSDBs): docs/TSDB-LANDSCAPE.en.md (中文).

Mainland China observability landscape (Nightingale / DeepFlow / ARMS / domestic TSDBs): docs/CHINA-LANDSCAPE.en.md (中文).

Prometheus TSDB internals that informed this operator's design: docs/TSDB-INTERNALS.en.md (中文).

Adding custom scrape configs without hand-editing the ConfigMap: docs/SCRAPE-CONFIGS.md (中文).

Plan for the v1.0 API freeze: docs/V1-PREP.md (中文).

Architecture

┌───────────────────────────────────────────────────────────────┐
│                         tsdb-operator                         │
│                                                               │
│  ┌────────────────────────┐   ┌───────────────────────────┐   │
│  │ PrometheusCluster      │   │ HA health checker         │   │
│  │ reconciler             │──▶│ (probe /-/ready, failover)│   │
│  │ (StatefulSet + SVC)    │   └───────────────────────────┘   │
│  └────────────┬───────────┘                                   │
│               │                                               │
│               ▼                                               │
│  ┌────────────────────────┐   ┌───────────────────────────┐   │
│  │ Backup scheduler       │──▶│ S3 / MinIO                │   │
│  │ (cron, admin snapshot) │   └───────────────────────────┘   │
│  └────────────────────────┘                                   │
│                                                               │
│  ┌────────────────────────┐   ┌───────────────────────────┐   │
│  │ REST API (gin)         │──▶│ Audit log (PostgreSQL)    │   │
│  └────────────────────────┘   └───────────────────────────┘   │
└───────────────────────────────────────────────────────────────┘

Quick start

# 1. Install the operator via Helm
helm install tsdb-operator ./charts/tsdb-operator -n tsdb-operator --create-namespace

# 2. Create a PrometheusCluster
kubectl apply -f config/samples/observability_v1_prometheuscluster.yaml

# 3. Watch it come up
kubectl get prometheuscluster -w

CRD example

See config/samples/observability_v1_prometheuscluster.yaml.

apiVersion: observability.merlionos.org/v1
kind: PrometheusCluster
metadata:
  name: demo
spec:
  replicas: 2
  retention: 15d
  storage:
    size: 20Gi
  backup:
    enabled: true
    bucket: tsdb-backups
    schedule: "0 */6 * * *"

REST API

Method	Path	Description
GET	`/api/clusters`	List PrometheusCluster resources
POST	`/api/clusters`	Create a cluster
GET	`/api/clusters/:name`	Get cluster + status
DELETE	`/api/clusters/:name`	Delete cluster
POST	`/api/clusters/:name/backup`	Trigger manual backup
GET	`/api/clusters/:name/audit`	Query audit log

Set X-Operator: <user> on write requests to record the actor in the audit log.

Development

# Local dependencies (Postgres for audit, MinIO for backup, Grafana)
docker compose up -d

# Regenerate CRDs / deepcopy after changing api/v1 types
make generate manifests

# Unit + envtest
make test

# End-to-end on kind
make test-e2e

Layout

api/v1/                        CRD types
internal/controller/           PrometheusCluster reconciler
internal/ha/                   Health checking + failover
internal/backup/               Snapshot + S3 upload scheduler
internal/audit/                PostgreSQL audit log
pkg/api/                       gin HTTP server
config/                        kustomize manifests (kubebuilder)
grafana/dashboards/            Operator dashboard JSON

Roadmap

See ROADMAP.md (中文).

Architecture Decision Records

See docs/adr/ for the rationale behind key choices.

The Observability Epic (trilogy)

A long-form essay tracing observability from ancient beacon towers to modern eBPF — and why Chinese and Western observability ecosystems diverged along lines already written three thousand years ago.

Book I · The Flame and the Eyes (中文) — 771 BCE → 1858 CE: Eastern Qintianjian vs. Western Nightingale
Book II · Iron and Lightning (中文) — 1837 → 2019: from Morse to OpenTelemetry
Book III · The Divergence and the Echo (中文) — 2013 → today: the modern Sino-American divergence, and where tsdb-operator sits

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
api/v1		api/v1
charts/tsdb-operator		charts/tsdb-operator
cmd		cmd
config		config
docs		docs
drafts		drafts
grafana/dashboards		grafana/dashboards
hack		hack
internal		internal
pkg/api		pkg/api
test		test
.custom-gcl.yml		.custom-gcl.yml
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Makefile		Makefile
PROJECT		PROJECT
README.md		README.md
README.zh.md		README.zh.md
ROADMAP.md		ROADMAP.md
ROADMAP.zh.md		ROADMAP.zh.md
VERSION		VERSION
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tsdb-operator

What it solves

Features

Architecture

Quick start

CRD example

REST API

Development

Layout

Roadmap

Architecture Decision Records

The Observability Epic (trilogy)

License

About

Uh oh!

Releases 15

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

tsdb-operator

What it solves

Features

Architecture

Quick start

CRD example

REST API

Development

Layout

Roadmap

Architecture Decision Records

The Observability Epic (trilogy)

License

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 15

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages