Single-connector multi-cluster: tiered tools: config for tool exposure

## Goal

Today, reaching N ClickHouse clusters from claude.ai means registering N MCP connectors — one per cluster, because claude.ai binds one connector to one URL and runs one OAuth flow per connector. We want **a single MCP connector that fronts several (2–5, mostly 2–3) clusters that already share OAuth**.

Since claude.ai can't wildcard URLs, the cluster identity has to travel **in-band** — either as a tool argument or baked into a tool. URL-path routing (`/mcp/{cluster}`, #132/#134) structurally can't do single-connector, so this is a new exposure mode alongside it, not a replacement.

Scope for this issue:
- One deployment fronting clusters that **already share** issuer / audience / signing_secret (no auth consolidation in scope).
- Fixed, small cluster set (2–5).
- No distributed / cross-cluster queries — one cluster per call.
- Writes are confirmed by the agent already, so no extra write-gating needed here.

## What we already have

- `multicluster` URL-path routing: `Host: clickhouse-{cluster}.demo.svc.cluster.local` template, cluster from URL, `ClusterAllowlist`.
- **Static tools**: `execute_query` (read), `write_query` (write) — fixed input schema `{query, settings, …}`, cluster-independent.
- **Dynamic tools**: discovered by `view_regexp` (reads) / `table_regexp` (writes); names are `prefix + discovered`. **Input schema is view-derived** — parameterized views (`{id:UInt64}` via `parseViewParams`) and write tables (`system.columns`). So the *same* regexp can match *different* views with *different* schemas on different clusters.
- **Lazy discovery** (`EnsureDynamicTools`) + **catalog cache keyed `(bearer, cluster)`**.

## Proposed design: tiered `tools:` placement

Reuse the existing `ToolDefinition` struct everywhere; **where** the `tools:` block lives determines the cluster binding:

| Placement | Cluster binding | `cluster` arg? | Best for |
|---|---|---|---|
| `server.tools` | the one configured CH | none | single-cluster / legacy (unchanged) |
| `multicluster.tools` | chosen at call time, `enum` = section names | **added to all** | generic `execute_query` / `write_query` |
| `multicluster.clusters[].tools` | fixed by the section | none | curated per-cluster tools |

```yaml
clickhouse:
  host: clickhouse-{cluster}.demo.svc.cluster.local   # template; default for every section

multicluster:
  enabled: true

  tools:                          # tier 2 — cluster arg auto-added (enum: [otel, antalya])
    - type: read
      name: execute_query         # → one tool: {query, settings, cluster}
    - type: write
      name: write_query           # → {query, limit, settings, cluster}

  clusters:
    - name: otel                  # tier 3 — cluster baked in, no arg
      tools:
        - type: read
          view_regexp: "^mcp_.*"
          prefix: "otel_"         # admin-authored → otel_<view>
    - name: antalya
      host: clickhouse-antalya.demo.svc.cluster.local   # explicit override when template doesn't fit
      tools:
        - type: write
          table_regexp: "^events_.*"
          prefix: "antalya_"
          mode: insert
```

### Binding / placement rules

- **Generic, cluster-independent tools** (`execute_query`/`write_query`, fixed schema) → **tier 2**. One def, `cluster` enum arg over the section names. Cluster count doesn't grow the tool list.
- **Regexp / dynamic tools** (view-derived schema) → **tier 3**, bound to one cluster so the derived schema is unambiguous. They **cannot** be collapsed under a tier-2 cluster arg, because one regexp matches differently-shaped views per cluster.
- **Admin-authored `prefix`** per section disambiguates discovered names (`otel_`, `antalya_`). This is the operator's choice, *not* the server auto-deriving prefixes from cluster names.

### Host template

Keep `{cluster}` templating (`clickhouse-{cluster}.demo.svc.cluster.local`) as the default — works for single-cluster and as the multi-cluster default; a section may override `host` (and other sparse CH overrides: port, TLS, database) when it doesn't fit.

### Reuse vs new

- **Reuse**: `ToolDefinition`, regexp discovery, lazy discovery + `(bearer, cluster)` cache, all handlers.
- **New**: cluster sections in config; the **cross-cluster union** that assembles one connector's `tools/list` from the per-`(bearer, cluster)` discovered sets. The section names also become the allowlist + the tier-2 `cluster` enum (making `ClusterAllowlist` redundant for this mode).

### Drift & collisions

- **Drift** (configured/regexp tool whose view/table is missing for this user on this cluster) → silently **omit** from the list.
- **Name collisions are runtime and per-user** — final names only materialize at discovery, since regexp matches whatever exists for that bearer on that cluster. They surface at the cross-cluster union step. On collision: **expose no tool with that name** (drop *all* contenders — never silently route to the wrong table/cluster); **log once per cache miss** (with tier/cluster/source for each contender) so the admin can fix the prefix manually. Load-time validation still catches cheap static issues (two tier-2 names equal, malformed prefix, bad regexp), but final-name uniqueness is a runtime concern.

## Decisions already made in discussion

- Single connector, shared-OAuth clusters only; no auth consolidation.
- Tiered `tools:` placement (above) chosen over: per-call cluster arg on everything, session-pinning (`select_cluster`), and server auto-prefixing — all rejected.
- Regexp discovery kept as-is, now usable inside a section.
- Homogeneous repetition across sections is acceptable; no templating, **no tier-2 subset scoping**.
- Collision → drop all colliding names + log per cache miss; admin resolves manually.

## Open questions for broader discussion

1. Should this single-connector mode and the existing `/mcp/{cluster}` URL routing coexist indefinitely, or is one a migration target?
2. Any concern with `tools/list` fan-out across all configured clusters on first request per bearer (one-time, then cached)? Behavior when a cluster is unreachable at that moment — omit its tools and retry on next cache cycle?
3. Tool-count / context budget in claude.ai & ChatGPT with the union of all sections' tools — practical ceiling for our 2–5 cluster target?
4. Naming convention guidance for `prefix` to minimize accidental collisions.
5. Anything that breaks the assumption that all fronted clusters share issuer/audience/signing_secret?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single-connector multi-cluster: tiered tools: config for tool exposure #136

Goal

What we already have

Proposed design: tiered `tools:` placement

Binding / placement rules

Host template

Reuse vs new

Drift & collisions

Decisions already made in discussion

Open questions for broader discussion

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Placement	Cluster binding	`cluster` arg?	Best for
`server.tools`	the one configured CH	none	single-cluster / legacy (unchanged)
`multicluster.tools`	chosen at call time, `enum` = section names	added to all	generic `execute_query` / `write_query`
`multicluster.clusters[].tools`	fixed by the section	none	curated per-cluster tools

Single-connector multi-cluster: tiered tools: config for tool exposure #136

Description

Goal

What we already have

Proposed design: tiered tools: placement

Binding / placement rules

Host template

Reuse vs new

Drift & collisions

Decisions already made in discussion

Open questions for broader discussion

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Proposed design: tiered `tools:` placement