Skip to content

Tailnet Lock (TKA): build AUM-sync RPC + replayer, ship verify-and-log before enforce #7

@GeiserX

Description

@GeiserX

Status

TKA verification is fully wired and unit-tested (ts_tka::Authority::node_key_authorized, the fail-closed tka_admits chokepoint in the peer tracker, per-peer key_signature parsed off the netmap, TkaStatus{head,disabled} from MapResponse). Enforcement is inert (tka_authority = None) because the trusted-key Authority cannot yet be built — MapResponse carries only the AUM head and the per-peer signature, never the trusted keys.

What's missing (the acquisition side)

  1. AUM-chain replayer — NOT in ts_tka (no Aum struct, no chain replay; Authority is only constructible via from_state). Must fold AddKey/RemoveKey/UpdateKey/Checkpoint into a trusted-key State with deterministic fork resolution (Go tka.computeActiveChain/pickNextAUM).
  2. /machine/tka/sync Noise RPC client — NOT present. Minimal viable: GET /machine/tka/bootstrap (genesis AUM + DisablementSecret) → Bootstrap() → one GET /machine/tka/sync/offerInform(MissingAUMs). Skip /sync/send (read-only client) and incremental sync (re-bootstrap on TKAHead change).
  3. Authority delivery seam — push the built Authority into the peer tracker (via the Env bus, mirroring StateUpdate), with a re-evaluation sweep over the existing peer_db on install.

Primary sources: Go tka/{aum,tka,sync,sig,state}.go, ipn/ipnlocal/tailnet-lock.go (RPC drivers), tailcfg/tka.go. Sharpest risk: byte-exact CBOR (CTAP2) + BLAKE2s hashing matching Go — any divergence breaks Hash/SigHash and every signature fails. Cross-validate against Go fixtures in tka/aum_test.go first. Estimated 2–3 weeks.

Rollout decision (from research panel — security + critic, independent)

Ship verify-and-LOG first, not fail-closed enforcement. Rationale:

  • In the current both-ends-owned model (operator runs Headscale AND the exit nodes), the threat TKA defends — a compromised control plane injecting malicious node keys — is out of the trust model (control is the operator). Enforcement adds little security here.
  • ts_tka crypto is unaudited (TS_RS_EXPERIMENT-gated). Fail-closed enforcement on unaudited CBOR/verification risks a self-inflicted connectivity outage (a replay/verify bug rejects legitimate peers) while guarding an out-of-model threat — worse than honest no-enforcement.
  • Verify-and-log (run node_key_authorized, log authorized/unauthorized/unsigned + a "would-reject" metric, but always admit) detects a control-plane compromise with zero outage risk and seasons the unaudited crypto on real traffic before it ever gates connectivity.

Promote to fail-closed enforcement only after (a) ts_tka is audited / cross-validated against Go vectors, AND (b) the deployment model widens to untrusted control (e.g. open-source users on hosted Tailscale SaaS), where TKA is genuinely load-bearing.

Acceptance for the first PR

  • AUM CBOR (de)serialization validated against Go tka/aum_test.go + sig_test.go vectors (do this before any network code).
  • /machine/tka/bootstrap + /sync/offer RPC clients over the existing Noise transport.
  • Authority delivered to the peer tracker; observe-only (tka_admits logs the verdict + increments a "would-reject" counter, still admits).
  • Fail-open on sync RPC failure (match Go: admit until a verified Authority exists), documented as an intentional divergence from the egress fail-closed posture.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions