Skip to content

Persist generated WireGuard keys as Kubernetes Secrets to prevent key drift on reconcile retries#498

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/persist-wireguard-keys-for-full-mesh-pods
Draft

Persist generated WireGuard keys as Kubernetes Secrets to prevent key drift on reconcile retries#498
Copilot wants to merge 2 commits intomainfrom
copilot/persist-wireguard-keys-for-full-mesh-pods

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 25, 2026

When full-mesh mode is enabled and WireGuard keypairs are auto-generated (no annotations pre-set), each reconcile retry regenerates a new keypair — causing the already-loaded wg0 interface on the shadow pod to diverge from the client snippet written in the same pass.

Changes

  • pkg/virtualkubelet/mesh.go — new ensureWGKeysSecret:

    • Gets or creates a <resourceBaseName>-wg-keys Secret in the wstunnel namespace storing server-private-key, client-private-key, and client-public-key
    • On first call: creates the Secret with freshly-generated keys
    • On subsequent calls (retries): returns the stored keys unchanged, discarding any newly-generated values
    • Handles concurrent creation races via AlreadyExists re-fetch
    • Validates stored fields are non-empty before returning
  • pkg/virtualkubelet/virtualkubelet.gocreateDummyPod:

    • After key generation, calls ensureWGKeysSecret; the returned values replace the generated ones so all downstream rendering (shadow pod config, client snippet) uses the same stable keypair
    • cleanupWstunnelResources now also deletes the <name>-wg-keys Secret on pod deletion
// After generation or annotation-read, stabilise keys via Secret:
serverPriv, generatedClientPriv, clientPub, err = p.ensureWGKeysSecret(
    ctx, wstunnelNS, resourceBaseName,
    serverPriv, generatedClientPriv, clientPub,
)
Original prompt

This section details on the original issue you should resolve

<issue_title>Persist generated WireGuard keys for full-mesh pods to avoid shadow/client config drift on retries</issue_title>
<issue_description>When full-mesh offloading is enabled, InterLink generates WireGuard key material during shadow pod creation if the offloaded pod does not already carry interlink.eu/wg-private-key and interlink.eu/wg-peer-public-key.

At the moment, those generated keys are used to render:

  • the shadow pod WireGuard config
  • the client-side WireGuard snippet/pre-exec script

but they are not persisted back onto the offloaded pod.

This becomes a problem when the creation/reconcile flow runs more than once for the same pod. A later retry can generate a different keypair and update annotations/config content, while the already running shadow wg0 interface still has the previously loaded keys. The result is WireGuard handshake failure due to client/shadow key drift.

Observed symptom:

  • interlink.eu/wireguard-client-snippet contains one server public key
  • /etc/wireguard/wg0.conf may contain another, or be updated later
  • wg show wg0 on the live shadow pod can still expose the original loaded keypair
  • ping/handshake to 10.7.0.1 fails intermittently

Why this is tricky:

  • the setup often works on the first reconcile
  • failures appear only when retries/requeues happen during creation, so the issue looks random

Proposed fix:

  • persist generated interlink.eu/wg-private-key and interlink.eu/wg-peer-public-key annotations onto the offloaded pod immediately after generation
  • reuse those persisted values on subsequent retries
  • optionally, consider forcing a shadow pod rollout when WG config changes, to avoid live/runtime drift from mounted config updates

This should make the full-mesh WireGuard setup deterministic across retries and prevent shadow/client key mismatches.</issue_description>

<agent_instructions>make a proposal persisting keys as secrets</agent_instructions>

Comments on the Issue (you are @copilot in this section)


⌨️ Start Copilot coding agent tasks without leaving your editor — available in VS Code, Visual Studio, JetBrains IDEs and Eclipse.

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 25, 2026

Deploy Preview for interlink-dev canceled.

Name Link
🔨 Latest commit 346603a
🔍 Latest deploy log https://app.netlify.com/projects/interlink-dev/deploys/69c3e4b8af9f4d0008dd05db

… drift on retries

When full-mesh offloading is enabled, generated WireGuard keypairs are now
stored in a Kubernetes Secret (`<name>-wg-keys`) in the wstunnel namespace.
Subsequent reconcile runs/retries read back the same keys instead of
generating new ones, preventing shadow/client WireGuard config drift.

- Add wgKeysSecretName() and ensureWGKeysSecret() in mesh.go
- Update createDummyPod() to call ensureWGKeysSecret() after key generation
- Update cleanupWstunnelResources() to delete the WG keys Secret on pod deletion

Co-authored-by: dciangot <4144326+dciangot@users.noreply.github.com>
Agent-Logs-Url: https://github.com/interlink-hq/interLink/sessions/515150a4-c5b6-47e4-b657-b4602bf0cc89
Copilot AI changed the title [WIP] Persist generated WireGuard keys for full-mesh pods to avoid config drift Persist generated WireGuard keys as Kubernetes Secrets to prevent key drift on reconcile retries Mar 25, 2026
Copilot AI requested a review from dciangot March 25, 2026 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Persist generated WireGuard keys for full-mesh pods to avoid shadow/client config drift on retries

2 participants