Skip to content

infra(cdn): put Cloudflare in front of Cloud Run for AI Bot Activity + edge TTFB #415

Description

@julianken

Context

detached-node.dev serves directly from Google Cloud Run (us-west1) with no CDN. Two consequences:

  1. No AI Bot Activity visibility in Microsoft Clarity. Clarity's Bot Activity dashboard (which tracks GPTBot, ClaudeBot, PerplexityBot, CCBot, etc., per-page and per-frequency) requires a connected CDN — Fastly, CloudFront, or Cloudflare. Currently invisible: we cannot see which AI crawlers are hitting which pages or how often. See https://learn.microsoft.com/en-us/clarity/ai-visibility/bot-activity-overview.
  2. Global TTFB suboptimal. Cloud Run runs only in us-west1. EU/APAC visitors pay full transcontinental round-trip on every uncached request.

Plan and cost. The Clarity ↔ Cloudflare integration uses Cloudflare LogPush, which Microsoft's own cost-considerations doc notes is "typically available on paid plans and might have usage-based pricing" (https://learn.microsoft.com/en-us/clarity/ai-visibility/cost-considerations-bot-activity-integrations). LogPush to HTTPS destinations requires Cloudflare Pro plan (~$25/month) or higher. Free tier covers the proxy/TTFB/DDoS half, but not the AI Bot Activity unlock. This issue assumes Pro and budgets accordingly.

Acceptance criteria

  • Cloudflare account with a zone for detached-node.dev on Pro plan (or higher if needed for LogPush destination)
  • DNS migration to Cloudflare nameservers (required — Cloudflare must be authoritative for the zone; current Vercel nameservers cannot proxy through Cloudflare):
    • Mirror all existing Vercel DNS records at Cloudflare before flipping NS: A/AAAA for apex + any subdomains, MX records, all TXT records (SPF, DKIM, DMARC, domain-verification, sitemap-verification, any other vendor-verification TXTs)
    • Verify mirrored records with dig @<cloudflare-ns> <record> before NS change
    • Flip nameservers at the registrar to Cloudflare; confirm propagation
  • Cloudflare proxy enabled (orange cloud) for the apex
  • SSL/TLS mode set to Full (strict) to maintain end-to-end TLS to Cloud Run
  • Cloud Run origin lockdown (in scope — protects the integrity of the Bot Activity numbers; the *.run.app hostname is discoverable via certificate transparency logs, so bots that find it can bypass Cloudflare entirely):
    • Restrict the Cloud Run service to Cloudflare IP ranges (https://www.cloudflare.com/ips/) via ingress controls or an in-app middleware check
    • Verify: curl -v https://<service>-<hash>-uw.a.run.app/ from a non-Cloudflare IP returns 403 or times out
    • Verify: curl -sI https://detached-node.dev/ (via Cloudflare) still returns 200
  • CDN connected to Microsoft Clarity per https://learn.microsoft.com/en-us/clarity/ai-visibility/bot-activity-overview — verify Bot Activity dashboard populates within 48 hours of LogPush activation
  • Budget AC: total monthly Cloudflare spend ≤ $30/mo (Pro is ~$25/mo; LogPush usage adds a small variable). If projected cost exceeds the ceiling, pause and re-decide.
  • Production verification:
    • curl -sI https://detached-node.dev/ shows server: cloudflare and a cf-ray header
    • curl https://detached-node.dev/sitemap.xml returns the sitemap intact
    • curl https://detached-node.dev/robots.txt returns robots.txt intact
    • Multiple routes render correctly in a browser
  • Document the Cloudflare account/zone ID, plan, LogPush configuration, and origin-lockdown method in docs/deployment.md

Out of scope

  • Cloudflare Workers, R2, D1 — proxy + DNS + LogPush only
  • Page Rules / Transform Rules tuning — baseline only; defer optimization
  • WAF rules beyond Pro-tier defaults — defer
  • Plan upgrades beyond Pro unless LogPush destination requires it (revisit if so)

Plain-English explanation

  1. What: Put Cloudflare in front of the site so AI bot traffic becomes visible in Clarity's Bot Activity dashboard and global users get faster page loads. Also lock down the Cloud Run origin so bots can't bypass the CDN.
  2. Why now: Bot Activity is the biggest blind spot in AI-discovery measurement — without a CDN, we cannot see which AI crawlers visit. Without origin lockdown, the numbers we do see would be incomplete by an unknown margin.
  3. Cost: ~$25–30/mo on Cloudflare Pro for the LogPush feature that powers the Clarity integration. Free tier is insufficient per Microsoft's own docs.
  4. Risk: Medium during DNS migration window (brief but visible) — mitigated by mirroring all records at Cloudflare before flipping NS. Cloudflare proxy adds <50ms first-byte for cache misses but improves p99 on geographic edges. Worst case: revert by disabling the orange cloud proxy and flipping NS back to Vercel.
  5. Validation: Bot Activity dashboard populates within 48 hours of LogPush activation. cf-ray header present in production responses. Non-Cloudflare requests to the *.run.app origin return 403/timeout.

Note: This is primarily an ops task — DNS migration + dashboard setup + Clarity integration + origin lockdown — with minimal code changes. An implementer subagent may help with the docs/deployment.md update and any Cloud Run middleware for IP filtering; the DNS/dashboard steps are Julian-direct.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestneeds:humanBot review iteration cap hit; human attention requiredstatus:todoReady to start

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions