Skip to content

perf: drop AVIF + raise Cloud Run CPU/mem to fix 5.4s home LCP#442

Merged
mergify[bot] merged 1 commit into
mainfrom
perf/440-drop-avif-cpu
Jun 1, 2026
Merged

perf: drop AVIF + raise Cloud Run CPU/mem to fix 5.4s home LCP#442
mergify[bot] merged 1 commit into
mainfrom
perf/440-drop-avif-cpu

Conversation

@julianken

Copy link
Copy Markdown
Owner

Diagrams

N/A — config-only change (two lines: next.config.ts image formats + deploy.yml Cloud Run flags); no architecture or data-flow change to illustrate.

Summary

Hotfix for a production incident: the home-page LCP is 5,406 ms (Google "poor" is > 4,000 ms). TTFB is fine (129 ms, prerender cache HIT); the slow element is a featured-post card hero image (PostCard → ThemeAwareHero) optimized on demand by /_next/image. The trace showed a 4,723 ms load with only 12 ms of download — so ~4.7 s is pure server-side encode wait on a --cpu=1 --memory=512Mi, CPU-throttled Cloud Run instance with an ephemeral optimizer cache and no CDN. images.formats lists AVIF first, the most CPU-expensive encode, which also starved sibling immutable JS chunks.

Two config changes:

  1. Drop AVIFnext.config.ts images.formats: ['image/avif', 'image/webp']['image/webp']. Sources are already WebP; WebP encodes ~2–4× faster on 1 vCPU, and the larger WebP bytes are irrelevant when download was 12 ms.
  2. Give the origin CPU + memorydeploy.yml Cloud Run flags: --cpu=1 --memory=512Mi--cpu=2 --memory=1Gi. CPU throttling left on (still grants full CPU during the request, when the encode runs); --no-cpu-throttling deliberately not added (it only adds 24/7 instance billing for no encode-latency benefit).

Non-goals (unchanged here): no CDN (durable edge-cache tracked in #415), no --no-cpu-throttling, and no JSX/component edits — the home LCP image already sets fetchPriority="high"; this PR does not add priority/preload. All other deploy flags (--allow-unauthenticated, --min-instances=1, --max-instances=3, --port=8080, --timeout=60s) are untouched.

Closes #440

Screenshots

N/A — not UI. This is a build-config + deploy-flag change with no markup difference.

Test plan

Local gates run from the worktree:

  • pnpm lintPASS (0 errors; 19 pre-existing warnings, none from this change).
  • pnpm typecheck (tsc --noEmit) — PASS.
  • pnpm test:unit (vitest run) — PASS (710/710 tests, 49 files).
  • pnpm build (next build) — PASS (exit 0; 67/67 static pages generated, full route table including /). Required NEXT_PUBLIC_SERVER_URL was supplied locally to satisfy the build-phase env guard (src/lib/env/required-env.ts), exactly as CI injects it.
  • pnpm test:e2e (playwright test) — DEFERRED-CI: globalSetup runs pnpm seed:test, which needs a Postgres DATABASE_URL not available in this sandbox. The four required E2E Shard x/4 checks (CI + Mergify-enforced) run on this PR. This is a config-only change with no JSX/behavioral edits, so E2E-exercised markup is unaffected.

Post-merge verification (to record here after deploy): re-run a Chrome DevTools trace of /; the LCP element is the first featured-post card image (PostCard → ThemeAwareHero). Expect LCP well under the prior ~5.4 s (target < ~2.5 s); note before/after.

Plan reference

Out of plan — prod incident: home-page LCP 5.4s hotfix; durable CDN follow-up #415.

🤖 Generated with Claude Code

Production home-page LCP is 5.4s. The LCP element is a featured-post card
hero image optimized on demand by /_next/image on a single-vCPU,
CPU-throttled Cloud Run instance with an ephemeral optimizer cache and no
CDN. The trace showed ~4.7s of pure server-side wait with only ~12ms of
download, so the cost is encode time, not bytes.

AVIF is listed first in images.formats and is the most CPU-expensive
encode. Sources are already WebP and WebP encodes ~2-4x faster on one
vCPU, so dropping AVIF removes the dominant per-request work; the larger
WebP bytes are irrelevant when download was 12ms. Raising the origin to
2 vCPU / 1Gi gives the optimizer headroom so a single encode no longer
pegs the CPU and stalls sibling immutable JS chunks.

CPU throttling is intentionally left on: throttling-on still grants full
CPU during a request (when the encode runs), whereas --no-cpu-throttling
only adds 24/7 instance billing for no encode-latency benefit. The
durable edge-cache (CDN) fix is tracked separately in #415.

Closes #440

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@julianken julianken marked this pull request as ready for review June 1, 2026 17:26

@julianken-bot julianken-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: APPROVE (code is correct; one non-code SUGGESTION on merge-readiness)

Clean two-line config hotfix. The diff matches issue #440's acceptance criteria exactly, the one load-bearing API value is the literal Next.js framework default (context7-verified), and no consumer-side coupling breaks from dropping AVIF.

Verification ledger (commands run this turn)

  • gh pr view 442 → 2 files, +2/-2, draft, mergeStateStatus: BLOCKED, head 13def5e (stable across two checks).
  • gh pr diff 442 → exactly two lines: deploy.yml flags + next.config.ts images.formats.
  • Read next.config.ts + full deploy.yml deploy step → new flag string is --cpu=2 --memory=1Gi; --allow-unauthenticated --min-instances=1 --max-instances=3 --port=8080 --timeout=60s unchanged; --no-cpu-throttling absent. grep for cpu-throttling across .github/workflows/ confirms no override anywhere.
  • context7 /vercel/next.js image-config.tsimageConfigDefault.formats = ['image/webp'], type ImageFormat = 'image/avif' | 'image/webp'. The new value is the v16 default and a valid ImageFormat[] — no type/runtime breakage. (Installed: Next 16.2.6, sharp 0.34.5.)
  • statusCheckRollup → ESLint/TypeScript/Vitest/Next.js Build/Analyze Bundle/CodeQL all SUCCESS; all four E2E shards SKIPPED (gated on draft != true, e2e-tests.yml:29).
  • grep tests/ e2e/ src/ for AVIF / images.formats coupling → none beyond the (now-removed-in-#441) blur-placeholder concern; orthogonal. git log on both files → changed lines predate this PR (R7 clean).
  • R15: 0 mermaid blocks → skipped. R16: no UI source → skipped. R13 fired (.github/workflows/**): T4/T6/T7 clear; shadow-mode, non-verdict-affecting.

Findings (1)

SUGGESTION — PR body claims the E2E shards "run on this PR"; they're SKIPPED while the PR is a draft. No code change needed — marking the PR ready triggers the required shards so they actually run before merge.

Specific praise (not filler)

images.formats: ['image/webp'] lands exactly on the Next.js 16 default — the lowest-risk value — and not adding --no-cpu-throttling is correct: throttling-on still grants full CPU during request processing (when the encode runs), capturing the latency win without always-on billing.

Bottom line

APPROVE on the merits — code correct and verified against current docs and the linked issue. Marking ready so the required E2E shards run before merge.

@julianken-bot (opus, fresh context)

@julianken

Copy link
Copy Markdown
Owner Author

@Mergifyio queue

@mergify

mergify Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Merge Queue Status

  • 🟠 Waiting for queue conditions
  • ⏳ Enter queue
  • ⏳ Run checks
  • ⏳ Merge
Required conditions to enter a queue
  • -closed [📌 queue requirement]
  • -conflict [📌 queue requirement]
  • -draft [📌 queue requirement]
  • any of [📌 queue -> configuration change requirements]:
    • -mergify-configuration-changed
    • check-success = Configuration changed
  • any of [🔀 queue conditions]:
    • all of [📌 queue conditions of queue rule default]:
      • #approved-reviews-by >= 1
      • #approved-reviews-by >= 1 [🛡 GitHub branch protection]
      • -conflict
      • -draft
      • base = main
      • check-success = Analyze Bundle
      • check-success = CodeQL Analysis
      • check-success = E2E Shard 1/4
      • check-success = E2E Shard 2/4
      • check-success = E2E Shard 3/4
      • check-success = E2E Shard 4/4
      • check-success = ESLint
      • check-success = Next.js Build
      • check-success = TypeScript
      • check-success = Vitest
      • github-review-decision = APPROVED [🛡 GitHub branch protection]
      • any of [🛡 GitHub branch protection]:
        • check-success = ESLint
        • check-neutral = ESLint
        • check-skipped = ESLint
      • any of [🛡 GitHub branch protection]:
        • check-success = TypeScript
        • check-neutral = TypeScript
        • check-skipped = TypeScript
      • any of [🛡 GitHub branch protection]:
        • check-success = Vitest
        • check-neutral = Vitest
        • check-skipped = Vitest
      • any of [🛡 GitHub branch protection]:
        • check-success = Next.js Build
        • check-neutral = Next.js Build
        • check-skipped = Next.js Build
      • any of [🛡 GitHub branch protection]:
        • check-success = Analyze Bundle
        • check-neutral = Analyze Bundle
        • check-skipped = Analyze Bundle
      • any of [🛡 GitHub branch protection]:
        • check-success = CodeQL Analysis
        • check-neutral = CodeQL Analysis
        • check-skipped = CodeQL Analysis
      • any of [🛡 GitHub branch protection]:
        • check-success = E2E Shard 1/4
        • check-neutral = E2E Shard 1/4
        • check-skipped = E2E Shard 1/4
      • any of [🛡 GitHub branch protection]:
        • check-success = E2E Shard 2/4
        • check-neutral = E2E Shard 2/4
        • check-skipped = E2E Shard 2/4
      • any of [🛡 GitHub branch protection]:
        • check-success = E2E Shard 3/4
        • check-neutral = E2E Shard 3/4
        • check-skipped = E2E Shard 3/4
      • any of [🛡 GitHub branch protection]:
        • check-success = E2E Shard 4/4
        • check-neutral = E2E Shard 4/4
        • check-skipped = E2E Shard 4/4

@mergify mergify Bot added the queued label Jun 1, 2026
@mergify

mergify Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Merge Queue Status

  • Entered queue2026-06-01 17:30 UTC · Rule: default
  • Checks skipped · PR is already up-to-date
  • Merged2026-06-01 17:31 UTC · at 13def5ecad025e0a3837778602619efdc43a017d · squash

This pull request spent 49 seconds in the queue, including 7 seconds running CI.

Required conditions to merge
  • #approved-reviews-by >= 1 [🛡 GitHub branch protection]
  • github-review-decision = APPROVED [🛡 GitHub branch protection]
  • any of [🛡 GitHub branch protection]:
    • check-success = ESLint
    • check-neutral = ESLint
    • check-skipped = ESLint
  • any of [🛡 GitHub branch protection]:
    • check-success = TypeScript
    • check-neutral = TypeScript
    • check-skipped = TypeScript
  • any of [🛡 GitHub branch protection]:
    • check-success = Vitest
    • check-neutral = Vitest
    • check-skipped = Vitest
  • any of [🛡 GitHub branch protection]:
    • check-success = Next.js Build
    • check-neutral = Next.js Build
    • check-skipped = Next.js Build
  • any of [🛡 GitHub branch protection]:
    • check-success = Analyze Bundle
    • check-neutral = Analyze Bundle
    • check-skipped = Analyze Bundle
  • any of [🛡 GitHub branch protection]:
    • check-success = CodeQL Analysis
    • check-neutral = CodeQL Analysis
    • check-skipped = CodeQL Analysis
  • any of [🛡 GitHub branch protection]:
    • check-success = E2E Shard 1/4
    • check-neutral = E2E Shard 1/4
    • check-skipped = E2E Shard 1/4
  • any of [🛡 GitHub branch protection]:
    • check-success = E2E Shard 2/4
    • check-neutral = E2E Shard 2/4
    • check-skipped = E2E Shard 2/4
  • any of [🛡 GitHub branch protection]:
    • check-success = E2E Shard 3/4
    • check-neutral = E2E Shard 3/4
    • check-skipped = E2E Shard 3/4
  • any of [🛡 GitHub branch protection]:
    • check-success = E2E Shard 4/4
    • check-neutral = E2E Shard 4/4
    • check-skipped = E2E Shard 4/4

@mergify mergify Bot merged commit 321c86a into main Jun 1, 2026
15 checks passed
@mergify mergify Bot deleted the perf/440-drop-avif-cpu branch June 1, 2026 17:31
@mergify mergify Bot removed the queued label Jun 1, 2026
@julianken

Copy link
Copy Markdown
Owner Author

Post-deploy verification ✅

Re-traced / on prod after the Cloud Run deploy (succeeded 17:31, run 26771000341):

Metric Before After
LCP 5,406 ms 339 ms (≈16×)
Hero image load duration 4,723 ms 2 ms (warm)
TTFB 129 ms 72 ms
CLS 0.00 0.00

Hero image content-type flipped image/avifimage/webp (confirmed via curl with an AVIF-accepting Accept header). The AVIF-encode-on-1-vCPU bottleneck is gone — a cold x-nextjs-cache: MISS now re-encodes WebP on 2 vCPU quickly, and warm hits are ~2 ms.

Durable edge-cache (so the first global request per asset also skips the re-encode) remains tracked in #415.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: fix 5.4s home LCP from on-demand image optimization

2 participants