Skip to content

refactor(test): derive test_seo sample from prerender + harden deploy smoke test#37

Merged
miquelmatoses merged 1 commit into
mainfrom
refactor/test-seo-dynamic-sampling
May 28, 2026
Merged

refactor(test): derive test_seo sample from prerender + harden deploy smoke test#37
miquelmatoses merged 1 commit into
mainfrom
refactor/test-seo-dynamic-sampling

Conversation

@miquelmatoses
Copy link
Copy Markdown
Collaborator

Summary

Two reliability follow-ups from the 2026-05-26 audit.

1. test_seo.py sample derived from the prerender (no hardcoded slugs)

The audit found SAMPLED_ROUTES contained three blog slugs that no longer existed. CI never caught it because these tests skipif when dist/ is absent and CI does not run build:full on the backend job. PR #36 swapped the dead slugs by hand, but the fragile pattern remained.

Now the sample is derived from the prerendered dist/ at import time:

def _discover_routes() -> list[str]:
    if not _has_prerendered():
        return []
    routes = [r for r in TOP_LEVEL_CANDIDATES if _html_path(r).is_file()]
    for lang in LANGS:
        blog_dir = DIST / "blog" if lang == "" else DIST / lang / "blog"
        slugs = sorted(p.name for p in blog_dir.iterdir()
                       if p.is_dir() and (p / "index.html").is_file())
        if slugs:
            prefix = "blog" if lang == "" else f"{lang}/blog"
            routes.append(f"{prefix}/{slugs[0]}")
    return routes
  • Deterministic (alphabetical, first per category) so it fails identically for everyone.
  • Degrades gracefully: empty when dist/ is not built (skipif still fires), individual top-level pages skipped if missing instead of a hard FileNotFoundError.
  • Coverage 7 -> 13 routes (one blog article per language present).

Robustness gate (verified locally): removing the currently-sampled article from dist/ keeps the suite green, because the sample re-derives to the next existing article.

$ mv dist/blog/anonymity-...-why-it-matters /tmp/   # simulate vanished slug
$ pytest -q api/tests/test_seo.py
57 passed   # sample auto-advanced to big-five-personality-across-cultures-...

2. Drift inventory (FASE 2)

Location Finding Action
api/tests/test_seo.py 3 dead hardcoded blog slugs Fixed — dynamic derivation
api/jobs/pagespeed_ingest.py SEED_URLS 2 dead fallback blog slugs (PSI would measure 404 pages) Fixed — replaced with live articles + rot-risk comment
scripts/update_blog_article_{2,3}.py body cross-link to /blog/big-five-vs-disc-vs-belbin (dead) Documented, not fixed — run-once historical generators; editing them does not change already-published content. This is content debt (a broken internal link in published articles), tracked separately, not test fragility
api/seo_mcp/, src/ no hardcoded article slugs found none

3. Backend deploy smoke test hardened

PR #36's backend deploy went green on the server (Caddy validated, service active, api.cercol.team/blog 200, crawl_logs flowing) but the Action failed: the smoke test waited only ~10s (5 x 2s) while uvicorn took ~31s to boot two workers on the load-saturated shared VPS (the YELLOW 6 audit finding). False negative. Window is now 20 x 3s = up to 60s.

Test plan

  • pytest -q api/tests/test_seo.py (full dist) — 57 passed
  • Robustness gate — remove sampled article, still 57 passed
  • pytest -q api/ — 176 passed
  • vitest run — 204 passed
  • ruff — covered by CI (not installed locally)
  • CI frontend job runs build:full + test_seo against the complete dist (authoritative gate)

No production behavior change. Test + CI-config only; the auto-triggered backend deploy should now pass its (longer) smoke test.

🤖 Generated with Claude Code

…smoke test

test_seo.py no longer hardcodes blog slugs that rot when content is
renamed (the audit found three dead slugs that CI never caught because it
skips when dist/ is absent). SAMPLED_ROUTES is now derived from the
prerendered dist/ at import time: every existing top-level page plus the
first blog article alphabetically for each language present. The
selection is deterministic and degrades gracefully (empty when dist/ is
not built, individual pages skipped if missing), so a vanished article
can never break the suite again. Coverage went from 7 to 13 routes (one
article per language). Robustness gate: removing the sampled article from
dist/ keeps the suite green because the sample re-derives from what
remains.

FASE 2 drift sweep also fixed pagespeed_ingest.py SEED_URLS, whose two
fallback blog slugs were dead (PSI would have measured 404 pages);
replaced with live articles and a comment on the rot risk. Remaining
known drift documented in the PR (historical update_blog_article_*.py
scripts cross-link a non-existent slug; that is content debt, not test
fragility, and editing the run-once scripts would not change published
content).

Also hardens the backend deploy smoke test: the 2026-05-28 deploy of
PR #36 went green on the server (Caddy valid, service active, /blog 200)
but the Action failed because the smoke test only waited ~10s while
uvicorn took ~31s to boot two workers on the load-saturated shared VPS.
The window is now 20 x 3s = up to 60s, which tolerates the cold start
without masking a real outage.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@miquelmatoses miquelmatoses merged commit 3c477a5 into main May 28, 2026
10 of 11 checks passed
@miquelmatoses miquelmatoses deleted the refactor/test-seo-dynamic-sampling branch May 28, 2026 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant