Skip to content

fix(audit): restore Caddy access log, Decimal 500, og:title per page#36

Merged
miquelmatoses merged 1 commit into
mainfrom
fix/audit-2026-05-26
May 28, 2026
Merged

fix(audit): restore Caddy access log, Decimal 500, og:title per page#36
miquelmatoses merged 1 commit into
mainfrom
fix/audit-2026-05-26

Conversation

@miquelmatoses
Copy link
Copy Markdown
Collaborator

Summary

Fixes the three reds and supporting yellows from the 2026-05-26 read-only audit. Single branch, all changes covered by tests.

Reds

  • RED 1 — crawl_logs pipeline frozen. The api.cercol.team Caddy snippet lost its log directive during the 2026-05-20 conf.d migration, silently freezing /var/log/caddy/cercol_api_access.log (last write 2026-05-11) and starving crawl_log_parser. Restored the JSON access-log block with a bounded native roller. Validated on the server in /tmp (never touching /etc/caddy); installed by the normal backend deploy via caddy validate + reload.
  • RED 2 — HTTP 500 on /admin/results. asyncpg returns decimal.Decimal for NUMERIC columns; mixing with the float norm mean/sd raised unsupported operand type(s) for -: 'decimal.Decimal' and 'float'. Cast to float at the arithmetic site in _scores_to_zscores. +2 regression tests.
  • RED 3 — generic og:title on top-level pages. usePageMeta never mutated og:*/twitter:*, so /about/, /science/, etc. shipped the home's generic og:title. Now mutates the existing tags via setAttribute (count stays at one), defaults to title/description, restores on unmount. +2 regression guards in test_seo.py.

Yellows

Test plan

  • pytest api/tests/ — 149 passed
  • pytest api/tests/test_seo.py against fresh build:full dist — 30 passed (incl. og:title guards)
  • vitest run — 204 passed
  • Caddy snippet caddy validate on server /tmp — Valid configuration
  • Post-merge: verify crawl_logs gets fresh rows after deploy + a fresh curl to api.cercol.team
  • Post-merge: confirm /about/ /science/ /instruments/ /roles/ ship page-specific og:title

🤖 Generated with Claude Code

…er page

Three reds and supporting yellows from the 2026-05-26 read-only audit:

- RED 1 (crawl_logs frozen): the api.cercol.team Caddy snippet lost its
  `log` directive in the 2026-05-20 conf.d migration, freezing
  /var/log/caddy/cercol_api_access.log since 2026-05-11 and breaking the
  crawl_log_parser ingest. Restore the JSON access-log block with a
  bounded native roller. Validated on the server in /tmp (not touching
  /etc/caddy).
- RED 2 (HTTP 500 on /admin/results): asyncpg returns decimal.Decimal for
  NUMERIC columns; mixing it with the float norm mean/sd raised a
  TypeError. Cast the raw score to float at the arithmetic site in
  _scores_to_zscores. Two regression tests added.
- RED 3 (generic og:title on top-level pages): usePageMeta never mutated
  og:*/twitter:*, so /about/, /science/, etc. shipped the home's generic
  og:title. Mutate the existing tags via setAttribute (count stays at
  one), default to title/description, restore on unmount. Two regression
  guards added in test_seo.py.

Yellows:
- YELLOW 4: the test_seo.py SAMPLED_ROUTES referenced three blog slugs
  that no longer exist (only surfaced with a full prerender; CI skips
  these without build:full). Replaced with real, existing articles.
- YELLOW 5: docs and cron files referenced cercol-sa.json; the real key
  file is cercol-seo-ingest.json. Corrected everywhere.
- YELLOW 6/7: VPS load ~16 on 2 cores (TopQuaranta/postgres-driven, not
  cercol) and mobile LCP ~6s documented as ROADMAP Phase 17.9; neither is
  a quick win.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@miquelmatoses miquelmatoses merged commit 65f74a3 into main May 28, 2026
7 checks passed
@miquelmatoses miquelmatoses deleted the fix/audit-2026-05-26 branch May 28, 2026 18:36
miquelmatoses added a commit that referenced this pull request May 28, 2026
…smoke test (#37)

test_seo.py no longer hardcodes blog slugs that rot when content is
renamed (the audit found three dead slugs that CI never caught because it
skips when dist/ is absent). SAMPLED_ROUTES is now derived from the
prerendered dist/ at import time: every existing top-level page plus the
first blog article alphabetically for each language present. The
selection is deterministic and degrades gracefully (empty when dist/ is
not built, individual pages skipped if missing), so a vanished article
can never break the suite again. Coverage went from 7 to 13 routes (one
article per language). Robustness gate: removing the sampled article from
dist/ keeps the suite green because the sample re-derives from what
remains.

FASE 2 drift sweep also fixed pagespeed_ingest.py SEED_URLS, whose two
fallback blog slugs were dead (PSI would have measured 404 pages);
replaced with live articles and a comment on the rot risk. Remaining
known drift documented in the PR (historical update_blog_article_*.py
scripts cross-link a non-existent slug; that is content debt, not test
fragility, and editing the run-once scripts would not change published
content).

Also hardens the backend deploy smoke test: the 2026-05-28 deploy of
PR #36 went green on the server (Caddy valid, service active, /blog 200)
but the Action failed because the smoke test only waited ~10s while
uvicorn took ~31s to boot two workers on the load-saturated shared VPS.
The window is now 20 x 3s = up to 60s, which tolerates the cold start
without masking a real outage.

Co-authored-by: miquelmatoses <miquelmatoses@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant