fix(audit): restore Caddy access log, Decimal 500, og:title per page#36
Merged
Conversation
…er page Three reds and supporting yellows from the 2026-05-26 read-only audit: - RED 1 (crawl_logs frozen): the api.cercol.team Caddy snippet lost its `log` directive in the 2026-05-20 conf.d migration, freezing /var/log/caddy/cercol_api_access.log since 2026-05-11 and breaking the crawl_log_parser ingest. Restore the JSON access-log block with a bounded native roller. Validated on the server in /tmp (not touching /etc/caddy). - RED 2 (HTTP 500 on /admin/results): asyncpg returns decimal.Decimal for NUMERIC columns; mixing it with the float norm mean/sd raised a TypeError. Cast the raw score to float at the arithmetic site in _scores_to_zscores. Two regression tests added. - RED 3 (generic og:title on top-level pages): usePageMeta never mutated og:*/twitter:*, so /about/, /science/, etc. shipped the home's generic og:title. Mutate the existing tags via setAttribute (count stays at one), default to title/description, restore on unmount. Two regression guards added in test_seo.py. Yellows: - YELLOW 4: the test_seo.py SAMPLED_ROUTES referenced three blog slugs that no longer exist (only surfaced with a full prerender; CI skips these without build:full). Replaced with real, existing articles. - YELLOW 5: docs and cron files referenced cercol-sa.json; the real key file is cercol-seo-ingest.json. Corrected everywhere. - YELLOW 6/7: VPS load ~16 on 2 cores (TopQuaranta/postgres-driven, not cercol) and mobile LCP ~6s documented as ROADMAP Phase 17.9; neither is a quick win. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
6 tasks
miquelmatoses
added a commit
that referenced
this pull request
May 28, 2026
…smoke test (#37) test_seo.py no longer hardcodes blog slugs that rot when content is renamed (the audit found three dead slugs that CI never caught because it skips when dist/ is absent). SAMPLED_ROUTES is now derived from the prerendered dist/ at import time: every existing top-level page plus the first blog article alphabetically for each language present. The selection is deterministic and degrades gracefully (empty when dist/ is not built, individual pages skipped if missing), so a vanished article can never break the suite again. Coverage went from 7 to 13 routes (one article per language). Robustness gate: removing the sampled article from dist/ keeps the suite green because the sample re-derives from what remains. FASE 2 drift sweep also fixed pagespeed_ingest.py SEED_URLS, whose two fallback blog slugs were dead (PSI would have measured 404 pages); replaced with live articles and a comment on the rot risk. Remaining known drift documented in the PR (historical update_blog_article_*.py scripts cross-link a non-existent slug; that is content debt, not test fragility, and editing the run-once scripts would not change published content). Also hardens the backend deploy smoke test: the 2026-05-28 deploy of PR #36 went green on the server (Caddy valid, service active, /blog 200) but the Action failed because the smoke test only waited ~10s while uvicorn took ~31s to boot two workers on the load-saturated shared VPS. The window is now 20 x 3s = up to 60s, which tolerates the cold start without masking a real outage. Co-authored-by: miquelmatoses <miquelmatoses@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the three reds and supporting yellows from the 2026-05-26 read-only audit. Single branch, all changes covered by tests.
Reds
api.cercol.teamCaddy snippet lost itslogdirective during the 2026-05-20 conf.d migration, silently freezing/var/log/caddy/cercol_api_access.log(last write 2026-05-11) and starvingcrawl_log_parser. Restored the JSON access-log block with a bounded native roller. Validated on the server in/tmp(never touching/etc/caddy); installed by the normal backend deploy viacaddy validate+ reload./admin/results. asyncpg returnsdecimal.Decimalfor NUMERIC columns; mixing with the float normmean/sdraisedunsupported operand type(s) for -: 'decimal.Decimal' and 'float'. Cast tofloatat the arithmetic site in_scores_to_zscores. +2 regression tests.og:titleon top-level pages.usePageMetanever mutatedog:*/twitter:*, so/about/,/science/, etc. shipped the home's generic og:title. Now mutates the existing tags viasetAttribute(count stays at one), defaults to title/description, restores on unmount. +2 regression guards intest_seo.py.Yellows
test_seo.pySAMPLED_ROUTESpointed at three blog slugs that no longer exist (only fails under a full prerender; CI skips withoutbuild:full). Replaced with real articles.cercol-sa.json; real key iscercol-seo-ingest.json. Corrected everywhere.Test plan
pytest api/tests/— 149 passedpytest api/tests/test_seo.pyagainst freshbuild:fulldist — 30 passed (incl. og:title guards)vitest run— 204 passedcaddy validateon server/tmp— Valid configurationcrawl_logsgets fresh rows after deploy + a fresh curl to api.cercol.team/about//science//instruments//roles/ship page-specific og:title🤖 Generated with Claude Code