Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,16 @@ Every version listed here must correspond to a slice in [`PLAN.md`](./PLAN.md) w

---

## [0.9.4] — 2026-05-28

### Changed
- **Database connection pool size is now configurable** via the `DB_POOL_SIZE` and `DB_MAX_OVERFLOW` environment variables (defaults unchanged at 5 each), so it can be tuned in production without a redeploy. Telemetry showed no connection-pool pressure at current scale, so this ships the capability without changing the running defaults.

### Fixed
- **Search button no longer sticks on a spinner after pressing browser Back.** Returning to the landing page from a report could leave the analyze button spinning and its input disabled. The v0.9.3 attempt fixed the wrong mechanism; this is the real fix.

---

## [0.9.3] — 2026-05-28

### Added
Expand Down
19 changes: 15 additions & 4 deletions PLAN.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
| **v0.9.1** | `/me/analyses` N+1 fix + Layer A cache schema version | ✅ shipped |
| **v0.9.2** | Rate limiting (IP + user) on `/analyze` + `/narrative` | ✅ shipped |
| **v0.9.3** | Deletable `/me` history + back-nav loading fix + creator flair | ✅ shipped |
| **v0.9.4** | DB pool tune (5→10, 5→20) after PostHog baseline | pending |
| **v0.9.4** | DB pool size env-tunable + real back-nav spinner fix | ✅ shipped |
| **v0.9.5** | `/security-review` pass + load test to 100 RPS | pending |
| **v0.9.6** | Privacy policy + terms (legal docs) | pending |
| **v1.0.0** | Public launch | pending |
Expand Down Expand Up @@ -630,11 +630,22 @@ The narrative-mode CHECK constraint was a third drift in the same family — the

---

## v0.9.4 — DB pool tune (deferred)
## v0.9.4 — DB pool size env-tunable + back-nav spinner fix (shipped 2026-05-28)

**Goal:** Raise `pool_size=5, max_overflow=5` to `pool_size=10, max_overflow=20` once PostHog/Sentry baseline confirms the symptom in v0.8.0-shipped RUM data. Mind Neon's pooled-host (PgBouncer) connection caps when sizing.
**Goal:** Make the SQLAlchemy engine's `pool_size` / `max_overflow` configurable via `DB_POOL_SIZE` / `DB_MAX_OVERFLOW`, keeping the 5/5 defaults. Plus a genuine fix for the landing-page search spinner sticking on browser-back (the v0.9.3 attempt fixed the wrong mechanism).

**Exit criteria:** TBD when the slice begins.
**Why not the planned 10/20 bump:** Direct telemetry on 2026-05-28 (Neon `max_connections=112`, ~1 live app connection; Vercel 0% error rate; Sentry clean) showed no pool-exhaustion symptom. A blind bump to 30 connections/instance would also risk the 105-usable ceiling under multi-instance Fluid Compute. So the slice ships tunability instead of a default change; flip the env var if RUM ever shows the symptom.

**Back-nav spinner root cause:** Cache Components (`cacheComponents: true`, shipped v0.8.6) keeps the landing page mounted in a hidden React `<Activity>` on navigation instead of unmounting it, so the manual `isLoading` `useState` was preserved and reappeared as a stuck spinner on browser-back. Fixed by switching `search-bar.tsx` to `useTransition` (pending state derived from the live navigation, idle on return by construction). The v0.9.3 `pageshow` listener was inert (same-document soft-nav never fires it) and its test was a false positive.

**Design spec:** [`docs/superpowers/specs/2026-05-28-v0.9.4-db-pool-tunable-design.md`](./docs/superpowers/specs/2026-05-28-v0.9.4-db-pool-tunable-design.md).
**Sub-plan:** [`docs/superpowers/plans/2026-05-28-v0.9.4-db-pool-tunable.md`](./docs/superpowers/plans/2026-05-28-v0.9.4-db-pool-tunable.md).

**Exit criteria:**
- [x] `DB_POOL_SIZE` / `DB_MAX_OVERFLOW` settings (default 5); `_build_engine` reads them via the module reference.
- [x] 2 new non-DB tests (defaults + override) pass; backend suite 281 → 283.
- [x] `search-bar.tsx` uses `useTransition`; inert `pageshow` effect removed; bfcache test replaced with normalize/validation/nav coverage (frontend vitest 51 → 54).
- [x] Docs ritual + version bump to 0.9.4; tag + release.

---

Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Engineering insight first. AI flavor second. Scoring is deterministic and explai

## Status

Pre-alpha. Latest shipped release is **v0.9.3** (deletable `/me` history with undo, a fix for the search spinner sticking after browser-back, and a golden "creator" scorecard for the project's creator account). Live at https://skill-issue-tau.vercel.app — GitHub OAuth sign-in, Neon Postgres persistence, `/me` history, opt-in `/share/[slug]` public links. The AI narrative layer (Roast + Mentor) runs on **Groq** (`llama-3.3-70b-versatile`). v0.7.0 added Upstash Redis caching (warm `/analyze` ≤ 200 ms); v0.7.2 prod-certified the perf budget (CLS 0.080 → **0** structurally, perf 90 → 94, LCP 2,804 → 2,773 ms); v0.8.0 shipped Sentry (FE+BE), PostHog (events + web vitals), structlog JSON logging, on-voice 404, and a full axe a11y pass; v0.8.1 ships the nightly cron with bearer auth; v0.8.2 pairs it with the manual force-refresh button on `/me`; v0.8.3 hotfixes the empty-repo crash; v0.8.4 fixes the silent narrative misattribution; v0.8.5 closes the post-deploy-Sentry loop with a pre-merge CI gate; v0.8.6 closes v0.7.1's deferred share-page caching; v0.8.7 modernizes project config; v0.9.0 opens Beta hardening with bounded GH fan-out; v0.9.1 closes the /me N+1 + adds per-namespace Report cache versioning; v0.9.2 adds rate limiting (per-IP for anonymous, higher per-user caps for signed-in) on `/analyze` and `/narrative`; v0.9.3 adds deletable `/me` history with undo, fixes the back-nav search spinner, and gilds the creator's scorecard. **v0.9.4 — DB pool tune** is next (after PostHog baseline). See [`CHANGELOG.md`](./CHANGELOG.md) for shipped slices, [`PLAN.md`](./PLAN.md) for the full roadmap, and [`docs/PROGRESS_LOG.md`](./docs/PROGRESS_LOG.md) for the most recent session handoff.
Pre-alpha. Latest shipped release is **v0.9.4** (the database connection pool size is now tunable via environment variables without a redeploy, and the search spinner that could stick after pressing browser Back is genuinely fixed). v0.9.3 before it added deletable `/me` history with undo, a golden "creator" scorecard for the project's creator account, and a first (incomplete) attempt at the back-nav spinner fix. Live at https://skill-issue-tau.vercel.app — GitHub OAuth sign-in, Neon Postgres persistence, `/me` history, opt-in `/share/[slug]` public links. The AI narrative layer (Roast + Mentor) runs on **Groq** (`llama-3.3-70b-versatile`). v0.7.0 added Upstash Redis caching (warm `/analyze` ≤ 200 ms); v0.7.2 prod-certified the perf budget (CLS 0.080 → **0** structurally, perf 90 → 94, LCP 2,804 → 2,773 ms); v0.8.0 shipped Sentry (FE+BE), PostHog (events + web vitals), structlog JSON logging, on-voice 404, and a full axe a11y pass; v0.8.1 ships the nightly cron with bearer auth; v0.8.2 pairs it with the manual force-refresh button on `/me`; v0.8.3 hotfixes the empty-repo crash; v0.8.4 fixes the silent narrative misattribution; v0.8.5 closes the post-deploy-Sentry loop with a pre-merge CI gate; v0.8.6 closes v0.7.1's deferred share-page caching; v0.8.7 modernizes project config; v0.9.0 opens Beta hardening with bounded GH fan-out; v0.9.1 closes the /me N+1 + adds per-namespace Report cache versioning; v0.9.2 adds rate limiting (per-IP for anonymous, higher per-user caps for signed-in) on `/analyze` and `/narrative`; v0.9.3 adds deletable `/me` history with undo, attempts the back-nav search-spinner fix, and gilds the creator's scorecard. v0.9.4 makes the DB connection pool size env-tunable (defaults unchanged — RUM showed no pool exhaustion) and lands the real back-nav spinner fix (the v0.9.3 attempt addressed the wrong mechanism). **v0.9.5 — security review + load test** is next. See [`CHANGELOG.md`](./CHANGELOG.md) for shipped slices, [`PLAN.md`](./PLAN.md) for the full roadmap, and [`docs/PROGRESS_LOG.md`](./docs/PROGRESS_LOG.md) for the most recent session handoff.

---

Expand Down Expand Up @@ -76,7 +76,7 @@ cp .env.example .env # then edit .env and add your GITHUB_TOKEN and OPENA
uv run uvicorn app.main:app --reload --port 8000
```

Verify: `curl http://localhost:8000/health` → `{"status":"ok","version":"0.9.3","db":"up"|"down","cache":"up"|"down"|"unconfigured"}`. The `db` field reports DB reachability when `DATABASE_URL` is configured; the `cache` field reports Upstash reachability (`unconfigured` when `UPSTASH_REDIS_REST_URL` isn't set — perfectly fine for local dev, the in-process fallback covers it).
Verify: `curl http://localhost:8000/health` → `{"status":"ok","version":"0.9.4","db":"up"|"down","cache":"up"|"down"|"unconfigured"}`. The `db` field reports DB reachability when `DATABASE_URL` is configured; the `cache` field reports Upstash reachability (`unconfigured` when `UPSTASH_REDIS_REST_URL` isn't set — perfectly fine for local dev, the in-process fallback covers it).
Hit the analyzer: `curl http://localhost:8000/analyze/octocat`.

### Frontend (`:3000`)
Expand Down
10 changes: 10 additions & 0 deletions backend/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -95,3 +95,13 @@ UPSTASH_REDIS_REST_TOKEN=
# throttled by mistake); narrative + signed-in limits stay active.
# Generate: python -c "import secrets; print(secrets.token_hex(32))"
# INTERNAL_PROXY_SECRET=

# ── v0.9.4: DB connection pool sizing ─────────────────────────────────
# SQLAlchemy async engine pool. Optional — defaults shown match the
# previous hardcoded values, so leaving these unset changes nothing.
# Raise only if RUM shows QueuePool timeouts. Ceiling ~105 usable
# connections on the current Neon compute (112 - 7 reserved); on the
# PgBouncer pooler that's buffered by multiplexing, on a direct
# connection keep (pool_size + max_overflow) x peak_instances < 105.
# DB_POOL_SIZE=5
# DB_MAX_OVERFLOW=5
8 changes: 4 additions & 4 deletions backend/app/db/engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
create_async_engine,
)

from app.settings import settings
import app.settings as settings_module


def _normalize_async_url(url: str) -> tuple[str, bool]:
Expand Down Expand Up @@ -81,15 +81,15 @@ def _build_engine(url: str) -> AsyncEngine:
return create_async_engine(
normalized,
connect_args=connect_args,
pool_size=5,
max_overflow=5,
pool_size=settings_module.settings.db_pool_size,
max_overflow=settings_module.settings.db_max_overflow,
pool_pre_ping=True,
pool_recycle=1800,
)


engine: AsyncEngine = _build_engine(
settings.database_url
settings_module.settings.database_url
or "postgresql+asyncpg://placeholder:placeholder@localhost:5432/placeholder"
)

Expand Down
13 changes: 12 additions & 1 deletion backend/app/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

from pydantic_settings import BaseSettings, SettingsConfigDict

VERSION = "0.9.3"
VERSION = "0.9.4"


class Settings(BaseSettings):
Expand Down Expand Up @@ -97,5 +97,16 @@ class Settings(BaseSettings):
# collapse into one Vercel-infra-IP bucket); narrative + user limits stay on.
internal_proxy_secret: str | None = None

# v0.9.4 — DB connection pool sizing
# SQLAlchemy async engine pool. Defaults match the pre-v0.9.4 hardcoded
# values, so a deploy with neither env var set is byte-identical to before.
# Raise via env only when RUM shows pool exhaustion (QueuePool timeouts).
# Ceiling: ~105 usable Postgres connections (112 max_connections - 7
# reserved on the current ~0.25 CU Neon compute). On the PgBouncer pooler
# this is buffered by multiplexing; on a direct connection keep
# (pool_size + max_overflow) x peak_instances < 105.
db_pool_size: int = 5
db_max_overflow: int = 5


settings = Settings()
2 changes: 1 addition & 1 deletion backend/pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "skill-issue-backend"
version = "0.9.3"
version = "0.9.4"
description = "Skill Issue backend — FastAPI service that ingests a GitHub profile and returns a deterministic engineering report."
readme = "README.md"
authors = [
Expand Down
33 changes: 33 additions & 0 deletions backend/tests/db/test_engine_pool.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
from __future__ import annotations

from unittest.mock import patch

import app.settings as settings_module
from app.db.engine import _build_engine
from app.settings import Settings

_URL = "postgresql+asyncpg://u:p@localhost:5432/db"


def test_engine_pool_uses_settings_defaults():
"""With no env override, the builder passes the 5/5 defaults through."""
with patch("app.db.engine.create_async_engine") as mock_create:
_build_engine(_URL)
kwargs = mock_create.call_args.kwargs
assert kwargs["pool_size"] == 5
assert kwargs["max_overflow"] == 5


def test_engine_pool_reads_env_override(monkeypatch):
"""Monkeypatching the live settings object flows into the built engine —
proves _build_engine reads via the module reference, not an import-time bind."""
monkeypatch.setattr(
settings_module,
"settings",
Settings(db_pool_size=10, db_max_overflow=20),
)
with patch("app.db.engine.create_async_engine") as mock_create:
_build_engine(_URL)
kwargs = mock_create.call_args.kwargs
assert kwargs["pool_size"] == 10
assert kwargs["max_overflow"] == 20
2 changes: 1 addition & 1 deletion backend/uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions docs/DEPLOY.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,10 @@ In Vercel → **Settings** → **Environment Variables**, add (Production + Prev
| `ANALYZE_USER_PER_HOUR` | Signed-in per-user `/analyze` cap. Default `60`. Backend env only. (v0.9.2+) | — |
| `NARRATIVE_ANON_PER_IP_PER_HOUR` | Anonymous per-IP `/narrative` cap. Default `30`. Backend env only. (v0.9.2+) | — |
| `NARRATIVE_USER_PER_HOUR` | Signed-in per-user `/narrative` cap. Default `90`. Backend env only. (v0.9.2+) | — |
| `DB_POOL_SIZE` | SQLAlchemy pool size per Fluid Compute instance. Default `5`. Backend env only. Raise only on confirmed pool exhaustion. (v0.9.4+) | — |
| `DB_MAX_OVERFLOW` | Extra connections beyond `DB_POOL_SIZE` under burst. Default `5`. Backend env only. (v0.9.4+) | — |

> **DB pool ceiling.** The Neon compute exposes ~105 usable connections (`max_connections` 112 - 7 `superuser_reserved_connections` on the current ~0.25 CU compute). The app connects through the PgBouncer pooler (`statement_cache_size=0`), which multiplexes many client connections onto few server ones — so the ceiling is heavily buffered. If ever switched to a direct connection, keep `(DB_POOL_SIZE + DB_MAX_OVERFLOW) × peak_instances < 105`.

### 5. Run the initial Alembic migration

Expand Down
27 changes: 27 additions & 0 deletions docs/PROGRESS_LOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,33 @@ Format:

---

## 2026-05-28 — Claude (Opus 4.7) — v0.9.4 shipped (DB pool size env-tunable + real back-nav spinner fix)

**Slice:** v0.9.4. Two changes: the planned DB-pool work, plus a genuine fix for the back-nav search spinner that v0.9.3 only *appeared* to fix.

**Done:**
- **DB pool:** Added `DB_POOL_SIZE` / `DB_MAX_OVERFLOW` settings (default 5); `_build_engine` reads them via the `settings_module` module reference; 2 new non-DB tests (defaults + override) asserting the kwargs passed to `create_async_engine`; backend suite 281 → 283. Docs ritual across CHANGELOG/PLAN/DEPLOY/.env.example/README + version literals + uv.lock.
- **Back-nav spinner (real fix):** `search-bar.tsx` now uses `useTransition` for the pending state instead of a manual `isLoading` `useState`. Removed the inert v0.9.3 `pageshow` effect. Replaced the false-positive bfcache test with normalize/validation/navigation coverage (search-bar tests 1 → 4; frontend vitest 51 → 54).

**Decisions:**
- **DB pool — ship tunability, NOT the planned 10/20 bump.** Evidence gathered 2026-05-28 via Neon SQL + Vercel logs + Sentry: `max_connections=112`, `superuser_reserved_connections=7` → 105 usable; live app footprint ~1 connection (`neondb_owner`); Vercel 0% error rate; Sentry clean. No pool-exhaustion symptom exists, and a blind bump to 30 conns/instance would risk the 105 ceiling under multi-instance Fluid Compute. Defaults stay 5/5 → byte-identical runtime; flip env var if RUM ever shows the symptom. Module-reference read chosen for test monkeypatch propagation (v0.8.1/v0.9.0 lesson).
- **Back-nav — `useTransition`, not an effect reset.** A mount effect that resets `isLoading` would trip the `react-hooks/set-state-in-effect` lint gate. `useTransition`'s `isPending` is derived from the live navigation, so on browser-back (which never invokes this page's `startTransition`) it's idle by construction — no preserved-state to get stuck. Folded into v0.9.4 (unshipped branch) at the user's request rather than renumbering.

**Learned / surprises:**
- **Cache Components (`cacheComponents: true`, shipped v0.8.6) was the real culprit.** With it enabled, the App Router keeps the previous route mounted in a hidden React `<Activity>` instead of unmounting it — so a manual loading `useState` is *preserved* and reappears as a stuck spinner on browser-back. v0.9.3 misdiagnosed this as bfcache; its `pageshow` listener never fires on same-document soft-nav, and its unit test was a false positive (mocked the router, fired a synthetic `pageshow`). Memo: UI behavior that depends on Activity hide/show is not reproducible in happy-dom — verify in a real browser.

**Verified:**
- Backend ruff clean + pytest 283 passed/69 skipped. Frontend lint + tsc clean, vitest 54 passed, `next build` clean (PPR routes intact).
- **Back-nav fix live behavior: pending user confirmation in-browser** (Activity show/hide can't be exercised headlessly; static checks pass).

**Blocked / open:**
- Live browser confirmation of the spinner fix + push/PR/tag/release pending controller+user (this entry written pre-ship).

**Next:**
- v0.9.5 — security review + load test.

---

## 2026-05-28 — Claude (Opus 4.7) — post-v0.9.3 fix-forward (creator glow removed; deploy unblocked)

**Slice:** post-v0.9.3, no version bump (fix-forward on `main`, matching the v0.8.0 next.config precedent).
Expand Down
Loading
Loading