Skip to content

pika-dev: implement issue agent service and deploy hardening#420

Draft
justinmoon wants to merge 3 commits intomasterfrom
pika-dev-main
Draft

pika-dev: implement issue agent service and deploy hardening#420
justinmoon wants to merge 3 commits intomasterfrom
pika-dev-main

Conversation

@justinmoon
Copy link
Collaborator

@justinmoon justinmoon commented Mar 4, 2026

Attempt to make a little background bot to implement stuff. Should maybe move this to my personal infra.

Notes: codex resume 019cb713-68d5-7f30-b723-dc4dfa4cf784 in /Users/justin/code/pika/worktrees/pika-dev

justinmoon and others added 3 commits March 3, 2026 22:22
Three root causes were making generation tasks get stuck:

1. Stale `generating` artifacts after service restart had no recovery
   logic — add `recover_stale_generating()` called on startup.

2. Claude CLI invoked without `-p` and without `--tools ""`, so the
   model used tools, hit `--max-turns 1`, and returned `error_max_turns`
   with no `result` field — make `result` optional via `#[serde(default)]`,
   detect the subtype, and disable tools.

3. `extract_json_payload` only handled output starting with triple
   backticks, but the model often prefixes with prose — search for
   fenced blocks and bare JSON objects anywhere in the output.

Also adds structured poll/worker logging and better error diagnostics
that include stdout/result prefixes in error messages.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Mar 4, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f1ecb420-4fba-4413-808c-9b330f595319

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pika-dev-main

Comment @coderabbitai help to get the list of available commands and usage tips.

@justinmoon
Copy link
Collaborator Author

Agent review:

Issues found

  1. Duplicate system prompt (runner.ts:133-135) — The initial prompt concatenates
    buildSystemPrompt + buildIssuePrompt, but buildSystemPrompt is already passed to
    createSession as systemPrompt at line 87. The agent gets the system prompt twice —
    once via the SDK's system prompt mechanism and once in the user message. This is
    wasteful but not broken (agents handle redundancy fine). Should probably just send
    buildIssuePrompt as the prompt.

  2. No worktree cleanup on completion (runner.ts:132-176) — After a session
    completes or fails, the worktree stays on disk. The spec mentions "prune old
    worktrees after PR is merged or session is stale (>7 days)" as Phase 4, but
    there's no cleanup at all currently. Over time this will accumulate disk usage.

  3. git add -A in worktree (git.ts:130) — The agent worktree could accumulate junk
    files (node_modules from test runs, build artifacts, etc.) and git add -A stages
    everything. A .gitignore in the repo handles most of this, but it's worth noting.

  4. CSS served via readFileSync on every request (server.ts:37-41) — Not a real
    problem at this scale, but the CSS is re-read from disk on every hit. Could cache
    it at startup. Truly doesn't matter for now.

  5. Dashboard event detail rendering (templates/session.ts:79) — payload.payload is
    a JSON string from the DB but rendered raw via textContent. Works, but the raw
    JSON blob isn't very human-readable in the UI. Future polish item.

  6. No pagination — Dashboard lists are capped at 50/100 but there's no actual
    pagination UI. Fine for now with max 5 concurrent sessions.

Spec alignment

The implementation is ~98% faithful to the spec. The only material gap is the
NixOS deployment module (infra/nix/modules/pika-dev.nix) and justfile recipes,
which are deployment concerns rather than application code. Everything else from
the spec is implemented and working.

Lines: ~3,050 (spec target was <2,000). The overshoot is from thorough tests
(1,094 lines) and the DB layer being more verbose than estimated. Core application
logic is well within budget.

Bottom line

Ship it. The duplicate system prompt is the only thing I'd fix before deploying —
everything else is polish for later iterations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant