feat: add stagehand-export skill by aq17 · Pull Request #105 · browserbase/skills

aq17 · 2026-05-13T00:55:37Z

Summary

Adds /stagehand-export — translates a graduated /autobrowse task into a deterministic Stagehand TypeScript script.
Mines the last passing trace.json for XPath/CSS selectors → cached Action descriptors. Falls back to observe() for ARIA-ref clicks. Auto-generates a Zod schema for extract() from task.md's Output block.
Verified end-to-end on the sdml-grants task: autobrowse iter 1 ($0.36, 7 turns, 75s) → stagehand-export → npx tsx replay returned the same 7 grants with success: true.

Why

/autobrowse produces a strategy.md another Claude session can replay step-by-step, but every replay still pays per-step inference. This collapses the loop into a one-LLM-call (just extract()) deterministic script you can bb functions deploy, cron, or invoke from non-Claude code.

Files added

skills/stagehand-export/SKILL.md — entry point + workflow doc
skills/stagehand-export/scripts/export.mjs — generator (~620 lines, stdlib only, no deps)
skills/stagehand-export/references/command-mapping.md — browse CLI → Stagehand translation table

Test plan

Run node skills/stagehand-export/scripts/export.mjs --task <name> --workspace ./autobrowse --no-verify against a graduated autobrowse task; inspect the generated .ts
Run without --no-verify to execute the script and confirm exit 0 + success: true JSON
Confirm cached Action count and observe() fallback count match the trace's command shape

🤖 Generated with Claude Code

Note

Medium Risk
Adds a new code generator that parses trace/markdown inputs and runs npm install + executes generated scripts; failures or heuristic mis-parsing could produce incorrect automation or unexpected local side effects during verification.

Overview
Adds a new stagehand-export skill that exports a graduated /autobrowse task into a deterministic Stagehand TypeScript script, mining the most recent passing trace.json for stable XPath/CSS selectors and emitting cached stagehand.act(...) calls with observe() fallbacks for ARIA refs.

The generator (scripts/export.mjs) also infers a Zod OutputSchema from task.md’s ## Output JSON block, scaffolds package.json/tsconfig.json, writes a selectors.cache.json sidecar, and can optionally verify by running npm install and executing the generated script, reporting pass/fail and logs.

^{Reviewed by Cursor Bugbot for commit a1ccf2d. Bugbot is set up for automated code reviews on this repo. Configure here.}

Translate a graduated /autobrowse task into a deterministic Stagehand TypeScript script. The autobrowse loop converges on a working workflow but every replay still pays per-step LLM inference. stagehand-export collapses that into a single .ts file by mining the last passing trace.json for the XPath/CSS selectors that worked, baking them in as cached Action descriptors, and falling back to observe() for ARIA-ref clicks. The Zod schema for extract() is auto-generated from task.md's Output block. Verified end-to-end on the sdml-grants task: autobrowse iter 1 ($0.36, 7 turns, 75s) -> stagehand-export -> npx tsx replay returned the same 7 grants with success: true. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 4 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit a1ccf2d. Configure here.}

cursor · 2026-05-13T01:04:10Z

+  // The script prints OutputSchema JSON to stdout as the last block.
+  const stdout = run.stdout ?? "";
+  const lastBrace = stdout.lastIndexOf("{");
+  if (lastBrace >= 0) parsedOutput = JSON.parse(stdout.slice(lastBrace));


Verification JSON parsing fails for nested output objects

High Severity

stdout.lastIndexOf("{") finds the opening brace of the last nested object in pretty-printed JSON, not the root object. For any output with nested objects or arrays of objects (e.g., the 7-grants result from the PR description), JSON.parse(stdout.slice(lastBrace)) parses a substring starting mid-structure with trailing ] and } characters, which is invalid JSON. The parse throws, parsedOutput stays null, and verification incorrectly reports failure. Using stdout.indexOf("{") or JSON.parse(stdout.trim()) would correctly target the root object.

^{Reviewed by Cursor Bugbot for commit a1ccf2d. Configure here.}

cursor · 2026-05-13T01:04:10Z

+const trace = JSON.parse(fs.readFileSync(tracePath, "utf-8"));
+const taskMd = fs.readFileSync(taskFile, "utf-8");
+const strategyMd = fs.readFileSync(strategyFile, "utf-8");
+const summaryMd = fs.readFileSync(summaryPath, "utf-8");


Unused summaryMd read crashes on missing file

Medium Severity

summaryMd is read via fs.readFileSync(summaryPath) but never referenced afterward—it's dead code. Worse, when --run is forced, the isPassing() check is skipped, so there's no guarantee summary.md exists. If the forced run directory lacks a summary.md, this line throws an uncaught ENOENT error and crashes the script before any generation happens.

^{Reviewed by Cursor Bugbot for commit a1ccf2d. Configure here.}

cursor · 2026-05-13T01:04:10Z

+        const selector = args[0];
+        const value = args.slice(1).join(" ");
+        ops.push({ kind: "act", method: "selectOptionFromDropdown", selector, arguments: [value], ...base });
+        break;


select case skips ref classification unlike click/fill

Medium Severity

The select case unconditionally pushes a kind: "act" op with the raw selector, without calling classifySelector first. Unlike the click and fill cases—which check for ARIA refs and route them to observe_act—a browse select [0-58] "value" command would emit a cached Action with an ephemeral ARIA ref as the selector, producing a generated script line that can't replay deterministically.

^{Reviewed by Cursor Bugbot for commit a1ccf2d. Configure here.}

cursor · 2026-05-13T01:04:10Z

+const STAGEHAND_ENV = usesBrowserbase ? "BROWSERBASE" : "LOCAL";
+
+// First goto URL
+const firstGoto = ops.find((o) => o.kind === "goto");


Several computed values are assigned but never used

Low Severity

firstGoto is computed on line 373 but never read or referenced anywhere in the script — it appears to be leftover from an earlier design (perhaps to emit the initial URL separately). Similarly, lastReasoning and lastTurn on lines 248–249 are declared and never read, having been superseded by the per-turn turnReasoning logic. These dead stores add confusion about whether functionality is missing.

Additional Locations (1)

skills/stagehand-export/scripts/export.mjs#L247-L249

^{Reviewed by Cursor Bugbot for commit a1ccf2d. Configure here.}

cursor Bot reviewed May 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add stagehand-export skill#105

feat: add stagehand-export skill#105
aq17 wants to merge 1 commit into
mainfrom
add-stagehand-export-skill

aq17 commented May 13, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 13, 2026

Uh oh!

cursor Bot May 13, 2026

Uh oh!

cursor Bot May 13, 2026

Uh oh!

cursor Bot May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aq17 commented May 13, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Files added

Test plan

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 13, 2026

Choose a reason for hiding this comment

Verification JSON parsing fails for nested output objects

Uh oh!

cursor Bot May 13, 2026

Choose a reason for hiding this comment

Unused summaryMd read crashes on missing file

Uh oh!

cursor Bot May 13, 2026

Choose a reason for hiding this comment

select case skips ref classification unlike click/fill

Uh oh!

cursor Bot May 13, 2026

Choose a reason for hiding this comment

Several computed values are assigned but never used

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aq17 commented May 13, 2026 •

edited by cursor Bot

Loading

Unused `summaryMd` read crashes on missing file

`select` case skips ref classification unlike `click`/`fill`