Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions skills/autobrowse/.env.example
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
ANTHROPIC_API_KEY=sk-ant-...
# Optional: use OpenAI-compatible Chat Completions instead of Anthropic.
# AUTOBROWSE_PROVIDER=openai
# AUTOBROWSE_MODEL=gpt-4.1
# OPENAI_API_KEY=sk-...
# OPENAI_BASE_URL=https://api.openai.com/v1
# For OpenRouter/LiteLLM/etc, point OPENAI_BASE_URL at that provider's /v1 endpoint.
BROWSERBASE_API_KEY=bb_live_...
BROWSERBASE_PROJECT_ID=your-project-id
21 changes: 19 additions & 2 deletions skills/autobrowse/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ The output is a `skill.md` — a site-specific playbook any agent can follow. On
- Node.js 18+
- [Claude Code](https://claude.ai/code)
- `browse` CLI: `npm install -g @browserbasehq/browse-cli`
- `ANTHROPIC_API_KEY` in your environment
- `ANTHROPIC_API_KEY` in your environment, or `AUTOBROWSE_PROVIDER=openai` with `OPENAI_API_KEY`
- For bot-protected sites: `BROWSERBASE_API_KEY` + `BROWSERBASE_PROJECT_ID`

## Setup
Expand All @@ -25,6 +25,23 @@ npm install
cp .env.example .env # fill in your API keys
```

By default, the inner agent uses Anthropic. To use an OpenAI-compatible provider instead:

```bash
AUTOBROWSE_PROVIDER=openai \
OPENAI_API_KEY=sk-... \
node scripts/evaluate.mjs --task my-portal --model gpt-4.1
```

For OpenRouter, LiteLLM, or another Chat Completions-compatible gateway, set `OPENAI_BASE_URL`:

```bash
AUTOBROWSE_PROVIDER=openai \
OPENAI_API_KEY=sk-or-... \
OPENAI_BASE_URL=https://openrouter.ai/api/v1 \
node scripts/evaluate.mjs --task my-portal --model anthropic/claude-sonnet-4.5
```

## Your project structure

Create this in your working directory before running `/autobrowse`:
Expand Down Expand Up @@ -80,7 +97,7 @@ Inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch)
outer agent (Claude Code + /autobrowse skill)
└── reads trace → improves strategy.md → repeats

inner agent (scripts/evaluate.mjs → Anthropic API)
inner agent (scripts/evaluate.mjs → Anthropic API or OpenAI-compatible Chat Completions)
└── browse open → snapshot → click → snapshot → ...
└── writes traces/ with summary, full trace, screenshots
```
16 changes: 13 additions & 3 deletions skills/autobrowse/REFERENCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,22 @@ node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task <name> [options]
|------|---------|-------------|
| `--task <name>` | required | Task name — matches `tasks/<name>/` directory |
| `--env local\|remote` | `local` | Browser environment |
| `--model <model>` | `claude-sonnet-4-6` | Claude model for the inner agent |
| `--provider anthropic\|openai` | `anthropic` | Model provider for the inner agent |
| `--model <model>` | provider-specific | Model for the inner agent (`claude-sonnet-4-6` for Anthropic, `gpt-4.1` for OpenAI-compatible) |
| `--run-number N` | auto-increment | Force a specific run number |

## Environment variables

| Variable | Required | Description |
|----------|----------|-------------|
| `ANTHROPIC_API_KEY` | Yes | Claude API key |
| `AUTOBROWSE_PROVIDER` | No | `anthropic` or `openai`; same as `--provider` |
| `AUTOBROWSE_MODEL` | No | Default model when `--model` is omitted |
| `ANTHROPIC_API_KEY` | Anthropic only | Claude API key |
| `OPENAI_API_KEY` | OpenAI-compatible only | API key for OpenAI, OpenRouter, LiteLLM, etc. |
| `OPENAI_BASE_URL` | No | OpenAI-compatible `/v1` base URL; defaults to `https://api.openai.com/v1` |
| `OPENAI_ORGANIZATION` | No | Optional OpenAI organization header |
| `OPENAI_SITE_URL` | No | Optional `HTTP-Referer` header for gateways such as OpenRouter |
| `OPENAI_APP_NAME` | No | Optional `X-Title` header for gateways such as OpenRouter |
| `BROWSERBASE_API_KEY` | Remote only | Browserbase API key |
| `BROWSERBASE_PROJECT_ID` | Remote only | Browserbase project ID |

Expand All @@ -29,7 +37,7 @@ Each run writes to `traces/<task>/run-NNN/`:
|------|-------------|
| `summary.md` | Duration, cost, turn-by-turn decision log, final output |
| `trace.json` | Full tool call log — every command and response |
| `messages.json` | Raw Anthropic API message history |
| `messages.json` | Raw normalized message history |
| `screenshots/` | Visual captures saved during the run |

`traces/<task>/latest` is a symlink to the most recent run.
Expand All @@ -41,6 +49,8 @@ Each run writes to `traces/<task>/run-NNN/`:
| `claude-sonnet-4-6` | $$ | Default — good balance of speed and accuracy |
| `claude-opus-4-6` | $$$$ | Hardest tasks, complex multi-step workflows |
| `claude-haiku-4-5-20251001` | $ | Simple tasks, high-volume iteration |
| `gpt-4.1` | $$ | Default for OpenAI-compatible mode |
| `gpt-4.1-mini` | $ | Lower-cost OpenAI-compatible iteration |

## Skill lifecycle

Expand Down
7 changes: 5 additions & 2 deletions skills/autobrowse/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: autobrowse
description: Self-improving browser automation via the auto-research loop. Iteratively runs a browsing task, reads the trace, and improves the navigation skill (strategy.md) until it reliably passes. Supports parallel runs across multiple tasks using sub-agents. Use when you want to build or improve browser automation skills for specific website tasks.
license: See LICENSE.txt
compatibility: "Requires Node.js 18+, browse CLI, and ANTHROPIC_API_KEY. Run from the autobrowse app directory."
compatibility: "Requires Node.js 18+, browse CLI, and either ANTHROPIC_API_KEY or AUTOBROWSE_PROVIDER=openai with OPENAI_API_KEY. Run from the autobrowse app directory."
allowed-tools: Bash Read Write Edit Glob Grep Agent
metadata:
author: browserbase
Expand Down Expand Up @@ -96,14 +96,17 @@ Check that `./autobrowse/tasks/<task>/task.md` exists (scaffold it from the temp

### Requirements

- `ANTHROPIC_API_KEY` must be in the environment (or in a `.env` file in CWD — `evaluate.mjs` auto-loads it). If missing, the harness prints a clear error and exits; don't hunt for keys in other paths.
- By default, `ANTHROPIC_API_KEY` must be in the environment (or in a `.env` file in CWD — `evaluate.mjs` auto-loads it). If missing, the harness prints a clear error and exits; don't hunt for keys in other paths.
- To use OpenAI-compatible Chat Completions instead, set `AUTOBROWSE_PROVIDER=openai` or pass `--provider openai`, then set `OPENAI_API_KEY`. For OpenRouter, LiteLLM, or another compatible gateway, also set `OPENAI_BASE_URL` to that provider's `/v1` endpoint.

### Run the inner agent

```bash
node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task <task-name> --workspace ./autobrowse
# or for bot-protected sites:
node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task <task-name> --workspace ./autobrowse --env remote
# or with an OpenAI-compatible provider:
AUTOBROWSE_PROVIDER=openai OPENAI_API_KEY=sk-... node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task <task-name> --workspace ./autobrowse --model gpt-4.1
```

This runs the browser session and writes a full trace to `./autobrowse/traces/<task>/latest/`.
Expand Down
2 changes: 1 addition & 1 deletion skills/autobrowse/package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "autobrowse",
"version": "0.1.0",
"description": "Self-improving browser agent via skill learning — autoresearch pattern + Browse CLI + Anthropic API",
"description": "Self-improving browser agent via skill learning — autoresearch pattern + Browse CLI + Anthropic/OpenAI-compatible APIs",
"type": "module",
"scripts": {
"evaluate": "node scripts/evaluate.mjs"
Expand Down
Loading