Gbrow

Not just a browser. A workstation for your AI agent.

A full-featured headless browser for OpenClaw agents. Built on Playwright and Bun. Instead of taking screenshots and sending them to expensive vision models, Gbrow reads pages through the browser's accessibility tree — fast, free, and way more reliable.

The Problem

Most browser tools for AI agents take screenshots, upload them to GPT-4o or Claude, and wait for a description. That works, but it's slow (3-10 seconds per page), costs money (~$0.01 per read), and breaks when API keys expire.

How Gbrow Does It Differently

Gbrow uses Playwright's ariaSnapshot() — the same structured data that screen readers use. Instead of a picture of the page, you get a clean text tree:

@e1 [heading] "Welcome to Example" [level=1]
@e2 [link] "Get Started"
@e3 [button] "Sign in"
@e4 [textbox] "Search"

Each element gets a ref (@e1, @e2, etc.) that you can click, fill, or inspect directly. No vision model, no API calls, no cost.

Install

Via ClawHub (recommended):

clawhub install gbrow

Via Git:

cd ~/.openclaw/workspace/skills
git clone https://github.com/ashish797/Gbrow.git
cd Gbrow && bash setup.sh

Either way, the setup installs Bun (if needed), pulls dependencies, and installs Chromium. About 30 seconds.

Usage

Start the server:

bun run src/server.ts

Then send commands over HTTP:

PORT=$(python3 -c "import json; print(json.load(open('.gstack/browse.json'))['port'])")
TOKEN=$(python3 -c "import json; print(json.load(open('.gstack/browse.json'))['token'])")

# Navigate
curl -s -X POST "http://127.0.0.1:${PORT}/command" \
  -H "Authorization: Bearer ${TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{"command":"goto","args":["https://news.ycombinator.com"]}'

# Read the page
curl -s -X POST "http://127.0.0.1:${PORT}/command" \
  -H "Authorization: Bearer ${TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{"command":"snapshot","args":["-i"]}'

# Click an element
curl -s -X POST "http://127.0.0.1:${PORT}/command" \
  -H "Authorization: Bearer ${TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{"command":"click","args":["@e3"]}'

Commands

Navigation

Command	Description
`goto <url>`	Navigate to URL
`back` / `forward` / `reload`	History navigation
`url`	Current URL

Reading

Command	Description
`snapshot [-i\|-c\|-d N]`	Accessibility tree with element refs
`text`	Cleaned page text
`html [selector]`	Raw HTML
`links`	All links as "text -> href"
`forms`	Form fields as JSON

Interaction

Command	Description
`click <ref>`	Click element by ref (e.g. `@e3`)
`fill <ref> <text>`	Fill an input field
`type <ref> <text>`	Type with keyboard events
`select <ref> <value>`	Select dropdown value
`press <key>`	Press a key (Enter, Tab, etc.)
`scroll <direction>`	Scroll the page

Inspection

Command	Description
`js <expr>`	Run JavaScript on the page
`css <sel> <prop>`	Get computed CSS value
`attrs <ref>`	Element attributes as JSON
`is <prop> <ref>`	Check state (visible, enabled, etc.)

Tabs

Command	Description
`tabs`	List open tabs
`tab N`	Switch to tab N
`newtab`	Open new tab
`closetab`	Close current tab

Visual

Command	Description
`screenshot`	Take screenshot
`pdf`	Save page as PDF
`responsive <w> <h>`	Set viewport size

Snapshot Flags

Flag	What it does
`-i`	Interactive elements only (buttons, links, inputs)
`-c`	Compact — remove empty structural nodes
`-d N`	Limit tree depth to N levels
`-s <sel>`	Scope to a CSS selector
`-D`	Diff against previous snapshot
`-a`	Annotated screenshot with ref overlays

How It Works

Your Agent  ---HTTP--->  Gbrow Server  --->  Chromium (headless)
                                    |
                                    v
                          Accessibility Tree
                          (structured text + refs)

Agent sends a command (goto, snapshot, click, etc.)
Gbrow server receives it, runs it through Playwright
For reading, it uses ariaSnapshot() — not screenshots
Result is structured text with clickable refs
Agent can click refs, fill forms, navigate — all without vision models

Why Not Just Use Playwright Directly?

You can. But Gbrow gives you:

Persistent server — browser stays alive between commands
Auth token — only authorized callers can use it
Tab management — open, switch, close tabs
Ref system — structured interaction without CSS selectors
Auto-shutdown — kills itself after 30 minutes of inactivity
Docker-friendly — handles sandboxing issues automatically

Comparison

Feature	Gbrow	Vision-based tools	Raw Playwright
Page reading	Accessibility tree	Screenshot + GPT-4o	Manual extraction
Cost per page	Free	~$0.01	Free
Speed	< 100ms	3-10s	Varies
API key needed	No	Yes	No
Click method	`@ref`	CSS selector	CSS selector
Tab management	Built-in	No	Manual
Persistent server	Yes	No	No
OpenClaw integration	Yes	Varies	No

Docker

Gbrow works in Docker out of the box. The setup.sh script handles Chromium sandboxing automatically.

If you're running manually in Docker, set chromiumSandbox: false in the browser launch options.

Credits

Built on gstack by Gary Tan. Adapted for OpenClaw under the MIT license.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bin		bin
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
package.json		package.json
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gbrow

The Problem

How Gbrow Does It Differently

Install

Usage

Commands

Navigation

Reading

Interaction

Inspection

Tabs

Visual

Snapshot Flags

How It Works

Why Not Just Use Playwright Directly?

Comparison

Docker

Credits

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Gbrow

The Problem

How Gbrow Does It Differently

Install

Usage

Commands

Navigation

Reading

Interaction

Inspection

Tabs

Visual

Snapshot Flags

How It Works

Why Not Just Use Playwright Directly?

Comparison

Docker

Credits

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages