From e446338dd8c1fe6908246522719349df359656db Mon Sep 17 00:00:00 2001 From: Shrey Pandya Date: Mon, 27 Apr 2026 16:45:57 -0700 Subject: [PATCH] docs(browser): update browse cli 0.6 docs --- skills/browser/REFERENCE.md | 54 ++++++++++++++++++++++++++++- skills/browser/SKILL.md | 26 +++++++++++++- skills/browserbase-cli/REFERENCE.md | 5 +++ 3 files changed, 83 insertions(+), 2 deletions(-) diff --git a/skills/browser/REFERENCE.md b/skills/browser/REFERENCE.md index 1508a80..1cb2844 100644 --- a/skills/browser/REFERENCE.md +++ b/skills/browser/REFERENCE.md @@ -13,6 +13,7 @@ Technical reference for the `browse` CLI tool. - [JavaScript Evaluation](#javascript-evaluation) - [Viewport](#viewport) - [Network Capture](#network-capture) + - [CDP Event Streaming](#cdp-event-streaming) - [Configuration](#configuration) - [Global Flags](#global-flags) - [Environment Variables](#environment-variables) @@ -101,7 +102,7 @@ browse screenshot --full-page # capture entire scrollable page #### `get [selector]` -Get page properties. Available properties: `url`, `title`, `text`, `html`, `value`, `box`, `visible`, `checked`. +Get page properties. Available properties: `url`, `title`, `text`, `html`, `markdown`, `value`, `box`, `visible`, `checked`. ```bash browse get url # current URL @@ -109,6 +110,8 @@ browse get title # page title browse get text "body" # all visible text (selector required) browse get text ".product-info" # text within a CSS selector browse get html "#main" # inner HTML of an element +browse get markdown # markdown from body +browse get markdown ".article" # markdown from a specific element browse get value "#email-input" # value of a form field browse get box "#header" # bounding box (centroid coordinates) browse get visible ".modal" # check if element is visible @@ -182,6 +185,15 @@ browse select "#country" "United States" browse select "#tags" "javascript" "typescript" # multi-select ``` +#### `upload ` + +Upload one or more files to an `` element. Works in both local and remote sessions. + +```bash +browse upload "input[type=file]" ./avatar.png +browse upload "#documents" ./contract.pdf ./appendix.pdf +``` + #### `press ` Press a keyboard key or key combination. @@ -387,6 +399,23 @@ browse network clear --- +### CDP Event Streaming + +#### `cdp ` + +Attach to a CDP target and stream Chrome DevTools Protocol events as newline-delimited JSON. Accepts a WebSocket URL or a bare port number. + +```bash +browse cdp 9222 +browse cdp ws://localhost:9222/devtools/browser/... +browse cdp 9222 --domain Network Page # enable selected domains +browse cdp 9222 --pretty # human-readable output +``` + +Default domains are `Network`, `Console`, `Runtime`, `Log`, and `Page`. When the `Page` domain is enabled, lifecycle events are also enabled so consumers receive milestones such as `DOMContentLoaded`, `load`, `firstPaint`, and `networkIdle`. + +--- + ## Configuration ### Global Flags @@ -417,6 +446,29 @@ Load a Browserbase context to persist browser state (cookies, localStorage, sess Save browser state changes back to the Browserbase context when the session ends. Must be used with `--context-id`. +#### Browserbase session flags + +These global flags configure Browserbase session creation and are only supported in remote mode. Run `browse env remote` first, then place flags before the command: + +```bash +browse env remote +browse --proxies --advanced-stealth --solve-captchas open https://example.com +browse --block-ads --region us-west-2 --session-timeout 300 open https://example.com +``` + +| Flag | Effect | +|------|--------| +| `--proxies` | Enable Browserbase proxies | +| `--advanced-stealth` | Enable advanced stealth mode | +| `--solve-captchas` | Enable automatic CAPTCHA solving | +| `--no-solve-captchas` | Disable automatic CAPTCHA solving | +| `--block-ads` | Enable ad blocking | +| `--region ` | Set session region: `us-west-2`, `us-east-1`, `eu-central-1`, or `ap-southeast-1` | +| `--keep-alive` | Keep the session alive after disconnection | +| `--session-timeout ` | Set Browserbase session timeout in seconds | + +Changing these flags for a running session restarts the daemon so the new Browserbase session is created with the requested parameters. The resolved params appear in `browse status`. + ### Environment Variables | Variable | Required | Description | diff --git a/skills/browser/SKILL.md b/skills/browser/SKILL.md index 22c20bf..ccf81e2 100644 --- a/skills/browser/SKILL.md +++ b/skills/browser/SKILL.md @@ -42,6 +42,7 @@ The CLI supports explicit per-session environment overrides. If you do nothing, - `browse env remote` switches the current session to Browserbase - Without a local override, Browserbase is also the default when `BROWSERBASE_API_KEY` is set - Provides: anti-bot stealth, automatic CAPTCHA solving, residential proxies, session persistence +- Use remote session flags when needed: `--proxies`, `--advanced-stealth`, `--solve-captchas`, `--no-solve-captchas`, `--block-ads`, `--region `, `--keep-alive`, `--session-timeout ` - **Use remote mode when:** the target site has bot detection, CAPTCHAs, IP rate limiting, Cloudflare protection, or requires geo-specific access - Get credentials at https://browserbase.com/settings @@ -73,6 +74,7 @@ browse screenshot [path] # Take visual screenshot (slow, uses vi browse get url # Get current URL browse get title # Get page title browse get text # Get text content (use "body" for all text) +browse get markdown [selector] # Convert page or element HTML to markdown browse get html # Get HTML content of element browse get value # Get form field value ``` @@ -85,6 +87,7 @@ browse click # Click element by ref from snapshot (e browse type # Type text into focused element browse fill # Fill input and press Enter browse select # Select dropdown option(s) +browse upload # Upload file(s) to file inputs browse press # Press key (Enter, Tab, Escape, Cmd+A, etc.) browse drag # Drag from one point to another browse scroll # Scroll at coordinates @@ -108,6 +111,24 @@ browse tab_switch # Switch to tab by index browse tab_close [index] # Close tab ``` +### Remote session options +Add these global flags before the command when using remote mode. They apply to the Browserbase session and restart the daemon if the active session was created with different params. + +```bash +browse env remote +browse --proxies --advanced-stealth --solve-captchas open https://example.com +browse --block-ads --region us-west-2 --session-timeout 300 open https://example.com +``` + +### CDP event streaming +Use `browse cdp` when you need low-level Chrome DevTools Protocol events from a CDP target. + +```bash +browse cdp 9222 # Stream default domains as NDJSON +browse cdp 9222 --domain Network Page # Enable selected domains +browse cdp 9222 --pretty # Human-readable event stream +``` + ### Typical workflow If the environment matters, set it first with `browse env local`, `browse env local --auto-connect`, or `browse env remote`. @@ -148,13 +169,15 @@ browse stop 3. **Use `browse snapshot`** to check page state — it's fast and gives you element refs 4. **Only screenshot when visual context is needed** (layout checks, images, debugging) 5. **Use refs from snapshot** to click/interact — e.g., `browse click @0-5` -6. **`browse stop`** when done to clean up the browser session and clear the env override +6. **Use remote flags intentionally**: add `--proxies`, `--advanced-stealth`, `--solve-captchas`, `--region`, or related flags before the command when protected sites need Browserbase session configuration +7. **`browse stop`** when done to clean up the browser session and clear the env override ## Troubleshooting - **"No active page"**: Run `browse stop`, then check `browse status`. If it still says running, kill the zombie daemon with `pkill -f "browse.*daemon"`, then retry `browse open` - **Chrome not found**: Install Chrome, use `browse env local --auto-connect` if you already have a debuggable Chrome running, or switch to `browse env remote` - **Action fails**: Run `browse snapshot` to see available elements and their refs +- **Session flags fail**: Remote session flags only work in remote mode. Run `browse env remote` first, then put flags before the command, such as `browse --proxies open https://example.com` - **Browserbase fails**: Verify API key is set ## Switching to Remote Mode @@ -167,6 +190,7 @@ Don't switch for simple sites (docs, wikis, public APIs, localhost). browse env local # clean isolated local browser browse env local --auto-connect # reuse existing Chrome state browse env remote # switch to Browserbase +browse --proxies --advanced-stealth --solve-captchas open https://example.com ``` Overrides are scoped per session and stay in effect until you switch again or run `browse stop`. After `browse stop`, the next start falls back to env-var-based auto detection. Use `browse status` to inspect the resolved local strategy while the daemon is running. diff --git a/skills/browserbase-cli/REFERENCE.md b/skills/browserbase-cli/REFERENCE.md index 7496142..e85a8b2 100644 --- a/skills/browserbase-cli/REFERENCE.md +++ b/skills/browserbase-cli/REFERENCE.md @@ -234,10 +234,15 @@ bb browse env local --auto-connect bb browse env remote bb browse status bb browse open https://example.com +bb browse --proxies --advanced-stealth --solve-captchas open https://example.com +bb browse upload "input[type=file]" ./file.pdf +bb browse cdp 9222 --domain Network Page ``` `bb browse` mirrors the standalone `browse` binary exactly. For local work, `bb browse env local` starts a clean isolated browser by default. Use `bb browse env local --auto-connect` only when you need the agent to reuse an existing local Chrome session, cookies, or login state. +Remote Browserbase session flags are passed through as global browse flags. Run `bb browse env remote` first, then put flags such as `--proxies`, `--advanced-stealth`, `--solve-captchas`, `--block-ads`, `--region `, `--keep-alive`, or `--session-timeout ` before the subcommand. + If `browse` is not installed, the CLI will prompt you to install it: ```bash