Firecrawl CLI (Python)

A Python-based command-line interface for Firecrawl with full enterprise proxy support.

Why This Project?

The official Firecrawl Node.js CLI has known issues with enterprise proxy configurations. This Python CLI is a drop-in replacement that properly respects HTTP_PROXY, HTTPS_PROXY, and NO_PROXY environment variables.

The Problem with Node CLI

Enterprise environments often require proxy configuration for external HTTP requests. While Node.js supports proxies, many CLI tools built on Node don't properly handle proxy environment variables, requiring additional tools like proxychains or custom configuration.

Our Solution

This Python CLI uses httpx with built-in proxy support that automatically detects and uses standard proxy environment variables:

export HTTPS_PROXY="http://proxy.company.com:8080"
firecrawl scrape https://example.com  # Works seamlessly behind proxy

Feature Comparison

Feature	Node CLI	Python CLI
Web Scraping	✅	✅
Website Crawling	✅	✅
URL Mapping	✅	✅
Web Search	✅	✅
Batch Operations	✅	✅
`HTTP_PROXY` support	❌	✅
`HTTPS_PROXY` support	❌	✅
`NO_PROXY` support	❌	✅
Proxy Authentication	❌	✅

Features

Enterprise Proxy Support: Respects HTTP_PROXY, HTTPS_PROXY, and NO_PROXY environment variables
Multiple Output Formats: Markdown, HTML, JSON, screenshots, links, and more
Web Scraping: Extract clean data from any URL
Website Crawling: Recursively crawl and discover pages
URL Mapping: Fast website URL discovery
Web Search: Search and optionally scrape results
Batch Operations: Process multiple URLs at once
uvx Compatible: Run without installation using uvx

Installation

Method 1: Using uvx (Recommended for agents)

# Run without installing
uvx firecrawl-cli scrape https://example.com

# With specific version
uvx firecrawl-cli@1.0.0 scrape https://example.com

Method 2: Using uv

uv tool install firecrawl-cli

Method 3: Using pip

pip install firecrawl-cli

Quick Start

1. Set up authentication

# Interactive login
firecrawl login

# Or set environment variable
export FIRECRAWL_API_KEY="fc-xxxxx"

2. Scrape a URL

# Quick scrape (default: markdown)
firecrawl scrape https://example.com

# Multiple formats
firecrawl scrape https://example.com --format markdown,html,links

# Save to file
firecrawl scrape https://example.com --output result.md

Enterprise Proxy Configuration

This CLI properly supports enterprise proxy settings through environment variables:

Environment Variables

# Set proxy for all HTTP requests
export HTTP_PROXY="http://proxy.company.com:8080"
export HTTPS_PROXY="https://proxy.company.com:8080"

# With authentication
export HTTPS_PROXY="http://user:pass@proxy.company.com:8080"

# Bypass proxy for certain hosts
export NO_PROXY="localhost,127.0.0.1,internal.company.com"

Priority Order

Proxy configuration is loaded in the following priority:

Command-line --proxy option
Environment variables (HTTPS_PROXY, HTTP_PROXY)
Configuration file settings
System defaults

Verify Proxy Settings

firecrawl status

This will show whether a proxy is configured and its URL (with credentials masked).

Commands

scrape

Extract content from a URL in various formats.

firecrawl scrape <URL> [OPTIONS]

Options:
  --format, -f         Output format(s): markdown, html, rawHtml, links,
                       images, screenshot, summary, json, branding
  --only-main-content  Extract only main content (default: true)
  --wait-for          Wait time in milliseconds
  --screenshot        Take a screenshot
  --max-age           Maximum age of cached content in ms
  --output, -o        Output file path
  --json              Output as JSON
  --pretty            Pretty print JSON

Examples:

# Basic scrape
firecrawl scrape https://example.com

# HTML output
firecrawl scrape https://example.com --format html

# Multiple formats with JSON output
firecrawl scrape https://example.com --format markdown,links --json --pretty

# Screenshot
firecrawl scrape https://example.com --format screenshot --output screenshot.png

crawl

Recursively crawl a website.

firecrawl crawl <URL> [OPTIONS]

Options:
  --wait              Wait for crawl to complete
  --limit             Maximum pages to crawl
  --max-depth         Maximum crawl depth
  --exclude-paths     Comma-separated paths to exclude
  --include-paths     Comma-separated paths to include
  --sitemap           Sitemap handling: include, skip
  --output, -o        Output file path
  --pretty            Pretty print JSON

Examples:

# Crawl with limit
firecrawl crawl https://example.com --limit 10

# Wait for completion
firecrawl crawl https://example.com --wait --limit 100

# Check crawl status
firecrawl crawl JOB_ID --status

map

Discover URLs on a website.

firecrawl map <URL> [OPTIONS]

Options:
  --limit             Maximum URLs to discover
  --search            Search query to filter URLs
  --sitemap           Sitemap handling: only, include, skip
  --include-subdomains
  --output, -o        Output file path

Examples:

firecrawl map https://example.com
firecrawl map https://example.com --limit 100 --search "blog"

search

Search the web with optional result scraping.

firecrawl search <QUERY> [OPTIONS]

Options:
  --limit             Maximum results (default: 5)
  --sources           Comma-separated: web, images, news
  --categories        Comma-separated: github, research, pdf
  --location          Geo-targeting location
  --country           ISO country code (default: US)
  --scrape            Enable scraping of results
  --scrape-formats    Formats for scraped content

Examples:

firecrawl search "python web scraping"
firecrawl search "firecrawl" --limit 10 --scrape

batch

Scrape multiple URLs at once.

firecrawl batch <URL>... [OPTIONS]

Options:
  --format, -f        Output format (default: markdown)

Examples:

firecrawl batch https://example.com https://example2.com
firecrawl batch https://site1.com https://site2.com --format html

Authentication Commands

# Login and save credentials
firecrawl login

# Login with API key directly
firecrawl login --api-key fc-xxxxx

# View configuration
firecrawl config --view

# Logout (clear credentials)
firecrawl logout

# Check status
firecrawl status

Configuration

Configuration is stored in platform-specific locations:

Linux: ~/.config/firecrawl/config.json
macOS: ~/Library/Application Support/firecrawl/config.json
Windows: %APPDATA%\firecrawl\config.json

Environment Variables

Variable	Description
`FIRECRAWL_API_KEY`	Your Firecrawl API key
`FIRECRAWL_API_URL`	Custom API URL (optional)
`HTTP_PROXY`	HTTP proxy URL
`HTTPS_PROXY`	HTTPS proxy URL
`NO_PROXY`	Comma-separated hosts to bypass proxy

Output Formats

Scrape Formats

markdown - Clean markdown text (default)
html - Clean HTML
rawHtml - Raw HTML without processing
links - Extracted links from the page
images - Extracted image URLs
screenshot - Base64-encoded screenshot
summary - Page summary
json - Structured data extraction
branding - Brand identity information
changeTracking - Content change tracking

Output Modes

# Pretty output (default, human-readable)
firecrawl scrape https://example.com

# JSON output (for programmatic use)
firecrawl scrape https://example.com --json

# Pretty JSON output
firecrawl scrape https://example.com --json --pretty

# Save to file
firecrawl scrape https://example.com --output result.json

Exit Codes

Code	Meaning
0	Success
1	General error
2	Invalid arguments
3	Authentication error
4	API error
5	Network error
130	Interrupted (Ctrl+C)

Development

Setup

# Clone repository
git clone https://github.com/socamalo/firecrawl-cli-python.git
cd firecrawl-cli-python

# Install with uv
uv sync --dev

# Run in development mode
uv run firecrawl scrape https://example.com

Testing

# Run tests
uv run pytest

# Run with coverage
uv run pytest --cov=firecrawl_cli

Linting

# Format code
uv run ruff format .

# Check linting
uv run ruff check .

# Type checking
uv run mypy src/firecrawl_cli

License

MIT License - see LICENSE for details.

Contributing

Contributions are welcome! Please read our Contributing Guide for details.

Support

Documentation: https://docs.firecrawl.dev/cli
Issues: https://github.com/socamalo/firecrawl-cli-python/issues
API Docs: https://docs.firecrawl.dev/api-reference

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docs		docs
src/firecrawl_cli		src/firecrawl_cli
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Firecrawl CLI (Python)

Why This Project?

The Problem with Node CLI

Our Solution

Feature Comparison

Features

Installation

Method 1: Using uvx (Recommended for agents)

Method 2: Using uv

Method 3: Using pip

Quick Start

1. Set up authentication

2. Scrape a URL

Enterprise Proxy Configuration

Environment Variables

Priority Order

Verify Proxy Settings

Commands

scrape

crawl

map

search

batch

Authentication Commands

Configuration

Environment Variables

Output Formats

Scrape Formats

Output Modes

Exit Codes

Development

Setup

Testing

Linting

License

Contributing

Support

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages