Weblens reads the web and turns pages into readable Markdown.
Built for the terminal, scripts, automation, MCP tools, and AI workflows.
Documentation • CLI • Python • Render Mode • Benchmarks
Weblens is a read tool for the web.
Use it when you want:
- the useful text from a page
- Markdown instead of DOM noise
- a fast HTTP path first
- optional Chromium rendering when a page really needs it
- one tool that works in the CLI, Python, and MCP
- Fetches a URL with browser-like HTTP defaults
- Extracts readable content as Markdown
- Supports cookies, headers, proxies, and timeouts
- Supports JavaScript rendering with a local Chromium-based browser
- Supports bounded render scrolling for lazy-loaded pages
- Supports ordered concurrent batch fetching with
fetch_many() - Supports validator-based caching with
PageCache - Supports benchmarking for extraction and throughput
From this repo:
python3 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -e .Optional extras:
pip install -e ".[speed]"
pip install -e ".[render]"
pip install -e ".[bench-compare]"speed: enablesselectolaxfor a faster primary extractorrender: enables JavaScript rendering withnodriverbench-compare: enables extra benchmark backends like BeautifulSoup
Basic CLI use:
weblens https://www.wikipedia.org/Basic Python use:
import weblens
markdown = weblens.fetch("https://www.wikipedia.org/")
print(markdown)Render a JS-heavy page:
weblens https://www.spotify.com/ \
--render-js \
--browser-executable "$WEBLENS_CHROME"Load more of a lazy page:
weblens https://example.com/feed \
--render-js \
--render-scroll 3 \
--browser-executable "$WEBLENS_CHROME"Scroll to the bottom of a long finite page:
weblens https://example.com/long-page \
--render-js \
--render-scroll-to-bottom \
--browser-executable "$WEBLENS_CHROME"Python render example:
import weblens
markdown = weblens.fetch(
"https://www.spotify.com/",
render_js=True,
browser_executable="/path/to/chrome",
render_wait=5.0,
render_scroll_to_bottom=True,
)Use --render-js when:
- the page is a JavaScript app
- the fast path returns
JavaScriptRequiredError - the page works in a real browser session but not in raw HTML
Weblens does not bundle Chromium. Point it at a local Chromium-based browser:
export WEBLENS_CHROME=/path/to/chrome
weblens https://open.spotify.com/ --render-js --browser-executable "$WEBLENS_CHROME"Weblens is especially good for:
- saving articles, docs, and product pages as readable Markdown
- feeding clean web content into LLMs
- building terminal and Python automation around web reading
- using one MCP tool to read the internet
- reading JS-heavy pages only when needed
- batch text collection across many URLs
Local extractor benchmarks:
weblens-bench
weblens-bench --iterations 20 --warmup 5Live benchmarks:
weblens-bench --live-only \
--url https://www.wikipedia.org/ \
--url https://docs.python.org/3/Comparison benchmarks:
weblens-bench --compare-only
weblens-bench --live-only --compare --url https://www.wikipedia.org/