This project scrapes https://mcpservers.org/all?sort=newest&page=1, paginates through the full listing, follows each server detail page, and writes one JSON object per server to output/mcpservers.jsonl.
Each JSONL record includes:
titlesummarygithub_urlrepo_titlecontent_textcontent_html_source_url_listing_url
uv sync
uv run mcpservers-scraperYou can also run it directly with:
uv run main.pyThe crawler uses silkworm with:
- pagination from the
Nextbutton on the/alllisting - detail-page follows for every
/servers/...link - JSONL output in
output/mcpservers.jsonl