0xMassi/webclaw
by Various
Fast, local-first web content extraction for LLMs. Scrape, crawl, extract structured data — all from Rust. CLI, REST API, and MCP server.
MCP
0xMassi/webclaw
Added 1 June 2026
Overview
A Rust-based tool for fast, local-first web content extraction. It provides a CLI, REST API, and MCP server for scraping, crawling, and extracting structured data. Designed to feed content directly into LLM workflows.
Best for
Best for
Developers needing fast, local web content extraction for LLM pipelines
Use cases
- Extracting structured data from web pages for LLM training datasets
- Crawling websites to build custom knowledge bases
- Integrating web content into AI pipelines via REST API or MCP server
Notes
A Rust-based tool for fast, local-first web content extraction. It provides a CLI, REST API, and MCP server for scraping, crawling, and extracting structured data. Designed to feed content directly into LLM workflows.
1,269 stars on GitHub. Last updated 2026-05-31. Licensed AGPL-3.0.
Use cases
- Extracting structured data from web pages for LLM training datasets
- Crawling websites to build custom knowledge bases
- Integrating web content into AI pipelines via REST API or MCP server
Pros
- Fast performance due to Rust implementation
- Local-first design keeps data private and offline
- Multiple interfaces (CLI, API, MCP) for flexible integration
Cons
- Requires Rust toolchain to build from source
- Limited community and documentation compared to established scraping tools
- Primarily optimized for LLM use cases, not general-purpose web scraping
Indexed from awesome-mcp-servers-punkpeye and enriched against its public facts.
Pros
- Fast performance due to Rust implementation
- Local-first design keeps data private and offline
- Multiple interfaces (CLI, API, MCP) for flexible integration
Cons
- Requires Rust toolchain to build from source
- Limited community and documentation compared to established scraping tools
- Primarily optimized for LLM use cases, not general-purpose web scraping
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.