linxule/mineru-mcp
by Various
๐ โ๏ธ - MCP server for MinerU document parsing API. Parse PDFs, images, DOCX, and PPTX with OCR (109 languages), batch processing (200 docs), page ranges, and local file upload. 73
MCP
linxule/mineru-mcp
Added 1 June 2026
Overview
An MCP server that wraps the MinerU document parsing API. It extracts text from PDFs, images, DOCX, and PPTX using OCR supporting 109 languages, with batch processing up to 200 documents, configurable page ranges, and local file upload.
Best for
Best for
Developers who need a programmable OCR server with multi-format, multi-language support for AI document pipelines
Use cases
- Extracting text from scanned PDFs and images with OCR in multiple languages
- Batch processing up to 200 documents for large-scale text extraction
- Integrating document parsing into MCP-compatible AI workflows
Notes
An MCP server that wraps the MinerU document parsing API. It extracts text from PDFs, images, DOCX, and PPTX using OCR supporting 109 languages, with batch processing up to 200 documents, configurable page ranges, and local file upload.
5 stars on GitHub. Last updated 2026-05-07.
Use cases
- Extracting text from scanned PDFs and images with OCR in multiple languages
- Batch processing up to 200 documents for large-scale text extraction
- Integrating document parsing into MCP-compatible AI workflows
Pros
- Supports 109 languages for broad OCR coverage
- Handles multiple document formats (PDF, image, DOCX, PPTX) in one tool
- Allows batch processing and page range selection for flexible extraction
Cons
- Depends on an external API (MinerU) which may have cost or rate limits
- Batch size capped at 200 documents, limiting very large jobs
- Only available as a JavaScript project, limiting language ecosystem choice
Indexed from awesome-mcp-servers-punkpeye and enriched against its public facts.
Pros
- Supports 109 languages for broad OCR coverage
- Handles multiple document formats (PDF, image, DOCX, PPTX) in one tool
- Allows batch processing and page range selection for flexible extraction
Cons
- Depends on an external API (MinerU) which may have cost or rate limits
- Batch size capped at 200 documents, limiting very large jobs
- Only available as a JavaScript project, limiting language ecosystem choice
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.