Enterprise DNA
M MCP Servers Developer low

linxule/mineru-mcp

by Various

๐Ÿ“‡ โ˜๏ธ - MCP server for MinerU document parsing API. Parse PDFs, images, DOCX, and PPTX with OCR (109 languages), batch processing (200 docs), page ranges, and local file upload. 73

L

MCP

linxule/mineru-mcp

Added 1 June 2026

Overview

An MCP server that wraps the MinerU document parsing API. It extracts text from PDFs, images, DOCX, and PPTX using OCR supporting 109 languages, with batch processing up to 200 documents, configurable page ranges, and local file upload.

Best for

Best for
Developers who need a programmable OCR server with multi-format, multi-language support for AI document pipelines

Use cases

  • Extracting text from scanned PDFs and images with OCR in multiple languages
  • Batch processing up to 200 documents for large-scale text extraction
  • Integrating document parsing into MCP-compatible AI workflows

Notes

An MCP server that wraps the MinerU document parsing API. It extracts text from PDFs, images, DOCX, and PPTX using OCR supporting 109 languages, with batch processing up to 200 documents, configurable page ranges, and local file upload.

5 stars on GitHub. Last updated 2026-05-07.

Use cases

  • Extracting text from scanned PDFs and images with OCR in multiple languages
  • Batch processing up to 200 documents for large-scale text extraction
  • Integrating document parsing into MCP-compatible AI workflows

Pros

  • Supports 109 languages for broad OCR coverage
  • Handles multiple document formats (PDF, image, DOCX, PPTX) in one tool
  • Allows batch processing and page range selection for flexible extraction

Cons

  • Depends on an external API (MinerU) which may have cost or rate limits
  • Batch size capped at 200 documents, limiting very large jobs
  • Only available as a JavaScript project, limiting language ecosystem choice

Indexed from awesome-mcp-servers-punkpeye and enriched against its public facts.

Pros

  • Supports 109 languages for broad OCR coverage
  • Handles multiple document formats (PDF, image, DOCX, PPTX) in one tool
  • Allows batch processing and page range selection for flexible extraction

Cons

  • Depends on an external API (MinerU) which may have cost or rate limits
  • Batch size capped at 200 documents, limiting very large jobs
  • Only available as a JavaScript project, limiting language ecosystem choice