NameetP/pdfmux
by Various
PDF extraction that checks its own work. #2 reading order accuracy — zero AI, zero GPU, zero cost.
MCP
NameetP/pdfmux
Added 1 June 2026
Overview
PDF extraction tool that verifies its own output for reading order accuracy. Uses a rule-based approach with no AI or GPU, achieving high accuracy at zero cost.
Best for
Best for
Developers needing lightweight, verifiable PDF text extraction without AI costs
Use cases
- Extracting text from PDFs while preserving logical reading order
- Building reliable document processing pipelines without AI dependencies
- Validating extraction quality automatically for downstream tasks
Notes
PDF extraction tool that verifies its own output for reading order accuracy. Uses a rule-based approach with no AI or GPU, achieving high accuracy at zero cost.
66 stars on GitHub. Last updated 2026-05-22. Licensed MIT.
Use cases
- Extracting text from PDFs while preserving logical reading order
- Building reliable document processing pipelines without AI dependencies
- Validating extraction quality automatically for downstream tasks
Pros
- No AI or GPU required, runs on any machine
- Self-checking mechanism ensures output accuracy
- Free and open source with a simple Python interface
Cons
- Limited to reading order extraction, no layout or table parsing
- Small community and limited documentation due to low star count
- May struggle with complex PDF layouts or scanned documents
Indexed from awesome-mcp-servers-punkpeye and enriched against its public facts.
Pros
- No AI or GPU required, runs on any machine
- Self-checking mechanism ensures output accuracy
- Free and open source with a simple Python interface
Cons
- Limited to reading order extraction, no layout or table parsing
- Small community and limited documentation due to low star count
- May struggle with complex PDF layouts or scanned documents
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.