Enterprise DNA
M MCP Servers Developer low

sifter-ai/sifter

by Various

Sifter is an open-source, developer-first document extraction engine that turns unstructured documents — invoices, contracts, receipts, reports — into a structured, queryable datab

S

MCP

sifter-ai/sifter

Added 1 June 2026

Overview

Sifter is an open-source, developer-first document extraction engine. It uses TypeScript to convert unstructured documents such as invoices, contracts, receipts, and reports into a structured, queryable database. The extraction process is designed to be self-hosted and integrated into developer workflows.

Best for

Best for
Developers needing an open-source document extraction engine to build structured databases from unstructured files

Use cases

  • Extracting structured data from invoices and receipts
  • Parsing contracts into queryable records for analysis
  • Building a searchable database from reports and documents

Notes

Sifter is an open-source, developer-first document extraction engine. It uses TypeScript to convert unstructured documents such as invoices, contracts, receipts, and reports into a structured, queryable database. The extraction process is designed to be self-hosted and integrated into developer workflows.

44 stars on GitHub. Last updated 2026-05-29. Licensed MIT.

Use cases

  • Extracting structured data from invoices and receipts
  • Parsing contracts into queryable records for analysis
  • Building a searchable database from reports and documents

Pros

  • Open-source and self-hostable, giving full control over data
  • Developer-first design with a TypeScript API for easy integration
  • Turns unstructured documents into structured, queryable data

Cons

  • Small community (44 stars) may limit support and contributions
  • May require significant setup for complex or varied document types
  • Not optimized for real-time extraction at scale without additional tuning

Indexed from awesome-mcp-servers-punkpeye and enriched against its public facts.

Pros

  • Open-source and self-hostable, giving full control over data
  • Developer-first design with a TypeScript API for easy integration
  • Turns unstructured documents into structured, queryable data

Cons

  • Small community (44 stars) may limit support and contributions
  • May require significant setup for complex or varied document types
  • Not optimized for real-time extraction at scale without additional tuning

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.