Use case
Build a Document Intelligence Agent
Extract structured, queryable data from PDFs and unstructured documents so the agent can reason over it, store it, and hand off clean records downstream.
The actual problem is that your documents do not agree on a format. One vendor sends a two-column PDF invoice. Another sends a scanned image. A third sends a Word doc with merged cells. Engineers building document pipelines spend most of their time writing one-off parsers for each new format, then debugging the silent failures when a table shifts by one column. The people who need this stack are ops teams at law firms, finance teams processing supplier invoices, and SaaS teams whose customers upload anything. What makes it hard is not the extraction itself but getting the model to return a reliably typed object every time, not a plausible-looking string that breaks downstream on Wednesday.
The stack
Each pick is a real entry on the index. Click any one for the full detail page.
- 1A Agents Driver
Claude Code
by Anthropic
Why this: Headless Claude Code processes incoming documents in batch via a cron job or file-watch trigger without any UI wrapper. Skills encode the extraction contract per document type, so the agent runs the same typed pipeline whether the input is an invoice PDF or a scanned W-9.
Full entry - 2M MCP PDF surface
AryanBV/pdf-toolkit-mcp
by Various
Why this: Twenty-two tools covering read, render, and transform. For scanned pages the vision rendering path is what prevents the pipeline from silently returning empty text on image-heavy PDFs. Zero native dependencies keeps the deployment footprint small.
Full entry - 3M MCP Format normaliser
microsoft/markitdown
by Various
Why this: Not every document is a PDF. This tool converts Word docs, Excel sheets, and image files into clean Markdown before the extraction step, so the agent sees a consistent text format regardless of what the user uploaded. 138k stars means the conversion quality is well-tested on the weird edge cases.
Full entry - 4O OSS Structured output
Instructor
by Jason Liu (community)
Why this: Instructor patches the Anthropic client to return a Pydantic model on every extraction call instead of a raw string. When the model returns a malformed field it retries automatically. For a pipeline that processes hundreds of documents a day, that retry logic is what prevents silent data loss at row 247.
Full entry - 5O OSS Storage + retrieval
pgvector
by Community
Why this: Extracted records land in Postgres with embeddings stored via pgvector. This means the agent can do both exact-match queries (give me all invoices from vendor X) and semantic queries (find contracts that mention termination for convenience) against the same table without standing up a separate vector database.
Full entry - 6M MCP Query interface
Postgres MCP Server
by Model Context Protocol (reference)
Why this: Once the extracted data is in Postgres the rest of your stack needs to query it. The reference Postgres MCP server gives any downstream agent read access to the schema and extracted records without anyone copy-pasting a connection string or opening a DB GUI.
Full entry
Get this running with Enterprise DNA.
Enterprise DNA connects this stack to an operating layer that most document pipelines are missing. Each extraction job runs as a work item in OPM so you know which documents processed, which failed, and who owns the follow-up. Extracted records that need a human decision land in the inbox rather than a Slack message nobody finds. Secrets for the Postgres connection and Anthropic key live in Infisical and get pulled at runtime, so neither ends up in a .env file on a developer's laptop. The CRM record for each customer is the downstream destination for the structured fields the agent extracts, closing the loop from document upload to live account data without a manual copy-paste step.
Get the Stack Blueprint
A printable architecture card with every tool, role, and rationale on one page.
Enter your email. We send one useful update per week. Unsubscribe any time.
In the print dialog, choose "Save as PDF" as the destination.
Alternative stacks
Different angles on the same outcome.
Build a research agent
If the documents are public web sources rather than uploaded files, the research agent stack replaces the extraction pipeline with search MCPs and a delivery layer.
See the alternative AlternativeBuild a personal email assistant
Email attachments are often the delivery mechanism for the same PDFs and invoices. Combining the email assistant with this extraction stack routes attachments straight into the structured data pipeline.
See the alternative AlternativeBuild a code review bot
If the documents in question are pull requests and changelogs rather than PDFs, the code review bot stack is the closer fit.
See the alternativeOther use cases
More curated stacks from the index.
Build a customer support agent
A working customer-support agent that triages tickets, answers from your docs, and escalates with full context.
See the stack Use caseBuild a research agent
An agent that watches sources, synthesises findings, and ships you a briefing on the days something matters.
See the stack Use caseBuild a sales outreach agent
An outreach agent that drafts personal-feeling email, qualifies replies on the phone, and updates the CRM without anyone copy-pasting notes.
See the stack