microsoft/markitdown
by Various
Python tool for converting files and office documents to Markdown.
MCP
microsoft/markitdown
Added 1 June 2026
Overview
Python tool that converts files and Office documents into Markdown format. Handles multiple input types including PDFs, Word docs, Excel sheets, and images, outputting clean Markdown suitable for further processing or storage.
Best for
Best for
Developers building document pipeline tools or migrating content to Markdown-based systems
Use cases
- Converting legacy Word documents to Markdown for documentation systems
- Batch processing spreadsheets into structured Markdown tables
- Extracting text from PDFs while preserving basic formatting
Notes
Python tool that converts files and Office documents into Markdown format. Handles multiple input types including PDFs, Word docs, Excel sheets, and images, outputting clean Markdown suitable for further processing or storage.
138,078 stars on GitHub. Last updated 2026-05-26. Licensed MIT.
Use cases
- Converting legacy Word documents to Markdown for documentation systems
- Batch processing spreadsheets into structured Markdown tables
- Extracting text from PDFs while preserving basic formatting
Pros
- Supports diverse file formats including Office suite documents
- High community adoption with 138k+ GitHub stars
- Handles images and extracts text from visual content
Cons
- Python-only, requires runtime environment setup
- Conversion quality varies by source format complexity
- Limited control over output formatting rules
Indexed from awesome-mcp-servers-punkpeye and enriched against its public facts.
Pros
- Supports diverse file formats including Office suite documents
- High community adoption with 138k+ GitHub stars
- Handles images and extracts text from visual content
Cons
- Python-only, requires runtime environment setup
- Conversion quality varies by source format complexity
- Limited control over output formatting rules
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.
0xMassi/webclaw
Various
Fast, local-first web content extraction for LLMs. Scrape, crawl, extract structured data — all from Rust. CLI, REST API, and MCP server.
agenticdecks/deckrun-mcp
Various
MCP server for Deckrun — generate presentation PDFs, videos, and audio from Markdown
ailenshen/apple-notes-mcp
Various
Read and write Apple Notes, with Apple Notes native formatting support
AIMLPM/markcrawl
Various
Fast Python web crawler for RAG and AI ingestion. Extracts clean Markdown from any site for LLMs and vector stores.
aparajithn/agent-scraper-mcp
Various
Web scraping MCP server for AI agents — screenshots, content extraction, structured scraping
arthurpanhku/DocSentinel
Various
MCP server for AI agent for cybersecurity: automate assessment of documents, questionnaires & reports. Multi-format parsing, RAG knowledge base,Risks, compliance gaps, remediations
AryanBV/pdf-toolkit-mcp
Various
Write-capable PDF toolkit for any MCP client: 22 tools to read, create, render, encrypt, and transform PDFs. Vision rendering for scans, form-preserving merge and split, AES-256, z
bch1212/agentfetch-mcp
Various
MCP server for fetching web URLs with token estimation, caching, and intelligent routing. Built for AI agents.
calclavia/mcp-obsidian
Various
📇 🏠 - This is a connector to allow Claude Desktop (or any MCP client) to read and search any directory containing Markdown notes (such as an Obsidian vault).
caol64/wenyan-mcp
Various
文颜 MCP Server 可以让 AI 自动将 Markdown 文章排版后发布至微信公众号。
danielkennedy1/pdf-tools-mcp
Various
🐍 - PDF download, view & manipulation utilities.
dodopayments/contextmcp
Various
Self-hosted MCP server for your documentation
drolosoft/go-docs-mcp
Various
📄🐹⚡ Go MCP server for multi-format document access — PDF, TXT, MD, DOCX, CSV, images. Install and Go.
epicsagas/alcove
Various
Alcove is an MCP server that gives AI coding agents on-demand access to your private project docs — BM25 + vector hybrid search for precision retrieval, tree-sitter code indexing s
Erodenn/fetch-guard
Various
Fetch URLs and return clean, LLM-ready markdown with metadata and layered prompt injection defense. Configurable timeouts, word limits, JS rendering, and link extraction. All-in-on
exa-labs/exa-mcp-server
Various
Exa MCP for web search and web crawling!
exoticknight/mcp-file-merger
Various
MCP server for merging multiple files into one
FacundoLucci/plsreadme
Various
FacundoLucci/plsreadme — indexed from awesome-mcp-servers-punkpeye
Harry-027/JotDown
Various
An MCP Server in Rust for creating Notion pages & mdBooks with LLMs 🦀
isaacphi/mcp-gdrive
Various
Model Context Protocol (MCP) Server for reading from Google Drive and editing Google Sheets
jinzcdev/markmap-mcp-server
Various
An MCP server for converting Markdown to interactive mind maps with export support (PNG/JPG/SVG).
johannesbrandenburger/typst-mcp
Various
Typst MCP Server is an MCP (Model Context Protocol) implementation that helps AI models interact with Typst, a markup-based typesetting system. The server provides tools for conver
kc23go/anybrowse
Various
Web scraping MCP server for AI agents. Real Chrome, 84% success rate. 10 free calls/day, no signup.
kehvinbehvin/json-mcp-filter
Various
JSON MCP server to filter only relevant data for your LLM
madhan-g-p/DevDocs-MCP
Various
Documentation Authority for AI Agents based upon Devdocs
MarceauSolutions/md-to-pdf-mcp
Various
Convert Markdown to professional PDFs with customizable themes - MCP server for Claude Desktop
mark3labs/mcp-filesystem-server
Various
Go server implementing Model Context Protocol (MCP) for filesystem operations.
MobileReality/mdma
Various
Interactive documents from Markdown. Extends MD with forms, approvals, webhooks, and more — built for next gen apps
pskill9/website-downloader
Various
MCP server to download entire websites
Retio-ai/pagemap
Various
🐍 🏠 - Compresses ~100K-token HTML into 2-5K-token structured maps while preserving every actionable element. AI agents can read and interact with any web page at 97% fewer tokens
SecurityRonin/docx-mcp
Various
MCP server for reading and editing Word (.docx) documents with track changes, comments, footnotes, and structural validation
sifter-ai/sifter
Various
Sifter is an open-source, developer-first document extraction engine that turns unstructured documents — invoices, contracts, receipts, reports — into a structured, queryable datab
UnMarkdown/mcp-server
Various
MCP server for the Unmarkdown API: Convert markdown, manage documents, publish pages
vezlo/src-to-kb
Various
Convert source code to LLM ready knowledge base
Zacccck/Claude-MCP-Read-Email-Attachments
Various
Local MCP server for Claude — read and parse Outlook email attachments via Microsoft Graph API
adeu
ai.adeu
docx ↔ LLM translator. Projects .docx to Markdown for editing. Projects edits back to OOXML as tracked changes. Python and Node.js implementations.
dodopayments/contextmcp
Various
Self-hosted MCP server for your documentation
just-every/mcp-read-website-fast
Various
Quickly reads webpages and converts to markdown for fast, token efficient web scraping
kimwwk/repocrunch
Various
Analyze GitHub repos into structured JSON. No AI, fully deterministic.
lfnovo/content-core
Various
Extract what matters from any media source
linxule/mineru-mcp
Various
📇 ☁️ - MCP server for MinerU document parsing API. Parse PDFs, images, DOCX, and PPTX with OCR (109 languages), batch processing (200 docs), page ranges, and local file upload. 73
NameetP/pdfmux
Various
PDF extraction that checks its own work. #2 reading order accuracy — zero AI, zero GPU, zero cost.
opendatalab/MinerU-Ecosystem
Various
opendatalab/MinerU-Ecosystem — indexed from awesome-mcp-servers-punkpeye
talonicdev/talonic-mcp
Various
Official Talonic MCP server. Lets AI agents extract structured data from any document via the Model Context Protocol.
zcaceres/markdownify-mcp
Various
A Model Context Protocol server for converting almost anything to Markdown