Whisper
by Various
Robust speech recognition via large-scale weak supervision. [#opensource](https://github.com/openai/whisper)
Apps
Whisper
Added 1 June 2026
Overview
Whisper is an open-source speech recognition system trained on a large dataset of weakly supervised audio-text pairs. It transcribes audio into text across multiple languages and handles various accents, background noise, and technical jargon.
Best for
Best for
Developers needing a free, multilingual speech-to-text solution for offline or privacy-sensitive applications.
Use cases
- Transcribing meeting recordings or podcasts into searchable text
- Building voice-controlled applications or virtual assistants
- Generating subtitles or captions for video content
Notes
Whisper is an open-source speech recognition system trained on a large dataset of weakly supervised audio-text pairs. It transcribes audio into text across multiple languages and handles various accents, background noise, and technical jargon.
Use cases
- Transcribing meeting recordings or podcasts into searchable text
- Building voice-controlled applications or virtual assistants
- Generating subtitles or captions for video content
Pros
- Supports 99+ languages with robust accuracy
- Free and open-source with no usage limits
- Works offline after model download
Cons
- Requires significant GPU memory for larger models
- Slower than some proprietary alternatives on consumer hardware
- May struggle with very long audio without chunking
Indexed from awesome-generative-ai and enriched against its public facts.
Pros
- Supports 99+ languages with robust accuracy
- Free and open-source with no usage limits
- Works offline after model download
Cons
- Requires significant GPU memory for larger models
- Slower than some proprietary alternatives on consumer hardware
- May struggle with very long audio without chunking
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.
eviscerations/whisper-windows-mcp
Various
Windows-native MCP server for local audio transcription — GPU accelerated via Vulkan, works with Claude Desktop
JuhongPark/mcp-server-pronunciation
Various
Local MCP voice coach with English pronunciation, grammar, and fluency feedback.
samson-art/transcriptor-mcp
Various
An MCP server (stdio + HTTP/SSE) that fetches video transcripts/subtitles via yt-dlp, with pagination for large responses. Supports YouTube, Twitter/X, Instagram, TikTok, Twitch, V
transcribe-app/mcp-transcribe
Various
Add transcription tools to your AI-powered assistants.
AudioGPT
Community
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Off Grid
Community
The Swiss Army Knife of Offline AI. Chat, Speak, and Generate Images - Privacy First, Zero Internet. Download an LLM and use it on your mobile device. No data ever leaves your phon
OpenDAN
Community
OpenDAN is an open source Personal AI OS , which consolidates various AI modules in one place for your personal use.
Pipecat
Community
Open Source framework for voice and multimodal conversational AI
whisper-ctranslate2
Community
Whisper command line client compatible with original OpenAI client based on CTranslate2.
Fireflies
Fireflies.ai
AI meeting assistant. Records, transcribes, summarises, and pipes the output to your stack.
Granola
Granola
AI notepad for meetings. Take your own notes, Granola enhances them after the call with the audio context.
Loopin AI
Various
loopinhq.com
Otter.ai
Various
Otter AI Meeting Agent supports real-time transcription, live chat, automated summaries, insights, and action items.
PyGPT
Various
PyGPT is an open‑source desktop AI assistant for Windows, macOS and Linux. Chat, agents, web search, run Python, TTS/STT, plugins, long‑term memory.
Read AI
Various
Read AI, the fastest growing AI meeting assistant, ever, delivers real-time transcription, smart summaries, and enables AI search and discovery across all your content including
Screenpipe
Various
YC (S26) | AI that knows what you've seen, said, or heard. Records everything you do, say, hear 24/7, local, private, secure
Teleprompter
Various
An on-device AI for your meetings that listens to you and makes charismatic quote suggestions.
Vibe Transcribe
Various
Local-first transcription for audio and video with AI summaries, multilingual support, and privacy-focused processing.
Wispr Flow
Various
Flow makes writing quick and clear with seamless voice dictation. It is the fastest, smartest way to type with your voice.