Enterprise DNA
P Apps and SaaS Productivity low

Whisper

by Various

Robust speech recognition via large-scale weak supervision. [#opensource](https://github.com/openai/whisper)

W

Apps

Whisper

Added 1 June 2026

Overview

Whisper is an open-source speech recognition system trained on a large dataset of weakly supervised audio-text pairs. It transcribes audio into text across multiple languages and handles various accents, background noise, and technical jargon.

Best for

Best for
Developers needing a free, multilingual speech-to-text solution for offline or privacy-sensitive applications.

Use cases

  • Transcribing meeting recordings or podcasts into searchable text
  • Building voice-controlled applications or virtual assistants
  • Generating subtitles or captions for video content

Notes

Whisper is an open-source speech recognition system trained on a large dataset of weakly supervised audio-text pairs. It transcribes audio into text across multiple languages and handles various accents, background noise, and technical jargon.

Use cases

  • Transcribing meeting recordings or podcasts into searchable text
  • Building voice-controlled applications or virtual assistants
  • Generating subtitles or captions for video content

Pros

  • Supports 99+ languages with robust accuracy
  • Free and open-source with no usage limits
  • Works offline after model download

Cons

  • Requires significant GPU memory for larger models
  • Slower than some proprietary alternatives on consumer hardware
  • May struggle with very long audio without chunking

Indexed from awesome-generative-ai and enriched against its public facts.

Pros

  • Supports 99+ languages with robust accuracy
  • Free and open-source with no usage limits
  • Works offline after model download

Cons

  • Requires significant GPU memory for larger models
  • Slower than some proprietary alternatives on consumer hardware
  • May struggle with very long audio without chunking

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Used by19entries
M MCP Dev low

eviscerations/whisper-windows-mcp

Various

Windows-native MCP server for local audio transcription — GPU accelerated via Vulkan, works with Claude Desktop

★ 0 updated 11d ago
M MCP Dev low

JuhongPark/mcp-server-pronunciation

Various

Local MCP voice coach with English pronunciation, grammar, and fluency feedback.

★ 0 updated 10d ago
M MCP Dev low

samson-art/transcriptor-mcp

Various

An MCP server (stdio + HTTP/SSE) that fetches video transcripts/subtitles via yt-dlp, with pagination for large responses. Supports YouTube, Twitter/X, Instagram, TikTok, Twitch, V

★ 10 updated 2d ago
M MCP Dev low

transcribe-app/mcp-transcribe

Various

Add transcription tools to your AI-powered assistants.

★ 6 updated 2mo ago
O OSS Orchestration medium

AudioGPT

Community

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

★ 10,179 updated 1y ago
O OSS Obs medium

Off Grid

Community

The Swiss Army Knife of Offline AI. Chat, Speak, and Generate Images - Privacy First, Zero Internet. Download an LLM and use it on your mobile device. No data ever leaves your phon

★ 2,335 updated 5d ago
O OSS Orchestration medium

OpenDAN

Community

OpenDAN is an open source Personal AI OS , which consolidates various AI modules in one place for your personal use.

★ 2,032 updated 2mo ago
O OSS Orchestration medium

Pipecat

Community

Open Source framework for voice and multimodal conversational AI

★ 12,588 updated 2d ago
O OSS Obs medium

whisper-ctranslate2

Community

Whisper command line client compatible with original OpenAI client based on CTranslate2.

★ 1,309 updated 3mo ago
P Apps Productivity one click

Fireflies

Fireflies.ai

AI meeting assistant. Records, transcribes, summarises, and pipes the output to your stack.

P Apps Productivity one click

Granola

Granola

AI notepad for meetings. Take your own notes, Granola enhances them after the call with the audio context.

P Apps Productivity low

Loopin AI

Various

loopinhq.com

P Apps Productivity low

Otter.ai

Various

Otter AI Meeting Agent supports real-time transcription, live chat, automated summaries, insights, and action items.

P Apps Productivity low

PyGPT

Various

PyGPT is an open‑source desktop AI assistant for Windows, macOS and Linux. Chat, agents, web search, run Python, TTS/STT, plugins, long‑term memory.

P Apps Productivity low

Read AI

Various

Read AI, the fastest growing AI meeting assistant, ever, delivers real-time transcription, smart summaries, and enables AI search and discovery across all your content including

P Apps Productivity low

Screenpipe

Various

YC (S26) | AI that knows what you've seen, said, or heard. Records everything you do, say, hear 24/7, local, private, secure

★ 19,049 updated 2d ago
P Apps Productivity low

Teleprompter

Various

An on-device AI for your meetings that listens to you and makes charismatic quote suggestions.

★ 335 updated 3y ago
P Apps Productivity low

Vibe Transcribe

Various

Local-first transcription for audio and video with AI summaries, multilingual support, and privacy-focused processing.

P Apps Productivity low

Wispr Flow

Various

Flow makes writing quick and clear with seamless voice dictation. It is the fastest, smartest way to type with your voice.