Enterprise DNA

Omni by Enterprise DNA

Enterprise DNA Resources

Latest AI and industry news. Practical AI operating-system thinking for owners, operators, and teams doing real work.

220k+

Data professionals

Omni

AI agents and apps

Audit

Map the manual work

News Trending Product

Krisp VIVA 2.0 Fixes Voice AI's Production Problem

Krisp launched VIVA 2.0 on May 6: voice infrastructure for AI agents that fixes the noise, turn-taking, and interruption problems that kill voice AI demos.

Enterprise DNA | | via BusinessWire
Enterprise DNA News

Voice AI usage grew 9x in 2025. And yet most voice agents still fail in the same predictable ways the moment they leave a controlled demo environment.

Krisp launched VIVA 2.0 on May 6, 2026, at Twilio Signal in San Francisco, directly targeting that gap. VIVA — Voice Infrastructure for Voice Agents — is the infrastructure layer sitting underneath voice agents, IVRs, and conversational AI systems to handle the messy real-world audio problems that demo-room conditions never expose.

The Problem VIVA Solves

The core issue is this: background noise pushes speech-to-text word error rates from around 5% under clean conditions to over 30% in the real environments where voice agents actually operate. Offices, retail floors, homes, warehouses — these spaces are loud in ways that break the assumptions baked into most voice AI pipelines.

Beyond noise, voice agents have consistently struggled with two behavioral problems: knowing when a user has actually finished speaking (turn-taking), and handling interruptions gracefully rather than talking over the user or freezing mid-sentence.

VIVA 2.0 introduces a new generation of small, real-time models built specifically for these problems. They predict when users finish speaking, classify interruptions, and read perceptual signals including whether audio is synthetic, the speaker’s gender, and accent characteristics. The result is a voice agent that responds the way a real conversation works, not the way a clean audio file works.

Why the Scale Here Matters

Krisp is not a startup working with experimental traffic. The company’s platform is deployed on over 200 million devices and processes more than 80 billion minutes of voice conversations per month. The VIVA SDK specifically handles more than 12 billion minutes of voice AI agent traffic annually and is embedded in over 130 voice AI products.

When Krisp says VIVA 2.0 sets a new benchmark for how voice agents handle audio in production, they are drawing on production data from one of the largest voice processing operations in the world. That is not a claim most voice AI vendors can make.

What This Signals for Enterprise Voice AI

A few things worth noting for businesses evaluating voice AI:

The 9x growth in voice agent usage is real, and it is creating infrastructure demand. Krisp’s launch validates that the voice AI market has moved past early adopters into genuine enterprise deployment. The problems VIVA solves were not worth solving until usage reached the scale where they became blockers.

Infrastructure quality now separates good voice AI from bad. The models powering voice agents (the LLMs, the TTS engines, the STT systems) have gotten good enough that they are no longer the limiting factor. The bottleneck is the infrastructure around them — noise handling, turn-taking, latency. VIVA 2.0 is a direct response to that shift.

Demos are not deployments. Every business evaluating voice AI should test it in the actual environment where it will run, with the actual background noise, interruption patterns, and audio conditions that environment produces. A voice agent that passes a conference room demo can still fail badly on a busy phone line or a retail floor.

What This Means for Business

The voice AI market is maturing in real time. A year ago, building a useful voice agent meant accepting significant limitations around audio quality and conversational flow. The infrastructure stack is catching up to the ambition.

For businesses that have been watching voice AI with interest but skepticism, the production-readiness story is getting meaningfully better. VIVA 2.0 is one part of that — not the whole answer, but an important signal that the industry is taking real-world deployment seriously rather than optimizing for demos.

The companies that move now on voice AI — with the right infrastructure and the right partner — will be significantly ahead of those that wait for “perfect.” The infrastructure is good enough to deliver real results today. The gap between now and “perfect” is just going to keep closing.

For enterprises exploring voice AI as part of their operations — customer intake, internal reporting, admin automation — the deployment bar has meaningfully dropped in 2026. The tools now exist to make it work.