Deepgram today announced the general availability (GA) of Flux Multilingual, expanding its conversational speech recognition model beyond English to support 10 languages, with the ability to automatically detect, understand, and switch languages dynamically within a single conversation in real time. Developers, enterprises, and product teams building voice agents now have access to the first real-time conversational speech recognition model, delivering accurate turn-taking, interruption handling, low latency, and natural human-like conversations at global scale.
Traditional automatic speech recognition (ASR) is designed for transcription. Flux introduced a new approach, conversational speech recognition (CSR), built from the ground up to understand dialogue flow and enable real-time interaction. Flux has rapidly become foundational infrastructure for real-time voice agents, powering production systems that developers trust to deliver fast, natural conversational experiences with best-in-class accuracy in turn detection and speech recognition. Prior to today’s release, extending these experiences across multiple languages required stitching together multilingual transcription models, language detection, and routing logic, introducing latency, complexity, and brittle user experiences. Flux Multilingual replaces that complexity with a single model and API, making it possible to build conversational voice agents across 10 languages without re-architecting systems or sacrificing performance.
With native support for turn-taking, interruptions, and code-switching within a single interaction, voice applications remain fluid, responsive, and natural regardless of language or region. Flux Multilingual delivers monolingual-grade accuracy across languages. Developers can guide the model with language hints or let it auto-detect, adapting in real time even mid-conversation.
Flux Multilingual Capabilities
Supported Languages
English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch
Ultra-low latency conversational speech recognition, now global
Flux Multilingual is built for understanding and interaction, not just transcription. It uses model-based turn detection, not simple silence detection, to deliver accurate end-of-turn decisions in under 400 milliseconds, keeping conversations fluid and responsive across languages.
Monolingual-grade accuracy with real-time language control
Flux Multilingual delivers monolingual-grade accuracy across languages, with flexible real-time control through language hints or automatic detection, native code-switching, and dynamic adaptation as conversations evolve.
Build and scale global voice agents with one model
Flux Multilingual supports 10 languages in a single conversational model, enabling teams to build and deploy voice agents globally with one integration. One model, ten languages, one API, with no additional infrastructure or model orchestration required.
Key Features
- Native turn detection and interruption handling for natural dialogue flow
- Low-latency streaming transcription for real-time responsiveness
- Automatic language detection and language hint support for accuracy control
- Mid-session configurability for dynamic language adaptation
- Native code-switching within a single conversation
- Fully compatible with existing Flux API integrations
Flux Multilingual is now generally available (GA). As part of the launch, Deepgram is offering a limited-time promotional rate on streaming speech-to-text, including Flux Multilingual and Nova-3 models.
Flux Multilingual is available via Deepgram’s Cloud API or as a self-hosted deployment, with support for EU endpoints, SDKs, and seamless integration into voice agent architectures. Developers can get started today at deepgram.com or try Flux Multilingual directly in the Deepgram Playground.
Like this:
Like Loading...
Related
This entry was posted on April 29, 2026 at 8:36 am and is filed under Commentary with tags Deepgram. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.
Deepgram Launches Flux Multilingual
Deepgram today announced the general availability (GA) of Flux Multilingual, expanding its conversational speech recognition model beyond English to support 10 languages, with the ability to automatically detect, understand, and switch languages dynamically within a single conversation in real time. Developers, enterprises, and product teams building voice agents now have access to the first real-time conversational speech recognition model, delivering accurate turn-taking, interruption handling, low latency, and natural human-like conversations at global scale.
Traditional automatic speech recognition (ASR) is designed for transcription. Flux introduced a new approach, conversational speech recognition (CSR), built from the ground up to understand dialogue flow and enable real-time interaction. Flux has rapidly become foundational infrastructure for real-time voice agents, powering production systems that developers trust to deliver fast, natural conversational experiences with best-in-class accuracy in turn detection and speech recognition. Prior to today’s release, extending these experiences across multiple languages required stitching together multilingual transcription models, language detection, and routing logic, introducing latency, complexity, and brittle user experiences. Flux Multilingual replaces that complexity with a single model and API, making it possible to build conversational voice agents across 10 languages without re-architecting systems or sacrificing performance.
With native support for turn-taking, interruptions, and code-switching within a single interaction, voice applications remain fluid, responsive, and natural regardless of language or region. Flux Multilingual delivers monolingual-grade accuracy across languages. Developers can guide the model with language hints or let it auto-detect, adapting in real time even mid-conversation.
Flux Multilingual Capabilities
Supported Languages
English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch
Ultra-low latency conversational speech recognition, now global
Flux Multilingual is built for understanding and interaction, not just transcription. It uses model-based turn detection, not simple silence detection, to deliver accurate end-of-turn decisions in under 400 milliseconds, keeping conversations fluid and responsive across languages.
Monolingual-grade accuracy with real-time language control
Flux Multilingual delivers monolingual-grade accuracy across languages, with flexible real-time control through language hints or automatic detection, native code-switching, and dynamic adaptation as conversations evolve.
Build and scale global voice agents with one model
Flux Multilingual supports 10 languages in a single conversational model, enabling teams to build and deploy voice agents globally with one integration. One model, ten languages, one API, with no additional infrastructure or model orchestration required.
Key Features
Flux Multilingual is now generally available (GA). As part of the launch, Deepgram is offering a limited-time promotional rate on streaming speech-to-text, including Flux Multilingual and Nova-3 models.
Flux Multilingual is available via Deepgram’s Cloud API or as a self-hosted deployment, with support for EU endpoints, SDKs, and seamless integration into voice agent architectures. Developers can get started today at deepgram.com or try Flux Multilingual directly in the Deepgram Playground.
Share this:
Like this:
Related
This entry was posted on April 29, 2026 at 8:36 am and is filed under Commentary with tags Deepgram. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.