Deepgram today announced the general availability (GA) of Flux Multilingual, expanding its conversational speech recognition model beyond English to support 10 languages, with the ability to automatically detect, understand, and switch languages dynamically within a single conversation in real time. Developers, enterprises, and product teams building voice agents now have access to the first real-time conversational speech recognition model, delivering accurate turn-taking, interruption handling, low latency, and natural human-like conversations at global scale.
Traditional automatic speech recognition (ASR) is designed for transcription. Flux introduced a new approach, conversational speech recognition (CSR), built from the ground up to understand dialogue flow and enable real-time interaction. Flux has rapidly become foundational infrastructure for real-time voice agents, powering production systems that developers trust to deliver fast, natural conversational experiences with best-in-class accuracy in turn detection and speech recognition. Prior to today’s release, extending these experiences across multiple languages required stitching together multilingual transcription models, language detection, and routing logic, introducing latency, complexity, and brittle user experiences. Flux Multilingual replaces that complexity with a single model and API, making it possible to build conversational voice agents across 10 languages without re-architecting systems or sacrificing performance.
With native support for turn-taking, interruptions, and code-switching within a single interaction, voice applications remain fluid, responsive, and natural regardless of language or region. Flux Multilingual delivers monolingual-grade accuracy across languages. Developers can guide the model with language hints or let it auto-detect, adapting in real time even mid-conversation.
Flux Multilingual Capabilities
Supported Languages
English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch
Ultra-low latency conversational speech recognition, now global
Flux Multilingual is built for understanding and interaction, not just transcription. It uses model-based turn detection, not simple silence detection, to deliver accurate end-of-turn decisions in under 400 milliseconds, keeping conversations fluid and responsive across languages.
Monolingual-grade accuracy with real-time language control
Flux Multilingual delivers monolingual-grade accuracy across languages, with flexible real-time control through language hints or automatic detection, native code-switching, and dynamic adaptation as conversations evolve.
Build and scale global voice agents with one model
Flux Multilingual supports 10 languages in a single conversational model, enabling teams to build and deploy voice agents globally with one integration. One model, ten languages, one API, with no additional infrastructure or model orchestration required.
Key Features
- Native turn detection and interruption handling for natural dialogue flow
- Low-latency streaming transcription for real-time responsiveness
- Automatic language detection and language hint support for accuracy control
- Mid-session configurability for dynamic language adaptation
- Native code-switching within a single conversation
- Fully compatible with existing Flux API integrations
Flux Multilingual is now generally available (GA). As part of the launch, Deepgram is offering a limited-time promotional rate on streaming speech-to-text, including Flux Multilingual and Nova-3 models.
Flux Multilingual is available via Deepgram’s Cloud API or as a self-hosted deployment, with support for EU endpoints, SDKs, and seamless integration into voice agent architectures. Developers can get started today at deepgram.com or try Flux Multilingual directly in the Deepgram Playground.
Deepgram Delivers Private Voice AI to Regulated Industries with On-Premises Deployments Powered by Fortanix Confidential AI and NVIDIA Confidential Computing
Posted in Commentary with tags Deepgram on June 1, 2026 by itnerdDeepgram and Fortanix today announced a partnership that will enable enterprises to run voice AI in their own environment on their own terms while ensuring their most sensitive data is securely protected. Under terms of the agreement, Deepgram can leverage Fortanix Confidential AI and NVIDIA Confidential Computing to add an additional layer of advanced security to self-hosted environments to ensure that its proprietary model weights, built on business-critical intellectual property, can be deployed while protecting against model theft or inappropriate use. With this announcement, Deepgram and Fortanix continue to raise the bar for model-in-use protection in the most security-sensitive on-prem environments, enabling increased voice AI adoption in highly regulated industries.
For enterprises, especially those in highly regulated industries, security requirements continue to tighten. Organizations handling patient conversations, financial transactions, or classified information increasingly require that sensitive audio and AI model weights remain protected not only at rest and in transit, but also during active processing in their own environments. This level of protection enables organizations to build highly-secure real-time voice applications without sacrificing on performance.
The on-premises solution runs Deepgram’s voice AI models with Fortanix Confidential AI on NVIDIA Confidential Computing-enabled GPUs, creating a hardware-isolated environment where both audio data and model weights remain encrypted and protected throughout active use. NVIDIA GPUs with Confidential Computing enable AI workloads to process sensitive data inside a trusted execution environment — a capability traditional infrastructure cannot provide. By bringing together best-in-class voice AI models, hardware-rooted isolation, and a jointly engineered, pre-integrated stack, the partnership delivers a level of in-use data protection that, until now, has not been practical to deploy at enterprise scale.
The Deepgram, Fortanix, and NVIDIA solution opens the door to a variety of on-prem security-demanding voice AI applications: private, on-prem voice agents handling sensitive customer and patient interactions; enterprise-wide transcription layers that capture every call, meeting, and internal conversation for analytics, compliance, and search; and voice-enabled IT, operations, and service desk applications running entirely inside an organization’s secure perimeter. For regulated enterprises, this turns voice into a production-ready interface without sacrificing the real-time performance the experience demands.
Deepgram’s voice AI models deliver the real-time voice understanding and generation with the accuracy, consistency, and low latency that enterprise use demands. Designed for any environment including those with the highest confidentiality and regulatory needs, Deepgram’s models bring voice AI to enterprise organizations across virtually every industry vertical, including those with sensitive, regulated use cases that have historically been out of reach.
Fortanix Confidential AI protects data and AI model weights while they’re actively running. It builds on NVIDIA GPUs with Confidential Computing to create Trusted Execution Environments (TEEs) that isolate the AI workload from the underlying infrastructure and OS. Data and AI models run safely inside Confidential Computing, encrypted in memory, and inaccessible to the host operating system or even privileged administrators. As a result, regulated organizations can unlock AI innovation with trust, security, and sovereignty at the core, while meeting HIPAA, GDPR, and national-data residency requirements.
To learn more, please reach out to Deepgram at: partners@deepgram.com.
Leave a comment »