Deepgram Launches Flux Multilingual

Deepgram today announced the general availability (GA) of Flux Multilingual, expanding its conversational speech recognition model beyond English to support 10 languages, with the ability to automatically detect, understand, and switch languages dynamically within a single conversation in real time. Developers, enterprises, and product teams building voice agents now have access to the first real-time conversational speech recognition model, delivering accurate turn-taking, interruption handling, low latency, and natural human-like conversations at global scale. 

Traditional automatic speech recognition (ASR) is designed for transcription. Flux introduced a new approach, conversational speech recognition (CSR), built from the ground up to understand dialogue flow and enable real-time interaction. Flux has rapidly become foundational infrastructure for real-time voice agents, powering production systems that developers trust to deliver fast, natural conversational experiences with best-in-class accuracy in turn detection and speech recognition. Prior to today’s release, extending these experiences across multiple languages required stitching together multilingual transcription models, language detection, and routing logic, introducing latency, complexity, and brittle user experiences. Flux Multilingual replaces that complexity with a single model and API, making it possible to build conversational voice agents across 10 languages without re-architecting systems or sacrificing performance.

With native support for turn-taking, interruptions, and code-switching within a single interaction, voice applications remain fluid, responsive, and natural regardless of language or region. Flux Multilingual delivers monolingual-grade accuracy across languages. Developers can guide the model with language hints or let it auto-detect, adapting in real time even mid-conversation.

Flux Multilingual Capabilities

Supported Languages

English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch

Ultra-low latency conversational speech recognition, now global

Flux Multilingual is built for understanding and interaction, not just transcription. It uses model-based turn detection, not simple silence detection, to deliver accurate end-of-turn decisions in under 400 milliseconds, keeping conversations fluid and responsive across languages.

Monolingual-grade accuracy with real-time language control

Flux Multilingual delivers monolingual-grade accuracy across languages, with flexible real-time control through language hints or automatic detection, native code-switching, and dynamic adaptation as conversations evolve.

Build and scale global voice agents with one model

Flux Multilingual supports 10 languages in a single conversational model, enabling teams to build and deploy voice agents globally with one integration. One model, ten languages, one API, with no additional infrastructure or model orchestration required.

Key Features

  • Native turn detection and interruption handling for natural dialogue flow
  • Low-latency streaming transcription for real-time responsiveness
  • Automatic language detection and language hint support for accuracy control 
  • Mid-session configurability for dynamic language adaptation
  • Native code-switching within a single conversation
  • Fully compatible with existing Flux API integrations 

Flux Multilingual is now generally available (GA). As part of the launch, Deepgram is offering a limited-time promotional rate on streaming speech-to-text, including Flux Multilingual and Nova-3 models.

Flux Multilingual is available via Deepgram’s Cloud API or as a self-hosted deployment, with support for EU endpoints, SDKs, and seamless integration into voice agent architectures. Developers can get started today at deepgram.com or try Flux Multilingual directly in the Deepgram Playground.

Leave a Reply

Discover more from The IT Nerd

Subscribe now to keep reading and get access to the full archive.

Continue reading