Deepgram Launches Streaming Speech, Text, and Voice Agents on Amazon SageMaker AI

Deepgram today announced native integration with Amazon SageMaker AI, delivering streaming, real-time speech-to-text (STT), text-to-speech (TTS), and the Voice Agent API as Amazon SageMaker AI real-time endpoints, no custom pipelines or orchestration required. Teams can now build, deploy, and scale voice-powered applications inside their existing AWS workflows while maintaining the security and compliance benefits of their AWS environment.

Native streaming via Amazon SageMaker endpoints means no workarounds or hoops to jump through, just clean, real-time inferences through the SageMaker API. The integration enables sub-second latency and enterprise-grade reliability for high-scale use cases like contact centers, trading floors, and live analytics.

Built to run on AWS, the solution supports streaming responses via InvokeEndpointWithResponseStream and keeps data within AWS. Customers can deploy Deepgram in their Amazon Virtual Private Cloud (Amazon VPC) or as a managed service, aligning with stringent data residency and compliance requirements.

The integration is also backed by a strong relationship with AWS. Deepgram is an AWS Generative AI Competency Partner and has signed a multi-year Strategic Collaboration Agreement (SCA) with AWS to accelerate enterprise adoption.

The integration is available to customers building on AWS, with live demonstrations planned at AWS re:Invent in Las Vegas, December 1–5, 2025, in Deepgram Booth #690. Learn more about our AWS partnership and technical implementation on Deepgram’s AWS partner page, and read the AWS blog: “Introducing bidirectional streaming for real-time inference on Amazon SageMaker AI.”

This entry was posted on December 1, 2025 at 8:24 am and is filed under Commentary with tags Deepgram. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

The IT Nerd