Skip to main content

⚡ Deepgram Aura: Ultra-Fast TTS

Industry-leading speed with ~75ms latency. Purpose-built for real-time phone calls, live chat, and interactive applications where every millisecond matters.

Quick Setup

1

Get API Key

  1. Visit Deepgram Console and create an account
  2. Navigate to API Keys in the dashboard
  3. Create a new API key with TTS permissions
  4. Copy your API key
2

Configure in Burki

  1. Go to AI ConfigurationTTS tab
  2. Select Deepgram as provider
  3. Paste your API key in the TTS API Key field
3

Select Aura Model

Choose Aura-2 for best quality or Aura for proven stability

Available Models

🚀 Aura-2

~75ms latencyNext-generation model with improved qualityBest for: Production phone calls, live applications Status: Latest and recommended

⚖️ Aura

~85ms latencyProven stable model with consistent qualityBest for: Stable production environments Status: Battle-tested, reliable

Available Voices

Deepgram Aura Voices

Asteria

Warm and expressivePerfect for customer service and supportVoice ID: aura-asteria-en

Luna

Soft and melodicGreat for gentle, calming interactionsVoice ID: aura-luna-en

Stella

Bright and clearExcellent for announcements and alertsVoice ID: aura-stella-en

Athena

Intelligent and clearProfessional and articulateVoice ID: aura-athena-en

Orion

Deep and authoritativePerfect for business and professional useVoice ID: aura-orion-en

Helios

Bright and energeticGreat for upbeat, engaging contentVoice ID: aura-helios-en

Perseus

Strong and heroic (Aura-2 only)Commanding presence for leadership contentVoice ID: aura-2-perseus-en

Apollo

Musical and expressive (Aura-2 only)Rich, versatile voice for varied contentVoice ID: aura-2-apollo-en
VoiceGenderModelVoice IDBest For
AsteriaFemaleAura/Aura-2aura-asteria-enCustomer service
ThaliaFemaleAura-2aura-2-thalia-enProfessional calls
LunaFemaleAura/Aura-2aura-luna-enGentle interactions
StellaFemaleAura/Aura-2aura-stella-enClear announcements
AthenaFemaleAura/Aura-2aura-athena-enBusiness calls
HeraFemaleAura/Aura-2aura-hera-enAuthoritative voice
OrionMaleAura/Aura-2aura-orion-enProfessional use
HeliosMaleAura/Aura-2aura-helios-enEnergetic content
PerseusMaleAura-2aura-2-perseus-enLeadership content
ApolloMaleAura-2aura-2-apollo-enVersatile applications

Phone Call Optimization

📞 Twilio Integration

Deepgram Aura is specifically optimized for phone systems with built-in Twilio compatibility.

Audio Format Settings

  • µ-law Encoding
  • Linear PCM
Recommended for Twilio
{
  "encoding": "mulaw",
  "sample_rate": 8000
}
  • Format: G.711 µ-law
  • Sample Rate: 8kHz
  • Best for: Phone calls, VoIP systems
  • Quality: Optimized for voice clarity over networks

Performance Metrics

⚡ Latency

~75ms end-to-endFrom text input to first audio chunk3x faster than most competitors

🎯 Accuracy

99.9% uptimeEnterprise-grade reliabilityProduction-ready stability

📊 Throughput

High concurrencyScales automatically with demandNo rate limit bottlenecks

API Integration

import asyncio
import websockets
import json
import base64

async def stream_tts():
    uri = "wss://api.deepgram.com/v1/speak?model=aura-asteria-en&encoding=mulaw&sample_rate=8000"
    
    headers = {
        "Authorization": "Token YOUR_DEEPGRAM_API_KEY"
    }
    
    async with websockets.connect(uri, extra_headers=headers) as websocket:
        # Send text
        await websocket.send(json.dumps({
            "type": "speak",
            "text": "Hello from Deepgram Aura!"
        }))
        
        # Receive audio chunks
        async for message in websocket:
            data = json.loads(message)
            if data.get("type") == "audio":
                audio_data = base64.b64decode(data["data"])
                # Process audio data
                yield audio_data

Configuration Examples

  • Phone Calls
  • Live Chat
  • Announcements
Optimal Settings for Phone Systems
{
  "model": "aura-2-asteria-en",
  "encoding": "mulaw",
  "sample_rate": 8000,
  "text": "Thank you for calling. How can I help you today?"
}
  • Ultra-low latency for real-time conversation
  • Phone-compatible audio format
  • Clear, professional voice

Best Practices

🚀 Optimization Tips

Maximize Deepgram’s Speed Advantage
Send text in optimal chunks for best latency
  • Ideal chunk size: 20-50 words
  • Avoid: Sending entire paragraphs at once
  • Benefit: Faster time to first audio
def chunk_text(text, max_words=30):
    words = text.split()
    chunks = []
    current_chunk = []
    
    for word in words:
        current_chunk.append(word)
        if len(current_chunk) >= max_words:
            chunks.append(' '.join(current_chunk))
            current_chunk = []
    
    if current_chunk:
        chunks.append(' '.join(current_chunk))
    
    return chunks
Keep WebSocket connections alive for multiple requests
  • Pattern: One connection per conversation
  • Benefit: Eliminates connection overhead
  • Implementation: Reuse WebSocket for entire call session
Implement robust error handling for production
async def robust_tts_stream():
    max_retries = 3
    retry_count = 0
    
    while retry_count < max_retries:
        try:
            # Your TTS streaming code here
            break
        except websockets.exceptions.ConnectionClosed:
            retry_count += 1
            await asyncio.sleep(1)  # Brief pause before retry
        except Exception as e:
            logger.error(f"TTS error: {e}")
            break

Pricing

💰 Simple, Predictable Pricing

Pay-per-character with volume discounts. No hidden fees or subscription tiers.
Usage TierPrice per CharacterBest For
First 10M chars/month$0.0135Small to medium businesses
Next 90M chars/month$0.0108Growing applications
Next 400M chars/month$0.0081Enterprise usage
500M+ chars/monthCustom pricingLarge-scale deployments

Language Support

🇺🇸 English Optimizations

Specialized for English-language applications
  • Native English training data
  • Optimized for American English pronunciation
  • Best-in-class quality for English content
  • Perfect for US-based business applications

Troubleshooting

Problem: Response time slower than expectedSolutions:
  • Verify you’re using Aura-2 model
  • Check network connection quality
  • Ensure µ-law encoding for phone calls
  • Monitor concurrent connection limits
Problem: Distorted or unclear audioSolutions:
  • Use correct encoding (µ-law for phones, linear16 for high quality)
  • Verify sample rate matches your playback system
  • Check API key permissions include TTS
  • Test with shorter text chunks
Problem: WebSocket connection terminates unexpectedlySolutions:
  • Implement connection keepalive pings
  • Add automatic reconnection logic
  • Monitor connection health
  • Use exponential backoff for retries

Migration from Other Providers

  • From ElevenLabs
  • From Google/AWS
Key Differences:
  • 3x faster latency (75ms vs 250ms)
  • English-only vs multilingual
  • WebSocket-only vs REST+WebSocket
  • Different voice ID format
Migration Tips:
  • Map ElevenLabs voices to Deepgram equivalents
  • Update API calls to WebSocket format
  • Adjust audio encoding for your use case

⚡ Ready for Ultra-Fast TTS?

Configure Deepgram Aura in your assistant settings and experience the speed difference in your phone calls!