Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.burki.dev/llms.txt

Use this file to discover all available pages before exploring further.

🎙️ Give Your Assistant a Voice

Transform text into natural, human-like speech with our integrated TTS providers. Each provider offers unique advantages for different use cases.

Quick Provider Comparison

⚡ ElevenLabs

Premium Quality & Customization70+ languages, voice cloning, advanced controlsBest for: High-quality customer interactions

🚀 Deepgram Aura

Ultra-Low Latency3x faster than competitors, phone-optimizedBest for: Real-time conversations

🎯 Cartesia Sonic 3

Multilingual Excellence42 languages, voice cloning, low latencyBest for: Multilingual agents, global deployments

☁️ Azure Speech

Enterprise Scale500+ neural voices, 100+ languagesBest for: Enterprise, Azure ecosystem

🎭 Inworld.ai

AI-Powered EmotionsMultilingual, emotional markup, voice cloningBest for: Expressive, contextual responses

🎙️ Resemble AI

Custom Voice CreationWebSocket streaming, personalized voicesBest for: Brand-specific voice identity

🧠 OpenAI TTS

Instruction-Aware Voicestts-1, tts-1-hd, and gpt-4o-mini-ttsBest for: OpenAI-native stacks and voice instructions

🧩 Additional Providers

Kokoro, Uplift, Murf, SonioxSelf-hosted and specialized provider optionsBest for: Custom deployments and specialized voices

Feature Matrix

ProviderLatencyLanguagesVoice CloningStreamingBest For
ElevenLabsVendor-reported low latency70+✅ AdvancedWebSocketPremium quality
DeepgramVendor-reported very low latencyEnglishWebSocketSpeed & phone calls
CartesiaVendor-reported low latency42✅ YesWebSocketMultilingual
AzureProvider/config dependent100+HTTPMicrosoft ecosystem
InworldProvider/config dependent11✅ Zero-shotHTTP/WSEmotional expression
ResembleProvider/config dependentEnglish✅ CustomWebSocketBrand voices
OpenAIProvider/config dependentEnglish-firstHTTP streamingOpenAI-native stacks
KokoroSelf-hosted dependentModel-dependentHTTPSelf-hosted/local deployments
UpliftProvider-dependentSpecializedStreaming serviceUplift voices
MurfProvider-dependentMulti-languageHTTPStyle/rate/pitch controls
SonioxProvider-dependentModel-dependentStreaming serviceSoniox voice stack
New to TTS? Start with ElevenLabs for the best balance of quality and features, Deepgram if speed is your priority, or Cartesia for multilingual support.

Setup Overview

All providers follow the same basic setup pattern:
1

Get API Credentials

Sign up with your chosen provider and obtain API keys
2

Configure in Burki

Add your credentials in the assistant’s AI Configuration → TTS tab
3

Select Voice & Model

Choose from available voices and models for your use case
4

Fine-tune Settings

Adjust speed, stability, and other provider-specific options

Provider Deep Dives

🎭 ElevenLabs - Premium Voice Quality

Latest Models: Flash v2.5 (75ms), v3 (70+ languages), Turbo v2.5Key Features: Advanced voice controls, multilingual support, custom voice creationPerfect For: Customer service, content creation, multilingual applicationsComplete ElevenLabs Guide

⚡ Deepgram Aura - Ultra-Fast TTS

Latest Models: Aura-2 (next-gen), Aura (proven)Key Features: Industry-leading speed, phone optimization, µ-law encodingPerfect For: Real-time phone calls, live chat, interactive applicationsComplete Deepgram Guide

🎯 Cartesia Sonic 3 - Multilingual Excellence

Latest Models: sonic-3 (latest), sonic-3-2025-10-27 (stable)Key Features: 42 languages, voice cloning from ~5 sec, low latency WebSocketPerfect For: Global deployments, multilingual voice agentsComplete Cartesia Guide

☁️ Azure Speech - Enterprise Scale

Latest Models: Neural (high-quality), Standard (basic)Key Features: 500+ voices, 100+ languages, SSML support, Microsoft integrationPerfect For: Enterprise applications, Azure ecosystem usersComplete Azure Guide

🎪 Inworld.ai - AI-Powered Expression

Latest Models: inworld-tts-1, inworld-tts-1-max, inworld-tts-1.5-max, inworld-tts-1.5-miniKey Features: Emotional markup, context awareness, 11 languagesPerfect For: Gaming, entertainment, emotional customer supportComplete Inworld Guide

🎙️ Resemble AI - Custom Brand Voices

Key Features: WebSocket streaming, unlimited voice creation, business plansPerfect For: Brand consistency, personalized experiences, enterpriseComplete Resemble Guide

🧠 OpenAI TTS - Instruction-Aware Speech

Latest Models: tts-1, tts-1-hd, gpt-4o-mini-ttsKey Features: Built-in OpenAI voices, speed control, instruction support on GPT-4o mini TTS modelsPerfect For: Teams already using OpenAI keys and model-specific voice instructionsComplete OpenAI TTS Guide

Additional Supported Providers

The backend also supports these TTS providers through tts_settings.provider:
ProviderProvider KeyImportant Options
Kokorokokorovoice_id, model_id, language, speed, kokoro_base_url; no API key required by default
Upliftupliftvoice_id, model_id, language, output_format
Murfmurfvoice_id, model_id, style, rate, pitch, region, optional variation, locale, pronunciation_dictionary
Sonioxsonioxvoice_id, model_id, language
{
  "tts_settings": {
    "provider": "kokoro",
    "voice_id": "af_heart",
    "model_id": "kokoro",
    "language": "en",
    "provider_config": {
      "kokoro_base_url": "http://localhost:8880"
    }
  }
}
AWS Polly and Google TTS appear only as future placeholders in backend enum comments and are not registered in the active TTS factory. Do not document them as supported providers until they are wired.

Advanced Topics

🎛️ Voice Tuning

Master stability, similarity, and style controls across all providers

🔧 Troubleshooting

Common issues and solutions with step-by-step fixes

📈 Best Practices

Performance optimization, cost reduction, and production tips

🔗 See Also

Configuration: Learn how to configure TTS in your AI Configuration settings.Integration: Understand how TTS fits into the overall Architecture of Burki Voice AI.Call Management: Discover how TTS works with Call Management features.

Quick Start Guide

Recommended Setup:
  • Provider: ElevenLabs or Deepgram
  • Model: Flash v2.5 or Aura-2
  • Voice: Professional (Rachel, Asteria)
  • Settings: Stability 0.5, Speaker Boost ON
API Rate Limits: Each provider has different rate limits and pricing models. Check the individual provider pages for detailed pricing information.

🚀 Ready to Get Started?

Choose your provider and dive into the detailed setup guides, or check out our Best Practices for optimization tips.