🔧 TTS Troubleshooting Guide
Solve common TTS issues quickly with provider-specific solutions and general troubleshooting strategies.
Quick Diagnosis
🩺 Identify Your Issue
Start here to quickly identify the type of problem you’re experiencing.
Symptoms: TTS request completes but no audio is producedQuick Checks:
- ✅ API key is valid and has TTS permissions
- ✅ Voice ID exists and is spelled correctly
- ✅ Audio format is supported by your system
- ✅ Network connectivity is stable
No Audio Output
ElevenLabs - No Audio
ElevenLabs - No Audio
Common Causes & Solutions:Voice ID Issues:
- Verify voice ID is correct (case-sensitive)
- Ensure voice is available on your plan
- Try with default voice:
21m00Tcm4TlvDq8ikWAM
(Rachel)
- Some voices don’t work with all models
- Try with
eleven_turbo_v2_5
for broad compatibility - Check ElevenLabs compatibility matrix
- Verify API key has TTS permissions
- Check key isn’t expired or revoked
- Test with a simple curl request first
Deepgram - No Audio
Deepgram - No Audio
Common Causes & Solutions:WebSocket Connection:Audio Format Issues:
- Ensure your system supports the requested format
- Try µ-law for phone systems:
encoding=mulaw&sample_rate=8000
- Use linear16 for web:
encoding=linear16&sample_rate=24000
- Use correct voice format:
aura-asteria-en
notasteria
- Verify model exists:
aura-2
vsaura
- Check Deepgram voice list
Inworld - No Audio
Inworld - No Audio
Common Causes & Solutions:Bearer Token:Language/Voice Compatibility:
- Verify voice supports selected language
- Check language code format:
en
notenglish
- Use language matrix
- Try
inworld-tts-1
beforeinworld-tts-1-max
- Ensure model supports your voice
- Check model compatibility
Resemble - No Audio
Resemble - No Audio
Common Causes & Solutions:WebSocket Requirements:Voice Training Status:
- Business plan required for WebSocket streaming
- Check plan status in Resemble dashboard
- Fallback to REST API if needed
- Ensure custom voice training is complete
- Check voice status in Resemble dashboard
- Wait for training completion before using
Audio Quality Issues
Robotic or Distorted Voice
Robotic or Distorted Voice
ElevenLabs Solutions:Inworld Solutions:
- Reduce emotional markup intensity
- Try different voice with your content
- Switch from TTS-1-Max to TTS-1 for stability
- Use Aura-2 instead of original Aura
- Ensure proper audio encoding for your system
- Check sample rate matches playback system
- Test with shorter text samples
- Remove special characters from input text
- Verify network stability during generation
Pronunciation Issues
Pronunciation Issues
Text Preprocessing:Provider-Specific:
- ElevenLabs: Use SSML for pronunciation control
- Inworld: Leverage phonetic variations in training
- Deepgram: English-optimized, fewer pronunciation issues
- Resemble: Train custom voice with problematic words
Inconsistent Quality
Inconsistent Quality
Stability Optimization:
- ElevenLabs: Increase stability to 0.6-0.7
- Inworld: Use TTS-1 instead of TTS-1-Max
- Resemble: Retrain voice with more consistent samples
- Use WebSocket connections for streaming providers
- Implement connection keepalive
- Add retry logic for failed chunks
- Monitor network latency and jitter
Latency Problems
⚡ Speed Optimization
Optimize TTS response times across all providers.
ElevenLabs Latency Optimization
ElevenLabs Latency Optimization
Model Selection:Best Practices:
- Use Flash v2.5 for phone calls (~75ms)
- Keep text chunks under 100 characters
- Avoid complex punctuation and formatting
- Use WebSocket streaming for real-time apps
Deepgram Speed Maximization
Deepgram Speed Maximization
Optimal Configuration:Speed Tips:
- Already fastest provider (~75ms)
- Use µ-law encoding for phone systems
- Keep WebSocket connections alive
- Send text in 20-50 word chunks
General Latency Solutions
General Latency Solutions
Text Optimization:Connection Optimization:
- Reuse connections where possible
- Implement connection pooling
- Use regional endpoints when available
- Monitor and retry failed requests quickly
API and Authentication Errors
401 Unauthorized
401 Unauthorized
403 Forbidden
403 Forbidden
Common Causes:
- Plan limitations (voice access, features)
- Usage quota exceeded
- Geographic restrictions
- Check plan features and upgrade if needed
- Verify voice is available on your plan
- Review usage dashboard for quota limits
- Contact provider support for restrictions
429 Rate Limited
429 Rate Limited
Rate Limit Solutions:Prevention:
- Implement proper rate limiting in your code
- Use connection pooling and queuing
- Distribute requests across time
- Consider upgrading to higher tier plans
Provider-Specific Issues
Voice Cloning Problems
Voice Cloning Problems
Training Issues:
- Upload 1-25 minutes of clear audio
- Use consistent speaker and environment
- Include diverse sentence types
- Wait for full training completion
- Use correct voice ID from dashboard
- Ensure plan supports voice cloning
- Try different similarity_boost values
- Check voice model compatibility
Multilingual Issues
Multilingual Issues
Language Detection:
- Explicitly set language parameter
- Use models that support target language
- Test with native speakers
- Avoid mixing languages in single request
Emergency Troubleshooting
🚨 When Everything Breaks
Quick recovery strategies for critical TTS failures.
Fallback Strategy Implementation
Health Check Implementation
Getting Help
📞 Support Resources
When you need additional help beyond this troubleshooting guide.
Provider Support
Direct Provider Support:
Community Resources
Community Help:
💡 Still Having Issues?
If this guide didn’t solve your problem, check our Best Practices guide or reach out to our community for help!