Overview
Burki Voice AIβs voice cloning feature enables you to:- Upload Voice Samples: Upload high-quality audio recordings to create voice models
- Multi-Provider Support: Use ElevenLabs, Resemble AI, and other providers that support voice cloning
- Instant Voice Creation: Generate cloned voices ready for immediate use
- Voice Management: Organize, test, and manage your custom voices
- Usage Analytics: Track voice usage for billing and optimization
ποΈ Voice Sample Upload
Upload audio samples with validation and processing
π€ AI Voice Training
Provider-powered voice training with quality optimization
π Usage Analytics
Track synthesis usage and voice performance
π§ Easy Integration
Seamless integration with existing assistant configurations
Supported Providers
ElevenLabs
- Instant Voice Cloning: Create voices from single audio samples
- High Quality: Professional-grade voice synthesis
- Multiple Languages: Support for 29+ languages
- Quick Processing: Voices ready in seconds
Resemble AI
- Professional Training: Advanced voice training algorithms
- Custom Models: Highly personalized voice characteristics
- Unlimited Voices: Create as many voices as needed
- Enterprise Features: Advanced customization options
Future Providers
- Inworld AI: Coming soon with emotional voice cloning
- OpenAI: Voice cloning capabilities when available
Voice Sample Requirements
Audio Quality Guidelines
File Format Requirements
File Format Requirements
Supported Formats:
- MP3 (recommended)
- WAV (highest quality)
- FLAC (lossless)
- M4A/AAC
- OGG
- Sample Rate: 22kHz or higher
- Bit Rate: 128kbps minimum
- Channels: Mono preferred, stereo acceptable
- File Size: Maximum 50MB
Recording Guidelines
Recording Guidelines
Duration Requirements:
- Minimum: 10 seconds of clear speech
- Recommended: 30-60 seconds for better quality
- Maximum: 10 minutes (longer samples may not improve quality)
- Clear Speech: No background noise or music
- Natural Tone: Conversational, not monotone
- Consistent Volume: Steady audio levels throughout
- Single Speaker: Only the target voice in the recording
Quality Tips
Quality Tips
For Best Results:
- Environment: Record in a quiet room with soft furnishings
- Microphone: Use a quality microphone 6-12 inches from mouth
- Content: Read varied sentences with different emotions
- Consistency: Maintain the same speaking style throughout
- Format: Save in WAV format for highest quality
Getting Started
Step 1: Upload Voice Sample
Navigate to your assistantβs configuration and open the Voice Cloning section:- Upload Audio File: Drag and drop or click to select your audio file
- Add Metadata: Provide a name, description, and tags
- Validation: System automatically validates audio quality
- Processing: File is uploaded and prepared for cloning
Example Upload
Step 2: Create Cloned Voice
Once your sample is uploaded, create a cloned voice:- Select Provider: Choose ElevenLabs or Resemble AI
- Configure Options: Set voice name, language, and quality settings
- Initiate Cloning: Start the voice training process
- Monitor Progress: Track cloning status in real-time
Example Voice Creation
Step 3: Use Cloned Voice
Once processing is complete, assign the voice to your assistant:- Voice Selection: Choose from your cloned voices
- Testing: Preview the voice with sample text
- Assignment: Set as the assistantβs default voice
- Go Live: Start using the voice in live calls
Voice Management
Voice Library
Organizing Voices
Organizing Voices
Voice Categories:
- Brand Voices: Official company voices
- Character Voices: Specific personas or characters
- Language Variants: Same voice in different languages
- Seasonal/Campaign: Temporary or promotional voices
- Use consistent tags for easy filtering
- Include language, gender, style descriptors
- Add use case tags (customer service, sales, etc.)
Voice Analytics
Voice Analytics
Usage Tracking:
- Synthesis Count: Number of times voice was used
- Duration Metrics: Total audio generated
- Cost Tracking: Provider usage and billing
- Performance: Quality scores and user feedback
- Most/least used voices
- Cost per synthesis by provider
- Quality trends over time
- User preference patterns
Voice Testing
Test your cloned voices before deployment:- Text-to-Speech Preview: Enter sample text to hear the voice
- Quality Assessment: Evaluate clarity, naturalness, and accuracy
- Comparison Testing: Compare with original samples and other voices
- A/B Testing: Test different voices with real users
API Integration
Upload Voice Sample
Create Cloned Voice
List Cloned Voices
Best Practices
Recording Quality
Professional Recording Setup
Professional Recording Setup
Equipment Recommendations:
- Microphone: USB condenser microphone (Audio-Technica AT2020USB+)
- Environment: Quiet room with minimal echo
- Software: Audacity, GarageBand, or professional DAW
- Monitoring: Use headphones to monitor audio quality
- Consistent Distance: Maintain 6-12 inches from microphone
- Proper Levels: Keep audio peaks between -12dB and -6dB
- Room Treatment: Use blankets or acoustic foam to reduce echo
- Multiple Takes: Record several versions and choose the best
Content Selection
Content Selection
Ideal Voice Sample Content:
- Varied Sentences: Different sentence structures and lengths
- Emotional Range: Include slight variations in tone
- Natural Speech: Conversational, not reading tone
- Complete Thoughts: Full sentences with natural pauses
- Background noise or music
- Multiple speakers
- Heavy accents (unless desired)
- Monotone or robotic delivery
- Incomplete sentences or stuttering
Voice Management
- Naming Convention: Use descriptive, consistent names
- Version Control: Keep track of voice iterations and improvements
- Usage Documentation: Document which voices work best for different scenarios
- Regular Testing: Periodically test voice quality and user satisfaction
- Cost Monitoring: Track usage and costs across different providers
Security and Privacy
- Consent: Always obtain explicit consent before using someoneβs voice
- Data Protection: Store voice samples securely and follow GDPR/CCPA requirements
- Access Control: Limit who can create and manage cloned voices
- Audit Trail: Keep logs of voice creation and usage
- Retention Policy: Define how long voice samples and models are stored
Troubleshooting
Common Issues
Upload Problems
Upload Problems
File Upload Fails:
- Check file format is supported (MP3, WAV, FLAC, M4A, OGG)
- Ensure file size is under 50MB
- Verify audio duration is between 10 seconds and 10 minutes
- Check internet connection stability
- Use higher sample rate (22kHz+) and bit rate (128kbps+)
- Remove background noise using audio editing software
- Re-record in a quieter environment
- Check microphone positioning and levels
Voice Creation Issues
Voice Creation Issues
Cloning Process Fails:
- Verify provider API credentials are valid
- Check account balance with voice cloning provider
- Ensure voice sample meets provider requirements
- Contact provider support for specific error messages
- Use higher quality source audio
- Try different provider (ElevenLabs vs Resemble AI)
- Experiment with quality enhancement settings
- Consider recording new samples with better equipment
Performance Issues
Performance Issues
Slow Processing:
- Provider processing times vary (ElevenLabs: seconds, Resemble: minutes)
- Check provider status pages for service issues
- Large files take longer to process
- Peak usage times may cause delays
- Monitor usage through analytics dashboard
- Set usage limits and alerts
- Compare provider pricing for your use case
- Optimize voice selection for cost efficiency
Provider Comparison
| Feature | ElevenLabs | Resemble AI | Coming Soon |
|---|---|---|---|
| Processing Time | Seconds | Minutes | Varies |
| Quality | Excellent | Excellent | TBD |
| Languages | 29+ | English+ | TBD |
| Cost Model | Per character | Per synthesis | TBD |
| Sample Requirements | 30s+ | 60s+ | TBD |
| Instant Preview | β | β | TBD |
| Emotional Control | Basic | Advanced | TBD |
| Enterprise Features | Limited | Full | TBD |
Use Cases
Customer Service
- Consistent Brand Voice: Maintain brand identity across all interactions
- Multilingual Support: Create voices in different languages for global support
- Personality Matching: Match voice characteristics to brand personality
Sales and Marketing
- Campaign Voices: Create specific voices for marketing campaigns
- Regional Variants: Adapt voices for different geographical markets
- Seasonal Adjustments: Modify voice characteristics for holidays or events
Entertainment and Media
- Character Voices: Create unique voices for virtual characters
- Narrator Voices: Professional voices for content narration
- Interactive Experiences: Engaging voices for games and interactive media
Enterprise Applications
- Executive Voices: Clone executive voices for consistent communication
- Training Systems: Consistent voices for e-learning and training
- Brand Ambassadors: Virtual representatives with authentic brand voices
Getting Help
π Documentation
Complete TTS provider documentation
ποΈ Voice Tuning
Advanced voice configuration guide
π¬ Community Support
Get help from the community
π§ Technical Support
Contact our support team