Skip to main content

🎯 Resemble AI: Custom Brand Voices

Create unlimited custom voices with your brand’s unique sound. WebSocket streaming ensures real-time responses for professional applications.

Quick Setup

1

Get API Credentials

  1. Visit Resemble AI and create an account
  2. Navigate to Settings β†’ API Keys
  3. Generate an API key with TTS permissions
  4. Copy your API Key and Project UUID
2

Create Custom Voice

  1. Go to Voices in your Resemble dashboard
  2. Click Create Voice and upload voice samples
  3. Wait for training completion (~30 minutes)
  4. Copy the generated Voice UUID
3

Configure in Burki

  1. Go to AI Configuration β†’ TTS tab
  2. Select Resemble AI as provider
  3. Enter your API Key, Project UUID, and Voice UUID

Voice Creation Process

πŸŽ™οΈ Build Your Brand Voice

Resemble AI specializes in creating custom voices that match your brand personality and requirements.

Voice Training Steps

  • Voice Samples
  • Training Process
  • Voice Quality Tips
Upload Requirements:
  • Duration: 3-10 minutes of clean audio
  • Format: WAV or MP3, 22kHz+ sample rate
  • Content: Read diverse sentences for best results
  • Quality: Clear speech, minimal background noise
Example training script:
"Hello, my name is [Speaker Name]. I work at [Company Name] 
as a customer service representative. Today I'll be reading 
various sentences to help train an AI voice model. 
The weather is beautiful today with clear blue skies. 
Our company provides excellent customer support services."

Available Models

πŸ”§ Synthesis Models

Resemble AI focuses on custom voice synthesis rather than multiple models.

Default Synthesis Model

~300ms latencyHigh-quality neural synthesis optimized for custom voicesFeatures:
  • Custom voice support
  • WebSocket streaming
  • Phone-compatible formats
  • Twilio integration ready
Best for: Brand-specific applications, personalized experiences

WebSocket Streaming

⚑ Real-Time Streaming

WebSocket streaming enables real-time TTS for live applications like phone calls and interactive experiences.

Streaming Setup

import asyncio
import websockets
import json
import base64

async def stream_resemble_tts():
    uri = "wss://websocket.cluster.resemble.ai/stream"
    
    headers = {
        "Authorization": "Bearer YOUR_RESEMBLE_API_KEY"
    }
    
    async with websockets.connect(uri, extra_headers=headers) as websocket:
        # Initialize streaming session
        init_message = {
            "type": "initialize",
            "project_uuid": "YOUR_PROJECT_UUID",
            "voice_uuid": "YOUR_VOICE_UUID",
            "sample_rate": 8000,
            "precision": "MULAW",
            "output_format": "wav"
        }
        
        await websocket.send(json.dumps(init_message))
        
        # Send text for synthesis
        text_message = {
            "type": "text",
            "text": "Hello from Resemble AI streaming!",
            "request_id": "unique_request_id_123"
        }
        
        await websocket.send(json.dumps(text_message))
        
        # Receive audio chunks
        async for message in websocket:
            data = json.loads(message)
            
            if data.get("type") == "audio":
                audio_content = data.get("audio_content")
                if audio_content:
                    audio_data = base64.b64decode(audio_content)
                    # Process audio chunk
                    yield audio_data

Audio Format Configuration

Custom Voice Management

πŸŽ›οΈ Voice Library Management

Organize and manage your custom voices for different use cases and brand requirements.

Voice Categories

Use Case: Customer service, sales, brand communicationCharacteristics:
  • Professional and approachable tone
  • Consistent with brand personality
  • Clear pronunciation and pacing
  • Suitable for extended conversations
Training Tips:
  • Use your actual customer service representatives
  • Record in professional setting
  • Include common business phrases and terminology
  • Test with actual customer scripts
Use Case: Gaming, entertainment, interactive mediaCharacteristics:
  • Distinctive personality traits
  • Appropriate for character backstory
  • Emotionally expressive range
  • Memorable and engaging
Training Tips:
  • Work with voice actors who understand the character
  • Include emotional range in training samples
  • Record character-appropriate content
  • Test with actual dialogue scripts
Use Case: E-learning, audiobooks, documentationCharacteristics:
  • Clear and educational tone
  • Good pacing for comprehension
  • Neutral but engaging delivery
  • Suitable for long-form content
Training Tips:
  • Use experienced narrators or educators
  • Include varied sentence structures
  • Practice with actual educational content
  • Focus on clarity and comprehension

Voice UUID Management

# Example voice management system
VOICE_LIBRARY = {
    "customer_service_female": "uuid-1234-5678-abcd",
    "customer_service_male": "uuid-2345-6789-bcde",
    "ceo_announcements": "uuid-3456-7890-cdef",
    "technical_support": "uuid-4567-8901-defa",
    "marketing_spokesperson": "uuid-5678-9012-efab"
}

def get_voice_for_context(context_type):
    """Select appropriate voice based on interaction context"""
    voice_mapping = {
        "support": VOICE_LIBRARY["customer_service_female"],
        "sales": VOICE_LIBRARY["marketing_spokesperson"],
        "technical": VOICE_LIBRARY["technical_support"],
        "announcements": VOICE_LIBRARY["ceo_announcements"]
    }
    
    return voice_mapping.get(context_type, VOICE_LIBRARY["customer_service_female"])

Integration Examples

  • Customer Service Bot
  • Brand Spokesperson
  • Multi-Voice Application
import asyncio
from resemble_ai import ResembleStreaming

class CustomerServiceTTS:
    def __init__(self):
        self.voice_uuid = "customer-service-voice-uuid"
        self.project_uuid = "your-project-uuid"
        
    async def handle_customer_inquiry(self, customer_message, inquiry_type):
        # Select appropriate voice based on inquiry type
        if inquiry_type == "complaint":
            response_tone = "empathetic"
            text = f"I understand your concern and I'm here to help resolve this issue for you."
        elif inquiry_type == "sales":
            response_tone = "enthusiastic"
            text = f"I'd be happy to tell you more about that product!"
        else:
            response_tone = "professional"
            text = f"Thank you for contacting us. How may I assist you today?"
        
        # Stream TTS response
        async for audio_chunk in self.stream_response(text):
            yield audio_chunk
            
    async def stream_response(self, text):
        # Implementation details for streaming
        pass

Pricing Structure

πŸ’° Custom Voice Pricing

Resemble AI pricing is based on usage and plan features. WebSocket streaming requires Business plans or higher.
PlanMonthly CostCharacters IncludedWebSocket StreamingCustom Voices
Basic$29200,000❌3 voices
Pro$89800,000❌10 voices
Business$1992,000,000βœ…25 voices
EnterpriseCustomCustomβœ…Unlimited

Cost Optimization Tips

  • Voice Reuse: Create versatile voices that work across multiple use cases
  • Batch Processing: Use REST API for non-real-time applications to save costs
  • Smart Caching: Cache frequently used phrases to reduce API calls
  • Context-Aware Selection: Use different voices only when necessary for user experience

Quality Assurance

🎯 Voice Quality Testing

Ensure your custom voices meet production standards with systematic testing approaches.

Testing Framework

1

Initial Voice Validation

Test basic voice quality with standard phrases
2

Domain-Specific Testing

Test with actual content from your application domain
3

Edge Case Testing

Test with numbers, abbreviations, and special cases
4

User Acceptance Testing

Get feedback from actual users or stakeholders
5

Production Monitoring

Monitor voice quality in real applications

Common Quality Issues

Issue: Custom voice mispronounces specific wordsSolutions:
  • Include problematic words in training data
  • Use phonetic spelling in TTS requests
  • Create pronunciation guide for domain-specific terms
  • Retrain voice with additional samples if needed
Example Fix:
# Phonetic corrections for common issues
PRONUNCIATION_FIXES = {
    "API": "A P I",
    "HTTP": "H T T P",
    "OAuth": "O Auth",
    "UUID": "U U I D"
}

def apply_pronunciation_fixes(text):
    for term, phonetic in PRONUNCIATION_FIXES.items():
        text = text.replace(term, phonetic)
    return text
Issue: Voice sounds monotone or lacks expressionSolutions:
  • Include more emotional range in training samples
  • Use varied sentence types during training
  • Consider retraining with more expressive speaker
  • Test with TTS-specific emotional markup if available
Training Improvement:
Training script should include:
- Questions: "How can I help you today?"
- Excitement: "That's fantastic news!"
- Concern: "I'm sorry to hear about that."
- Professional: "Let me check that for you."

Troubleshooting

Problem: Cannot establish WebSocket connectionSolutions:
  • Verify Business plan subscription
  • Check API key permissions for streaming
  • Confirm project UUID is correct
  • Test connection with WebSocket debugging tools
  • Check firewall settings for WebSocket traffic
Problem: Custom voice UUID returns errorSolutions:
  • Verify voice training is completed
  • Check voice UUID spelling in configuration
  • Confirm voice is associated with correct project
  • Contact support if voice disappeared after training
Problem: Generated audio has artifacts or poor qualitySolutions:
  • Adjust audio format settings (sample rate, precision)
  • Test with different output formats
  • Check if voice training data was high quality
  • Consider retraining voice with better samples
  • Verify network stability for streaming

Migration Guide

  • From Standard TTS Providers
  • Voice Replacement Strategy
Migration Benefits:
  • Custom brand voice consistency
  • WebSocket streaming for real-time apps
  • Unlimited voice creation potential
  • Professional voice quality control
Migration Steps:
  1. Voice Planning: Decide what custom voices you need
  2. Training Data: Collect high-quality voice samples
  3. Voice Creation: Train your custom voices
  4. Testing: Validate voice quality and performance
  5. Integration: Update API calls to use custom voice UUIDs
  6. Monitoring: Implement quality monitoring

🎯 Ready to Create Your Brand Voice?

Set up Resemble AI in your assistant configuration and start building custom voices that represent your brand perfectly!