Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.burki.dev/llms.txt

Use this file to discover all available pages before exploring further.

☁️ Azure Speech: Enterprise Scale

Microsoft’s neural STT service with broad language and regional variant support. Seamless integration with Azure ecosystem and phrase lists for term boosting. Good for organizations already using Microsoft services or requiring broad language support.

Quick Setup

1

Create Azure Speech Resource

  1. Follow the Azure AI Speech resource quickstart
  2. Create a new Speech resource
  3. Select your subscription, resource group, and region
  4. Note your Key and Region from the resource’s Keys and Endpoint page
2

Configure in Burki

  1. Go to AI Configuration β†’ STT tab
  2. Select Azure as the provider
  3. Enter your Subscription Key and Region (e.g., eastus, westus2)
3

Choose Model & Language

Select your preferred model and language from the dropdowns
Free Tier: Azure offers 5 hours of audio per month free for speech-to-text. See Azure Pricing for details.

Available Models

🎯 Standard

General PurposeBalanced accuracy and performance for most use casesLanguages: 30+ Best for: General transcription

⚑ Enhanced

Improved AccuracyBetter recognition for major languagesLanguages: 18 Best for: High-accuracy needs

🧠 Neural

Highest QualityState-of-the-art neural recognitionLanguages: 16 Best for: Premium applications
Recommendation: Start with Standard for broad language support, or use Neural for English-focused applications requiring the highest accuracy.

Language Support

Azure Speech STT supports 100+ languages and regional variants. Here are the most commonly used:
LanguageCodeStandardEnhancedNeural
English (US)en-USβœ…βœ…βœ…
English (UK)en-GBβœ…βœ…βœ…
English (Australia)en-AUβœ…βœ…βœ…
English (Canada)en-CAβœ…βœ…βœ…
English (India)en-INβœ…βœ…βœ…
Spanish (Spain)es-ESβœ…βœ…βœ…
Spanish (Mexico)es-MXβœ…βœ…βœ…
French (France)fr-FRβœ…βœ…βœ…
French (Canada)fr-CAβœ…βœ…βœ…
Germande-DEβœ…βœ…βœ…
Italianit-ITβœ…βœ…βœ…
Portuguese (Brazil)pt-BRβœ…βœ…βœ…
Portuguese (Portugal)pt-PTβœ…βœ…βœ…
Japaneseja-JPβœ…βœ…βœ…
Koreanko-KRβœ…βœ…βœ…
Chinese (Mandarin)zh-CNβœ…βœ…βœ…
Chinese (Hong Kong)zh-HKβœ…βœ…β€“
Chinese (Taiwan)zh-TWβœ…βœ…β€“
Arabic (Saudi Arabia)ar-SAβœ…β€“β€“
Hindihi-INβœ…β€“β€“
Dutchnl-NLβœ…β€“β€“
Russianru-RUβœ…β€“β€“
Swedishsv-SEβœ…β€“β€“
Danishda-DKβœ…β€“β€“
Norwegianno-NOβœ…β€“β€“
Finnishfi-FIβœ…β€“β€“
Polishpl-PLβœ…β€“β€“
Turkishtr-TRβœ…β€“β€“
Hebrewhe-ILβœ…β€“β€“
Thaith-THβœ…β€“β€“
100+ More Languages: Azure supports many additional languages and regional variants. Visit the Azure Language Support page for the complete list.

Configuration Options

Basic Configuration

{
  "stt_settings": {
    "provider": "azure",
    "model": "standard",
    "language": "en-US"
  }
}

Full Configuration

{
  "stt_settings": {
    "provider": "azure",
    "model": "standard",
    "language": "en-US",
    "punctuate": true,
    "interim_results": true,
    "smart_format": true,
    "endpointing": 10,
    "utterance_end_ms": 1000,
    "vad_events": true,
    "keyterms": ["Burki", "AI assistant", "customer support"]
  }
}

Per-Assistant Azure Credentials

You can configure Azure credentials per-assistant instead of using environment variables:
{
  "stt_settings": {
    "provider": "azure",
    "azure_config": {
      "subscription_key": "your_subscription_key",
      "region": "eastus"
    }
  }
}

Phrase Lists (Keyterms)

Phrase lists boost recognition of specific termsβ€”perfect for company names, product names, and industry terminology.
In AI Configuration β†’ STT β†’ Keywords/Keyterms:Enter terms separated by commas:
Burki, AI assistant, voice platform, customer success
Best Practice: Add your company name, product names, and any domain-specific terminology to phrase lists for improved recognition accuracy.

Key Features

🎯 Phrase Lists

Term BoostingBoost recognition of specific words and phrases for your domain

πŸ‘₯ Speaker Diarization

Speaker IdentificationDistinguish between multiple speakers in a conversation

πŸ”Š Multi-Channel

Stereo SupportProcess audio with separate channels for each participant

⚑ Real-Time

Low LatencyReal-time transcription with interim results

Timing Controls

Azure Speech STT supports timing controls to optimize speech detection:
What it does: How long to wait after detecting silence before considering speech has ended.Default: 10ms (minimal endpointing) Range: 10ms - 2000msWhen to Adjust:
  • Lower (10-100ms): For fast talkers or quick interactions
  • Higher (500-1000ms): For elderly callers or complex topics
  • Much higher (1500ms+): For people with speech difficulties
{
  "stt_settings": {
    "endpointing": 500
  }
}
What it does: Maximum time to wait for a complete utterance before triggering end-of-speech.Default: 1000ms Range: 500ms - 5000msWhen to Adjust:
  • Lower (500-800ms): For short, quick interactions
  • Higher (1500-3000ms): For detailed conversations
{
  "stt_settings": {
    "utterance_end_ms": 1500
  }
}
What it does: Enables Voice Activity Detection for enhanced speech detection.Default: EnabledBenefits:
  • Better speech detection in noisy environments
  • Backup mechanism when normal detection fails
  • Essential for background noise scenarios
{
  "stt_settings": {
    "vad_events": true
  }
}

Provider Comparison

FeatureAzure SpeechDeepgram
Languages100+30+
Latency~200ms~100ms
Term BoostingPhrase ListsKeywords/Keyterms
Diarizationβœ…βœ…
Custom Modelsβœ… (Custom Speech)Limited
Real-Timeβœ…βœ…
Multi-Channelβœ…βœ…
Best ForEnterprise, Multi-languageSpeed, Phone calls
When to Choose Azure: Broad language support, Microsoft ecosystem integration, enterprise features, or custom speech models.When to Choose Deepgram: Ultra-low latency, phone call optimization, or Nova-3 keyterms for English.

Regional Selection

Latency Optimization: Choose the Azure region closest to your deployment for optimal latency.
RegionLocationBest For
eastusEast USNorth America (East)
eastus2East US 2North America (East) - Backup
westus2West US 2North America (West)
westeuropeNetherlandsEurope
northeuropeIrelandEurope - Backup
southeastasiaSingaporeAsia-Pacific
australiaeastAustralia EastAustralia/Oceania

Pricing Overview

TierHours/MonthPrice
Free5 hours$0
StandardPay-as-you-go$1 per audio hour
Enterprise: Contact Azure for custom pricing on high-volume usage and reserved capacity. Custom Speech model training has separate pricing.

Common Issues & Solutions

Problem: API returns 401 UnauthorizedSolutions:
  • Verify your Azure Speech Key is correct in Settings β†’ Provider Keys
  • Ensure the key is from your Speech resource (not another Azure service)
  • Check that the region matches your Speech resource’s region
  • Verify your Azure subscription is active
Problem: Connection fails or returns errorsSolutions:
  • Double-check the region code (e.g., eastus not east-us)
  • Ensure the region is available for Speech services
  • Try a different region if experiencing connectivity issues
Problem: Wrong language transcribedSolutions:
  • Explicitly set the language code in configuration
  • Use the correct regional variant (e.g., es-ES vs es-MX)
  • Ensure your model supports the selected language
Problem: Transcription quality is lowSolutions:
  • Add domain-specific terms to phrase lists
  • Try a different model (Enhanced or Neural for supported languages)
  • Ensure audio quality is good (minimal background noise)
  • Enable audio denoising in Burki settings
Problem: Azure Speech SDK import failsSolution:
pip install azure-cognitiveservices-speech

Best Practices


See Also

⚑ Need Speed?

Deepgram - Ultra-low ~100ms latency, optimized for phone calls

πŸ”§ Timing Controls

Advanced Settings - Fine-tune speech detection timing

πŸ“ž Call Management

Conversation Flow - Configure interruption and timeout settings

πŸ”— Additional Resources


πŸš€ Ready to Use Azure Speech STT?

Head back to your assistant configuration and set up Azure Speech for managed speech-to-text.