Documentation Index
Fetch the complete documentation index at: https://docs.burki.dev/llms.txt
Use this file to discover all available pages before exploring further.
βοΈ Azure Speech: Enterprise Scale
Microsoftβs neural STT service with broad language and regional variant support. Seamless integration with Azure ecosystem and phrase lists for term boosting. Good for organizations already using Microsoft services or requiring broad language support.
Quick Setup
Create Azure Speech Resource
- Follow the Azure AI Speech resource quickstart
- Create a new Speech resource
- Select your subscription, resource group, and region
- Note your Key and Region from the resourceβs Keys and Endpoint page
Configure in Burki
- Go to AI Configuration β STT tab
- Select Azure as the provider
- Enter your Subscription Key and Region (e.g.,
eastus,westus2)
Free Tier: Azure offers 5 hours of audio per month free for speech-to-text. See Azure Pricing for details.
Available Models
π― Standard
General PurposeBalanced accuracy and performance for most use casesLanguages: 30+
Best for: General transcription
β‘ Enhanced
Improved AccuracyBetter recognition for major languagesLanguages: 18
Best for: High-accuracy needs
π§ Neural
Highest QualityState-of-the-art neural recognitionLanguages: 16
Best for: Premium applications
Recommendation: Start with Standard for broad language support, or use Neural for English-focused applications requiring the highest accuracy.
Language Support
Azure Speech STT supports 100+ languages and regional variants. Here are the most commonly used:Full Language List
Full Language List
| Language | Code | Standard | Enhanced | Neural |
|---|---|---|---|---|
| English (US) | en-US | β | β | β |
| English (UK) | en-GB | β | β | β |
| English (Australia) | en-AU | β | β | β |
| English (Canada) | en-CA | β | β | β |
| English (India) | en-IN | β | β | β |
| Spanish (Spain) | es-ES | β | β | β |
| Spanish (Mexico) | es-MX | β | β | β |
| French (France) | fr-FR | β | β | β |
| French (Canada) | fr-CA | β | β | β |
| German | de-DE | β | β | β |
| Italian | it-IT | β | β | β |
| Portuguese (Brazil) | pt-BR | β | β | β |
| Portuguese (Portugal) | pt-PT | β | β | β |
| Japanese | ja-JP | β | β | β |
| Korean | ko-KR | β | β | β |
| Chinese (Mandarin) | zh-CN | β | β | β |
| Chinese (Hong Kong) | zh-HK | β | β | β |
| Chinese (Taiwan) | zh-TW | β | β | β |
| Arabic (Saudi Arabia) | ar-SA | β | β | β |
| Hindi | hi-IN | β | β | β |
| Dutch | nl-NL | β | β | β |
| Russian | ru-RU | β | β | β |
| Swedish | sv-SE | β | β | β |
| Danish | da-DK | β | β | β |
| Norwegian | no-NO | β | β | β |
| Finnish | fi-FI | β | β | β |
| Polish | pl-PL | β | β | β |
| Turkish | tr-TR | β | β | β |
| Hebrew | he-IL | β | β | β |
| Thai | th-TH | β | β | β |
100+ More Languages: Azure supports many additional languages and regional variants. Visit the Azure Language Support page for the complete list.
Configuration Options
Basic Configuration
Full Configuration
Per-Assistant Azure Credentials
You can configure Azure credentials per-assistant instead of using environment variables:Phrase Lists (Keyterms)
Phrase lists boost recognition of specific termsβperfect for company names, product names, and industry terminology.- Dashboard
- API
In AI Configuration β STT β Keywords/Keyterms:Enter terms separated by commas:
Best Practice: Add your company name, product names, and any domain-specific terminology to phrase lists for improved recognition accuracy.
Key Features
π― Phrase Lists
Term BoostingBoost recognition of specific words and phrases for your domain
π₯ Speaker Diarization
Speaker IdentificationDistinguish between multiple speakers in a conversation
π Multi-Channel
Stereo SupportProcess audio with separate channels for each participant
β‘ Real-Time
Low LatencyReal-time transcription with interim results
Timing Controls
Azure Speech STT supports timing controls to optimize speech detection:Endpointing (Silence Threshold)
Endpointing (Silence Threshold)
What it does: How long to wait after detecting silence before considering speech has ended.Default: 10ms (minimal endpointing)
Range: 10ms - 2000msWhen to Adjust:
- Lower (10-100ms): For fast talkers or quick interactions
- Higher (500-1000ms): For elderly callers or complex topics
- Much higher (1500ms+): For people with speech difficulties
Utterance End Timeout
Utterance End Timeout
What it does: Maximum time to wait for a complete utterance before triggering end-of-speech.Default: 1000ms
Range: 500ms - 5000msWhen to Adjust:
- Lower (500-800ms): For short, quick interactions
- Higher (1500-3000ms): For detailed conversations
VAD Events
VAD Events
What it does: Enables Voice Activity Detection for enhanced speech detection.Default: EnabledBenefits:
- Better speech detection in noisy environments
- Backup mechanism when normal detection fails
- Essential for background noise scenarios
Provider Comparison
| Feature | Azure Speech | Deepgram |
|---|---|---|
| Languages | 100+ | 30+ |
| Latency | ~200ms | ~100ms |
| Term Boosting | Phrase Lists | Keywords/Keyterms |
| Diarization | β | β |
| Custom Models | β (Custom Speech) | Limited |
| Real-Time | β | β |
| Multi-Channel | β | β |
| Best For | Enterprise, Multi-language | Speed, Phone calls |
When to Choose Azure: Broad language support, Microsoft ecosystem integration, enterprise features, or custom speech models.When to Choose Deepgram: Ultra-low latency, phone call optimization, or Nova-3 keyterms for English.
Regional Selection
Latency Optimization: Choose the Azure region closest to your deployment for optimal latency.
| Region | Location | Best For |
|---|---|---|
eastus | East US | North America (East) |
eastus2 | East US 2 | North America (East) - Backup |
westus2 | West US 2 | North America (West) |
westeurope | Netherlands | Europe |
northeurope | Ireland | Europe - Backup |
southeastasia | Singapore | Asia-Pacific |
australiaeast | Australia East | Australia/Oceania |
Pricing Overview
| Tier | Hours/Month | Price |
|---|---|---|
| Free | 5 hours | $0 |
| Standard | Pay-as-you-go | $1 per audio hour |
Enterprise: Contact Azure for custom pricing on high-volume usage and reserved capacity. Custom Speech model training has separate pricing.
Common Issues & Solutions
Authentication Failed
Authentication Failed
Problem: API returns 401 UnauthorizedSolutions:
- Verify your Azure Speech Key is correct in Settings β Provider Keys
- Ensure the key is from your Speech resource (not another Azure service)
- Check that the region matches your Speech resourceβs region
- Verify your Azure subscription is active
Region Mismatch
Region Mismatch
Problem: Connection fails or returns errorsSolutions:
- Double-check the region code (e.g.,
eastusnoteast-us) - Ensure the region is available for Speech services
- Try a different region if experiencing connectivity issues
Language Detection Issues
Language Detection Issues
Problem: Wrong language transcribedSolutions:
- Explicitly set the language code in configuration
- Use the correct regional variant (e.g.,
es-ESvses-MX) - Ensure your model supports the selected language
Poor Recognition Accuracy
Poor Recognition Accuracy
Problem: Transcription quality is lowSolutions:
- Add domain-specific terms to phrase lists
- Try a different model (Enhanced or Neural for supported languages)
- Ensure audio quality is good (minimal background noise)
- Enable audio denoising in Burki settings
SDK Not Installed
SDK Not Installed
Problem: Azure Speech SDK import failsSolution:
Best Practices
See Also
β‘ Need Speed?
Deepgram - Ultra-low ~100ms latency, optimized for phone calls
π§ Timing Controls
Advanced Settings - Fine-tune speech detection timing
π Call Management
Conversation Flow - Configure interruption and timeout settings
π Additional Resources
Azure Speech setup: Azure AI Speech speech-to-text quickstartLanguage Support: Azure STT Language ListDocumentation: Azure Speech Service DocsPricing: Azure Speech Pricing
π Ready to Use Azure Speech STT?
Head back to your assistant configuration and set up Azure Speech for managed speech-to-text.