Play 04
Call Center Voice AI
High✅ Ready
Voice-enabled customer service with real-time STT→LLM→TTS streaming.
Build a phone-answering AI agent. Azure Communication Services handles the call, Speech Service converts audio to text, GPT-4o processes the intent and generates a response, then TTS speaks it back — all streaming in real time. Includes escalation triggers, PII redaction, and call recording consent flows.
Architecture Pattern
STT→LLM→TTS streaming pipeline, intent detection, escalation
Azure Services
Communication ServicesAI Speech (STT + TTS)Azure OpenAI (gpt-4o)Container AppsContent Safety
DevKit (.github Agentic OS)
- agent.md — voice agent personality
- instructions.md — call flow, escalation rules
- mcp/index.js — voice validation tools
- plugins/ — speech handler, call routing, agent orchestrator
TuneKit (AI Config)
- config/openai.json — model selection, temperature for voice
- config/guardrails.json — PII redaction, consent, profanity filter
- config/agents.json — escalation triggers, hold music
- config/model-comparison.json — GPT-4o vs GPT-4o-mini latency
Tuning Parameters
Speech config (language, speed)Grounding promptsFallback chainsResponse latency targetsAudio encoding
Estimated Cost
Dev/Test
$200–400/mo
Production
$2.5K–10K/mo