Engineering12 min readFeb 28, 2026

How We Achieved Sub-800ms Voice AI Latency for Indian Accents

Getting voice AI to feel natural requires sub-800ms end-to-end latency. Here is the exact technical stack and optimisations that got us there — and what we tried that did not work.

Priya Nair

Co-founder & CTO

📞

Why 800ms?

Research shows callers hang up after 2 seconds of silence. With 800ms latency, you have 1,200ms of buffer — enough for a natural pause without sounding broken.

Getting to 800ms end-to-end (from end of user speech to start of AI voice playback) required optimising every step of the pipeline.

Our Pipeline

Twilio WebSocket (mulaw 8kHz)
    → Deepgram Nova-2 streaming STT (~160ms)
    → Turn detection (500ms silence threshold)
    → LLM streaming (Claude Haiku ~180ms to first token)
    → ElevenLabs streaming TTS (~60ms to first byte)
    → Twilio audio playback

The Key Optimisations

1. Parallel processing

Don't wait for LLM to finish — start TTS as soon as the first sentence segment arrives. This saved ~400ms.

2. Streaming everything

Deepgram streams transcripts. We send partial transcripts to the LLM after 200ms of inactivity. The LLM starts generating before the user finishes speaking.

3. Prompt caching

LiteLLM prompt caching for the system prompt reduces LLM latency by ~40%.

4. Regional STT for Indian accents

Deepgram Nova-2 accuracy for Indian English: 84%. Sarvam AI for Hindi: 91%. We route by detected language.

What Didn't Work

✓OpenAI Whisper: Too slow (800ms+ just for STT)
✓ElevenLabs Flash v2: Artifacts on Indian English
✓Groq: Fast but quality inconsistent at high load

Current Benchmark

P50 latency: 720ms. P95: 980ms. P99: 1,340ms.

The P99 cases are network issues on 2G/Edge connections. We now detect poor connections and switch to a lighter TTS voice.

EngineeringVoice AITechnicalLatency

Priya Nair

Co-founder & CTO

Writing about AI automation, India SMBs, and building products that work for the next billion users.

Prompt Engineering for Hinglish: A Practical Guide

AI SDR vs Human SDR: 3-Month Experiment for an Indian IT Services Firm

Ready to try it for your business?

7-day free trial. No credit card. Setup in 30 minutes.

Start Free Trial