Why Navan Should Use LiveKit
A comprehensive proposal covering the LiveKit open-source framework, recommended architecture, and the advantages of LiveKit Cloud for hosting your voice AI infrastructure
Navan Voice Agent Current Architecture
Twilio SIP Trunk + Azure-hosted Voice Agent with OpenAI LLM
Inbound Call
User dials in via PSTN, Twilio receives and converts to SIP
SIP Trunking
Twilio SIP Trunk bridges to Azure via Websockets
Agent Processing
Voice Agent on Azure handles STT, TTS, and conversation flow
LLM Response
OpenAI generates intelligent responses, streamed back to user
Disadvantages of This Architecture
Limited to OpenAI Realtime API Capabilities
Every new feature request becomes an architecture crisis
Building on an Incomplete and Unstable Foundation
OpenAI's Realtime API only does one thing—everything else requires workarounds
PM wants to add screen sharing to support calls? Can't do it.
Product wants an AI avatar for branding? Impossible.
Customer success needs conversation recording with custom metadata? Not supported.
Sales wants multi-agent handoffs for specialized expertise? Rewrite everything.
You're constantly telling stakeholders 'no' because the API only supports voice-in, voice-out. Each workaround means duct-taping external services together, creating technical debt and maintenance nightmares.
Each Workaround Creates More Problems
- Separate recording infrastructure (Recall, MeetKay, custom builds)
- Bolt-on analytics that can't see inside the conversation flow
- Hacked interruption handling that feels janky to users
- Zero ability to inject visual context or tools mid-conversation
- Manual session management code that's fragile and hard to test
Building your Own Framework for Voice Agents vs. Leveraging Open Source
No new updates for voice features: better turn detection, better barge-in detection, new model/plugin support
Building Voice AI Without an Open-Source Framework
Every feature you need, you must build and maintain yourself
The Result
- →Months of engineering time to build core features
- →No community contributions or shared improvements
- →Every new AI model requires custom integration
- →Security patches are your responsibility