Skip to main content

Voice Agent Resilience

Overview

The voice agent uses LiveKit's native FallbackAdapter for automatic provider failover across all three provider types (LLM, TTS, STT). If a primary provider fails, the adapter transparently switches to the backup. It marks the failed provider unhealthy, routes subsequent requests to the backup, periodically health-checks the primary, and resumes using it when recovered.

Additionally, the entrypoint has crash recovery that speaks an apology message to the caller before disconnecting.

Provider Fallback Architecture

Provider Fallback Chains

LLM (per-agent FallbackAdapter)

Each agent class constructs its own llm.FallbackAdapter with the correct provider priority. This restores the cross-fallback design from the Cartesia LINE SDK (LlmConfig.fallbacks) that was lost during the LiveKit migration.

AgentPrimaryFallback
CoordinatorAgentGemini 2.5 FlashClaude Haiku 4.5
CareAgentClaude Haiku 4.5Gemini 2.5 Flash
SalesAgentGemini 2.5 FlashClaude Haiku 4.5
DemoAgentGemini 2.5 FlashClaude Haiku 4.5
DemoRouterAgentGemini 2.5 FlashClaude Haiku 4.5

Set in verticals/church/agents.py (_coordinator_llm(), _care_llm()) and verticals/sales/agents.py (_sales_llm()). Passed to super().__init__(llm=...).

TTS (session-level + per-agent override)

ScopePrimaryFallback
Session (Coordinator, Sales)Cartesia Sonic 3 (church voice)Google TTS
CareAgent overrideCartesia Sonic 3 (care voice)Google TTS

Set in main.py (_run_call()) for session-level, and verticals/church/agents.py (_care_tts()) for CareAgent.

When CareAgent becomes active via handoff, its per-agent TTS FallbackAdapter replaces the session-level one. This preserves the voice-switching design (different gender for Care).

STT (session-level only)

PrimaryFallback
Deepgram Nova-3Google STT

Set in main.py (_run_call()). No agent overrides STT.

Crash Recovery

If any unhandled exception occurs during call setup or the main call pipeline, the entrypoint catches it and:

  1. Logs the full exception with traceback
  2. Creates a minimal TTS-only AgentSession (no LLM/STT needed)
  3. Speaks: "We're sorry, we're experiencing technical difficulties right now. Please try calling back in a few minutes, or contact the church directly. We apologize for the inconvenience."
  4. Waits 1 second for TTS to finish playing
  5. Deletes the LiveKit room (cleanly disconnects the call)

If even TTS fails (e.g., Cartesia is completely down), it logs the error and still cleans up the room (caller gets silence + disconnect, but no stuck room).

Implemented in main.py: _speak_error_and_hangup() and the try/except wrapper around _run_call().

Post-Call Classification Fallback

Separate from the live conversation fallbacks, post-call classification has its own chain:

  1. Gemini 2.5 Flash (primary)
  2. Gemini 2.0 Flash (older model, same provider)
  3. Keyword-based heuristic (no LLM — detects crisis, pastoral, prayer patterns)

Implemented in session.py classify_call().

What Is NOT Covered

  • No redundant telephony path. If Telnyx/Twilio or LiveKit's SIP gateway is down, calls fail at the carrier level (busy signal or "not in service"). This is outside application control.
  • No geographic failover. The agent runs in us-east only. LiveKit Cloud handles infrastructure redundancy within that region.
  • No automatic caller callback. If a call fails, the system does not attempt to call the person back.

Monitoring

MechanismWhat it detectsFrequency
Voice health cron (/api/cron/voice-health)Missing trunks, stuck rooms, schema driftEvery 15 min
LiveKit Cloud dashboardAgent online/offline, room countReal-time
Loguru loggingAll provider errors, fallback activationsReal-time (LiveKit logs)
FallbackAdapter internalProvider health state (healthy/unhealthy)Automatic

History

  • Pre-March 2026: Cartesia LINE SDK had LlmConfig(fallbacks=[...]) for LLM failover only
  • March 26, 2026: LiveKit migration — fallbacks lost (single provider per service)
  • April 1, 2026: Crash recovery added (_speak_error_and_hangup)
  • April 1, 2026: Full fallback chains restored via LiveKit FallbackAdapter (LLM + TTS + STT)