Knowledge > Products > Voice Agent > Architecture
Voice Agent Architecture
Multi-Tenant Design
The voice agent is a single deployed instance that serves every church customer. There is no per-church container, no per-church deployment, and no per-church scaling decision. When a call arrives, the agent dynamically loads that church's configuration from Supabase and builds a church-specific agent on the fly.
This design was chosen because:
- Solo founder -- no DevOps capacity for per-tenant infrastructure
- LiveKit Cloud handles the SIP gateway and room management; Railway hosts the Python agent worker
- Per-church customization lives entirely in database configuration, not code
Call Routing: SIP Trunk Phone Number
Every inbound call arrives via the LiveKit Cloud SIP gateway. The Twilio SIP trunk forwards the call to LiveKit Cloud, which dispatches a job to the Railway agent worker. The agent worker reads the dialed number from JobContext.room.sip (sip.trunkPhoneNumber) to determine which church (if any) the call belongs to. The routing follows a three-tier resolution:
Tier 1: resolve_route(to_number)
Maps the inbound Twilio To number to an (agent_type, church_id) tuple. Three outcomes:
| Resolution | agent_type | church_id | What happens next |
|---|---|---|---|
Toll-free number matches TOLL_FREE_NUMBER | "sales" | None | Build Sales Agent |
Number is in DEMO_NUMBERS set | "demo_router" | None | Build Demo Router Agent |
Number is in PHONE_REGISTRY dict | "church" | UUID or None | Build Church Coordinator |
| Number not found anywhere | "church" | None | Fall through to DB lookup |
Tier 2: Agent Type Determines Builder
Based on the agent_type from Tier 1, get_agent() calls the appropriate builder:
agent_type == "sales" --> build_sales_agent()
agent_type == "demo_router" --> build_demo_router_agent()
agent_type == "church" --> build_coordinator_agent()
Tier 3: Per-Church Data Loading
For church calls, the system loads a complete church configuration object before building the agent. This happens in parallel where possible:
await asyncio.gather(
fetch_session_rag(supabase, church_id, denomination, church_name),
load_product_knowledge(supabase),
load_inline_faqs(supabase, church_id),
load_repeat_caller_history(supabase, caller_phone, church_id),
)
The assembled context includes:
- RAG context -- church-specific knowledge base hits + theological content from
unified_rag_content - Product knowledge -- runtime FAQ pairs from
product_knowledgetable (shared across all calls) - Inline FAQs -- church-specific Q&A pairs from
church_knowledge_base(injected directly, separate from RAG vector search) - Repeat caller history -- privacy-gated summaries of the caller's last 5 calls within 90 days
- Datetime context -- current date/time in the church's timezone with relative references ("This Sunday means March 30, 2026")
All context blocks are concatenated and injected into the agent's system prompt via agent.history.add_entry(rag_context, role="system").
Phone Registry
PHONE_REGISTRY is a static Python dict in session.py that maps every Twilio number the platform owns to a church ID (or None for unassigned/sales numbers).
# Simplified structure (actual phone numbers redacted):
PHONE_REGISTRY = {
"+1XXXXXXXXXX": None, # Toll-free -> Sales
"+1XXXXXXXXXX": None, # Spare, unassigned
"+1XXXXXXXXXX": None, # Cartesia agent number
}
DEMO_NUMBERS = {
"+1XXXXXXXXXX", # US demo line
"+1XXXXXXXXXX", # CA demo line
"+1XXXXXXXXXX", # Cartesia agent demo line
}
Resolution priority:
- Toll-free match (
TOLL_FREE_NUMBERconstant) --> Sales Agent - Demo number match (
DEMO_NUMBERSset) --> Demo Router Agent - Registry match (
PHONE_REGISTRYdict) --> Church Agent with knownchurch_id - DB lookup (
lookup_church_by_phone()) --> querieschurch_voice_agents.twilio_phone_number - Fallback --> Sales Agent (caller always reaches someone)
The DB lookup at step 4 supports churches whose numbers were provisioned after the last code deploy. The result is cached for 5 minutes.
Caching Strategy
The voice agent uses a simple in-memory TTL cache (session._cache) based on time.monotonic(). No external cache (Redis, Memcached) is used -- LiveKit Cloud runs a single agent process.
| Cache Key Pattern | TTL | What It Caches |
|---|---|---|
pk:all | 15 minutes | Formatted product knowledge FAQ block |
phone:{to_number} | 5 minutes | Church ID resolved from DB phone lookup |
faq:{church_id} | 5 minutes | Formatted church-specific FAQ pairs |
church:{church_id} | 5 minutes | Complete church configuration dict |
The cache also supports a stale-while-revalidate pattern via cache_get_stale(). If Supabase fails during a church data reload, the expired cached value is served rather than dropping the call. This ensures degraded-but-functional behavior during database outages.
Note: The Railway agent worker runs a single process per container. If Railway scales to multiple workers, each worker maintains its own independent in-memory cache — there is no shared cache across workers.
Non-Fatal Error Handling
Every Supabase call in the voice agent is wrapped in try/catch. The design principle is: a database error should never drop a call. Specific fallbacks:
| Failure | Fallback Behavior |
|---|---|
load_church_data() Supabase error | Serve stale cache if available; otherwise return None (routes to Sales) |
lookup_church_by_phone() error | Return None (routes to Sales) |
insert_call_log() error | Log warning, call proceeds without a log record |
increment_call_count() error | Log warning, church gets a free call rather than a dropped call |
load_product_knowledge() error | Return empty string, agent proceeds without product knowledge |
load_inline_faqs() error | Return empty string, agent proceeds without FAQ context |
load_repeat_caller_history() error | Return empty string, agent proceeds without caller history |
update_call_log_end() error | Log error, call has already completed |
| Unknown inbound number, no DB match | Route to Sales Agent |
| Church over call limit | Return None, route to Sales Agent |
| Care Agent build fails | Log warning, Coordinator handles all topics including pastoral |
TurnProcessor Pipeline
The TurnProcessor wraps the LLM agent and intercepts every event before and after it reaches the LLM. It is the core per-turn processing pipeline.
For each UserTextSent event (transcribed caller speech), the pipeline runs:
1. Cancel pending farewell timer (caller spoke again)
2. "Are you there?" reassurance
- If session.is_processing AND caller said "are you there?" / "hello" / etc.
- Yield "Yes, I'm here! Just one more moment."
- Return (do not forward to LLM, do not cancel pending work)
3. Moderation checks (BEFORE noise filtering -- safety must never be silently dropped)
a. Threat detection (check_threat)
- Hardcoded response: "This call is being recorded..."
- Immediately end call
- Fire-and-forget: email + SMS alerts to church + support
b. Crisis detection (check_crisis)
- Inject 988 Lifeline directive into LLM context
- Set session.crisis_detected = True
- Do NOT end call -- let caller decide
c. Abuse detection (check_abuse)
- 1st offense: inject "caller used inappropriate language" context
- 2nd offense: hardcoded "I'm going to end this call now" + end call
4. Noise filtering (AFTER moderation -- only if moderation didn't fire)
- Pure noise ("um", "uh", "hmm"): silently dropped
- Pure backchannels ("uh huh", "mm hmm"): silently dropped
- Context-dependent ("okay", "yeah", "sure"): dropped if agent didn't ask a question
- Floor takes ("wait", "stop", "actually"): always pass through
- Farewell/gratitude ("thanks", "bye"): always pass through
5. Per-turn RAG (500ms hard timeout, skipped on moderation events)
- Generate embedding of caller message
- Search church knowledge base (stricter threshold than session init)
- Skip if caller message < 10 characters
- If Supabase is slow, RAG is dropped for this turn
6. Combine contexts and delegate to LlmAgent.process()
- Inject tool filler phrase before first tool call ("Let me check on that.")
- Skip filler for end_call and demo_agent tools
7. Auto-hangup on mutual farewell
- Both agent AND caller must have said goodbye
- 4-second grace period (let farewell audio finish)
- Cancelled if caller speaks again during grace period
- DISABLED during crisis mode -- caller controls when to end
Non-text events (CallStarted, etc.) pass through directly to the agent. CallEnded triggers call log finalization and async classification.
LLM Configuration
Each agent type has specific LLM settings:
| Setting | Coordinator | Care | Sales/Demo |
|---|---|---|---|
| Model | gemini/gemini-2.5-flash | anthropic/claude-haiku-4-5-20251001 | gemini/gemini-2.5-flash |
| Fallbacks | [claude-haiku-4-5] | [gemini-2.5-flash] | [claude-haiku-4-5] |
| Temperature | 0.7 | 0.4 (more controlled for sensitive topics) | 0.7 (0.5 for Demo Router) |
| Timeout | 15 seconds | 15 seconds | 15 seconds |
| Retries | 1 | 1 | 1 |
| Max tool iterations | 5 | 5 | 5 (3 for Demo Router) |
The cross-fallback pattern (Coordinator uses Haiku as fallback, Care uses Gemini as fallback) ensures that if either provider has an outage, calls still complete.
Session State
Each call maintains a session dict that persists for the duration of the call:
session = {
"abuse_count": 0, # Incremented on each abuse detection
"church_id": "uuid", # Resolved church or SALES_SENTINEL
"call_id": "call_sid", # Twilio/Cartesia call SID
"caller_phone": "+1...", # Caller's phone number
"church_data": {...}, # Full church config dict (or {} for sales)
"is_processing": False, # True while LLM/tool call is in-flight
"crisis_detected": False, # True after crisis pattern match (disables auto-hangup)
"start_time": 1234.5, # time.monotonic() at call start
"duration": 0, # Computed each turn: monotonic() - start_time
"farewell_pending": False, # True during farewell grace period
"tool_results": None, # Legacy field, no longer written to DB
}
The TurnProcessor also injects session data into the TurnEnv object that tools receive, including supabase, church_id, church_data, caller_phone, caller_email, PCO credentials, Cal.com credentials, and church timezone.
Call Log Lifecycle
-
Insert (
insert_call_log) -- called immediately whenget_agent()resolves the call. Creates avoice_call_logsrow withstatus: "in_progress", the call SID, church ID, caller number, and called number. -
Increment (
increment_call_count) -- incrementscalls_this_monthonchurch_voice_agentsfor call-limit tracking. Non-fatal. -
Update at end (
update_call_log_end) -- triggered byCallEndedevent. Writes:status: "completed",duration_seconds,transcript(JSONB array of all events), andsummary(initially empty, populated by classification). -
Async classification (
_generate_call_classification) -- after the transcript is saved, Gemini Flash parses the conversation into 7 structured fields:
| Field | DB Column | Example Values |
|---|---|---|
| Summary | summary | "Caller asked about Sunday service times and was given 9:00 AM and 11:00 AM options." |
| Sentiment | caller_sentiment | -1.0 to 1.0 (float) |
| Topics | call_topics | ["service_times", "directions"] (JSONB array) |
| Category | category | service_info, prayer_request, visitor, crisis, pastoral_care, etc. |
| Urgency | urgency | low, normal, urgent, pastoral_emergency |
| Follow-up needed | follow_up_needed | true / false |
| Suggested assignee | suggested_assignee | pastor, office_admin, prayer_team, care_team, none |
Classification is fire-and-forget. If it fails, the call log still has the transcript and duration. The admin dashboard uses these fields for filtering, triage, and assignment.
Agent Builder Pattern
All agents are built via builder functions that return a configured LlmAgent. The pattern:
def build_coordinator_agent(church: dict, rag_context: str = "") -> LlmAgent:
# 1. Assemble tool list based on church config and feature flags
tools = [send_sms_link, end_call]
if church.get("address"):
tools.append(send_directions_link)
if church.get("giving_enabled") and church.get("giving_url"):
tools.append(send_giving_link)
# ... more conditional tools ...
# 2. Add tier-gated handoffs (Care Agent)
tier = church.get("plan", "starter")
if "care" in TIER_AGENTS[tier]:
care_agent = build_care_agent(church, rag_context)
tools.append(agent_as_handoff(care_agent, name="transfer_to_care", ...))
# 3. Build LlmAgent with church-specific prompt
agent = LlmAgent(
model=_MODEL,
api_key=_api_key(),
tools=tools,
config=LlmConfig(
system_prompt=build_coordinator_prompt(church),
introduction=f"Thank you for calling {church_name}...",
temperature=0.7,
fallbacks=_FALLBACKS,
),
)
# 4. Inject RAG context as system message
if rag_context:
agent.history.add_entry(rag_context, role="system")
return agent
Feature flags from the church config dict control which tools are available:
visitor_intake_enabled--capture_visitor_contacttooladdress--send_directions_linktoolevents--register_for_eventtoolcal_enabled+cal_event_type_id--check_availability+book_appointmenttoolspco_enabled+pco_app_id+pco_secret-- Planning Center tools (service times, events, staff)giving_enabled+ (giving_urloretransfer_email) --send_giving_linktool
Sales and Demo Agent Architecture
Beyond church calls, the voice agent serves two additional agent types:
Sales Agent
Handles the toll-free line. Has access to:
search_churches-- look up churches in the PewSearch directoryschedule_demo-- book a demo callcapture_support-- log support requestssend_sms_link-- send URLs via SMS- Demo handoffs -- one per configured demo church, with voice swapping
Voice gender is randomly assigned 50/50 per call (Carson male / Brooke female). Demo handoffs use the opposite gender voice via UpdateCallConfig so the caller hears a distinct voice change.
Demo Router Agent
Handles demo phone lines. A lightweight router that greets the caller, offers a choice of demo churches (Protestant or Catholic), and transfers to the selected Demo Agent. No sales knowledge, no product info. If the caller asks about pricing, it directs them to the toll-free number.
Demo Agent
A leaf agent (no handoffs) that role-plays as a specific church's receptionist. All tools are no-op mocks (acknowledged but not persisted to DB) except send_directions_link, which sends a real SMS so the prospect can see the feature in action. Church facts are loaded from the database, with a hardcoded fallback (DEMO_CHURCH_FALLBACK_FACTS) if the DB load fails.
RAG Architecture
The voice agent uses two tiers of RAG:
Session-Init RAG (fetch_session_rag)
Runs once when the call starts. Generates an embedding from a broad seed query ("Tell me about {church_name}, services, events, programs, ministries") and searches two sources in parallel:
- Church knowledge base (
search_church_knowledgeRPC) -- church-specific FAQs and uploaded document chunks. 8 results, 0.35 similarity threshold. - Unified RAG content (
search_unified_rag_contentRPC) -- theological content filtered by the church's denomination (mapped to a theological lens ID). 5 results, 0.35 threshold.
Results are formatted into labeled blocks and injected into the agent's system prompt.
Per-Turn RAG (fetch_turn_rag)
Runs on every caller utterance (unless moderation fired or message is < 10 characters). Searches church knowledge base only (not theological) with a stricter 0.4 threshold and a hard 500ms timeout. If Supabase is slow, RAG is silently skipped for that turn -- the call never blocks waiting for search results.
Church Data Loading
load_church_data() in supabase_church.py assembles a complete church configuration by joining three Supabase queries:
church_voice_agentsjoined withchurchesandpremium_churches-- identity, address, denomination, voice config, feature toggles, integrations, pastor info, weekly content, custom hours/staff/ministries/eventsorganization_settings-- agent configuration (personality, enabled agents)premium_churches(standalone query) -- plan tier and call-limit fields
The result is cached for 5 minutes with stale-serve fallback on error. Call-limit enforcement happens at load time: if calls_this_month >= calls_limit, the function returns None and the call routes to the Sales Agent.
Deployment
The voice agent deploys via git push to the main branch. Railway auto-deploys the Python agent worker from GitHub. LiveKit Cloud connects to the agent worker automatically via the agent worker's WebSocket connection — no separate cartesia connect or cartesia deploy step is needed.
# Push to main — Railway auto-deploys
git push origin main
Environment variables (Supabase keys, API keys, LiveKit credentials, Twilio credentials) are configured in the Railway service environment, not in .env files shipped with the code. See runbooks/deployment/deploy-voice-agent.md for the full deploy procedure.
Legacy Code
The churchwiseai-web/voice-agent-line/ directory is the legacy Cartesia LINE SDK implementation. It has been replaced by churchwiseai-web/voice-agent-livekit/. Do not modify voice-agent-line/ — it exists as reference code only. The even older Node.js agent at churchwiseai-web/voice-agent/ is also fully legacy and must not be modified.