Day 4 Handoff — Pick Up After 8-P0 Day

You are the new orchestrator. Day 3 (2026-04-29) ended at 18:30 ET after ~7h of live-verification work that exposed eight distinct P0 regressions in PR #251 (Day 2 Worktree A — live transfer + browser-demo widget). Six were fixed at code layer, one at carrier config, the eighth (architectural) was deferred to Day 4 with explicit founder approval. Cold-email batch GO/NO-GO is still held. Voice agent runtime on sNqgCxcJY23F carries all six code-layer fixes; Telnyx outbound is wired correctly; bridge mechanic itself is broken on the browser demo path.

Read this entire document before doing anything. Run as Opus.

1. Mission recap

ChurchWiseAI is shipping a verticals-first voice agent platform so funeral homes (FuneralWiseAI), churches (ChurchWiseAI), and future verticals share infrastructure with vertical-flavored prompts/tools and a unified per-tenant Master Config. The trigger was a strategic decision to delay the FuneralWiseAI cold-email batch (~500 funeral homes) until the live-director-transfer demo is rock-solid — that demo is the competitive wedge against ASD Answering Service.

Day 1 shipped foundation. Day 2 shipped the four parallel surfaces in PRs #248-#251. Day 3 ran the founder-supervised 10-item live verification and got stuck on item 4 (browser-initiated demo flip with live transfer to director's cell). Founder priority restated multiple times: "long-term quality, industry-standard robust fixes — not quick fixes deferred to a high-pressure merge window" and "the better the demos and the more robust the product, the conversions will be way higher." Day 3 ended with founder agreeing to STOP and plan Day 4 carefully rather than push through tonight.

2. What broke (the eight P0s) — the central lesson

Every single one of these came from PR #251 (4ffe07dd). The pattern: feature was merged with unit tests stubbing the LLM, stubbing LiveKit, stubbing the audio path, stubbing the carrier. Nothing exercised the real round-trip: real browser → mic → LiveKit → agent → STT → LLM → tool call → SIP outbound → carrier → callee → bridge.

#	Bug	Layer	Fix
1	`safety.py` `flag_safety_event(context)` missing type annotation → `KeyError` in `parse_function_tools` → both Anthropic + Google reject all turns	Python class signature	commit before `a69a58c3` — added `: RunContext`
2	`_run_call` referenced `demo_director_phone_override` outside scope → `NameError` on funeral-prospect path	Python signature plumbing	`f51702c8` — added kwargs to `_run_call` and pass-through
3	`DirectorTransferDemo.tsx` had no `RoomEvent.TrackSubscribed` handler → AI's TTS audio never reached browser DOM → caller heard nothing → silent timeout	Frontend WebRTC wiring	`fe3f07a7` — added attach() handler + cleanup on TrackUnsubscribed
4	`setMicrophoneEnabled(true)` runs after `room.connect()` → permission prompt fires AFTER agent dispatched → caller's mic-publish lands too late, agent already gave up	Frontend mic timing	`f205b854` — pre-acquire via `getUserMedia({ audio: true })` before token mint
5	`transfer_to_director` registered on `CareAgent` only; funeral-prospect path uses `CoordinatorAgent` → LLM hallucinated tool name from prompt, LiveKit logged "unknown AI function"	Tool registration scope	`067c7c8f` — duplicated method onto `CoordinatorAgent`; CareAgent retains for defense-in-depth
6	`core/transfer.py` `sip_call_to=f"sip:{n}@{domain}"` → Telnyx rejects full SIP URIs in this field; `transfer_to` similarly malformed	LiveKit Python SDK shape	`ad74eaef` — bare E.164 in `sip_call_to`, `tel:+E164` in `transfer_to`
7	Telnyx credential connection `2948197312620398250` had `outbound_voice_profile_id: null` → carrier silently 403/D35 rejects all outbound INVITEs, no MDR	Telnyx carrier config (NOT code)	PATCH `outbound.outbound_voice_profile_id = 2925086295615079772` (verified by direct dial test ringing founder's cell)
8	DEFERRED — Day 4. `core/transfer.py` calls `TransferSIPParticipant` (SIP REFER) to bridge the caller leg to the director's number. Browser demo's caller leg is WebRTC, not SIP — REFER fails with `"no SIP session associated with participant"`. Bridge architecture assumes both legs are SIP; browser demo path needs a different mechanic.	Architecture	(this handoff)

Lessons captured and committed in CI / memory:

Static contract tests landed: tests/test_function_tool_schemas.py (catches #1), voice-tool-schemas.yml workflow runs ruff check --select F821 --target-version py312 (catches #2), tests/test_transfer_sip_payload_shape.py (catches #6 with 6 assertions including SDK-class-field-name drift detection).
Memory: feedback_telnyx_outbound_three_requirements.md (TODO: write — covers credential + outbound voice profile + DID-to-connection binding triple, plus the PATCH-needs-nested-objects gotcha).
What's still missing: the round-trip Playwright spec (task #10 P0). Without it #3, #4, #5, #8 still slip through. Write it before resuming live verification — see §4.2.

3. State verification (run BEFORE doing anything)

cd C:/dev/churchwiseai-web && git fetch origin --quiet
git log --oneline origin/feat/verticals-platform-day1-foundation -10
# Expected (top to bottom):
#   ad74eaef fix(voice): bare E.164 in sip_call_to + tel: in transfer_to (Day 3 P0 #6)
#   067c7c8f fix(voice): register transfer_to_director on CoordinatorAgent (Day 3 P0 #5)
#   cc640276 fix(funeral): rewrite AT-NEED block — transfer-first with phone safety net
#   40146ffb fix(funeral): live-transfer is ANY-of conditions, not ALL-of (UX)
#   f205b854 fix(cold-outreach): pre-acquire mic permission before room connect (Day 3 P0 #4)
#   fe3f07a7 fix(cold-outreach): subscribe to remote audio so the prospect hears the AI (Day 3 P0 #3)
#   8cf0ff1d ci(voice-tool-schemas): pin ruff F821 to py312 target
#   f51702c8 fix(voice): pass demo_director_phone_override into _run_call (Day 3 P0 #2)
#   a69a58c3 fix(voice): annotate flag_safety_event(context) — restore LLM tool-schema build (Day 3 P0)
#   2ce80f9a feat(verticals): Day 2 C — VerticalProfile registry + funeralwiseai.com admin route + vertical-aware Inbox (#248)

C:/dev/lk.exe agent list --project cwa-voice
# Expected: Version `sNqgCxcJY23F`, Deployed At 2026-04-29T17:41:04Z (or NEWER if Day 4 redeploy already happened — see §4.1)

# Telnyx config — confirm bug #7 fix is still in place (not reverted)
curl -s -H "Authorization: Bearer $TELNYX_API_KEY" \
  https://api.telnyx.com/v2/credential_connections/2948197312620398250 \
  | jq '.data.outbound.outbound_voice_profile_id'
# Expected: "2925086295615079772" (NOT null)

4. Day 4 plan — three phases, executed in order

4.1 Architectural fix — WebRTC↔SIP bridge branch (~3-5h focused work)

The bug. voice-agent-livekit/core/transfer.py:execute_attended_transfer() calls TransferSIPParticipant to REFER the caller leg to the director's E.164. That works when the caller leg is SIP (real PSTN inbound to our LiveKit room). It fails when the caller leg is WebRTC (browser-initiated demo) with "no SIP session associated with participant" — REFER is a SIP method; there's no SIP session to REFER.

The right mechanic for the browser demo path:

Determine the caller leg's transport at transfer time. LiveKit's RemoteParticipant.kind distinguishes Kind.STANDARD (WebRTC) from Kind.SIP — see livekit-protocol/participant.proto. The caller leg here is the participant who originated the room (the prospect's browser).
If caller leg is WebRTC: do NOT call TransferSIPParticipant. After the director's SIP leg is created and joins the room, AI agent says the bridge intro, then cleanly leaves the room (or just stops generating audio). LiveKit's room-native audio mixing handles browser ↔ SIP-director bidirectionally — no transfer needed. Side benefit: removes the echo the founder hit on Day 3 (AI was talking on both legs simultaneously because the browser AND the SIP director were both subscribed to AI's published audio track).
If caller leg is SIP (production PSTN inbound): keep the existing REFER path. Both legs are SIP, REFER works, AI's leg disconnects, two phones talking.
Write the test. The Playwright round-trip in §4.2 must include this assertion: after transfer_to_director fires, assert that within 30s either (a) TransferSIPParticipant was called and succeeded for SIP-caller scenarios, OR (b) for WebRTC-caller scenarios, both the browser participant AND a new SIP participant are visible in the room AND the AI agent participant has stopped publishing audio. Either path should result in the founder's cell + the browser hearing each other (verifiable via DB row in voice_call_logs.transcript showing user turns from BOTH legs).

Files in scope:

voice-agent-livekit/core/transfer.py — execute_attended_transfer is where the branch goes. Around line 580-620 (the TransferSIPParticipant call site). Add a kwarg caller_kind: ParticipantKind and branch.
voice-agent-livekit/verticals/church/agents.py — transfer_to_director on both CoordinatorAgent (line 600+) and CareAgent (line 353) needs to look up self.session.room.remote_participants for the caller leg and pass its kind into execute_attended_transfer. Caller identity is the participant who is NOT the director-transfer-* identity.
Tests in voice-agent-livekit/tests/ — add test_transfer_browser_branch.py covering the WebRTC-caller path (mocked) and test_transfer_sip_branch.py covering the PSTN-caller path (mocked, asserts REFER still fires).

Constraints:

Don't touch core/transfer.py's crisis gate. It applies regardless of transport.
The wait_message_template and bridge_intro_template paths apply to BOTH branches — caller hears wait message while director's SIP leg is dialing, then bridge intro spoken by AI before AI exits.
Echo elimination: as soon as the SIP director leg is established and the bridge intro finishes, the AI's published audio track must mute or the agent should leave the room. Choose one based on what LiveKit Agents v1.5 supports cleanly.

LiveKit Python docs to read first:

https://docs.livekit.io/agents/build/handoff/ — agent handoff patterns
https://docs.livekit.io/sip/transfer-cold/ — REFER semantics (we're already on canonical tel:+E164 form per bug #6 fix)
https://docs.livekit.io/sip/api/ — CreateSIPParticipantRequest, TransferSIPParticipantRequest — the second only works when source is SIP
https://docs.livekit.io/agents/voice/ — audio publishing controls (set_microphone_enabled, etc., or how to mute the agent's local TTS output)
Search for "WebRTC to SIP bridge" and "leave room agents" in livekit/agents on GitHub

Hard rule: voice agent deploy gated on founder approval AND founder presence to test live calls. Do NOT run lk agent deploy autonomously. Push to foundation and ask.

4.2 Round-trip Playwright spec (P0, gate for resuming live tests)

Build BEFORE running items 4-10 of live verification. Day 3 task #10 covers this. Spec needs to:

Load https://churchwiseai-web-git-feat-verticals-platf-116e57-church-wise-ai.vercel.app/s/walker-mortuary-daughenb-e6c1 in Playwright Chromium with launch flags --use-fake-ui-for-media-stream, --use-fake-device-for-media-stream, --use-file-for-fake-audio-capture=fixtures/test-utterance.wav. The WAV should contain "I need to speak with the funeral director, my grandmother just passed at home, my name is Test User, my number is +12268830000."
Pass permissions: ['microphone'] on context creation so getUserMedia auto-grants.
Wait for "Try the live director" button → click → wait for stage transitions: connecting → active → director-dialing → bridged.
Assert /api/livekit/token returned 200 with { token, wsUrl } (already loosely covered by the demo working at all, but pin it).
Assert RoomEvent.TrackSubscribed fired for an audio track AND a hidden <audio> element appeared in DOM (catches bug #3 regression).
Assert mic permission was granted BEFORE /api/livekit/token POST (catches bug #4 regression). Hard to assert from Playwright cleanly — settle for: assert the prompt did not race the call by checking that audio frames flowed within 3s of the click.
Assert voice_call_logs row was created within 5s with the expected slug and vertical='church' (funeral path uses church Coordinator with override — that's a known data quirk, don't change it here).
Within 60s, assert voice_call_logs.transcript for that call contains BOTH role='assistant' AND role='user' entries (catches bug #4 regression mic-not-actually-publishing-AI-can't-hear-you).
Within 60s, assert at least one tool call to transfer_to_director was logged in agent logs OR in a new voice_tool_calls table (TODO: this table doesn't exist yet — either grep agent logs via lk agent logs post-hoc, or add a tool-call audit table as part of this work).
Within 90s, for the WebRTC-caller path, assert TWO SIP participants are NOT bridged via REFER (because the bug #8 fix means we should NOT be calling TransferSIPParticipant here). Instead: assert there's a SIP participant in the room AND the AI agent's audio track has been muted/left. (This is the assertion that would have caught bug #8 at PR time.)
Cleanup afterAll: delete the test's demo_dial_log row, voice_call_logs row, and any voice_callback_requests/crisis_events rows the spec created.

File: churchwiseai-web/e2e/cold-outreach-director-transfer.spec.ts. Use the existing Playwright config and webServer pattern.

CI integration: add the spec path to a new workflow .github/workflows/cold-outreach-director-transfer.yml, triggered on PRs touching src/components/cold-outreach/**, src/app/api/livekit/token/**, voice-agent-livekit/core/transfer.py, voice-agent-livekit/verticals/church/agents.py. Run against the deployed Vercel preview URL (NOT localhost — caller environment must be a real browser hitting a real LiveKit Cloud agent).

This is the test that catches bugs #3, #4, #5, #8 simultaneously, plus future ones in the same class. It's why Day 3 §6 plan said "1 failure = patch + reverify, 2+ failures = STOP" — we hit 8 because we had no integration gate.

4.3 Resume live verification items 4-10 (founder-supervised)

Once §4.1 + §4.2 are landed and the Playwright spec passes against the foundation Vercel preview alias, redeploy voice agent (founder-approved) and run the items per Day 3 §6:

Item 4 retry: browser demo flip with founder's cell as director. Should now bridge cleanly (no echo, AI exits after intro, browser ↔ cell connected).
Items 2 + 3: PSTN live transfer happy path + timeout fallback. Needs founder to source a second phone (covered in Day 3 discussion — collapsed into item 4 was option 2 last time; now items 2+3 can run separately or be skipped if item 4 covers the same code paths).
Item 5: voice crisis test — call demo line, say "I want to end my life," assert 988 routing + crisis_events row + NO SMS to notification_phone.
Item 6: regression on all 4 customer lines.
Items 7 + 8: funeral chatbot operational handoff + crisis on funeralwiseai.com. Note: funeralwiseai.com points at production main, NOT foundation. Either run against the foundation Vercel preview alias (chatbot endpoint is /api/chatbot/stream — works on any deployment) OR merge foundation→main first.
Item 9: church chatbot regression at churchwiseai.com.
Item 10: demo church chatbot at /care/churchwiseai-demo.

4.4 GO/NO-GO on cold-email batch (founder-owned, post-§4.3)

If 10/10 verification items GREEN AND vet acceptance test deferred-or-passed AND knowledge sync clean: founder approves PR foundation→main, squash-merges, triggers FuneralWiseAI cold-email batch.

5. Founder priorities — DO NOT VIOLATE

"Long-term quality, industry-standard robust fixes — not quick fixes deferred to a high-pressure merge window." "The better the demos and the more robust the product, the conversions will be way higher so it's worth spending a few days on it to get it right." "Verify external-account state from authoritative sources, not symptoms." "30k feet with drill-down to real evidence."

If a bug surfaces during Day 4 testing that requires another architectural pivot, STOP and plan Day 5. Do not "just one more hotfix" past 6pm ET. Pattern from Day 3: each hotfix uncovered the next layer down because the test gates didn't exist.

6. Hard rules (CLAUDE.md + memory — DO NOT VIOLATE)

NEVER touch the inbound trunk ST_Xa3Bp9aixRFP (locked).
NEVER use lk agent update-secrets --overwrite.
NEVER use lk agent restart; only lk agent deploy recovers stuck agents.
NEVER deploy the voice agent without explicit founder approval AND founder presence.
NEVER write junk test data to production Supabase. Use demo church UUID 00000000-0000-4000-a000-000000000001.
NEVER force-push.
NEVER merge a worktree's own PR or autonomously merge feat/verticals-platform-day1-foundation to main.
Crisis events NEVER attempt a SIP bridge (defense-in-depth in core/transfer.py regardless of branch).
Verify DB columns before use via information_schema.columns.
Verify external-account state from authoritative sources (LiveKit dashboard / Telnyx API / official docs), not symptoms.
Long-term quality > deferred fixes. If a fix can land in-PR, it should land in-PR.

7. Open follow-ups inherited from Day 3 (file as you go)

Task	Priority	Notes
#7 — `demo_dial_log` counts FAILED handshakes	P1	Insert moves from token-mint to participant-join callback. Founder hit this 5x today during failed dials.
#8 — Hydration mismatch in `ServiceBusinessTemplate`	P2	`new Date().getFullYear()` in `'use client'` template; FA-176 nowIso pattern fix.
#9 — Self-dial loop detection in `core/transfer.py`	P1	Block dial when target_number resolves to one of our own DIDs. Funeral directors heavily use call forwarding.
#11 — `_silence_watchdog` guard against closed session	P3	RuntimeError race in `church/agents.py:548`. Cosmetic.
#12 — AI re-prompt for valid phone when "the one I called from"	P3	Today Bobby's "the one I called in on" → AI captured `1197654590` invalid.
Voice-health cron extension (config validation)	P1	Assert outbound_voice_profile_id bound + active. From bug #7 lessons.
Daily outbound-trunk-cert cron	P2	Test dial to a Telnyx echo number daily, assert participant joins room. Keep dialing OFF the founder's cell.
Runbook entry for outbound-trunk first-dial certification	P1	Append to `knowledge/runbooks/voice-provisioning.md`.

8. References

Day 3 handoff: C:/dev/knowledge/specs/day2-verticals-platform/06-DAY3-HANDOFF.md
Day 3 in-session decisions: C:/dev/DECISION_LOG.md 2026-04-29 entry (write today as part of session close).
Memory files (existing): feedback_robustness_over_velocity.md, feedback_livekit_recovery_lk_deploy_only.md, feedback_lk_overwrite_flag_destroys_secrets.md, feedback_verify_external_account_state.md, project_livekit_cloud_plan_ship.md, user_30k_with_drilldown.md, feedback_ship_with_trust.md.
New memory files (TODO write 2026-04-29 EOD): feedback_telnyx_outbound_three_requirements.md, feedback_round_trip_test_before_merge.md.
Voice agent code (PROD): voice-agent-livekit/ — all under foundation branch.
Frontend (browser demo): src/components/cold-outreach/DirectorTransferDemo.tsx + src/app/s/[slug]/page.tsx.
LiveKit lk CLI: C:/dev/lk.exe.
Telnyx API key: C:/dev/knowledge/.env TELNYX_API_KEY (do NOT log to transcript).

9. First actions when you start Day 4

Read this entire document.
Read C:/dev/CLAUDE.md (root) + C:/dev/churchwiseai-web/CLAUDE.md.
Read the new + existing memory files in §8.
Run §3 state verification.
Greet the founder. Confirm what state you observed. Ask whether to start with §4.1 (architectural fix) or §4.2 (Playwright spec) — there's no dependency between them, both can run in parallel via subagent if founder wants speed.
DO NOT proactively run the voice agent deploy.
While waiting on founder, you CAN: write the new memory files in §8 if not already saved, update voice-provisioning.md runbook with the bug #7 lesson + outbound-trunk certification step, prep the cold-outreach-director-transfer.spec.ts skeleton.

10. What success looks like (Day 4 done)

Voice agent on a NEW version (not sNqgCxcJY23F) carrying the WebRTC↔SIP branch fix, deployed 2026-04-30 or later.
Playwright cold-outreach-director-transfer.spec.ts lands and runs green against the foundation Vercel preview alias.
10/10 live verification items GREEN.
Founder approves PR feat/verticals-platform-day1-foundation → main, squash-merges.
Founder triggers FuneralWiseAI cold-email batch.
Decision Log updated with Day 3 + Day 4 narrative.

You're cleared to start. Plan carefully — that was the founder's parting instruction.

1. Mission recap​

2. What broke (the eight P0s) — the central lesson​

3. State verification (run BEFORE doing anything)​

4. Day 4 plan — three phases, executed in order​

4.1 Architectural fix — WebRTC↔SIP bridge branch (~3-5h focused work)​

4.2 Round-trip Playwright spec (P0, gate for resuming live tests)​

4.3 Resume live verification items 4-10 (founder-supervised)​

4.4 GO/NO-GO on cold-email batch (founder-owned, post-§4.3)​

5. Founder priorities — DO NOT VIOLATE​

6. Hard rules (CLAUDE.md + memory — DO NOT VIOLATE)​

7. Open follow-ups inherited from Day 3 (file as you go)​

8. References​

9. First actions when you start Day 4​

10. What success looks like (Day 4 done)​