Skip to main content

Expected Output Specification Methodology

The Problem This Solves

The knowledge system documents what the system DOES (features, flows, APIs). But no document defines what the customer SEES and EXPERIENCES. This gap means:

  • Agents build features that technically work but show wrong content for specific tiers
  • Tests verify code paths but not customer experience
  • Bugs like "Starter customer sees ElevenLabs voice picker" ship because no spec said "this should be hidden"
  • No one can answer "what does a Starter Chat customer's first week look like, step by step?"
  • The full setup journey (getting a product configured and working) is undocumented

The Three Layers

Layer 1: Feature Matrix (EXISTS — features.yaml, tier-restrictions.md)
"Starter gets 12 tools, 2 agents, no embed widget"

Layer 2: Process Flows (EXISTS — onboarding.md, checkout-flow.md, etc.)
"POST /api/onboard creates premium_churches row, sends email, syncs MailerLite"

Layer 3: Expected Outputs (MISSING — this is what we're building)
The COMPLETE customer journey, documented visually:
- Every way a customer discovers the product
- Every screen they see from discovery → signup → email → dashboard → setup → working product
- Every email they receive and when
- What "success" looks like at the end — the product is live and working as expected
- What each tier SEES vs what's hidden, at every step

How to Build an Expected Output Spec

Phase 1: Enumerate User States

For each product, list every possible customer state:

StatePlanChannelStatusKey Flags
Starter Chat (preview)starterchatpreviewchatbot_enabled, care_enabled
Starter Chat (active)starterchatactivechatbot_enabled, care_enabled
Starter Voice (active)startervoiceactivehas voice_agent row
Starter Both (active)starterbothactivechatbot + voice
Pro Chat (active)prochatactiveall chatbot features
Pro Both (active)probothactiveall features
Suite Chat (active)suitechatactiveeverything except voice
Suite Both (active)suitebothactiveeverything
Pro Websitepro_websitechatactiverestricted chatbot, PewSearch template
Trial expiredanyanypreview (expired)chatbot should be offline
Cancelledanyanycancelleddashboard accessible, chatbot offline
Past dueanyanypast_duegrace period behavior
Free (PewSearch claim)free-previewbasic chatbot only

Phase 2: Enumerate Touchpoints

For each state, walk through EVERY customer touchpoint as a sequential journey — the order a real customer would experience them.

A. Discovery Paths — How They Find This Product

Document EVERY way a customer could arrive at this product. Each path is a mini-journey:

  1. Google search → which page do they land on?
  2. PewSearch directory → banner/CTA they see → where it takes them
  3. Denomination landing page (e.g., /ai-for/baptist) → CTA → destination
  4. Blog post → CTA → destination
  5. Pricing page → which plan card → what CTA text
  6. Homepage → which section → what CTA
  7. Chatbot/voice product page → CTA → destination
  8. Peer referral / direct URL → where they land
  9. Facebook/Instagram ad → landing page
  10. PewSearch admin banner (existing PewSearch customer) → upsell CTA

For each path: screenshot the entry page, the CTA, and where it leads.

B. Pre-Purchase Journey (sequential) 11. Marketing/landing page they see (exact copy, images, CTA placement) 12. Onboard form Step 1: Search for church (what results look like, "already claimed" state) 13. Onboard form Step 2: Select/create church (what fields, what's pre-filled) 14. Onboard form Step 3: Contact info (what fields, plan pre-selected?) 15. Stripe checkout page (trial badge? amount shown? promo code field?) 16. Checkout success/confirmation page (what do they see immediately after paying?)

C. Email Journey (in order of receipt) 17. Pre-checkout welcome email (Resend) — subject, copy, CTA 18. Post-checkout welcome email with magic link (Resend) — subject, copy, link text 19. Stripe receipt email — what it shows 20. Lifecycle Email System (Resend + cron): Day 0 welcome + starter kit (immediate) — subject, copy, what it encourages them to do 21. Lifecycle Email System (Resend + cron): Day 2 setup nudge — subject, copy 22. Lifecycle Email System (Resend + cron): Day 7 activation check — subject, copy 23. Lifecycle Email System (Resend + cron): Day 13 trial reminder — subject, copy 24. Notification emails (prayer request received, visitor contact, callback request, care escalation)

D. First Login & Dashboard Discovery (sequential) 25. Magic link click → what page loads 26. Dashboard header (plan badge, status badge, View Page link, upgrade button) 27. Tab navigation (which tabs visible, which order) 28. Overview tab — first impression (stats cards at zero, getting started prompts, share links, upsell cards) 29. Getting Started checklist — what tasks appear, what order, progress indication

E. Setup Journey — Getting the Product Working (sequential)

This is the most critical section. Walk through every step a customer takes to get their product configured and live:

  1. Training tab — Church Knowledge: adding church description, service times, staff info
  2. Training tab — This Week: adding upcoming events and announcements
  3. Training tab — FAQs: adding custom Q&A (locked for some tiers?)
  4. Training tab — Theology: selecting denomination/tradition
  5. Training tab — Agents: configuring agent personalities (which agents visible? voice picker?)
  6. Training tab — Safety: reviewing crisis protocols, notification settings
  7. Training tab — Simulator: testing the chatbot (what does the test interface look like?)
  8. Training tab — Training Progress: checklist showing setup completion
  9. Settings tab — Church Profile: uploading logo, editing name/description
  10. Settings tab — Hours: adding service times and office hours
  11. Settings tab — Notifications: configuring who receives alerts (crisis protocol visible?)
  12. Settings tab — Integrations: connecting Cal.com, Planning Center (which locked by tier?)
  13. Settings tab — Team: inviting team members (what roles available?)

F. Public-Facing Product Pages (what the church's visitors see) 43. Hosted chat page /chat/[slug] — layout, branding, input field 44. Care hub /care/[slug] — layout, agent cards, subscribe option 45. Care subscribe /care/[slug]/subscribe — email signup form 46. Agent-specific chat /care/[slug]/[agent] — direct agent access 47. Embed widget on church website (if applicable) — positioning, branding badge, mobile behavior 48. Pro Website vanity page (if applicable) — hero, sections, chatbot widget placement

G. Ongoing Dashboard Use 49. Calls tab (visible? call history, transcripts, tools used) 50. Requests tab (prayer requests, callbacks, visitor contacts — all in one view) 51. Care tab (care_enabled? broadcast messaging, member list) 52. Social tab (connected accounts, scheduled posts — or locked?) 53. Upgrade tab (current plan display, comparison table, upgrade buttons) 54. Analytics (if applicable — chat volume, response quality, tool usage)

H. Lifecycle Events 55. Approaching usage limit (what warning banner/email?) 56. Usage limit reached (what message, what's disabled?) 57. Trial expiring — Day 10 (email + dashboard banner) 58. Trial expired — Day 15 (chatbot offline? dashboard access? upgrade CTA?) 59. Payment failed (email, dashboard state, grace period?) 60. Cancellation (email, dashboard state, data retention, chatbot offline?) 61. Upgrade (what changes immediately? new tabs appear? features unlock?) 62. Downgrade (what gets locked? existing data preserved? upgrade CTAs appear?)

Phase 3: Define Expected Output for Each Touchpoint

For each touchpoint, the agent pre-populates a draft by reading the code and production site, then the founder confirms or corrects. This is NOT open-ended — it's "here's what I see, is this right?"

Template:

## [Touchpoint Name]
State: [which user states this applies to]
Page/Component: [URL or component name]
Screenshot: [path to screenshot in knowledge/acceptance/screenshots/]

### Should See:
- [exact element, text, link, or behavior]

### Should NOT See:
- [exact element that must be hidden/absent]

### Conditional:
- IF [condition]: show [X]
- IF [condition]: hide [Y]

### Links:
- [button/link name] → [exact URL it should go to]

### Copy:
- [exact text that should appear, especially for tier-specific messaging]

### Success Criteria:
- [what "done right" looks like for this touchpoint — the customer's expectation]

Discovery Path Template:

## Discovery Path: [Name]
Entry point: [Google ad / PewSearch banner / denomination page / etc.]
Landing URL: [exact URL they arrive at]

### Journey:
1. [What they see first] — Screenshot: [path]
2. [What CTA they click] — text: "[exact button text]"
3. [Where it takes them] — Screenshot: [path]
4. [Next action] → leads to Touchpoint [N] (onboard form)

Setup Step Template:

## Setup Step [N]: [Name]
Tab: [Training > Agents / Settings > Hours / etc.]
Page/Component: [component name]
Screenshot (before): [empty/default state]
Screenshot (after): [configured state]

### What the customer does:
- [step-by-step actions they take]

### What they see when done:
- [confirmation message, updated UI, progress indicator]

### Success Criteria:
- [what "working correctly" looks like from the customer's perspective]

Phase 4: Build Tests from Specs

Each expected output becomes one or more Playwright assertions:

// From spec: "Calls tab: MUST NOT be visible (no voice agent)"
await expect(page.getByRole('tab', { name: 'Calls' })).not.toBeVisible();

// From spec: "Header shows 'View Chat Page' linking to churchwiseai.com/chat/[slug]"
const viewPageLink = page.getByRole('link', { name: 'View Chat Page' });
await expect(viewPageLink).toBeVisible();
await expect(viewPageLink).toHaveAttribute('href', /churchwiseai\.com\/chat\//);

Phase 5: Maintain

When ANY code change affects a touchpoint:

  1. Update the expected output spec FIRST
  2. Update the E2E test to match
  3. Then change the code
  4. Run the test to verify

File Structure

Expected output specs live in knowledge/acceptance/:

knowledge/acceptance/
README.md
starter-chat.md ← First one we build
starter-voice.md
starter-both.md
pro-chat.md
pro-both.md
suite-chat.md
suite-both.md
pro-website.md
free-claim.md
trial-expired.md
cancelled.md
screenshots/ ← Visual documentation
starter-chat/
discovery-*.png
onboard-*.png
checkout-*.png
email-*.png
dashboard-*.png
setup-*.png
public-*.png
pro-both/
...

Each spec file covers 62 touchpoints across 8 categories (A-H), each with screenshots, and follows the Phase 3 templates above.

The Interview Process

Building a spec is a 3-stage process: research, interview, verification.

Stage 1: Agent Research (subagents, no founder needed)

Before the interview, agents pre-populate the entire spec by reading code and production pages:

  1. Discovery researcher — Visit every marketing page, landing page, denomination page, pricing page. Document every CTA that leads to this product. Screenshot each discovery path.
  2. Dashboard researcher — Read all dashboard components for this tier. For each tab/sub-tab, document what's visible and what's hidden based on tier-config.ts, tool-config.ts, agent-type-config.ts. Screenshot the production dashboard.
  3. Journey researcher — Walk the full journey: onboard form → checkout → email templates → first dashboard load → setup steps. Screenshot each step.
  4. Completeness validator — Cross-check all 62 touchpoints are accounted for. Flag any gaps. Verify every property (churchwiseai.com, pewsearch.com, etc.) is covered.

Output: A DRAFT spec with every touchpoint pre-filled and screenshots captured.

Stage 2: Founder Interview (rapid yes/no/tweak)

Walk through the draft spec with the founder. For each touchpoint:

  1. Present what was found: "Touchpoint 28: Agents tab — I see Care Agent (expandable), Coordinator (expandable), Discipleship (locked), Stewardship (locked). No voice picker. Correct?"
  2. Founder says: yes / no, change X / also add Y
  3. Capture corrections immediately in the spec

This turns a 90-minute open-ended interview into ~30 minutes of confirmation.

Stage 3: Test Generation & Verification

  1. Generate Playwright E2E test from the approved spec
  2. Run the test against production
  3. Fix any failures — the spec is right, the code is wrong
  4. Commit both spec and test together

Screenshot Convention

Screenshots live in knowledge/acceptance/screenshots/[tier]/:

knowledge/acceptance/screenshots/
starter-chat/
discovery-pricing-page.png
onboard-step1-search.png
onboard-step2-contact.png
checkout-stripe.png
email-welcome.png
dashboard-overview-first-visit.png
dashboard-training-agents.png
setup-hours-before.png
setup-hours-after.png
chat-page-public.png
...
pro-both/
...

Screenshots should be captured from production at the time of spec creation and updated whenever the spec is updated.

Key Principles

  • The FOUNDER defines expected outputs, not agents — but agents propose drafts from code research
  • Expected outputs are the source of truth for BOTH code and tests
  • If code doesn't match the spec, the code is wrong (not the spec)
  • If a spec is missing, no agent should build the feature until the spec exists
  • Specs are living documents — update them BEFORE changing code
  • Every touchpoint should have a screenshot — visual documentation is not optional
  • The full journey matters — not just individual screens, but the sequential experience from discovery to working product