Skip to main content

Enforce knowledge↔code sync via a CI gate on every code PR

Status

DECIDED

Context

The knowledge/ repo is meant to be the single source of truth for every customer-facing flow, architecture decision, and operational runbook. The chatbot and voice agent both read from it at runtime, and docs.churchwiseai.com renders it publicly. Staying accurate therefore matters in three ways at once: live AI product behavior, investor / hire-facing documentation, and the founder's own institutional memory.

CLAUDE.md rule #16 already instructed agents to update a knowledge doc in the same commit that modifies any file listed in that doc's code-files: frontmatter. The enforcement was voluntary. In practice, drift accumulated silently — a PR would edit src/app/api/stripe/webhook/route.ts without touching the 16 knowledge docs that reference it, and the next agent to read those docs would be working from a stale model of the system. By the time the drift surfaced (usually via a failed pnpm derive --check or a confused pastor-facing FAQ answer), tracking the originating PR was expensive.

Existing enforcement solved the opposite direction: churchwiseai-web/.github/workflows/derive-on-push.yml fires on knowledge/data/*.yaml changes and regenerates code artifacts (pricing.ts, PricingGrid.tsx, etc.). That's YAML→code. The code→KB direction had no gate.

Decision

Add a knowledge-sync-gate.yml workflow to every code repo (churchwiseai-web, pewsearch, sermon-illustrations). On every PR to the deploy branch:

  1. Clone ChurchWiseAI/knowledge@master.
  2. Run the new knowledge/scripts/changed-files-to-docs.ts primitive with the PR's git diff --name-only as stdin, prefixed with the repo name (e.g. --repo-prefix=churchwiseai-web).
  3. If zero knowledge docs match, the gate passes silently.
  4. If one or more docs match, the gate requires ONE of:
    • Label knowledge-sync-updated — the agent attests that the paired knowledge PR is in flight or already merged.
    • Label knowledge-sync-override + a PR comment whose body starts with reason: — deliberate bypass, restricted to refactors, renames, and dead-code removal that provably cannot alter behavior.

The gate reads from master, so the correct workflow for an agent touching gated files is:

knowledge PR edits the affected doc(s) → merge to master
→ open code PR → CI gate sees the matching doc is now fresh
→ add knowledge-sync-updated label → merge code PR

The primitive is the reusable building block. It emits structured JSON and a human-readable text format, accepts a repo-prefix flag, and handles exact-path, directory-prefix (bare), and directory-prefix (trailing-slash) match semantics uniformly.

Additionally, knowledge/.github/workflows/trigger-docs-rebuild.yml is split into drift-guardtrigger jobs. The Vercel webhook only fires if pnpm derive --check passes. Drift never publishes to docs.churchwiseai.com again.

A nightly freshness-nag.yml cron complements the PR-time gate: it regenerates FRESHNESS_REPORT.md and opens/updates a single rolling issue listing docs with last-verified > 60 days. The gate catches drift; the nag catches neglect.

Rationale

Why hard-block and not warn:

  • Voluntary enforcement failed for three months. The 16 undated docs, the pre-decouple Pro Website references, and the silently-broken processes/email-triage.md → chief-of-staff/SKILL.md pointer are all evidence that agents don't reliably self-police.
  • The override label is cheap (one click + one comment). It isn't a barrier for legitimate refactors — it's a forcing function to think about whether the refactor touched behavior.
  • The critical-path gate from 2026-04-14 already established the exact pattern (label-based override with reason). Agents know the shape; adoption cost is low.

Why attestation, not verification:

  • The "correct" verification would assert that a paired commit on knowledge/master touches the matched doc's path. That's achievable (parse the gh API) but brittle — when should the assertion fire? Before the code PR opens? When labeled? What if the knowledge edit happens after labeling?
  • The derive --check drift-guard on the rebuild webhook catches dishonest attestations at publish time (mismatched YAML/code verified targets will fail). Combined with the freshness nag, the feedback loop is tight enough that attestation + trust is better than over-engineered verification for v1.

Consequences

  • Good:
    • Drift stops accumulating silently. Every PR surfaces the affected docs, and the label makes the choice explicit (update, override, or rollback).
    • docs.churchwiseai.com stays internally consistent. The drift-guard ensures a drifted knowledge/ never renders as "current" on the public site.
    • The rolling freshness issue gives the founder a standing queue of docs to sweep during triage time, independent of agent PR activity.
  • Bad:
    • Two-PR dance for every non-trivial code change. Feels heavier; costs about 30 seconds per PR in practice.
    • KNOWLEDGE_REPO_TOKEN secret is another thing to rotate (FA-083).
    • The gate adds ~30s to CI on code PRs (clone + install + query).
    • Wrong override reasons are undetectable by CI. Relies on social enforcement from PR review.
  • Reversible? Yes — delete the three workflow files + revert the decision log + revert rule #16. The primitive and the drift-guard are keepers regardless.

Alternatives considered

  • Warning comment, no block — low friction but the drift history above proves warnings don't work without enforcement.
  • Companion-PR verification via GitHub API — more honest but brittle to timing. Deferred to a future iteration if attestation drift becomes a real problem.
  • Git pre-commit hooks — can't be enforced across multiple machines and agent environments. CI is the only universal checkpoint.
  • Submodule knowledge/ into each code repo — would make the PRs atomic but every code PR would see every knowledge commit, bloating blame, PR scope, and review cost. Rejected.
  • Primitive: knowledge/scripts/changed-files-to-docs.ts
  • Drift-guard workflow: knowledge/.github/workflows/trigger-docs-rebuild.yml
  • Freshness cron: knowledge/.github/workflows/freshness-nag.yml
  • Code-repo gates: .github/workflows/knowledge-sync-gate.yml in each of churchwiseai-web, pewsearch, sermon-illustrations.
  • Related process: processes/agent-quality-principles.md
  • Related decisions: 2026-04-14-critical-path-gate
  • Founder action: FA-083 (token setup + merge order)
  • Plan: C:/Users/johnm/.claude/plans/docs-churchwiseai-com-is-way-stale-sharded-hummingbird.md