Skip to main content

HNSW Vector Index on unified_rag_content.embedding

Decision

Add an HNSW ANN (Approximate Nearest Neighbor) index to public.unified_rag_content(embedding) using vector_cosine_ops.

CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_unified_rag_content_embedding
ON public.unified_rag_content
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Context

The unified_rag_content table holds ~327K rows and is the backing store for both the SermonWise AI RAG pipeline and the ChurchWiseAI chatbot RAG retrieval path. The DBA-audit subagent (2026-05-05) confirmed that no ANN vector index existed — every cosine similarity search without a pericope ILIKE pre-filter was executing an O(N) exact KNN scan across all 327K embedding vectors.

At this scale, exact KNN scans take approximately 600–700ms p90. This is a silent performance drain: queries "work" but are an order of magnitude slower than they need to be, and the cost compounds as the table grows.

Why HNSW over IVFFlat

CriterionHNSWIVFFlat
Training step requiredNoYes (CREATE INDEX requires a pre-built centroid list via VACUUM ANALYZE)
Recall @ ef_search=40~99%~95–97%
Sensitivity to bulk insertsLow (inserts re-wire graph incrementally)High (bulk inserts degrade recall until VACUUM+ANALYZE re-clusters)
Build time for ~327K rows~5–15 minSimilar
Memory at query timeHigher (ef_search parameter trades RAM for recall)Lower

HNSW is the better choice for this workload because:

  1. No training step — index can be created live with CONCURRENTLY
  2. Better recall at the same query latency budget
  3. unified_rag_content receives continuous incremental inserts (new church KB entries, sermon illustrations); IVFFlat recall degrades between VACUUM cycles, HNSW does not
  4. pgvector HNSW support is stable as of pgvector ≥ 0.5.0

Build Parameters

ParameterValueRationale
m16Connections per node. Standard for 300K-1M vectors. Higher = better recall, more RAM
ef_construction64Build-time search depth. Standard for ~300K vectors. Higher = better graph quality, slower build
ef_search(default 40)Can be tuned per-query via SET hnsw.ef_search = N if recall vs. latency tradeoff needs adjustment

Expected Impact

MetricBeforeAfter
RAG query p90 (unfiltered cosine search)~650ms<50ms
RAG query p90 (with pericope pre-filter)Already using btree on pericope; unchanged
Table write throughputUnchanged (index maintained incrementally)
Table read throughput during buildUnchanged (CONCURRENTLY guarantee)

Application Notes

Critical: CREATE INDEX CONCURRENTLY cannot run inside a transaction. The Supabase SQL Editor wraps all statements in a transaction by default — do NOT use the SQL Editor unless you uncheck "Wrap in transaction."

Preferred application method:

psql "postgresql://postgres.wrwkszmobuhvcfjipasi:<DB_PASSWORD>@aws-0-<region>.pooler.supabase.com:5432/postgres" \
-f churchwiseai-web/migrations/2026-05-05_unified_rag_hnsw_index.sql

The DB password is available in the Supabase dashboard under Project Settings → Database → Connection string.

Alternative: Supabase dashboard → SQL Editor → uncheck "Wrap in transaction" checkbox → paste the CREATE INDEX CONCURRENTLY ... statement only.

Monitor build progress:

SELECT * FROM pg_stat_progress_create_index;

Post-Build Verification

After the index appears in pg_indexes, verify the query planner uses it:

-- Get a real probe vector first:
SELECT embedding FROM unified_rag_content WHERE embedding IS NOT NULL LIMIT 1;

-- Then run EXPLAIN ANALYZE with that vector:
EXPLAIN (ANALYZE, BUFFERS)
SELECT id, embedding <=> '<probe-vector>'::vector AS dist
FROM unified_rag_content
ORDER BY dist
LIMIT 6;
-- Expected: "Index Scan using idx_unified_rag_content_embedding"
-- Expected runtime: <50ms

Migration File

churchwiseai-web/migrations/2026-05-05_unified_rag_hnsw_index.sql