72 vitest files, 1241 test cases (was 30 at session start — added 1211 across ~76 helper modules). 100% pass rate, ~7.7 s runtime. Spans 26 features + 3 infra layers. 14 cross-feature dedupes complete. STRUCTURAL refactor: discover gridNodes Steps 1-5 landed (5 byte-identical adapters lifted, -252 source LoC, +26 tests). 26 worker subagents across 15 swarm waves. Final auditor verdicts: SHIP-READY, no regressions, no transient HEAD breaks made it into published history. Remaining open work needs human decisions (Mark sign-off on discoverCompany filter divergence for gridNodes Step 6; Charles/Robert review on PR #343 which is mechanically MERGEABLE/CLEAN with build PASS / lint PASS / 0 skipped tests). Plus L65→L73 perf wave: ~5 MB removed from first-paint critical path + crash-safety (ErrorBoundary 1→4) + 5 React perf fixes + 7 latent bugs fixed (see perf-wins.html).
Component-render tests in this codebase would be expensive to write and maintain: the AE Renderer mounts via OpenUI Lang strings (not direct JSX), MCQ chips trigger DOM events that wire through ref-based handlers, and the dispatch flow is multi-step async. Capturing all of this in vitest with happy-dom is possible but slow + fragile. The faster + more truthful gate is agent-browser screenshot against the live canvas.
✓ pnpm exec tsc --noEmit # typecheck (catches shape mismatches) ✓ pnpm exec biome lint # static analysis (catches deps, a11y, complexity) ✓ pnpm test # vitest (catches logic regressions in helpers + hooks) ✓ agent-browser screenshot # live canvas dogfood (catches render + interaction regressions)
Each gate catches a different class of bug. Vitest is the cheap one — runs in 1.13 s, no flake risk. Canvas dogfood is the expensive one — needs the dev server + WorkOS auth + real browser, but catches the things vitest can’t see.
Each of these gets caught by the canvas dogfood gate today, but a vitest case would catch it 1000× faster on every commit.
Ranked by leverage (line-coverage gain × regression risk):
SIX helper modules extracted + tested within the same session as the gap was documented (or surfaced). Three from the original next-steps queue (sanitizeAskBackReasoning, parseConversationId, networkErrorMessage) plus three bonus extractions during the inline-helper sweep: adaptContactCard suite (event-renderer.tsx), format-helpers x4 (chat-shell.tsx), contact-card-utils x4 (components.tsx). One queue item — openui05DispatchFn validator — is wrapped in createServerFn so harder to extract without a server-runtime mock; deferred. Item 3 (eventsToSse extraction) was moot.