The reference implementation of the Anything Engine. Classify → Embed → Graph Query → Synthesize → Crayon card stream. Live on the sandbox today.
/classify (8400). Groq Llama 3.3 70B classifies the intent in <300ms. Returns {class, confidence, reasoning}. If confidence < 0.75, surface a confirmation card before routing.// POST /api:UgP1h6uR/classify
{ "query": "find seed VCs for AI infrastructure, $3M round" }
// Response
{ "class": "find_investors", "confidence": 0.96, "count": 1,
"reasoning": "Explicit fundraising context, stage+size mentioned." }
text-embedding-3-small (1536 dimensions). The embedding vector is used for semantic similarity matching against investor profiles. OpenAI embeddings are unit-normalized, so cosine similarity = dot product — no normalization step needed.VC_Firm and Angel labels, traverses portfolio and co-investment edges, scores against the query vector. Score filter < 0.85 removes low-confidence matches. The investor → company relationship is indirect: (Investor)‑[:INVESTED_IN]‑>(Funding_Round)<‑[:RAISED]‑(Company) — the second hop is required to get actual company names (not funding round IDs).// Cypher pattern (simplified)
MATCH (i:VC_Firm)-[:INVESTED_IN]->(fr:Funding_Round)<-[:RAISED]-(co:Company)
WHERE fr.stage IN ['Seed', 'Pre-Seed']
AND any(tag IN co.tags WHERE tag IN ['AI', 'Infrastructure', 'ML'])
WITH i, co, fr,
vecf32(i.thesis_embedding) <-> vecf32($query_vec) AS score
WHERE score < 0.85
RETURN i.name, i.partner, co.name, fr.amount, score
ORDER BY score ASC
LIMIT 20
vecf32() wrapper on both sides of the similarity operator. Always wrap embedding vectors.
// Per-investor synthesis output
{
"master_person_id": 1847,
"name": "Kai Nguyen",
"firm": "Gradient Ventures",
"fit_score": 0.91,
"why": "Gradient led two AI-infra seed rounds in Q3 2024 (Layer and Synth.AI). Kai's public writing focuses on the infrastructure-application stack bottleneck — exactly the problem you're solving.",
"draft_subject": "AI infra seed — Orbiter intro via [mutual]",
"stage_match": true,
"sector_match": true,
"check_range": "$1M–$5M"
}
thread.get_user_context provides prior context so vague follow-ups ("show me more like the last one") still classify and dispatch correctly. mem_used: true appears in the response when memory influenced the result.The current sandbox uses FalkorDB as the interim graph database. 11,948 Entity nodes, 1,353 Funding_Rounds, 21 edge types, 30K+ edges. The AlloyDB migration preserves the same logical graph model but adds ScaNN vector indexes for sub-10ms hybrid queries.
| Label | Count (approx) | Key Properties | Role in find_investors |
|---|---|---|---|
VC_Firm | ~2,400 | name, thesis_embedding, check_min, check_max, stage | Primary match target |
Angel | ~800 | name, thesis_embedding, sectors, stage_preference | Secondary match target |
Funding_Round | 1,353 | amount, stage, date, company_id | Portfolio traversal hop |
Company | ~5,200 | name, sector, tags, founded | 2nd-hop for portfolio company names |
Person | ~8,000 | name, title, firm_id, bio_embedding | Partner-level contact for outreach |
The AlloyDB migration adds 6 ScaNN vector columns per investor. A single SQL query combines hard filters (stage, check_size range) with semantic similarity across all 6 dimensions simultaneously — no multi-step pipeline needed.
-- AlloyDB ScaNN pattern (pending migration)
SELECT
i.id, i.name, i.firm, i.partner_name,
(i.sector_embedding <=> $sector_vec) * 0.3 +
(i.stage_embedding <=> $stage_vec) * 0.25 +
(i.check_embedding <=> $check_vec) * 0.2 +
(i.geo_embedding <=> $geo_vec) * 0.15 +
(i.signal_embedding <=> $signal_vec) * 0.05 +
(i.founder_embedding <=> $founder_vec) * 0.05 AS composite_score
FROM investors i
WHERE i.check_min <= $ask AND i.check_max >= $ask
AND i.stage @> ARRAY[$stage]
ORDER BY composite_score ASC
LIMIT 20
Chosen over Mem0 and Cognee for plug-and-play integration, SOC2 compliance, temporal memory (facts expire appropriately), and a Graphiti escape hatch for graph-structured memory if needed. Free tier handles current scale.
On a vague follow-up query ("show me more like those"), mem_used flips to true and the response quality matches a full-context first query. Verified in sandbox testing — no degradation on turn 2.
// Every dispatch ingest (Xano endpoint, post-synthesis)
{
"thread_id": "thread_abc123",
"user_id": 15,
"facts": [
{ "type": "query", "value": "find seed VCs for AI infrastructure" },
{ "type": "classification", "value": "find_investors" },
{ "type": "entities", "value": ["Gradient Ventures", "Kai Nguyen", "Layer", "Synth.AI"] },
{ "type": "context", "value": { "stage": "Seed", "sector": "AI Infrastructure", "ask": 3000000 } }
]
}
// Turn 2 retrieval (pre-classify)
GET /zep/threads/{thread_id}/context
→ { "prior_class": "find_investors", "prior_entities": [...], "prior_context": {...} }
The Crayon SDK renders server responses as rich interactive cards rather than plain text. Each investor result streams in as a structured card template. The frontend uses @crayonai/react-core with custom templates registered per card type.
SSE stream from Xano → Next.js route handler → Crayon SDK. Each data: event carries a partial card payload. Cards render progressively as tokens arrive — no blank loading state.
Xano response includes template_name: "contact_card". The SDK maps this to the registered React component. The sandbox uses a subset of the full copilot template registry: contact_card, scanning_card, error_card.
// SSE data payload per investor
{
"template_name": "contact_card",
"data": {
"master_person_id": 1847,
"name": "Kai Nguyen",
"title": "General Partner",
"firm": "Gradient Ventures",
"avatar_url": "https://...",
"fit_score": 0.91,
"why": "Gradient led two AI-infra seed rounds...",
"tags": ["AI Infrastructure", "Seed", "$1M–$5M"],
"draft_subject": "AI infra seed — Orbiter intro via [mutual]",
"draft_opening": "Hi Kai, [mutual] suggested I reach out..."
}
}
Users can upload pitch decks and company documents to enrich the investor-matching context. The text is extracted server-side and injected into the embedding and synthesis steps.
| Format | Max Size | Extraction Method | Status |
|---|---|---|---|
.pdf | 25MB | Server-side PDF parse | LIVE |
.doc / .docx | 25MB | Mammoth.js extraction | LIVE |
.txt | 5MB | Raw text | LIVE |
.pptx | 25MB | Slide text extraction | LIVE |
POST /api:UgP1h6uR/file-upload — accepts multipart form data, returns extracted text and a pitch_context_id for reference in subsequent dispatch calls. Text is truncated to 8K tokens before embedding.
Each Anything Engine tool has a minimum context floor. When the floor is not met, the dispatcher surfaces a context-gap card asking the user to provide what's missing before running the query.
selectedEvent must be set — a specific calendar event must be selected before the prep pipeline will run. Surfaces event-picker card if missing.selectedPerson must be set — a person must be selected from the network before the leverage loop tool activates./classify and /dispatch endpoints don't change during the migration. Only the internal Xano function that executes the data query is swapped. The UI, SSE streaming, and Crayon card schema are unaffected.
| Layer | Technology | Role |
|---|---|---|
| UI | Next.js 14 App Router | Frontend + thin BFF route handlers. Zero business logic. |
| Auth | WorkOS AuthKit | OAuth, session management, user identity. |
| Generative UI | CrayonChat SDK (@crayonai/react-core) | SSE streaming → contact card templates. |
| Orchestration | Xano (API Group 1270) | All pipeline logic: classify, embed, query, synthesize. |
| Classifier | Groq Llama 3.3 70B | Intent classification at <300ms, temp 0.1. |
| Embeddings | OpenRouter text-embedding-3-small | 1536-dim vectors for query and investor profiles. |
| Graph DB (interim) | FalkorDB | Cypher multi-hop for investor graph traversal. |
| Graph DB (target) | AlloyDB + ScaNN | 6-dim hybrid queries. Pending migration. |
| Synthesis | Groq + Claude Opus 4 | Per-investor rationale and outreach drafts. |
| Memory | Zep Cloud | Thread context, mem_used on turn 2. |
| Hosting | Vercel | Auto-deploys on push to main. roboulos-projects/orbiter-sandbox. |
Migrate from FalkorDB to AlloyDB. Add 6-dim ScaNN indexes per investor. Jog builds BigQuery → AlloyDB delta sync. Target: same week as Mark's go-ahead.
find_investors is the reference pattern. Port to: find_talent (done), find_customers (done), research_person (done). Next: find_partners, find_advisors, find_co_investors following the same Cypher → Groq → Crayon pattern.
14 class names must be locked across UI labels, classifier prompt, dispatcher, and Mintlify docs simultaneously. One source of truth, no drift between layers.
Once AlloyDB is live and 4+ tools are solid, the Anything Engine dispatch endpoint gets wired into the main Orbiter copilot. The sandbox validates the pattern before the port.