The journey of how "the chat" evolved across projects, the three architectures we
ended up with, their pros and cons, and which one to reach for per project.
TL;DR — three working approaches, each right for a different target:
- Static JSON index inside a Docusaurus/Vercel site (MAI) — zero infra, ship today.
- Postgres + pgvector hosted RAG service (Neon) — real semantic search at scale, multi‑user.
- Self‑contained WordPress plugin with vectors in the site DB (Grekai Chat) — installs on any WP, no extra infra.
The timeline
Each stage solved the limitation of the previous one, but none replaced the
others — they target different deployments, so all three remain valid choices.
Stage 1 — MAI on Docusaurus (static index)
The docs assistant ("MAI") shipped inside the Docusaurus 3.8 docs site on Vercel.
- A pre-build step compiles all markdown into a single
docs-index.json, bundled with the serverless function (vercel.jsonincludeFiles). - Retrieval runs in the function, in memory over that index; the grounded context is sent to the chosen provider.
- Keys are the visitor's own, encrypted client-side (per-device key in IndexedDB).
Pros
- Zero database, zero extra infra — just the site + a function.
- Cheap, fast to ship, and the index is versioned with the docs (rebuilds on deploy).
- Multi-provider, no server-side secrets.
Cons
- The index is rebuilt on every deploy and loaded whole into memory.
- Retrieval quality is bounded (no true ANN vector index unless precomputed).
- Doesn't scale past a few MB of content; no per-user data, history, or auth.
Stage 2 — Neon Postgres + pgvector (hosted RAG service)
To get real semantic search, scale, and multi-user accounts, the next step was a
dedicated RAG service backed by Postgres + pgvector (Neon in prod, an embedded
pgserver locally).
- Vectors live in
langchain_pg_embedding(vector(1536)), searched with a real vector index (ANN) — fast even on large corpora. -
Accounts:
app_users(scrypt password hashes),user_secrets(per‑user API keys, AES‑256‑GCM, decrypted only server-side),user_data(per-user JSON).
Pros
- Real semantic search that scales to large/growing content.
- Multi-tenant: many users, each with their own encrypted key and data.
- A queryable, durable database; clean separation of auth / vectors / data.
Cons
- Needs a hosted Postgres (Neon) and a service to run it — more moving parts.
- Ops + cost; overkill for a single small site.
- Not something you "install" — it's infrastructure you operate.
Stage 3 — Grekai Chat (WordPress plugin, vectors in the WP DB)
To make the chat a drop-in product for any client's WordPress site, the vectors
moved into the WordPress database itself — no external DB, no service to run.
- On activation the plugin creates two tables:
wp_gk_chat_chunks(each passage- its embedding stored as JSON) and
wp_gk_chat_leads.
- its embedding stored as JSON) and
- Embeddings via Gemini
gemini-embedding-001; retrieval is brute-force cosine in PHP over the chunks. - The owner brings their own key (encrypted in
wp_options); everything stays in the site's own DB. Adds lead capture, RTL, anti-abuse, i18n.
Pros
- Installs on any WordPress, zero extra setup — no separate DB or infra.
- Data stays in the site owner's DB; GPL, distributable (a real plugin).
- Built-in lead generation + funnel; owner controls cost via their own key.
Cons
- Brute-force cosine doesn't scale to tens of thousands of chunks (great for small/medium sites, up to a few thousand passages).
- The WP DB is not a vector DB — no ANN index; embeddings re-run on content change.
- Host/runtime constraints (shared hosting, PHP limits).
Why not pgvector in the plugin? The product goal is "works on any WordPress
with no setup." Requiring an external Postgres would break that and push cost/ops
onto every site owner. WP sites are small/medium, so brute-force cosine is fast
enough — and the vector store is a single swappable class
(includes/class-vector-store.php), so a site that outgrows it can point at
pgvector or an external vector DB without touching the rest of the plugin.
Side-by-side
| Dimension | Stage 1 · Docusaurus static index | Stage 2 · Neon + pgvector | Stage 3 · WP plugin (in-DB vectors) |
|---|---|---|---|
| Where vectors live | JSON file bundled with the function | Postgres pgvector
|
wp_gk_chat_chunks (JSON) |
| Retrieval | in-memory over the JSON | ANN / vector index | brute-force cosine (PHP) |
| Hosting | Vercel (static + functions) | Neon + a running service | any WordPress host |
| Auth / multi-user | none (visitor's own key) |
app_users + user_secrets
|
WP admin (one owner key) |
| Key storage | browser IndexedDB (encrypted) |
user_secrets (AES‑GCM) |
wp_options (AES‑256) |
| Scale | small (a few MB) | large / growing | small–medium (≈ thousands of chunks) |
| Infra / ops | ~none | Postgres + service | ~none (uses the WP DB) |
| Setup effort | low | high | low |
| Distributable? | no — it is the site | no — it's a service | yes — a plugin |
| Best for | docs you own on a static site | SaaS / large multi-tenant | a client's WordPress site |
Which approach per project?
- Your own docs/marketing site, content rarely changes → Stage 1. Don't build a vector DB you don't need yet.
- A multi-tenant SaaS, big knowledge bases, per-user keys/history → Stage 2.
- You want to hand a working chat to any WordPress site, no ops → Stage 3.
Lessons learned
- Start simple to validate. The static index proved the chat + UX before any DB.
- Keep the vector store swappable. One class behind an interface lets you upgrade retrieval (JSON → pgvector → managed vector DB) without rewriting the app.
- Match the architecture to the *deployment target* — static site vs SaaS vs distributable plugin — not to what's fashionable. A plugin that needs an external Postgres isn't a plugin anyone will install.
- Brute-force cosine lasts longer than people expect. For a few thousand passages it's milliseconds; reach for ANN/pgvector only when the numbers demand it.
-
Encrypt keys at every layer and never send them to the browser — IndexedDB
(client),
user_secrets(server),wp_options(plugin) all do this differently for the same reason. - Bring-your-own-key keeps you out of the data-processor business. The visitor's content goes straight to the provider the owner configured; you never hold it.







Top comments (0)