DEV Community

Cover image for I gave Claude a memory of everything I browse — here's the architecture
Kiell Tampubolon
Kiell Tampubolon

Posted on

I gave Claude a memory of everything I browse — here's the architecture

Claude can read my files, my terminal, even my screen. But it had no idea what I read in my browser yesterday.

That gap bugged me enough to build BraveMCP: a local-first "second brain" that gives Claude Desktop access to my browsing history, bookmarks, highlights, and notes through the Model Context Protocol (MCP). Everything stays on my machine. No cloud, no tracking.

This is the technical write-up: the architecture, the one constraint that shaped the whole design, and the bugs that cost me the most time.

The constraint that shaped everything

MCP servers talk to Claude Desktop over stdio, a JSON-RPC stream on stdin/stdout. A browser extension lives in a sandbox and cannot speak stdio. It can only make outbound HTTP requests.

So the two halves of the system physically cannot talk to each other directly. That single fact drove the entire design.

BraveMCP architecture

The fix is a small HTTP bridge: an Express server running on port 3747, inside the same process as the MCP server. The extension POSTs browsing events to it; the MCP server reads from the shared database when Claude calls a tool.

The storage layer: hybrid search

Keyword search and semantic search each miss things the other catches. So BraveMCP runs both and merges them.

  • SQLite with FTS5 for fast BM25 keyword ranking over titles, summaries, notes, and highlights.
  • ChromaDB for cosine vector similarity, so "MCP security" still finds a page titled "Claude agent hardening."
// Merge keyword + vector hits; boost items that appear in both
const merged = new Map();
for (const m of chromaMatches) merged.set(m.id, { ...m, source: "semantic" });
for (const m of ftsMatches) {
  const existing = merged.get(m.id);
  if (existing) existing.relevance *= 1.1; // appears in both -> boost
  else merged.set(m.id, { ...m, source: "keyword" });
}
Enter fullscreen mode Exit fullscreen mode

If ChromaDB is not running, the server degrades to FTS5-only instead of failing. Local-first means it has to work with whatever services you actually have up.

The AI pipeline, and why the fallback matters

When a page is captured, BraveMCP generates a summary and an embedding. It tries Ollama first (fully local: llama3.2 for summaries, nomic-embed-text for embeddings), then falls back to the Anthropic API.

But here is the trap I walked into. The first version, when no LLM was available, returned canned strings:

// before: this ignores the actual data entirely
return `Synthesis on "${topic}": Relies on the gathered browser research database.`;
Enter fullscreen mode Exit fullscreen mode

That is useless. It says the same thing no matter what you searched. So I rewrote every fallback to be extractive: to build a real summary from the actual data, grouping matching pages by domain with real snippets pulled from SQLite. With no LLM at all, asking for a topic synthesis now returns the genuine sources. Different input produces different output. The "AI" tools stay useful even when there is no AI running.

Recovering forgotten pages

The tool I use most is find_forgotten_content. You give it a vague description and it does hybrid search, then re-ranks with time decay and a visit-count boost:

const timeDecay  = Math.max(0.5, Math.exp(-0.01 * daysElapsed));
const visitBoost = 1 + 0.2 * Math.log(visitCount);
const adjusted   = Math.min(0.99, relevance * timeDecay * visitBoost);
Enter fullscreen mode Exit fullscreen mode

A page you opened three times last week beats one you glanced at once today. That matches how memory actually feels.

Before and after BraveMCP

Two bugs that cost me hours

1. dotenv v17 broke the entire protocol. MCP communicates over stdout. dotenv v17 prints a status line to stdout by default. That one line corrupted the JSON-RPC channel and Claude Desktop refused to connect with a cryptic Unexpected token error. The fix was pinning dotenv@16. Two hours on a single log line.

2. The dual-process state problem. Claude Desktop and my dev client each spawn their own copy of the MCP server. Only the instance that grabs port 3747 receives extension data. The other had empty in-memory state, so tab tools returned nothing. The fix: stop treating in-memory state as the source of truth and fall back to SQLite, which both processes share.

What's in the box

  • A Manifest V3 extension (tab sync, bookmarks, context-menu highlights)
  • An MCP server exposing 13 tools (search_memory, find_forgotten_content, summarize_research_topic, generate_weekly_digest, suggest_tab_cleanup, and more)
  • SQLite + ChromaDB hybrid search
  • A test suite on Node's built-in runner, wired into CI

It is open source, MIT licensed: https://github.com/glatinone/BraveMCP

If you are building on MCP, the stdio-vs-HTTP bridge pattern is the part worth stealing. What would you want your AI to remember?

Top comments (6)

Collapse
 
adamthedeveloper profile image
Adam - The Developer

i wouldn't feed Claude my browsing history, no. too risky.

Collapse
 
kielltampubolon profile image
Kiell Tampubolon

You are absolutely right to point that out. That is a crucial distinction.

While BraveMCP keeps your browsing history, bookmarks, and notes stored entirely locally on your machine (no cloud database), the final step of answering your question does involve sending data to the cloud if you use Claude Desktop (which runs on Anthropic's servers).

Here is the simplified flow:

  1. Local Only: BraveMCP searches your local database and finds the relevant snippet or summary.
  2. To the Cloud: That specific snippet is sent to Anthropic so the Claude AI can read it and generate an answer for you.

So, the entire history never leaves your computer, but the specific piece of information you asked about does get sent to Anthropic for processing.

The Solution for 100% Privacy:
If you want *zero data to leave your machine, the architecture supports using a local model (like llama3.2 via Ollama) instead of Claude. In that mode, both the search and the AI processing happen entirely on your computer with no internet connection required after the initial download.

Great catch, it's important to be clear about where the data goes!

Collapse
 
merbayerp profile image
Mustafa ERBAY

This is a great example of how system boundaries often matter more than the business logic itself.

The browser extension, HTTP bridge, MCP server, vector database, and local storage are all relatively straightforward components. The interesting part is how the constraints between them drive the architecture.

The dotenv story is also a perfect reminder that protocols are fragile. One unexpected stdout message was enough to break the entire integration. I’ve seen similar incidents in production systems where a harmless log line silently corrupted a machine-to-machine contract.

The most valuable lesson here isn’t memory retrieval. It’s respecting system boundaries and designing graceful degradation when dependencies disappear.

Collapse
 
kielltampubolon profile image
Kiell Tampubolon

Totally agree. The dotenv story is a classic reminder that protocols are unforgiving. It’s fascinating how a single line of output can corrupt a JSON-RPC stream. Designing for graceful degradation when the AI layer drops out is definitely the smartest move in this architecture.

Collapse
 
joaquinriosheredia profile image
Joaquinriosheredia

The dotenv stdout corruption issue is a great catch.
I hit the exact same problem building a Spring Boot MCP server —
any output to stdout corrupts the JSON-RPC stream silently.
The fix on the Java side is routing all logs to stderr via logback:
logging.pattern.console= (empty in application.yml)
One empty line in config, hours saved debugging cryptic parse errors.

Collapse
 
rishabh_devsingh_df2267d profile image
Rishabh Dev Singh

Is it normal? How can Claude read my browsing history? It's too risky.