Claude can read my files, my terminal, even my screen. But it had no idea what I read in my browser yesterday.
That gap bugged me enough to build BraveMCP: a local-first "second brain" that gives Claude Desktop access to my browsing history, bookmarks, highlights, and notes through the Model Context Protocol (MCP). Everything stays on my machine. No cloud, no tracking.
This is the technical write-up: the architecture, the one constraint that shaped the whole design, and the bugs that cost me the most time.
The constraint that shaped everything
MCP servers talk to Claude Desktop over stdio, a JSON-RPC stream on stdin/stdout. A browser extension lives in a sandbox and cannot speak stdio. It can only make outbound HTTP requests.
So the two halves of the system physically cannot talk to each other directly. That single fact drove the entire design.
The fix is a small HTTP bridge: an Express server running on port 3747, inside the same process as the MCP server. The extension POSTs browsing events to it; the MCP server reads from the shared database when Claude calls a tool.
The storage layer: hybrid search
Keyword search and semantic search each miss things the other catches. So BraveMCP runs both and merges them.
- SQLite with FTS5 for fast BM25 keyword ranking over titles, summaries, notes, and highlights.
- ChromaDB for cosine vector similarity, so "MCP security" still finds a page titled "Claude agent hardening."
// Merge keyword + vector hits; boost items that appear in both
const merged = new Map();
for (const m of chromaMatches) merged.set(m.id, { ...m, source: "semantic" });
for (const m of ftsMatches) {
const existing = merged.get(m.id);
if (existing) existing.relevance *= 1.1; // appears in both -> boost
else merged.set(m.id, { ...m, source: "keyword" });
}
If ChromaDB is not running, the server degrades to FTS5-only instead of failing. Local-first means it has to work with whatever services you actually have up.
The AI pipeline, and why the fallback matters
When a page is captured, BraveMCP generates a summary and an embedding. It tries Ollama first (fully local: llama3.2 for summaries, nomic-embed-text for embeddings), then falls back to the Anthropic API.
But here is the trap I walked into. The first version, when no LLM was available, returned canned strings:
// before: this ignores the actual data entirely
return `Synthesis on "${topic}": Relies on the gathered browser research database.`;
That is useless. It says the same thing no matter what you searched. So I rewrote every fallback to be extractive: to build a real summary from the actual data, grouping matching pages by domain with real snippets pulled from SQLite. With no LLM at all, asking for a topic synthesis now returns the genuine sources. Different input produces different output. The "AI" tools stay useful even when there is no AI running.
Recovering forgotten pages
The tool I use most is find_forgotten_content. You give it a vague description and it does hybrid search, then re-ranks with time decay and a visit-count boost:
const timeDecay = Math.max(0.5, Math.exp(-0.01 * daysElapsed));
const visitBoost = 1 + 0.2 * Math.log(visitCount);
const adjusted = Math.min(0.99, relevance * timeDecay * visitBoost);
A page you opened three times last week beats one you glanced at once today. That matches how memory actually feels.
Two bugs that cost me hours
1. dotenv v17 broke the entire protocol. MCP communicates over stdout. dotenv v17 prints a status line to stdout by default. That one line corrupted the JSON-RPC channel and Claude Desktop refused to connect with a cryptic Unexpected token error. The fix was pinning dotenv@16. Two hours on a single log line.
2. The dual-process state problem. Claude Desktop and my dev client each spawn their own copy of the MCP server. Only the instance that grabs port 3747 receives extension data. The other had empty in-memory state, so tab tools returned nothing. The fix: stop treating in-memory state as the source of truth and fall back to SQLite, which both processes share.
What's in the box
- A Manifest V3 extension (tab sync, bookmarks, context-menu highlights)
- An MCP server exposing 13 tools (
search_memory,find_forgotten_content,summarize_research_topic,generate_weekly_digest,suggest_tab_cleanup, and more) - SQLite + ChromaDB hybrid search
- A test suite on Node's built-in runner, wired into CI
It is open source, MIT licensed: https://github.com/glatinone/BraveMCP
If you are building on MCP, the stdio-vs-HTTP bridge pattern is the part worth stealing. What would you want your AI to remember?


Top comments (6)
i wouldn't feed Claude my browsing history, no. too risky.
You are absolutely right to point that out. That is a crucial distinction.
While BraveMCP keeps your browsing history, bookmarks, and notes stored entirely locally on your machine (no cloud database), the final step of answering your question does involve sending data to the cloud if you use Claude Desktop (which runs on Anthropic's servers).
Here is the simplified flow:
So, the entire history never leaves your computer, but the specific piece of information you asked about does get sent to Anthropic for processing.
The Solution for 100% Privacy:
If you want *zero data to leave your machine, the architecture supports using a local model (like llama3.2 via Ollama) instead of Claude. In that mode, both the search and the AI processing happen entirely on your computer with no internet connection required after the initial download.
Great catch, it's important to be clear about where the data goes!
This is a great example of how system boundaries often matter more than the business logic itself.
The browser extension, HTTP bridge, MCP server, vector database, and local storage are all relatively straightforward components. The interesting part is how the constraints between them drive the architecture.
The dotenv story is also a perfect reminder that protocols are fragile. One unexpected stdout message was enough to break the entire integration. I’ve seen similar incidents in production systems where a harmless log line silently corrupted a machine-to-machine contract.
The most valuable lesson here isn’t memory retrieval. It’s respecting system boundaries and designing graceful degradation when dependencies disappear.
Totally agree. The dotenv story is a classic reminder that protocols are unforgiving. It’s fascinating how a single line of output can corrupt a JSON-RPC stream. Designing for graceful degradation when the AI layer drops out is definitely the smartest move in this architecture.
The dotenv stdout corruption issue is a great catch.
I hit the exact same problem building a Spring Boot MCP server —
any output to stdout corrupts the JSON-RPC stream silently.
The fix on the Java side is routing all logs to stderr via logback:
logging.pattern.console= (empty in application.yml)
One empty line in config, hours saved debugging cryptic parse errors.
Is it normal? How can Claude read my browsing history? It's too risky.