Introduction
"A vector database remembers what was said. OpenMemory remembers what it meant, when it happened, how it felt, and why it matters."
This is article #100 in the "One Open Source Project a Day" series. Today's project is OpenMemory — a self-hosted cognitive memory engine for LLM applications and AI agents.
LLMs are stateless by design. Most "memory" solutions are really RAG pipelines in disguise: chunk text, embed it in a vector store, retrieve by similarity. They don't understand the type of memory (is this a fact, an event, a skill, or a feeling?), don't track time (was this true last month?), don't model importance (is this more relevant than that?), and don't maintain associations (these two things are related).
OpenMemory's thesis: AI agents deserve an actual memory system, not a vector database with "memory" in the marketing copy.
What You'll Learn
- The five-sector memory model: episodic, semantic, procedural, emotional, reflective — what each means and how they decay at different rates
- HMD v2 architecture: how Hierarchical Memory Decomposition works
- Waypoint association graph: single-strongest-path graph and the composite scoring formula
- Temporal knowledge graph:
valid_from/valid_toand fact evolution - The fundamental difference from RAG and vector databases
- Three operating modes: embedded SDK, standalone server, MCP interface
Prerequisites
- Basic understanding of LLM agents
- Familiarity with LangChain, CrewAI, or similar agent frameworks
- Basic understanding of vector embeddings and cosine similarity
Project Background
Overview
OpenMemory is an open-source cognitive memory engine built on HMD (Hierarchical Memory Decomposition) v2 architecture. It provides persistent, structured memory for LLM applications and AI agents.
It's not a vector database wrapper. It's not a replacement for a cloud memory API. The design philosophy is: memory is not a database — it's a dynamic system with decay, reinforcement, association, and temporal dimensions.
The project is maintained by CaviraOSS and ships a Python SDK, Node.js SDK, REST API server, VS Code extension, and a native MCP server.
Author / Team
- Organization: CaviraOSS
- Primary language: TypeScript/Node.js (server), Python (SDK)
- License: Apache 2.0
- VS Code Extension: marketplace.visualstudio.com
Project Stats
- 📄 License: Apache 2.0
- 🐍 PyPI:
openmemory-py - 📦 npm:
openmemory-js - 🧩 Integrations: LangChain, CrewAI, AutoGen, Streamlit, MCP, VS Code
Features
What It Does
Traditional RAG memory (vector DB):
"User is allergic to peanuts, prefers coding at night, feels productive"
→ one embedding vector
→ retrieved by similarity
→ no structure, no time, no importance weighting
OpenMemory cognitive memory:
"User prefers coding at night, feels productive"
→ semantic sector: "coding preference" (slow decay)
→ emotional sector: "feels productive" (faster decay)
→ episodic sector: "time: night" (fastest decay)
→ Three sector vectors → mean vector → Waypoint link to related memories
→ Composite score = 0.6×similarity + 0.2×salience + 0.1×recency + 0.1×waypoint weight
Use Cases
- Long-term conversation assistants: Remember user preferences, habits, and history across sessions without repeating context
- Agent framework memory layer: Shared long-term memory store for CrewAI, AutoGen, LangGraph agents
- Knowledge worker tools: Ingest GitHub, Notion, Google Drive content; agents can ask "what was the design decision from last week?"
- Coding assistants: Persist code preferences, project context, tech stack choices across sessions
- Emotion-aware applications: Emotional sector stores sentiment separately, preventing it from polluting factual memory retrieval
Quick Start
Python SDK (local SQLite, zero config):
pip install openmemory-py
from openmemory.client import Memory
mem = Memory()
# Add memories
await mem.add("user is allergic to peanuts", user_id="user123")
await mem.add("user prefers coding at night", user_id="user123")
# Query memories (composite score ranking)
results = await mem.search("what dietary restrictions does the user have?", user_id="user123")
# Reinforce a memory (boost salience)
await mem.reinforce("memory_id")
# Delete a memory
await mem.delete("memory_id")
Node.js SDK:
npm install openmemory-js
import { Memory } from "openmemory-js"
const mem = new Memory()
await mem.add("user prefers TypeScript over Python", { user_id: "u1" })
const results = await mem.search("language preference", { user_id: "u1" })
LangChain integration:
from openmemory.integrations.langchain import OpenMemoryChatMessageHistory
history = OpenMemoryChatMessageHistory(memory=mem, user_id="u1")
# Drop-in replacement for LangChain's ConversationBufferMemory
OpenAI interceptor pattern:
mem = Memory()
client = mem.openai.register(OpenAI(), user_id="u1")
# All subsequent chat.completions.create calls automatically store/retrieve memory
resp = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What language should I use?"}]
)
Ingesting external data sources:
# Ingest a GitHub repository
github = mem.source("github")
await github.connect(token="ghp_...")
await github.ingest_all(repo="owner/repo")
# Ingest Notion pages
notion = mem.source("notion")
await notion.connect(token="secret_...")
await notion.ingest_all(database_id="xxx")
Available connectors: github, notion, google_drive, google_sheets, google_slides, onedrive, web_crawler
MCP integration (Claude Code / Cursor):
# Claude Code
claude mcp add --transport http openmemory http://localhost:8080/mcp
// Cursor .mcp.json
{
"mcpServers": {
"openmemory": {
"type": "http",
"url": "http://localhost:8080/mcp"
}
}
}
Available MCP tools: openmemory_query, openmemory_store, openmemory_list, openmemory_get, openmemory_reinforce
The Five Memory Sectors
| Sector | Meaning | Decay Rate | Weight |
|---|---|---|---|
episodic |
Events and experiences (what happened, when) | 0.015 (fast) | 1.2 |
semantic |
Facts and knowledge (user preferences, domain knowledge) | 0.005 (slowest) | 1.0 |
procedural |
Skills and workflows (how to do something) | 0.008 (medium) | 1.1 |
emotional |
Feelings and attitudes (how something felt) | 0.020 (fastest) | 1.3 |
reflective |
Meta-cognition and insights (what was realized) | 0.001 (near-permanent) | 0.8 |
Decay formula: salience × e^(-decay_lambda × days_since_last_seen)
Decay runs every 24 hours. Waypoint links with weight < 0.05 are pruned every 7 days.
Deep Dive
HMD v2 Architecture
Input content
↓
Sector Classifier
├── Identifies primary sector + additional sectors
└── Based on keyword patterns + context
↓
Multi-Sector Embedding
├── Independent embedding vector per relevant sector
├── Providers: OpenAI / Gemini / AWS / Ollama / local / synthetic
└── Compute mean vector across all sector vectors
↓
Storage (SQLite / Postgres)
├── memories table: content + metadata + salience + decay parameters
├── vectors table: one vector per memory × per sector
└── waypoints table: single strongest associative link per memory
↓
Query: Composite Scoring
score = 0.6×similarity + 0.2×salience + 0.1×recency + 0.1×waypoint weight
The Waypoint Association Graph
This is one of the key architectural choices that distinguishes OpenMemory from vector databases:
Memory A ──0.85──▶ Memory B
(only the single strongest link is kept)
When a memory is added, the system finds the single most similar existing memory (cosine similarity ≥ 0.75) and creates a one-way Waypoint link. Cross-sector links are bidirectional.
At query time, after top-K vector retrieval, a 1-hop graph traversal expands results to include memories linked via Waypoints. Every time a Waypoint is traversed, its weight increases by +0.05 (max 1.0) — memories that are frequently recalled together develop stronger links over time.
The practical effect: a query that isn't semantically close to a memory can still surface it if a linked memory scores high. This creates emergent associative recall rather than pure similarity search.
Temporal Knowledge Graph
OpenMemory treats time as a first-class dimension:
# Add a fact in 2021
POST /api/temporal/fact
{
"subject": "CompanyX",
"predicate": "has_CEO",
"object": "Alice",
"valid_from": "2021-01-01"
}
# Update in 2024
POST /api/temporal/fact
{
"subject": "CompanyX",
"predicate": "has_CEO",
"object": "Bob",
"valid_from": "2024-04-10"
}
# Alice's tenure is automatically closed (valid_to = 2024-04-09)
Supported operations:
-
valid_from/valid_totruth windows - Point-in-time queries ("who was CEO of CompanyX in late 2022?")
- Change detection (when did a fact flip?)
- Entity timeline reconstruction
Performance Data
At 100k memories (SQLite with WAL mode):
| Operation | Latency |
|---|---|
| Add memory | 80-120 ms (depends on embedding provider) |
| Single-sector query | 110-130 ms |
| Multi-sector fusion (2-3 sectors) | 150-200 ms |
| Waypoint expansion (per hop) | +30-50 ms |
| Decay process (background) | ~10 sec (every 24 hours) |
Storage estimates:
- Single memory: ~4-6 KB (including all sector vectors)
- 100k memories: ~500 MB
- 1M memories: ~5 GB
vs. SaaS Alternatives
| Dimension | OpenMemory | Supermemory | OpenAI Memory |
|---|---|---|---|
| Hosting | Self-hosted | Cloud only | Cloud only |
| Query latency | 110-130 ms | 350-400 ms | ~300 ms |
| Cost per 1M tokens | ~$0.30-0.40 | ~$2.50+ | ~$3.00+ |
| Explainability | Fully transparent (Waypoint trace) | Black-box | Black-box |
| Local embeddings | Yes (Ollama, local models) | No | No |
| Data ownership | 100% yours | Vendor-held | OpenAI-held |
The cost difference comes from running local embeddings (Ollama, BGE, E5) instead of API-billed embedding calls, and from no cloud infrastructure markup.
Migrating from Other Systems
OpenMemory ships a migration tool to import existing memory data:
cd migrate
# Migrate from Mem0
python -m migrate --from mem0 --api-key MEM0_KEY --verify
# Migrate from Zep
python -m migrate --from zep --api-key ZEP_KEY --verify
# Migrate from Supermemory
python -m migrate --from supermemory --api-key SM_KEY --verify
Database Schema
The core SQLite schema makes the architecture transparent:
-- A memory record with salience and decay parameters
CREATE TABLE memories (
id TEXT PRIMARY KEY,
content TEXT NOT NULL,
primary_sector TEXT NOT NULL,
salience REAL, -- 0-1 importance score
decay_lambda REAL, -- sector-specific decay rate
last_seen_at INTEGER, -- for decay calculation
mean_vec BLOB -- mean vector for waypoint matching
);
-- One vector per memory per sector
CREATE TABLE vectors (
id TEXT NOT NULL,
sector TEXT NOT NULL,
v BLOB NOT NULL, -- float32 vector
PRIMARY KEY (id, sector)
);
-- Single strongest associative link per memory
CREATE TABLE waypoints (
src_id TEXT PRIMARY KEY,
dst_id TEXT NOT NULL,
weight REAL NOT NULL -- 0-1 link strength
);
Resources
Official Links
- 🌟 GitHub: CaviraOSS/OpenMemory
- 📦 PyPI: openmemory-py
- 📦 npm: openmemory-js
- 🔌 VS Code Extension: openmemory-vscode
- 📄 Architecture docs: ARCHITECTURE.md (in repo)
Summary
OpenMemory's contribution is treating "memory" as a serious engineering problem rather than a marketing term.
When most developers talk about "AI memory," they mean retrieval-augmented generation — store vectors, retrieve by similarity. OpenMemory's position is that this is search, not memory. A real memory system needs to know whether something is a fact or an emotion, whether it's recent or historical, whether it's still true today, and what else it relates to.
The five-sector model, Waypoint graph, temporal knowledge graph, and composite scoring aren't complexity for its own sake — they map to distinct dimensions of how human memory actually works.
For developers building agent applications that need cross-session memory, OpenMemory is one of the most architecturally coherent open-source options available today: self-hosted, local-first, framework-agnostic, explainable, and performance-predictable. Three lines of code to integrate, a full server mode when you need it.
Explore PrimeSkills — a curated marketplace of AI agents and skills, each validated against real enterprise workflows. No hype, just what actually works.
Visit my personal site for more insights and interesting products.
Top comments (0)