DEV Community

Cover image for Open Source Project #100: OpenMemory — A Real Cognitive Memory Engine for AI Agents
WonderLab
WonderLab

Posted on

Open Source Project #100: OpenMemory — A Real Cognitive Memory Engine for AI Agents

Introduction

"A vector database remembers what was said. OpenMemory remembers what it meant, when it happened, how it felt, and why it matters."

This is article #100 in the "One Open Source Project a Day" series. Today's project is OpenMemory — a self-hosted cognitive memory engine for LLM applications and AI agents.

LLMs are stateless by design. Most "memory" solutions are really RAG pipelines in disguise: chunk text, embed it in a vector store, retrieve by similarity. They don't understand the type of memory (is this a fact, an event, a skill, or a feeling?), don't track time (was this true last month?), don't model importance (is this more relevant than that?), and don't maintain associations (these two things are related).

OpenMemory's thesis: AI agents deserve an actual memory system, not a vector database with "memory" in the marketing copy.

What You'll Learn

  • The five-sector memory model: episodic, semantic, procedural, emotional, reflective — what each means and how they decay at different rates
  • HMD v2 architecture: how Hierarchical Memory Decomposition works
  • Waypoint association graph: single-strongest-path graph and the composite scoring formula
  • Temporal knowledge graph: valid_from / valid_to and fact evolution
  • The fundamental difference from RAG and vector databases
  • Three operating modes: embedded SDK, standalone server, MCP interface

Prerequisites

  • Basic understanding of LLM agents
  • Familiarity with LangChain, CrewAI, or similar agent frameworks
  • Basic understanding of vector embeddings and cosine similarity

Project Background

Overview

OpenMemory is an open-source cognitive memory engine built on HMD (Hierarchical Memory Decomposition) v2 architecture. It provides persistent, structured memory for LLM applications and AI agents.

It's not a vector database wrapper. It's not a replacement for a cloud memory API. The design philosophy is: memory is not a database — it's a dynamic system with decay, reinforcement, association, and temporal dimensions.

The project is maintained by CaviraOSS and ships a Python SDK, Node.js SDK, REST API server, VS Code extension, and a native MCP server.

Author / Team

  • Organization: CaviraOSS
  • Primary language: TypeScript/Node.js (server), Python (SDK)
  • License: Apache 2.0
  • VS Code Extension: marketplace.visualstudio.com

Project Stats

  • 📄 License: Apache 2.0
  • 🐍 PyPI: openmemory-py
  • 📦 npm: openmemory-js
  • 🧩 Integrations: LangChain, CrewAI, AutoGen, Streamlit, MCP, VS Code

Features

What It Does

Traditional RAG memory (vector DB):
"User is allergic to peanuts, prefers coding at night, feels productive"
    → one embedding vector
    → retrieved by similarity
    → no structure, no time, no importance weighting

OpenMemory cognitive memory:
"User prefers coding at night, feels productive"
    → semantic sector: "coding preference" (slow decay)
    → emotional sector: "feels productive" (faster decay)
    → episodic sector: "time: night" (fastest decay)
    → Three sector vectors → mean vector → Waypoint link to related memories
    → Composite score = 0.6×similarity + 0.2×salience + 0.1×recency + 0.1×waypoint weight
Enter fullscreen mode Exit fullscreen mode

Use Cases

  1. Long-term conversation assistants: Remember user preferences, habits, and history across sessions without repeating context
  2. Agent framework memory layer: Shared long-term memory store for CrewAI, AutoGen, LangGraph agents
  3. Knowledge worker tools: Ingest GitHub, Notion, Google Drive content; agents can ask "what was the design decision from last week?"
  4. Coding assistants: Persist code preferences, project context, tech stack choices across sessions
  5. Emotion-aware applications: Emotional sector stores sentiment separately, preventing it from polluting factual memory retrieval

Quick Start

Python SDK (local SQLite, zero config):

pip install openmemory-py
Enter fullscreen mode Exit fullscreen mode
from openmemory.client import Memory

mem = Memory()

# Add memories
await mem.add("user is allergic to peanuts", user_id="user123")
await mem.add("user prefers coding at night", user_id="user123")

# Query memories (composite score ranking)
results = await mem.search("what dietary restrictions does the user have?", user_id="user123")

# Reinforce a memory (boost salience)
await mem.reinforce("memory_id")

# Delete a memory
await mem.delete("memory_id")
Enter fullscreen mode Exit fullscreen mode

Node.js SDK:

npm install openmemory-js
Enter fullscreen mode Exit fullscreen mode
import { Memory } from "openmemory-js"

const mem = new Memory()
await mem.add("user prefers TypeScript over Python", { user_id: "u1" })
const results = await mem.search("language preference", { user_id: "u1" })
Enter fullscreen mode Exit fullscreen mode

LangChain integration:

from openmemory.integrations.langchain import OpenMemoryChatMessageHistory

history = OpenMemoryChatMessageHistory(memory=mem, user_id="u1")
# Drop-in replacement for LangChain's ConversationBufferMemory
Enter fullscreen mode Exit fullscreen mode

OpenAI interceptor pattern:

mem = Memory()
client = mem.openai.register(OpenAI(), user_id="u1")
# All subsequent chat.completions.create calls automatically store/retrieve memory
resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What language should I use?"}]
)
Enter fullscreen mode Exit fullscreen mode

Ingesting external data sources:

# Ingest a GitHub repository
github = mem.source("github")
await github.connect(token="ghp_...")
await github.ingest_all(repo="owner/repo")

# Ingest Notion pages
notion = mem.source("notion")
await notion.connect(token="secret_...")
await notion.ingest_all(database_id="xxx")
Enter fullscreen mode Exit fullscreen mode

Available connectors: github, notion, google_drive, google_sheets, google_slides, onedrive, web_crawler

MCP integration (Claude Code / Cursor):

# Claude Code
claude mcp add --transport http openmemory http://localhost:8080/mcp
Enter fullscreen mode Exit fullscreen mode
// Cursor .mcp.json
{
  "mcpServers": {
    "openmemory": {
      "type": "http",
      "url": "http://localhost:8080/mcp"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Available MCP tools: openmemory_query, openmemory_store, openmemory_list, openmemory_get, openmemory_reinforce

The Five Memory Sectors

Sector Meaning Decay Rate Weight
episodic Events and experiences (what happened, when) 0.015 (fast) 1.2
semantic Facts and knowledge (user preferences, domain knowledge) 0.005 (slowest) 1.0
procedural Skills and workflows (how to do something) 0.008 (medium) 1.1
emotional Feelings and attitudes (how something felt) 0.020 (fastest) 1.3
reflective Meta-cognition and insights (what was realized) 0.001 (near-permanent) 0.8

Decay formula: salience × e^(-decay_lambda × days_since_last_seen)

Decay runs every 24 hours. Waypoint links with weight < 0.05 are pruned every 7 days.


Deep Dive

HMD v2 Architecture

Input content
    ↓
Sector Classifier
    ├── Identifies primary sector + additional sectors
    └── Based on keyword patterns + context
    ↓
Multi-Sector Embedding
    ├── Independent embedding vector per relevant sector
    ├── Providers: OpenAI / Gemini / AWS / Ollama / local / synthetic
    └── Compute mean vector across all sector vectors
    ↓
Storage (SQLite / Postgres)
    ├── memories table: content + metadata + salience + decay parameters
    ├── vectors table: one vector per memory × per sector
    └── waypoints table: single strongest associative link per memory
    ↓
Query: Composite Scoring
    score = 0.6×similarity + 0.2×salience + 0.1×recency + 0.1×waypoint weight
Enter fullscreen mode Exit fullscreen mode

The Waypoint Association Graph

This is one of the key architectural choices that distinguishes OpenMemory from vector databases:

Memory A ──0.85──▶ Memory B
(only the single strongest link is kept)
Enter fullscreen mode Exit fullscreen mode

When a memory is added, the system finds the single most similar existing memory (cosine similarity ≥ 0.75) and creates a one-way Waypoint link. Cross-sector links are bidirectional.

At query time, after top-K vector retrieval, a 1-hop graph traversal expands results to include memories linked via Waypoints. Every time a Waypoint is traversed, its weight increases by +0.05 (max 1.0) — memories that are frequently recalled together develop stronger links over time.

The practical effect: a query that isn't semantically close to a memory can still surface it if a linked memory scores high. This creates emergent associative recall rather than pure similarity search.

Temporal Knowledge Graph

OpenMemory treats time as a first-class dimension:

# Add a fact in 2021
POST /api/temporal/fact
{
  "subject": "CompanyX",
  "predicate": "has_CEO",
  "object": "Alice",
  "valid_from": "2021-01-01"
}

# Update in 2024
POST /api/temporal/fact
{
  "subject": "CompanyX",
  "predicate": "has_CEO",
  "object": "Bob",
  "valid_from": "2024-04-10"
}
# Alice's tenure is automatically closed (valid_to = 2024-04-09)
Enter fullscreen mode Exit fullscreen mode

Supported operations:

  • valid_from / valid_to truth windows
  • Point-in-time queries ("who was CEO of CompanyX in late 2022?")
  • Change detection (when did a fact flip?)
  • Entity timeline reconstruction

Performance Data

At 100k memories (SQLite with WAL mode):

Operation Latency
Add memory 80-120 ms (depends on embedding provider)
Single-sector query 110-130 ms
Multi-sector fusion (2-3 sectors) 150-200 ms
Waypoint expansion (per hop) +30-50 ms
Decay process (background) ~10 sec (every 24 hours)

Storage estimates:

  • Single memory: ~4-6 KB (including all sector vectors)
  • 100k memories: ~500 MB
  • 1M memories: ~5 GB

vs. SaaS Alternatives

Dimension OpenMemory Supermemory OpenAI Memory
Hosting Self-hosted Cloud only Cloud only
Query latency 110-130 ms 350-400 ms ~300 ms
Cost per 1M tokens ~$0.30-0.40 ~$2.50+ ~$3.00+
Explainability Fully transparent (Waypoint trace) Black-box Black-box
Local embeddings Yes (Ollama, local models) No No
Data ownership 100% yours Vendor-held OpenAI-held

The cost difference comes from running local embeddings (Ollama, BGE, E5) instead of API-billed embedding calls, and from no cloud infrastructure markup.

Migrating from Other Systems

OpenMemory ships a migration tool to import existing memory data:

cd migrate

# Migrate from Mem0
python -m migrate --from mem0 --api-key MEM0_KEY --verify

# Migrate from Zep
python -m migrate --from zep --api-key ZEP_KEY --verify

# Migrate from Supermemory
python -m migrate --from supermemory --api-key SM_KEY --verify
Enter fullscreen mode Exit fullscreen mode

Database Schema

The core SQLite schema makes the architecture transparent:

-- A memory record with salience and decay parameters
CREATE TABLE memories (
  id TEXT PRIMARY KEY,
  content TEXT NOT NULL,
  primary_sector TEXT NOT NULL,
  salience REAL,           -- 0-1 importance score
  decay_lambda REAL,       -- sector-specific decay rate
  last_seen_at INTEGER,    -- for decay calculation
  mean_vec BLOB            -- mean vector for waypoint matching
);

-- One vector per memory per sector
CREATE TABLE vectors (
  id TEXT NOT NULL,
  sector TEXT NOT NULL,
  v BLOB NOT NULL,         -- float32 vector
  PRIMARY KEY (id, sector)
);

-- Single strongest associative link per memory
CREATE TABLE waypoints (
  src_id TEXT PRIMARY KEY,
  dst_id TEXT NOT NULL,
  weight REAL NOT NULL     -- 0-1 link strength
);
Enter fullscreen mode Exit fullscreen mode

Resources

Official Links


Summary

OpenMemory's contribution is treating "memory" as a serious engineering problem rather than a marketing term.

When most developers talk about "AI memory," they mean retrieval-augmented generation — store vectors, retrieve by similarity. OpenMemory's position is that this is search, not memory. A real memory system needs to know whether something is a fact or an emotion, whether it's recent or historical, whether it's still true today, and what else it relates to.

The five-sector model, Waypoint graph, temporal knowledge graph, and composite scoring aren't complexity for its own sake — they map to distinct dimensions of how human memory actually works.

For developers building agent applications that need cross-session memory, OpenMemory is one of the most architecturally coherent open-source options available today: self-hosted, local-first, framework-agnostic, explainable, and performance-predictable. Three lines of code to integrate, a full server mode when you need it.


Explore PrimeSkills — a curated marketplace of AI agents and skills, each validated against real enterprise workflows. No hype, just what actually works.

Visit my personal site for more insights and interesting products.

Top comments (0)