Overview — What is SharedMemory?

SharedMemory is the persistent memory layer for AI agents. It lets any AI — whether it's a Claude MCP tool, a custom agent, or a CLI script — remember facts, learn from documents, and share knowledge across tools and teammates.

With SharedMemory, developers get production-ready infrastructure for:

Agent memory — Auto-extracted knowledge graphs from conversations and documents
Guard system — Intelligent validation that prevents duplicates, conflicts, and noise
Hybrid retrieval — Vector search + graph traversal for perfect recall
Shared volumes — Isolated, shareable memory spaces with real-time sync

How does it work? (at a glance)

✏️ Write

Agents write facts, documents, and conversations to SharedMemory. Each entry goes through an intelligent guard pipeline that checks for duplicates, conflicts, and quality.

🧠 Extract

The knowledge extraction engine automatically builds a structured knowledge graph — entities, facts, relationships, and summaries — from raw text.

🔍 Search

Hybrid retrieval combines vector semantic search (Qdrant) with knowledge graph traversal (Neo4j) to find the most relevant context for any query.

Volumes are isolated memory spaces. Private volumes for personal use, shared volumes for team collaboration. Real-time sync via WebSocket.

SharedMemory is context engineering

Unlike simple RAG systems that just chunk and embed documents, SharedMemory builds a living knowledge graph from your data:

Entities — People, projects, technologies, organizations extracted from your content
Facts — Specific, attributed pieces of knowledge linked to entities
Relationships — How entities connect (e.g., John WORKED_AT Google, React USED_IN ProjectAlpha)
Summaries — Auto-generated, evolving summaries for each entity

This means when you ask "What does John know about React?", SharedMemory doesn't just do keyword matching — it traverses the graph from John → skills → React and synthesizes a real answer.

tip

All knowledge extraction is automatic. You don't define entities, tag relationships, or clean up stale data. Just write memories and search naturally.

Architecture

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  SDK / CLI  │────▶│   Agent API  │────▶│  Guard LLM  │
│  MCP Server │     │  (Express)   │     │  (Approve/  │
│  Dashboard  │     │              │     │   Reject)   │
└─────────────┘     └──────┬───────┘     └──────┬──────┘
                           │                     │
                    ┌──────▼───────┐      ┌──────▼──────┐
                    │  PostgreSQL  │      │  BullMQ     │
                    │  (Source of  │      │  (Redis)    │
                    │   Truth)     │      │  Job Queue  │
                    └──────────────┘      └──────┬──────┘
                                                 │
                    ┌────────────────────────────▼┐
                    │     Knowledge Pipeline       │
                    │  ┌────────┐  ┌───────────┐  │
                    │  │ Qdrant │  │   Neo4j   │  │
                    │  │(Vector)│  │  (Graph)  │  │
                    │  └────────┘  └───────────┘  │
                    └──────────────────────────────┘

Ingestion and Extraction

When data enters SharedMemory:

Classification — Is this worth indexing? The message classifier detects factual content vs. casual conversation.
Guard Check — The tiered guard system (similarity check → fast LLM → full LLM) validates the entry against existing knowledge.
Embedding — Text is embedded using all-MiniLM-L6-v2 for semantic search.
Knowledge Extraction — LLM extracts entities, facts, and relationships from the text.
Graph Indexing — Entities and relationships are upserted into Neo4j with deduplication.
Summary Generation — Entity summaries are auto-generated and updated as new facts arrive.

info

See How it Works for the full pipeline breakdown with diagrams.

Memory API — Learned user context

Every query to SharedMemory is enriched with:

Vector search results from Qdrant (semantic similarity)
Knowledge graph traversal from Neo4j (entity relationships)
Document sources with chunk-level attribution
Conversation history for multi-turn context

This context can be injected into any LLM prompt to build agents that truly remember.

Next steps

▶️ Quickstart

Make your first API call in under 5 minutes.

⚙️ How it Works

Understand the full knowledge pipeline architecture.

📖 API Reference

Explore all endpoints with examples and schemas.

📦 SDKs & MCP

TypeScript SDK, CLI, and MCP Server for Claude and Cursor.

How does it work? (at a glance)​

✏️ Write​

🧠 Extract​

🔍 Search​

👥 Share​

SharedMemory is context engineering​

Architecture​

Ingestion and Extraction​

Memory API — Learned user context​

Next steps​

▶️ Quickstart​

⚙️ How it Works​

📖 API Reference​

📦 SDKs & MCP​