Overview — What is SharedMemory?
SharedMemory is the persistent memory layer for AI agents. It lets any AI — whether it's a Claude MCP tool, a custom agent, or a CLI script — remember facts, learn from documents, and share knowledge across tools and teammates.
With SharedMemory, developers get production-ready infrastructure for:
- Agent memory — Auto-extracted knowledge graphs from conversations and documents
- Guard system — Intelligent validation that prevents duplicates, conflicts, and noise
- Hybrid retrieval — Vector search + graph traversal for perfect recall
- Shared volumes — Isolated, shareable memory spaces with real-time sync
How does it work? (at a glance)
Write
Agents write facts, documents, and conversations to SharedMemory. Each entry goes through an intelligent guard pipeline that checks for duplicates, conflicts, and quality.Extract
The knowledge extraction engine automatically builds a structured knowledge graph — entities, facts, relationships, and summaries — from raw text.Search
Hybrid retrieval combines vector semantic search (Qdrant) with knowledge graph traversal (Neo4j) to find the most relevant context for any query.Share
Volumes are isolated memory spaces. Private volumes for personal use, shared volumes for team collaboration. Real-time sync via WebSocket.SharedMemory is context engineering
Unlike simple RAG systems that just chunk and embed documents, SharedMemory builds a living knowledge graph from your data:
- Entities — People, projects, technologies, organizations extracted from your content
- Facts — Specific, attributed pieces of knowledge linked to entities
- Relationships — How entities connect (e.g.,
John WORKED_AT Google,React USED_IN ProjectAlpha) - Summaries — Auto-generated, evolving summaries for each entity
This means when you ask "What does John know about React?", SharedMemory doesn't just do keyword matching — it traverses the graph from John → skills → React and synthesizes a real answer.
All knowledge extraction is automatic. You don't define entities, tag relationships, or clean up stale data. Just write memories and search naturally.
Architecture
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ SDK / CLI │────▶│ Agent API │────▶│ Guard LLM │
│ MCP Server │ │ (Express) │ │ (Approve/ │
│ Dashboard │ │ │ │ Reject) │
└─────────────┘ └──────┬───────┘ └──────┬──────┘
│ │
┌──────▼───────┐ ┌──────▼──────┐
│ PostgreSQL │ │ BullMQ │
│ (Source of │ │ (Redis) │
│ Truth) │ │ Job Queue │
└──────────────┘ └──────┬──────┘
│
┌────────────────────────────▼┐
│ Knowledge Pipeline │
│ ┌────────┐ ┌───────────┐ │
│ │ Qdrant │ │ Neo4j │ │
│ │(Vector)│ │ (Graph) │ │
│ └────────┘ └───────────┘ │
└──────────────────────────────┘
Ingestion and Extraction
When data enters SharedMemory:
- Classification — Is this worth indexing? The message classifier detects factual content vs. casual conversation.
- Guard Check — The tiered guard system (similarity check → fast LLM → full LLM) validates the entry against existing knowledge.
- Embedding — Text is embedded using
all-MiniLM-L6-v2for semantic search. - Knowledge Extraction — LLM extracts entities, facts, and relationships from the text.
- Graph Indexing — Entities and relationships are upserted into Neo4j with deduplication.
- Summary Generation — Entity summaries are auto-generated and updated as new facts arrive.
See How it Works for the full pipeline breakdown with diagrams.
Memory API — Learned user context
Every query to SharedMemory is enriched with:
- Vector search results from Qdrant (semantic similarity)
- Knowledge graph traversal from Neo4j (entity relationships)
- Document sources with chunk-level attribution
- Conversation history for multi-turn context
This context can be injected into any LLM prompt to build agents that truly remember.