Skip to main content

Memory Pipeline

Every piece of data that enters SharedMemory passes through a multi-stage pipeline.

Pipeline Stages​

Input → Classify → Embed → Guard → Index → Extract → Broadcast

1. Classification​

The message classifier determines if the input contains indexable knowledge. Casual conversation is skipped.

2. Embedding​

Text is embedded using all-MiniLM-L6-v2 (384 dimensions). Embeddings are cached in Redis for 30 days.

3. Guard System​

Score RangeActionReason
< 0.55Auto-approveLow similarity — clearly new information
0.55 – 0.97LLM GuardAmbiguous — LLM evaluates conflict
> 0.97Auto-rejectNear-duplicate detected

4. Indexing​

Approved entries are indexed across three stores: Qdrant (vector), Neo4j (graph), Postgres (record).

5. Knowledge Extraction​

Extracts entities, facts, and relationships. Generates/updates entity summaries.

6. Broadcasting​

Real-time WebSocket notifications inform connected clients.

Async Processing (BullMQ)​

Background memory ingestion uses BullMQ backed by Redis:

  • Jobs persist across API restarts and pod cycling
  • 3 retry attempts with exponential backoff
  • Concurrency: 5 workers for propose queue, 3 for guard queue
  • Failed jobs retained for debugging (up to 500)

Document Ingestion​

  1. Parse — PDF, DOCX, TXT, MD, CSV, JSON extracted to text
  2. Chunk — Split into overlapping chunks
  3. Store — Chunks saved to Postgres
  4. Embed + Index — Each chunk embedded and indexed in Qdrant
  5. Extract — Knowledge extraction runs on full document text
  6. Graph Link — Document node linked to extracted entities in Neo4j

Supported Formats​

FormatExtensionMax Size
PDF.pdf20 MB
Word.docx20 MB
Plain Text.txt20 MB
Markdown.md20 MB
CSV.csv20 MB
JSON.json20 MB