The 7-Layer Memory Stack

Clawpy uses a layered memory model so agents can combine short-term execution context with durable, searchable knowledge.

This page defines the seven layers as an architectural model for operators and builders. During the rebuild, implementation details may shift, but the layer responsibilities and control boundaries remain the reference model.

Layer Overview

LayerResponsibilityTypical Role
Layer A — Session StateActive conversation and immediate working contextIn-turn reasoning and tool continuity
Layer B — Event LedgerStructured records of decisions, actions, validations, and outcomesAuditing and post-task analysis
Layer C — Semantic RetrievalSimilarity search over prior contextFuzzy recall across related tasks
Layer D — Capture & Recall OrchestrationDecides what to retain and what to inject for the next turnMemory relevance control
Layer E — Cross-Session ContextDurable context reused across runs and workflowsLongitudinal continuity
Layer F — Task Relationship MemoryLinks related tasks and execution historyCausal/sequence-aware recall
Layer G — Canonical KnowledgeStable, operator-important facts and policiesHigh-trust ground truth

Why this matters: memory quality is not only about storage volume. It is about retrieval quality, governance, and traceability.

Evidence in code

Layer A — Session State

Layer A captures active state for the current execution window: immediate chat context, in-flight intent, and current-step constraints.

Why this matters: agents need fast, low-friction context for local decisions without overloading durable memory.

Evidence in code

Layer B — Event Ledger

Layer B stores structured operational events (for example: validation outcomes, memory operations, and governance events) in queryable form.

Why this matters: reliable learning and safety controls require an auditable record, not only free-form text.

Evidence in code

Layer C — Semantic Retrieval

Layer C supports semantic lookup so agents can recover relevant prior knowledge even when exact wording differs.

Why this matters: many production tasks depend on related precedent, not exact keyword matches.

Evidence in code

Layer D — Capture & Recall Orchestration

Layer D coordinates what gets captured from runtime behavior and what gets injected into future prompts.

Why this matters: ungoverned recall increases noise. Orchestration keeps memory useful and bounded.

Evidence in code

Layer E — Cross-Session Context

Layer E preserves and retrieves context across sessions so agents can continue work without re-deriving every prior decision.

Why this matters: long-horizon workflows fail when each run starts from near-zero state.

Evidence in code

Layer F — Task Relationship Memory

Layer F models relationships among tasks and outcomes to improve recall relevance beyond pure similarity.

Why this matters: execution history often has dependency chains that simple nearest-neighbor recall misses.

Evidence in code

Layer G — Canonical Knowledge

Layer G holds stable, high-confidence facts such as enduring project rules, preferences, and canonical references.

Why this matters: operators need a trustworthy memory layer that resists drift during long-lived projects.

Evidence in code

How the Layers Work Together

A typical retrieval cycle combines layers rather than using a single source:

  1. Session context establishes immediate intent.
  2. Structured and semantic layers retrieve relevant precedent.
  3. Orchestration ranks and filters recalled context.
  4. Canonical knowledge anchors high-confidence facts.

The result is memory that is both flexible (semantic recall) and governable (structured + canonical controls).