The 7-Layer Memory Stack

Clawpy uses a layered memory model so agents can combine short-term execution context with durable, searchable knowledge.

This page defines the seven layers as an architectural model for operators and builders. During the rebuild, implementation details may shift, but the layer responsibilities and control boundaries remain the reference model.

Layer Overview

Layer	Responsibility	Typical Role
Layer A — Session State	Active conversation and immediate working context	In-turn reasoning and tool continuity
Layer B — Event Ledger	Structured records of decisions, actions, validations, and outcomes	Auditing and post-task analysis
Layer C — Semantic Retrieval	Similarity search over prior context	Fuzzy recall across related tasks
Layer D — Capture & Recall Orchestration	Decides what to retain and what to inject for the next turn	Memory relevance control
Layer E — Cross-Session Context	Durable context reused across runs and workflows	Longitudinal continuity
Layer F — Task Relationship Memory	Links related tasks and execution history	Causal/sequence-aware recall
Layer G — Canonical Knowledge	Stable, operator-important facts and policies	High-trust ground truth

Why this matters: memory quality is not only about storage volume. It is about retrieval quality, governance, and traceability.

Evidence in code

Layer A — Session State

Layer A captures active state for the current execution window: immediate chat context, in-flight intent, and current-step constraints.

Why this matters: agents need fast, low-friction context for local decisions without overloading durable memory.

Evidence in code

Memory
Chat

Layer B — Event Ledger

Layer B stores structured operational events (for example: validation outcomes, memory operations, and governance events) in queryable form.

Why this matters: reliable learning and safety controls require an auditable record, not only free-form text.

Evidence in code

Layer C — Semantic Retrieval

Layer C supports semantic lookup so agents can recover relevant prior knowledge even when exact wording differs.

Why this matters: many production tasks depend on related precedent, not exact keyword matches.

Evidence in code

Layer D — Capture & Recall Orchestration

Layer D coordinates what gets captured from runtime behavior and what gets injected into future prompts.

Why this matters: ungoverned recall increases noise. Orchestration keeps memory useful and bounded.

Evidence in code

Layer E — Cross-Session Context

Layer E preserves and retrieves context across sessions so agents can continue work without re-deriving every prior decision.

Why this matters: long-horizon workflows fail when each run starts from near-zero state.

Evidence in code

Layer F — Task Relationship Memory

Layer F models relationships among tasks and outcomes to improve recall relevance beyond pure similarity.

Why this matters: execution history often has dependency chains that simple nearest-neighbor recall misses.

Evidence in code

Layer G — Canonical Knowledge

Layer G holds stable, high-confidence facts such as enduring project rules, preferences, and canonical references.

Why this matters: operators need a trustworthy memory layer that resists drift during long-lived projects.

Evidence in code

How the Layers Work Together

A typical retrieval cycle combines layers rather than using a single source:

Session context establishes immediate intent.
Structured and semantic layers retrieve relevant precedent.
Orchestration ranks and filters recalled context.
Canonical knowledge anchors high-confidence facts.

The result is memory that is both flexible (semantic recall) and governable (structured + canonical controls).