PARA Canonical Knowledge
PARA (Projects, Areas, Resources, Archives) is Clawpy's highest-fidelity knowledge layer — Layer G of the memory stack. While other layers store ephemeral context and decaying memories, PARA stores canonical, immutable facts that define ground truth for every agent in the system.
Facts in PARA are never deleted. They can only be superseded by newer facts, creating an auditable correction history that preserves knowledge provenance.
Structure
The PARA system organises knowledge into four directories, each containing named entities:
life/
├── projects/
│ ├── clawpy/
│ │ ├── summary.md ← Human-readable overview
│ │ └── items.json ← Atomic facts (source of truth)
│ └── auspice-8/
│ ├── summary.md
│ └── items.json
├── areas/
│ ├── security/
│ └── infrastructure/
├── resources/
│ ├── api-patterns/
│ └── deployment-guides/
└── archives/
└── completed-migration/
Category Definitions
| Category | Purpose | Example Entities |
|---|---|---|
| Projects | Active, enduring project knowledge | clawpy, auspice-8, trading-machine |
| Areas | Ongoing responsibilities and operating preferences | security, code-style, infrastructure |
| Resources | Reusable references and technical patterns | api-patterns, deployment-guides |
| Archives | Completed work preserved for reference | completed-migration, legacy-api |
Atomic Facts
Every piece of knowledge in PARA is stored as an atomic ParaFact — a self-contained statement with full metadata:
{
"id": "fact_a1b2c3d4",
"fact": "Clawpy uses FastAPI on port 8000",
"category": "technical",
"timestamp": "2026-04-19T22:00:00Z",
"source": "memory_extractor:cto",
"status": "active",
"access_count": 7,
"last_accessed": "2026-04-19T21:30:00Z",
"superseded_by": null
}
Fields
| Field | Type | Purpose |
|---|---|---|
id | string | Unique identifier (fact_ prefix + 8-char hex) |
fact | string | The atomic knowledge statement |
category | string | Classification: technical, preference, deadline, etc. |
source | string | Origin: user_stated, memory_extractor:ceo, correction |
status | active / superseded | Whether this fact is current |
access_count | int | How many times this fact has been retrieved |
superseded_by | string / null | ID of the newer fact that replaced this one |
Supersession — Never Delete, Always Correct
When a fact becomes outdated, PARA does not delete it. Instead, the old fact is superseded by a new one:
Before:
fact_001 → "FastAPI runs on port 3000" [status: active]
After correction:
fact_001 → "FastAPI runs on port 3000" [status: superseded, superseded_by: fact_002]
fact_002 → "FastAPI runs on port 8000" [status: active, source: correction]
This creates an immutable audit trail of knowledge evolution. If you need to understand why an agent believes what it believes, the supersession chain tells the complete story.
Nightly PARA Promotion
Facts don't arrive in PARA by magic. They are promoted from the nightly memory synthesis through an LLM-driven evaluation pipeline:
Raw Ledger (24h)
│
▼
Memory Synthesis (LLM + Validation Loop)
│
▼
Daily Summary → Write to memory.md + daily note
│
▼
PARA Promotion Evaluation (LLM + Validation Loop)
│
▼
promote: true/false?
│
├── true → Add facts to PARA entity
│ Merge summary into summary.md
│ Record telemetry in Layer B
│
└── false → No action (not durable enough)
Promotion Criteria
The LLM evaluator is prompted to assess 30-day value — would this fact still be relevant a month from now? It classifies the promotion:
- Projects — for enduring project facts
- Areas — for ongoing responsibilities and operating preferences
- Resources — for reusable references or technical knowledge
- Archives — only for clearly completed work
Cost Controls
The promotion process is governed by the Validation Loop with strict budgets:
| Parameter | Default | Environment Variable |
|---|---|---|
| Max retries | 2 | CLAWPY_PARA_PROMOTION_MAX_RETRIES |
| Max cost | 12 cents | CLAWPY_PARA_PROMOTION_BUDGET_CENTS |
| Max latency | 12 seconds | CLAWPY_PARA_PROMOTION_MAX_MS |
| Max facts per cycle | 8 | configurable via Adaptation Store |
Deduplication
Before adding a new fact, the extractor checks existing facts for duplicates:
existing_fact_texts = {
str(item.get("fact")).strip().lower()
for item in self.para.get_active_facts(category, entity)
}
for fact in new_facts:
if fact.lower() in existing_fact_texts:
continue # Skip duplicate
stored_fact_ids.append(
self.para.add_fact(category, entity, fact, source=...)
)
This prevents the nightly extraction from adding "Clawpy uses FastAPI" fifty times.
Access Tracking
Every time a PARA fact is retrieved (via keyword search, hybrid search, or auto-recall), its access_count is incremented and last_accessed is updated:
def record_access(self, para_category, entity, fact_id):
item["access_count"] = item.get("access_count", 0) + 1
item["last_accessed"] = datetime.now(timezone.utc).isoformat()
This metadata enables future features like relevance-based pruning of the Archives category and analytics on which facts are most valuable to agent reasoning.
Integration With Other Layers
PARA doesn't operate in isolation. It feeds into and receives from multiple memory subsystems:
| Direction | Layer | Relationship |
|---|---|---|
| ← Input | Layer B (Event Ledger) | Raw events that trigger nightly extraction |
| ← Input | Memory Synthesis | Daily summaries that feed promotion evaluation |
| → Output | Auto-Recall (Layer D) | PARA facts are injected into agent prompts |
| → Output | Hybrid Search | PARA facts participate in keyword + vector search |
| ↔ Bidirectional | Adaptation Engine | Validator policies can tune max facts per cycle |