PARA Canonical Knowledge

PARA (Projects, Areas, Resources, Archives) is Clawpy's highest-fidelity knowledge layer — Layer G of the memory stack. While other layers store ephemeral context and decaying memories, PARA stores canonical, immutable facts that define ground truth for every agent in the system.

Facts in PARA are never deleted. They can only be superseded by newer facts, creating an auditable correction history that preserves knowledge provenance.

Structure

The PARA system organises knowledge into four directories, each containing named entities:

life/
├── projects/
│   ├── clawpy/
│   │   ├── summary.md      ← Human-readable overview
│   │   └── items.json      ← Atomic facts (source of truth)
│   └── auspice-8/
│       ├── summary.md
│       └── items.json
├── areas/
│   ├── security/
│   └── infrastructure/
├── resources/
│   ├── api-patterns/
│   └── deployment-guides/
└── archives/
    └── completed-migration/

Category Definitions

Category	Purpose	Example Entities
Projects	Active, enduring project knowledge	`clawpy`, `auspice-8`, `trading-machine`
Areas	Ongoing responsibilities and operating preferences	`security`, `code-style`, `infrastructure`
Resources	Reusable references and technical patterns	`api-patterns`, `deployment-guides`
Archives	Completed work preserved for reference	`completed-migration`, `legacy-api`

Atomic Facts

Every piece of knowledge in PARA is stored as an atomic ParaFact — a self-contained statement with full metadata:

{
  "id": "fact_a1b2c3d4",
  "fact": "Clawpy uses FastAPI on port 8000",
  "category": "technical",
  "timestamp": "2026-04-19T22:00:00Z",
  "source": "memory_extractor:cto",
  "status": "active",
  "access_count": 7,
  "last_accessed": "2026-04-19T21:30:00Z",
  "superseded_by": null
}

Fields

Field	Type	Purpose
`id`	string	Unique identifier (`fact_` prefix + 8-char hex)
`fact`	string	The atomic knowledge statement
`category`	string	Classification: `technical`, `preference`, `deadline`, etc.
`source`	string	Origin: `user_stated`, `memory_extractor:ceo`, `correction`
`status`	`active` / `superseded`	Whether this fact is current
`access_count`	int	How many times this fact has been retrieved
`superseded_by`	string / null	ID of the newer fact that replaced this one

Supersession — Never Delete, Always Correct

When a fact becomes outdated, PARA does not delete it. Instead, the old fact is superseded by a new one:

Before:
  fact_001 → "FastAPI runs on port 3000"  [status: active]

After correction:
  fact_001 → "FastAPI runs on port 3000"  [status: superseded, superseded_by: fact_002]
  fact_002 → "FastAPI runs on port 8000"  [status: active, source: correction]

This creates an immutable audit trail of knowledge evolution. If you need to understand why an agent believes what it believes, the supersession chain tells the complete story.

Nightly PARA Promotion

Facts don't arrive in PARA by magic. They are promoted from the nightly memory synthesis through an LLM-driven evaluation pipeline:

Raw Ledger (24h)
     │
     ▼
Memory Synthesis (LLM + Validation Loop)
     │
     ▼
Daily Summary → Write to memory.md + daily note
     │
     ▼
PARA Promotion Evaluation (LLM + Validation Loop)
     │
     ▼
  promote: true/false?
     │
     ├── true → Add facts to PARA entity
     │          Merge summary into summary.md
     │          Record telemetry in Layer B
     │
     └── false → No action (not durable enough)

Promotion Criteria

The LLM evaluator is prompted to assess 30-day value — would this fact still be relevant a month from now? It classifies the promotion:

Projects — for enduring project facts
Areas — for ongoing responsibilities and operating preferences
Resources — for reusable references or technical knowledge
Archives — only for clearly completed work

Cost Controls

The promotion process is governed by the Validation Loop with strict budgets:

Parameter	Default	Environment Variable
Max retries	2	`CLAWPY_PARA_PROMOTION_MAX_RETRIES`
Max cost	12 cents	`CLAWPY_PARA_PROMOTION_BUDGET_CENTS`
Max latency	12 seconds	`CLAWPY_PARA_PROMOTION_MAX_MS`
Max facts per cycle	8	configurable via Adaptation Store

Deduplication

Before adding a new fact, the extractor checks existing facts for duplicates:

existing_fact_texts = {
    str(item.get("fact")).strip().lower()
    for item in self.para.get_active_facts(category, entity)
}

for fact in new_facts:
    if fact.lower() in existing_fact_texts:
        continue  # Skip duplicate
    stored_fact_ids.append(
        self.para.add_fact(category, entity, fact, source=...)
    )

This prevents the nightly extraction from adding "Clawpy uses FastAPI" fifty times.

Access Tracking

Every time a PARA fact is retrieved (via keyword search, hybrid search, or auto-recall), its access_count is incremented and last_accessed is updated:

def record_access(self, para_category, entity, fact_id):
    item["access_count"] = item.get("access_count", 0) + 1
    item["last_accessed"] = datetime.now(timezone.utc).isoformat()

This metadata enables future features like relevance-based pruning of the Archives category and analytics on which facts are most valuable to agent reasoning.

Integration With Other Layers

PARA doesn't operate in isolation. It feeds into and receives from multiple memory subsystems:

Direction	Layer	Relationship
← Input	Layer B (Event Ledger)	Raw events that trigger nightly extraction
← Input	Memory Synthesis	Daily summaries that feed promotion evaluation
→ Output	Auto-Recall (Layer D)	PARA facts are injected into agent prompts
→ Output	Hybrid Search	PARA facts participate in keyword + vector search
↔ Bidirectional	Adaptation Engine	Validator policies can tune max facts per cycle