Validation Loop

The Validation Loop is Clawpy's self-healing execution framework. Every LLM output that matters — memory synthesis, PARA promotions, introspection evaluations, research summaries, swarm blueprints — passes through a validate → heal → retry cycle that ensures quality while enforcing strict cost and latency budgets.

Core Pattern

          ┌──────────────┐
          │ LLM generates │
          │  candidate    │
          └──────┬───────┘
                 │
          ┌──────▼───────┐
          │  Validate     │ ← Structural checks (JSON shape, required fields)
          └──────┬───────┘
                 │
          ┌──── pass ────┐──── fail ───────────┐
          │              │                      │
          ▼              │               ┌──────▼───────┐
     ✅ Accept            │               │  Heal         │ ← LLM repairs its output
                          │               │  (+ retry     │
                          │               │   counter)    │
                          │               └──────┬───────┘
                          │                      │
                          │               ┌──────▼───────┐
                          │               │  Re-validate  │
                          │               └──────┬───────┘
                          │                      │
                          │               pass / fail (loop)
                          │                      │
                          │               Max retries or
                          │               cost ceiling hit
                          │                      │
                          │               ┌──────▼───────┐
                          └───────────────│  ❌ Reject     │
                                          └──────────────┘

Validators

The Validation Loop supports multiple validator types:

RuleBasedValidator

Applies structural rules to raw LLM output. Each rule is a function that returns (passed: bool, message: str):

rules = {
    "has_title": lambda s: (len(s) > 0, "Title required"),
    "max_length": lambda s: (len(s) < 1200, "Too long"),
    "no_empty_lines": lambda s: ("\n\n\n" not in s, "Triple blank lines"),
}

JsonObjectValidator

For LLM outputs that must be valid JSON objects. Automatically:

Strips markdown code fences (\``json ... ````)
Attempts JSON parsing
Runs per-field structural rules
Returns parsed object in metadata on success

JsonArrayValidator

Same as JsonObjectValidator but expects a JSON array. Used for list-type outputs.

HttpRequestValidator

Validates HTTP request specifications before execution. Checks URL format, allowed methods, required headers, and response expectations.

Healing

When validation fails, the Validation Loop sends the original output plus the validation feedback back to the LLM for repair:

System: "Your previous response did not pass validation."
User:
  - Previous response: [the failing output]
  - Validation feedback: "learnings must contain non-empty strings"
  - Repair attempt 1 of 3.

Each healing attempt costs tokens, which are tracked and counted against the run's cost budget.

Cost Governance

Every validation run operates under strict budget constraints:

Parameter	Description	Environment Variable
`max_retries`	Max heal attempts before rejection	Per-run configuration
`cost_cents`	Running cost total for this run	Accumulates across retries
`max_cost_cents`	Cost ceiling — run is rejected if exceeded	Per-run configuration
`max_latency_ms`	Latency ceiling — run is rejected if exceeded	Per-run configuration

Cost Ceiling Example

context = ValidationContext(
    run_kind="para_promotion",
    max_retries=2,
    cost_cents=0,
    max_cost_cents=12,    # 12 cents max for this promotion
    max_latency_ms=12000, # 12 seconds max
)

If the cumulative cost of the initial generation + healings exceeds 12 cents, the run is rejected regardless of output quality. This prevents runaway retry loops from burning budget.

Run Kinds

The Validation Loop is used across the system for different purposes:

Run Kind	Purpose	Typical Budget
`tdd`	Test-driven development repair cycles	20 cents, 3 retries
`memory_synthesis`	Nightly memory distillation	20 cents, 2 retries
`para_promotion`	PARA knowledge promotion evaluation	12 cents, 2 retries
`introspection_evaluation`	Agent self-evaluation	8 cents, 3 retries
`research_summary`	Research output formatting	15 cents, 2 retries
`blueprint_draft`	Swarm org chart generation	25 cents, 3 retries
`auto_reply_evaluation`	Auto-reply quality check	5 cents, 2 retries
`workflow_verification`	Flow execution validation	10 cents, 2 retries

Telemetry

Every validation run is recorded in the event ledger (Layer B):

{
  "run_kind": "para_promotion",
  "status": "passed",
  "attempts": 2,
  "total_cost_cents": 8,
  "latency_ms": 4200,
  "validator_name": "json_object",
  "agent_id": "memory_extractor",
  "feedback": "Passed after 1 heal"
}

This telemetry feeds the Adaptation Engine, which can learn from recurring validation failures and adjust system behaviour. For example, if research_summary runs consistently fail the first attempt, the system may propose a prompt fragment improvement to reduce first-attempt failure rates.

Why This Matters

Without the Validation Loop, an LLM hallucination in memory synthesis could corrupt the agent's knowledge base. A malformed PARA promotion could inject broken facts. A poorly formatted introspection evaluation could crash the skill creation pipeline.

The Validation Loop provides mechanical guarantees that LLM outputs meet structural requirements, while the cost governance ensures that quality enforcement doesn't become more expensive than the work itself.