Validation Loop

The Validation Loop is Clawpy's self-healing execution framework. Every LLM output that matters — memory synthesis, PARA promotions, introspection evaluations, research summaries, swarm blueprints — passes through a validate → heal → retry cycle that ensures quality while enforcing strict cost and latency budgets.


Core Pattern

          ┌──────────────┐
          │ LLM generates │
          │  candidate    │
          └──────┬───────┘
                 │
          ┌──────▼───────┐
          │  Validate     │ ← Structural checks (JSON shape, required fields)
          └──────┬───────┘
                 │
          ┌──── pass ────┐──── fail ───────────┐
          │              │                      │
          ▼              │               ┌──────▼───────┐
     ✅ Accept            │               │  Heal         │ ← LLM repairs its output
                          │               │  (+ retry     │
                          │               │   counter)    │
                          │               └──────┬───────┘
                          │                      │
                          │               ┌──────▼───────┐
                          │               │  Re-validate  │
                          │               └──────┬───────┘
                          │                      │
                          │               pass / fail (loop)
                          │                      │
                          │               Max retries or
                          │               cost ceiling hit
                          │                      │
                          │               ┌──────▼───────┐
                          └───────────────│  ❌ Reject     │
                                          └──────────────┘

Validators

The Validation Loop supports multiple validator types:

RuleBasedValidator

Applies structural rules to raw LLM output. Each rule is a function that returns (passed: bool, message: str):

rules = {
    "has_title": lambda s: (len(s) > 0, "Title required"),
    "max_length": lambda s: (len(s) < 1200, "Too long"),
    "no_empty_lines": lambda s: ("\n\n\n" not in s, "Triple blank lines"),
}

JsonObjectValidator

For LLM outputs that must be valid JSON objects. Automatically:

  1. Strips markdown code fences (\``json ... ````)
  2. Attempts JSON parsing
  3. Runs per-field structural rules
  4. Returns parsed object in metadata on success

JsonArrayValidator

Same as JsonObjectValidator but expects a JSON array. Used for list-type outputs.

HttpRequestValidator

Validates HTTP request specifications before execution. Checks URL format, allowed methods, required headers, and response expectations.


Healing

When validation fails, the Validation Loop sends the original output plus the validation feedback back to the LLM for repair:

System: "Your previous response did not pass validation."
User:
  - Previous response: [the failing output]
  - Validation feedback: "learnings must contain non-empty strings"
  - Repair attempt 1 of 3.

Each healing attempt costs tokens, which are tracked and counted against the run's cost budget.


Cost Governance

Every validation run operates under strict budget constraints:

ParameterDescriptionEnvironment Variable
max_retriesMax heal attempts before rejectionPer-run configuration
cost_centsRunning cost total for this runAccumulates across retries
max_cost_centsCost ceiling — run is rejected if exceededPer-run configuration
max_latency_msLatency ceiling — run is rejected if exceededPer-run configuration

Cost Ceiling Example

context = ValidationContext(
    run_kind="para_promotion",
    max_retries=2,
    cost_cents=0,
    max_cost_cents=12,    # 12 cents max for this promotion
    max_latency_ms=12000, # 12 seconds max
)

If the cumulative cost of the initial generation + healings exceeds 12 cents, the run is rejected regardless of output quality. This prevents runaway retry loops from burning budget.


Run Kinds

The Validation Loop is used across the system for different purposes:

Run KindPurposeTypical Budget
tddTest-driven development repair cycles20 cents, 3 retries
memory_synthesisNightly memory distillation20 cents, 2 retries
para_promotionPARA knowledge promotion evaluation12 cents, 2 retries
introspection_evaluationAgent self-evaluation8 cents, 3 retries
research_summaryResearch output formatting15 cents, 2 retries
blueprint_draftSwarm org chart generation25 cents, 3 retries
auto_reply_evaluationAuto-reply quality check5 cents, 2 retries
workflow_verificationFlow execution validation10 cents, 2 retries

Telemetry

Every validation run is recorded in the event ledger (Layer B):

{
  "run_kind": "para_promotion",
  "status": "passed",
  "attempts": 2,
  "total_cost_cents": 8,
  "latency_ms": 4200,
  "validator_name": "json_object",
  "agent_id": "memory_extractor",
  "feedback": "Passed after 1 heal"
}

This telemetry feeds the Adaptation Engine, which can learn from recurring validation failures and adjust system behaviour. For example, if research_summary runs consistently fail the first attempt, the system may propose a prompt fragment improvement to reduce first-attempt failure rates.


Why This Matters

Without the Validation Loop, an LLM hallucination in memory synthesis could corrupt the agent's knowledge base. A malformed PARA promotion could inject broken facts. A poorly formatted introspection evaluation could crash the skill creation pipeline.

The Validation Loop provides mechanical guarantees that LLM outputs meet structural requirements, while the cost governance ensures that quality enforcement doesn't become more expensive than the work itself.