Validation Loop
The Validation Loop is Clawpy's self-healing execution framework. Every LLM output that matters — memory synthesis, PARA promotions, introspection evaluations, research summaries, swarm blueprints — passes through a validate → heal → retry cycle that ensures quality while enforcing strict cost and latency budgets.
Core Pattern
┌──────────────┐
│ LLM generates │
│ candidate │
└──────┬───────┘
│
┌──────▼───────┐
│ Validate │ ← Structural checks (JSON shape, required fields)
└──────┬───────┘
│
┌──── pass ────┐──── fail ───────────┐
│ │ │
▼ │ ┌──────▼───────┐
✅ Accept │ │ Heal │ ← LLM repairs its output
│ │ (+ retry │
│ │ counter) │
│ └──────┬───────┘
│ │
│ ┌──────▼───────┐
│ │ Re-validate │
│ └──────┬───────┘
│ │
│ pass / fail (loop)
│ │
│ Max retries or
│ cost ceiling hit
│ │
│ ┌──────▼───────┐
└───────────────│ ❌ Reject │
└──────────────┘
Validators
The Validation Loop supports multiple validator types:
RuleBasedValidator
Applies structural rules to raw LLM output. Each rule is a function that returns (passed: bool, message: str):
rules = {
"has_title": lambda s: (len(s) > 0, "Title required"),
"max_length": lambda s: (len(s) < 1200, "Too long"),
"no_empty_lines": lambda s: ("\n\n\n" not in s, "Triple blank lines"),
}
JsonObjectValidator
For LLM outputs that must be valid JSON objects. Automatically:
- Strips markdown code fences (
\``json ... ````) - Attempts JSON parsing
- Runs per-field structural rules
- Returns parsed object in metadata on success
JsonArrayValidator
Same as JsonObjectValidator but expects a JSON array. Used for list-type outputs.
HttpRequestValidator
Validates HTTP request specifications before execution. Checks URL format, allowed methods, required headers, and response expectations.
Healing
When validation fails, the Validation Loop sends the original output plus the validation feedback back to the LLM for repair:
System: "Your previous response did not pass validation."
User:
- Previous response: [the failing output]
- Validation feedback: "learnings must contain non-empty strings"
- Repair attempt 1 of 3.
Each healing attempt costs tokens, which are tracked and counted against the run's cost budget.
Cost Governance
Every validation run operates under strict budget constraints:
| Parameter | Description | Environment Variable |
|---|---|---|
max_retries | Max heal attempts before rejection | Per-run configuration |
cost_cents | Running cost total for this run | Accumulates across retries |
max_cost_cents | Cost ceiling — run is rejected if exceeded | Per-run configuration |
max_latency_ms | Latency ceiling — run is rejected if exceeded | Per-run configuration |
Cost Ceiling Example
context = ValidationContext(
run_kind="para_promotion",
max_retries=2,
cost_cents=0,
max_cost_cents=12, # 12 cents max for this promotion
max_latency_ms=12000, # 12 seconds max
)
If the cumulative cost of the initial generation + healings exceeds 12 cents, the run is rejected regardless of output quality. This prevents runaway retry loops from burning budget.
Run Kinds
The Validation Loop is used across the system for different purposes:
| Run Kind | Purpose | Typical Budget |
|---|---|---|
tdd | Test-driven development repair cycles | 20 cents, 3 retries |
memory_synthesis | Nightly memory distillation | 20 cents, 2 retries |
para_promotion | PARA knowledge promotion evaluation | 12 cents, 2 retries |
introspection_evaluation | Agent self-evaluation | 8 cents, 3 retries |
research_summary | Research output formatting | 15 cents, 2 retries |
blueprint_draft | Swarm org chart generation | 25 cents, 3 retries |
auto_reply_evaluation | Auto-reply quality check | 5 cents, 2 retries |
workflow_verification | Flow execution validation | 10 cents, 2 retries |
Telemetry
Every validation run is recorded in the event ledger (Layer B):
{
"run_kind": "para_promotion",
"status": "passed",
"attempts": 2,
"total_cost_cents": 8,
"latency_ms": 4200,
"validator_name": "json_object",
"agent_id": "memory_extractor",
"feedback": "Passed after 1 heal"
}
This telemetry feeds the Adaptation Engine, which can learn from recurring validation failures and adjust system behaviour. For example, if research_summary runs consistently fail the first attempt, the system may propose a prompt fragment improvement to reduce first-attempt failure rates.
Why This Matters
Without the Validation Loop, an LLM hallucination in memory synthesis could corrupt the agent's knowledge base. A malformed PARA promotion could inject broken facts. A poorly formatted introspection evaluation could crash the skill creation pipeline.
The Validation Loop provides mechanical guarantees that LLM outputs meet structural requirements, while the cost governance ensures that quality enforcement doesn't become more expensive than the work itself.