Flow Sequence Detector

The Flow Sequence Detector (flow_sequence_detector.py, 15,628 bytes) watches for repeated multi-step tool-call sequences and automatically proposes deterministic flow definitions that replace expensive LLM reasoning with cheap, predictable execution.

This is automated workflow extraction — the system observes its own behaviour and creates shortcuts.


Why Flow Offloading Matters

Consider an agent that repeatedly performs this 4-step sequence:

1. file_read(config.yaml)
2. validate_yaml(content)
3. file_write(config.yaml, fixed_content)
4. bash_execute(restart_service)

Each time, the LLM reasons through the same steps, consuming tokens. After detecting 3+ repetitions, the Flow Sequence Detector proposes converting this into a deterministic flow that runs without LLM involvement:

Flow: "Auto: file_read → validate_yaml → file_write → bash_execute"
  Step 1: file_read     → output → Step 2
  Step 2: validate_yaml → output → Step 3
  Step 3: file_write    → output → Step 4
  Step 4: bash_execute  → done

Estimated savings: 95% of the LLM cost per execution.


Detection Algorithm

Step 1: Fetch Learning Records

Scan recent tool_sequence learning records from the event ledger:

records = db.list_learning_records(limit=200)
filtered = [r for r in records if r["outcome_kind"] == "tool_sequence"]

Step 2: Extract Tool Sequences

Normalise each record's evidence into a tuple of tool names:

evidence = record.get("evidence", {})
seq = evidence.get("tool_sequence", [])
# → ("file_read", "validate_yaml", "file_write", "bash_execute")

Step 3: Find Repeated Patterns

Two matching strategies:

Exact match — identical sequences across 3+ distinct task executions:

exact_counts[("file_read", "validate_yaml", "file_write")] = [record_1, record_2, record_3]
# → 3 occurrences → candidate!

Subsequence match — contiguous 3+ tool subsequences that appear within longer sequences:

# Full sequence: ("search", "file_read", "validate", "file_write", "test")
# Subsequence match: ("file_read", "validate", "file_write") — found in 4 records

Subsequences are only promoted if they're not already covered by a longer exact match (prevents duplicate candidates).

Step 4: Build Candidates

Each detected pattern becomes a flow candidate:

{
  "tool_sequence": ["file_read", "validate_yaml", "file_write", "bash_execute"],
  "occurrence_count": 5,
  "match_type": "exact",
  "estimated_token_savings": 42.75,
  "avg_cost_per_execution": 9.0,
  "dedupe_key": "flow_offload:file_read→validate_yaml→file_write→bash_execute",
  "proposed_flow": {
    "name": "Auto: file_read → validate_yaml → file_write → bash_execute",
    "description": "Auto-generated from 5 observed repetitions...",
    "steps": [
      {
        "id": "step_1",
        "name": "File Read",
        "skill_key": "file_read",
        "depends_on": [],
        "retry_max": 2,
        "timeout_seconds": 120
      },
      {
        "id": "step_2",
        "name": "Validate Yaml",
        "skill_key": "validate_yaml",
        "depends_on": ["step_1"],
        "retry_max": 2,
        "timeout_seconds": 120
      }
    ],
    "tags": ["auto-generated", "flow-offload"]
  }
}

Token Savings Calculation

avg_cost = mean([record.cost_cents for record in occurrences])
estimated_savings = avg_cost × occurrence_count × 0.95

The 95% figure reflects that deterministic flow execution costs ~5% of LLM-reasoned execution (mainly the trigger detection cost).


Flow Step Generation

Each tool in the detected sequence becomes a FlowStep:

{
    "id": "step_1",
    "name": "File Read",           # Tool name → title case
    "skill_key": "file_read",      # Original tool name
    "depends_on": [],              # First step has no deps
    "output_key": "step_1",        # Output reference for next step
    "retry_max": 2,
    "retry_backoff": 1.0,
    "timeout_seconds": 120,
    "on_failure": "stop",
    "input_map": {
        "input": "{{_trigger.input}}"  # First step gets trigger input
    }
}

Steps are linearly chained (step_1 → step_2 → step_3) since that's the observed execution order.


Integration With Adaptation Engine

Flow candidates are submitted as flow_offload adaptation candidates:

payload = {
    "title": "Auto-Flow: file_read → validate_yaml → ...",
    "source_kind": "tool_sequence",
    "occurrence_count": 5,
    "estimated_token_savings": 42.75,
    "promotion_effect": {
        "effect_type": "flow_offload_registration",
        "proposed_flow": { ... },
        "agent_prompt_hint": "A deterministic flow 'Auto: ...' is available. Use flow_run() instead of reasoning through these steps manually.",
        "reason": "Detected 5 repetitions of a 4-step sequence. Offloading saves ~42 cost units per run."
    }
}

Upon promotion, the agent's system prompt is updated with a hint to use the flow:

"A deterministic flow 'Auto: file_read → validate_yaml → file_write → bash_execute' is available for this pattern. Use flow_run('Auto: file_read → ...') instead of reasoning through these steps manually."

This closes the loop: the system detects its own repetitive behaviour, creates an optimised shortcut, and teaches itself to use it.


Configuration

ParameterDefaultPurpose
min_sequence_length3Minimum steps to consider as a pattern
min_occurrences3Minimum repetitions before proposing a flow
lookback_limit200How many learning records to scan
max_candidates5Maximum candidates per detection run