Budget Service

The Budget Service (budget_service.py, 22,831 bytes) enforces token spending limits at every level of the hierarchy — per-agent, per-division, per-workspace, and globally. When a budget is exhausted, the offending agent is automatically paused until the budget resets or an operator intervenes.


Budget Hierarchy

Global Budget
  └── Workspace Budget
        └── SwarmSpace Budget
              └── Agent Budget

Each level is independently enforceable. An agent can be within its own budget but blocked because its workspace has exceeded the workspace-level limit.


Cost Event Tracking

Every LLM call generates a cost event recorded in cost_events:

INSERT INTO cost_events
  (id, agent_id, workspace, swarmspace, model,
   input_tokens, output_tokens, cost_cents, run_id, metadata_json)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

Each event captures:

FieldPurpose
agent_idWhich agent made the call
workspaceWorkspace isolation boundary
swarmspaceDivision-level accounting
modelWhich LLM model was used
input_tokensPrompt token count
output_tokensCompletion token count
cost_centsCalculated cost in cents
run_idLinks to a specific task execution
metadata_jsonAdditional context (tool name, validation run ID, etc.)

Budget Policy Enforcement

Pre-Flight Check

Before every LLM call, the system runs can_agent_run():

1. Query total cost_cents for this agent in the current month window
2. Look up the agent's budget limit
3. Classify status: ok | warning | hard_stop
4. If hard_stop → Reject the request, emit budget_incident event
5. If warning → Log warning, allow execution, alert operator
6. If ok → Proceed normally

Status Classification

def _budget_status(observed, limit, warn_pct):
    if limit <= 0:
        return "ok"          # No limit set → unlimited
    if observed >= limit:
        return "hard_stop"   # Budget exhausted
    warn_threshold = int((limit * warn_pct) / 100)
    if observed >= warn_threshold:
        return "warning"     # Approaching limit
    return "ok"

Soft Warning (Default: 80%)

When an agent reaches 80% of its budget, a budget_warning event is emitted via the Plugin Event Bus. The operator sees a yellow alert in the dashboard. The agent continues operating.

Hard Stop (100%)

When an agent reaches 100% of its budget, a budget_incident event is emitted and the agent is automatically paused. The incident is recorded as a learning record for the Adaptation Engine.


Month Window Accounting

Budgets reset on the first day of each UTC calendar month:

def _month_window():
    now = datetime.now(timezone.utc)
    start = now.replace(day=1, hour=0, minute=0, second=0, microsecond=0)
    if now.month == 12:
        end = start.replace(year=now.year + 1, month=1)
    else:
        end = start.replace(month=now.month + 1)
    return start, end

All cost queries are scoped to the current month window, meaning a new month starts with a clean budget.


Workspace Isolation

Cost events are enriched with workspace metadata through workspace_isolation.py:

from core.workspace_isolation import enrich_cost_event_workspace
enriched = enrich_cost_event_workspace(agent_id, {
    "workspace": workspace,
    "swarmspace": swarmspace,
})

This ensures multi-tenant deployments correctly attribute costs to the right tenant, even when agents share infrastructure.


Plugin Integration

The Budget Service fires events through the Plugin Event Bus:

EventTriggerData
cost_event_recordedEvery LLM callFull cost event details
budget_warning80% threshold crossedAgent ID, observed cost, limit
budget_incident100% threshold reachedAgent ID, observed cost, limit
agent_pausedAuto-pause on hard stopAgent ID, reason

Plugins can subscribe to these events for custom alerting, Slack notifications, or external billing integration.


CostDB Infrastructure

The underlying cost_db.py (94,277 bytes) provides the storage layer:

  • 14+ tables tracking costs, events, learning records, validation runs, and adaptation candidates
  • Automatic schema migration — new tables are created on first use
  • Singleton pattern — one CostDB instance per process for connection reuse
  • Thread-safe — concurrent agents can record costs without locks