Agent Architecture — Separation of Duties

Clawpy's agent architecture is built on a principle that no other framework implements: two permanent, purpose-built points of contact for the operator, each with completely separate memory, tools, permissions, and responsibilities. This is not just a naming convention — it is enforced at the code level through org_policy.py, with reserved IDs that cannot be overridden, reassigned, or merged.


The Two Points of Contact

Alfred — The Butler (Special Ops)

Source: core/alfred_tools.py (887 lines), core/alfred_memory.py (1,084 lines)

Alfred is the operator's day-to-day assistant. You talk to Alfred when you want to ask questions, search the web, browse pages, read files, run scripts, create reports, or recall past conversations.

What Alfred can do:

  • Web search (provider chain: Brave → Google → DuckDuckGo)
  • Web browsing (provider chain: Firecrawl → trafilatura → raw httpx)
  • File read/write (workspace-jailed, path traversal blocked)
  • Script execution (Python/Bash with Guardian Scanner pre-check)
  • Dashboard queries (live system data — agent status, billing, heartbeat)
  • Markdown report creation
  • Memory recall and save (long-term, cross-session)

What Alfred cannot do:

  • Spawn agents
  • Modify the organisational structure
  • Change autonomy settings
  • Assign budgets
  • Delegate strategic tasks

Alfred's memory is completely isolated in dedicated ChromaDB collections (alfred_conversations, alfred_facts, alfred_decisions). It persists for 3 years with a 20% life extension bonus per recall — meaning frequently-used facts live even longer.

Lucius Fox — The CEO (Alpha Command)

Source: core/ceo_memory.py (813 lines), core/org_policy.py

Lucius is the strategic decision-maker. He manages the swarm — spawning agents, designing organisational blueprints, delegating work, setting budgets, and reviewing outcomes.

What Lucius can do:

  • Spawn and terminate agents
  • Design and modify org blueprints
  • Assign budget caps and autonomy presets
  • Delegate strategic initiatives to divisions
  • Review agent performance and outcomes
  • Execute real-world side-effects (external action authority)

What Lucius cannot do:

  • Perform day-to-day operational tasks (that's Alfred's job)
  • Act as a general-purpose assistant

Lucius's memory is also completely isolated (ceo_conversations, ceo_facts, ceo_decisions). It tracks decision extraction at a higher frequency (every 3 exchanges vs. Alfred's 5) because strategic decisions are made more often.


Why Separate? The Architectural Reasoning

1. Context Window Purity

Every LLM has a finite context window. If a single agent handles both "search the web for React tutorials" and "redesign the org chart and deploy 12 new agents," the context gets polluted. Operational noise crowds out strategic reasoning.

By separating the two:

  • Alfred's context stays focused on operator tasks, research, and conversations
  • Lucius's context stays focused on organisational decisions, delegation, and agent management

2. Memory Isolation

Alfred remembers your preferences, your past conversations, your research. Lucius remembers strategic decisions, delegation outcomes, agent performance. These are fundamentally different types of knowledge — mixing them degrades recall quality.

3. Security Scoping

From org_policy.py:

# Alpha command identities can execute real-world side-effects.
PRIMARY_EXTERNAL_ACTION_AGENT_IDS = frozenset({"ceo", "ceo-assistant"})
OPTIONAL_EXTERNAL_ACTION_AGENT_IDS = frozenset({"alfred"})

# Reserved IDs that cannot be user-created as normal workspace agents.
RESERVED_WORKSPACE_AGENT_IDS = frozenset(
    set(IMMUTABLE_WORKSPACE_AGENT_IDS) | set(SPECIAL_OPS_AGENT_IDS)
)

Only Alpha agents (Lucius, Ms Pepper) have primary external action authority. Alfred has optional external action authority that must be explicitly enabled. No worker agent can ever elevate to these reserved IDs.

4. Immutability

Both Alfred and Lucius are immutable workspace roles:

  • Their IDs are reserved — no user-created agent can use them
  • They cannot be assigned as division heads or members
  • They are never paused/activated by swarmspace switching
  • They exist outside the organisational hierarchy — they're infrastructure, not employees

The Supporting Cast

Ms Pepper — CEO Assistant

Source: org_policy.py line 17

Lucius's assistant. Has Alpha-level authority for external actions. Handles delegation overflow and administrative tasks that don't require CEO-level strategic reasoning.

Guardian

Source: core/guardian_scanner.py

Part of the Special Ops layer alongside Alfred. Performs two-tier security scanning (regex + LLM) on all inputs.


The Archetype Classifier

Source: core/archetype_classifier.py (532 lines)

Every agent in the system — whether spawned by a blueprint, created by Lucius, or provisioned by a marketplace kit — is automatically classified into one of 16+ archetypes using a three-stage classifier:

Stage 1: Deterministic Rules

Built-in agent ID → archetype mappings for known roles:

Agent IDArchetypeTier
ceo, ceo-assistantceo_seniorAlpha
alfred, butlerassistant_butlerSpecial Ops
ctobuilder_seniorExecutive
cfo, coo, cmodirector_execExecutive
lead-devbuilderb_seniorSenior
frontend-devbuilderf_workerWorker
backend-devbuilderb_workerWorker
researcherallother_workerWorker

Stage 2: Regex Scoring

For agents that don't match built-in IDs, the classifier scores against 30+ patterns across:

  • Executive titles (CEO, CFO, CTO, etc.)
  • Domain signals (frontend, backend, marketing, research, operations)
  • Seniority signals (junior, lead, manager, director, principal)
  • Reporting structure (who reports to whom)
  • Swarm/workspace name hints

Stage 3: AI Adjudication

For ambiguous cases (score < 5.0 or margin < 1.0), a cheap LLM call adjudicates. This uses GPT-5-nano by default — fast and near-zero cost.

Why this matters: The archetype determines the model tier, token budget, spawning authority, and tool access. A ceo_senior gets an expensive frontier model. A builderb_junior gets a cheap, fast model. This is automatic, deterministic, and auditable.


Competitor Comparison — Agent Architecture

CapabilityClawpyOpenClawHermesAgent ZeroPaperclip
Points of contact✅ 2 (Alfred + Lucius)❌ 1 (single agent)❌ 1 (single agent)❌ 1 (Agent 0)⚠️ 1 (CEO agent, user = "Board")
Contact purpose separation✅ Butler vs CEO❌ N/A❌ N/A❌ N/A⚠️ Board oversight only
Isolated memory per contact✅ Separate ChromaDB❌ N/A❌ N/A❌ N/A❌ Shared
Reserved immutable agent IDs✅ Code-enforced❌ None❌ None❌ None⚠️ CEO role
Automatic archetype classification✅ 3-stage (regex + score + AI)❌ Manual roles❌ None❌ None⚠️ User-defined roles
Tiered model routing by archetype✅ Automatic per-tier⚠️ Manual per-agent⚠️ Single model❌ Single model⚠️ Per-agent config
Hierarchical delegation✅ 5-tier corporate⚠️ Basic❌ Single agent⚠️ call_subordinate✅ Org chart
Escalation chains✅ Heartbeat-monitored❌ None❌ None❌ None⚠️ Reporting only
Spawning authority control✅ Only leads can spawn⚠️ Any agent❌ N/A (single agent)✅ Agent 0 spawns✅ CEO spawns
External action authority✅ Scoped (Primary + Optional)❌ All or nothing❌ All have access❌ All have access⚠️ Budget-gated
Agent cannot self-elevate✅ Reserved ID enforcement❌ Not enforced❌ N/A❌ Not enforced⚠️ Role-based
Budget enforcement per agent✅ Auto-pause at limit⚠️ Guidance only❌ None❌ None✅ Auto-throttle
Dashboard GUI✅ Full visual⚠️ Logs❌ CLI only⚠️ Web UI (basic)✅ React dashboard
Stall detection + auto-kill✅ Heartbeat monitor❌ None❌ None❌ None⚠️ Heartbeat (scheduled)

How Each Competitor Handles the User Interface

OpenClaw — You talk to one agent. That agent does everything — answers questions, runs commands, manages files, browses the web. There is no strategic delegation layer. The agent can manage sub-processes, but there's no separation between "ask a question" and "redesign the organisation." Memory is shared across all interactions.

Hermes — You talk to one agent through TUI, messaging gateway, or API. The agent handles everything from chat to code execution to skill learning. It's sophisticated (40+ tools, persistent memory, self-improvement), but there's no separation of concerns. Your casual question about the weather shares context with your request to refactor the codebase.

Agent Zero — You talk to Agent 0, which can spawn subordinate agents for specific tasks. But Agent 0 is still a single point of contact — it handles both "research this topic" and "delegate this to a team." Subordinates are disposable — they don't maintain identity or isolated memory.

Paperclip — The closest to Clawpy's model. The user acts as the "Board of Directors" and interacts primarily with a CEO agent. However, Paperclip is an orchestration-only layer — it doesn't provide the agent runtime itself (it wraps Claude Code, OpenClaw, etc.). It has strong governance (budgets, audit trails, heartbeats) but the CEO and worker agents don't have purpose-built separation of memory, tools, or security scoping. Paperclip also lacks an Alfred-equivalent butler for day-to-day assistance.

Clawpy — Two permanent, purpose-built points of contact with isolated memory, scoped permissions, and fundamentally different toolsets. Alfred handles your daily work. Lucius handles strategy. They never interfere with each other. Both are immutable infrastructure that cannot be overridden, merged, or reassigned.

The Real-World Analogy

Think of it like a real company:

  • Alfred is your executive assistant. You ask him to find information, draft documents, check schedules, run reports. He knows your preferences and your history.
  • Lucius is your CEO. You tell him your strategic vision. He hires the team, assigns the work, monitors the budget, and reports outcomes.

You would never ask your CEO to search Google for restaurant recommendations. You would never ask your assistant to redesign the company org chart. They have different jobs, different knowledge, and different authority. Clawpy enforces this at the architecture level.