Temporal Knowledge Graph

While vector search (Layers C and E) answers "what past work is similar to this task?", the Temporal Knowledge Graph answers a fundamentally different question: "what past work is causally connected to this task?"

Layer F is a lightweight directed graph stored as JSON. It tracks task relationships, applies time-decay scoring, and injects graph neighbors into recall results — surfacing causal chains that pure similarity search cannot find.


Why a Graph?

Consider this scenario: a deployment task fails. A vector search for "deployment failure" might surface similar past failures — useful, but incomplete. The knowledge graph adds structural context:

Task: "Fix API authentication"
  ├── blocks → "Deploy v2.4 to staging"
  ├── related_to → "Refactor OAuth flow"  
  └── derived_from → "Security audit findings"

By walking 1-hop graph neighbors, Clawpy discovers that the authentication fix was derived from a security audit and is blocking a deployment — context that vector similarity alone cannot provide.


Data Model

Nodes

Every completed task becomes a node in the graph:

TaskNode {
  task_id:      string       // Unique identifier (e.g., "task_a1b2c3")
  title:        string       // "Fix API authentication"
  timestamp:    float        // time.time() when created/completed
  success:      bool | null  // true = passed, false = failed, null = pending
  description:  string       // Optional context or notes
}

Edges

Directed edges encode four types of relationships:

RelationshipDirectionMeaning
blocksA → BTask A's failure directly prevented Task B from proceeding
related_toA ↔ BTasks share domain, components, or codebase area
derived_fromB → ATask B was spawned or split from Task A
similar_toA ↔ BAuto-linked when ChromaDB similarity exceeds threshold

Edges are deduplicated by (source, target, relationship) — the same relationship between two tasks is never recorded twice.


Time-Decay Weighted Recall

The graph's most powerful feature is its time-weighted re-ranking of ChromaDB results. Raw vector search returns results ranked by semantic distance, but this ignores temporal relevance. The knowledge graph fixes this with a composite scoring formula:

combined_score = λ × semantic_score + (1 − λ) × recency_score

Where:

VariableFormulaDefault
semantic_score1 − chroma_distanceRange: 0.0 – 1.0
recency_scoreexp(−age_days / τ)Exponential decay
λ (lambda)SEMANTIC_WEIGHT0.7
τ (tau)HALF_LIFE_DAYS14 days

How Recency Scoring Works

The exponential decay function means:

AgeRecency ScoreInterpretation
0 days (today)1.000Full recency weight
7 days (1 week)0.607~61% recency weight
14 days (half-life)0.368~37% — the half-life point
28 days (1 month)0.135~14% recency weight
90 days (3 months)0.002~0.2% — nearly forgotten

With λ = 0.7, semantic similarity dominates the score, but recency acts as a tiebreaker: when two past tasks are equally similar to the current task, the more recent one ranks higher.

Worked Example

Two recalled tasks for the query "Fix SSL certificate rotation":

Task A: "Configured Let's Encrypt auto-renewal" — 3 days ago, distance 0.4

semantic = 1 − 0.4 = 0.6
recency  = exp(−3 / 14) = 0.807
combined = 0.7 × 0.6 + 0.3 × 0.807 = 0.420 + 0.242 = 0.662

Task B: "Set up TLS termination on load balancer" — 45 days ago, distance 0.3

semantic = 1 − 0.3 = 0.7
recency  = exp(−45 / 14) = 0.040
combined = 0.7 × 0.7 + 0.3 × 0.040 = 0.490 + 0.012 = 0.502

Result: Task A (0.662) ranks above Task B (0.502) despite Task B being more semantically similar, because Task A is much more recent.


1-Hop Neighbor Injection

After scoring direct ChromaDB hits, the graph walks one hop in each direction and injects connected tasks that weren't found by vector search:

for neighbor in graph.neighbors(task_id):
    if neighbor.task_id in already_seen:
        continue
    # Penalize indirect results: halve the semantic score
    combined = λ × (semantic_score × 0.5) + (1 − λ) × recency_score
    results.append(neighbor_with_reduced_score)

Indirect neighbors receive a 50% penalty on their semantic component to avoid overwhelming the recall with loosely-connected results. At most MAX_GRAPH_NEIGHBORS = 3 neighbors are injected per direct hit.


Soft Invalidation

When a task's results become obsolete (e.g., a decision is reversed), the node is soft-invalidated rather than deleted:

def soft_invalidate_node(self, task_id, reason, stamp=""):
    node.description += f"\n[invalidated:{stamp}] {reason}"
    node.success = None  # Reset outcome
    self._save()

This preserves the full audit trail while signaling to the recall system that this task's results should be treated with lower confidence.


Storage

The entire graph is stored as a single JSON file:

{
  "nodes": {
    "task_a1b2c3": {
      "task_id": "task_a1b2c3",
      "title": "Fix API authentication",
      "timestamp": 1713564000.0,
      "success": true,
      "description": ""
    }
  },
  "edges": [
    {
      "source_id": "task_a1b2c3",
      "target_id": "task_d4e5f6",
      "relationship": "blocks",
      "created_at": 1713564100.0,
      "note": ""
    }
  ],
  "_meta": {
    "saved_at": 1713564200.0,
    "node_count": 42,
    "edge_count": 67
  }
}

Writes are atomic (write to .tmp, then rename) to prevent data corruption during crashes. The graph is loaded once at startup as a singleton and held in memory for fast traversal.