6 phases, one per commit: Phase 1: data structures (CallGraph, ExpensiveOp, StateMutation) - 15 unit tests Phase 2: trace_action + ActionProfile + cost model + AST walking - 8 tests (synthetic + integration on real src/) Phase 3: JSON / markdown / Mermaid output - 4 tests Phase 4: MCP tool + CLI surface - 3 tests Phase 5: run audit on 3 actions; commit report Phase 6: tracks.md update TDD pattern: each task has synthetic-data unit test, then real implementation, then integration with real src/, then commit. The state.toml scaffold is created in Phase 0 Step 0.1 and advanced after each phase. 3 actions in scope (MMA is cold per user): - ai_message_lifecycle (5 entry points) - discussion_save_load (4 entry points) - gui_startup (3 entry points) Two follow-up tracks recorded but NOT in this track: - pipeline_runtime_profiling_20260607 - pipeline_pruning_20260607 No new pip dependencies; pure stdlib (ast, json, pathlib, dataclasses). Read-only on src/; new files are the tool, the tests, and the report under docs/reports/code_path_audit/2026-06-07/.
40 KiB
Code Path & Data Pipeline Audit Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Build src/code_path_audit.py — a static-analysis tool that audits the 3 major actions (AI message lifecycle, discussion save/load, GUI startup) for expensive operations, redundant calls, and pipelining candidates. Output: JSON + markdown + Mermaid reports under docs/reports/code_path_audit/2026-06-07/.
Architecture: Single new module src/code_path_audit.py. No new dependencies. Builds a call graph from src/ via AST walking, indexes state mutations and expensive ops per function, traverses per-action subgraphs, and emits JSON/markdown/Mermaid. Heuristic cost model with a module-level EXPENSIVE_THRESHOLD constant. The TDD pattern: each task has a synthetic-data unit test, then the real implementation, then integration with a real src/ fixture, then commit.
Tech Stack: Python 3.11+, ast (stdlib), pathlib (stdlib), json (stdlib), dataclasses (stdlib). No new pip dependencies.
Phase 0: Setup
Files: conductor/tracks/code_path_audit_20260607/state.toml (create), src/code_path_audit.py (create empty), tests/test_code_path_audit.py (create empty).
- Step 0.1: Create
state.toml
Write conductor/tracks/code_path_audit_20260607/state.toml:
# Track state for code_path_audit_20260607
# Updated by Tier 2 Tech Lead as tasks complete
[meta]
track_id = "code_path_audit_20260607"
name = "Code Path & Data Pipeline Audit"
status = "active"
current_phase = 0
last_updated = "2026-06-07"
[phases]
phase_1 = { status = "pending", checkpointsha = "", name = "Data structures (CallGraph, ExpensiveOpIndex, StateMutationIndex)" }
phase_2 = { status = "pending", checkpointsha = "", name = "trace_action + ActionProfile + cost model" }
phase_3 = { status = "pending", checkpointsha = "", name = "Output (JSON / markdown / Mermaid)" }
phase_4 = { status = "pending", checkpointsha = "", name = "MCP tool + CLI surface" }
phase_5 = { status = "pending", checkpointsha = "", name = "Run audit on 3 actions; commit report" }
phase_6 = { status = "pending", checkpointsha = "", name = "tracks.md update" }
[verification]
call_graph_json_produced = false
expensive_ops_json_produced = false
state_mutations_json_produced = false
actions_ai_message_produced = false
actions_save_load_produced = false
actions_gui_startup_produced = false
summary_md_produced = false
optimization_candidates_md_produced = false
unit_tests_passing = false
- Step 0.2: Create empty
src/code_path_audit.py
New-Item -ItemType File -Path src/code_path_audit.py -Force | Out-Null
- Step 0.3: Create empty
tests/test_code_path_audit.py
New-Item -ItemType File -Path tests/test_code_path_audit.py -Force | Out-Null
- Step 0.4: Confirm git state clean for the new files
Run: git status --short
Expected: only the user's pre-existing modifications (not the new files; the new files are untracked, which is fine for now).
- Step 0.5: Conductor - User Manual Verification (per workflow.md)
Ask the user to confirm Phase 0 setup is right before proceeding.
Phase 1: Data structures (CallGraph, ExpensiveOpIndex, StateMutationIndex)
Files: src/code_path_audit.py, tests/test_code_path_audit.py.
This phase is one commit. Three sub-tasks (one per data structure), each with TDD.
Task 1.1: CallGraph
Files:
-
Modify:
src/code_path_audit.py -
Test:
tests/test_code_path_audit.py -
Step 1.1.1: Add the
CallGraph+FunctionNodedataclasses tosrc/code_path_audit.py
Write the following at the top of src/code_path_audit.py (1-space indent per the project's Python style):
"""Static code path & data pipeline audit tool.
Builds a call graph of src/, indexes state mutations and expensive
operations per function, and produces per-action load profiles for
the 3 major actions (AI message lifecycle, discussion save/load,
GUI startup). See conductor/tracks/code_path_audit_20260607/spec.md.
"""
from __future__ import annotations
import ast
import json
from dataclasses import dataclass, field
from pathlib import Path
from typing import Literal
EXPENSIVE_THRESHOLD: int = 40_000
COST_CLASS_WEIGHTS: dict[str, int] = {
"file_io": 100,
"network": 500,
"ast_parse": 200,
"json_io": 150,
"pickle": 300,
"deep_copy": 200,
"loop_amplified": 1,
}
@dataclass
class FunctionNode:
fqname: str
file: str
line: int
calls: list[str] = field(default_factory=list)
state_mutations: list["StateMutation"] = field(default_factory=list)
expensive_ops: list["ExpensiveOp"] = field(default_factory=list)
@dataclass
class CallGraph:
nodes: dict[str, FunctionNode] = field(default_factory=dict)
edges: dict[str, set[str]] = field(default_factory=dict)
def add_edge(self, caller: str, callee: str) -> None:
self.edges.setdefault(caller, set()).add(callee)
self.nodes.setdefault(caller, FunctionNode(fqname=caller, file="", line=0))
self.nodes.setdefault(callee, FunctionNode(fqname=callee, file="", line=0))
def transitive_callees(self, root: str, max_depth: int = 10) -> set[str]:
seen: set[str] = set()
stack: list[tuple[str, int]] = [(root, 0)]
while stack:
node, depth = stack.pop()
if depth > max_depth or node in seen:
continue
seen.add(node)
for callee in self.edges.get(node, ()):
stack.append((callee, depth + 1))
seen.discard(root)
return seen
def render_mermaid(self, root: str, max_depth: int = 5) -> str:
seen: set[str] = {root}
edges: list[tuple[str, str]] = []
stack: list[tuple[str, int]] = [(root, 0)]
while stack:
node, depth = stack.pop()
if depth >= max_depth:
continue
for callee in self.edges.get(node, ()):
edges.append((node, callee))
if callee not in seen:
seen.add(callee)
stack.append((callee, depth + 1))
lines = ["graph LR"]
for caller, callee in sorted(edges):
lines.append(f' {caller.replace(".", "_")}["{caller}"] --> {callee.replace(".", "_")}["{callee}"]')
return "\n".join(lines)
- Step 1.1.2: Write the failing test for
CallGraphintests/test_code_path_audit.py
"""Tests for src.code_path_audit."""
from src.code_path_audit import CallGraph, EXPENSIVE_THRESHOLD
def test_callgraph_add_edge_creates_nodes() -> None:
cg = CallGraph()
cg.add_edge("a.b.c", "a.b.d")
assert "a.b.c" in cg.nodes
assert "a.b.d" in cg.nodes
assert "a.b.d" in cg.edges["a.b.c"]
def test_callgraph_transitive_callees_5_node_synthetic() -> None:
# a -> b -> c
# \-> d -> e
cg = CallGraph()
cg.add_edge("a", "b")
cg.add_edge("b", "c")
cg.add_edge("a", "d")
cg.add_edge("d", "e")
result = cg.transitive_callees("a", max_depth=10)
assert result == {"b", "c", "d", "e"}
def test_callgraph_transitive_callees_respects_max_depth() -> None:
cg = CallGraph()
cg.add_edge("a", "b")
cg.add_edge("b", "c")
cg.add_edge("c", "d")
# max_depth=1 from a: only b (direct callee); not c (depth 2)
result = cg.transitive_callees("a", max_depth=1)
assert result == {"b"}
def test_callgraph_render_mermaid_basic() -> None:
cg = CallGraph()
cg.add_edge("a", "b")
cg.add_edge("b", "c")
md = cg.render_mermaid("a", max_depth=5)
assert "graph LR" in md
assert 'a["a"] --> b["b"]' in md
assert 'b["b"] --> c["c"]' in md
def test_expensive_threshold_default() -> None:
assert EXPENSIVE_THRESHOLD == 40_000
- Step 1.1.3: Run the test to verify it fails (before the implementation in 1.1.1 is complete)
(Note: 1.1.1 added the dataclasses; 1.1.2 added the tests. They should now both exist; the test should PASS. If 1.1.1 was skipped, the test would fail with ImportError: cannot import name 'CallGraph'.)
Run: uv run pytest tests/test_code_path_audit.py -q 2>&1 | Select-Object -Last 10
Expected: 5 tests pass.
Task 1.2: ExpensiveOpIndex + cost classes
Files:
-
Modify:
src/code_path_audit.py -
Test:
tests/test_code_path_audit.py -
Step 1.2.1: Add the
ExpensiveOpdataclass + the 7 cost class detection functions tosrc/code_path_audit.py
Append to src/code_path_audit.py:
CostClass = Literal[
"file_io", "network", "ast_parse", "json_io", "pickle", "deep_copy", "loop_amplified"
]
@dataclass
class ExpensiveOp:
callee: str
cost_class: CostClass
data_size_estimate: int | None
line: int
weight: int
_FILE_IO_PATTERNS: frozenset[str] = frozenset({"open", "read_text", "write_text", "read_bytes", "write_bytes"})
_NETWORK_PATTERNS: frozenset[str] = frozenset({"get", "post", "put", "delete", "request", "urlopen", "send"})
_AST_PATTERNS: frozenset[str] = frozenset({"parse", "walk", "iter_child_nodes"})
_JSON_PATTERNS: frozenset[str] = frozenset({"dump", "dumps", "load", "loads"})
_PICKLE_PATTERNS: frozenset[str] = frozenset({"pickle"})
_DEEP_COPY_PATTERNS: frozenset[str] = frozenset({"deepcopy"})
def classify_call(callee_name: str) -> CostClass | None:
"""Return the cost class for a call name, or None if not expensive."""
if callee_name in _FILE_IO_PATTERNS:
return "file_io"
if callee_name in _NETWORK_PATTERNS:
return "network"
if callee_name in _AST_PATTERNS:
return "ast_parse"
if callee_name in _JSON_PATTERNS:
return "json_io"
if "pickle" in callee_name:
return "pickle"
if callee_name in _DEEP_COPY_PATTERNS:
return "deep_copy"
return None
- Step 1.2.2: Write the failing tests for the 7 cost classes
Append to tests/test_code_path_audit.py:
from src.code_path_audit import classify_call, COST_CLASS_WEIGHTS
def test_classify_call_file_io() -> None:
assert classify_call("open") == "file_io"
assert classify_call("read_text") == "file_io"
def test_classify_call_network() -> None:
assert classify_call("get") == "network"
assert classify_call("urlopen") == "network"
def test_classify_call_ast_parse() -> None:
assert classify_call("parse") == "ast_parse"
assert classify_call("walk") == "ast_parse"
def test_classify_call_json_io() -> None:
assert classify_call("dump") == "json_io"
assert classify_call("load") == "json_io"
def test_classify_call_pickle() -> None:
assert classify_call("pickle_dump") == "pickle"
assert "pickle" in (classify_call("custom_pickle_helper") or "")
def test_classify_call_deep_copy() -> None:
assert classify_call("deepcopy") == "deep_copy"
def test_classify_call_non_expensive() -> None:
assert classify_call("len") is None
assert classify_call("print") is None
assert classify_call("range") is None
def test_cost_class_weights_all_seven() -> None:
expected = {"file_io", "network", "ast_parse", "json_io", "pickle", "deep_copy", "loop_amplified"}
assert set(COST_CLASS_WEIGHTS.keys()) == expected
assert COST_CLASS_WEIGHTS["network"] > COST_CLASS_WEIGHTS["file_io"]
assert COST_CLASS_WEIGHTS["file_io"] > COST_CLASS_WEIGHTS["json_io"]
- Step 1.2.3: Run all tests
Run: uv run pytest tests/test_code_path_audit.py -q 2>&1 | Select-Object -Last 5
Expected: 13 tests pass (5 from 1.1.2 + 8 from 1.2.2).
Task 1.3: StateMutationIndex + 5 mutation kinds
Files:
-
Modify:
src/code_path_audit.py -
Test:
tests/test_code_path_audit.py -
Step 1.3.1: Add the
StateMutationdataclass + detection helpers tosrc/code_path_audit.py
Append to src/code_path_audit.py:
MutationKind = Literal["attr_write", "container_mutate", "file_write", "ipc_emit", "global_write"]
@dataclass
class StateMutation:
target: str
kind: MutationKind
line: int
- Step 1.3.2: Write the failing tests for the 5 mutation kinds (the kinds are documented in the spec; the AST detection is built in Phase 2 — Phase 1 only verifies the data structure)
Append to tests/test_code_path_audit.py:
from src.code_path_audit import StateMutation
def test_state_mutation_5_kinds() -> None:
"""The 5 mutation kinds documented in the spec."""
expected = {"attr_write", "container_mutate", "file_write", "ipc_emit", "global_write"}
mutations = [
StateMutation(target="self.history", kind="attr_write", line=10),
StateMutation(target="self.entries.append", kind="container_mutate", line=20),
StateMutation(target="file:logs/foo.log", kind="file_write", line=30),
StateMutation(target="queue.put", kind="ipc_emit", line=40),
StateMutation(target="module.events.X", kind="global_write", line=50),
]
assert {m.kind for m in mutations} == expected
def test_state_mutation_fields() -> None:
m = StateMutation(target="self.x", kind="attr_write", line=42)
assert m.target == "self.x"
assert m.line == 42
- Step 1.3.3: Run all tests
Run: uv run pytest tests/test_code_path_audit.py -q 2>&1 | Select-Object -Last 5
Expected: 15 tests pass.
Task 1.4: Commit Phase 1
- Step 1.4.1: Stage and commit
git add src/code_path_audit.py tests/test_code_path_audit.py
git commit -m "feat(audit): add code_path_audit data structures (CallGraph, ExpensiveOp, StateMutation)
15 unit tests passing on synthetic 5-node graphs. The 7 cost
classes (file_io, network, ast_parse, json_io, pickle,
deep_copy, loop_amplified) and 5 mutation kinds (attr_write,
container_mutate, file_write, ipc_emit, global_write) are
defined as Literal types and detected by name pattern in
classify_call().
EXPENSIVE_THRESHOLD = 40_000 module constant. COST_CLASS_WEIGHTS
dict with the 7 classes; network (500) > file_io (100) >
json_io (150) ordering. Tests in tests/test_code_path_audit.py.
Phase 2 will add trace_action + build_call_graph (AST walking
src/ to populate the indexes) + ActionProfile."
- Step 1.4.2: Attach git note
git notes add -m "feat(audit) Phase 1: data structures (CallGraph, ExpensiveOp, StateMutation)
15 unit tests pass on synthetic graphs and pattern detection.
The 3 data structures + 7 cost classes + 5 mutation kinds are
the foundation; Phase 2 adds the AST walker that populates them
from real src/.
EXPENSIVE_THRESHOLD = 40_000 is a module-level constant. The
runtime-profiling follow-up will calibrate this number based
on actual measurements." <commit_hash>
-
Step 1.4.3: Update state.toml: mark phase_1 = completed; current_phase = 2; commit
-
Step 1.4.4: Conductor - User Manual Verification
Phase 2: trace_action + ActionProfile + cost model
Files: src/code_path_audit.py, tests/test_code_path_audit.py.
This phase is one commit. AST walking the real src/ to populate the indexes.
Task 2.1: Action + ActionProfile dataclasses
- Step 2.1.1: Add the
Action+ActionProfiledataclasses tosrc/code_path_audit.py
Append to src/code_path_audit.py:
@dataclass
class Action:
name: str
entry_points: list[str]
description: str
@dataclass
class ActionProfile:
action: Action
call_graph: CallGraph
expensive_ops: list[ExpensiveOp]
state_mutations: list[StateMutation]
redundancy: list[tuple[str, int]]
pipelining_candidates: list[list[str]]
total_load_estimate: int
unresolved_calls: list[str]
mermaid: str
markdown: str
# The 3 actions in scope (MMA is cold per user)
ACTIONS: dict[str, Action] = {
"ai_message_lifecycle": Action(
name="ai_message_lifecycle",
entry_points=[
"src.app_controller.AppController.process_user_request",
"src.ai_client.AIClient.send",
"src.aggregate.build_file_items",
"src.summarize._summarise_generic",
"src.summarize._summarise_markdown",
],
description="AI message lifecycle: context aggregation -> AI call -> response -> history update.",
),
"discussion_save_load": Action(
name="discussion_save_load",
entry_points=[
"src.project_manager.save_project",
"src.project_manager.load_project",
"src.history.HistoryManager.save_snapshot",
"src.models.parse_history_entries",
],
description="Discussion save/load: snapshot -> serialize -> write TOML -> read TOML -> deserialize.",
),
"gui_startup": Action(
name="gui_startup",
entry_points=[
"gui_2.App.__init__",
"src.app_controller.AppController.__init__",
"src.paths._resolve_config_path",
],
description="GUI startup: paths init -> config load -> controller init -> first render.",
),
}
- Step 2.1.2: Write tests for the 3 actions in
tests/test_code_path_audit.py
Append to tests/test_code_path_audit.py:
from src.code_path_audit import ACTIONS
def test_actions_3_in_scope() -> None:
assert set(ACTIONS.keys()) == {"ai_message_lifecycle", "discussion_save_load", "gui_startup"}
def test_actions_have_entry_points() -> None:
for name, action in ACTIONS.items():
assert action.entry_points, f"{name} has no entry points"
assert action.description, f"{name} has no description"
def test_ai_message_entry_points_cover_pipeline() -> None:
entry = ACTIONS["ai_message_lifecycle"].entry_points
# context aggregation -> AI send -> history update -> summarization
assert any("build_file_items" in e for e in entry)
assert any("AIClient.send" in e for e in entry)
assert any("process_user_request" in e for e in entry)
def test_mma_is_cold_not_in_actions() -> None:
for action in ACTIONS.values():
for ep in action.entry_points:
assert "multi_agent_conductor" not in ep
assert "ConductorEngine" not in ep
assert "WorkerPool" not in ep
Task 2.2: AST walker + build_call_graph + build_*_index
- Step 2.2.1: Add the AST walking + index builders to
src/code_path_audit.py
Append to src/code_path_audit.py:
def _fqname(file: str, name: str) -> str:
return f"{file.removesuffix('.py').replace('/', '.')}.{name}"
def _walk_calls(node: ast.AST) -> list[tuple[str, int]]:
"""Return [(callee_name, line)] for every Call in node."""
out: list[tuple[str, int]] = []
for n in ast.walk(node):
if isinstance(n, ast.Call):
if isinstance(n.func, ast.Name):
out.append((n.func.id, n.lineno))
elif isinstance(n.func, ast.Attribute):
out.append((n.func.attr, n.lineno))
return out
def _walk_attr_writes(node: ast.AST) -> list[tuple[str, int]]:
"""Return [(target, line)] for self.X = ... and module.X = ... assignments."""
out: list[tuple[str, int]] = []
for n in ast.walk(node):
if isinstance(n, ast.Assign):
for target in n.targets:
if isinstance(target, ast.Attribute) and isinstance(target.value, ast.Name):
out.append((f"{target.value.id}.{target.attr}", n.lineno))
return out
def build_call_graph(src_dir: str = "src") -> CallGraph:
cg = CallGraph()
for py_file in Path(src_dir).rglob("*.py"):
if "__pycache__" in str(py_file):
continue
try:
tree = ast.parse(py_file.read_text(encoding="utf-8"))
except SyntaxError:
continue
module_fq = _fqname(str(py_file), "<module>")
for node in ast.walk(tree):
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
fq = _fqname(str(py_file), node.name)
cg.nodes[fq] = FunctionNode(fqname=fq, file=str(py_file), line=node.lineno)
cg.nodes[fq].calls = [c for c, _ in _walk_calls(node)]
cg.nodes[fq].state_mutations = [
StateMutation(target=t, kind="attr_write", line=ln)
for t, ln in _walk_attr_writes(node)
]
for callee, line in _walk_calls(node):
cg.add_edge(fq, callee)
cg.nodes[module_fq] = FunctionNode(fqname=module_fq, file=str(py_file), line=0)
return cg
def build_expensive_ops_index(cg: CallGraph) -> dict[str, list[ExpensiveOp]]:
out: dict[str, list[ExpensiveOp]] = {}
for fq, node in cg.nodes.items():
ops: list[ExpensiveOp] = []
for callee, line in node.calls:
cost_class = classify_call(callee)
if cost_class is None:
continue
weight = COST_CLASS_WEIGHTS[cost_class]
ops.append(ExpensiveOp(callee=callee, cost_class=cost_class, data_size_estimate=None, line=line, weight=weight))
out[fq] = ops
return out
def build_state_mutations_index(cg: CallGraph) -> dict[str, list[StateMutation]]:
return {fq: node.state_mutations for fq, node in cg.nodes.items() if node.state_mutations}
(Note: the file uses 1-space indent per project style. The implementer should preserve this when adding the code.)
- Step 2.2.2: Add the
trace_actionfunction tosrc/code_path_audit.py
Append:
def trace_action(action: Action, max_depth: int = 10) -> ActionProfile:
cg = build_call_graph()
expensive_index = build_expensive_ops_index(cg)
mutations_index = build_state_mutations_index(cg)
reachable: set[str] = set()
for ep in action.entry_points:
if ep in cg.nodes:
reachable.add(ep)
reachable |= cg.transitive_callees(ep, max_depth=max_depth)
reachable_ops: list[ExpensiveOp] = []
for fq in reachable:
reachable_ops.extend(expensive_index.get(fq, []))
cg.nodes[fq].expensive_ops = expensive_index.get(fq, [])
reachable_mutations: list[StateMutation] = []
for fq in reachable:
reachable_mutations.extend(mutations_index.get(fq, []))
call_counts: dict[str, int] = {}
for fq in reachable:
for callee in cg.nodes[fq].calls:
call_counts[callee] = call_counts.get(callee, 0) + 1
redundancy = [(op, n) for op, n in call_counts.items() if n > 1]
total_load = sum(op.weight for op in reachable_ops)
subgraph = CallGraph()
for fq in reachable:
subgraph.nodes[fq] = cg.nodes[fq]
for callee in cg.edges.get(fq, ()):
if callee in reachable:
subgraph.add_edge(fq, callee)
unresolved = [op.callee for op in reachable_ops if op.cost_class is None]
return ActionProfile(
action=action,
call_graph=subgraph,
expensive_ops=reachable_ops,
state_mutations=reachable_mutations,
redundancy=redundancy,
pipelining_candidates=[],
total_load_estimate=total_load,
unresolved_calls=unresolved,
mermaid="",
markdown="",
)
- Step 2.2.3: Add an integration test that builds the real call graph and traces a real action
Append to tests/test_code_path_audit.py:
def test_build_call_graph_real_src() -> None:
"""Smoke test: build the real call graph of src/ and verify it has nodes."""
from src.code_path_audit import build_call_graph
cg = build_call_graph()
assert len(cg.nodes) > 50, f"expected 50+ nodes, got {len(cg.nodes)}"
def test_trace_action_ai_message_returns_profile() -> None:
"""Integration test: trace the AI message action against real src/."""
from src.code_path_audit import trace_action, ACTIONS
profile = trace_action(ACTIONS["ai_message_lifecycle"], max_depth=5)
assert profile.action.name == "ai_message_lifecycle"
assert len(profile.expensive_ops) >= 0
assert profile.total_load_estimate >= 0
def test_trace_action_save_load_returns_profile() -> None:
from src.code_path_audit import trace_action, ACTIONS
profile = trace_action(ACTIONS["discussion_save_load"], max_depth=5)
assert profile.action.name == "discussion_save_load"
def test_trace_action_gui_startup_returns_profile() -> None:
from src.code_path_audit import trace_action, ACTIONS
profile = trace_action(ACTIONS["gui_startup"], max_depth=5)
assert profile.action.name == "gui_startup"
- Step 2.2.4: Run all tests
Run: uv run pytest tests/test_code_path_audit.py -q 2>&1 | Select-Object -Last 10
Expected: 19 tests pass (15 from Phase 1 + 4 from 2.2.3). The integration tests may take a few seconds (AST-walking 61 files).
Task 2.3: Commit Phase 2
- Step 2.3.1: Stage and commit
git add src/code_path_audit.py tests/test_code_path_audit.py
git commit -m "feat(audit): add trace_action + ActionProfile + AST walking
The 3 actions in scope (ai_message_lifecycle, discussion_save_load,
gui_startup) are declared as Action instances. trace_action()
builds the full call graph of src/, traverses the entry points to
depth 10, and returns an ActionProfile with: expensive ops,
state mutations, redundancy (ops called >1x), unresolved calls,
and total load estimate.
AST walker: build_call_graph() parses each src/*.py file,
extracts FunctionDef/AsyncFunctionDef nodes, indexes self.X
assignments as StateMutation, and classifies each Call as
file_io / network / ast_parse / json_io / pickle / deep_copy
or non-expensive.
MMA worker spawn is intentionally absent from ACTIONS (per
user: 'keeping that cold until the 1:1 discussion UX is
dogfooded in a few projects').
23 unit + integration tests passing on synthetic + real src/."
-
Step 2.3.2: Attach git note + update state.toml (phase_2 = completed; current_phase = 3)
-
Step 2.3.3: Conductor - User Manual Verification
Phase 3: Output (JSON / markdown / Mermaid)
Files: src/code_path_audit.py, tests/test_code_path_audit.py.
This phase is one commit. Three sub-tasks (one per output format).
Task 3.1: JSON serializer
- Step 3.1.1: Add the
to_jsonanddump_jsonfunctions tosrc/code_path_audit.py
Append:
def _to_jsonable(obj: object) -> object:
if isinstance(obj, (str, int, float, bool, type(None))):
return obj
if isinstance(obj, (list, tuple, set)):
return [_to_jsonable(x) for x in obj]
if isinstance(obj, dict):
return {str(k): _to_jsonable(v) for k, v in obj.items()}
if hasattr(obj, "__dataclass_fields__"):
return {k: _to_jsonable(getattr(obj, k)) for k in obj.__dataclass_fields__}
return repr(obj)
def to_json(profile: ActionProfile) -> str:
return json.dumps(_to_jsonable(profile), indent=2)
def dump_json(profile: ActionProfile, path: str) -> None:
Path(path).write_text(to_json(profile), encoding="utf-8")
- Step 3.1.2: Add tests for JSON output
Append to tests/test_code_path_audit.py:
import json
from src.code_path_audit import trace_action, ACTIONS, to_json
def test_to_json_round_trip() -> None:
profile = trace_action(ACTIONS["ai_message_lifecycle"], max_depth=3)
js = to_json(profile)
parsed = json.loads(js)
assert parsed["action"]["name"] == "ai_message_lifecycle"
assert "call_graph" in parsed
assert "expensive_ops" in parsed
assert "state_mutations" in parsed
def test_to_json_serializes_sets_as_lists() -> None:
from src.code_path_audit import CallGraph
cg = CallGraph()
cg.add_edge("a", "b")
js = to_json(cg)
parsed = json.loads(js)
assert isinstance(parsed["nodes"], dict)
assert isinstance(parsed["edges"]["a"], list)
Task 3.2: Markdown renderer
- Step 3.2.1: Add the
to_markdownfunction tosrc/code_path_audit.py
Append:
def to_markdown(profile: ActionProfile) -> str:
lines: list[str] = [
f"# Action Profile: {profile.action.name}",
"",
f"**Description:** {profile.action.description}",
"",
f"**Total load estimate:** {profile.total_load_estimate:,}",
f"**Expensive ops count:** {len(profile.expensive_ops)}",
f"**State mutations count:** {len(profile.state_mutations)}",
f"**Redundancy:** {len(profile.redundancy)} ops called >1x",
f"**Unresolved calls:** {len(profile.unresolved_calls)}",
"",
"## Expensive Operations",
"",
"| Callee | Cost class | Weight | Line |",
"|--------|------------|--------|------|",
]
for op in sorted(profile.expensive_ops, key=lambda o: -o.weight)[:50]:
lines.append(f"| `{op.callee}` | {op.cost_class} | {op.weight:,} | {op.line} |")
if not profile.expensive_ops:
lines.append("| _(none)_ | - | - | - |")
lines += ["", "## State Mutations (first 50)", "", "| Target | Kind | Line |", "|--------|------|------|"]
for m in profile.state_mutations[:50]:
lines.append(f"| `{m.target}` | {m.kind} | {m.line} |")
if not profile.state_mutations:
lines.append("| _(none)_ | - | - |")
lines += ["", "## Redundancy (ops called >1x)", ""]
if profile.redundancy:
for op, count in sorted(profile.redundancy, key=lambda x: -x[1])[:20]:
lines.append(f"- `{op}` called {count}x")
else:
lines.append("_(none)_")
lines += ["", "## Unresolved Calls", ""]
if profile.unresolved_calls:
for c in profile.unresolved_calls[:20]:
lines.append(f"- `{c}`")
else:
lines.append("_(none)_")
return "\n".join(lines)
- Step 3.2.2: Add tests for markdown output
Append to tests/test_code_path_audit.py:
from src.code_path_audit import to_markdown
def test_to_markdown_contains_action_name() -> None:
profile = trace_action(ACTIONS["ai_message_lifecycle"], max_depth=3)
md = to_markdown(profile)
assert "# Action Profile: ai_message_lifecycle" in md
assert "Total load estimate:" in md
assert "## Expensive Operations" in md
assert "## State Mutations" in md
Task 3.3: Mermaid generator
- Step 3.3.1: Add the
to_mermaidfunction tosrc/code_path_audit.py
Append:
def to_mermaid(profile: ActionProfile, max_depth: int = 5) -> str:
return profile.call_graph.render_mermaid(
profile.action.entry_points[0] if profile.action.entry_points else "<none>",
max_depth=max_depth,
)
- Step 3.3.2: Add tests for Mermaid output
Append to tests/test_code_path_audit.py:
from src.code_path_audit import to_mermaid
def test_to_mermaid_basic() -> None:
profile = trace_action(ACTIONS["ai_message_lifecycle"], max_depth=3)
mmd = to_mermaid(profile, max_depth=3)
assert "graph LR" in mmd or "<none>" in mmd
Task 3.4: Commit Phase 3
- Step 3.4.1: Stage and commit
git add src/code_path_audit.py tests/test_code_path_audit.py
git commit -m "feat(audit): add JSON / markdown / Mermaid output
to_json / to_markdown / to_mermaid functions serialize an
ActionProfile. JSON is round-trippable. Markdown has sections
for: summary, expensive ops (top 50 by weight), state
mutations (first 50), redundancy, unresolved calls. Mermaid
is rendered from the subgraph rooted at the first entry
point.
27 tests passing total."
-
Step 3.4.2: Attach git note + update state.toml (phase_3 = completed; current_phase = 4)
-
Step 3.4.3: Conductor - User Manual Verification
Phase 4: MCP tool + CLI surface
Files: src/code_path_audit.py (extends), opencode.json (modify if MCP registration needed).
Task 4.1: CLI
- Step 4.1.1: Add the CLI module-level entry to
src/code_path_audit.py
Append:
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description="Code path audit tool.")
parser.add_argument("--action", required=True, choices=list(ACTIONS.keys()))
parser.add_argument("--depth", type=int, default=10)
parser.add_argument("--output-dir", default="docs/reports/code_path_audit")
parser.add_argument("--date", default=None, help="ISO date; defaults to today")
args = parser.parse_args()
profile = trace_action(ACTIONS[args.action], max_depth=args.depth)
profile.markdown = to_markdown(profile)
profile.mermaid = to_mermaid(profile, max_depth=min(5, args.depth))
from datetime import date
date_str = args.date or date.today().isoformat()
out_dir = Path(args.output_dir) / date_str
(out_dir / "actions").mkdir(parents=True, exist_ok=True)
dump_json(profile, str(out_dir / "actions" / f"{args.action}.json"))
(out_dir / "actions" / f"{args.action}.md").write_text(profile.markdown, encoding="utf-8")
(out_dir / "actions" / f"{args.action}.mmd").write_text(profile.mermaid, encoding="utf-8")
print(f"Wrote {out_dir / 'actions' / args.action}.{{json,md,mmd}}")
- Step 4.1.2: Add a CLI smoke test
Append to tests/test_code_path_audit.py:
def test_cli_help() -> None:
import subprocess
result = subprocess.run(
["python", "-m", "src.code_path_audit", "--help"],
capture_output=True, text=True, timeout=10,
)
assert result.returncode == 0
assert "--action" in result.stdout
Task 4.2: MCP tool registration
- Step 4.2.1: Add the MCP-tool function to
src/code_path_audit.py
Append:
def code_path_audit(action_name: str, max_depth: int = 10) -> dict:
"""MCP tool: trace an action and return its profile as a dict."""
if action_name not in ACTIONS:
raise ValueError(f"Unknown action: {action_name}. Known: {list(ACTIONS.keys())}")
profile = trace_action(ACTIONS[action_name], max_depth=max_depth)
profile.markdown = to_markdown(profile)
profile.mermaid = to_mermaid(profile, max_depth=min(5, max_depth))
return _to_jsonable(profile)
- Step 4.2.2: Add the tool to
src/models.py(or wherever MCP tools are registered)
Check src/models.py:100 and src/models.py:144 (per the spec's audit finding). If the project has a MCP_TOOLS or similar registry, add code_path_audit there. If not, add it where appropriate. (This step requires the implementer to find the right registration point; the test below verifies the function exists.)
- Step 4.2.3: Add a test for the MCP-tool function
Append to tests/test_code_path_audit.py:
def test_code_path_audit_mcp_function() -> None:
from src.code_path_audit import code_path_audit
result = code_path_audit("ai_message_lifecycle", max_depth=3)
assert result["action"]["name"] == "ai_message_lifecycle"
assert "expensive_ops" in result
def test_code_path_audit_mcp_unknown_action_raises() -> None:
from src.code_path_audit import code_path_audit
import pytest
with pytest.raises(ValueError, match="Unknown action"):
code_path_audit("mma_worker_spawn", max_depth=3)
Task 4.3: Commit Phase 4
- Step 4.3.1: Stage and commit
git add src/code_path_audit.py tests/test_code_path_audit.py src/models.py opencode.json
git commit -m "feat(audit): add MCP tool + CLI surface
CLI: python -m src.code_path_audit --action <name> [--depth N]
[--output-dir DIR] [--date YYYY-MM-DD]. Writes JSON + MD + MMD
to <output-dir>/<date>/actions/<name>.{json,md,mmd}.
MCP tool: code_path_audit(action_name, max_depth=10) -> dict.
Raises ValueError on unknown action. Registered alongside the
other MCP tools in src/models.py.
30 tests passing total."
-
Step 4.3.2: Attach git note + update state.toml (phase_4 = completed; current_phase = 5)
-
Step 4.3.3: Conductor - User Manual Verification
Phase 5: Run audit on 3 actions; commit report
Files: docs/reports/code_path_audit/2026-06-07/ (create + populate).
This phase is one commit. The deliverable IS the report.
Task 5.1: Run audits + produce the report
- Step 5.1.1: Run the audit on the 3 actions
uv run python -m src.code_path_audit --action ai_message_lifecycle --depth 10 --date 2026-06-07
uv run python -m src.code_path_audit --action discussion_save_load --depth 10 --date 2026-06-07
uv run python -m src.code_path_audit --action gui_startup --depth 10 --date 2026-06-07
Expected output: 3 lines like Wrote docs/reports/code_path_audit/2026-06-07/actions/ai_message_lifecycle.{json,md,mmd}.
- Step 5.1.2: Generate the full call graph + indexes
uv run python -c "
import sys
sys.path.insert(0, 'src')
from pathlib import Path
from code_path_audit import build_call_graph, build_expensive_ops_index, build_state_mutations_index, _to_jsonable, COST_CLASS_WEIGHTS
import json
cg = build_call_graph()
out = Path('docs/reports/code_path_audit/2026-06-07')
(out).mkdir(parents=True, exist_ok=True)
(out / 'call_graph.json').write_text(json.dumps(_to_jsonable(cg), indent=2), encoding='utf-8')
(out / 'expensive_ops.json').write_text(json.dumps(_to_jsonable(build_expensive_ops_index(cg)), indent=2), encoding='utf-8')
(out / 'state_mutations.json').write_text(json.dumps(_to_jsonable(build_state_mutations_index(cg)), indent=2), encoding='utf-8')
print('Wrote call_graph.json, expensive_ops.json, state_mutations.json')
"
Expected: Wrote call_graph.json, expensive_ops.json, state_mutations.json
- Step 5.1.3: Produce
summary.md
The implementer writes docs/reports/code_path_audit/2026-06-07/summary.md by hand, synthesizing the 3 per-action reports. The summary structure (from the spec):
# Code Path Audit — 2026-06-07
## Top-level summary
- AI message lifecycle: N expensive ops, total load X
- Discussion save/load: N expensive ops, total load X
- GUI startup: N expensive ops, total load X
## Top optimization candidates (across all 3 actions)
1. (caller, callee) — current cost X, proposed reduction Y, effort Z
2. ...
## Caveats
- The cost model is heuristic; EXPENSIVE_THRESHOLD = 40_000.
- AST walking misses dynamic patterns (eval, getattr, decorator dispatch).
- The runtime-profiling follow-up (pipeline_runtime_profiling_20260607) will calibrate.
- MMA worker spawn is OUT of scope (user: cold until 1:1 discussion UX dogfooded).
The actual content is filled in by the implementer based on the 3 per-action reports.
- Step 5.1.4: Produce
optimization_candidates.md
A ranked list extracted from summary.md. Each candidate: location, current cost, proposed change, expected reduction, effort, priority.
Task 5.2: Commit Phase 5
- Step 5.2.1: Stage and commit the report
git add docs/reports/code_path_audit/2026-06-07/
git commit -m "docs(audit): run code path audit on 3 actions; commit report
Generated artifacts under docs/reports/code_path_audit/2026-06-07/:
- call_graph.json (full call graph of src/)
- expensive_ops.json (per-function expensive-op index)
- state_mutations.json (per-function state-mutation index)
- actions/ai_message_lifecycle.{json,md,mmd}
- actions/discussion_save_load.{json,md,mmd}
- actions/gui_startup.{json,md,mmd}
- summary.md (cross-action summary + top candidates)
- optimization_candidates.md (ranked list)
EXPENSIVE_THRESHOLD = 40_000. Cost model is heuristic; the
pipeline_runtime_profiling_20260607 follow-up will calibrate.
MMA worker spawn is intentionally absent from ACTIONS."
-
Step 5.2.2: Attach git note + update state.toml (phase_5 = completed; current_phase = 6; update verification booleans)
-
Step 5.2.3: Conductor - User Manual Verification (per workflow.md)
Ask the user to review the report. This is the deliverable — they need to confirm the optimization candidates make sense.
Phase 6: tracks.md update
Files: conductor/tracks.md (modify).
- Step 6.1: Add the track entry to
conductor/tracks.md
Open conductor/tracks.md. Add a new entry at the appropriate chronological location (near the other 2026-06-07 tracks). Use the format from recent tracks:
- [x] **Track: Code Path & Data Pipeline Audit** `[checkpoint: <last_commit_sha>]`
*Link: [./tracks/code_path_audit_20260607/](./tracks/code_path_audit_20260607/), Spec: [./tracks/code_path_audit_20260607/spec.md](./tracks/code_path_audit_20260607/spec.md), Plan: [./tracks/code_path_audit_20260607/plan.md](./tracks/code_path_audit_20260607/plan.md)*
*Goal: Build `src/code_path_audit.py` — static-analysis tool that audits 3 major actions (AI message, save/load, GUI startup) for expensive ops, redundant calls, pipelining candidates. 7 cost classes, 5 mutation kinds, EXPENSIVE_THRESHOLD = 40_000. Output: JSON + MD + Mermaid in `docs/reports/code_path_audit/2026-06-07/`. MMA worker spawn is OUT of scope (user: cold). Two follow-up tracks recorded: `pipeline_runtime_profiling_20260607` (calibrate heuristic cost model) and `pipeline_pruning_20260607` (implement the candidates). 24+ tests passing.*
Replace <last_commit_sha> with the SHA from the report commit in Phase 5.
- Step 6.2: Commit the tracks.md update
git add conductor/tracks.md
git commit -m "conductor(tracks): mark Code Path Audit track as complete
Phase 6 verification complete: 6 atomic commits landed, the
audit was run on the 3 actions, the report is in
docs/reports/code_path_audit/2026-06-07/, all 24+ unit and
integration tests pass.
MMA worker spawn is out of scope (per user: cold). Two
follow-up tracks recorded: pipeline_runtime_profiling_20260607
(calibrate heuristic cost model) and pipeline_pruning_20260607
(implement the candidates)."
-
Step 6.3: Attach git note + update state.toml (phase_6 = completed; current_phase = "complete"; status = "completed")
-
Step 6.4: Conductor - User Manual Verification (final)
Ask the user to confirm the track is complete.
Summary
- 6 phases, 6 atomic commits, 30 tests.
- 3 per-action reports + 1 cross-action summary + 1 ranked candidates list.
- 2 follow-up tracks recorded (runtime profiling + pruning).
- No new pip dependencies; pure stdlib (ast, json, pathlib, dataclasses).
- Read-only on
src/: the audit doesn't modify existing code. The new files aresrc/code_path_audit.py+tests/test_code_path_audit.py+ the report underdocs/reports/code_path_audit/2026-06-07/. - Reusable: re-run after any
src/change to see drift. The 3 actions are declared once inACTIONS; adding a 4th is oneAction(...)declaration. - Calibration target:
pipeline_runtime_profiling_20260607will use the existingsrc/performance_monitor.pyto measure real costs and recalibrateEXPENSIVE_THRESHOLD+COST_CLASS_WEIGHTS.