# Track: Code Path & Data Pipeline Audit **Status:** Spec approved 2026-06-07; revised 2026-06-08 with post-4-tracks timing and 5-source framing **Initialized:** 2026-06-07 **Owner:** Tier 2 Tech Lead **Priority:** Medium (foundational; enables follow-up pruning track) > **Revision note (2026-06-08).** The user specified that this audit should run *after* the 4 foundational tracks complete (`qwen_llama_grok_integration_20260606`, `data_oriented_error_handling_20260606`, `data_structure_strengthening_20260606`, `mcp_architecture_refactor_20260606`). The 4 tracks will significantly reshape `src/ai_client.py`, `src/mcp_client.py`, `src/app_controller.py`, and `src/type_aliases.py` — running the audit on the pre-refactor code would produce a report that's stale on day 1. The post-4-tracks timing ensures the audit grounds optimization decisions for the *resulting* architecture, not the pre-refactor one. See §"Timing" below. --- ## Overview Build `src/code_path_audit.py` — a data-oriented static-analysis tool that audits the 3 major actions (AI message lifecycle, discussion save/load, GUI startup) for expensive operations, redundant calls, and pipelining candidates. The output (custom postfix `.dsl` data + markdown + Mermaid + prefix tree text) is the artifact that informs pipeline-pruning decisions; the actual code changes are a follow-up track (`pipeline_pruning_20260607`). Per the user's framing: "anything that can even remotely smell as an expensive bulk action or major action that takes more than 10-40 microseconds." The audit focuses on **expensive** operations (file I/O, network, AST parsing, big loops, anything that smells like a bulk action) inside the 3 actions — not on every state mutation. The cost model is heuristic, calibrated by a runtime-profiling follow-up (`pipeline_runtime_profiling_20260607`) that catches the cases static analysis can't resolve (C-extension cost, import cost, JIT effects, decorator-driven dispatch). The MMA worker spawn action is **out of scope** for this track (per user: "keeping that cold for a while until I like the main ux loop with ai in a discussion fully dogfooded"). ## Timing (post-4-tracks) This track is intentionally **deferred** until *after* the 4 foundational tracks ship: 1. `qwen_llama_grok_integration_20260606` — adds 3 vendors (`_send_qwen`, `_send_llama`, `_send_grok`) and refactors `_send_minimax` to use the shared `send_openai_compatible()` helper. Modifies `src/ai_client.py`, `src/openai_compatible.py` (new), `src/vendor_capabilities.py` (new). 2. `data_oriented_error_handling_20260606` — refactors `ai_client._send_` to return `Result[str]`, modifies `mcp_client.py` (30+ sites), `rag_engine.py` (Result returns). 3. `data_structure_strengthening_20260606` — adds `src/type_aliases.py` with 10 TypeAliases, replaces 345 weak-type sites across 6 files. 4. `mcp_architecture_refactor_20260606` — splits `src/mcp_client.py` (2,205 lines → 6 sub-MCPs + 1 external), adds `src/mcp_client_legacy.py` for backward compat. Running the audit on the **pre-refactor** `src/` would produce a report that's stale on day 1. The post-4-tracks timing ensures: - The audit's data grounds optimization decisions for the *resulting* architecture (post-Fleury-style "effective codepaths" and "ECS archetype tables" if the 4 tracks are implemented with the data-oriented philosophy). - The `pipeline_pruning_20260607` follow-up has the *right* candidates to optimize — the 4 tracks will move the expensive ops around, and pruning the wrong ones wastes work. - The runtime-profiling follow-up (`pipeline_runtime_profiling_20260607`) measures the *new* code paths, not the old ones. **Pre-flight check (verifies the 4-tracks baseline before this track starts):** confirm that all 4 tracks are marked `[x]` completed in `conductor/tracks.md`. If any of the 4 are still `[~]` in-progress, this track is blocked — the audit would catch the in-progress state as drift. ## Analytical Framing (5-source lens) The 5 sources loaded into context for the post-4-tracks audit collectively reframe *what* to look for in the 3 actions. The audit's static cost model and pipeline-pruning recommendations should be informed by: | Source | Lens the audit inherits | |---|---| | [Ryan Fleury, "A Taxonomy of Computation Shapes"](https://www.dgtlgrove.com/p/a-taxonomy-of-computation-shapes) (Feb 2023) | The 6 shapes: instruction, codepath, wide codepath, codecycle, wide codecycle, codecycle graph. The audit's `trace_action` is a codepath visualization; the `redundancy` (call_count > 1) field detects **wide codepaths** that could be split into parallel sub-codepaths. | | [Ryan Fleury, "The Codepath Combinatoric Explosion"](https://www.dgtlgrove.com/p/the-codepath-combinatoric-explosion) (Apr 2023) | The "effective codepath" concept. The audit's `pipelining_candidates` field detects codepaths that *could be defused* (multiple real codepaths collapsed into 1 effective codepath via nil sentinels, generational handles, or immediate-mode APIs). The `redundancy` field is the *first indicator* of defusing opportunities. | | [Casey Muratori, "The Big OOPs: Anatomy of a Thirty-Five-Year Mistake" (BSC 2025)](https://youtu.be/wo84LFzx5nI) | The 35-year-historical indictment of compile-time domain hierarchies. The audit's per-function `state_mutations` index reveals whether a function is in the *system* pattern (mutates component-like data, not entity state) or the *entity-hierarchy* pattern (mutates a single object's identity, where the cost compounds per type). Functions in the latter pattern are the *highest-priority* refactor targets — they may need to be split into components + systems. | | [Andrew Reece, "Assuming as Much as Possible" (BSC 2025)](https://www.youtube.com/watch?v=i-h95QIGchY) | The "assume as much as possible" engineering discipline. The audit's `expensive_ops` index, for any function that calls a general-purpose primitive (e.g., `json.dumps`, `Path.read_text`, `ast.parse`), should ask: **"can this caller assume a smaller input domain and use a specialized primitive instead?"** A function that calls `json.dumps` 50 times per action with 1KB payloads each may be replaceable by a function that calls a domain-specific serializer once with a 50KB payload. | | User's chunk-ideation archive (May 2026) | The "fixed-size slices" + "ECS archetype tables" pattern. The audit's per-function calls that operate on lists/arrays should be flagged if they: (a) don't have a chunk-aware variant, (b) are in a hot path, (c) the data shape is uniform enough to chunk. Functions that match all 3 are the **prime candidates** for `pipeline_pruning_20260607` — chunkification is a known pattern with bounded risk. | **Concrete audit-time heuristics** that emerge from this framing: - **Effective-codepath count:** when a function has 3+ branches that all do roughly the same thing with different inputs, the audit should report "this is N real codepaths behaving as 1 effective codepath — could be defused with a nil sentinel or generational handle." The runtime-profiling follow-up measures the actual savings. - **Entity-hierarchy fingerprint:** when a function's `state_mutations` list has > 3 writes to a single `self.X` with a `type` discriminator, the audit should report "this function is operating on entity-hierarchy state; consider ECS split into components + systems." A *concrete Manual Slop example* the audit should catch: any function that does `if self.active_ticket.kind == TicketKind.X:` and then mutates multiple fields. - **Assumed-too-much detector:** when a function calls `ast.parse` (or any `tree_sitter.*`) on a file that *could be assumed* to be already-parsed (because the file is in the context composition and the `aggregate.py` pipeline has already done it), the audit should report "this is re-parsing data that was already parsed upstream; consider memoizing or threading the parsed AST through." This is the "assume as much as possible" pattern at the data-passing level. - **Chunkification candidates:** when a function loops over a `list[dict]` with a known uniform shape (heuristic: all dicts have the same key set), the audit should report "consider chunkifying — uniform data, hot path, no chunk awareness." The user has explicit code (`docs/ideation/ed_chunk_data_structures_20260523.md`) for the chunk pattern, so the audit's optimization candidates can cite it. These heuristics are *guidance for the audit's report interpretation* — they don't change the audit's static cost model (which is data-grounded in the existing `EXENSIVE_THRESHOLD` + per-class weights). They shape how the Tier 2 Tech Lead and the user interpret the report. ## Current State Audit (as of `ca781543`) `src/` has 61 `.py` files (27,447 total lines; 23,845 code lines). The call graph is non-trivial; per-action traversal is what makes the analysis tractable. ### Already Implemented (DO NOT re-implement; KEEP / build on) 1. **`src/mcp_client.py:934-992` — `derive_code_path(target, max_depth=5)`.** A single-symbol recursive call tracer with text output. Doesn't render multi-action graphs, doesn't track mutations, doesn't measure cost. The new tool is the multi-action + mutation + cost version of this primitive. **Build on this:** lift the AST traversal logic and `trace()` recursion pattern into `code_path_audit.py`. 2. **`scripts/audit_main_thread_imports.py`** — static CI gate for import-time purity. Different concern (startup-time import cost), but its AST-walking pattern is the model for `code_path_audit.py`'s implementation. 3. **`src/performance_monitor.py`** — runtime profiling with `monitor.scope("name")` and per-component hit counts + latencies. Used at runtime; the follow-up `pipeline_runtime_profiling_20260607` track will use it to calibrate the heuristic cost model. 4. **`conductor/archive/code_path_analysis_20260507/`** — prior manual audit + `PIPELINE_ANALYSIS.md` + Mermaid diagrams for the major pipelines. Manual effort, no reusable tool. New track is the data-grounded successor. 5. **`conductor/archive/ai_interaction_call_graph_20260507/`** — sequence diagram for the AI loop. New track supersedes this for the 3 actions in scope. 6. **SDM docstrings** (`[C: ...]` / `[M: ...]` tags in `src/*.py` docstrings) — pre-computed caller/mutation info. The new audit tool will be a more rigorous version of what SDM already documents ad-hoc. ### Gaps to Fill (this track's scope) - A static call-graph builder for all of `src/` (multi-action, depth-configurable, machine-readable output). - A state-mutation index per function (5 mutation kinds: `attr_write`, `container_mutate`, `file_write`, `ipc_emit`, `global_write`). - An expensive-ops index (7 cost classes, with a heuristic data-size estimate). - A per-action traversal API (`trace_action(action, max_depth=10) -> ActionProfile`). - An output suite: custom postfix `.dsl` data files + markdown summaries + Mermaid per-action call graphs + prefix-tree text view. - A CLI (`python -m src.code_path_audit --action `) and an MCP tool (`code_path_audit(action_name, max_depth)`). - The actual audit run on the 3 actions, with the report committed to `docs/reports/code_path_audit/2026-06-07/`. ## Goals 1. **Produce a queryable artifact.** The custom postfix `.dsl` output is the source of truth; markdown + Mermaid + prefix-tree text are for human review. Re-run after any `src/` change to see drift. 2. **Surface the top-N optimization candidates per action.** The `summary.md` ranks candidates by potential data-transform load reduction. This is what the user will use to decide which pruning/optimization work to do next. 3. **Data-grounded design.** The audit's data structure is the spec; the heuristics and the threshold are module-level constants tunable from one place. 4. **Reusable across actions.** The `trace_action` API takes any `Action` (entry point + description). Adding a 4th action (e.g., MMA worker spawn, when it's no longer cold) is one `Action(...)` declaration. 5. **Surface calibration gaps clearly.** When the static heuristic can't resolve a call (C-extension, decorator-driven dispatch, `getattr` magic), the report flags it as "unresolved" so the runtime-profiling follow-up targets it. ## Non-Goals - Not implementing the actual code optimizations — that's `pipeline_pruning_20260607`. - Not profiling runtime costs — that's `pipeline_runtime_profiling_20260607`. - Not analyzing the MMA worker spawn action (cold per user). - Not analyzing `simulation/*` or `tests/*` directories. - Not analyzing actions beyond the 3 in scope. - Not resolving C-extension call costs statically. - Not resolving decorator-driven call dispatch statically (e.g., `@property`, `@imscope`). - Not providing real microsecond measurements — the cost is heuristic (calibrated later). ## Architecture `src/code_path_audit.py` — single new module, no new dependencies. Exposes both an MCP tool surface (for agents) and a CLI (`python -m src.code_path_audit ...`). ### Public API ```python class CallGraph: """Directed graph: nodes are functions; edges are call sites.""" nodes: dict[str, "FunctionNode"] # fully-qualified name -> node edges: dict[str, set[str]] # caller -> set of callees def add_edge(self, caller: str, callee: str) -> None: ... def transitive_callees(self, root: str, max_depth: int = 10) -> set[str]: ... def render_mermaid(self, root: str, max_depth: int = 5) -> str: ... class FunctionNode: fqname: str # "src.ai_client.AIClient.send" file: str line: int calls: list[str] # all callees (resolved or not) state_mutations: list["StateMutation"] expensive_ops: list["ExpensiveOp"] class StateMutation: target: str # "self.history", "module.events", "file:..." kind: Literal["attr_write", "container_mutate", "file_write", "ipc_emit", "global_write"] line: int class ExpensiveOp: callee: str cost_class: Literal["file_io", "network", "ast_parse", "json_io", "pickle", "deep_copy", "loop_amplified"] data_size_estimate: int | None # bytes or container length, heuristic line: int # call site in the caller weight: int # cost_class_weight * data_size (or 1 if data_size unknown) class Action: name: str # "ai_message_lifecycle" entry_points: list[str] # ["src.app_controller.AppController.process_user_request", ...] description: str class ActionProfile: action: Action call_graph: CallGraph # subgraph reachable from entry points expensive_ops: list[ExpensiveOp] # all expensive ops in the subgraph state_mutations: list[StateMutation] # all mutations in the subgraph redundancy: list[tuple[str, int]] # (op_fqname, call_count) where count > 1 pipelining_candidates: list[list[str]] # groups of independent ops currently sequential total_load_estimate: int # sum(weight) heuristic unresolved_calls: list[str] # calls the AST walker couldn't resolve mermaid: str # rendered Mermaid markdown: str # human-readable per-action report def trace_action(action: Action, max_depth: int = 10) -> ActionProfile: ... def build_call_graph(src_dir: str = "src") -> CallGraph: ... # full call graph def build_expensive_ops_index(cg: CallGraph) -> dict[str, list[ExpensiveOp]]: ... def build_state_mutations_index(cg: CallGraph) -> dict[str, list[StateMutation]]: ... ``` ### Cost Model (heuristic, calibrated by the runtime-profiling follow-up) | Pattern | Cost class | Default weight | Data size source | |---------|-----------|----------------|------------------| | `open()`, `Path.read_*`, `Path.write_*`, `*.write_text` | `file_io` | 100 | file size from `Path.stat()` when resolvable, else `None` | | `requests.*`, `urllib.*`, `websockets.*`, `client.send` (with httpx-like signatures) | `network` | 500 | payload size from param literal/typed hint | | `ast.parse`, `ast.walk`, `tree_sitter.*` | `ast_parse` | 200 | source bytes from the path arg | | `json.dump`, `json.load`, `tomli_w.dump`, `tomllib.load` | `json_io` | 150 | container length if param is a list/dict | | `pickle.dump`, `pickle.load` | `pickle` | 300 | container length | | `copy.deepcopy` | `deep_copy` | 200 | container length | | Any call inside the body of a `for` / `while` loop | `loop_amplified` | caller_weight × loop_bound_estimate | loop bound = `range(...)` literal/arg, else 1 | **Expense threshold:** `EXPENSIVE_THRESHOLD = 40_000` (module-level constant). Any `ExpensiveOp.weight > EXPENSIVE_THRESHOLD` is flagged "expensive" in the per-action report. The 40,000 default matches the user's stated 10-40μs range; the runtime-profiling follow-up will calibrate it. **Unresolved calls:** when the AST walker cannot resolve a callee (e.g., attribute access on `self.X` where `X` is set dynamically; `getattr`; decorator-wrapped method dispatch), the call goes into `unresolved_calls` with a `"unresolved"` cost class and weight 0. The report's caveats section notes these; the runtime-profiling follow-up measures them. ### Out of the static analysis - C-extension call costs (imgui-bundle, tree-sitter native) — runtime profiling only. - Decorator-driven dispatch (e.g., `@property`, `@imscope`) — runtime profiling only. - Import cost at module load time — covered by the existing `scripts/audit_main_thread_imports.py`. - `eval` / `exec` calls — flagged as unresolved, not analyzed. ## Per-Action Design For each of the 3 actions, the audit is invoked with one or more entry points and a depth limit (default 10). The audit produces an `ActionProfile` that the report renders. | Action | Entry points | Expected high-cost ops the audit should surface | |--------|--------------|------------------------------------------------| | **AI message lifecycle** | `src.app_controller.AppController.process_user_request`, `src.ai_client.AIClient.send`, `src.aggregate.build_file_items`, `src.summarize._summarise_*` | Per-context-file AST parse in `build_file_items`; AI network call; history append + comms log append + session_logger file write; sub-agent summarization (network + AST, loop-amplified over context files) | | **Discussion save/load** | `src.project_manager.save_project`, `src.project_manager.load_project`, `src.history.HistoryManager.save_snapshot`, `src.models.parse_history_entries` | `tomli_w.dump` / `tomllib.load` on project TOML; `json.dump` on comms log (loop-amplified per entry); history file read/write; AST parse on schema validation | | **GUI startup** | `sloppy.main` → `gui_2.App.__init__`, `src.app_controller.AppController.__init__`, `src.paths._resolve_*` | `tomllib.load` on config.toml; AST parses for tool registration; file stat on log paths; `sloppy.py` first-frame import chain (covered by the existing `scripts/audit_main_thread_imports.py`) | The user can extend with more actions later (e.g., MMA worker spawn when it's no longer cold). Each action is one `Action(...)` declaration + a `trace_action()` call. ## Output Format CLI: ```bash uv run python -m src.code_path_audit --action ai_message_lifecycle [--depth N] [--dsl] [--tree] [--markdown] [--mermaid] ``` MCP tool (for agents): ```python code_path_audit(action_name: str, max_depth: int = 10) -> dict ``` Generated artifacts (all under `docs/reports/code_path_audit//`): | File | Format | Purpose | |------|--------|---------| | `call_graph.dsl` | Custom postfix DSL | Full call graph (all of `src/`); machine-readable, parses in ~30 lines | | `expensive_ops.dsl` | Custom postfix DSL | Expensive ops index (per-file, per-function) | | `state_mutations.dsl` | Custom postfix DSL | State mutations index (per function) | | `actions/.dsl` | Custom postfix DSL | Per-action profile (machine-readable) | | `actions/.tree` | Prefix tree (text) | Per-action human-readable tree (for human review) | | `actions/.md` | Markdown | Per-action summary + table (for code review) | | `actions/.mmd` | Mermaid | Per-action call graph (visual) | | `summary.md` | Markdown | Top-level cross-action summary + ranked optimization candidates | | `optimization_candidates.md` | Markdown | Ranked list with: candidate, current cost, proposed reduction, effort, priority | The two follow-up tracks consume the .dsl files; the markdown + tree are for human review. **The custom DSL is postfix (RPN) with length-prefixed lists** — no brackets, no braces, no commas, no colons. Each "word" is a tagged constructor that consumes a known number of args from the stack (e.g., `fn` consumes 3, `exp-op` consumes 5, `mut` consumes 3, `N list` consumes N items). Whitespace-tokenized. Strings are bare atoms when they have no whitespace; quoted only when needed. `nil` for null. `\` for line comments. The DSL is deliberately NOT strict Forth — it's a custom postfix format tailored to the audit's record shapes (function, call, mutation, expensive op, pair, list). Example of a single FunctionNode record: ```text \ FunctionNode: fqname file line fn "src.ai_client.AIClient.send" "src/ai_client.py" 100 fn "build_file_items" call "process_response" call "self.history" attr_write 110 mut "open" file_io 100 120 exp-op ``` **The prefix tree renderer** is a separate human-readable view of the same data — top-down, `├─`/`└─`/`│` box-drawing, scannable. Generated by a recursive walker. Inlined in the markdown reports (optionally produced as `actions/.tree` for tooling). **Why custom postfix DSL (not JSON, not s-expressions, not strict Forth):** - **Not JSON** (JSON is ill-performant: quoting, escaping, hash table allocation, no streaming). - **Not s-expressions** (the bracket version drifts back toward s-exprs; the user wanted postfix specifically). - **Not strict Forth** (the user wants a format ideal for call-graph recording, not a Turing-complete Forth program). - **Postfix** (per user: "I want a post-fix heiarchy"): stack-based, no delimiters to count. - **Length-prefixed lists** (standard postfix solution for nesting): `N list` consumes N items, unambiguous. - **Trivial parser** (~30 lines: split + walk + evaluate tagged words against a known arity table). - **Compact**: ~30-40% fewer characters than JSON for the same data. - **Streamable**: no need to parse the whole file to find a record; you can scan for tags. - **Extensible**: add new metric types by adding new tagged words (`metric(name value sample_size)`, `histogram(buckets)`, etc.). ## Verification (TDD per `conductor/workflow.md`) Unit tests in `tests/test_code_path_audit.py`: - `CallGraph.add_edge` + `transitive_callees` correctness on a synthetic 5-node graph. - `ExpensiveOpIndex` detects each of the 7 cost classes on synthetic source. - `StateMutationIndex` detects each of the 5 mutation kinds on synthetic source. - `trace_action` produces an `ActionProfile` for a synthetic action whose expected cost is computable by hand. - Custom postfix `.dsl` output round-trips (parse_dsl(to_dsl(profile)) == in-memory structure). - Prefix tree renderer produces well-formed box-drawing output for the 3 per-action reports. - Markdown output is well-formed (header per section, table per category). - Mermaid output parses as valid Mermaid syntax. Smoke test: run `python -m src.code_path_audit --action ai_message_lifecycle --depth 5` against a fixture project; verify the report is produced and contains the expected high-cost ops (per the table above). Manual verification: the report is the deliverable. A Tier 2 Tech Lead + user review the produced `summary.md` to confirm the optimization candidates make sense. ## Commit Structure (6 atomic commits, in order) ``` 1. feat(audit): add code_path_audit data structures (CallGraph, ExpensiveOpIndex, StateMutationIndex) - src/code_path_audit.py (initial data structures) - tests/test_code_path_audit.py (unit tests) 2. feat(audit): add trace_action + ActionProfile + cost model - src/code_path_audit.py (extends with action tracing) - tests/test_code_path_audit.py (integration tests) 3. feat(audit): add custom postfix DSL writer + parser + tree renderer / markdown / Mermaid output 4. feat(audit): add MCP tool + CLI surface 5. docs(audit): run audit on 3 actions; commit report - docs/reports/code_path_audit/2026-06-07/* (the deliverable) 6. conductor(tracks): mark Code Path Audit track complete - tracks.md update ``` Each commit message includes a `git notes add -m "..."` summary per `conductor/workflow.md` step 9.1-9.3. ## Risks | Risk | Likelihood | Impact | Mitigation | |------|-----------|--------|------------| | Heuristic cost model is imprecise; reported "expensive" ops aren't actually expensive at runtime. | Medium | Medium (false positives dilute the report) | `EXPENSIVE_THRESHOLD` is a module-level constant; the runtime-profiling follow-up calibrates it. | | AST walking misses dynamic patterns (eval, getattr, decorator-driven dispatch). | Medium | Medium (under-estimates some calls) | Document the limitations in the report's caveats section; the runtime-profiling follow-up catches these. | | Mermaid diagrams exceed renderable size for deep actions. | Medium | Low (visualization only) | Default `max_depth=5` for `--mermaid`; full graph available as `.dsl`. | | The 3 actions' entry points are not exactly the functions the user has in mind. | Medium | Low (the report is the artifact; user can re-run with different entry points) | Document the chosen entry points in the report; CLI/MCP tool accepts any fully-qualified function name. | | Report is too large to review (thousands of expensive ops). | Low | Medium | Per-action scoping; default `--depth 5`; ranked optimization candidates in `summary.md` make the top-N obvious. | | Existing `derive_code_path` is the de-facto call-graph tool and the new one is redundant. | Low | Low (the new one is a strict superset) | `derive_code_path` stays as a thin wrapper around `code_path_audit.trace_action` for backward compat, OR gets a `@deprecated` shim. | | The 3 actions are not actually the user's top 3 (user might have meant a different 3). | Low | Low (the tool is generic; re-run with different actions is one CLI call) | CLI accepts any `Action`; user can re-run. | ## Coordination with Pending Tracks This track has **no blockers** and **no conflicts**. It can ship independently of the 5 active planned tracks. **It enables** future refactors: | Pending track | Could use this analysis for... | |----------------|--------------------------------| | `qwen_llama_grok_integration_20260606` | Identifying redundant OpenAI-compatible request paths in `_send_*` functions | | `data_oriented_error_handling_20260606` | Showing the call paths the new `Result[T]` return values will thread through | | `data_structure_strengthening_20260606` | Pinpointing hot functions where the new type aliases matter most | | `mcp_architecture_refactor_20260606` | Identifying which sub-MCPs have the most expensive operations (file_io vs network vs ast) | | `test_batching_refactor_20260606` | Confirming which tests trigger the most expensive paths (to optimize test selection) | This track's analysis is **read-only** — it doesn't modify `src/`, doesn't change the public API, doesn't add tests to the existing test suite. The only new files are `src/code_path_audit.py` (the tool), `tests/test_code_path_audit.py` (the tests), and the report under `docs/reports/code_path_audit/2026-06-07/`. ## Follow-up - **`pipeline_runtime_profiling_20260607`** (the user-requested follow-up; NOT in this track): adds a runtime profiling harness using the existing `src/performance_monitor.py` + a per-action test fixture. Measures real costs for the 3 actions. Calibrates the heuristic cost model (`EXPENSIVE_THRESHOLD` + per-class weights). Catches "things that aren't easy to resolve statically" — import cost, JIT effects, GC pauses, C-extension call cost (imgui-bundle, tree-sitter native), decorator-driven dispatch. Output: `scripts/runtime_profiler.py` + updated `code_path_audit.py` cost model. - **`pipeline_pruning_20260607`** (the second follow-up; NOT in this track): implements the high-priority optimization candidates surfaced by this track's report. Will be scoped AFTER this track ships, since the report itself defines what to prune. ## Out of Scope - **MMA worker spawn action** (deferred per user — keeping MMA cold until the 1:1 discussion UX is dogfooded in a few projects). - **Implementing the optimization fixes** (deferred to `pipeline_pruning_20260607`). - **Runtime profiling** (deferred to `pipeline_runtime_profiling_20260607` per the user's explicit ask). - **Other major actions** beyond AI message, save/load, GUI startup. - **C-extension call costs** (deferred to runtime profiling). - **Decorator-driven call dispatch** (deferred to runtime profiling). - **`simulation/*` and `tests/*` directories** (analysis is `src/`-only for this track; can be extended later). - **Modifying `src/`** (read-only analysis). ## See Also - `conductor/archive/code_path_analysis_20260507/` — prior manual audit; the new track is its data-grounded successor. - `conductor/archive/ai_interaction_call_graph_20260507/` — prior sequence diagram for the AI loop. - `src/mcp_client.py:934-992` — `derive_code_path(target, max_depth=5)` (single-symbol tracer; the new tool supersedes this for multi-action use). - `src/performance_monitor.py` — runtime profiling infrastructure used by the `pipeline_runtime_profiling_20260607` follow-up. - `scripts/audit_main_thread_imports.py` — related static CI gate (startup-time import cost). - `docs/reports/PLANNING_DIGEST_20260606.md` — planning context; the 5 active planned tracks are independent of this one. - `docs/guide_data_oriented.md` (if it exists; otherwise `conductor/product-guidelines.md` "Data-Oriented & Immediate Mode Heuristics") — the project's data-oriented design philosophy this track follows. - **`conductor/tracks/nagent_review_20260608/report.md` §15** (Pitfalls #2 and #4, "provider-specific history in process globals" and "AI client is a stateful singleton") — the audit's `state_mutations` index will surface both of these in the post-4-tracks `src/ai_client.py`; the optimization candidates should specifically address them. - **`docs/transcripts/wo84LFzx5nI_big_oops_casemuratori.txt`** — full transcript of Casey Muratori's "The Big OOPs" talk, loaded 2026-06-08 for context. The historical genealogy (Stroustrup, Kay, Simula, Hoare) grounds the audit's "entity-hierarchy fingerprint" heuristic (above). Specifically, Hoare's 1966 "Record Handling" paper introduced discriminated unions — which Simula kept (as `inspect`) but C++ removed. The audit's `actions/ai_message_lifecycle.tree` should be checked for `if/else` chains that *would be* a discriminated union if `Result[T]` were threaded through. - **`docs/transcripts/i-h95QIGchY_assuming_as_much_as_possible_andrewreece.txt`** — full transcript of Andrew Reece's "Assuming as Much as Possible" talk, loaded 2026-06-08 for context. Reece's "Xar" data structure (8-byte header, power-of-2 chunks, bitwise divmod, no `realloc` copy) is the *exemplar* for the chunkification-candidate heuristic. The `summary.md` of the audit's report should note the Xar pattern as a possible optimization target for any function in the hot path that does append-heavy work on a list of uniform items. - **`docs/ideation/ed_chunk_data_structures_20260523.md`** — user's chunk-based-data-structure ideation (May 2026). The 5-image archive is the source of the "chunkification candidates" heuristic. Specifically, the user notes: *"if my chunk size is 1,000 elements, but I only have 5 elements to store, aren't I wasting a massive amount of memory?"* — the audit should distinguish *real* chunkification candidates (uniform data, hot path, large N) from *false* chunkification candidates (small N, low frequency, polymorphic data). - **`docs/reports/computational_shapes_ssdl_digest_20260608.md`** — the SSDL digest synthesizing the 4-source computational-shapes thinking. The audit's `actions/.tree` and `actions/.mmd` outputs *are* computational-shape visualizations; the SSDL vocabulary (6 primitives + 7 modifiers) is the conceptual model the audit's tree renderer should follow.