From a9333bbb59dcc14f96d090c41000227e0179d453 Mon Sep 17 00:00:00 2001 From: conductor-tier2 Date: Mon, 8 Jun 2026 22:05:54 -0400 Subject: [PATCH] conductor(track-update): code_path_audit_20260607 - post-4-tracks timing + 5-source framing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The user specified that the code_path_audit_20260607 track should run AFTER the 4 foundational tracks complete (qwen_llama_grok, data_oriented_error_handling, data_structure_strengthening, mcp_architecture_refactor). This commit formalizes that timing and grounds the audit's analytical framing in the 5 sources loaded into context on 2026-06-08. 3 surgical additions to the spec/plan, no task changes: 1. Post-4-tracks timing (new section in spec.md §"Timing", plus a "Timing" callout in plan.md's opening): - The 4 tracks will significantly reshape src/ai_client.py, src/mcp_client.py, src/app_controller.py, and src/type_aliases.py - Running the audit on pre-refactor code would produce a report that's stale on day 1 - The post-4-tracks timing ensures the audit grounds optimization decisions for the *resulting* architecture - Pre-flight check: verify all 4 tracks are [x] completed in conductor/tracks.md before starting this track 2. Analytical framing (new section in spec.md §"Analytical Framing (5-source lens)"): - Maps each of the 5 sources (Fleury taxonomy + Fleury combinatoric + Muratori Big OOPs + Reece Assuming + user's chunk ideation) to specific audit-time heuristics - 4 concrete heuristics: effective-codepath count, entity-hierarchy fingerprint, assumed-too-much detector, chunkification candidates - The heuristics shape REPORT INTERPRETATION, not the static cost model (which stays data-grounded in EXPENSIVE_THRESHOLD + per-class weights) 3. See Also cross-references in spec.md (6 new entries): - nagent_review Pitfalls #2 and #4 (provider history globals + stateful singleton) - wo84LFzx5nI Big OOPs transcript (full text, 4310 segments, 200KB; loaded 2026-06-08) - i-h95QIGchY Assuming transcript (full text, 3719 segments, 162KB; loaded 2026-06-08) - ed_chunk_data_structures_20260523.md (5-image archive of user's chunk ideation, 19KB; saved 2026-06-08) - computational_shapes_ssdl_digest_20260608.md (the SSDL digest that synthesizes the 4-source computational-shapes thinking; the audit's tree/mermaid outputs ARE computational-shape visualizations) 4. tracks.md entry updated to include the spec/plan links and a brief status note that the audit is post-4-tracks. 5. plan.md has a "Timing" callout at the top stating the 4 tracks must ship before the plan executes. No code modified. The audit's tasks (Phases 1-6) are unchanged in structure; the new sections only add analytical context and timing constraints. --- conductor/tracks.md | 3 +- .../tracks/code_path_audit_20260607/plan.md | 8 ++++ .../tracks/code_path_audit_20260607/spec.md | 46 ++++++++++++++++++- 3 files changed, 55 insertions(+), 2 deletions(-) diff --git a/conductor/tracks.md b/conductor/tracks.md index 2e88c16a..32a9059e 100644 --- a/conductor/tracks.md +++ b/conductor/tracks.md @@ -533,7 +533,8 @@ User review surfaced five outstanding UI issues, each previously attempted witho *Link: [./tracks/test_batching_post_refactor_polish_20260607/](./tracks/test_batching_post_refactor_polish_20260607/)* #### Track: Code Path Audit -*Link: [./tracks/code_path_audit_20260607/](./tracks/code_path_audit_20260607/)* +*Link: [./tracks/code_path_audit_20260607/](./tracks/code_path_audit_20260607/), Spec: [./tracks/code_path_audit_20260607/spec.md](./tracks/code_path_audit_20260607/spec.md), Plan: [./tracks/code_path_audit_20260607/plan.md](./tracks/code_path_audit_20260607/plan.md) (to be authored by writing-plans skill)* +*Goal: Build `src/code_path_audit.py` — a static-analysis tool that audits the 3 major actions (AI message lifecycle, discussion save/load, GUI startup) for expensive operations, redundant calls, and pipelining candidates. Output: custom postfix `.dsl` data + markdown + Mermaid + prefix tree text under `docs/reports/code_path_audit//`. The follow-up `pipeline_pruning_20260607` consumes the `.dsl` files; the markdown + tree are for human review. MMA worker spawn is **cold per user**. **Timing (revised 2026-06-08):** the audit must run *after* the 4 foundational tracks ship (`qwen_llama_grok`, `data_oriented_error_handling`, `data_structure_strengthening`, `mcp_architecture_refactor`); pre-4-tracks code is too stale to ground optimization decisions.* #### Track: GUI Architecture Refinement *Link: [./tracks/gui_architecture_refinement_20260512/](./tracks/gui_architecture_refinement_20260512/) (no spec.md; needs scoping before planning)* diff --git a/conductor/tracks/code_path_audit_20260607/plan.md b/conductor/tracks/code_path_audit_20260607/plan.md index 971929e1..6a07288f 100644 --- a/conductor/tracks/code_path_audit_20260607/plan.md +++ b/conductor/tracks/code_path_audit_20260607/plan.md @@ -2,6 +2,14 @@ > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. +> **Timing (added 2026-06-08).** This plan should **not** be executed until *all 4 foundational tracks* are shipped: +> 1. `qwen_llama_grok_integration_20260606` +> 2. `data_oriented_error_handling_20260606` +> 3. `data_structure_strengthening_20260606` +> 4. `mcp_architecture_refactor_20260606` +> +> The 4 tracks will significantly reshape `src/ai_client.py`, `src/mcp_client.py`, `src/app_controller.py`, and `src/type_aliases.py`. Running this audit on the pre-refactor `src/` would produce a report that's stale on day 1. The Tier 2 Tech Lead should verify the 4-tracks baseline (all marked `[x]` in `conductor/tracks.md`) before starting Phase 1. + **Goal:** Build `src/code_path_audit.py` — a static-analysis tool that audits the 3 major actions (AI message lifecycle, discussion save/load, GUI startup) for expensive operations, redundant calls, and pipelining candidates. Output: custom postfix `.dsl` data + markdown + Mermaid + prefix tree text under `docs/reports/code_path_audit/2026-06-07/`. **Architecture:** Single new module `src/code_path_audit.py`. No new dependencies. Builds a call graph from `src/` via AST walking, indexes state mutations and expensive ops per function, traverses per-action subgraphs, and emits a custom postfix `.dsl` (machine) + markdown + Mermaid (visual) + prefix tree text (human). The postfix `.dsl` is a custom DSL tailored to the audit's record shapes — tagged records (each "word" is a constructor with a known arity), length-prefixed lists, whitespace-tokenized, with `"..."` quoting only when needed. The prefix tree renderer is a separate view of the same data, generated by a recursive walker. Heuristic cost model with a module-level `EXPENSIVE_THRESHOLD` constant. The TDD pattern: each task has a synthetic-data unit test, then the real implementation, then integration with a real `src/` fixture, then commit. diff --git a/conductor/tracks/code_path_audit_20260607/spec.md b/conductor/tracks/code_path_audit_20260607/spec.md index 3f2a6930..7e18db5c 100644 --- a/conductor/tracks/code_path_audit_20260607/spec.md +++ b/conductor/tracks/code_path_audit_20260607/spec.md @@ -1,10 +1,12 @@ # Track: Code Path & Data Pipeline Audit -**Status:** Spec approved 2026-06-07 +**Status:** Spec approved 2026-06-07; revised 2026-06-08 with post-4-tracks timing and 5-source framing **Initialized:** 2026-06-07 **Owner:** Tier 2 Tech Lead **Priority:** Medium (foundational; enables follow-up pruning track) +> **Revision note (2026-06-08).** The user specified that this audit should run *after* the 4 foundational tracks complete (`qwen_llama_grok_integration_20260606`, `data_oriented_error_handling_20260606`, `data_structure_strengthening_20260606`, `mcp_architecture_refactor_20260606`). The 4 tracks will significantly reshape `src/ai_client.py`, `src/mcp_client.py`, `src/app_controller.py`, and `src/type_aliases.py` — running the audit on the pre-refactor code would produce a report that's stale on day 1. The post-4-tracks timing ensures the audit grounds optimization decisions for the *resulting* architecture, not the pre-refactor one. See §"Timing" below. + --- ## Overview @@ -15,6 +17,43 @@ Per the user's framing: "anything that can even remotely smell as an expensive b The MMA worker spawn action is **out of scope** for this track (per user: "keeping that cold for a while until I like the main ux loop with ai in a discussion fully dogfooded"). +## Timing (post-4-tracks) + +This track is intentionally **deferred** until *after* the 4 foundational tracks ship: + +1. `qwen_llama_grok_integration_20260606` — adds 3 vendors (`_send_qwen`, `_send_llama`, `_send_grok`) and refactors `_send_minimax` to use the shared `send_openai_compatible()` helper. Modifies `src/ai_client.py`, `src/openai_compatible.py` (new), `src/vendor_capabilities.py` (new). +2. `data_oriented_error_handling_20260606` — refactors `ai_client._send_` to return `Result[str]`, modifies `mcp_client.py` (30+ sites), `rag_engine.py` (Result returns). +3. `data_structure_strengthening_20260606` — adds `src/type_aliases.py` with 10 TypeAliases, replaces 345 weak-type sites across 6 files. +4. `mcp_architecture_refactor_20260606` — splits `src/mcp_client.py` (2,205 lines → 6 sub-MCPs + 1 external), adds `src/mcp_client_legacy.py` for backward compat. + +Running the audit on the **pre-refactor** `src/` would produce a report that's stale on day 1. The post-4-tracks timing ensures: +- The audit's data grounds optimization decisions for the *resulting* architecture (post-Fleury-style "effective codepaths" and "ECS archetype tables" if the 4 tracks are implemented with the data-oriented philosophy). +- The `pipeline_pruning_20260607` follow-up has the *right* candidates to optimize — the 4 tracks will move the expensive ops around, and pruning the wrong ones wastes work. +- The runtime-profiling follow-up (`pipeline_runtime_profiling_20260607`) measures the *new* code paths, not the old ones. + +**Pre-flight check (verifies the 4-tracks baseline before this track starts):** confirm that all 4 tracks are marked `[x]` completed in `conductor/tracks.md`. If any of the 4 are still `[~]` in-progress, this track is blocked — the audit would catch the in-progress state as drift. + +## Analytical Framing (5-source lens) + +The 5 sources loaded into context for the post-4-tracks audit collectively reframe *what* to look for in the 3 actions. The audit's static cost model and pipeline-pruning recommendations should be informed by: + +| Source | Lens the audit inherits | +|---|---| +| [Ryan Fleury, "A Taxonomy of Computation Shapes"](https://www.dgtlgrove.com/p/a-taxonomy-of-computation-shapes) (Feb 2023) | The 6 shapes: instruction, codepath, wide codepath, codecycle, wide codecycle, codecycle graph. The audit's `trace_action` is a codepath visualization; the `redundancy` (call_count > 1) field detects **wide codepaths** that could be split into parallel sub-codepaths. | +| [Ryan Fleury, "The Codepath Combinatoric Explosion"](https://www.dgtlgrove.com/p/the-codepath-combinatoric-explosion) (Apr 2023) | The "effective codepath" concept. The audit's `pipelining_candidates` field detects codepaths that *could be defused* (multiple real codepaths collapsed into 1 effective codepath via nil sentinels, generational handles, or immediate-mode APIs). The `redundancy` field is the *first indicator* of defusing opportunities. | +| [Casey Muratori, "The Big OOPs: Anatomy of a Thirty-Five-Year Mistake" (BSC 2025)](https://youtu.be/wo84LFzx5nI) | The 35-year-historical indictment of compile-time domain hierarchies. The audit's per-function `state_mutations` index reveals whether a function is in the *system* pattern (mutates component-like data, not entity state) or the *entity-hierarchy* pattern (mutates a single object's identity, where the cost compounds per type). Functions in the latter pattern are the *highest-priority* refactor targets — they may need to be split into components + systems. | +| [Andrew Reece, "Assuming as Much as Possible" (BSC 2025)](https://www.youtube.com/watch?v=i-h95QIGchY) | The "assume as much as possible" engineering discipline. The audit's `expensive_ops` index, for any function that calls a general-purpose primitive (e.g., `json.dumps`, `Path.read_text`, `ast.parse`), should ask: **"can this caller assume a smaller input domain and use a specialized primitive instead?"** A function that calls `json.dumps` 50 times per action with 1KB payloads each may be replaceable by a function that calls a domain-specific serializer once with a 50KB payload. | +| User's chunk-ideation archive (May 2026) | The "fixed-size slices" + "ECS archetype tables" pattern. The audit's per-function calls that operate on lists/arrays should be flagged if they: (a) don't have a chunk-aware variant, (b) are in a hot path, (c) the data shape is uniform enough to chunk. Functions that match all 3 are the **prime candidates** for `pipeline_pruning_20260607` — chunkification is a known pattern with bounded risk. | + +**Concrete audit-time heuristics** that emerge from this framing: + +- **Effective-codepath count:** when a function has 3+ branches that all do roughly the same thing with different inputs, the audit should report "this is N real codepaths behaving as 1 effective codepath — could be defused with a nil sentinel or generational handle." The runtime-profiling follow-up measures the actual savings. +- **Entity-hierarchy fingerprint:** when a function's `state_mutations` list has > 3 writes to a single `self.X` with a `type` discriminator, the audit should report "this function is operating on entity-hierarchy state; consider ECS split into components + systems." A *concrete Manual Slop example* the audit should catch: any function that does `if self.active_ticket.kind == TicketKind.X:` and then mutates multiple fields. +- **Assumed-too-much detector:** when a function calls `ast.parse` (or any `tree_sitter.*`) on a file that *could be assumed* to be already-parsed (because the file is in the context composition and the `aggregate.py` pipeline has already done it), the audit should report "this is re-parsing data that was already parsed upstream; consider memoizing or threading the parsed AST through." This is the "assume as much as possible" pattern at the data-passing level. +- **Chunkification candidates:** when a function loops over a `list[dict]` with a known uniform shape (heuristic: all dicts have the same key set), the audit should report "consider chunkifying — uniform data, hot path, no chunk awareness." The user has explicit code (`docs/ideation/ed_chunk_data_structures_20260523.md`) for the chunk pattern, so the audit's optimization candidates can cite it. + +These heuristics are *guidance for the audit's report interpretation* — they don't change the audit's static cost model (which is data-grounded in the existing `EXENSIVE_THRESHOLD` + per-class weights). They shape how the Tier 2 Tech Lead and the user interpret the report. + ## Current State Audit (as of `ca781543`) `src/` has 61 `.py` files (27,447 total lines; 23,845 code lines). The call graph is non-trivial; per-action traversal is what makes the analysis tractable. @@ -291,3 +330,8 @@ This track's analysis is **read-only** — it doesn't modify `src/`, doesn't cha - `scripts/audit_main_thread_imports.py` — related static CI gate (startup-time import cost). - `docs/reports/PLANNING_DIGEST_20260606.md` — planning context; the 5 active planned tracks are independent of this one. - `docs/guide_data_oriented.md` (if it exists; otherwise `conductor/product-guidelines.md` "Data-Oriented & Immediate Mode Heuristics") — the project's data-oriented design philosophy this track follows. +- **`conductor/tracks/nagent_review_20260608/report.md` §15** (Pitfalls #2 and #4, "provider-specific history in process globals" and "AI client is a stateful singleton") — the audit's `state_mutations` index will surface both of these in the post-4-tracks `src/ai_client.py`; the optimization candidates should specifically address them. +- **`docs/transcripts/wo84LFzx5nI_big_oops_casemuratori.txt`** — full transcript of Casey Muratori's "The Big OOPs" talk, loaded 2026-06-08 for context. The historical genealogy (Stroustrup, Kay, Simula, Hoare) grounds the audit's "entity-hierarchy fingerprint" heuristic (above). Specifically, Hoare's 1966 "Record Handling" paper introduced discriminated unions — which Simula kept (as `inspect`) but C++ removed. The audit's `actions/ai_message_lifecycle.tree` should be checked for `if/else` chains that *would be* a discriminated union if `Result[T]` were threaded through. +- **`docs/transcripts/i-h95QIGchY_assuming_as_much_as_possible_andrewreece.txt`** — full transcript of Andrew Reece's "Assuming as Much as Possible" talk, loaded 2026-06-08 for context. Reece's "Xar" data structure (8-byte header, power-of-2 chunks, bitwise divmod, no `realloc` copy) is the *exemplar* for the chunkification-candidate heuristic. The `summary.md` of the audit's report should note the Xar pattern as a possible optimization target for any function in the hot path that does append-heavy work on a list of uniform items. +- **`docs/ideation/ed_chunk_data_structures_20260523.md`** — user's chunk-based-data-structure ideation (May 2026). The 5-image archive is the source of the "chunkification candidates" heuristic. Specifically, the user notes: *"if my chunk size is 1,000 elements, but I only have 5 elements to store, aren't I wasting a massive amount of memory?"* — the audit should distinguish *real* chunkification candidates (uniform data, hot path, large N) from *false* chunkification candidates (small N, low frequency, polymorphic data). +- **`docs/reports/computational_shapes_ssdl_digest_20260608.md`** — the SSDL digest synthesizing the 4-source computational-shapes thinking. The audit's `actions/.tree` and `actions/.mmd` outputs *are* computational-shape visualizations; the SSDL vocabulary (6 primitives + 7 modifiers) is the conceptual model the audit's tree renderer should follow.