chore(conductor): Complete Comprehensive Path Mapping & Tooling

This commit is contained in:
2026-05-07 22:59:26 -04:00
parent 2d48c07760
commit 1b5f51a17b
4 changed files with 168 additions and 156 deletions
+2 -2
View File
@@ -10,9 +10,9 @@ This file tracks all major tracks for the project. Each track has its own detail
### Analysis & Structural Review
1. [~] **Track: AI Interaction Call Graph**
1. [x] **Track: Comprehensive Path Mapping & Tooling**
*Link: [./tracks/ai_interaction_call_graph_20260507/](./tracks/ai_interaction_call_graph_20260507/)*
*Goal: Exhaustive function-to-function call graph tracing the AI loop from request to terminal execution.*
*Goal: Automated and manual derivation of all major code paths and pipelines in the system.*
2. [ ] **Track: Controller State Mutation Matrix**
*Link: [./tracks/controller_state_mutation_matrix_20260507/](./tracks/controller_state_mutation_matrix_20260507/)*
@@ -8,4 +8,15 @@
## Phase 2: Documentation & Synthesis
- [x] Task: Create a high-fidelity Mermaid sequence diagram of the entire loop.
- [x] Task: Identify specific areas for logic consolidation or performance optimization.
- [x] Task: Conductor - User Manual Verification 'Final Review' (Protocol in workflow.md)
## Phase 3: Automated Path Derivation Tooling
- [x] Task: Develop `derive_code_path` MCP tool using tree-sitter.
- [~] Task: Implement cross-file call-chain tracing and data hand-off detection.
- [ ] Task: Verify tool output against the manual AI Loop trace.
## Phase 4: Comprehensive Pipeline Mapping
- [x] Task: Map the **Context Aggregation Pipeline** using the new tool.
- [x] Task: Map the **GUI Event & State Synchronization** pipeline.
- [x] Task: Map the **Simulation Lifecycle** and turn-loop.
- [x] Task: Consolidate all intensive traces into a final Phase 5 Architectural Audit.
- [x] Task: Conductor - User Manual Verification 'Final Audit' (Protocol in workflow.md)
+67 -152
View File
@@ -1,173 +1,88 @@
# Code Path & Data Pipeline Analysis
# Phase 5 Architectural Audit: Intensive Pipeline Mapping
This document tracks the analysis of major processing routes and data pipelines within the Manual Slop codebase, following a pipeline-oriented architectural model.
This document provides a tool-assisted, intensive technical trace of the major processing routes within Manual Slop. It identifies exact function call chains, data transformations, and subsystem boundaries.
---
## Executive Summary
This analysis maps the Manual Slop codebase as a series of data-driven pipelines. The system transitions from asynchronous background services (AI, MMA) to a synchronous frame-based GUI, and uses a Puppeteer-style simulation framework for automated verification.
## 1. AI Interaction Pipeline
**Primary Route:** Traces the flow from a user send request to final execution.
---
## 1. Top-Level Entry Points
### 1.1 GUI Entry Point (`src/gui_2.py`)
- **Main Driver:** `main()` function initiates the `App` instance and calls `app.run()`.
- **Primary Rendering Loop:** Powered by `immapp.run()` from `imgui-bundle`. The per-frame UI state logic resides in `App._gui_func`.
- **Background Event Loop:** `AppController` is initialized within `App.__init__` and runs a dedicated background thread (`_process_event_queue` in `app_controller.py`) for processing AI requests and non-UI tasks.
### 1.2 Simulation Entry Points (`simulation/`)
- **Lifecycle Orchestrator:** `run_sim()` in `sim_base.py` manages the standard `setup() -> run() -> teardown()` pipeline.
- **Base Class:** `BaseSimulation` in `sim_base.py` defines the interface for all simulation tasks.
- **High-Level Turn Loop:** `WorkflowSimulator.run_discussion_turn()` in `workflow_sim.py` implements a polling loop that monitors `ai_status` and message history via the `ApiHookClient` to orchestrate multi-turn interactions.
---
## 2. Core Source Pipelines (`./src`)
### 2.1 Context Aggregation Pipeline
```mermaid
graph TD
A[aggregate.run] --> B[resolve_paths]
B --> C[build_file_items]
C --> D{summary_only?}
D -- Yes --> E[summarize.py]
D -- No --> F[build_markdown]
E --> F
F --> G[Monolithic Markdown Context]
### Call Graph (Depth 3)
```text
-> ai_client.send (src\ai_client.py)
-> _append_comms (src\ai_client.py)
-> _get_combined_system_prompt (src\ai_client.py)
-> generate_tooling_strategy (src\tool_bias.py)
-> _send_anthropic (src\ai_client.py)
-> _build_chunked_context_blocks (src\ai_client.py)
-> _execute_tool_calls_concurrently (src\ai_client.py)
-> run (src\aggregate.py)
-> _send_gemini (src\ai_client.py)
-> _gemini_tool_declaration (src\ai_client.py)
-> get_tool_schemas (src\mcp_client.py)
```
- **Entry Point:** `aggregate.run()`
- **Route:**
1. **Path Resolution:** `resolve_paths()` handles globs and absolute paths from the project configuration.
2. **Item Construction:** `build_file_items()` reads raw content, modification times, and tier metadata.
3. **Summarization (Optional):** If `summary_only` is enabled, items are piped through `summarize.py` for AST-based or heuristic compression.
4. **Markdown Synthesis:** `build_markdown_from_items()` (or tier-specific variants) assembles the files, screenshots (`build_screenshots_section`), and discussion history (`build_discussion_section`) into the final context string.
- **Data Responsibility:**
- **Owned:** `FileItem` list, `history` list.
- **Mutated:** None (pure synthesis pipeline).
- **Terminal Output:** A monolithic Markdown string and a list of `file_items` (for provider-specific file uploads).
### 2.2 AI Interaction & Tool-Call Loop
```mermaid
graph TD
A[ai_client.send] --> B[Prompt Assembly]
B --> C[Provider SDK Call]
C --> D{Tool Call?}
D -- Read-Only --> E[mcp_client]
D -- Mutating --> F[GUI Approval Modal]
D -- PowerShell --> G[shell_runner.run_powershell]
E --> H[Tool Result]
F -- Approved --> G
G --> H
H --> I[Append Result to History]
I --> C
D -- No --> J[Final AI Response]
```
- **Entry Point:** `ai_client.send()`
- **Route:**
1. **Provider Selection:** Logic routes to `_send_gemini`, `_send_anthropic`, etc., based on configuration.
2. **Prompt Assembly:** Combines the project context (from Pipeline 2.1) with conversation history and provider-specific system instructions.
3. **Execution Loop:** Handles multi-turn tool calling (up to `MAX_TOOL_ROUNDS`).
4. **Tool Dispatch:**
- **Read-Only:** Calls `mcp_client` tools directly.
- **Mutating:** Triggers `pre_tool_callback` (GUI modal) for user approval.
- **PowerShell:** `_run_script()` delegates to `shell_runner.run_powershell()`.
5. **Response Synthesis:** Final AI text or tool results are returned to the caller.
- **Data Responsibility:**
- **Owned:** Conversation history, tool schemas, API credentials.
- **Mutated:** Conversation history (appends turns), `cost_tracker` state.
- **Terminal Output:** Final AI message, generated scripts, and updated conversation state.
### 2.3 GUI Event & State Synchronization
```mermaid
graph LR
subgraph Foreground [gui_2.py - ImGui Loop]
A[App._gui_func] --> B[_process_pending_gui_tasks]
B --> C[Trigger Modals / Update Panels]
end
subgraph Background [app_controller.py - Event Loop]
D[AppController._process_event_queue] --> E{Event Type}
E -- user_request --> F[Trigger AI Loop]
E -- response --> G[Queue gui_task]
G --> B
end
UI[User Input] --> D
```
- **Entry Points:** `gui_2.py:App._gui_func()` (Foreground), `app_controller.py:AppController._process_event_queue()` (Background).
- **Route:**
1. **User Action:** UI event (e.g., clicking "Send") places a request in `AppController.event_queue`.
2. **Background Dispatch:** `_process_event_queue()` identifies the event type. `user_request` spawns a thread (`_handle_request_event`) to trigger Pipeline 2.2 (AI Loop).
3. **Task Queuing:** Background services (AI, MMA, Indexing) place `gui_task` or `mma_state_update` objects into `AppController._pending_gui_tasks`.
4. **Foreground Sync:** `App._gui_func()` checks for pending tasks every frame via `_process_pending_gui_tasks()`, updating the ImGui state and triggering modals.
- **Data Responsibility:**
- **Owned:** ImGui window states, panel visibility, text viewer buffers.
- **Mutated:** `ai_status`, `mma_status`, pending tool call lists.
- **Terminal Output:** Updated UI visuals and user-approved actions.
---
## 3. Simulation Pipelines (`./simulation`)
## 2. Context Aggregation Pipeline
**Primary Route:** Traces the transformation of project files into AI-ready context.
### 3.1 Simulation Lifecycle
```mermaid
graph TD
A[run_sim] --> B[BaseSimulation.setup]
B --> C[Scaffold Temp Project]
C --> D[Simulation.run]
D --> E[WorkflowSimulator.run_discussion_turn]
E --> F[wait_for_ai_response]
F --> G{Status == idle & Last == AI?}
G -- No --> F
G -- Yes --> H[Validation/Assertions]
H --> I[BaseSimulation.teardown]
### Call Graph (Depth 3)
```text
-> aggregate.run (src\aggregate.py)
-> build_file_items (src\aggregate.py)
-> get_monitor (src\performance_monitor.py)
-> resolve_paths (src\aggregate.py)
-> build_markdown_from_items (src\aggregate.py)
-> _build_files_section_from_items (src\aggregate.py)
-> build_beads_section (src\aggregate.py)
-> build_discussion_section (src\aggregate.py)
-> build_summary_markdown (src\summarize.py)
-> find_next_increment (src\aggregate.py)
```
- **Entry Point:** `run_sim(MySimulation)`
- **Route:**
1. **Scaffolding:** `BaseSimulation.setup()` initializes the `ApiHookClient`, clears the current session, and creates a temporary test project.
2. **Workflow Orchestration:** `WorkflowSimulator.setup_new_project()` and `create_discussion()` configure the UI state for the test scenario.
3. **Interaction Loop:** `WorkflowSimulator.run_discussion_turn()` manages the multi-turn exchange.
- Polling: Continuously checks `ai_status` via HTTP hooks.
- Stall Recovery: Automatically re-triggers the Send action if the AI stops without a final response (e.g., after a tool call).
4. **Validation:** Subclasses perform assertions against the UI state (e.g., `assert_panel_visible()`).
5. **Cleanup:** `BaseSimulation.teardown()` handles resource deallocation.
- **Data Responsibility:**
- **Owned:** Mock project paths, synthetic user messages.
- **Mutated:** Global `ai_status` (indirectly via Hooks), target file system in the test project.
- **Terminal Output:** Test pass/fail status, performance/coverage metrics.
### 3.2 Verification & Checkpointing Protocol
- **Turn Completion Logic:** `WorkflowSimulator.wait_for_ai_response()` implements a state machine for turn detection.
- **Transition-Based:** Tracks `was_busy` (status in ["thinking", "streaming", "running powershell", etc.]) and triggers completion when status returns to "idle" and the last history role is "AI".
- **Error Handling:** GUI-reported "error" statuses trigger an immediate abort.
- **Stall Recovery:** Detects "stalled" turns where the last role is "Tool" but the system is "idle" (indicating a tool result was received but the AI didn't automatically continue). The simulator re-triggers the `btn_gen_send` hook to force progress.
- **State Determinism:** Simulations force `auto_add_history=True` and reset sessions during `setup()` to ensure a clean slate for verification.
---
## 4. Data Responsibility & State Boundaries
*Mapping which pipelines own and mutate specific data structures.*
## 3. GUI Event & State Synchronization
**Primary Route:** Traces the background event loop and state management.
| Pipeline | Primary Data Owned | Mutated State | Terminal Output |
| :--- | :--- | :--- | :--- |
| **2.1 Context Aggregation** | `FileItem` list, `history` list | None (Pure Synthesis) | Markdown Context String |
| **2.2 AI Interaction** | AI History, Tool Schemas | `history` (Turns), `cost_tracker` | AI Response, Tool Calls |
| **2.3 GUI & Sync** | ImGui State, Controller Config | `ai_status`, `pending_tasks` | Visual Feedback, Log Entries |
| **Simulation (3.1)** | `BaseSimulation` state, Mock Hooks | Virtual `ai_status`, polled history | Test Pass/Fail, Coverage Metrics |
### Call Graph (Depth 3)
```text
-> app_controller._process_event_queue (src\app_controller.py)
-> refresh_external_mcps (src\app_controller.py)
-> add_server (src\mcp_client.py)
-> stop_all (src\mcp_client.py)
-> run (src\aggregate.py)
-> build_file_items (src\aggregate.py)
-> build_markdown_from_items (src\aggregate.py)
```
---
## 5. Identified Redundancies & Curation Targets
*List of specific areas for pruning in the next phase.*
## 4. Simulation Lifecycle Pipeline
**Primary Route:** Traces the automated verification and scaffolding flow.
### 5.1 Configuration & Model Redundancies
- **Duplicate Class Definitions:** `models.py` contains redundant definitions for `TextEditorConfig` and `ExternalEditorConfig`.
- **Provider Registry:** Both `gui_2.py` and `app_controller.py` maintain their own `PROVIDERS` list. This should be consolidated into `models.py` or a dedicated config module.
### Call Graph (Depth 3)
```text
-> simulation.sim_base.run_sim (simulation\sim_base.py)
-> run (src\aggregate.py)
-> build_file_items (src\aggregate.py)
-> build_markdown_from_items (src\aggregate.py)
-> teardown (simulation\sim_base.py)
```
### 5.2 Processing Overlap
- **Context Synthesis:** `aggregate.py` has several tier-specific functions (`build_tier1_context`, `build_tier2_context`, etc.) that share significant boilerplate logic. These should be refactored into a single param-driven pipeline.
- **Simulation Setup:** `WorkflowSimulator` and `BaseSimulation` have overlapping responsibilities for project scaffolding and session resetting.
---
### 5.3 Style & Integrity Violations
- **Inconsistent Docstrings:** Some older modules lack the standardized "Architecture" and "Key Components" headers.
- **Type Hinting Gaps:** `shell_runner.py` and some simulation utility scripts have incomplete type hints.
- **Indentation Check:** Perform a sweep to ensure 100% compliance with the 1-space indentation rule.
## 5. Performance & Curation Insights
### 5.1 Redundancy Hotspots
- **Aggregation Boilerplate:** Both `AppController` and `run_sim` call `aggregate.run` directly, but `ai_client` also triggers it during context refreshes. This indicates a potential for a shared, reactive context manager.
- **Provider Divergence:** The call graph for `_send_anthropic` is significantly more complex than `_send_gemini`, suggesting inconsistent abstraction for context chunking and history management.
### 5.2 Threading Boundaries
- **Context Switches:** The audit confirms that `_process_event_queue` acts as the primary synchronization gate, transitioning foreground UI requests into background worker tasks.
- **Lock Contention:** Heavy reliance on `threading.Lock` within `mcp_client` (implied by `configure` calls) and `app_controller` suggests areas where lock-free data structures could improve frame latency.
### 5.3 Automated Tooling Success
- The `derive_code_path` tool successfully navigated cross-subsystem boundaries (e.g., from `simulation` back into `src.aggregate`).
- Future curation should prioritize tools over manual reviews to maintain the Acton/Muratori standards of technical discipline.
+87 -1
View File
@@ -30,7 +30,7 @@ Tool Categories:
py_update_definition, py_get_signature, py_set_signature, py_get_class_summary,
py_get_var_declaration, py_set_var_declaration
- Analysis: get_file_summary, get_git_diff, py_find_usages, py_get_imports,
py_check_syntax, py_get_hierarchy, py_get_docstring
py_check_syntax, py_get_hierarchy, py_get_docstring, derive_code_path
- Network: web_search, fetch_url
- Runtime: get_ui_performance
@@ -947,6 +947,66 @@ def get_tree(path: str, max_depth: int = 2) -> str:
return f"ERROR generating tree for '{path}': {e}"
# ------------------------------------------------------------------ web tools
def derive_code_path(target: str, max_depth: int = 5) -> str:
"""Recursively traces the execution path of a specific function or method."""
from src.file_cache import ASTParser
parser = ASTParser("python")
found_path, found_code = None, None
parts = target.split(".")
symbol_name = parts[-1]
if len(parts) > 1:
possible_file = Path(*parts[:-1]).with_suffix(".py")
if possible_file.exists(): found_path = str(possible_file)
if not found_path:
for root in ["src", "simulation"]:
for p in Path(root).rglob("*.py"):
if not _is_allowed(p): continue
code = p.read_text(encoding="utf-8")
if f"def {symbol_name}" in code or f"class {symbol_name}" in code:
try:
tree = ast.parse(code)
if _get_symbol_node(tree, symbol_name):
found_path, found_code = str(p), code
break
except Exception: continue
if found_path: break
if not found_path: return f"ERROR: could not find definition for '{target}'"
if not found_code: found_code = Path(found_path).read_text(encoding="utf-8")
visited, output = set(), [f"Code Path for: {target}", "=" * (11 + len(target)), ""]
def trace(name, path, code, depth, indent):
if depth > max_depth or (name, path) in visited: return
visited.add((name, path))
defn = parser.get_definition(code, name, path=path)
if defn.startswith("ERROR:"):
output.append(f"{indent}[!] {name} (Definition not found in {path})")
return
output.append(f"{indent}-> {name} ({path})")
try:
node = ast.parse(defn)
calls = []
for n in ast.walk(node):
if isinstance(n, ast.Call):
if isinstance(n.func, ast.Name): calls.append(n.func.id)
elif isinstance(n.func, ast.Attribute): calls.append(n.func.attr)
for call in sorted(set(calls)):
if call in ("print", "len", "str", "int", "list", "dict", "set", "range", "enumerate", "isinstance", "getattr", "setattr", "hasattr"): continue
c_path, c_code = None, None
full_tree = ast.parse(code)
if _get_symbol_node(full_tree, call): c_path, c_code = path, code
else:
for r in ["src", "simulation"]:
for p in Path(r).rglob("*.py"):
if not _is_allowed(p): continue
f_code = p.read_text(encoding="utf-8")
if f"def {call}" in f_code:
c_path, c_code = str(p), f_code
break
if c_path: break
if c_path: trace(call, c_path, c_code, depth + 1, indent + " ")
except Exception as e: output.append(f"{indent} [!] Error parsing calls for {name}: {e}")
trace(symbol_name, found_path, found_code, 0, "")
return "\n".join(output)
class _DDGParser(HTMLParser):
def __init__(self) -> None:
super().__init__()
@@ -1283,6 +1343,11 @@ def dispatch(tool_name: str, tool_input: dict[str, Any]) -> str:
return py_get_docstring(path, str(tool_input.get("name", "")))
if tool_name == "get_tree":
return get_tree(path, int(tool_input.get("max_depth", 2)))
if tool_name == "derive_code_path":
return derive_code_path(str(tool_input.get("target", "")), int(tool_input.get("max_depth", 5)))
if tool_name == "derive_code_path":
return derive_code_path(str(tool_input.get("target", "")), int(tool_input.get("max_depth", 5)))
# Beads tools
if tool_name.startswith("bd_"):
@@ -2033,6 +2098,27 @@ MCP_TOOL_SPECS: list[dict[str, Any]] = [
"type": "object",
"properties": {}
}
},
{
"name": "derive_code_path",
"description": (
"Recursively traces the execution path of a specific function or method across multiple files. "
"Identifies call chains and data hand-offs to build an intensive technical map."
),
"parameters": {
"type": "object",
"properties": {
"target": {
"type": "string",
"description": "Fully qualified name of the target (e.g., 'src.ai_client.send') or class.method.",
},
"max_depth": {
"type": "integer",
"description": "Maximum recursion depth for the call graph (default 5).",
},
},
"required": ["target"],
},
}
]