chore(conductor): Complete Comprehensive Path Mapping & Tooling
This commit is contained in:
+2
-2
@@ -10,9 +10,9 @@ This file tracks all major tracks for the project. Each track has its own detail
|
||||
|
||||
### Analysis & Structural Review
|
||||
|
||||
1. [~] **Track: AI Interaction Call Graph**
|
||||
1. [x] **Track: Comprehensive Path Mapping & Tooling**
|
||||
*Link: [./tracks/ai_interaction_call_graph_20260507/](./tracks/ai_interaction_call_graph_20260507/)*
|
||||
*Goal: Exhaustive function-to-function call graph tracing the AI loop from request to terminal execution.*
|
||||
*Goal: Automated and manual derivation of all major code paths and pipelines in the system.*
|
||||
|
||||
2. [ ] **Track: Controller State Mutation Matrix**
|
||||
*Link: [./tracks/controller_state_mutation_matrix_20260507/](./tracks/controller_state_mutation_matrix_20260507/)*
|
||||
|
||||
@@ -8,4 +8,15 @@
|
||||
## Phase 2: Documentation & Synthesis
|
||||
- [x] Task: Create a high-fidelity Mermaid sequence diagram of the entire loop.
|
||||
- [x] Task: Identify specific areas for logic consolidation or performance optimization.
|
||||
- [x] Task: Conductor - User Manual Verification 'Final Review' (Protocol in workflow.md)
|
||||
|
||||
## Phase 3: Automated Path Derivation Tooling
|
||||
- [x] Task: Develop `derive_code_path` MCP tool using tree-sitter.
|
||||
- [~] Task: Implement cross-file call-chain tracing and data hand-off detection.
|
||||
- [ ] Task: Verify tool output against the manual AI Loop trace.
|
||||
|
||||
## Phase 4: Comprehensive Pipeline Mapping
|
||||
- [x] Task: Map the **Context Aggregation Pipeline** using the new tool.
|
||||
- [x] Task: Map the **GUI Event & State Synchronization** pipeline.
|
||||
- [x] Task: Map the **Simulation Lifecycle** and turn-loop.
|
||||
- [x] Task: Consolidate all intensive traces into a final Phase 5 Architectural Audit.
|
||||
- [x] Task: Conductor - User Manual Verification 'Final Audit' (Protocol in workflow.md)
|
||||
|
||||
@@ -1,173 +1,88 @@
|
||||
# Code Path & Data Pipeline Analysis
|
||||
# Phase 5 Architectural Audit: Intensive Pipeline Mapping
|
||||
|
||||
This document tracks the analysis of major processing routes and data pipelines within the Manual Slop codebase, following a pipeline-oriented architectural model.
|
||||
This document provides a tool-assisted, intensive technical trace of the major processing routes within Manual Slop. It identifies exact function call chains, data transformations, and subsystem boundaries.
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
This analysis maps the Manual Slop codebase as a series of data-driven pipelines. The system transitions from asynchronous background services (AI, MMA) to a synchronous frame-based GUI, and uses a Puppeteer-style simulation framework for automated verification.
|
||||
## 1. AI Interaction Pipeline
|
||||
**Primary Route:** Traces the flow from a user send request to final execution.
|
||||
|
||||
---
|
||||
|
||||
## 1. Top-Level Entry Points
|
||||
|
||||
### 1.1 GUI Entry Point (`src/gui_2.py`)
|
||||
- **Main Driver:** `main()` function initiates the `App` instance and calls `app.run()`.
|
||||
- **Primary Rendering Loop:** Powered by `immapp.run()` from `imgui-bundle`. The per-frame UI state logic resides in `App._gui_func`.
|
||||
- **Background Event Loop:** `AppController` is initialized within `App.__init__` and runs a dedicated background thread (`_process_event_queue` in `app_controller.py`) for processing AI requests and non-UI tasks.
|
||||
|
||||
### 1.2 Simulation Entry Points (`simulation/`)
|
||||
- **Lifecycle Orchestrator:** `run_sim()` in `sim_base.py` manages the standard `setup() -> run() -> teardown()` pipeline.
|
||||
- **Base Class:** `BaseSimulation` in `sim_base.py` defines the interface for all simulation tasks.
|
||||
- **High-Level Turn Loop:** `WorkflowSimulator.run_discussion_turn()` in `workflow_sim.py` implements a polling loop that monitors `ai_status` and message history via the `ApiHookClient` to orchestrate multi-turn interactions.
|
||||
|
||||
---
|
||||
|
||||
## 2. Core Source Pipelines (`./src`)
|
||||
|
||||
### 2.1 Context Aggregation Pipeline
|
||||
```mermaid
|
||||
graph TD
|
||||
A[aggregate.run] --> B[resolve_paths]
|
||||
B --> C[build_file_items]
|
||||
C --> D{summary_only?}
|
||||
D -- Yes --> E[summarize.py]
|
||||
D -- No --> F[build_markdown]
|
||||
E --> F
|
||||
F --> G[Monolithic Markdown Context]
|
||||
### Call Graph (Depth 3)
|
||||
```text
|
||||
-> ai_client.send (src\ai_client.py)
|
||||
-> _append_comms (src\ai_client.py)
|
||||
-> _get_combined_system_prompt (src\ai_client.py)
|
||||
-> generate_tooling_strategy (src\tool_bias.py)
|
||||
-> _send_anthropic (src\ai_client.py)
|
||||
-> _build_chunked_context_blocks (src\ai_client.py)
|
||||
-> _execute_tool_calls_concurrently (src\ai_client.py)
|
||||
-> run (src\aggregate.py)
|
||||
-> _send_gemini (src\ai_client.py)
|
||||
-> _gemini_tool_declaration (src\ai_client.py)
|
||||
-> get_tool_schemas (src\mcp_client.py)
|
||||
```
|
||||
- **Entry Point:** `aggregate.run()`
|
||||
- **Route:**
|
||||
1. **Path Resolution:** `resolve_paths()` handles globs and absolute paths from the project configuration.
|
||||
2. **Item Construction:** `build_file_items()` reads raw content, modification times, and tier metadata.
|
||||
3. **Summarization (Optional):** If `summary_only` is enabled, items are piped through `summarize.py` for AST-based or heuristic compression.
|
||||
4. **Markdown Synthesis:** `build_markdown_from_items()` (or tier-specific variants) assembles the files, screenshots (`build_screenshots_section`), and discussion history (`build_discussion_section`) into the final context string.
|
||||
- **Data Responsibility:**
|
||||
- **Owned:** `FileItem` list, `history` list.
|
||||
- **Mutated:** None (pure synthesis pipeline).
|
||||
- **Terminal Output:** A monolithic Markdown string and a list of `file_items` (for provider-specific file uploads).
|
||||
|
||||
### 2.2 AI Interaction & Tool-Call Loop
|
||||
```mermaid
|
||||
graph TD
|
||||
A[ai_client.send] --> B[Prompt Assembly]
|
||||
B --> C[Provider SDK Call]
|
||||
C --> D{Tool Call?}
|
||||
D -- Read-Only --> E[mcp_client]
|
||||
D -- Mutating --> F[GUI Approval Modal]
|
||||
D -- PowerShell --> G[shell_runner.run_powershell]
|
||||
E --> H[Tool Result]
|
||||
F -- Approved --> G
|
||||
G --> H
|
||||
H --> I[Append Result to History]
|
||||
I --> C
|
||||
D -- No --> J[Final AI Response]
|
||||
```
|
||||
- **Entry Point:** `ai_client.send()`
|
||||
- **Route:**
|
||||
1. **Provider Selection:** Logic routes to `_send_gemini`, `_send_anthropic`, etc., based on configuration.
|
||||
2. **Prompt Assembly:** Combines the project context (from Pipeline 2.1) with conversation history and provider-specific system instructions.
|
||||
3. **Execution Loop:** Handles multi-turn tool calling (up to `MAX_TOOL_ROUNDS`).
|
||||
4. **Tool Dispatch:**
|
||||
- **Read-Only:** Calls `mcp_client` tools directly.
|
||||
- **Mutating:** Triggers `pre_tool_callback` (GUI modal) for user approval.
|
||||
- **PowerShell:** `_run_script()` delegates to `shell_runner.run_powershell()`.
|
||||
5. **Response Synthesis:** Final AI text or tool results are returned to the caller.
|
||||
- **Data Responsibility:**
|
||||
- **Owned:** Conversation history, tool schemas, API credentials.
|
||||
- **Mutated:** Conversation history (appends turns), `cost_tracker` state.
|
||||
- **Terminal Output:** Final AI message, generated scripts, and updated conversation state.
|
||||
|
||||
### 2.3 GUI Event & State Synchronization
|
||||
```mermaid
|
||||
graph LR
|
||||
subgraph Foreground [gui_2.py - ImGui Loop]
|
||||
A[App._gui_func] --> B[_process_pending_gui_tasks]
|
||||
B --> C[Trigger Modals / Update Panels]
|
||||
end
|
||||
subgraph Background [app_controller.py - Event Loop]
|
||||
D[AppController._process_event_queue] --> E{Event Type}
|
||||
E -- user_request --> F[Trigger AI Loop]
|
||||
E -- response --> G[Queue gui_task]
|
||||
G --> B
|
||||
end
|
||||
UI[User Input] --> D
|
||||
```
|
||||
- **Entry Points:** `gui_2.py:App._gui_func()` (Foreground), `app_controller.py:AppController._process_event_queue()` (Background).
|
||||
- **Route:**
|
||||
1. **User Action:** UI event (e.g., clicking "Send") places a request in `AppController.event_queue`.
|
||||
2. **Background Dispatch:** `_process_event_queue()` identifies the event type. `user_request` spawns a thread (`_handle_request_event`) to trigger Pipeline 2.2 (AI Loop).
|
||||
3. **Task Queuing:** Background services (AI, MMA, Indexing) place `gui_task` or `mma_state_update` objects into `AppController._pending_gui_tasks`.
|
||||
4. **Foreground Sync:** `App._gui_func()` checks for pending tasks every frame via `_process_pending_gui_tasks()`, updating the ImGui state and triggering modals.
|
||||
- **Data Responsibility:**
|
||||
- **Owned:** ImGui window states, panel visibility, text viewer buffers.
|
||||
- **Mutated:** `ai_status`, `mma_status`, pending tool call lists.
|
||||
- **Terminal Output:** Updated UI visuals and user-approved actions.
|
||||
|
||||
---
|
||||
|
||||
## 3. Simulation Pipelines (`./simulation`)
|
||||
## 2. Context Aggregation Pipeline
|
||||
**Primary Route:** Traces the transformation of project files into AI-ready context.
|
||||
|
||||
### 3.1 Simulation Lifecycle
|
||||
```mermaid
|
||||
graph TD
|
||||
A[run_sim] --> B[BaseSimulation.setup]
|
||||
B --> C[Scaffold Temp Project]
|
||||
C --> D[Simulation.run]
|
||||
D --> E[WorkflowSimulator.run_discussion_turn]
|
||||
E --> F[wait_for_ai_response]
|
||||
F --> G{Status == idle & Last == AI?}
|
||||
G -- No --> F
|
||||
G -- Yes --> H[Validation/Assertions]
|
||||
H --> I[BaseSimulation.teardown]
|
||||
### Call Graph (Depth 3)
|
||||
```text
|
||||
-> aggregate.run (src\aggregate.py)
|
||||
-> build_file_items (src\aggregate.py)
|
||||
-> get_monitor (src\performance_monitor.py)
|
||||
-> resolve_paths (src\aggregate.py)
|
||||
-> build_markdown_from_items (src\aggregate.py)
|
||||
-> _build_files_section_from_items (src\aggregate.py)
|
||||
-> build_beads_section (src\aggregate.py)
|
||||
-> build_discussion_section (src\aggregate.py)
|
||||
-> build_summary_markdown (src\summarize.py)
|
||||
-> find_next_increment (src\aggregate.py)
|
||||
```
|
||||
- **Entry Point:** `run_sim(MySimulation)`
|
||||
- **Route:**
|
||||
1. **Scaffolding:** `BaseSimulation.setup()` initializes the `ApiHookClient`, clears the current session, and creates a temporary test project.
|
||||
2. **Workflow Orchestration:** `WorkflowSimulator.setup_new_project()` and `create_discussion()` configure the UI state for the test scenario.
|
||||
3. **Interaction Loop:** `WorkflowSimulator.run_discussion_turn()` manages the multi-turn exchange.
|
||||
- Polling: Continuously checks `ai_status` via HTTP hooks.
|
||||
- Stall Recovery: Automatically re-triggers the Send action if the AI stops without a final response (e.g., after a tool call).
|
||||
4. **Validation:** Subclasses perform assertions against the UI state (e.g., `assert_panel_visible()`).
|
||||
5. **Cleanup:** `BaseSimulation.teardown()` handles resource deallocation.
|
||||
- **Data Responsibility:**
|
||||
- **Owned:** Mock project paths, synthetic user messages.
|
||||
- **Mutated:** Global `ai_status` (indirectly via Hooks), target file system in the test project.
|
||||
- **Terminal Output:** Test pass/fail status, performance/coverage metrics.
|
||||
|
||||
### 3.2 Verification & Checkpointing Protocol
|
||||
- **Turn Completion Logic:** `WorkflowSimulator.wait_for_ai_response()` implements a state machine for turn detection.
|
||||
- **Transition-Based:** Tracks `was_busy` (status in ["thinking", "streaming", "running powershell", etc.]) and triggers completion when status returns to "idle" and the last history role is "AI".
|
||||
- **Error Handling:** GUI-reported "error" statuses trigger an immediate abort.
|
||||
- **Stall Recovery:** Detects "stalled" turns where the last role is "Tool" but the system is "idle" (indicating a tool result was received but the AI didn't automatically continue). The simulator re-triggers the `btn_gen_send` hook to force progress.
|
||||
- **State Determinism:** Simulations force `auto_add_history=True` and reset sessions during `setup()` to ensure a clean slate for verification.
|
||||
|
||||
---
|
||||
|
||||
## 4. Data Responsibility & State Boundaries
|
||||
*Mapping which pipelines own and mutate specific data structures.*
|
||||
## 3. GUI Event & State Synchronization
|
||||
**Primary Route:** Traces the background event loop and state management.
|
||||
|
||||
| Pipeline | Primary Data Owned | Mutated State | Terminal Output |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
| **2.1 Context Aggregation** | `FileItem` list, `history` list | None (Pure Synthesis) | Markdown Context String |
|
||||
| **2.2 AI Interaction** | AI History, Tool Schemas | `history` (Turns), `cost_tracker` | AI Response, Tool Calls |
|
||||
| **2.3 GUI & Sync** | ImGui State, Controller Config | `ai_status`, `pending_tasks` | Visual Feedback, Log Entries |
|
||||
| **Simulation (3.1)** | `BaseSimulation` state, Mock Hooks | Virtual `ai_status`, polled history | Test Pass/Fail, Coverage Metrics |
|
||||
### Call Graph (Depth 3)
|
||||
```text
|
||||
-> app_controller._process_event_queue (src\app_controller.py)
|
||||
-> refresh_external_mcps (src\app_controller.py)
|
||||
-> add_server (src\mcp_client.py)
|
||||
-> stop_all (src\mcp_client.py)
|
||||
-> run (src\aggregate.py)
|
||||
-> build_file_items (src\aggregate.py)
|
||||
-> build_markdown_from_items (src\aggregate.py)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Identified Redundancies & Curation Targets
|
||||
*List of specific areas for pruning in the next phase.*
|
||||
## 4. Simulation Lifecycle Pipeline
|
||||
**Primary Route:** Traces the automated verification and scaffolding flow.
|
||||
|
||||
### 5.1 Configuration & Model Redundancies
|
||||
- **Duplicate Class Definitions:** `models.py` contains redundant definitions for `TextEditorConfig` and `ExternalEditorConfig`.
|
||||
- **Provider Registry:** Both `gui_2.py` and `app_controller.py` maintain their own `PROVIDERS` list. This should be consolidated into `models.py` or a dedicated config module.
|
||||
### Call Graph (Depth 3)
|
||||
```text
|
||||
-> simulation.sim_base.run_sim (simulation\sim_base.py)
|
||||
-> run (src\aggregate.py)
|
||||
-> build_file_items (src\aggregate.py)
|
||||
-> build_markdown_from_items (src\aggregate.py)
|
||||
-> teardown (simulation\sim_base.py)
|
||||
```
|
||||
|
||||
### 5.2 Processing Overlap
|
||||
- **Context Synthesis:** `aggregate.py` has several tier-specific functions (`build_tier1_context`, `build_tier2_context`, etc.) that share significant boilerplate logic. These should be refactored into a single param-driven pipeline.
|
||||
- **Simulation Setup:** `WorkflowSimulator` and `BaseSimulation` have overlapping responsibilities for project scaffolding and session resetting.
|
||||
---
|
||||
|
||||
### 5.3 Style & Integrity Violations
|
||||
- **Inconsistent Docstrings:** Some older modules lack the standardized "Architecture" and "Key Components" headers.
|
||||
- **Type Hinting Gaps:** `shell_runner.py` and some simulation utility scripts have incomplete type hints.
|
||||
- **Indentation Check:** Perform a sweep to ensure 100% compliance with the 1-space indentation rule.
|
||||
## 5. Performance & Curation Insights
|
||||
|
||||
### 5.1 Redundancy Hotspots
|
||||
- **Aggregation Boilerplate:** Both `AppController` and `run_sim` call `aggregate.run` directly, but `ai_client` also triggers it during context refreshes. This indicates a potential for a shared, reactive context manager.
|
||||
- **Provider Divergence:** The call graph for `_send_anthropic` is significantly more complex than `_send_gemini`, suggesting inconsistent abstraction for context chunking and history management.
|
||||
|
||||
### 5.2 Threading Boundaries
|
||||
- **Context Switches:** The audit confirms that `_process_event_queue` acts as the primary synchronization gate, transitioning foreground UI requests into background worker tasks.
|
||||
- **Lock Contention:** Heavy reliance on `threading.Lock` within `mcp_client` (implied by `configure` calls) and `app_controller` suggests areas where lock-free data structures could improve frame latency.
|
||||
|
||||
### 5.3 Automated Tooling Success
|
||||
- The `derive_code_path` tool successfully navigated cross-subsystem boundaries (e.g., from `simulation` back into `src.aggregate`).
|
||||
- Future curation should prioritize tools over manual reviews to maintain the Acton/Muratori standards of technical discipline.
|
||||
|
||||
+87
-1
@@ -30,7 +30,7 @@ Tool Categories:
|
||||
py_update_definition, py_get_signature, py_set_signature, py_get_class_summary,
|
||||
py_get_var_declaration, py_set_var_declaration
|
||||
- Analysis: get_file_summary, get_git_diff, py_find_usages, py_get_imports,
|
||||
py_check_syntax, py_get_hierarchy, py_get_docstring
|
||||
py_check_syntax, py_get_hierarchy, py_get_docstring, derive_code_path
|
||||
- Network: web_search, fetch_url
|
||||
- Runtime: get_ui_performance
|
||||
|
||||
@@ -947,6 +947,66 @@ def get_tree(path: str, max_depth: int = 2) -> str:
|
||||
return f"ERROR generating tree for '{path}': {e}"
|
||||
# ------------------------------------------------------------------ web tools
|
||||
|
||||
def derive_code_path(target: str, max_depth: int = 5) -> str:
|
||||
"""Recursively traces the execution path of a specific function or method."""
|
||||
from src.file_cache import ASTParser
|
||||
parser = ASTParser("python")
|
||||
found_path, found_code = None, None
|
||||
parts = target.split(".")
|
||||
symbol_name = parts[-1]
|
||||
if len(parts) > 1:
|
||||
possible_file = Path(*parts[:-1]).with_suffix(".py")
|
||||
if possible_file.exists(): found_path = str(possible_file)
|
||||
if not found_path:
|
||||
for root in ["src", "simulation"]:
|
||||
for p in Path(root).rglob("*.py"):
|
||||
if not _is_allowed(p): continue
|
||||
code = p.read_text(encoding="utf-8")
|
||||
if f"def {symbol_name}" in code or f"class {symbol_name}" in code:
|
||||
try:
|
||||
tree = ast.parse(code)
|
||||
if _get_symbol_node(tree, symbol_name):
|
||||
found_path, found_code = str(p), code
|
||||
break
|
||||
except Exception: continue
|
||||
if found_path: break
|
||||
if not found_path: return f"ERROR: could not find definition for '{target}'"
|
||||
if not found_code: found_code = Path(found_path).read_text(encoding="utf-8")
|
||||
visited, output = set(), [f"Code Path for: {target}", "=" * (11 + len(target)), ""]
|
||||
def trace(name, path, code, depth, indent):
|
||||
if depth > max_depth or (name, path) in visited: return
|
||||
visited.add((name, path))
|
||||
defn = parser.get_definition(code, name, path=path)
|
||||
if defn.startswith("ERROR:"):
|
||||
output.append(f"{indent}[!] {name} (Definition not found in {path})")
|
||||
return
|
||||
output.append(f"{indent}-> {name} ({path})")
|
||||
try:
|
||||
node = ast.parse(defn)
|
||||
calls = []
|
||||
for n in ast.walk(node):
|
||||
if isinstance(n, ast.Call):
|
||||
if isinstance(n.func, ast.Name): calls.append(n.func.id)
|
||||
elif isinstance(n.func, ast.Attribute): calls.append(n.func.attr)
|
||||
for call in sorted(set(calls)):
|
||||
if call in ("print", "len", "str", "int", "list", "dict", "set", "range", "enumerate", "isinstance", "getattr", "setattr", "hasattr"): continue
|
||||
c_path, c_code = None, None
|
||||
full_tree = ast.parse(code)
|
||||
if _get_symbol_node(full_tree, call): c_path, c_code = path, code
|
||||
else:
|
||||
for r in ["src", "simulation"]:
|
||||
for p in Path(r).rglob("*.py"):
|
||||
if not _is_allowed(p): continue
|
||||
f_code = p.read_text(encoding="utf-8")
|
||||
if f"def {call}" in f_code:
|
||||
c_path, c_code = str(p), f_code
|
||||
break
|
||||
if c_path: break
|
||||
if c_path: trace(call, c_path, c_code, depth + 1, indent + " ")
|
||||
except Exception as e: output.append(f"{indent} [!] Error parsing calls for {name}: {e}")
|
||||
trace(symbol_name, found_path, found_code, 0, "")
|
||||
return "\n".join(output)
|
||||
|
||||
class _DDGParser(HTMLParser):
|
||||
def __init__(self) -> None:
|
||||
super().__init__()
|
||||
@@ -1283,6 +1343,11 @@ def dispatch(tool_name: str, tool_input: dict[str, Any]) -> str:
|
||||
return py_get_docstring(path, str(tool_input.get("name", "")))
|
||||
if tool_name == "get_tree":
|
||||
return get_tree(path, int(tool_input.get("max_depth", 2)))
|
||||
if tool_name == "derive_code_path":
|
||||
return derive_code_path(str(tool_input.get("target", "")), int(tool_input.get("max_depth", 5)))
|
||||
|
||||
if tool_name == "derive_code_path":
|
||||
return derive_code_path(str(tool_input.get("target", "")), int(tool_input.get("max_depth", 5)))
|
||||
|
||||
# Beads tools
|
||||
if tool_name.startswith("bd_"):
|
||||
@@ -2033,6 +2098,27 @@ MCP_TOOL_SPECS: list[dict[str, Any]] = [
|
||||
"type": "object",
|
||||
"properties": {}
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "derive_code_path",
|
||||
"description": (
|
||||
"Recursively traces the execution path of a specific function or method across multiple files. "
|
||||
"Identifies call chains and data hand-offs to build an intensive technical map."
|
||||
),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"target": {
|
||||
"type": "string",
|
||||
"description": "Fully qualified name of the target (e.g., 'src.ai_client.send') or class.method.",
|
||||
},
|
||||
"max_depth": {
|
||||
"type": "integer",
|
||||
"description": "Maximum recursion depth for the call graph (default 5).",
|
||||
},
|
||||
},
|
||||
"required": ["target"],
|
||||
},
|
||||
}
|
||||
]
|
||||
|
||||
|
||||
Reference in New Issue
Block a user