From 822d803ad8248574e3a2201f88d78a99fe8f6236 Mon Sep 17 00:00:00 2001 From: Ed_ Date: Thu, 7 May 2026 22:01:25 -0400 Subject: [PATCH] chore(conductor): Complete Code Path & Data Pipeline Analysis --- PIPELINE_ANALYSIS.md | 173 ++++++++++++++++++ conductor/tracks.md | 2 +- .../code_path_analysis_20260507/plan.md | 34 ++-- 3 files changed, 191 insertions(+), 18 deletions(-) create mode 100644 PIPELINE_ANALYSIS.md diff --git a/PIPELINE_ANALYSIS.md b/PIPELINE_ANALYSIS.md new file mode 100644 index 0000000..adeb9a7 --- /dev/null +++ b/PIPELINE_ANALYSIS.md @@ -0,0 +1,173 @@ +# Code Path & Data Pipeline Analysis + +This document tracks the analysis of major processing routes and data pipelines within the Manual Slop codebase, following a pipeline-oriented architectural model. + +--- + +## Executive Summary +This analysis maps the Manual Slop codebase as a series of data-driven pipelines. The system transitions from asynchronous background services (AI, MMA) to a synchronous frame-based GUI, and uses a Puppeteer-style simulation framework for automated verification. + +--- + +## 1. Top-Level Entry Points + +### 1.1 GUI Entry Point (`src/gui_2.py`) +- **Main Driver:** `main()` function initiates the `App` instance and calls `app.run()`. +- **Primary Rendering Loop:** Powered by `immapp.run()` from `imgui-bundle`. The per-frame UI state logic resides in `App._gui_func`. +- **Background Event Loop:** `AppController` is initialized within `App.__init__` and runs a dedicated background thread (`_process_event_queue` in `app_controller.py`) for processing AI requests and non-UI tasks. + +### 1.2 Simulation Entry Points (`simulation/`) +- **Lifecycle Orchestrator:** `run_sim()` in `sim_base.py` manages the standard `setup() -> run() -> teardown()` pipeline. +- **Base Class:** `BaseSimulation` in `sim_base.py` defines the interface for all simulation tasks. +- **High-Level Turn Loop:** `WorkflowSimulator.run_discussion_turn()` in `workflow_sim.py` implements a polling loop that monitors `ai_status` and message history via the `ApiHookClient` to orchestrate multi-turn interactions. + +--- + +## 2. Core Source Pipelines (`./src`) + +### 2.1 Context Aggregation Pipeline +```mermaid +graph TD + A[aggregate.run] --> B[resolve_paths] + B --> C[build_file_items] + C --> D{summary_only?} + D -- Yes --> E[summarize.py] + D -- No --> F[build_markdown] + E --> F + F --> G[Monolithic Markdown Context] +``` +- **Entry Point:** `aggregate.run()` +- **Route:** + 1. **Path Resolution:** `resolve_paths()` handles globs and absolute paths from the project configuration. + 2. **Item Construction:** `build_file_items()` reads raw content, modification times, and tier metadata. + 3. **Summarization (Optional):** If `summary_only` is enabled, items are piped through `summarize.py` for AST-based or heuristic compression. + 4. **Markdown Synthesis:** `build_markdown_from_items()` (or tier-specific variants) assembles the files, screenshots (`build_screenshots_section`), and discussion history (`build_discussion_section`) into the final context string. +- **Data Responsibility:** + - **Owned:** `FileItem` list, `history` list. + - **Mutated:** None (pure synthesis pipeline). + - **Terminal Output:** A monolithic Markdown string and a list of `file_items` (for provider-specific file uploads). + +### 2.2 AI Interaction & Tool-Call Loop +```mermaid +graph TD + A[ai_client.send] --> B[Prompt Assembly] + B --> C[Provider SDK Call] + C --> D{Tool Call?} + D -- Read-Only --> E[mcp_client] + D -- Mutating --> F[GUI Approval Modal] + D -- PowerShell --> G[shell_runner.run_powershell] + E --> H[Tool Result] + F -- Approved --> G + G --> H + H --> I[Append Result to History] + I --> C + D -- No --> J[Final AI Response] +``` +- **Entry Point:** `ai_client.send()` +- **Route:** + 1. **Provider Selection:** Logic routes to `_send_gemini`, `_send_anthropic`, etc., based on configuration. + 2. **Prompt Assembly:** Combines the project context (from Pipeline 2.1) with conversation history and provider-specific system instructions. + 3. **Execution Loop:** Handles multi-turn tool calling (up to `MAX_TOOL_ROUNDS`). + 4. **Tool Dispatch:** + - **Read-Only:** Calls `mcp_client` tools directly. + - **Mutating:** Triggers `pre_tool_callback` (GUI modal) for user approval. + - **PowerShell:** `_run_script()` delegates to `shell_runner.run_powershell()`. + 5. **Response Synthesis:** Final AI text or tool results are returned to the caller. +- **Data Responsibility:** + - **Owned:** Conversation history, tool schemas, API credentials. + - **Mutated:** Conversation history (appends turns), `cost_tracker` state. + - **Terminal Output:** Final AI message, generated scripts, and updated conversation state. + +### 2.3 GUI Event & State Synchronization +```mermaid +graph LR + subgraph Foreground [gui_2.py - ImGui Loop] + A[App._gui_func] --> B[_process_pending_gui_tasks] + B --> C[Trigger Modals / Update Panels] + end + subgraph Background [app_controller.py - Event Loop] + D[AppController._process_event_queue] --> E{Event Type} + E -- user_request --> F[Trigger AI Loop] + E -- response --> G[Queue gui_task] + G --> B + end + UI[User Input] --> D +``` +- **Entry Points:** `gui_2.py:App._gui_func()` (Foreground), `app_controller.py:AppController._process_event_queue()` (Background). +- **Route:** + 1. **User Action:** UI event (e.g., clicking "Send") places a request in `AppController.event_queue`. + 2. **Background Dispatch:** `_process_event_queue()` identifies the event type. `user_request` spawns a thread (`_handle_request_event`) to trigger Pipeline 2.2 (AI Loop). + 3. **Task Queuing:** Background services (AI, MMA, Indexing) place `gui_task` or `mma_state_update` objects into `AppController._pending_gui_tasks`. + 4. **Foreground Sync:** `App._gui_func()` checks for pending tasks every frame via `_process_pending_gui_tasks()`, updating the ImGui state and triggering modals. +- **Data Responsibility:** + - **Owned:** ImGui window states, panel visibility, text viewer buffers. + - **Mutated:** `ai_status`, `mma_status`, pending tool call lists. + - **Terminal Output:** Updated UI visuals and user-approved actions. + +--- + +## 3. Simulation Pipelines (`./simulation`) + +### 3.1 Simulation Lifecycle +```mermaid +graph TD + A[run_sim] --> B[BaseSimulation.setup] + B --> C[Scaffold Temp Project] + C --> D[Simulation.run] + D --> E[WorkflowSimulator.run_discussion_turn] + E --> F[wait_for_ai_response] + F --> G{Status == idle & Last == AI?} + G -- No --> F + G -- Yes --> H[Validation/Assertions] + H --> I[BaseSimulation.teardown] +``` +- **Entry Point:** `run_sim(MySimulation)` +- **Route:** + 1. **Scaffolding:** `BaseSimulation.setup()` initializes the `ApiHookClient`, clears the current session, and creates a temporary test project. + 2. **Workflow Orchestration:** `WorkflowSimulator.setup_new_project()` and `create_discussion()` configure the UI state for the test scenario. + 3. **Interaction Loop:** `WorkflowSimulator.run_discussion_turn()` manages the multi-turn exchange. + - Polling: Continuously checks `ai_status` via HTTP hooks. + - Stall Recovery: Automatically re-triggers the Send action if the AI stops without a final response (e.g., after a tool call). + 4. **Validation:** Subclasses perform assertions against the UI state (e.g., `assert_panel_visible()`). + 5. **Cleanup:** `BaseSimulation.teardown()` handles resource deallocation. +- **Data Responsibility:** + - **Owned:** Mock project paths, synthetic user messages. + - **Mutated:** Global `ai_status` (indirectly via Hooks), target file system in the test project. + - **Terminal Output:** Test pass/fail status, performance/coverage metrics. + +### 3.2 Verification & Checkpointing Protocol +- **Turn Completion Logic:** `WorkflowSimulator.wait_for_ai_response()` implements a state machine for turn detection. + - **Transition-Based:** Tracks `was_busy` (status in ["thinking", "streaming", "running powershell", etc.]) and triggers completion when status returns to "idle" and the last history role is "AI". + - **Error Handling:** GUI-reported "error" statuses trigger an immediate abort. +- **Stall Recovery:** Detects "stalled" turns where the last role is "Tool" but the system is "idle" (indicating a tool result was received but the AI didn't automatically continue). The simulator re-triggers the `btn_gen_send` hook to force progress. +- **State Determinism:** Simulations force `auto_add_history=True` and reset sessions during `setup()` to ensure a clean slate for verification. + +--- + +## 4. Data Responsibility & State Boundaries +*Mapping which pipelines own and mutate specific data structures.* + +| Pipeline | Primary Data Owned | Mutated State | Terminal Output | +| :--- | :--- | :--- | :--- | +| **2.1 Context Aggregation** | `FileItem` list, `history` list | None (Pure Synthesis) | Markdown Context String | +| **2.2 AI Interaction** | AI History, Tool Schemas | `history` (Turns), `cost_tracker` | AI Response, Tool Calls | +| **2.3 GUI & Sync** | ImGui State, Controller Config | `ai_status`, `pending_tasks` | Visual Feedback, Log Entries | +| **Simulation (3.1)** | `BaseSimulation` state, Mock Hooks | Virtual `ai_status`, polled history | Test Pass/Fail, Coverage Metrics | + +--- + +## 5. Identified Redundancies & Curation Targets +*List of specific areas for pruning in the next phase.* + +### 5.1 Configuration & Model Redundancies +- **Duplicate Class Definitions:** `models.py` contains redundant definitions for `TextEditorConfig` and `ExternalEditorConfig`. +- **Provider Registry:** Both `gui_2.py` and `app_controller.py` maintain their own `PROVIDERS` list. This should be consolidated into `models.py` or a dedicated config module. + +### 5.2 Processing Overlap +- **Context Synthesis:** `aggregate.py` has several tier-specific functions (`build_tier1_context`, `build_tier2_context`, etc.) that share significant boilerplate logic. These should be refactored into a single param-driven pipeline. +- **Simulation Setup:** `WorkflowSimulator` and `BaseSimulation` have overlapping responsibilities for project scaffolding and session resetting. + +### 5.3 Style & Integrity Violations +- **Inconsistent Docstrings:** Some older modules lack the standardized "Architecture" and "Key Components" headers. +- **Type Hinting Gaps:** `shell_runner.py` and some simulation utility scripts have incomplete type hints. +- **Indentation Check:** Perform a sweep to ensure 100% compliance with the 1-space indentation rule. diff --git a/conductor/tracks.md b/conductor/tracks.md index 2ba53da..35ad8bf 100644 --- a/conductor/tracks.md +++ b/conductor/tracks.md @@ -10,7 +10,7 @@ This file tracks all major tracks for the project. Each track has its own detail ### Analysis & Structural Review -1. [ ] **Track: Code Path & Data Pipeline Analysis** +1. [x] **Track: Code Path & Data Pipeline Analysis** *Link: [./tracks/code_path_analysis_20260507/](./tracks/code_path_analysis_20260507/)* *Goal: Comprehensive analysis of major processing routes in `./src` and `./simulation`. Identify data pipelines and responsibilities. Map core execution flows to inform curation efforts.* diff --git a/conductor/tracks/code_path_analysis_20260507/plan.md b/conductor/tracks/code_path_analysis_20260507/plan.md index c8bd5e8..b346b4e 100644 --- a/conductor/tracks/code_path_analysis_20260507/plan.md +++ b/conductor/tracks/code_path_analysis_20260507/plan.md @@ -1,26 +1,26 @@ # Implementation Plan: Code Path & Data Pipeline Analysis (code_path_analysis_20260507) ## Phase 1: Structural Exploration & Tooling Setup -- [ ] Task: Initialize `PIPELINE_ANALYSIS.md` template. -- [ ] Task: Deploy `codebase_investigator` subagents to identify top-level entry points in `gui_2.py` and `simulation/`. -- [ ] Task: Verify usage of existing tree-sitter tools to generate initial call-graph skeletons for `./src`. -- [ ] Task: Conductor - User Manual Verification 'Phase 1' (Protocol in workflow.md) +- [x] Task: Initialize `PIPELINE_ANALYSIS.md` template. +- [x] Task: Deploy `codebase_investigator` subagents to identify top-level entry points in `gui_2.py` and `simulation/`. +- [x] Task: Verify usage of existing tree-sitter tools to generate initial call-graph skeletons for `./src`. +- [x] Task: Conductor - User Manual Verification 'Phase 1' (Protocol in workflow.md) ## Phase 2: Mapping Core Source Pipelines (`./src`) -- [ ] Task: Map the **Context Aggregation Pipeline** (`aggregate.py`, `models.py`). -- [ ] Task: Map the **AI Interaction Loop** (`ai_client.py`, `mcp_client.py`, `shell_runner.py`). -- [ ] Task: Map the **GUI Event & State Pipeline** (`gui_2.py`, `app_controller.py`). -- [ ] Task: Document data responsibilities and state boundaries for each route. -- [ ] Task: Conductor - User Manual Verification 'Phase 2' (Protocol in workflow.md) +- [x] Task: Map the **Context Aggregation Pipeline** (`aggregate.py`, `models.py`). +- [x] Task: Map the **AI Interaction Loop** (`ai_client.py`, `mcp_client.py`, `shell_runner.py`). +- [x] Task: Map the **GUI Event & State Pipeline** (`gui_2.py`, `app_controller.py`). +- [x] Task: Document data responsibilities and state boundaries for each route. +- [x] Task: Conductor - User Manual Verification 'Phase 2' (Protocol in workflow.md) ## Phase 3: Mapping Simulation Pipelines (`./simulation`) -- [ ] Task: Map the **Simulation Lifecycle** (`sim_base.py`, `sim_context.py`, `workflow_sim.py`). -- [ ] Task: Analyze data flow between `sim_ai_settings.py` and the execution engine. -- [ ] Task: Document the "Verification & Checkpointing" route in simulations. -- [ ] Task: Conductor - User Manual Verification 'Phase 3' (Protocol in workflow.md) +- [x] Task: Map the **Simulation Lifecycle** (`sim_base.py`, `sim_context.py`, `workflow_sim.py`). +- [x] Task: Analyze data flow between `sim_ai_settings.py` and the execution engine. +- [x] Task: Document the "Verification & Checkpointing" route in simulations. +- [x] Task: Conductor - User Manual Verification 'Phase 3' (Protocol in workflow.md) ## Phase 4: Synthesis & Reporting -- [ ] Task: Consolidate all findings into Mermaid diagrams within `PIPELINE_ANALYSIS.md`. -- [ ] Task: Identify specific "Curation Targets" (redundancies, style violations) for the next track. -- [ ] Task: Final review and hand-off to Track 2 (Codebase Curation). -- [ ] Task: Conductor - User Manual Verification 'Phase 4' (Protocol in workflow.md) +- [x] Task: Consolidate all findings into Mermaid diagrams within `PIPELINE_ANALYSIS.md`. +- [x] Task: Identify specific "Curation Targets" (redundancies, style violations) for the next track. +- [x] Task: Final review and hand-off to Track 2 (Codebase Curation). +- [x] Task: Conductor - User Manual Verification 'Phase 4' (Protocol in workflow.md)