Files
manual_slop/PIPELINE_ANALYSIS.md
T

9.7 KiB

Code Path & Data Pipeline Analysis

This document tracks the analysis of major processing routes and data pipelines within the Manual Slop codebase, following a pipeline-oriented architectural model.


Executive Summary

This analysis maps the Manual Slop codebase as a series of data-driven pipelines. The system transitions from asynchronous background services (AI, MMA) to a synchronous frame-based GUI, and uses a Puppeteer-style simulation framework for automated verification.


1. Top-Level Entry Points

1.1 GUI Entry Point (src/gui_2.py)

  • Main Driver: main() function initiates the App instance and calls app.run().
  • Primary Rendering Loop: Powered by immapp.run() from imgui-bundle. The per-frame UI state logic resides in App._gui_func.
  • Background Event Loop: AppController is initialized within App.__init__ and runs a dedicated background thread (_process_event_queue in app_controller.py) for processing AI requests and non-UI tasks.

1.2 Simulation Entry Points (simulation/)

  • Lifecycle Orchestrator: run_sim() in sim_base.py manages the standard setup() -> run() -> teardown() pipeline.
  • Base Class: BaseSimulation in sim_base.py defines the interface for all simulation tasks.
  • High-Level Turn Loop: WorkflowSimulator.run_discussion_turn() in workflow_sim.py implements a polling loop that monitors ai_status and message history via the ApiHookClient to orchestrate multi-turn interactions.

2. Core Source Pipelines (./src)

2.1 Context Aggregation Pipeline

graph TD
    A[aggregate.run] --> B[resolve_paths]
    B --> C[build_file_items]
    C --> D{summary_only?}
    D -- Yes --> E[summarize.py]
    D -- No --> F[build_markdown]
    E --> F
    F --> G[Monolithic Markdown Context]
  • Entry Point: aggregate.run()
  • Route:
    1. Path Resolution: resolve_paths() handles globs and absolute paths from the project configuration.
    2. Item Construction: build_file_items() reads raw content, modification times, and tier metadata.
    3. Summarization (Optional): If summary_only is enabled, items are piped through summarize.py for AST-based or heuristic compression.
    4. Markdown Synthesis: build_markdown_from_items() (or tier-specific variants) assembles the files, screenshots (build_screenshots_section), and discussion history (build_discussion_section) into the final context string.
  • Data Responsibility:
    • Owned: FileItem list, history list.
    • Mutated: None (pure synthesis pipeline).
    • Terminal Output: A monolithic Markdown string and a list of file_items (for provider-specific file uploads).

2.2 AI Interaction & Tool-Call Loop

graph TD
    A[ai_client.send] --> B[Prompt Assembly]
    B --> C[Provider SDK Call]
    C --> D{Tool Call?}
    D -- Read-Only --> E[mcp_client]
    D -- Mutating --> F[GUI Approval Modal]
    D -- PowerShell --> G[shell_runner.run_powershell]
    E --> H[Tool Result]
    F -- Approved --> G
    G --> H
    H --> I[Append Result to History]
    I --> C
    D -- No --> J[Final AI Response]
  • Entry Point: ai_client.send()
  • Route:
    1. Provider Selection: Logic routes to _send_gemini, _send_anthropic, etc., based on configuration.
    2. Prompt Assembly: Combines the project context (from Pipeline 2.1) with conversation history and provider-specific system instructions.
    3. Execution Loop: Handles multi-turn tool calling (up to MAX_TOOL_ROUNDS).
    4. Tool Dispatch:
      • Read-Only: Calls mcp_client tools directly.
      • Mutating: Triggers pre_tool_callback (GUI modal) for user approval.
      • PowerShell: _run_script() delegates to shell_runner.run_powershell().
    5. Response Synthesis: Final AI text or tool results are returned to the caller.
  • Data Responsibility:
    • Owned: Conversation history, tool schemas, API credentials.
    • Mutated: Conversation history (appends turns), cost_tracker state.
    • Terminal Output: Final AI message, generated scripts, and updated conversation state.

2.3 GUI Event & State Synchronization

graph LR
    subgraph Foreground [gui_2.py - ImGui Loop]
    A[App._gui_func] --> B[_process_pending_gui_tasks]
    B --> C[Trigger Modals / Update Panels]
    end
    subgraph Background [app_controller.py - Event Loop]
    D[AppController._process_event_queue] --> E{Event Type}
    E -- user_request --> F[Trigger AI Loop]
    E -- response --> G[Queue gui_task]
    G --> B
    end
    UI[User Input] --> D
  • Entry Points: gui_2.py:App._gui_func() (Foreground), app_controller.py:AppController._process_event_queue() (Background).
  • Route:
    1. User Action: UI event (e.g., clicking "Send") places a request in AppController.event_queue.
    2. Background Dispatch: _process_event_queue() identifies the event type. user_request spawns a thread (_handle_request_event) to trigger Pipeline 2.2 (AI Loop).
    3. Task Queuing: Background services (AI, MMA, Indexing) place gui_task or mma_state_update objects into AppController._pending_gui_tasks.
    4. Foreground Sync: App._gui_func() checks for pending tasks every frame via _process_pending_gui_tasks(), updating the ImGui state and triggering modals.
  • Data Responsibility:
    • Owned: ImGui window states, panel visibility, text viewer buffers.
    • Mutated: ai_status, mma_status, pending tool call lists.
    • Terminal Output: Updated UI visuals and user-approved actions.

3. Simulation Pipelines (./simulation)

3.1 Simulation Lifecycle

graph TD
    A[run_sim] --> B[BaseSimulation.setup]
    B --> C[Scaffold Temp Project]
    C --> D[Simulation.run]
    D --> E[WorkflowSimulator.run_discussion_turn]
    E --> F[wait_for_ai_response]
    F --> G{Status == idle & Last == AI?}
    G -- No --> F
    G -- Yes --> H[Validation/Assertions]
    H --> I[BaseSimulation.teardown]
  • Entry Point: run_sim(MySimulation)
  • Route:
    1. Scaffolding: BaseSimulation.setup() initializes the ApiHookClient, clears the current session, and creates a temporary test project.
    2. Workflow Orchestration: WorkflowSimulator.setup_new_project() and create_discussion() configure the UI state for the test scenario.
    3. Interaction Loop: WorkflowSimulator.run_discussion_turn() manages the multi-turn exchange.
      • Polling: Continuously checks ai_status via HTTP hooks.
      • Stall Recovery: Automatically re-triggers the Send action if the AI stops without a final response (e.g., after a tool call).
    4. Validation: Subclasses perform assertions against the UI state (e.g., assert_panel_visible()).
    5. Cleanup: BaseSimulation.teardown() handles resource deallocation.
  • Data Responsibility:
    • Owned: Mock project paths, synthetic user messages.
    • Mutated: Global ai_status (indirectly via Hooks), target file system in the test project.
    • Terminal Output: Test pass/fail status, performance/coverage metrics.

3.2 Verification & Checkpointing Protocol

  • Turn Completion Logic: WorkflowSimulator.wait_for_ai_response() implements a state machine for turn detection.
    • Transition-Based: Tracks was_busy (status in ["thinking", "streaming", "running powershell", etc.]) and triggers completion when status returns to "idle" and the last history role is "AI".
    • Error Handling: GUI-reported "error" statuses trigger an immediate abort.
  • Stall Recovery: Detects "stalled" turns where the last role is "Tool" but the system is "idle" (indicating a tool result was received but the AI didn't automatically continue). The simulator re-triggers the btn_gen_send hook to force progress.
  • State Determinism: Simulations force auto_add_history=True and reset sessions during setup() to ensure a clean slate for verification.

4. Data Responsibility & State Boundaries

Mapping which pipelines own and mutate specific data structures.

Pipeline Primary Data Owned Mutated State Terminal Output
2.1 Context Aggregation FileItem list, history list None (Pure Synthesis) Markdown Context String
2.2 AI Interaction AI History, Tool Schemas history (Turns), cost_tracker AI Response, Tool Calls
2.3 GUI & Sync ImGui State, Controller Config ai_status, pending_tasks Visual Feedback, Log Entries
Simulation (3.1) BaseSimulation state, Mock Hooks Virtual ai_status, polled history Test Pass/Fail, Coverage Metrics

5. Identified Redundancies & Curation Targets

List of specific areas for pruning in the next phase.

5.1 Configuration & Model Redundancies

  • Duplicate Class Definitions: models.py contains redundant definitions for TextEditorConfig and ExternalEditorConfig.
  • Provider Registry: Both gui_2.py and app_controller.py maintain their own PROVIDERS list. This should be consolidated into models.py or a dedicated config module.

5.2 Processing Overlap

  • Context Synthesis: aggregate.py has several tier-specific functions (build_tier1_context, build_tier2_context, etc.) that share significant boilerplate logic. These should be refactored into a single param-driven pipeline.
  • Simulation Setup: WorkflowSimulator and BaseSimulation have overlapping responsibilities for project scaffolding and session resetting.

5.3 Style & Integrity Violations

  • Inconsistent Docstrings: Some older modules lack the standardized "Architecture" and "Key Components" headers.
  • Type Hinting Gaps: shell_runner.py and some simulation utility scripts have incomplete type hints.
  • Indentation Check: Perform a sweep to ensure 100% compliance with the 1-space indentation rule.