docs(conductor): Enforce execution order dependencies in phase 2 specs

2026-02-27 19:23:38 -05:00
parent 0dedcc1773
commit ef7040c3fd
14 changed files with 364 additions and 30 deletions
--- a/conductor/archive/mma_implementation_20260224/migration_epics.md
+++ b/conductor/archive/mma_implementation_20260224/migration_epics.md
@@ -0,0 +1,128 @@
+# MMA Migration: Epics and Detailed Tasks
+
+## Track 1: The Memory Foundations (AST Parser)
+
+**Goal:** Build the engine that prevents token-bloat by turning massive source files into curated memory views.
+
+### 1. TDD Approach for `tree-sitter` Integration
+- Create `tests/test_file_cache_ast.py`.
+- Define mock Python source files containing various structures (classes, functions, docstrings, `@core_logic` decorators, `# [HOT]` comments).
+- Write failing tests that instantiate `ASTParser` and assert that `get_skeleton_view()` and `get_curated_view()` return the precisely filtered strings.
+- **Red Phase:** Ensure tests fail because `ASTParser` does not exist.
+- **Green Phase:** Implement the tree-sitter logic iteratively until strings match exactly.
+
+### 2. `ASTParser` Extraction Rules (Tasks)
+- **Task 1.1: Dependency Setup**
+  - Add `tree-sitter` and `tree-sitter-python` to `pyproject.toml` / `requirements.txt`.
+- **Task 1.2: Core Parser Class**
+  - Create `ASTParser` in `file_cache.py` that initializes the language parser.
+- **Task 1.3: Skeleton View Extraction**
+  - Write query to extract `function_definition` and `class_definition`.
+  - Keep signatures, parameters, and return type hints.
+  - Replace all bodies with `pass`.
+- **Task 1.4: Curated View Extraction**
+  - Write query to keep class structures and `expression_statement` docstrings.
+  - Implement heuristic to preserve full bodies of functions decorated with `@core_logic` or containing `# [HOT]` comments.
+  - Replace all other function bodies with `... # Hidden`.
+
+### 3. Acceptance Testing Criteria
+- **Unit Tests:** All AST parsing tests pass with >90% coverage for `file_cache.py`.
+- **Integration Test:** Execute the parser on a large, complex project file (e.g., `ai_client.py`). The output `Skeleton View` must be less than 15% of the original token count. The `Curated View` must correctly retain docstrings and marked functions while stripping standard bodies.
+## Track 2: State Machine & Data Structures
+
+**Goal:** Define the rigid Python objects (Pydantic/Dataclasses) that AI agents will pass to each other, enforcing structured data over loose chat strings.
+
+### 1. TDD Approach for \models.py\
+- Create \	ests/test_models.py\.
+- Write failing tests that instantiate \Track\, \Ticket\, and \WorkerContext\ with various valid and invalid schemas.
+- Write tests that assert state transitions (e.g., from \pending\ to \locked\, from \step_paused\ to \completed\) correctly update internal flags and dependencies.
+- **Red Phase:** Tests fail because \models.py\ classes are undefined or lack transition methods.
+- **Green Phase:** Implement the dataclasses and state mutators.
+
+### 2. State Machine Tasks
+- **Task 2.1: The Dataclasses**
+  - Create \models.py\. Define \Ticket\ (id, target_file, prompt, worker_archetype, status, dependencies).
+  - Define \Track\ (id, title, description, status, tickets).
+- **Task 2.2: Worker Context Definition**
+  - Define \WorkerContext\ holding a \Ticket\ ID, assigned model, configuration injection, and an ephemeral \messages\ array.
+- **Task 2.3: State Mutator Methods**
+  - Implement methods like \	icket.mark_blocked(dependency_id)\, \	icket.mark_complete()\, and \	rack.get_executable_tickets()\. Ensure strict validation of valid state transitions.
+
+### 3. Acceptance Testing Criteria
+- **Unit Tests:** \models.py\ has 100% test coverage for all state transitions.
+- **Integration Test:** Instantiate a \Track\ with 3 dependent \Tickets\ in Python. Programmatically mark tickets as complete and assert that the subsequent dependent tickets transition from \locked\ to \pending\ without any AI involvement.
+
+## Track 3: The Linear Orchestrator & Execution Clutch
+
+**Goal:** Build the synchronous, debuggable core loop that runs a single Tier 3 Worker and pauses for human approval.
+
+### 1. TDD Approach for \multi_agent_conductor.py\
+- Create \	ests/test_conductor.py\.
+- Write tests that mock the AI client response (e.g., returning a mock tool call like \write_file\).
+- Test that \
+un_worker_lifecycle(ticket: Ticket)\ fetches the Raw View from \ile_cache.py\, formats messages, and processes the mock output.
+- Test that execution pauses (waits for a simulated human signal) when the \	rust_level\ dictates.
+- **Red Phase:** Failure occurs because \multi_agent_conductor.py\ lacks the lifecycle execution loop.
+- **Green Phase:** Implement the \ConductorEngine\ core execution block.
+
+### 2. Linear Orchestration Tasks
+- **Task 3.1: The Engine Core**
+  - Create \multi_agent_conductor.py\. Implement the \ConductorEngine\ class containing the \
+un_worker_lifecycle\ synchronous execution.
+- **Task 3.2: Context Injection**
+  - Implement logic reading the Ticket target, querying \ile_cache.py\ for the \Raw View\, and formatting the messages array for the API.
+- **Task 3.3: The HITL Execution Clutch**
+  - Before executing tools via \mcp_client.py\ or \shell_runner.py\, intercept the tool payload if the Worker's archetype dictates a \step\ mode.
+  - Wait for explicit user confirmation via a CLI prompt (or event block for UI future-proofing). Allow editing of the JSON payload.
+  - Flush history upon \TicketCompleted\.
+
+### 3. Acceptance Testing Criteria
+- **Unit Tests:** Context generation, API schema mapping, and event-blocking are tested for all Edge cases.
+- **Integration Test:** Manually execute a script pointing the \ConductorEngine\ at a dummy file. The CLI should pause before \write_file\ execution, display the diff, allow manual JSON editing via terminal input, execute the updated JSON file modification, and return \Task Complete\.
+
+## Track 4: Tier 4 QA Interception
+
+**Goal:** Stop error traces from destroying the Worker's token window by routing crashes through a cheap, stateless translator.
+
+### 1. TDD Approach for \shell_runner.py\
+- Create \	ests/test_shell_runner.py\.
+- Write tests that mock a local execution failure (e.g., returning a mock 3000-line Python stack trace).
+- Test that the error is intercepted and passed to a mock Tier 4 agent.
+- Test that the output is compressed into a 20-word fix before returning.
+- **Red Phase:** Fails because no interception loop exists in \shell_runner.py\.
+- **Green Phase:** Implement the try/except logic handling \subprocess.run()\ with \
+eturncode != 0\.
+
+### 2. QA Interception Tasks
+- **Task 4.1: The Interceptor Loop**
+  - Open \shell_runner.py\ and catch execution errors.
+- **Task 4.2: Tier 4 Instantiation**
+  - Construct a secondary, synchronous API call directly to the \default_cheap\ model, sending the raw \stderr\ and the offending code snippet.
+- **Task 4.3: Payload Formatting**
+  - Inject the 20-word fix response from the Tier 4 agent back into the main Tier 3 worker's history context as a system hint.
+
+### 3. Acceptance Testing Criteria
+- **Unit Tests:** Verify that massive error outputs never leak uncompressed into the main history logs.
+- **Integration Test:** Purposely introduce a syntax error in a local script. Ensure the orchestrator catches it, pings the mock/cheap API, and the history log receives the 20-word hint instead of the 200-line stack trace.
+
+## Track 5: UI Decoupling & Tier 1/2 Routing (The Final Boss)
+
+**Goal:** Bring the whole system online by letting Tier 1 and Tier 2 generate Tickets dynamically, managed via an asynchronous Event Bus.
+
+### 1. TDD Approach for \gui_2.py\ Decoupling
+- Create \	ests/test_gui_decoupling.py\.
+- Write tests that instantiate a mocked GUI instance listening to an \syncio.Queue\.
+- Mock pushing \TrackStateUpdated\ and \TicketStarted\ events into the queue and ensure the GUI updates its view state rather than calling LLM endpoints directly.
+- **Red Phase:** Failure occurs because \gui_2.py\ is tightly coupled with \i_client.py\ logic.
+- **Green Phase:** Implement the \AgentBus\ messaging system linking \multi_agent_conductor.py\ to \gui_2.py\.
+
+### 2. Tier 1/2 Routing Tasks
+- **Task 5.1: The Event Bus**
+  - Implement an \syncio.Queue\ in \multi_agent_conductor.py\.
+- **Task 5.2: Tier 1 & 2 System Prompts**
+  - Define system prompts that force the 3.1 Pro/3.5 Sonnet models to output strict JSON arrays defining the Tracks and Tickets (utilizing native Structured Outputs).
+- **Task 5.3: The Dispatcher**
+  - Write an async loop that reads JSON from Tier 2, converts them into \Ticket\ objects, and pushes them onto the queue.
+  - Implement the Stub Resolver to enforce \contract_stubber\ dependent execution flow.
+- **Task 5.4: UI Component Update**
+  - Remove direct LLM calls from \gui_2.py\. Wire user inputs into \UserRequestEvents\ for the Orchestrator's queue.