ed/manual_slop

Fork 0

Files

Ed_ 67734c92a1 docs(mma): Draft Track 5 - UI Decoupling & Tier 1/2 Routing

2026-02-24 22:27:22 -05:00

8.8 KiB

Raw Blame History

MMA Migration: Epics and Detailed Tasks

Track 1: The Memory Foundations (AST Parser)

Goal: Build the engine that prevents token-bloat by turning massive source files into curated memory views.

1. TDD Approach for `tree-sitter` Integration

Create tests/test_file_cache_ast.py.
Define mock Python source files containing various structures (classes, functions, docstrings, @core_logic decorators, # [HOT] comments).
Write failing tests that instantiate ASTParser and assert that get_skeleton_view() and get_curated_view() return the precisely filtered strings.
Red Phase: Ensure tests fail because ASTParser does not exist.
Green Phase: Implement the tree-sitter logic iteratively until strings match exactly.

2. `ASTParser` Extraction Rules (Tasks)

Task 1.1: Dependency Setup
- Add tree-sitter and tree-sitter-python to pyproject.toml / requirements.txt.
Task 1.2: Core Parser Class
- Create ASTParser in file_cache.py that initializes the language parser.
Task 1.3: Skeleton View Extraction
- Write query to extract function_definition and class_definition.
- Keep signatures, parameters, and return type hints.
- Replace all bodies with pass.
Task 1.4: Curated View Extraction
- Write query to keep class structures and expression_statement docstrings.
- Implement heuristic to preserve full bodies of functions decorated with @core_logic or containing # [HOT] comments.
- Replace all other function bodies with ... # Hidden.

3. Acceptance Testing Criteria

Unit Tests: All AST parsing tests pass with >90% coverage for file_cache.py.
Integration Test: Execute the parser on a large, complex project file (e.g., ai_client.py). The output Skeleton View must be less than 15% of the original token count. The Curated View must correctly retain docstrings and marked functions while stripping standard bodies.

Track 2: State Machine & Data Structures

Goal: Define the rigid Python objects (Pydantic/Dataclasses) that AI agents will pass to each other, enforcing structured data over loose chat strings.

1. TDD Approach for \models.py\

Create \ ests/test_models.py.
Write failing tests that instantiate \Track, \Ticket, and \WorkerContext\ with various valid and invalid schemas.
Write tests that assert state transitions (e.g., from \pending\ to \locked, from \step_paused\ to \completed) correctly update internal flags and dependencies.
Red Phase: Tests fail because \models.py\ classes are undefined or lack transition methods.
Green Phase: Implement the dataclasses and state mutators.

2. State Machine Tasks

Task 2.1: The Dataclasses
- Create \models.py. Define \Ticket\ (id, target_file, prompt, worker_archetype, status, dependencies).
- Define \Track\ (id, title, description, status, tickets).
Task 2.2: Worker Context Definition
- Define \WorkerContext\ holding a \Ticket\ ID, assigned model, configuration injection, and an ephemeral \messages\ array.
Task 2.3: State Mutator Methods
- Implement methods like \ icket.mark_blocked(dependency_id), \ icket.mark_complete(), and \ rack.get_executable_tickets(). Ensure strict validation of valid state transitions.

3. Acceptance Testing Criteria

Unit Tests: \models.py\ has 100% test coverage for all state transitions.
Integration Test: Instantiate a \Track\ with 3 dependent \Tickets\ in Python. Programmatically mark tickets as complete and assert that the subsequent dependent tickets transition from \locked\ to \pending\ without any AI involvement.

Track 3: The Linear Orchestrator & Execution Clutch

Goal: Build the synchronous, debuggable core loop that runs a single Tier 3 Worker and pauses for human approval.

1. TDD Approach for \multi_agent_conductor.py\

Create \ ests/test_conductor.py.
Write tests that mock the AI client response (e.g., returning a mock tool call like \write_file).
Test that
un_worker_lifecycle(ticket: Ticket)\ fetches the Raw View from \ile_cache.py, formats messages, and processes the mock output.
Test that execution pauses (waits for a simulated human signal) when the \ rust_level\ dictates.
Red Phase: Failure occurs because \multi_agent_conductor.py\ lacks the lifecycle execution loop.
Green Phase: Implement the \ConductorEngine\ core execution block.

2. Linear Orchestration Tasks

Task 3.1: The Engine Core
- Create \multi_agent_conductor.py. Implement the \ConductorEngine\ class containing the
  un_worker_lifecycle\ synchronous execution.
Task 3.2: Context Injection
- Implement logic reading the Ticket target, querying \ile_cache.py\ for the \Raw View, and formatting the messages array for the API.
Task 3.3: The HITL Execution Clutch
- Before executing tools via \mcp_client.py\ or \shell_runner.py, intercept the tool payload if the Worker's archetype dictates a \step\ mode.
- Wait for explicit user confirmation via a CLI prompt (or event block for UI future-proofing). Allow editing of the JSON payload.
- Flush history upon \TicketCompleted.

3. Acceptance Testing Criteria

Unit Tests: Context generation, API schema mapping, and event-blocking are tested for all Edge cases.
Integration Test: Manually execute a script pointing the \ConductorEngine\ at a dummy file. The CLI should pause before \write_file\ execution, display the diff, allow manual JSON editing via terminal input, execute the updated JSON file modification, and return \Task Complete.

Track 4: Tier 4 QA Interception

Goal: Stop error traces from destroying the Worker's token window by routing crashes through a cheap, stateless translator.

1. TDD Approach for \shell_runner.py\

Create \ ests/test_shell_runner.py.
Write tests that mock a local execution failure (e.g., returning a mock 3000-line Python stack trace).
Test that the error is intercepted and passed to a mock Tier 4 agent.
Test that the output is compressed into a 20-word fix before returning.
Red Phase: Fails because no interception loop exists in \shell_runner.py.
Green Phase: Implement the try/except logic handling \subprocess.run()\ with
eturncode != 0.

2. QA Interception Tasks

Task 4.1: The Interceptor Loop
- Open \shell_runner.py\ and catch execution errors.
Task 4.2: Tier 4 Instantiation
- Construct a secondary, synchronous API call directly to the \default_cheap\ model, sending the raw \stderr\ and the offending code snippet.
Task 4.3: Payload Formatting
- Inject the 20-word fix response from the Tier 4 agent back into the main Tier 3 worker's history context as a system hint.

3. Acceptance Testing Criteria

Unit Tests: Verify that massive error outputs never leak uncompressed into the main history logs.
Integration Test: Purposely introduce a syntax error in a local script. Ensure the orchestrator catches it, pings the mock/cheap API, and the history log receives the 20-word hint instead of the 200-line stack trace.

Track 5: UI Decoupling & Tier 1/2 Routing (The Final Boss)

Goal: Bring the whole system online by letting Tier 1 and Tier 2 generate Tickets dynamically, managed via an asynchronous Event Bus.

1. TDD Approach for \gui_2.py\ Decoupling

Create \ ests/test_gui_decoupling.py.
Write tests that instantiate a mocked GUI instance listening to an \syncio.Queue.
Mock pushing \TrackStateUpdated\ and \TicketStarted\ events into the queue and ensure the GUI updates its view state rather than calling LLM endpoints directly.
Red Phase: Failure occurs because \gui_2.py\ is tightly coupled with \i_client.py\ logic.
Green Phase: Implement the \AgentBus\ messaging system linking \multi_agent_conductor.py\ to \gui_2.py.

2. Tier 1/2 Routing Tasks

Task 5.1: The Event Bus
- Implement an \syncio.Queue\ in \multi_agent_conductor.py.
Task 5.2: Tier 1 & 2 System Prompts
- Define system prompts that force the 3.1 Pro/3.5 Sonnet models to output strict JSON arrays defining the Tracks and Tickets (utilizing native Structured Outputs).
Task 5.3: The Dispatcher
- Write an async loop that reads JSON from Tier 2, converts them into \Ticket\ objects, and pushes them onto the queue.
- Implement the Stub Resolver to enforce \contract_stubber\ dependent execution flow.
Task 5.4: UI Component Update
- Remove direct LLM calls from \gui_2.py. Wire user inputs into \UserRequestEvents\ for the Orchestrator's queue.

3. Acceptance Testing Criteria

Integration Test: Execute the full app stack in simulation. Issue a vague prompt ("Refactor the config system"). Ensure Tier 1 outputs a Track. Tier 2 breaks it into an interface stub Ticket and an implementation Ticket. The system executes the stub, updates the AST, and finishes the implementation automatically or allows step-through in Linear mode.

8.8 KiB Raw Blame History Unescape Escape

MMA Migration: Epics and Detailed Tasks

Track 1: The Memory Foundations (AST Parser)

1. TDD Approach for tree-sitter Integration

2. ASTParser Extraction Rules (Tasks)

3. Acceptance Testing Criteria

Track 2: State Machine & Data Structures

1. TDD Approach for \models.py\

2. State Machine Tasks

3. Acceptance Testing Criteria

Track 3: The Linear Orchestrator & Execution Clutch

1. TDD Approach for \multi_agent_conductor.py\

2. Linear Orchestration Tasks

3. Acceptance Testing Criteria

Track 4: Tier 4 QA Interception

1. TDD Approach for \shell_runner.py\

2. QA Interception Tasks

3. Acceptance Testing Criteria

Track 5: UI Decoupling & Tier 1/2 Routing (The Final Boss)

1. TDD Approach for \gui_2.py\ Decoupling

2. Tier 1/2 Routing Tasks

3. Acceptance Testing Criteria

8.8 KiB

Raw Blame History

1. TDD Approach for `tree-sitter` Integration

2. `ASTParser` Extraction Rules (Tasks)