Files

Ed_ 0d2b6049d1 conductor: Create 3 MVP tracks with surgical specs from full codebase analysis

Three new tracks identified by analyzing product.md requirements against
actual codebase state using 1M-context Opus with all architecture docs loaded:

1. mma_pipeline_fix_20260301 (P0, blocker):
   - Diagnoses why Tier 3 worker output never reaches mma_streams in GUI
   - Identifies 4 root cause candidates: positional arg ordering, asyncio.Queue
     thread-safety violation, ai_client.reset_session() side effects, token
     stats stub returning empty dict
   - 2 phases, 6 tasks with exact line references

2. simulation_hardening_20260301 (P1, depends on pipeline fix):
   - Addresses 3 documented issues from robust_live_simulation session compression
   - Mock triggers wrong approval popup, popup state desync, approval ambiguity
   - 3 phases, 9 tasks including standalone mock test suite

3. context_token_viz_20260301 (P2):
   - Builds UI for product.md primary use case #2 'Context & Memory Management'
   - Backend already complete (get_history_bleed_stats, 140 lines)
   - Token budget bar, proportion breakdown, trimming preview, cache status
   - 3 phases, 10 tasks

Execution order: pipeline_fix -> simulation_hardening -> gui_ux (parallel w/ token_viz)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-01 09:58:34 -05:00

3.1 KiB

Raw Blame History

Implementation Plan: MMA Pipeline Fix & Worker Stream Verification

Phase 1: Diagnose & Fix Worker Stream Pipeline

Task 1.1: Add diagnostic logging to run_worker_lifecycle (multi_agent_conductor.py:280-290). Before the _queue_put call, add print(f"[MMA] Pushing Tier 3 response for {ticket.id}, loop={'present' if loop else 'NONE'}, stream_id={response_payload['stream_id']}"). Also add a print inside the except Exception as e block that currently silently swallows errors. This will reveal whether (a) the function reaches the push point, (b) loop is passed correctly, (c) any exceptions are being swallowed.
Task 1.2: Remove the unsafe else branch in run_worker_lifecycle (multi_agent_conductor.py:289-290) that calls event_queue._queue.put_nowait(). asyncio.Queue is NOT thread-safe from non-event-loop threads. The else branch should either raise an error (raise RuntimeError("loop is required for thread-safe event queue access")) or use a fallback that IS thread-safe. Same fix needed in confirm_execution (line 156) and confirm_spawn (line 183).
Task 1.3: Verify the run_in_executor positional argument order at multi_agent_conductor.py:118-127 matches run_worker_lifecycle's signature exactly: (ticket, context, context_files, event_queue, engine, md_content, loop). The signature at line 207 is: (ticket, context, context_files=None, event_queue=None, engine=None, md_content="", loop=None). Positional args must be in this exact order. If any are swapped, fix the call site.
Task 1.4: Write a unit test that creates a mock AsyncEventQueue and asyncio.AbstractEventLoop, calls run_worker_lifecycle with a mock ai_client.send (returning a fixed string), and verifies the ("response", {...}) event was pushed with the correct stream_id format "Tier 3 (Worker): {ticket.id}".

Phase 2: Fix Token Usage Tracking

Task 2.1: In run_worker_lifecycle (multi_agent_conductor.py:295-298), the stats = {} stub produces zero token counts. Replace with stats = ai_client.get_history_bleed_stats() which returns a dict containing "total_input_tokens" and "total_output_tokens" (see ai_client.py:1657-1796). Extract the relevant fields and update engine.tier_usage["Tier 3"]. If get_history_bleed_stats is too heavy, use the simpler approach: after ai_client.send(), read the last comms log entry from ai_client.get_comms_log()[-1] which contains payload.usage with token counts.
Task 2.2: Similarly fix Tier 1 and Tier 2 token tracking. In _cb_plan_epic (gui_2.py:1985-2010) and wherever Tier 2 calls happen, ensure mma_tier_usage is updated with actual token counts from comms log entries.

Phase 3: End-to-End Verification

Task 3.1: Update tests/visual_sim_mma_v2.py Stage 8 to assert that mma_streams contains a key matching "Tier 3" with non-empty content after a full mock MMA run. If this already passes with the fixes from Phase 1, mark as verified. If not, trace the specific failure point using the diagnostic logging from Task 1.1.
Task 3.2: Conductor - User Manual Verification 'Phase 3: End-to-End Verification' (Protocol in workflow.md)

3.1 KiB Raw Blame History

Implementation Plan: MMA Pipeline Fix & Worker Stream Verification

Phase 1: Diagnose & Fix Worker Stream Pipeline

Phase 2: Fix Token Usage Tracking

Phase 3: End-to-End Verification

3.1 KiB

Raw Blame History