Addresses three gaps where Claude Code and Gemini CLI outperform Manual Slop's
MMA during actual execution:
1. Live worker streaming: Wire comms_log_callback to per-ticket streams so
users see real-time output instead of waiting for worker completion.
2. Per-tier model config: Replace hardcoded get_model_for_role with GUI
dropdowns persisted to project TOML.
3. Parallel DAG execution: asyncio.gather for independent tickets (exploratory
— _send_lock may block, needs investigation).
4. Auto-retry with escalation: flash-lite -> flash -> pro on BLOCKED, up to
2 retries (wires existing --failure-count mechanism into ConductorEngine).
7 new tasks across Phase 6, bringing total to 30 tasks across 6 phases.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mma_exec.py changes:
- get_role_documents: Tier 1 now gets docs/guide_architecture.md + guide_mma.md
(was: only product.md). Tier 2 gets same (was: only tech-stack + workflow).
Tier 3 gets guide_architecture.md (was: only workflow.md — workers modifying
gui_2.py had zero knowledge of threading model). Tier 4 gets guide_architecture.md
(was: nothing).
- Tier 3 system directive: Added ARCHITECTURE REFERENCE callout, CRITICAL
THREADING RULE (never write GUI state from background thread), TASK FORMAT
instruction (follow WHERE/WHAT/HOW/SAFETY from surgical tasks), and
py_get_definition to tool list.
- Tier 4 system directive: Added ARCHITECTURE REFERENCE callout and instruction
to trace errors through thread domains documented in guide_architecture.md.
conductor/workflow.md changes:
- Red Phase delegation prompt: Replaced 'with a prompt to create tests' with
surgical prompt format example showing WHERE/WHAT/HOW/SAFETY.
- Green Phase delegation prompt: Replaced 'with a highly specific prompt' with
surgical prompt format example with exact line refs and API calls.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrites comprehensive_gui_ux_20260228 spec and plan using deep analysis of
the actual gui_2.py implementation (3078 lines). The previous spec asked to
implement features that already exist (Track Browser, DAG tree, epic planning,
approval dialogs, token table, performance monitor). The new spec:
- Documents 15 already-implemented features with exact line references
- Identifies 8 actual gaps (tier stream panels, DAG editing, cost tracking,
conductor lifecycle forms, track-scoped discussions, approval indicators,
track proposal editing, stream scrollability)
- Rewrites all 5 phases with surgical task descriptions referencing exact
gui_2.py line ranges, function names, and data structures
- Each task specifies the precise imgui API calls to use
- References docs/guide_architecture.md for threading constraints
- References docs/guide_mma.md for Ticket/Track data structures
Also adds architecture documentation fallback references to:
- conductor/workflow.md (new principle #9)
- conductor/product.md (new Architecture Reference section)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three independent root causes fixed:
- gui_2.py: Route mma_spawn_approval/mma_step_approval events in _process_event_queue
- multi_agent_conductor.py: Pass asyncio loop from ConductorEngine.run() through to
thread-pool workers for thread-safe event queue access; add _queue_put helper
- ai_client.py: Preserve GeminiCliAdapter in reset_session() instead of nulling it
Test: visual_sim_mma_v2::test_mma_complete_lifecycle passes in ~8s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>