Implementation Plan: Comprehensive Conductor & MMA GUI UX

Architecture reference: docs/guide_architecture.md, docs/guide_mma.md

Phase 1: Tier Stream Panels & Approval Indicators

Focus: Make all 4 tier output streams visible and indicate pending approvals.

Task 1.1: Replace the single Tier 1 strategy text box in _render_mma_dashboard (gui_2.py:2700-2701) with four collapsible sections — one per tier. Each section uses imgui.collapsing_header(f"Tier {N}: {label}") wrapping a begin_child scrollable region (200px height). Tier 1 = "Strategy", Tier 2 = "Tech Lead", Tier 3 = "Workers", Tier 4 = "QA". Tier 3 should aggregate all mma_streams keys containing "Tier 3" with ticket ID sub-headers. Each section auto-scrolls to bottom when new content arrives (track previous scroll position, scroll only if user was at bottom).
Task 1.2: Add approval state indicators to the MMA dashboard. After the "Status:" line in _render_mma_dashboard (gui_2.py:2672-2676), check self._pending_mma_spawn, self._pending_mma_approval, and self._pending_ask_dialog. When any is active, render a colored blinking badge: imgui.text_colored(ImVec4(1,0.3,0.3,1), "APPROVAL PENDING") using sin(time.time()*5) for alpha pulse. Also add a imgui.same_line() button "Go to Approval" that scrolls/focuses the relevant dialog.
Task 1.3: Write unit tests verifying: (a) mma_streams with keys "Tier 1", "Tier 2 (Tech Lead)", "Tier 3: T-001", "Tier 4 (QA)" are all rendered (check by mocking imgui.collapsing_header calls); (b) approval indicators appear when _pending_mma_spawn is not None.
Task 1.4: Conductor - User Manual Verification 'Phase 1: Tier Stream Panels & Approval Indicators' (Protocol in workflow.md)

Phase 2: Cost Tracking & Enhanced Token Table

Focus: Add cost estimation to the existing token usage display.

Task 2.1: Create a new module cost_tracker.py with a MODEL_PRICING dict mapping model name patterns to {"input_per_mtok": float, "output_per_mtok": float}. Include entries for: gemini-2.5-flash-lite ($0.075/$0.30), gemini-2.5-flash ($0.15/$0.60), gemini-3-flash-preview ($0.15/$0.60), gemini-3.1-pro-preview ($3.50/$10.50), claude-*-sonnet ($3/$15), claude-*-opus ($15/$75), deepseek-v3 ($0.27/$1.10). Function: estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float that does pattern matching on model name and returns dollar cost.
Task 2.2: Extend the token usage table in _render_mma_dashboard (gui_2.py:2685-2699) from 3 columns to 5: add "Est. Cost" and "Model". Populate using cost_tracker.estimate_cost() with the model name from self.mma_tier_usage (need to extend tier_usage dict in ConductorEngine._push_state to include model name per tier, or use a default mapping: Tier 1 → gemini-3.1-pro-preview, Tier 2 → gemini-3-flash-preview, Tier 3 → gemini-2.5-flash-lite, Tier 4 → gemini-2.5-flash-lite). Show total cost row at bottom.
Task 2.3: Write tests for cost_tracker.estimate_cost() covering all model patterns and edge cases (unknown model returns 0).
Task 2.4: Conductor - User Manual Verification 'Phase 2: Cost Tracking & Enhanced Token Table' (Protocol in workflow.md)

Phase 3: Track Proposal Editing & Conductor Lifecycle Forms

Focus: Make track proposals editable and add conductor setup/newTrack GUI forms.

Task 3.1: Enhance _render_track_proposal_modal (gui_2.py:2146-2173) to make track titles and goals editable. Replace imgui.text_colored for title with imgui.input_text(f"##track_title_{idx}", track['title']). Replace imgui.text_wrapped for goal with imgui.input_text_multiline(f"##track_goal_{idx}", track['goal'], ImVec2(-1, 60)). Add a "Remove" button per track (imgui.button(f"Remove##{idx}")) that pops from self.proposed_tracks. Edited values must be written back to self.proposed_tracks[idx].
Task 3.2: Add a "Conductor Setup" collapsible section at the top of the MMA dashboard (before the Track Browser). Contains a "Run Setup" button. On click, reads conductor/workflow.md, conductor/tech-stack.md, conductor/product.md using Path.read_text(), computes a readiness summary (files found, line counts, track count via project_manager.get_all_tracks()), and displays it in a read-only text region. This is informational only — no backend changes.
Task 3.3: Add a "New Track" form below the Track Browser. Fields: track name (input_text), description (input_text_multiline), type dropdown (feature/chore/fix via imgui.combo). "Create" button calls a new helper _cb_create_track(name, desc, type) that: creates conductor/tracks/{name}_{date}/ directory, writes a minimal spec.md from the description, writes an empty plan.md template, writes metadata.json with the track ID/type/status="new", then refreshes self.tracks via project_manager.get_all_tracks().
Task 3.4: Write tests for track creation helper: verify directory structure, file contents, and metadata.json format. Test proposal modal editing by verifying proposed_tracks list is mutated correctly.
Task 3.5: Conductor - User Manual Verification 'Phase 3: Track Proposal Editing & Conductor Lifecycle Forms' (Protocol in workflow.md)

Phase 4: DAG Editing & Track-Scoped Discussion

Focus: Allow GUI-based ticket manipulation and track-specific discussion history.

Task 4.1: Add an "Add Ticket" button below the Task DAG section in _render_mma_dashboard. On click, show an inline form: ticket ID (input_text, default auto-increment like "T-NNN"), description (input_text_multiline), target_file (input_text), depends_on (multi-select or comma-separated input of existing ticket IDs). "Create" button appends a new Ticket dict to self.active_tickets with status="todo" and triggers _push_mma_state_update() to synchronize the ConductorEngine. Cancel hides the form. Store the form visibility in self._show_add_ticket_form: bool.
Task 4.2: Add a "Delete" button to each DAG node in _render_ticket_dag_node (gui_2.py:2770-2773, after the Skip button). On click, show a confirmation popup. On confirm, remove the ticket from self.active_tickets, remove it from all other tickets' depends_on lists, and push state update. Only allow deletion of todo or blocked tickets (not in_progress or completed).
Task 4.3: Add track-scoped discussion support. In _render_discussion_panel (gui_2.py:2295-2483), add a toggle checkbox "Track Discussion" (visible only when self.active_track is set). When toggled ON: load history via project_manager.load_track_history(self.active_track.id, base_dir) into self.disc_entries, set a flag self._track_discussion_active = True. When toggled OFF or track changes: restore project discussion. On save/flush, if _track_discussion_active, write to track history file instead of project history.
Task 4.4: Write tests for: (a) adding a ticket updates active_tickets and has correct default fields; (b) deleting a ticket removes it from all depends_on references; (c) track discussion toggle switches disc_entries source.
Task 4.5: Conductor - User Manual Verification 'Phase 4: DAG Editing & Track-Scoped Discussion' (Protocol in workflow.md)

Phase 5: Visual Polish & Integration Testing

Focus: Dense, responsive dashboard with arcade aesthetics and end-to-end verification.

Task 5.1: Add color-coded styling to the Track Browser table. Status column uses colored text: "new" = gray, "active" = yellow, "done" = green, "blocked" = red. Progress bar uses imgui.push_style_color to tint: <33% red, 33-66% yellow, >66% green.
Task 5.2: Improve the DAG tree nodes with status-colored left borders. Use imgui.get_cursor_screen_pos() and imgui.get_window_draw_list().add_rect_filled() to draw a 4px colored strip to the left of each tree node matching its status color.
Task 5.3: Add a "Dashboard Summary" header line at the top of _render_mma_dashboard showing: Track: {name} | Tickets: {done}/{total} | Cost: ${total_cost:.4f} | Status: {mma_status} in a single dense line with colored segments.
Task 5.4: Write an end-to-end integration test (extending tests/visual_sim_mma_v2.py or creating tests/visual_sim_gui_ux.py) that verifies via ApiHookClient: (a) track creation form produces correct directory structure; (b) tier streams are populated during MMA execution; (c) approval indicators appear when expected; (d) cost tracking shows non-zero values after execution.
Task 5.5: Verify all new UI elements maintain >30 FPS via get_ui_performance during a full MMA simulation run.
Task 5.6: Conductor - User Manual Verification 'Phase 5: Visual Polish & Integration Testing' (Protocol in workflow.md)

Phase 6: Live Worker Streaming & Engine Enhancements

Focus: Make MMA execution observable in real-time and configurable from the GUI. Currently workers are black boxes until completion.

Task 6.1: Wire ai_client.comms_log_callback to per-ticket streams during run_worker_lifecycle (multi_agent_conductor.py:207-300). Before calling ai_client.send(), set ai_client.comms_log_callback to a closure that pushes intermediate text chunks to the GUI via _queue_put(event_queue, loop, "response", {"text": chunk, "stream_id": f"Tier 3 (Worker): {ticket.id}", "status": "streaming..."}). After send() returns, restore the original callback. This gives real-time output streaming to the Tier 3 stream panels from Phase 1.
Task 6.2: Add per-tier model configuration to the MMA dashboard. Below the token usage table in _render_mma_dashboard, add a collapsible "Tier Model Config" section with 4 rows (Tier 1-4). Each row: tier label + imgui.combo dropdown populated from ai_client.list_models() (cached). Store selections in self.mma_tier_models: dict[str, str] with defaults from mma_exec.get_model_for_role(). On change, write to self.project["mma"]["tier_models"] for persistence.
Task 6.3: Wire per-tier model config into the execution pipeline. In ConductorEngine.run (multi_agent_conductor.py:105-135), when creating WorkerContext, read the model name from the GUI's mma_tier_models dict (passed via the event queue or stored on the engine). Pass it through to run_worker_lifecycle which should use it in ai_client.set_provider/ai_client.set_model_params before calling send(). Also update mma_exec.py:get_model_for_role to accept an override parameter.
Task 6.4: Add parallel DAG execution. In ConductorEngine.run (multi_agent_conductor.py:100-135), replace the sequential for ticket in ready_tasks loop with asyncio.gather(*[loop.run_in_executor(None, run_worker_lifecycle, ...) for ticket in ready_tasks]). Each worker already gets its own ai_client.reset_session() so they're isolated. Guard with ai_client._send_lock awareness — if the lock serializes all sends, parallel execution won't help. In that case, create per-worker provider instances or use separate session IDs. Mark this task as exploratory — if _send_lock blocks parallelism, document the constraint and defer.
Task 6.5: Add automatic retry with model escalation. In ConductorEngine.run, after run_worker_lifecycle returns, check if ticket.status == "blocked". If so, and retry_count < max_retries (default 2), increment retry count, escalate the model (e.g., flash-lite → flash → pro), and re-execute. Store retry_count as a field on the ticket dict. After max retries, leave as blocked.
Task 6.6: Write tests for: (a) streaming callback pushes intermediate content to event queue; (b) per-tier model config persists to project TOML; (c) retry escalation increments model tier.
Task 6.7: Conductor - User Manual Verification 'Phase 6: Live Worker Streaming & Engine Enhancements' (Protocol in workflow.md)

12 KiB Raw Blame History