Files

Ed_ 52f3820199 conductor(gui_ux): Add Phase 6 — live streaming, per-tier model config, parallel DAG, auto-retry

Addresses three gaps where Claude Code and Gemini CLI outperform Manual Slop's
MMA during actual execution:

1. Live worker streaming: Wire comms_log_callback to per-ticket streams so
   users see real-time output instead of waiting for worker completion.
2. Per-tier model config: Replace hardcoded get_model_for_role with GUI
   dropdowns persisted to project TOML.
3. Parallel DAG execution: asyncio.gather for independent tickets (exploratory
   — _send_lock may block, needs investigation).
4. Auto-retry with escalation: flash-lite -> flash -> pro on BLOCKED, up to
   2 retries (wires existing --failure-count mechanism into ConductorEngine).

7 new tasks across Phase 6, bringing total to 30 tasks across 6 phases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-01 10:24:29 -05:00

8.5 KiB

Raw Blame History

Track Specification: Comprehensive Conductor & MMA GUI UX

Overview

This track enhances the existing MMA orchestration GUI from its current functional-but-minimal state to a production-quality control surface. The existing implementation already has a working Track Browser, DAG tree visualizer, epic planning flow, approval dialogs, and token usage table. This track focuses on the gaps: dedicated tier stream panels, DAG editing, track-scoped discussions, conductor lifecycle GUI forms, cost tracking, and visual polish.

Current State Audit (as of `08e003a`)

Already Implemented (DO NOT re-implement)

Track Browser table (_render_mma_dashboard, lines 2633-2660): Title, status, progress bar, Load button per track.
Epic Planning (_render_projects_panel, lines 1968-1983 + _cb_plan_epic): Input field + "Plan Epic (Tier 1)" button, background thread orchestration.
Track Proposal Modal (_render_track_proposal_modal, lines 2146-2173): Shows proposed tracks, Start/Accept/Cancel.
Step Mode toggle: Checkbox for "Step Mode (HITL)" with self.mma_step_mode.
Active Track Info: Description + ticket progress bar.
Token Usage Table: Per-tier input/output display in a 3-column ImGui table.
Tier 1 Strategy Stream: mma_streams.get("Tier 1") rendered as read-only multiline (150px).
Task DAG Tree (_render_ticket_dag_node, lines 2726-2785): Recursive tree with color-coded status (gray/yellow/green/red/orange), tooltips showing ID/target/description/dependencies/worker-stream, Retry/Skip buttons.
Spawn Interceptor (MMASpawnApprovalDialog): Editable prompt, context_md, abort capability.
MMA Step Approval (MMAApprovalDialog): Editable payload, approve/reject.
Script Confirmation (ConfirmDialog): Editable script, approve/reject.
Comms History Panel (_render_comms_history_panel, lines 2859-2984).
Tool Calls Panel (_render_tool_calls_panel, lines 2787-2857).
Performance Monitor: FPS, Frame Time, CPU, Input Lag via perf_monitor.

Gaps to Fill (This Track's Scope)

Tier Stream Panels: Only Tier 1 gets a dedicated text box. Tier 2/3/4 streams exist in mma_streams dict but have no dedicated UI. Tier 3 output is tooltip-only on DAG nodes. No Tier 2 (Tech Lead) or Tier 4 (QA) visibility at all.
DAG Editing: Can Retry/Skip tickets but cannot reorder, insert, or delete tasks from the GUI.
Conductor Lifecycle Forms: /conductor:setup and /conductor:newTrack have no GUI equivalents — they're CLI-only. Users must use slash commands or the epic planning flow.
Track-Scoped Discussion: Discussions are global. When a track is active, the discussion panel should optionally isolate to that track's context. project_manager.load_track_history() exists but isn't wired to the GUI.
Cost Estimation: Token counts are displayed but not converted to estimated cost per tier or per track.
Approval State Indicators: The dashboard doesn't visually indicate when a spawn/step/tool approval is pending. pending_mma_spawn_approval, pending_mma_step_approval, pending_tool_approval are tracked but not rendered.
Track Proposal Editing: The modal shows proposed tracks read-only. No ability to edit track titles, goals, or remove unwanted tracks before accepting.
Stream Scrollability: Tier 1 stream is a 150px non-scrolling text box. Needs proper scrollable, resizable panels for all tier streams.

Goals

Tier Stream Visibility: Dedicated, scrollable panels for all 4 tier output streams (Tier 1 Strategy, Tier 2 Tech Lead, Tier 3 Worker, Tier 4 QA) with auto-scroll and copy support.
DAG Manipulation: Add/remove tickets from the active track's DAG via the GUI, with dependency validation.
Conductor GUI Forms: Setup and track creation forms that invoke the same logic as the CLI slash commands.
Track-Scoped Discussions: Switch the discussion panel to track-specific history when a track is active.
Cost Tracking: Per-tier and per-track cost estimation based on model pricing.
Approval Indicators: Clear visual cues (blinking, color changes) when any approval gate is pending.
Track Proposal Editing: Allow editing/removing proposed tracks before acceptance.
Polish & Density: Make the dashboard information-dense and responsive to the MMA engine's state.

Functional Requirements

Tier Stream Panels

Four collapsible/expandable text regions in the MMA dashboard, one per tier.
Auto-scroll to bottom on new content. Toggle for manual scroll lock.
Each stream populated from self.mma_streams keyed by tier prefix.
Tier 3 streams: aggregate all "Tier 3: T-xxx" keyed entries, render with ticket ID headers.

DAG Editing

"Add Ticket" button: opens an inline form (ID, description, target_file, depends_on dropdown).
"Remove Ticket" button on each DAG node (with confirmation).
Changes must update self.active_tickets, rebuild the ConductorEngine's TrackDAG, and push state via _push_state.

Conductor Lifecycle Forms

"Setup Conductor" button that reads conductor/workflow.md, conductor/tech-stack.md, conductor/product.md and displays a readiness summary.
"New Track" form: name, description, type dropdown. Creates the track directory structure under conductor/tracks/.

Track-Scoped Discussion

When self.active_track is set, add a toggle "Track Discussion" that switches to project_manager.load_track_history(track_id).
Saving flushes to the track's history file instead of the project's.

Cost Tracking

Model pricing table (configurable or hardcoded initial version).
Compute cost = (input_tokens / 1M) * input_price + (output_tokens / 1M) * output_price per tier.
Display as additional column in the existing token usage table.

Approval Indicators

When _pending_mma_spawn is not None: flash the "MMA Dashboard" tab header or show a blinking indicator.
When _pending_mma_approval is not None: similar.
When _pending_ask_dialog is True: similar.
Use imgui.push_style_color to tint the relevant UI region.

Track Proposal Editing

Make track titles and goals editable in the proposal modal.
Add a "Remove" button per proposed track.
Edited data flows back to self.proposed_tracks before acceptance.

Non-Functional Requirements

Thread Safety: All new data mutations from background threads must go through _pending_gui_tasks. No direct GUI state writes from non-main threads.
No New Dependencies: Use only existing Dear PyGui / imgui-bundle APIs.
Performance: New panels must not degrade FPS below 30 under normal operation. Verify via get_ui_performance.

Architecture Reference

Threading model and _process_pending_gui_tasks action catalog: docs/guide_architecture.md
MMA data structures (Ticket, Track, WorkerContext): docs/guide_mma.md
Hook API for testing: docs/guide_tools.md
Simulation patterns: docs/guide_simulations.md

Functional Requirements (Engine Enhancements)

Live Worker Streaming

During run_worker_lifecycle, set ai_client.comms_log_callback to push intermediate text chunks to the per-ticket stream via the event queue. Currently workers are black boxes until completion — both Claude Code and Gemini CLI stream in real-time. The callback should push {"text": chunk, "stream_id": "Tier 3 (Worker): {ticket.id}", "status": "streaming..."} events.

Per-Tier Model Configuration

mma_exec.py:get_model_for_role is hardcoded. Add a GUI section with imgui.combo dropdowns for each tier's model. Persist to project["mma"]["tier_models"]. Wire into ConductorEngine and run_worker_lifecycle.

Parallel DAG Execution

ConductorEngine.run() executes ready tickets sequentially. DAG-independent tickets should run in parallel via asyncio.gather. Constraint: ai_client._send_lock serializes all API calls — parallel workers may need separate provider instances or the lock needs to be per-session rather than global. Mark as exploratory.

Automatic Retry with Model Escalation

mma_exec.py has --failure-count for escalation but ConductorEngine doesn't use it. When a worker produces BLOCKED, auto-retry with a more capable model (up to 2 retries).

Out of Scope

Remote management via web browser.
Visual diagram generation (Dear PyGui node editor for DAG — future track).
Docking/floating multi-viewport layout (requires imgui docking branch investigation — future track).

8.5 KiB Raw Blame History