- Add cost tracking with new cost_tracker.py module - Enhance Track Proposal modal with editable titles and goals - Add Conductor Setup summary and New Track creation form to MMA Dashboard - Implement Task DAG editing (add/delete tickets) and track-scoped discussion - Add visual polish: color-coded statuses, tinted progress bars, and node indicators - Support live worker streaming from AI providers to GUI panels - Fix numerous integration test regressions and stabilize headless service
12 KiB
12 KiB
Implementation Plan: Comprehensive Conductor & MMA GUI UX
Architecture reference: docs/guide_architecture.md, docs/guide_mma.md
Phase 1: Tier Stream Panels & Approval Indicators
Focus: Make all 4 tier output streams visible and indicate pending approvals.
- Task 1.1: Replace the single Tier 1 strategy text box in
_render_mma_dashboard(gui_2.py:2700-2701) with four collapsible sections — one per tier. Each section usesimgui.collapsing_header(f"Tier {N}: {label}")wrapping abegin_childscrollable region (200px height). Tier 1 = "Strategy", Tier 2 = "Tech Lead", Tier 3 = "Workers", Tier 4 = "QA". Tier 3 should aggregate allmma_streamskeys containing "Tier 3" with ticket ID sub-headers. Each section auto-scrolls to bottom when new content arrives (track previous scroll position, scroll only if user was at bottom). - Task 1.2: Add approval state indicators to the MMA dashboard. After the "Status:" line in
_render_mma_dashboard(gui_2.py:2672-2676), checkself._pending_mma_spawn,self._pending_mma_approval, andself._pending_ask_dialog. When any is active, render a colored blinking badge:imgui.text_colored(ImVec4(1,0.3,0.3,1), "APPROVAL PENDING")usingsin(time.time()*5)for alpha pulse. Also add aimgui.same_line()button "Go to Approval" that scrolls/focuses the relevant dialog. - Task 1.3: Write unit tests verifying: (a)
mma_streamswith keys "Tier 1", "Tier 2 (Tech Lead)", "Tier 3: T-001", "Tier 4 (QA)" are all rendered (check by mockingimgui.collapsing_headercalls); (b) approval indicators appear when_pending_mma_spawn is not None. - Task 1.4: Conductor - User Manual Verification 'Phase 1: Tier Stream Panels & Approval Indicators' (Protocol in workflow.md)
Phase 2: Cost Tracking & Enhanced Token Table
Focus: Add cost estimation to the existing token usage display.
- Task 2.1: Create a new module
cost_tracker.pywith aMODEL_PRICINGdict mapping model name patterns to{"input_per_mtok": float, "output_per_mtok": float}. Include entries for:gemini-2.5-flash-lite($0.075/$0.30),gemini-2.5-flash($0.15/$0.60),gemini-3-flash-preview($0.15/$0.60),gemini-3.1-pro-preview($3.50/$10.50),claude-*-sonnet($3/$15),claude-*-opus($15/$75),deepseek-v3($0.27/$1.10). Function:estimate_cost(model: str, input_tokens: int, output_tokens: int) -> floatthat does pattern matching on model name and returns dollar cost. - Task 2.2: Extend the token usage table in
_render_mma_dashboard(gui_2.py:2685-2699) from 3 columns to 5: add "Est. Cost" and "Model". Populate usingcost_tracker.estimate_cost()with the model name fromself.mma_tier_usage(need to extendtier_usagedict inConductorEngine._push_stateto include model name per tier, or use a default mapping: Tier 1 →gemini-3.1-pro-preview, Tier 2 →gemini-3-flash-preview, Tier 3 →gemini-2.5-flash-lite, Tier 4 →gemini-2.5-flash-lite). Show total cost row at bottom. - Task 2.3: Write tests for
cost_tracker.estimate_cost()covering all model patterns and edge cases (unknown model returns 0). - [~] Task 2.4: Conductor - User Manual Verification 'Phase 2: Cost Tracking & Enhanced Token Table' (Protocol in workflow.md)
Phase 3: Track Proposal Editing & Conductor Lifecycle Forms
Focus: Make track proposals editable and add conductor setup/newTrack GUI forms.
- Task 3.1: Enhance
_render_track_proposal_modal(gui_2.py:2146-2173) to make track titles and goals editable. Replaceimgui.text_coloredfor title withimgui.input_text(f"##track_title_{idx}", track['title']). Replaceimgui.text_wrappedfor goal withimgui.input_text_multiline(f"##track_goal_{idx}", track['goal'], ImVec2(-1, 60)). Add a "Remove" button per track (imgui.button(f"Remove##{idx}")) that pops fromself.proposed_tracks. Edited values must be written back toself.proposed_tracks[idx]. - Task 3.2: Add a "Conductor Setup" collapsible section at the top of the MMA dashboard (before the Track Browser). Contains a "Run Setup" button. On click, reads
conductor/workflow.md,conductor/tech-stack.md,conductor/product.mdusingPath.read_text(), computes a readiness summary (files found, line counts, track count viaproject_manager.get_all_tracks()), and displays it in a read-only text region. This is informational only — no backend changes. - Task 3.3: Add a "New Track" form below the Track Browser. Fields: track name (input_text), description (input_text_multiline), type dropdown (feature/chore/fix via
imgui.combo). "Create" button calls a new helper_cb_create_track(name, desc, type)that: createsconductor/tracks/{name}_{date}/directory, writes a minimalspec.mdfrom the description, writes an emptyplan.mdtemplate, writesmetadata.jsonwith the track ID/type/status="new", then refreshesself.tracksviaproject_manager.get_all_tracks(). - Task 3.4: Write tests for track creation helper: verify directory structure, file contents, and metadata.json format. Test proposal modal editing by verifying
proposed_trackslist is mutated correctly. - [~] Task 3.5: Conductor - User Manual Verification 'Phase 3: Track Proposal Editing & Conductor Lifecycle Forms' (Protocol in workflow.md)
Phase 4: DAG Editing & Track-Scoped Discussion
Focus: Allow GUI-based ticket manipulation and track-specific discussion history.
- Task 4.1: Add an "Add Ticket" button below the Task DAG section in
_render_mma_dashboard. On click, show an inline form: ticket ID (input_text, default auto-increment like "T-NNN"), description (input_text_multiline), target_file (input_text), depends_on (multi-select or comma-separated input of existing ticket IDs). "Create" button appends a newTicketdict toself.active_ticketswithstatus="todo"and triggers_push_mma_state_update()to synchronize the ConductorEngine. Cancel hides the form. Store the form visibility inself._show_add_ticket_form: bool. - Task 4.2: Add a "Delete" button to each DAG node in
_render_ticket_dag_node(gui_2.py:2770-2773, after the Skip button). On click, show a confirmation popup. On confirm, remove the ticket fromself.active_tickets, remove it from all other tickets'depends_onlists, and push state update. Only allow deletion oftodoorblockedtickets (notin_progressorcompleted). - Task 4.3: Add track-scoped discussion support. In
_render_discussion_panel(gui_2.py:2295-2483), add a toggle checkbox "Track Discussion" (visible only whenself.active_trackis set). When toggled ON: load history viaproject_manager.load_track_history(self.active_track.id, base_dir)intoself.disc_entries, set a flagself._track_discussion_active = True. When toggled OFF or track changes: restore project discussion. On save/flush, if_track_discussion_active, write to track history file instead of project history. - Task 4.4: Write tests for: (a) adding a ticket updates
active_ticketsand has correct default fields; (b) deleting a ticket removes it from alldepends_onreferences; (c) track discussion toggle switchesdisc_entriessource. - [~] Task 4.5: Conductor - User Manual Verification 'Phase 4: DAG Editing & Track-Scoped Discussion' (Protocol in workflow.md)
Phase 5: Visual Polish & Integration Testing
Focus: Dense, responsive dashboard with arcade aesthetics and end-to-end verification.
- [~] Task 5.1: Add color-coded styling to the Track Browser table. Status column uses colored text: "new" = gray, "active" = yellow, "done" = green, "blocked" = red. Progress bar uses
imgui.push_style_colorto tint: <33% red, 33-66% yellow, >66% green. - Task 5.2: Improve the DAG tree nodes with status-colored left borders. Use
imgui.get_cursor_screen_pos()andimgui.get_window_draw_list().add_rect_filled()to draw a 4px colored strip to the left of each tree node matching its status color. - Task 5.3: Add a "Dashboard Summary" header line at the top of
_render_mma_dashboardshowing:Track: {name} | Tickets: {done}/{total} | Cost: ${total_cost:.4f} | Status: {mma_status}in a single dense line with colored segments. - Task 5.4: Write an end-to-end integration test (extending
tests/visual_sim_mma_v2.pyor creatingtests/visual_sim_gui_ux.py) that verifies viaApiHookClient: (a) track creation form produces correct directory structure; (b) tier streams are populated during MMA execution; (c) approval indicators appear when expected; (d) cost tracking shows non-zero values after execution. - Task 5.5: Verify all new UI elements maintain >30 FPS via
get_ui_performanceduring a full MMA simulation run. - Task 5.6: Conductor - User Manual Verification 'Phase 5: Visual Polish & Integration Testing' (Protocol in workflow.md)
Phase 6: Live Worker Streaming & Engine Enhancements
Focus: Make MMA execution observable in real-time and configurable from the GUI. Currently workers are black boxes until completion.
- Task 6.1: Wire
ai_client.comms_log_callbackto per-ticket streams duringrun_worker_lifecycle(multi_agent_conductor.py:207-300). Before callingai_client.send(), setai_client.comms_log_callbackto a closure that pushes intermediate text chunks to the GUI via_queue_put(event_queue, loop, "response", {"text": chunk, "stream_id": f"Tier 3 (Worker): {ticket.id}", "status": "streaming..."}). Aftersend()returns, restore the original callback. This gives real-time output streaming to the Tier 3 stream panels from Phase 1. - Task 6.2: Add per-tier model configuration to the MMA dashboard. Below the token usage table in
_render_mma_dashboard, add a collapsible "Tier Model Config" section with 4 rows (Tier 1-4). Each row: tier label +imgui.combodropdown populated fromai_client.list_models()(cached). Store selections inself.mma_tier_models: dict[str, str]with defaults frommma_exec.get_model_for_role(). On change, write toself.project["mma"]["tier_models"]for persistence. - Task 6.3: Wire per-tier model config into the execution pipeline. In
ConductorEngine.run(multi_agent_conductor.py:105-135), when creatingWorkerContext, read the model name from the GUI'smma_tier_modelsdict (passed via the event queue or stored on the engine). Pass it through torun_worker_lifecyclewhich should use it inai_client.set_provider/ai_client.set_model_paramsbefore callingsend(). Also updatemma_exec.py:get_model_for_roleto accept an override parameter. - Task 6.4: Add parallel DAG execution. In
ConductorEngine.run(multi_agent_conductor.py:100-135), replace the sequentialfor ticket in ready_tasksloop withasyncio.gather(*[loop.run_in_executor(None, run_worker_lifecycle, ...) for ticket in ready_tasks]). Each worker already gets its ownai_client.reset_session()so they're isolated. Guard withai_client._send_lockawareness — if the lock serializes all sends, parallel execution won't help. In that case, create per-worker provider instances or use separate session IDs. Mark this task as exploratory — if_send_lockblocks parallelism, document the constraint and defer. - Task 6.5: Add automatic retry with model escalation. In
ConductorEngine.run, afterrun_worker_lifecyclereturns, check ifticket.status == "blocked". If so, andretry_count < max_retries(default 2), increment retry count, escalate the model (e.g., flash-lite → flash → pro), and re-execute. Storeretry_countas a field on the ticket dict. After max retries, leave as blocked. - Task 6.6: Write tests for: (a) streaming callback pushes intermediate content to event queue; (b) per-tier model config persists to project TOML; (c) retry escalation increments model tier.
- Task 6.7: Conductor - User Manual Verification 'Phase 6: Live Worker Streaming & Engine Enhancements' (Protocol in workflow.md)