diff --git a/conductor/tracks/comprehensive_gui_ux_20260228/plan.md b/conductor/tracks/comprehensive_gui_ux_20260228/plan.md index 6c9e021..1462379 100644 --- a/conductor/tracks/comprehensive_gui_ux_20260228/plan.md +++ b/conductor/tracks/comprehensive_gui_ux_20260228/plan.md @@ -45,3 +45,14 @@ Focus: Dense, responsive dashboard with arcade aesthetics and end-to-end verific - [ ] Task 5.4: Write an end-to-end integration test (extending `tests/visual_sim_mma_v2.py` or creating `tests/visual_sim_gui_ux.py`) that verifies via `ApiHookClient`: (a) track creation form produces correct directory structure; (b) tier streams are populated during MMA execution; (c) approval indicators appear when expected; (d) cost tracking shows non-zero values after execution. - [ ] Task 5.5: Verify all new UI elements maintain >30 FPS via `get_ui_performance` during a full MMA simulation run. - [ ] Task 5.6: Conductor - User Manual Verification 'Phase 5: Visual Polish & Integration Testing' (Protocol in workflow.md) + +## Phase 6: Live Worker Streaming & Engine Enhancements +Focus: Make MMA execution observable in real-time and configurable from the GUI. Currently workers are black boxes until completion. + +- [ ] Task 6.1: Wire `ai_client.comms_log_callback` to per-ticket streams during `run_worker_lifecycle` (multi_agent_conductor.py:207-300). Before calling `ai_client.send()`, set `ai_client.comms_log_callback` to a closure that pushes intermediate text chunks to the GUI via `_queue_put(event_queue, loop, "response", {"text": chunk, "stream_id": f"Tier 3 (Worker): {ticket.id}", "status": "streaming..."})`. After `send()` returns, restore the original callback. This gives real-time output streaming to the Tier 3 stream panels from Phase 1. +- [ ] Task 6.2: Add per-tier model configuration to the MMA dashboard. Below the token usage table in `_render_mma_dashboard`, add a collapsible "Tier Model Config" section with 4 rows (Tier 1-4). Each row: tier label + `imgui.combo` dropdown populated from `ai_client.list_models()` (cached). Store selections in `self.mma_tier_models: dict[str, str]` with defaults from `mma_exec.get_model_for_role()`. On change, write to `self.project["mma"]["tier_models"]` for persistence. +- [ ] Task 6.3: Wire per-tier model config into the execution pipeline. In `ConductorEngine.run` (multi_agent_conductor.py:105-135), when creating `WorkerContext`, read the model name from the GUI's `mma_tier_models` dict (passed via the event queue or stored on the engine). Pass it through to `run_worker_lifecycle` which should use it in `ai_client.set_provider`/`ai_client.set_model_params` before calling `send()`. Also update `mma_exec.py:get_model_for_role` to accept an override parameter. +- [ ] Task 6.4: Add parallel DAG execution. In `ConductorEngine.run` (multi_agent_conductor.py:100-135), replace the sequential `for ticket in ready_tasks` loop with `asyncio.gather(*[loop.run_in_executor(None, run_worker_lifecycle, ...) for ticket in ready_tasks])`. Each worker already gets its own `ai_client.reset_session()` so they're isolated. Guard with `ai_client._send_lock` awareness — if the lock serializes all sends, parallel execution won't help. In that case, create per-worker provider instances or use separate session IDs. Mark this task as exploratory — if `_send_lock` blocks parallelism, document the constraint and defer. +- [ ] Task 6.5: Add automatic retry with model escalation. In `ConductorEngine.run`, after `run_worker_lifecycle` returns, check if `ticket.status == "blocked"`. If so, and `retry_count < max_retries` (default 2), increment retry count, escalate the model (e.g., flash-lite → flash → pro), and re-execute. Store `retry_count` as a field on the ticket dict. After max retries, leave as blocked. +- [ ] Task 6.6: Write tests for: (a) streaming callback pushes intermediate content to event queue; (b) per-tier model config persists to project TOML; (c) retry escalation increments model tier. +- [ ] Task 6.7: Conductor - User Manual Verification 'Phase 6: Live Worker Streaming & Engine Enhancements' (Protocol in workflow.md) diff --git a/conductor/tracks/comprehensive_gui_ux_20260228/spec.md b/conductor/tracks/comprehensive_gui_ux_20260228/spec.md index 9718b29..5b6e652 100644 --- a/conductor/tracks/comprehensive_gui_ux_20260228/spec.md +++ b/conductor/tracks/comprehensive_gui_ux_20260228/spec.md @@ -92,8 +92,21 @@ This track enhances the existing MMA orchestration GUI from its current function - Hook API for testing: [docs/guide_tools.md](../../docs/guide_tools.md) - Simulation patterns: [docs/guide_simulations.md](../../docs/guide_simulations.md) +## Functional Requirements (Engine Enhancements) + +### Live Worker Streaming +- During `run_worker_lifecycle`, set `ai_client.comms_log_callback` to push intermediate text chunks to the per-ticket stream via the event queue. Currently workers are black boxes until completion — both Claude Code and Gemini CLI stream in real-time. The callback should push `{"text": chunk, "stream_id": "Tier 3 (Worker): {ticket.id}", "status": "streaming..."}` events. + +### Per-Tier Model Configuration +- `mma_exec.py:get_model_for_role` is hardcoded. Add a GUI section with `imgui.combo` dropdowns for each tier's model. Persist to `project["mma"]["tier_models"]`. Wire into `ConductorEngine` and `run_worker_lifecycle`. + +### Parallel DAG Execution +- `ConductorEngine.run()` executes ready tickets sequentially. DAG-independent tickets should run in parallel via `asyncio.gather`. Constraint: `ai_client._send_lock` serializes all API calls — parallel workers may need separate provider instances or the lock needs to be per-session rather than global. Mark as exploratory. + +### Automatic Retry with Model Escalation +- `mma_exec.py` has `--failure-count` for escalation but `ConductorEngine` doesn't use it. When a worker produces BLOCKED, auto-retry with a more capable model (up to 2 retries). + ## Out of Scope -- Automated "Auto-Fix" loops without user intervention. - Remote management via web browser. - Visual diagram generation (Dear PyGui node editor for DAG — future track). - Docking/floating multi-viewport layout (requires imgui docking branch investigation — future track).