conductor(tracks): archive 3 completed tracks, update tracks.md with active/archived sections

This commit is contained in:
2026-03-02 10:46:08 -05:00
parent e7879f45a6
commit c35f372f52
13 changed files with 17 additions and 7 deletions

View File

@@ -0,0 +1,34 @@
# Track Specification: Simulation Hardening
## Overview
The `robust_live_simulation_verification` track is marked complete but its session compression documents three unresolved issues: (1) brittle mock that triggers the wrong approval popup, (2) popup state desynchronization after "Accept" clicks, (3) Tier 3 output never appearing in `mma_streams` (fixed by `mma_pipeline_fix` track). This track stabilizes the simulation framework so it reliably passes end-to-end.
## Prerequisites
- `mma_pipeline_fix_20260301` MUST be completed first (fixes Tier 3 stream plumbing).
## Current Issues (from session compression 2026-02-28)
### Issue 1: Mock Triggers Wrong Approval Popup
`mock_gemini_cli.py` defaults to emitting a `read_file` tool call, which triggers the general tool approval popup (`_pending_ask_dialog`) instead of the MMA spawn popup (`_pending_mma_spawn`). The test expects the spawn popup and times out.
**Root cause**: The mock's default response path doesn't distinguish between MMA orchestration prompts and Tier 3 worker prompts. It needs to NOT emit tool calls for orchestration-level prompts (Tier 1/2), only for worker-level prompts where tool use is expected.
### Issue 2: Popup State Desynchronization
After clicking "Accept" on the track proposal modal, `_show_track_proposal_modal` is set to `False` but the test still sees the popup as active. The hook API's `mma_status` returns stale `proposed_tracks` data.
**Root cause**: `_cb_accept_tracks` (gui_2.py:2012-2045) processes tracks and clears `proposed_tracks`, but this runs on the GUI thread. The `ApiHookClient.get_mma_status()` reads via the GUI trampoline pattern, but there may be a frame delay before the state updates are visible.
### Issue 3: Approval Type Ambiguity
The test polling loop auto-approves `pending_approval` but can't distinguish between tool approval (`_pending_ask_dialog`), MMA step approval (`_pending_mma_approval`), and spawn approval (`_pending_mma_spawn`). The simulation needs explicit handling for each type.
**Already resolved in code**: `get_mma_status` now returns separate `pending_tool_approval`, `pending_mma_step_approval`, `pending_mma_spawn_approval` booleans. The test in `visual_sim_mma_v2.py` already checks these individually. The fix is in making the mock not trigger unexpected approval types.
## Goals
1. Make `tests/visual_sim_mma_v2.py` pass reliably against the live GUI.
2. Clean up mock_gemini_cli.py to be deterministic and not trigger spurious approvals.
3. Add retry/timeout resilience to polling loops.
## Architecture Reference
- Simulation patterns: [docs/guide_simulations.md](../../docs/guide_simulations.md)
- Hook API endpoints: [docs/guide_tools.md](../../docs/guide_tools.md) — see `/api/gui/mma_status` response fields
- HITL mechanism: [docs/guide_architecture.md](../../docs/guide_architecture.md) — see "The Execution Clutch"