conductor(tracks): archive 3 completed tracks, update tracks.md with active/archived sections
This commit is contained in:
5
conductor/archive/simulation_hardening_20260301/index.md
Normal file
5
conductor/archive/simulation_hardening_20260301/index.md
Normal file
@@ -0,0 +1,5 @@
|
||||
# Track simulation_hardening_20260301 Context
|
||||
|
||||
- [Specification](./spec.md)
|
||||
- [Implementation Plan](./plan.md)
|
||||
- [Metadata](./metadata.json)
|
||||
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"track_id": "simulation_hardening_20260301",
|
||||
"description": "Stabilize visual_sim_mma_v2.py and mock_gemini_cli.py for reliable end-to-end MMA simulation.",
|
||||
"type": "fix",
|
||||
"status": "new",
|
||||
"priority": "P1",
|
||||
"depends_on": ["mma_pipeline_fix_20260301"],
|
||||
"created_at": "2026-03-01T15:45:00Z",
|
||||
"updated_at": "2026-03-01T15:45:00Z"
|
||||
}
|
||||
22
conductor/archive/simulation_hardening_20260301/plan.md
Normal file
22
conductor/archive/simulation_hardening_20260301/plan.md
Normal file
@@ -0,0 +1,22 @@
|
||||
# Implementation Plan: Simulation Hardening
|
||||
|
||||
Depends on: `mma_pipeline_fix_20260301`
|
||||
Architecture reference: [docs/guide_simulations.md](../../docs/guide_simulations.md)
|
||||
|
||||
## Phase 1: Mock Provider Cleanup
|
||||
|
||||
- [x] Task 1.1: PRE-RESOLVED — mock_gemini_cli.py default path already returns plain text JSON (not function_call). Routing verified by code inspection: Epic/Sprint/Worker/tool-result all return plain text. Covered by Task 1.3 test.
|
||||
- [x] Task 1.2: Fix mock sprint planning ticket format. Current mock returns `goal`/`target_file` fields; ConductorEngine.parse_json_tickets expects `description`/`status`/`assigned_to`. Also add `'generate the implementation tickets'` keyword detection alongside `'PATH: Sprint Planning'`. 0593b28
|
||||
- [x] Task 1.3: Write a standalone test (`tests/test_mock_gemini_cli.py`) that invokes the mock script via `subprocess.run()` with various stdin prompts and verifies: (a) epic prompt → Track JSON, no tool calls; (b) sprint prompt → Ticket JSON, no tool calls; (c) worker prompt → plain text, no tool calls; (d) tool-result prompt → plain text response. 0873453
|
||||
|
||||
## Phase 2: Simulation Stability
|
||||
|
||||
- [x] Task 2.1: PRE-RESOLVED — visual_sim_mma_v2.py already has 0.3–1.5s frame-sync sleeps after every state-changing click, implemented in mma_pipeline_fix track (89a8d9b).
|
||||
- [x] Task 2.2: PRE-RESOLVED — _poll() with condition lambdas already covers all state-transition waits cleanly. wait_for_value exists in ApiHookClient but _poll() is more flexible and already in use.
|
||||
- [x] Task 2.3: Add `@pytest.mark.timeout(300)` to test_mma_complete_lifecycle to prevent infinite CI hangs. 63fa181
|
||||
|
||||
## Phase 3: End-to-End Verification
|
||||
|
||||
- [x] Task 3.1: PRE-RESOLVED — visual_sim_mma_v2.py passes in 11s against live GUI with real Gemini API (gemini-2.5-flash-lite). Verified in mma_pipeline_fix track. All 8 stages pass. ce5b6d2
|
||||
- [x] Task 3.2: Added Stage 9 to sim test: non-blocking poll for mma_tier_usage Tier 3 non-zero (30s, warns if not wired). Tier 3 stream and mma_status checks already covered by Stages 7-8. 63fa181
|
||||
- [x] Task 3.3: Fixed pending_script_approval gap (btn_approve_script unwired, _pending_dialog not in hook API). Sim test PASSED in 19.73s. Tier 3 token usage confirmed: input=34839, output=514. 90fc38f
|
||||
34
conductor/archive/simulation_hardening_20260301/spec.md
Normal file
34
conductor/archive/simulation_hardening_20260301/spec.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# Track Specification: Simulation Hardening
|
||||
|
||||
## Overview
|
||||
The `robust_live_simulation_verification` track is marked complete but its session compression documents three unresolved issues: (1) brittle mock that triggers the wrong approval popup, (2) popup state desynchronization after "Accept" clicks, (3) Tier 3 output never appearing in `mma_streams` (fixed by `mma_pipeline_fix` track). This track stabilizes the simulation framework so it reliably passes end-to-end.
|
||||
|
||||
## Prerequisites
|
||||
- `mma_pipeline_fix_20260301` MUST be completed first (fixes Tier 3 stream plumbing).
|
||||
|
||||
## Current Issues (from session compression 2026-02-28)
|
||||
|
||||
### Issue 1: Mock Triggers Wrong Approval Popup
|
||||
`mock_gemini_cli.py` defaults to emitting a `read_file` tool call, which triggers the general tool approval popup (`_pending_ask_dialog`) instead of the MMA spawn popup (`_pending_mma_spawn`). The test expects the spawn popup and times out.
|
||||
|
||||
**Root cause**: The mock's default response path doesn't distinguish between MMA orchestration prompts and Tier 3 worker prompts. It needs to NOT emit tool calls for orchestration-level prompts (Tier 1/2), only for worker-level prompts where tool use is expected.
|
||||
|
||||
### Issue 2: Popup State Desynchronization
|
||||
After clicking "Accept" on the track proposal modal, `_show_track_proposal_modal` is set to `False` but the test still sees the popup as active. The hook API's `mma_status` returns stale `proposed_tracks` data.
|
||||
|
||||
**Root cause**: `_cb_accept_tracks` (gui_2.py:2012-2045) processes tracks and clears `proposed_tracks`, but this runs on the GUI thread. The `ApiHookClient.get_mma_status()` reads via the GUI trampoline pattern, but there may be a frame delay before the state updates are visible.
|
||||
|
||||
### Issue 3: Approval Type Ambiguity
|
||||
The test polling loop auto-approves `pending_approval` but can't distinguish between tool approval (`_pending_ask_dialog`), MMA step approval (`_pending_mma_approval`), and spawn approval (`_pending_mma_spawn`). The simulation needs explicit handling for each type.
|
||||
|
||||
**Already resolved in code**: `get_mma_status` now returns separate `pending_tool_approval`, `pending_mma_step_approval`, `pending_mma_spawn_approval` booleans. The test in `visual_sim_mma_v2.py` already checks these individually. The fix is in making the mock not trigger unexpected approval types.
|
||||
|
||||
## Goals
|
||||
1. Make `tests/visual_sim_mma_v2.py` pass reliably against the live GUI.
|
||||
2. Clean up mock_gemini_cli.py to be deterministic and not trigger spurious approvals.
|
||||
3. Add retry/timeout resilience to polling loops.
|
||||
|
||||
## Architecture Reference
|
||||
- Simulation patterns: [docs/guide_simulations.md](../../docs/guide_simulations.md)
|
||||
- Hook API endpoints: [docs/guide_tools.md](../../docs/guide_tools.md) — see `/api/gui/mma_status` response fields
|
||||
- HITL mechanism: [docs/guide_architecture.md](../../docs/guide_architecture.md) — see "The Execution Clutch"
|
||||
Reference in New Issue
Block a user