From 8248a49f1ebf81183d06da27afbebe401af0219f Mon Sep 17 00:00:00 2001 From: Ed_ Date: Mon, 8 Jun 2026 09:25:18 -0400 Subject: [PATCH] docs(todo): simple todo list for fixing test_full_live_workflow race --- TODO_test_full_live_workflow.md | 74 +++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 TODO_test_full_live_workflow.md diff --git a/TODO_test_full_live_workflow.md b/TODO_test_full_live_workflow.md new file mode 100644 index 00000000..1e96dcbe --- /dev/null +++ b/TODO_test_full_live_workflow.md @@ -0,0 +1,74 @@ +# TODO: Fix test_full_live_workflow race condition + +**Report:** `docs/reports/test_full_live_workflow_root_cause_20260608.md` +**Failure reproducibility:** 100% in tier-3 batch, 0% in isolation +**Status:** Not started (investigation only; no fix code yet) + +## Tasks (simple, ordered by ROI) + +### 1. [HIGH] Add deterministic signal endpoint +- **What:** Add `GET /api/project_switch_status` returning `{"in_progress": bool, "path": str | null, "error": str | null}`. +- **Where:** `src/api_hooks.py` (new handler) + `src/app_controller.py` (track `_project_switch_in_progress` + `_project_switch_error` state). +- **Why:** Polling the project dict is fragile (returns stale state from prior tests). Polling a purpose-built signal is deterministic. +- **Pattern:** See `src/api_hooks.py:336-363` (`/api/warmup_wait`) for the existing pattern of "block until condition, return final state". +- **Acceptance:** Test polls `/api/project_switch_status` until `in_progress == False` and `path == expected` and `error is None`. Times out after 30s with clear error. + +### 2. [HIGH] Reset project state in `_handle_reset_session` +- **What:** Add `self.project = {}; self.active_project_path = ""; self.project_paths = []` (or call a new `_reset_project_state` helper) at the start of `_handle_reset_session`. +- **Where:** `src/app_controller.py:3244-3296`. +- **Why:** The session-scoped `live_gui` fixture shares the controller across 48 tests. Prior tests leave stale project state. The reset handler currently clears AI session but not project state. +- **Acceptance:** After `client.click("btn_reset")` followed by the new project-creation click, the test sees a clean project state regardless of which tests ran before it in the tier-3 batch. + +### 3. [MED] Replace `os.path.abspath("tests/artifacts/temp_project.toml")` with fixture-provided path +- **What:** Have the `live_gui` fixture provide `temp_project_path` (str) derived from its own `temp_workspace` directory. +- **Where:** `tests/conftest.py` (live_gui fixture) + `tests/test_live_workflow.py:50`. +- **Why:** cwd-relative path is fragile; fixture-relative path is stable. +- **Acceptance:** Test does `temp_project_path = live_gui_temp_project_path` (or accesses it as a fixture attribute). No more `os.path.abspath("tests/artifacts/...")`. + +### 4. [MED] Replace 10×1s blind poll with condition-based wait +- **What:** Use the new `/api/project_switch_status` endpoint with a single `wait_for_condition` call (or `client.wait_for_project_active(name, timeout=30)` helper). +- **Where:** `tests/test_live_workflow.py:58-65` + new `ApiHookClient.wait_for_project_active` method. +- **Why:** Blind polling of derived state is fragile; condition-based wait is deterministic and surfaces the failure reason immediately. +- **Pattern:** See `src/api_hook_client.py:wait_for_server` (existing pattern in the same client). +- **Acceptance:** Test fails fast (within 5-10s) with a clear `error` message from the API instead of timing out at 10s with "Project failed to activate". + +### 5. [LOW] Add defensive state assertions +- **What:** Before polling for activation, verify: + - The file was created: `assert os.path.exists(temp_project_path)` + - The click was enqueued: check `client.get_events()` for the `click` task +- **Where:** `tests/test_live_workflow.py:55-65`. +- **Why:** Catches the case where the click was dropped or the handler crashed before writing the file. +- **Acceptance:** If the file doesn't exist after the click, the test fails immediately with "temp_project.toml not created" instead of timing out. + +### 6. [LOW] Add `pytest.mark.live` to pyproject.toml markers +- **What:** Append `"live: marks tests as live visualization tests (not in CI by default)"` to `[tool.pytest.ini_options].markers`. +- **Where:** `pyproject.toml`. +- **Why:** Silences the `PytestUnknownMarkWarning: Unknown pytest.mark.live` warnings emitted by `test_visual_mma.py`, `test_visual_sim_gui_ux.py`. The mark already exists; pyproject just doesn't know about it. +- **Acceptance:** `uv run pytest tests/ 2>&1 | grep -i UnknownMark` returns 0 lines. + +### 7. [LOW] Add `tests/.test_durations.json` recording in CI / dev convenience +- **What:** Add a dev-mode shortcut to record durations once the fix lands (e.g. `python scripts/run_tests_batched.py --durations`). +- **Where:** `scripts/run_tests_batched.py` already has `--durations` flag; just need a one-time run + commit. +- **Why:** The categorizer uses `.test_durations.json` for `speed` auto-inference. Currently all files default to MEDIUM speed. +- **Acceptance:** `tests/.test_durations.json` exists, has timing data for all 295+ tests. (Not strictly needed for the live_workflow fix.) + +## Order of work + +1, 2, 3, 4 are tightly coupled (all about making the test deterministic and isolated). Do them in one PR. + +5 is a defensive complement. Add with 1-4. + +6, 7 are unrelated cleanup. Do in a separate small commit. + +## Estimated time + +- Tasks 1, 2, 3, 4, 5: 2-3 hours (mostly test + 1 endpoint + 1 reset path) +- Tasks 6, 7: 5-10 minutes each + +## Verification + +After fix: +- `uv run python scripts/run_tests_batched.py --tiers 3 --no-xdist --no-color` shows `<<< tier-3-live_gui PASS` +- `uv run pytest tests/test_live_workflow.py` still PASSes in isolation +- `uv run pytest tests/test_live_workflow.py tests/test_extended_sims.py tests/test_command_palette_sim.py` (siblings) PASSes +- Failure message on real regression is clear and actionable (e.g. "click was not dispatched within 5s" or "/api/project_switch_status returned error: file not found")