diff --git a/docs/guide_testing.md b/docs/guide_testing.md index 62191795..70ced839 100644 --- a/docs/guide_testing.md +++ b/docs/guide_testing.md @@ -161,6 +161,55 @@ def test_my_thing(live_gui): --- +## Per-test Subprocess Resilience (2026-06-09) + +Added in `test_infrastructure_hardening_20260609` track. These three mechanisms address the "subprocess state pollution" and "controller state pollution" failure modes that caused batch regressions. + +### `_LiveGuiHandle` class (tests/conftest.py:393) + +The `live_gui` fixture yields a `_LiveGuiHandle` instead of a `(process, gui_script)` tuple. The handle exposes: + +| Attribute/Method | Purpose | +|---|---| +| `process` | The `subprocess.Popen` for the sloppy.py subprocess | +| `gui_script` | Absolute path to sloppy.py | +| `workspace` | Absolute path to the subprocess's working directory (pytest tmp dir) | +| `is_alive()` | True if the subprocess is running | +| `ensure_alive()` | No-op stub — increments `respawn_count` if dead, does not respawn (deferred) | +| `respawn_count` | Number of times the subprocess was found dead | + +**Backward compat:** The handle is iterable as `(process, gui_script)`, so existing `proc, _ = live_gui` patterns still work. + +### `live_gui_workspace` fixture (tests/conftest.py:657) + +Yields the absolute path to the live_gui subprocess's workspace (a `tmp_path_factory.mktemp("live_gui_workspace")` directory in pytest's tmp dir). Tests that need to create files in the workspace should request this fixture instead of hardcoding `Path("tests/artifacts/live_gui_workspace")`. + +```python +def test_rag_setup(live_gui, live_gui_workspace): + test_file = live_gui_workspace / "my_input.txt" + test_file.write_text("hello") + # ... configure RAG, index, query +``` + +### `_check_live_gui_health` autouse fixture (tests/conftest.py:650) + +Runs before every test that uses `live_gui`. Calls `handle.ensure_alive()` to detect subprocess death between tests. If the subprocess died, the counter increments (but the subprocess is not respawned — see `ensure_alive` above). + +### `clean_baseline` marker + +Opt-in marker for tests that need a fresh controller state. Tests marked with `@pytest.mark.clean_baseline` get `/api/reset_session` called before they start, ensuring no pollution from prior tests. + +```python +@pytest.mark.clean_baseline +def test_rag_final_verify(live_gui): + # ai_input is guaranteed empty, controller is in a known state + ... +``` + +Use this for tests that are sensitive to controller state pollution from prior tests in the same session. The `test_rag_phase4_final_verify` test is marked this way because the 4 sims in `test_extended_sims.py` mutate controller state (provider, model, etc.) that would otherwise pollute the RAG test. + +--- + ## Test Categories ### 1. Unit Tests (no fixtures, fast)