# Implementation Plan: Test Suite Stabilization & Consolidation (test_stabilization_20260302) ## Phase 1: Infrastructure & Paradigm Consolidation [checkpoint: 8666137] - [x] Task: Initialize MMA Environment `activate_skill mma-orchestrator` [Manual] - [x] Task: Setup Artifact Isolation Directories [570c0ea] - [ ] WHERE: Project root - [ ] WHAT: Create `./tests/artifacts/` and `./tests/logs/` directories. Add `.gitignore` to both containing `*` and `!.gitignore`. - [ ] HOW: Use PowerShell `New-Item` and `Out-File`. - [ ] SAFETY: Do not commit artifacts. - [x] Task: Migrate Manual Launchers to `live_gui` Fixture [6b7cd0a] - [ ] WHERE: `tests/visual_mma_verification.py` (lines 15-40), `simulation/` scripts. - [ ] WHAT: Replace `subprocess.Popen(["python", "gui_2.py"])` with the `live_gui` fixture injected into `pytest` test functions. Remove manual while-loop sleeps. - [ ] HOW: Use standard pytest `def test_... (live_gui):` and rely on `ApiHookClient` with proper timeouts. - [ ] SAFETY: Ensure `subprocess` is not orphaned if test fails. - [ ] Task: Conductor - User Manual Verification 'Phase 1: Infrastructure & Consolidation' (Protocol in workflow.md) ## Phase 2: Asyncio Stabilization & Logging - [ ] Task: Audit and Fix `conftest.py` Loop Lifecycle - [ ] WHERE: `tests/conftest.py:20-50` (around `app_instance` fixture). - [ ] WHAT: Ensure the `app._loop.stop()` cleanup safely cancels pending background tasks. - [ ] HOW: Use `asyncio.all_tasks(loop)` and `task.cancel()` before stopping the loop in the fixture teardown. - [ ] SAFETY: Thread-safety; only cancel tasks belonging to the app's loop. - [ ] Task: Resolve `Event loop is closed` in Core Test Suite - [ ] WHERE: `tests/test_spawn_interception.py`, `tests/test_gui_streaming.py`. - [ ] WHAT: Update blocking calls to use `ThreadPoolExecutor` or `asyncio.run_coroutine_threadsafe(..., loop)`. - [ ] HOW: Pass the active loop from `app_instance` to the functions triggering the events. - [ ] SAFETY: Prevent event queue deadlocks. - [ ] Task: Implement Centralized Sectioned Logging Utility - [ ] WHERE: `tests/conftest.py:50-80` (`VerificationLogger`). - [ ] WHAT: Route `VerificationLogger` output to `./tests/logs/` instead of `logs/test/`. - [ ] HOW: Update `self.logs_dir = Path(f"tests/logs/{datetime.datetime.now().strftime('%Y%m%d_%H%M%S')}")`. - [ ] SAFETY: No state impact. - [ ] Task: Conductor - User Manual Verification 'Phase 2: Asyncio & Logging' (Protocol in workflow.md) ## Phase 3: Assertion Implementation & Legacy Cleanup - [ ] Task: Replace `pytest.fail` with Functional Assertions (`api_events`, `execution_engine`) - [ ] WHERE: `tests/test_api_events.py:40`, `tests/test_execution_engine.py:45`. - [ ] WHAT: Implement actual `assert` statements testing the mock calls and status updates. - [ ] HOW: Use `MagicMock.assert_called_with` and check `ticket.status == "completed"`. - [ ] SAFETY: Isolate mocks. - [ ] Task: Replace `pytest.fail` with Functional Assertions (`token_usage`, `agent_capabilities`) - [ ] WHERE: `tests/test_token_usage.py`, `tests/test_agent_capabilities.py`. - [ ] WHAT: Implement tests verifying the `usage_metadata` extraction and `list_models` output count. - [ ] HOW: Check for 6 models (including `gemini-2.0-flash`) in `list_models` test. - [ ] SAFETY: Isolate mocks. - [ ] Task: Resolve Simulation Entry Count Regressions - [ ] WHERE: `tests/test_extended_sims.py:20`. - [ ] WHAT: Fix `AssertionError: Expected at least 2 entries, found 0`. - [ ] HOW: Update simulation flow to properly wait for the `User` and `AI` entries to populate the GUI history before asserting. - [ ] SAFETY: Use dynamic wait (`ApiHookClient.wait_for_event`) instead of static sleeps. - [ ] Task: Remove Legacy `gui_legacy` Test Imports & File - [ ] WHERE: `tests/test_gui_events.py`, `tests/test_gui_updates.py`, `tests/test_gui_diagnostics.py`, and project root. - [ ] WHAT: Change `from gui_legacy import App` to `from gui_2 import App`. Fix any breaking UI locators. Then delete `gui_legacy.py`. - [ ] HOW: String replacement and standard `os.remove`. - [ ] SAFETY: Verify no remaining imports exist across the suite using `grep_search`. - [ ] Task: Conductor - User Manual Verification 'Phase 3: Assertions & Legacy Cleanup' (Protocol in workflow.md) ## Phase 4: Documentation & Final Verification - [ ] Task: Model Switch Request - [ ] Ask the user to run the `/model` command to switch to a high reasoning model for the documentation phase. Wait for their confirmation before proceeding. - [ ] Task: Update Core Documentation & Workflow Contract - [ ] WHERE: `Readme.md`, `docs/guide_simulations.md`, `conductor/workflow.md`. - [ ] WHAT: Document artifact locations, `live_gui` standard, and the strict "Structural Testing Contract". - [ ] HOW: Markdown editing. Add sections explicitly banning arbitrary `unittest.mock.patch` on core infra for Tier 3 workers. - [ ] SAFETY: Keep formatting clean. - [ ] Task: Full Suite Validation & Warning Cleanup - [ ] Task: Final Artifact Isolation Verification - [ ] Task: Conductor - User Manual Verification 'Phase 4: Documentation & Final Verification' (Protocol in workflow.md)