From c0a87772040637625968736b44bde4f081abee45 Mon Sep 17 00:00:00 2001
From: Ed_ <edwardgz@gmail.com>
Date: Tue, 3 Mar 2026 23:38:08 -0500
Subject: [PATCH] chore(conductor): Archive track 'Test Suite Stabilization &
 Consolidation'

---
 .../test_stabilization_20260302/index.md      |  5 ++
 .../test_stabilization_20260302/metadata.json |  8 ++
 .../test_stabilization_20260302/plan.md       | 86 +++++++++++++++++++
 .../test_stabilization_20260302/spec.md       | 43 ++++++++++
 conductor/tracks.md                           | 24 +++---
 5 files changed, 154 insertions(+), 12 deletions(-)
 create mode 100644 conductor/archive/test_stabilization_20260302/index.md
 create mode 100644 conductor/archive/test_stabilization_20260302/metadata.json
 create mode 100644 conductor/archive/test_stabilization_20260302/plan.md
 create mode 100644 conductor/archive/test_stabilization_20260302/spec.md

diff --git a/conductor/archive/test_stabilization_20260302/index.md b/conductor/archive/test_stabilization_20260302/index.md
new file mode 100644
index 0000000..341517e
--- /dev/null
+++ b/conductor/archive/test_stabilization_20260302/index.md
@@ -0,0 +1,5 @@
+# Track test_stabilization_20260302 Context
+
+- [Specification](./spec.md)
+- [Implementation Plan](./plan.md)
+- [Metadata](./metadata.json)
diff --git a/conductor/archive/test_stabilization_20260302/metadata.json b/conductor/archive/test_stabilization_20260302/metadata.json
new file mode 100644
index 0000000..dd6cd66
--- /dev/null
+++ b/conductor/archive/test_stabilization_20260302/metadata.json
@@ -0,0 +1,8 @@
+{
+  "track_id": "test_stabilization_20260302",
+  "type": "chore",
+  "status": "new",
+  "created_at": "2026-03-02T22:09:00Z",
+  "updated_at": "2026-03-02T22:09:00Z",
+  "description": "Comprehensive Test Suite Stabilization & Consolidation. Fixes asyncio errors, resolves artifact leakage, and unifies testing paradigms."
+}
diff --git a/conductor/archive/test_stabilization_20260302/plan.md b/conductor/archive/test_stabilization_20260302/plan.md
new file mode 100644
index 0000000..fcc4c29
--- /dev/null
+++ b/conductor/archive/test_stabilization_20260302/plan.md
@@ -0,0 +1,86 @@
+# Implementation Plan: Test Suite Stabilization & Consolidation (test_stabilization_20260302)
+
+## Phase 1: Infrastructure & Paradigm Consolidation [checkpoint: 8666137]
+- [x] Task: Initialize MMA Environment `activate_skill mma-orchestrator` [Manual]
+- [x] Task: Setup Artifact Isolation Directories [570c0ea]
+    - [ ] WHERE: Project root
+    - [ ] WHAT: Create `./tests/artifacts/` and `./tests/logs/` directories. Add `.gitignore` to both containing `*` and `!.gitignore`.
+    - [ ] HOW: Use PowerShell `New-Item` and `Out-File`.
+    - [ ] SAFETY: Do not commit artifacts.
+- [x] Task: Migrate Manual Launchers to `live_gui` Fixture [6b7cd0a]
+    - [ ] WHERE: `tests/visual_mma_verification.py` (lines 15-40), `simulation/` scripts.
+    - [ ] WHAT: Replace `subprocess.Popen(["python", "gui_2.py"])` with the `live_gui` fixture injected into `pytest` test functions. Remove manual while-loop sleeps.
+    - [ ] HOW: Use standard pytest `def test_... (live_gui):` and rely on `ApiHookClient` with proper timeouts.
+    - [ ] SAFETY: Ensure `subprocess` is not orphaned if test fails.
+- [ ] Task: Conductor - User Manual Verification 'Phase 1: Infrastructure & Consolidation' (Protocol in workflow.md)
+
+## Phase 2: Asyncio Stabilization & Logging [checkpoint: 14613df]
+- [x] Task: Audit and Fix `conftest.py` Loop Lifecycle [5a0ec66]
+    - [ ] WHERE: `tests/conftest.py:20-50` (around `app_instance` fixture).
+    - [ ] WHAT: Ensure the `app._loop.stop()` cleanup safely cancels pending background tasks.
+    - [ ] HOW: Use `asyncio.all_tasks(loop)` and `task.cancel()` before stopping the loop in the fixture teardown.
+    - [ ] SAFETY: Thread-safety; only cancel tasks belonging to the app's loop.
+- [x] Task: Resolve `Event loop is closed` in Core Test Suite [82aa288]
+    - [ ] WHERE: `tests/test_spawn_interception.py`, `tests/test_gui_streaming.py`.
+    - [ ] WHAT: Update blocking calls to use `ThreadPoolExecutor` or `asyncio.run_coroutine_threadsafe(..., loop)`.
+    - [ ] HOW: Pass the active loop from `app_instance` to the functions triggering the events.
+    - [ ] SAFETY: Prevent event queue deadlocks.
+- [x] Task: Implement Centralized Sectioned Logging Utility [51f7c2a]
+    - [ ] WHERE: `tests/conftest.py:50-80` (`VerificationLogger`).
+    - [ ] WHAT: Route `VerificationLogger` output to `./tests/logs/` instead of `logs/test/`.
+    - [ ] HOW: Update `self.logs_dir = Path(f"tests/logs/{datetime.datetime.now().strftime('%Y%m%d_%H%M%S')}")`.
+    - [ ] SAFETY: No state impact.
+- [ ] Task: Conductor - User Manual Verification 'Phase 2: Asyncio & Logging' (Protocol in workflow.md)
+
+## Phase 3: Assertion Implementation & Legacy Cleanup [checkpoint: 14ac983]
+- [x] Task: Replace `pytest.fail` with Functional Assertions (`api_events`, `execution_engine`) [194626e]
+    - [ ] WHERE: `tests/test_api_events.py:40`, `tests/test_execution_engine.py:45`.
+    - [ ] WHAT: Implement actual `assert` statements testing the mock calls and status updates.
+    - [ ] HOW: Use `MagicMock.assert_called_with` and check `ticket.status == "completed"`.
+    - [ ] SAFETY: Isolate mocks.
+- [x] Task: Replace `pytest.fail` with Functional Assertions (`token_usage`, `agent_capabilities`) [ffc5d75]
+    - [ ] WHERE: `tests/test_token_usage.py`, `tests/test_agent_capabilities.py`.
+    - [ ] WHAT: Implement tests verifying the `usage_metadata` extraction and `list_models` output count.
+    - [ ] HOW: Check for 6 models (including `gemini-2.0-flash`) in `list_models` test.
+    - [ ] SAFETY: Isolate mocks.
+- [x] Task: Resolve Simulation Entry Count Regressions [dbd955a]
+    - [ ] WHERE: `tests/test_extended_sims.py:20`.
+    - [ ] WHAT: Fix `AssertionError: Expected at least 2 entries, found 0`.
+    - [ ] HOW: Update simulation flow to properly wait for the `User` and `AI` entries to populate the GUI history before asserting.
+    - [ ] SAFETY: Use dynamic wait (`ApiHookClient.wait_for_event`) instead of static sleeps.
+- [x] Task: Remove Legacy `gui_legacy` Test Imports & File [4d171ff]
+    - [x] WHERE: `tests/test_gui_events.py`, `tests/test_gui_updates.py`, `tests/test_gui_diagnostics.py`, and project root.
+    - [x] WHAT: Change `from gui_legacy import App` to `from gui_2 import App`. Fix any breaking UI locators. Then delete `gui_legacy.py`.
+    - [x] HOW: String replacement and standard `os.remove`.
+    - [x] SAFETY: Verify no remaining imports exist across the suite using `grep_search`.
+- [x] Task: Resolve `pytest.fail` in `tests/test_agent_tools_wiring.py` [20b2e2d]
+    - [x] WHERE: `tests/test_agent_tools_wiring.py`.
+    - [x] WHAT: Implement actual assertions for `test_set_agent_tools`.
+    - [x] HOW: Verify that `ai_client.set_agent_tools` correctly updates the active tool set.
+    - [x] SAFETY: Use mocks for `ai_client` if necessary.
+- [ ] Task: Conductor - User Manual Verification 'Phase 3: Assertions & Legacy Cleanup' (Protocol in workflow.md)
+
+## Phase 4: Documentation & Final Verification [checkpoint: 2d3820b]
+- [x] Task: Model Switch Request [Manual]
+    - [x] Ask the user to run the `/model` command to switch to a high reasoning model for the documentation phase. Wait for their confirmation before proceeding.
+- [x] Task: Update Core Documentation & Workflow Contract [6b2270f]
+    - [x] WHERE: `Readme.md`, `docs/guide_simulations.md`, `conductor/workflow.md`.
+    - [x] WHAT: Document artifact locations, `live_gui` standard, and the strict "Structural Testing Contract".
+    - [x] HOW: Markdown editing. Add sections explicitly banning arbitrary `unittest.mock.patch` on core infra for Tier 3 workers.
+    - [x] SAFETY: Keep formatting clean.
+- [x] Task: Full Suite Validation & Warning Cleanup [5401fc7]
+- [x] Task: Final Artifact Isolation Verification [7c70f74]
+- [x] Task: Conductor - User Manual Verification 'Phase 4: Documentation & Final Verification' (Protocol in workflow.md) [Manual]
+
+## Phase 5: Resolution of Lingering Regressions
+- [~] Task: Identify failing test batches [Isolated]
+- [ ] Task: Resolve `tests/test_visual_sim_mma_v2.py` (Epic Planning Hang)
+    - [ ] WHERE: `gui_2.py`, `gemini_cli_adapter.py`, `tests/mock_gemini_cli.py`.
+    - [ ] WHAT: Fix the hang where Tier 1 epic planning never completes in simulation.
+    - [ ] HOW: Add debug logging to adapter and mock. Fix stdin closure if needed.
+- [ ] Task: Resolve `tests/test_gemini_cli_edge_cases.py` (Loop Termination Hang)
+    - [ ] WHERE: `tests/test_gemini_cli_edge_cases.py`.
+    - [ ] WHAT: Fix `test_gemini_cli_loop_termination` timeout.
+- [ ] Task: Resolve `tests/test_live_workflow.py` and `tests/test_visual_orchestration.py`
+- [ ] Task: Resolve `conductor/tests/` failures
+- [ ] Task: Final Artifact Isolation & Batched Test Verification
diff --git a/conductor/archive/test_stabilization_20260302/spec.md b/conductor/archive/test_stabilization_20260302/spec.md
new file mode 100644
index 0000000..ac83bd2
--- /dev/null
+++ b/conductor/archive/test_stabilization_20260302/spec.md
@@ -0,0 +1,43 @@
+# Specification: Test Suite Stabilization & Consolidation (test_stabilization_20260302)
+
+## Overview
+The goal of this track is to stabilize and unify the project's test suite. This involves resolving pervasive `asyncio` lifecycle errors, consolidating redundant testing paradigms (specifically manual GUI subprocesses), ensuring artifact isolation in `./tests/artifacts/`, implementing functional assertions for currently mocked-out tests, and updating documentation to reflect the finalized verification framework.
+
+## Architectural Constraints: Combating Mock-Rot
+To prevent future testing entropy caused by "Green-Light Bias" and stateless Tier 3 delegation, this track establishes strict constraints:
+- **Ban on Aggressive Mocking:** Tests MUST NOT use `unittest.mock.patch` to arbitrarily hollow out core infrastructure (e.g., the `App` lifecycle or async loops) just to achieve exit code 0.
+- **Mandatory Centralized Fixtures:** All tests interacting with the GUI or AI client MUST use the centralized `app_instance` or `live_gui` fixtures defined in `conftest.py`.
+- **Structural Testing Contract:** The project workflow must enforce that future AI agents write integration tests against the live state rather than hallucinated mocked environments.
+
+## Functional Requirements
+- **Asyncio Lifecycle Stabilization:**
+  - Resolve `RuntimeError: Event loop is closed` across the suite.
+  - Implement `ThreadPoolExecutor` for blocking calls in GUI-bound tests.
+  - Audit and fix fixture cleanup in `conftest.py`.
+- **Paradigm Consolidation (from testing_consolidation_20260302):**
+  - Refactor integration/visual tests to exclusively use the `live_gui` pytest fixture.
+  - Eliminate all manual `subprocess.Popen` calls to `gui_2.py` in the `tests/` and `simulation/` directories.
+  - Update legacy tests (e.g., `test_gui_events.py`, `test_gui_diagnostics.py`) that still import the deprecated `gui_legacy.py` to use `gui_2.py`.
+  - Completely remove `gui_legacy.py` from the project to eliminate confusion.
+- **Artifact Isolation & Discipline:**
+  - All test-generated files (temporary projects, mocks, sessions) MUST be isolated in `./tests/artifacts/`.
+  - Prevent leakage into `conductor/tracks/` or project root.
+- **Enhanced Test Reporting:**
+  - Implement structured, sectioned logging in `./tests/logs/` with timestamps (consolidating `VerificationLogger` outputs).
+- **Assertion Implementation:**
+  - Replace `pytest.fail` placeholders with full functional implementation.
+- **Simulation Regression Fixes:**
+  - Debug and resolve `test_context_sim_live` entry count issues.
+- **Documentation Updates:**
+  - Update `Readme.md` (Testing section) to explain the new log/artifact locations and the `--enable-test-hooks` requirement.
+  - Update `docs/guide_simulations.md` to document the centralized `pytest` usage instead of standalone simulator scripts.
+
+## Acceptance Criteria
+- [ ] Full suite run completes without `RuntimeError: Event loop is closed` warnings.
+- [ ] No `subprocess.Popen` calls to `gui_2.py` exist in the test codebase.
+- [ ] No test files import `gui_legacy.py`.
+- [ ] `gui_legacy.py` has been deleted from the repository.
+- [ ] All test artifacts are isolated in `./tests/artifacts/`.
+- [ ] All tests previously marked with `pytest.fail` now have passing functional assertions.
+- [ ] Simulation tests pass with correct entry counts.
+- [ ] `Readme.md` and `docs/guide_simulations.md` accurately reflect the new testing infrastructure.
diff --git a/conductor/tracks.md b/conductor/tracks.md
index 7e66752..d65ff2a 100644
--- a/conductor/tracks.md
+++ b/conductor/tracks.md
@@ -8,40 +8,40 @@ This file tracks all major tracks for the project. Each track has its own detail
 
 *The following tracks MUST be executed in this exact order to safely resolve tech debt before feature development.*
 
-1. [x] **Track: Test Suite Stabilization & Consolidation** (Active/Next)
-*Link: [./tracks/test_stabilization_20260302/](./tracks/test_stabilization_20260302/)*
-
-2. [ ] **Track: Strict Static Analysis & Type Safety**
+1. [ ] **Track: Strict Static Analysis & Type Safety**
 *Link: [./tracks/strict_static_analysis_and_typing_20260302/](./tracks/strict_static_analysis_and_typing_20260302/)*
 
-3. [ ] **Track: Codebase Migration to `src` & Cleanup**
+2. [ ] **Track: Codebase Migration to `src` & Cleanup**
 *Link: [./tracks/codebase_migration_20260302/](./tracks/codebase_migration_20260302/)*
 
-4. [ ] **Track: GUI Decoupling & Controller Architecture**
+3. [ ] **Track: GUI Decoupling & Controller Architecture**
 *Link: [./tracks/gui_decoupling_controller_20260302/](./tracks/gui_decoupling_controller_20260302/)*
 
-5. [ ] **Track: Hook API UI State Verification**
+4. [ ] **Track: Hook API UI State Verification**
 *Link: [./tracks/hook_api_ui_state_verification_20260302/](./tracks/hook_api_ui_state_verification_20260302/)*
 
-6. [ ] **Track: Robust JSON Parsing for Tech Lead**
+5. [ ] **Track: Robust JSON Parsing for Tech Lead**
 *Link: [./tracks/robust_json_parsing_tech_lead_20260302/](./tracks/robust_json_parsing_tech_lead_20260302/)*
 
-7. [ ] **Track: Concurrent Tier Source Isolation**
+6. [ ] **Track: Concurrent Tier Source Isolation**
 *Link: [./tracks/concurrent_tier_source_tier_20260302/](./tracks/concurrent_tier_source_tier_20260302/)*
 
-8. [ ] **Track: Test Suite Performance & Flakiness**
+7. [ ] **Track: Test Suite Performance & Flakiness**
 *Link: [./tracks/test_suite_performance_and_flakiness_20260302/](./tracks/test_suite_performance_and_flakiness_20260302/)*
 
-9. [ ] **Track: Manual UX Validation & Polish**
+8. [ ] **Track: Manual UX Validation & Polish**
 *Link: [./tracks/manual_ux_validation_20260302/](./tracks/manual_ux_validation_20260302/)*
 
-10. [ ] **Track: Asynchronous Tool Execution Engine**
+9. [ ] **Track: Asynchronous Tool Execution Engine**
 *Link: [./tracks/async_tool_execution_20260303/](./tracks/async_tool_execution_20260303/)*
 
 ---
 
 ## Completed / Archived
 
+- [x] **Track: Test Suite Stabilization & Consolidation**
+*Link: [./archive/test_stabilization_20260302/](./archive/test_stabilization_20260302/)*
+
 - [x] **Track: Tech Debt & Test Discipline Cleanup**
 *Link: [./archive/tech_debt_and_test_cleanup_20260302/](./archive/tech_debt_and_test_cleanup_20260302/)*