chore(conductor): Add new track 'Codebase Migration to src & Cleanup'

chore(conductor): Ensure plan complies with Surgical Spec Protocol
chore(conductor): Add model switch requirement to Phase 4
2026-03-02 22:28:56 -05:00 · 2026-03-02 22:22:52 -05:00 · 2026-03-02 22:19:52 -05:00 · 2026-03-02 22:18:42 -05:00 · 2026-03-02 22:16:40 -05:00 · 2026-03-02 22:15:17 -05:00
12 changed files with 226 additions and 41 deletions
@@ -6,6 +6,12 @@ This file tracks all major tracks for the project. Each track has its own detail

 ## Current Tracks

+- [ ] **Track: Test Suite Stabilization & Consolidation**
+*Link: [./tracks/test_stabilization_20260302/](./tracks/test_stabilization_20260302/)*
+
+- [ ] **Track: Codebase Migration to `src` & Cleanup**
+*Link: [./tracks/codebase_migration_20260302/](./tracks/codebase_migration_20260302/)*
+
 ---

 ## Completed / Archived
@@ -1,4 +1,4 @@
-# Track testing_consolidation_20260302 Context
+# Track codebase_migration_20260302 Context

 - [Specification](./spec.md)
 - [Implementation Plan](./plan.md)
@@ -0,0 +1,8 @@
+{
+  "track_id": "codebase_migration_20260302",
+  "type": "chore",
+  "status": "new",
+  "created_at": "2026-03-02T22:28:00Z",
+  "updated_at": "2026-03-02T22:28:00Z",
+  "description": "Move the codebase from the main directory to a src directory. Alleviate clutter by doing so. Remove files that are not used at all by the current application's implementation."
+}
@@ -0,0 +1,54 @@
+# Implementation Plan: Codebase Migration to `src` & Cleanup (codebase_migration_20260302)
+
+## Phase 1: Unused File Identification & Removal
+- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator`
+- [ ] Task: Audit Codebase for Dead Files
+    - [ ] WHERE: Project root
+    - [ ] WHAT: Run `py_find_usages` or grep on suspected unused files to verify they are not referenced by `gui_2.py`, `tests/`, `simulation/`, or core config files.
+    - [ ] HOW: Gather a list of unused files.
+    - [ ] SAFETY: Do not delete files referenced in `.toml` files or Github action workflows.
+- [ ] Task: Delete Unused Files
+    - [ ] WHERE: Project root
+    - [ ] WHAT: Use `run_powershell` with `Remove-Item` to delete the identified unused files.
+    - [ ] HOW: Explicitly list and delete them.
+    - [ ] SAFETY: Stage deletions to Git carefully.
+- [ ] Task: Conductor - User Manual Verification 'Phase 1: Unused File Identification & Removal' (Protocol in workflow.md)
+
+## Phase 2: Directory Restructuring & Migration
+- [ ] Task: Create `src/` Directory
+    - [ ] WHERE: Project root
+    - [ ] WHAT: Create the `src/` directory. Add an empty `__init__.py` to make it a package.
+    - [ ] HOW: `New-Item -ItemType Directory src; New-Item src/__init__.py`.
+    - [ ] SAFETY: None.
+- [ ] Task: Move Application Files to `src/`
+    - [ ] WHERE: Project root
+    - [ ] WHAT: Move core `.py` files (`gui_2.py`, `ai_client.py`, `mcp_client.py`, `shell_runner.py`, `project_manager.py`, `events.py`, etc.) into `src/`.
+    - [ ] HOW: Use `git mv` via `run_powershell` or standard `Move-Item`.
+    - [ ] SAFETY: Preserve git history of these files.
+- [ ] Task: Conductor - User Manual Verification 'Phase 2: Directory Restructuring & Migration' (Protocol in workflow.md)
+
+## Phase 3: Entry Point & Import Resolution
+- [ ] Task: Create `sloppy.py` Entry Point
+    - [ ] WHERE: Project root (`sloppy.py`)
+    - [ ] WHAT: Create the script to act as the primary launch point. It should import `App` from `src.gui_2` and pass CLI args.
+    - [ ] HOW: Write a standard Python script wrapper.
+    - [ ] SAFETY: Ensure it correctly propagates `sys.argv`.
+- [ ] Task: Resolve Absolute and Relative Imports
+    - [ ] WHERE: `src/*.py`, `tests/*.py`, `simulation/*.py`
+    - [ ] WHAT: Update import statements. E.g., `import gui_2` becomes `from src import gui_2` or adjust `sys.path.append` in tests.
+    - [ ] HOW: Surgical string replacements. Ensure `pytest` can still find fixtures and test modules.
+    - [ ] SAFETY: Run `uv run pytest` to aggressively check for `ModuleNotFoundError`s.
+- [ ] Task: Conductor - User Manual Verification 'Phase 3: Entry Point & Import Resolution' (Protocol in workflow.md)
+
+## Phase 4: Final Validation & Documentation
+- [ ] Task: Full Test Suite Validation
+    - [ ] WHERE: Project root
+    - [ ] WHAT: Run `uv run pytest`. Fix any remaining path resolution issues for logs, artifacts, and configs.
+    - [ ] HOW: Verify 100% pass rate.
+    - [ ] SAFETY: Artifacts must still be written to `tests/artifacts/`.
+- [ ] Task: Update Core Documentation
+    - [ ] WHERE: `Readme.md`, `docs/`, `conductor/tech-stack.md`
+    - [ ] WHAT: Document `sloppy.py` as the new entry point. Document the `src/` directory layout.
+    - [ ] HOW: Surgical text replacement.
+    - [ ] SAFETY: Accurate representation of new structure.
+- [ ] Task: Conductor - User Manual Verification 'Phase 4: Final Validation & Documentation' (Protocol in workflow.md)
@@ -0,0 +1,33 @@
+# Track Specification: Codebase Migration to `src` & Cleanup (codebase_migration_20260302)
+
+## Overview
+This track focuses on restructuring the codebase to alleviate clutter by moving the main implementation files from the project root into a dedicated `src/` directory. Additionally, files that are completely unused by the current implementation will be automatically identified and removed. A new clean entry point (`sloppy.py`) will be created in the root directory.
+
+## Functional Requirements
+- **Directory Restructuring**:
+  - Move all active Python implementation files (e.g., `gui_2.py`, `ai_client.py`, `mcp_client.py`, `shell_runner.py`, `project_manager.py`, `events.py`, etc.) into a new `src/` directory.
+  - Update internal imports within all moved files to reflect their new locations or ensure the Python path resolves them correctly.
+- **Root Directory Retention**:
+  - Keep configuration files (e.g., `config.toml`, `pyproject.toml`, `requirements.txt`, `.gitignore`) in the project root.
+  - Keep documentation files and directories (e.g., `Readme.md`, `BUILD.md`, `docs/`) in the project root.
+  - Keep the `tests/` and `simulation/` directories at the root level.
+- **New Entry Point**:
+  - Create a new file `sloppy.py` in the root directory.
+  - `sloppy.py` will serve as the primary entry point to launch the application (jumpstarting the underlying `gui_2.py` logic which will be moved into `src/`).
+- **Dead Code/File Removal**:
+  - Automatically identify completely unused files and scripts in the project root (e.g., legacy files, unreferenced tools).
+  - Delete the identified unused files to clean up the repository.
+
+## Non-Functional Requirements
+- Ensure all automated tests (`tests/`) and simulations (`simulation/`) continue to function perfectly without `ModuleNotFoundError`s.
+- `sloppy.py` must support existing CLI arguments (e.g., `--enable-test-hooks`).
+
+## Acceptance Criteria
+- [ ] A `src/` directory exists and contains the main application logic.
+- [ ] The root directory is clean, containing mainly configs, docs, `tests/`, `simulation/`, and `sloppy.py`.
+- [ ] `sloppy.py` successfully launches the application.
+- [ ] The full test suite runs and passes (i.e. all imports are correctly resolved).
+- [ ] Obsolete/unused files have been successfully deleted from the repository.
+
+## Out of Scope
+- Complete refactoring of `gui_2.py` into a fully modular system (this track only moves it, though preparing it for future non-monolithic structure is conceptually aligned).
@@ -0,0 +1,5 @@
+# Track test_stabilization_20260302 Context
+
+- [Specification](./spec.md)
+- [Implementation Plan](./plan.md)
+- [Metadata](./metadata.json)
@@ -0,0 +1,8 @@
+{
+  "track_id": "test_stabilization_20260302",
+  "type": "chore",
+  "status": "new",
+  "created_at": "2026-03-02T22:09:00Z",
+  "updated_at": "2026-03-02T22:09:00Z",
+  "description": "Comprehensive Test Suite Stabilization & Consolidation. Fixes asyncio errors, resolves artifact leakage, and unifies testing paradigms."
+}
@@ -0,0 +1,68 @@
+# Implementation Plan: Test Suite Stabilization & Consolidation (test_stabilization_20260302)
+
+## Phase 1: Infrastructure & Paradigm Consolidation
+- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator`
+- [ ] Task: Setup Artifact Isolation Directories
+    - [ ] WHERE: Project root
+    - [ ] WHAT: Create `./tests/artifacts/` and `./tests/logs/` directories. Add `.gitignore` to both containing `*` and `!.gitignore`.
+    - [ ] HOW: Use PowerShell `New-Item` and `Out-File`.
+    - [ ] SAFETY: Do not commit artifacts.
+- [ ] Task: Migrate Manual Launchers to `live_gui` Fixture
+    - [ ] WHERE: `tests/visual_mma_verification.py` (lines 15-40), `simulation/` scripts.
+    - [ ] WHAT: Replace `subprocess.Popen(["python", "gui_2.py"])` with the `live_gui` fixture injected into `pytest` test functions. Remove manual while-loop sleeps.
+    - [ ] HOW: Use standard pytest `def test_... (live_gui):` and rely on `ApiHookClient` with proper timeouts.
+    - [ ] SAFETY: Ensure `subprocess` is not orphaned if test fails.
+- [ ] Task: Conductor - User Manual Verification 'Phase 1: Infrastructure & Consolidation' (Protocol in workflow.md)
+
+## Phase 2: Asyncio Stabilization & Logging
+- [ ] Task: Audit and Fix `conftest.py` Loop Lifecycle
+    - [ ] WHERE: `tests/conftest.py:20-50` (around `app_instance` fixture).
+    - [ ] WHAT: Ensure the `app._loop.stop()` cleanup safely cancels pending background tasks.
+    - [ ] HOW: Use `asyncio.all_tasks(loop)` and `task.cancel()` before stopping the loop in the fixture teardown.
+    - [ ] SAFETY: Thread-safety; only cancel tasks belonging to the app's loop.
+- [ ] Task: Resolve `Event loop is closed` in Core Test Suite
+    - [ ] WHERE: `tests/test_spawn_interception.py`, `tests/test_gui_streaming.py`.
+    - [ ] WHAT: Update blocking calls to use `ThreadPoolExecutor` or `asyncio.run_coroutine_threadsafe(..., loop)`.
+    - [ ] HOW: Pass the active loop from `app_instance` to the functions triggering the events.
+    - [ ] SAFETY: Prevent event queue deadlocks.
+- [ ] Task: Implement Centralized Sectioned Logging Utility
+    - [ ] WHERE: `tests/conftest.py:50-80` (`VerificationLogger`).
+    - [ ] WHAT: Route `VerificationLogger` output to `./tests/logs/` instead of `logs/test/`.
+    - [ ] HOW: Update `self.logs_dir = Path(f"tests/logs/{datetime.datetime.now().strftime('%Y%m%d_%H%M%S')}")`.
+    - [ ] SAFETY: No state impact.
+- [ ] Task: Conductor - User Manual Verification 'Phase 2: Asyncio & Logging' (Protocol in workflow.md)
+
+## Phase 3: Assertion Implementation & Legacy Cleanup
+- [ ] Task: Replace `pytest.fail` with Functional Assertions (`api_events`, `execution_engine`)
+    - [ ] WHERE: `tests/test_api_events.py:40`, `tests/test_execution_engine.py:45`.
+    - [ ] WHAT: Implement actual `assert` statements testing the mock calls and status updates.
+    - [ ] HOW: Use `MagicMock.assert_called_with` and check `ticket.status == "completed"`.
+    - [ ] SAFETY: Isolate mocks.
+- [ ] Task: Replace `pytest.fail` with Functional Assertions (`token_usage`, `agent_capabilities`)
+    - [ ] WHERE: `tests/test_token_usage.py`, `tests/test_agent_capabilities.py`.
+    - [ ] WHAT: Implement tests verifying the `usage_metadata` extraction and `list_models` output count.
+    - [ ] HOW: Check for 6 models (including `gemini-2.0-flash`) in `list_models` test.
+    - [ ] SAFETY: Isolate mocks.
+- [ ] Task: Resolve Simulation Entry Count Regressions
+    - [ ] WHERE: `tests/test_extended_sims.py:20`.
+    - [ ] WHAT: Fix `AssertionError: Expected at least 2 entries, found 0`.
+    - [ ] HOW: Update simulation flow to properly wait for the `User` and `AI` entries to populate the GUI history before asserting.
+    - [ ] SAFETY: Use dynamic wait (`ApiHookClient.wait_for_event`) instead of static sleeps.
+- [ ] Task: Remove Legacy `gui_legacy` Test Imports & File
+    - [ ] WHERE: `tests/test_gui_events.py`, `tests/test_gui_updates.py`, `tests/test_gui_diagnostics.py`, and project root.
+    - [ ] WHAT: Change `from gui_legacy import App` to `from gui_2 import App`. Fix any breaking UI locators. Then delete `gui_legacy.py`.
+    - [ ] HOW: String replacement and standard `os.remove`.
+    - [ ] SAFETY: Verify no remaining imports exist across the suite using `grep_search`.
+- [ ] Task: Conductor - User Manual Verification 'Phase 3: Assertions & Legacy Cleanup' (Protocol in workflow.md)
+
+## Phase 4: Documentation & Final Verification
+- [ ] Task: Model Switch Request
+    - [ ] Ask the user to run the `/model` command to switch to a high reasoning model for the documentation phase. Wait for their confirmation before proceeding.
+- [ ] Task: Update Core Documentation & Workflow Contract
+    - [ ] WHERE: `Readme.md`, `docs/guide_simulations.md`, `conductor/workflow.md`.
+    - [ ] WHAT: Document artifact locations, `live_gui` standard, and the strict "Structural Testing Contract".
+    - [ ] HOW: Markdown editing. Add sections explicitly banning arbitrary `unittest.mock.patch` on core infra for Tier 3 workers.
+    - [ ] SAFETY: Keep formatting clean.
+- [ ] Task: Full Suite Validation & Warning Cleanup
+- [ ] Task: Final Artifact Isolation Verification
+- [ ] Task: Conductor - User Manual Verification 'Phase 4: Documentation & Final Verification' (Protocol in workflow.md)
@@ -0,0 +1,43 @@
+# Specification: Test Suite Stabilization & Consolidation (test_stabilization_20260302)
+
+## Overview
+The goal of this track is to stabilize and unify the project's test suite. This involves resolving pervasive `asyncio` lifecycle errors, consolidating redundant testing paradigms (specifically manual GUI subprocesses), ensuring artifact isolation in `./tests/artifacts/`, implementing functional assertions for currently mocked-out tests, and updating documentation to reflect the finalized verification framework.
+
+## Architectural Constraints: Combating Mock-Rot
+To prevent future testing entropy caused by "Green-Light Bias" and stateless Tier 3 delegation, this track establishes strict constraints:
+- **Ban on Aggressive Mocking:** Tests MUST NOT use `unittest.mock.patch` to arbitrarily hollow out core infrastructure (e.g., the `App` lifecycle or async loops) just to achieve exit code 0.
+- **Mandatory Centralized Fixtures:** All tests interacting with the GUI or AI client MUST use the centralized `app_instance` or `live_gui` fixtures defined in `conftest.py`.
+- **Structural Testing Contract:** The project workflow must enforce that future AI agents write integration tests against the live state rather than hallucinated mocked environments.
+
+## Functional Requirements
+- **Asyncio Lifecycle Stabilization:**
+  - Resolve `RuntimeError: Event loop is closed` across the suite.
+  - Implement `ThreadPoolExecutor` for blocking calls in GUI-bound tests.
+  - Audit and fix fixture cleanup in `conftest.py`.
+- **Paradigm Consolidation (from testing_consolidation_20260302):**
+  - Refactor integration/visual tests to exclusively use the `live_gui` pytest fixture.
+  - Eliminate all manual `subprocess.Popen` calls to `gui_2.py` in the `tests/` and `simulation/` directories.
+  - Update legacy tests (e.g., `test_gui_events.py`, `test_gui_diagnostics.py`) that still import the deprecated `gui_legacy.py` to use `gui_2.py`.
+  - Completely remove `gui_legacy.py` from the project to eliminate confusion.
+- **Artifact Isolation & Discipline:**
+  - All test-generated files (temporary projects, mocks, sessions) MUST be isolated in `./tests/artifacts/`.
+  - Prevent leakage into `conductor/tracks/` or project root.
+- **Enhanced Test Reporting:**
+  - Implement structured, sectioned logging in `./tests/logs/` with timestamps (consolidating `VerificationLogger` outputs).
+- **Assertion Implementation:**
+  - Replace `pytest.fail` placeholders with full functional implementation.
+- **Simulation Regression Fixes:**
+  - Debug and resolve `test_context_sim_live` entry count issues.
+- **Documentation Updates:**
+  - Update `Readme.md` (Testing section) to explain the new log/artifact locations and the `--enable-test-hooks` requirement.
+  - Update `docs/guide_simulations.md` to document the centralized `pytest` usage instead of standalone simulator scripts.
+
+## Acceptance Criteria
+- [ ] Full suite run completes without `RuntimeError: Event loop is closed` warnings.
+- [ ] No `subprocess.Popen` calls to `gui_2.py` exist in the test codebase.
+- [ ] No test files import `gui_legacy.py`.
+- [ ] `gui_legacy.py` has been deleted from the repository.
+- [ ] All test artifacts are isolated in `./tests/artifacts/`.
+- [ ] All tests previously marked with `pytest.fail` now have passing functional assertions.
+- [ ] Simulation tests pass with correct entry counts.
+- [ ] `Readme.md` and `docs/guide_simulations.md` accurately reflect the new testing infrastructure.
@@ -1,8 +0,0 @@
-{
-  "track_id": "testing_consolidation_20260302",
-  "type": "chore",
-  "status": "new",
-  "created_at": "2026-03-02T00:00:00Z",
-  "updated_at": "2026-03-02T00:00:00Z",
-  "description": "Consolidate divergent simulation tests to uniformly use the pytest live_gui fixture and remove redundant subprocess launcher scripts."
-}
@@ -1,16 +0,0 @@
-# Implementation Plan: Testing & Simulation Consolidation
-
-Architecture reference: [docs/guide_simulations.md](../../../docs/guide_simulations.md)
-
---
-
-## Phase 1: Migrate Manual Launchers to Pytest Fixtures
-Focus: Remove `subprocess.Popen` from visual verification scripts and convert them to proper pytest tests.
-
- [ ] Task 1.1: Refactor `tests/visual_mma_verification.py` to be a standard pytest function: `def test_visual_mma_verification(live_gui):`. Remove all `subprocess.Popen` and directory changing logic.
- [ ] Task 1.2: Audit `tests/` for any other file containing `subprocess.Popen` pointing to `gui_2.py` and refactor them similarly.
-
-## Phase 2: Consolidate Simulation Scripts
-Focus: Ensure the `simulation/` directory integrates cleanly with the pytest framework or serves a distinct non-testing purpose.
-
- [ ] Task 2.1: Audit the `simulation/` directory. If scripts there are just tests in disguise, move them into `tests/` and wrap them in the `live_gui` fixture. If they are intended as standalone interactive demos, clearly document their purpose and ensure they don't duplicate `conftest.py` logic unnecessarily.
@@ -1,16 +0,0 @@
-# Track Specification: Testing & Simulation Consolidation
-
-## Overview
-Currently, the codebase has redundant testing paradigms. Some tests (`tests/visual_sim_gui_ux.py`) properly use the `live_gui` fixture managed by `pytest` in `conftest.py`. However, other visual verification scripts (like `tests/visual_mma_verification.py` and potentially files in `simulation/`) reinvent the wheel by manually opening subprocesses with `subprocess.Popen` to launch the GUI. This fragmentation causes tech debt and test flakiness.
-
-## Current State Audit
-1. **Redundant Subprocess Launching**: `tests/visual_mma_verification.py` manually spawns `gui_2.py` via `subprocess.Popen` instead of using the `conftest.py` `live_gui` fixture.
-2. **Simulation Redundancy**: The `simulation/` directory contains `sim_base.py`, `workflow_sim.py`, etc., that also use `ApiHookClient` but may be reinventing pytest workflows outside of the standard test runner.
-
-## Desired State
- All "visual" or "integration" testing scripts that interact with the live GUI via `ApiHookClient` MUST use the `live_gui` pytest fixture and be executed via `pytest`.
- Any standalone scripts in `tests/` that manually spawn `subprocess.Popen` for `gui_2.py` must be rewritten as standard pytest functions taking the `live_gui` argument.
-
-## Technical Constraints
- No tests should manually spawn `gui_2.py`. They must rely on `conftest.py`.
- Keep testing framework unified strictly under `pytest`.
Author	SHA1	Message	Date
ed	034acb0e54	chore(conductor): Add new track 'Codebase Migration to src & Cleanup'	2026-03-02 22:28:56 -05:00
ed	6141a958d3	chore(conductor): Ensure plan complies with Surgical Spec Protocol	2026-03-02 22:22:52 -05:00
ed	9a2dff9d66	chore(conductor): Add model switch requirement to Phase 4	2026-03-02 22:19:52 -05:00
ed	96c51f22b3	chore(conductor): Add constraints against Mock-Rot to stabilization track	2026-03-02 22:18:42 -05:00
ed	e8479bf9ab	chore(conductor): Add gui_legacy.py deletion to test stabilization track	2026-03-02 22:16:40 -05:00
ed	6e71960976	chore(conductor): Update test stabilization track based on deep audit	2026-03-02 22:15:17 -05:00
ed	84239e6d47	chore(conductor): Add Test Suite Stabilization & Consolidation track	2026-03-02 22:09:36 -05:00