conductor(track): Initialize 'tech_debt_and_test_cleanup' and 'conductor_workflow_improvements' tracks

2026-03-02 12:14:57 -05:00
parent 821983065c
commit 95bf42aa37
9 changed files with 140 additions and 0 deletions
--- a/TASKS.md
+++ b/TASKS.md
@@ -24,3 +24,31 @@
 - No Focus Agent selector widget in Operations Hub
 **Scope:** Phase 1 (tier tagging) → Phase 2 (tool log dict migration) → Phase 3 (Focus Agent UI + filter). Per-tier token stats deferred to sub-track.
 ### `tech_debt_and_test_cleanup_20260302` (initialized)
 **Priority:** High
 **Depends on:** `feature_bleed_cleanup_20260302`
 **Track dir:** `conductor/tracks/tech_debt_and_test_cleanup_20260302/`
 **Audit-confirmed gaps:**
 - 13 test files duplicate `app_instance` fixture instead of using `conftest.py`.
 - Duplicate test files (`test_ast_parser_curated.py`).
 - Multiple simulation tests silently pass with no assertions.
 - `gui_2.py` initializes 9 state variables in `__init__` that are never read.
 - `gui_2.py` has over 15 uncalled HTTP/background methods.
 **Scope:** Phase 1 (Fixture deduplication) → Phase 2 (False-positive test fixing) → Phase 3 (Dead code excision in `gui_2.py`).
 ### `conductor_workflow_improvements_20260302` (initialized)
 **Priority:** High
 **Depends on:** None
 **Track dir:** `conductor/tracks/conductor_workflow_improvements_20260302/`
 **Audit-confirmed gaps:**
 - Tier 2 skill lacks enforcement of AST pre-implementation scans to prevent duplicate state variables.
 - Tier 2 skill lacks explicit rejection of non-TDD execution.
 - Tier 3 skill does not strictly forbid implementing code without failing tests.
 - `workflow.md` lacks explicit warnings against zero-assertion tests and redundant `__init__` state.
 **Scope:** Phase 1 (Update MMA Skill prompts) → Phase 2 (Update `workflow.md`).
--- a/conductor/tracks/conductor_workflow_improvements_20260302/index.md
+++ b/conductor/tracks/conductor_workflow_improvements_20260302/index.md
@@ -0,0 +1,5 @@
 # Track conductor_workflow_improvements_20260302 Context
 - [Specification](./spec.md)
 - [Implementation Plan](./plan.md)
 - [Metadata](./metadata.json)
--- a/conductor/tracks/conductor_workflow_improvements_20260302/metadata.json
+++ b/conductor/tracks/conductor_workflow_improvements_20260302/metadata.json
@@ -0,0 +1,8 @@
 {
  "track_id": "conductor_workflow_improvements_20260302",
  "type": "chore",
  "status": "new",
  "created_at": "2026-03-02T00:00:00Z",
  "updated_at": "2026-03-02T00:00:00Z",
  "description": "Improve MMA Skill prompts and Conductor workflow docs to enforce TDD, prevent feature bleed, and force mandatory pre-implementation architecture audits."
 }
--- a/conductor/tracks/conductor_workflow_improvements_20260302/plan.md
+++ b/conductor/tracks/conductor_workflow_improvements_20260302/plan.md
@@ -0,0 +1,17 @@
 # Implementation Plan: Conductor Workflow Improvements
 Architecture reference: [docs/guide_mma.md](../../../docs/guide_mma.md)
 ---
 ## Phase 1: Skill Document Hardening
 Focus: Update the agent skill prompts to enforce strict discipline.
 - [ ] Task 1.1: Update `.gemini/skills/mma-tier2-tech-lead/SKILL.md`. Add a new section `## Anti-Entropy Protocol` requiring the Tech Lead to: (1) Use `py_get_code_outline` on the target class's `__init__` to check for redundant state before adding new variables; (2) Ensure failing tests are written and executed *before* delegating implementation to Tier 3.
 - [ ] Task 1.2: Update `.gemini/skills/mma-tier3-worker/SKILL.md`. Add an explicit directive in the `## Responsibilities` section: "You MUST write a failing test and verify it fails (the Red phase) BEFORE writing any implementation code. Do NOT write tests that contain only `pass` or lack assertions."
 ## Phase 2: Workflow Documentation Updates
 Focus: Add safeguards to the global Conductor workflow.
 - [ ] Task 2.1: Update `conductor/workflow.md`. In the `High-Signal Research Phase` section, add a requirement to audit class initializers (`__init__`) for existing, unused, or duplicate state variables before adding new ones.
 - [ ] Task 2.2: Update `conductor/workflow.md`. In the `Test-Driven Development` section, explicitly ban zero-assertion tests and state that a test is only valid if it contains assertions that test the behavioral change.
--- a/conductor/tracks/conductor_workflow_improvements_20260302/spec.md
+++ b/conductor/tracks/conductor_workflow_improvements_20260302/spec.md
@@ -0,0 +1,19 @@
 # Track Specification: Conductor Workflow Improvements
 ## Overview
 Recent Tier 2 track implementations have resulted in feature bleed, redundant code, unread state variables, and degradation of TDD discipline (e.g., zero-assertion tests).
 This track updates the Conductor documentation (`workflow.md`) and the Gemini skills for Tiers 2 and 3 to hard-enforce TDD, prevent hallucinated "mock" implementations, and enforce strict codebase auditing before writing code.
 ## Current State Audit
 1. **Tier 2 Tech Lead Skill (`.gemini/skills/mma-tier2-tech-lead/SKILL.md`)**: Lacks explicit instructions forbidding the merging of code without verified failing test runs. Also lacks mandatory instructions to use `py_get_code_outline` or AST scans specifically to prevent duplicate state variables.
 2. **Tier 3 Worker Skill (`.gemini/skills/mma-tier3-worker/SKILL.md`)**: Mentions TDD, but does not explicitly instruct the agent to refuse to write implementation code if failing tests haven't been written and executed first.
 3. **Workflow Document (`conductor/workflow.md`)**: Mentions TDD and a Research-First Protocol, but lacks a strict "Zero-Assertion Prevention" rule and doesn't emphasize AST analysis of `__init__` functions when modifying state.
 ## Desired State
 - The `mma-tier2-tech-lead` skill forces the Tech Lead to execute tests and verify failure *before* delegating the implementation. It also mandates an explicit check of `__init__` for existing variables before adding new ones.
 - The `mma-tier3-worker` skill includes an explicit safeguard: "Do NOT write implementation code if you have not first written and executed a failing test for it."
 - The `conductor/workflow.md` explicitly calls out the danger of zero-assertion tests and requires AST checks for redundant state.
 ## Technical Constraints
 - The `.gemini/skills/` documents are the ultimate source of truth for agent behavior and must be updated directly.
 - The updates should be clear, commanding, and reference the specific errors encountered (e.g., "feature bleed", "zero-assertion tests").
--- a/conductor/tracks/tech_debt_and_test_cleanup_20260302/index.md
+++ b/conductor/tracks/tech_debt_and_test_cleanup_20260302/index.md
@@ -0,0 +1,5 @@
 # Track tech_debt_and_test_cleanup_20260302 Context
 - [Specification](./spec.md)
 - [Implementation Plan](./plan.md)
 - [Metadata](./metadata.json)
--- a/conductor/tracks/tech_debt_and_test_cleanup_20260302/metadata.json
+++ b/conductor/tracks/tech_debt_and_test_cleanup_20260302/metadata.json
@@ -0,0 +1,8 @@
 {
  "track_id": "tech_debt_and_test_cleanup_20260302",
  "type": "chore",
  "status": "new",
  "created_at": "2026-03-02T00:00:00Z",
  "updated_at": "2026-03-02T00:00:00Z",
  "description": "Tech debt cleanup: Centralize duplicate app_instance fixtures, fix zero-assertion tests, and remove dead unused variables/methods from gui_2.py."
 }
--- a/conductor/tracks/tech_debt_and_test_cleanup_20260302/plan.md
+++ b/conductor/tracks/tech_debt_and_test_cleanup_20260302/plan.md
@@ -0,0 +1,26 @@
 # Implementation Plan: Tech Debt & Test Discipline Cleanup
 Architecture reference: [docs/guide_architecture.md](../../../docs/guide_architecture.md)
 ---
 ## Phase 1: Test Suite Deduplication and Centralization
 Focus: Move `app_instance` and `mock_app` to `tests/conftest.py` and remove them from individual test files.
 - [ ] Task 1.1: Add `app_instance` and `mock_app` fixtures to `tests/conftest.py`. Ensure they properly yield the App instance and tear down.
 - [ ] Task 1.2: Remove local `app_instance` and `mock_app` fixtures from all 13 identified test files. (Tier 3 Worker string replacement / rewrite).
 - [ ] Task 1.3: Delete `tests/test_ast_parser_curated.py` if its contents are fully duplicated in `test_ast_parser.py`, or merge any missing tests.
 - [ ] Task 1.4: Run the test suite (`pytest`) to ensure no fixture resolution errors.
 ## Phase 2: False-Positive Test Exposure
 Focus: Make zero-assertion tests fail loudly so they can be properly tracked.
 - [ ] Task 2.1: Add `pytest.fail("TODO: Implement assertions")` to `test_workflow_sim.py`, `test_sim_ai_settings.py`, `test_sim_tools.py`, `test_api_events.py` and any other tests identified as having zero assertions or just a `pass`.
 - [ ] Task 2.2: Add `@pytest.mark.skip(reason="TODO: Implement assertions")` to the visual simulation tests that only have a `pass` block.
 ## Phase 3: Dead Code Excision in `gui_2.py`
 Focus: Remove unused state variables and dead HTTP/background methods.
 - [ ] Task 3.1: In `gui_2.py` `__init__`, remove the initialization of `_role`, `_ticket_id`, `_uid`, `_base_dir`, `last_md_path`, `_scroll_tool_calls_to_bottom`, `_token_budget_limit`, `_token_budget_pct`, `_token_budget_current`.
 - [ ] Task 3.2: Delete the following unused method definitions from `gui_2.py`: `do_fetch`, `do_post`, `fetch_stats`, `health`, `get_session`, `list_sessions`, `delete_session`, `status`, `get_context`, `_bg_task`, `_push_t1_usage`, `_load_fonts`, `run_prune`, `_parse_history_entries`, `confirm_action`, `pending_actions`, `token_stats`.
 - [ ] Task 3.3: Run `gui_2.py --headless` to verify the application still initializes properly without these variables/methods.
--- a/conductor/tracks/tech_debt_and_test_cleanup_20260302/spec.md
+++ b/conductor/tracks/tech_debt_and_test_cleanup_20260302/spec.md
@@ -0,0 +1,24 @@
 # Track Specification: Tech Debt & Test Discipline Cleanup
 ## Overview
 Due to rapid iterative development and feature bleed across multiple Tier 2-led tracks, significant tech debt has accumulated in both the testing suite and `gui_2.py`.
 This track will clean up test fixtures, enforce test assertion integrity, and remove dead codebase remnants.
 ## Current State Audit
 1. **Duplicate Fixtures**: The `app_instance` fixture is duplicated across 13 test files (e.g. `test_gui_events.py`, `test_process_pending_gui_tasks.py`). `mock_app` is similarly duplicated. They should live in `tests/conftest.py`.
 2. **Duplicate Tests**: `test_ast_parser_get_curated_view` exists in both `test_ast_parser.py` and `test_ast_parser_curated.py`.
 3. **Zero-Assertion Tests**: Many simulation tests and API event tests (e.g., `test_setup_new_project`, `test_sim_ai_settings.py`, `visual_sim_gui_ux.py`) merely run `pass` or execute commands without assertions, acting as a false positive for code coverage.
 4. **Dead State/Methods in gui_2.py**:
   - `gui_2.py.__init__` assigns state variables never read: `_role`, `_ticket_id`, `_uid`, `_base_dir`, `last_md_path`, `_scroll_tool_calls_to_bottom`, `_token_budget_limit`, `_token_budget_pct`, `_token_budget_current`.
   - `gui_2.py` has uncalled boilerplate methods (FastAPI leftovers or old logic): `do_fetch`, `do_post`, `fetch_stats`, `health`, `get_session`, `list_sessions`, `delete_session`, `status`, `get_context`, `_bg_task`, `_push_t1_usage`, `_load_fonts`, `run_prune`, `_parse_history_entries`, `confirm_action`, `pending_actions`, `token_stats`.
 ## Desired State
 - `app_instance` and `mock_app` fixtures centralized in `conftest.py`.
 - Duplicate test files/functions removed.
 - Tests without assertions marked with `pytest.fail("TODO: Add assertions")` so they correctly show as incomplete.
 - Unused variables and methods completely removed from `gui_2.py`.
 ## Technical Constraints
 - The `app_instance` fixture requires the `live_gui` logic or an isolated `App` instance setup. Must ensure it does not leak state when placed in `conftest.py`.
 - Ensure removal of unused variables in `gui_2.py` does not break any reflection/serialization if they are coincidentally used by config savers (though AST confirmed they are not read locally).
 - Must adhere to 1-space indentation for `gui_2.py`.