From 95bf42aa374d66e0d13f58603e039e5983462063 Mon Sep 17 00:00:00 2001 From: Ed_ Date: Mon, 2 Mar 2026 12:14:57 -0500 Subject: [PATCH] conductor(track): Initialize 'tech_debt_and_test_cleanup' and 'conductor_workflow_improvements' tracks --- TASKS.md | 28 +++++++++++++++++++ .../index.md | 5 ++++ .../metadata.json | 8 ++++++ .../plan.md | 17 +++++++++++ .../spec.md | 19 +++++++++++++ .../index.md | 5 ++++ .../metadata.json | 8 ++++++ .../plan.md | 26 +++++++++++++++++ .../spec.md | 24 ++++++++++++++++ 9 files changed, 140 insertions(+) create mode 100644 conductor/tracks/conductor_workflow_improvements_20260302/index.md create mode 100644 conductor/tracks/conductor_workflow_improvements_20260302/metadata.json create mode 100644 conductor/tracks/conductor_workflow_improvements_20260302/plan.md create mode 100644 conductor/tracks/conductor_workflow_improvements_20260302/spec.md create mode 100644 conductor/tracks/tech_debt_and_test_cleanup_20260302/index.md create mode 100644 conductor/tracks/tech_debt_and_test_cleanup_20260302/metadata.json create mode 100644 conductor/tracks/tech_debt_and_test_cleanup_20260302/plan.md create mode 100644 conductor/tracks/tech_debt_and_test_cleanup_20260302/spec.md diff --git a/TASKS.md b/TASKS.md index 6a9ecaf..b00684f 100644 --- a/TASKS.md +++ b/TASKS.md @@ -24,3 +24,31 @@ - No Focus Agent selector widget in Operations Hub **Scope:** Phase 1 (tier tagging) → Phase 2 (tool log dict migration) → Phase 3 (Focus Agent UI + filter). Per-tier token stats deferred to sub-track. + +### `tech_debt_and_test_cleanup_20260302` (initialized) +**Priority:** High +**Depends on:** `feature_bleed_cleanup_20260302` +**Track dir:** `conductor/tracks/tech_debt_and_test_cleanup_20260302/` + +**Audit-confirmed gaps:** +- 13 test files duplicate `app_instance` fixture instead of using `conftest.py`. +- Duplicate test files (`test_ast_parser_curated.py`). +- Multiple simulation tests silently pass with no assertions. +- `gui_2.py` initializes 9 state variables in `__init__` that are never read. +- `gui_2.py` has over 15 uncalled HTTP/background methods. + +**Scope:** Phase 1 (Fixture deduplication) → Phase 2 (False-positive test fixing) → Phase 3 (Dead code excision in `gui_2.py`). + +### `conductor_workflow_improvements_20260302` (initialized) +**Priority:** High +**Depends on:** None +**Track dir:** `conductor/tracks/conductor_workflow_improvements_20260302/` + +**Audit-confirmed gaps:** +- Tier 2 skill lacks enforcement of AST pre-implementation scans to prevent duplicate state variables. +- Tier 2 skill lacks explicit rejection of non-TDD execution. +- Tier 3 skill does not strictly forbid implementing code without failing tests. +- `workflow.md` lacks explicit warnings against zero-assertion tests and redundant `__init__` state. + +**Scope:** Phase 1 (Update MMA Skill prompts) → Phase 2 (Update `workflow.md`). + diff --git a/conductor/tracks/conductor_workflow_improvements_20260302/index.md b/conductor/tracks/conductor_workflow_improvements_20260302/index.md new file mode 100644 index 0000000..ad6e45e --- /dev/null +++ b/conductor/tracks/conductor_workflow_improvements_20260302/index.md @@ -0,0 +1,5 @@ +# Track conductor_workflow_improvements_20260302 Context + +- [Specification](./spec.md) +- [Implementation Plan](./plan.md) +- [Metadata](./metadata.json) \ No newline at end of file diff --git a/conductor/tracks/conductor_workflow_improvements_20260302/metadata.json b/conductor/tracks/conductor_workflow_improvements_20260302/metadata.json new file mode 100644 index 0000000..5e21c81 --- /dev/null +++ b/conductor/tracks/conductor_workflow_improvements_20260302/metadata.json @@ -0,0 +1,8 @@ +{ + "track_id": "conductor_workflow_improvements_20260302", + "type": "chore", + "status": "new", + "created_at": "2026-03-02T00:00:00Z", + "updated_at": "2026-03-02T00:00:00Z", + "description": "Improve MMA Skill prompts and Conductor workflow docs to enforce TDD, prevent feature bleed, and force mandatory pre-implementation architecture audits." +} \ No newline at end of file diff --git a/conductor/tracks/conductor_workflow_improvements_20260302/plan.md b/conductor/tracks/conductor_workflow_improvements_20260302/plan.md new file mode 100644 index 0000000..ac63e82 --- /dev/null +++ b/conductor/tracks/conductor_workflow_improvements_20260302/plan.md @@ -0,0 +1,17 @@ +# Implementation Plan: Conductor Workflow Improvements + +Architecture reference: [docs/guide_mma.md](../../../docs/guide_mma.md) + +--- + +## Phase 1: Skill Document Hardening +Focus: Update the agent skill prompts to enforce strict discipline. + +- [ ] Task 1.1: Update `.gemini/skills/mma-tier2-tech-lead/SKILL.md`. Add a new section `## Anti-Entropy Protocol` requiring the Tech Lead to: (1) Use `py_get_code_outline` on the target class's `__init__` to check for redundant state before adding new variables; (2) Ensure failing tests are written and executed *before* delegating implementation to Tier 3. +- [ ] Task 1.2: Update `.gemini/skills/mma-tier3-worker/SKILL.md`. Add an explicit directive in the `## Responsibilities` section: "You MUST write a failing test and verify it fails (the Red phase) BEFORE writing any implementation code. Do NOT write tests that contain only `pass` or lack assertions." + +## Phase 2: Workflow Documentation Updates +Focus: Add safeguards to the global Conductor workflow. + +- [ ] Task 2.1: Update `conductor/workflow.md`. In the `High-Signal Research Phase` section, add a requirement to audit class initializers (`__init__`) for existing, unused, or duplicate state variables before adding new ones. +- [ ] Task 2.2: Update `conductor/workflow.md`. In the `Test-Driven Development` section, explicitly ban zero-assertion tests and state that a test is only valid if it contains assertions that test the behavioral change. \ No newline at end of file diff --git a/conductor/tracks/conductor_workflow_improvements_20260302/spec.md b/conductor/tracks/conductor_workflow_improvements_20260302/spec.md new file mode 100644 index 0000000..4fd566d --- /dev/null +++ b/conductor/tracks/conductor_workflow_improvements_20260302/spec.md @@ -0,0 +1,19 @@ +# Track Specification: Conductor Workflow Improvements + +## Overview +Recent Tier 2 track implementations have resulted in feature bleed, redundant code, unread state variables, and degradation of TDD discipline (e.g., zero-assertion tests). +This track updates the Conductor documentation (`workflow.md`) and the Gemini skills for Tiers 2 and 3 to hard-enforce TDD, prevent hallucinated "mock" implementations, and enforce strict codebase auditing before writing code. + +## Current State Audit +1. **Tier 2 Tech Lead Skill (`.gemini/skills/mma-tier2-tech-lead/SKILL.md`)**: Lacks explicit instructions forbidding the merging of code without verified failing test runs. Also lacks mandatory instructions to use `py_get_code_outline` or AST scans specifically to prevent duplicate state variables. +2. **Tier 3 Worker Skill (`.gemini/skills/mma-tier3-worker/SKILL.md`)**: Mentions TDD, but does not explicitly instruct the agent to refuse to write implementation code if failing tests haven't been written and executed first. +3. **Workflow Document (`conductor/workflow.md`)**: Mentions TDD and a Research-First Protocol, but lacks a strict "Zero-Assertion Prevention" rule and doesn't emphasize AST analysis of `__init__` functions when modifying state. + +## Desired State +- The `mma-tier2-tech-lead` skill forces the Tech Lead to execute tests and verify failure *before* delegating the implementation. It also mandates an explicit check of `__init__` for existing variables before adding new ones. +- The `mma-tier3-worker` skill includes an explicit safeguard: "Do NOT write implementation code if you have not first written and executed a failing test for it." +- The `conductor/workflow.md` explicitly calls out the danger of zero-assertion tests and requires AST checks for redundant state. + +## Technical Constraints +- The `.gemini/skills/` documents are the ultimate source of truth for agent behavior and must be updated directly. +- The updates should be clear, commanding, and reference the specific errors encountered (e.g., "feature bleed", "zero-assertion tests"). \ No newline at end of file diff --git a/conductor/tracks/tech_debt_and_test_cleanup_20260302/index.md b/conductor/tracks/tech_debt_and_test_cleanup_20260302/index.md new file mode 100644 index 0000000..23ae00b --- /dev/null +++ b/conductor/tracks/tech_debt_and_test_cleanup_20260302/index.md @@ -0,0 +1,5 @@ +# Track tech_debt_and_test_cleanup_20260302 Context + +- [Specification](./spec.md) +- [Implementation Plan](./plan.md) +- [Metadata](./metadata.json) \ No newline at end of file diff --git a/conductor/tracks/tech_debt_and_test_cleanup_20260302/metadata.json b/conductor/tracks/tech_debt_and_test_cleanup_20260302/metadata.json new file mode 100644 index 0000000..776a8f2 --- /dev/null +++ b/conductor/tracks/tech_debt_and_test_cleanup_20260302/metadata.json @@ -0,0 +1,8 @@ +{ + "track_id": "tech_debt_and_test_cleanup_20260302", + "type": "chore", + "status": "new", + "created_at": "2026-03-02T00:00:00Z", + "updated_at": "2026-03-02T00:00:00Z", + "description": "Tech debt cleanup: Centralize duplicate app_instance fixtures, fix zero-assertion tests, and remove dead unused variables/methods from gui_2.py." +} \ No newline at end of file diff --git a/conductor/tracks/tech_debt_and_test_cleanup_20260302/plan.md b/conductor/tracks/tech_debt_and_test_cleanup_20260302/plan.md new file mode 100644 index 0000000..64c0752 --- /dev/null +++ b/conductor/tracks/tech_debt_and_test_cleanup_20260302/plan.md @@ -0,0 +1,26 @@ +# Implementation Plan: Tech Debt & Test Discipline Cleanup + +Architecture reference: [docs/guide_architecture.md](../../../docs/guide_architecture.md) + +--- + +## Phase 1: Test Suite Deduplication and Centralization +Focus: Move `app_instance` and `mock_app` to `tests/conftest.py` and remove them from individual test files. + +- [ ] Task 1.1: Add `app_instance` and `mock_app` fixtures to `tests/conftest.py`. Ensure they properly yield the App instance and tear down. +- [ ] Task 1.2: Remove local `app_instance` and `mock_app` fixtures from all 13 identified test files. (Tier 3 Worker string replacement / rewrite). +- [ ] Task 1.3: Delete `tests/test_ast_parser_curated.py` if its contents are fully duplicated in `test_ast_parser.py`, or merge any missing tests. +- [ ] Task 1.4: Run the test suite (`pytest`) to ensure no fixture resolution errors. + +## Phase 2: False-Positive Test Exposure +Focus: Make zero-assertion tests fail loudly so they can be properly tracked. + +- [ ] Task 2.1: Add `pytest.fail("TODO: Implement assertions")` to `test_workflow_sim.py`, `test_sim_ai_settings.py`, `test_sim_tools.py`, `test_api_events.py` and any other tests identified as having zero assertions or just a `pass`. +- [ ] Task 2.2: Add `@pytest.mark.skip(reason="TODO: Implement assertions")` to the visual simulation tests that only have a `pass` block. + +## Phase 3: Dead Code Excision in `gui_2.py` +Focus: Remove unused state variables and dead HTTP/background methods. + +- [ ] Task 3.1: In `gui_2.py` `__init__`, remove the initialization of `_role`, `_ticket_id`, `_uid`, `_base_dir`, `last_md_path`, `_scroll_tool_calls_to_bottom`, `_token_budget_limit`, `_token_budget_pct`, `_token_budget_current`. +- [ ] Task 3.2: Delete the following unused method definitions from `gui_2.py`: `do_fetch`, `do_post`, `fetch_stats`, `health`, `get_session`, `list_sessions`, `delete_session`, `status`, `get_context`, `_bg_task`, `_push_t1_usage`, `_load_fonts`, `run_prune`, `_parse_history_entries`, `confirm_action`, `pending_actions`, `token_stats`. +- [ ] Task 3.3: Run `gui_2.py --headless` to verify the application still initializes properly without these variables/methods. \ No newline at end of file diff --git a/conductor/tracks/tech_debt_and_test_cleanup_20260302/spec.md b/conductor/tracks/tech_debt_and_test_cleanup_20260302/spec.md new file mode 100644 index 0000000..1748ee3 --- /dev/null +++ b/conductor/tracks/tech_debt_and_test_cleanup_20260302/spec.md @@ -0,0 +1,24 @@ +# Track Specification: Tech Debt & Test Discipline Cleanup + +## Overview +Due to rapid iterative development and feature bleed across multiple Tier 2-led tracks, significant tech debt has accumulated in both the testing suite and `gui_2.py`. +This track will clean up test fixtures, enforce test assertion integrity, and remove dead codebase remnants. + +## Current State Audit +1. **Duplicate Fixtures**: The `app_instance` fixture is duplicated across 13 test files (e.g. `test_gui_events.py`, `test_process_pending_gui_tasks.py`). `mock_app` is similarly duplicated. They should live in `tests/conftest.py`. +2. **Duplicate Tests**: `test_ast_parser_get_curated_view` exists in both `test_ast_parser.py` and `test_ast_parser_curated.py`. +3. **Zero-Assertion Tests**: Many simulation tests and API event tests (e.g., `test_setup_new_project`, `test_sim_ai_settings.py`, `visual_sim_gui_ux.py`) merely run `pass` or execute commands without assertions, acting as a false positive for code coverage. +4. **Dead State/Methods in gui_2.py**: + - `gui_2.py.__init__` assigns state variables never read: `_role`, `_ticket_id`, `_uid`, `_base_dir`, `last_md_path`, `_scroll_tool_calls_to_bottom`, `_token_budget_limit`, `_token_budget_pct`, `_token_budget_current`. + - `gui_2.py` has uncalled boilerplate methods (FastAPI leftovers or old logic): `do_fetch`, `do_post`, `fetch_stats`, `health`, `get_session`, `list_sessions`, `delete_session`, `status`, `get_context`, `_bg_task`, `_push_t1_usage`, `_load_fonts`, `run_prune`, `_parse_history_entries`, `confirm_action`, `pending_actions`, `token_stats`. + +## Desired State +- `app_instance` and `mock_app` fixtures centralized in `conftest.py`. +- Duplicate test files/functions removed. +- Tests without assertions marked with `pytest.fail("TODO: Add assertions")` so they correctly show as incomplete. +- Unused variables and methods completely removed from `gui_2.py`. + +## Technical Constraints +- The `app_instance` fixture requires the `live_gui` logic or an isolated `App` instance setup. Must ensure it does not leak state when placed in `conftest.py`. +- Ensure removal of unused variables in `gui_2.py` does not break any reflection/serialization if they are coincidentally used by config savers (though AST confirmed they are not read locally). +- Must adhere to 1-space indentation for `gui_2.py`. \ No newline at end of file