conductor(track): Initialize 'tech_debt_and_test_cleanup' and 'conductor_workflow_improvements' tracks

This commit is contained in:
2026-03-02 12:14:57 -05:00
parent 821983065c
commit 95bf42aa37
9 changed files with 140 additions and 0 deletions

View File

@@ -0,0 +1,5 @@
# Track conductor_workflow_improvements_20260302 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)

View File

@@ -0,0 +1,8 @@
{
"track_id": "conductor_workflow_improvements_20260302",
"type": "chore",
"status": "new",
"created_at": "2026-03-02T00:00:00Z",
"updated_at": "2026-03-02T00:00:00Z",
"description": "Improve MMA Skill prompts and Conductor workflow docs to enforce TDD, prevent feature bleed, and force mandatory pre-implementation architecture audits."
}

View File

@@ -0,0 +1,17 @@
# Implementation Plan: Conductor Workflow Improvements
Architecture reference: [docs/guide_mma.md](../../../docs/guide_mma.md)
---
## Phase 1: Skill Document Hardening
Focus: Update the agent skill prompts to enforce strict discipline.
- [ ] Task 1.1: Update `.gemini/skills/mma-tier2-tech-lead/SKILL.md`. Add a new section `## Anti-Entropy Protocol` requiring the Tech Lead to: (1) Use `py_get_code_outline` on the target class's `__init__` to check for redundant state before adding new variables; (2) Ensure failing tests are written and executed *before* delegating implementation to Tier 3.
- [ ] Task 1.2: Update `.gemini/skills/mma-tier3-worker/SKILL.md`. Add an explicit directive in the `## Responsibilities` section: "You MUST write a failing test and verify it fails (the Red phase) BEFORE writing any implementation code. Do NOT write tests that contain only `pass` or lack assertions."
## Phase 2: Workflow Documentation Updates
Focus: Add safeguards to the global Conductor workflow.
- [ ] Task 2.1: Update `conductor/workflow.md`. In the `High-Signal Research Phase` section, add a requirement to audit class initializers (`__init__`) for existing, unused, or duplicate state variables before adding new ones.
- [ ] Task 2.2: Update `conductor/workflow.md`. In the `Test-Driven Development` section, explicitly ban zero-assertion tests and state that a test is only valid if it contains assertions that test the behavioral change.

View File

@@ -0,0 +1,19 @@
# Track Specification: Conductor Workflow Improvements
## Overview
Recent Tier 2 track implementations have resulted in feature bleed, redundant code, unread state variables, and degradation of TDD discipline (e.g., zero-assertion tests).
This track updates the Conductor documentation (`workflow.md`) and the Gemini skills for Tiers 2 and 3 to hard-enforce TDD, prevent hallucinated "mock" implementations, and enforce strict codebase auditing before writing code.
## Current State Audit
1. **Tier 2 Tech Lead Skill (`.gemini/skills/mma-tier2-tech-lead/SKILL.md`)**: Lacks explicit instructions forbidding the merging of code without verified failing test runs. Also lacks mandatory instructions to use `py_get_code_outline` or AST scans specifically to prevent duplicate state variables.
2. **Tier 3 Worker Skill (`.gemini/skills/mma-tier3-worker/SKILL.md`)**: Mentions TDD, but does not explicitly instruct the agent to refuse to write implementation code if failing tests haven't been written and executed first.
3. **Workflow Document (`conductor/workflow.md`)**: Mentions TDD and a Research-First Protocol, but lacks a strict "Zero-Assertion Prevention" rule and doesn't emphasize AST analysis of `__init__` functions when modifying state.
## Desired State
- The `mma-tier2-tech-lead` skill forces the Tech Lead to execute tests and verify failure *before* delegating the implementation. It also mandates an explicit check of `__init__` for existing variables before adding new ones.
- The `mma-tier3-worker` skill includes an explicit safeguard: "Do NOT write implementation code if you have not first written and executed a failing test for it."
- The `conductor/workflow.md` explicitly calls out the danger of zero-assertion tests and requires AST checks for redundant state.
## Technical Constraints
- The `.gemini/skills/` documents are the ultimate source of truth for agent behavior and must be updated directly.
- The updates should be clear, commanding, and reference the specific errors encountered (e.g., "feature bleed", "zero-assertion tests").

View File

@@ -0,0 +1,5 @@
# Track tech_debt_and_test_cleanup_20260302 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)

View File

@@ -0,0 +1,8 @@
{
"track_id": "tech_debt_and_test_cleanup_20260302",
"type": "chore",
"status": "new",
"created_at": "2026-03-02T00:00:00Z",
"updated_at": "2026-03-02T00:00:00Z",
"description": "Tech debt cleanup: Centralize duplicate app_instance fixtures, fix zero-assertion tests, and remove dead unused variables/methods from gui_2.py."
}

View File

@@ -0,0 +1,26 @@
# Implementation Plan: Tech Debt & Test Discipline Cleanup
Architecture reference: [docs/guide_architecture.md](../../../docs/guide_architecture.md)
---
## Phase 1: Test Suite Deduplication and Centralization
Focus: Move `app_instance` and `mock_app` to `tests/conftest.py` and remove them from individual test files.
- [ ] Task 1.1: Add `app_instance` and `mock_app` fixtures to `tests/conftest.py`. Ensure they properly yield the App instance and tear down.
- [ ] Task 1.2: Remove local `app_instance` and `mock_app` fixtures from all 13 identified test files. (Tier 3 Worker string replacement / rewrite).
- [ ] Task 1.3: Delete `tests/test_ast_parser_curated.py` if its contents are fully duplicated in `test_ast_parser.py`, or merge any missing tests.
- [ ] Task 1.4: Run the test suite (`pytest`) to ensure no fixture resolution errors.
## Phase 2: False-Positive Test Exposure
Focus: Make zero-assertion tests fail loudly so they can be properly tracked.
- [ ] Task 2.1: Add `pytest.fail("TODO: Implement assertions")` to `test_workflow_sim.py`, `test_sim_ai_settings.py`, `test_sim_tools.py`, `test_api_events.py` and any other tests identified as having zero assertions or just a `pass`.
- [ ] Task 2.2: Add `@pytest.mark.skip(reason="TODO: Implement assertions")` to the visual simulation tests that only have a `pass` block.
## Phase 3: Dead Code Excision in `gui_2.py`
Focus: Remove unused state variables and dead HTTP/background methods.
- [ ] Task 3.1: In `gui_2.py` `__init__`, remove the initialization of `_role`, `_ticket_id`, `_uid`, `_base_dir`, `last_md_path`, `_scroll_tool_calls_to_bottom`, `_token_budget_limit`, `_token_budget_pct`, `_token_budget_current`.
- [ ] Task 3.2: Delete the following unused method definitions from `gui_2.py`: `do_fetch`, `do_post`, `fetch_stats`, `health`, `get_session`, `list_sessions`, `delete_session`, `status`, `get_context`, `_bg_task`, `_push_t1_usage`, `_load_fonts`, `run_prune`, `_parse_history_entries`, `confirm_action`, `pending_actions`, `token_stats`.
- [ ] Task 3.3: Run `gui_2.py --headless` to verify the application still initializes properly without these variables/methods.

View File

@@ -0,0 +1,24 @@
# Track Specification: Tech Debt & Test Discipline Cleanup
## Overview
Due to rapid iterative development and feature bleed across multiple Tier 2-led tracks, significant tech debt has accumulated in both the testing suite and `gui_2.py`.
This track will clean up test fixtures, enforce test assertion integrity, and remove dead codebase remnants.
## Current State Audit
1. **Duplicate Fixtures**: The `app_instance` fixture is duplicated across 13 test files (e.g. `test_gui_events.py`, `test_process_pending_gui_tasks.py`). `mock_app` is similarly duplicated. They should live in `tests/conftest.py`.
2. **Duplicate Tests**: `test_ast_parser_get_curated_view` exists in both `test_ast_parser.py` and `test_ast_parser_curated.py`.
3. **Zero-Assertion Tests**: Many simulation tests and API event tests (e.g., `test_setup_new_project`, `test_sim_ai_settings.py`, `visual_sim_gui_ux.py`) merely run `pass` or execute commands without assertions, acting as a false positive for code coverage.
4. **Dead State/Methods in gui_2.py**:
- `gui_2.py.__init__` assigns state variables never read: `_role`, `_ticket_id`, `_uid`, `_base_dir`, `last_md_path`, `_scroll_tool_calls_to_bottom`, `_token_budget_limit`, `_token_budget_pct`, `_token_budget_current`.
- `gui_2.py` has uncalled boilerplate methods (FastAPI leftovers or old logic): `do_fetch`, `do_post`, `fetch_stats`, `health`, `get_session`, `list_sessions`, `delete_session`, `status`, `get_context`, `_bg_task`, `_push_t1_usage`, `_load_fonts`, `run_prune`, `_parse_history_entries`, `confirm_action`, `pending_actions`, `token_stats`.
## Desired State
- `app_instance` and `mock_app` fixtures centralized in `conftest.py`.
- Duplicate test files/functions removed.
- Tests without assertions marked with `pytest.fail("TODO: Add assertions")` so they correctly show as incomplete.
- Unused variables and methods completely removed from `gui_2.py`.
## Technical Constraints
- The `app_instance` fixture requires the `live_gui` logic or an isolated `App` instance setup. Must ensure it does not leak state when placed in `conftest.py`.
- Ensure removal of unused variables in `gui_2.py` does not break any reflection/serialization if they are coincidentally used by config savers (though AST confirmed they are not read locally).
- Must adhere to 1-space indentation for `gui_2.py`.