chore(conductor): Add 6 new tracks to the strict execution order queue

This commit is contained in:
2026-03-02 22:34:25 -05:00
parent 034acb0e54
commit 51939c430a
26 changed files with 262 additions and 20 deletions

View File

@@ -92,31 +92,39 @@ To ensure smooth execution, execute the tracks in the following order:
---
## Future Backlog (Post-Cleanup)
*To be evaluated in a future Tier 1 session after the immediate tech debt queue is cleared.*
## Planned: Upcoming Tracks
*The following tracks have been initialized and ordered for execution.*
### `gui_decoupling_controller`
**Context:** `gui_2.py` is over 3,500 lines and operates as a Monolithic God Object. It violates the "Data-Oriented & Immediate Mode" heuristics by owning complex business logic, orchestrator hooks (`_bg_create_track`), and markdown file building instead of acting as a pure view.
**Goal:** Create a headless `orchestrator_pm.py` or `app_controller.py` that handles the core lifecycle, allowing `gui_2.py` to be a lagless, immediate-mode projection of the state.
### 1. `test_stabilization_20260302` (Active/Next)
**Priority:** High
**Goal:** Stabilize `asyncio` errors, ban mock-rot, and consolidate testing paradigms.
### `robust_json_parsing_tech_lead`
**Context:** In `conductor_tech_lead.py`, the `generate_tickets` function relies on a generic `try...except` block to parse the LLM's JSON ticket array. If the model hallucinates or outputs invalid JSON, it silently returns an empty array `[]`, causing the GUI to fail the track creation process without giving the model a chance to self-correct.
**Goal:** Implement a programmatic retry loop that catches `JSONDecodeError` and feeds the error back to the Tier 2 model for self-correction before failing the UI operation.
### 2. `strict_static_analysis_and_typing_20260302`
**Priority:** High
**Goal:** Resolve 512+ mypy errors and remaining ruff violations to secure the foundation before refactoring. Add pre-commit hooks.
### `strict_static_analysis_and_typing`
**Context:** Running `uv run ruff check .` and `uv run mypy --explicit-package-bases .` revealed massive technical debt in type safety (512+ Mypy errors across 64 files, 200+ remaining Ruff violations). The `gui_2.py` and `api_hook_client.py` files specifically have severe "Any" bleeding and incorrect unions.
**Goal:** Resolve all static analysis errors. Enforce strict `mypy` compliance, remove implicit `Optional` types, and fix ambiguous variables (`l`). Integrate `ruff` and `mypy` into a CI pre-commit hook so Tier 3 workers are forced to write type-safe code going forward.
### 3. `codebase_migration_20260302`
**Priority:** High
**Goal:** Restructure directories to a `src/` layout. Doing this after static analysis ensures no hidden import bugs are introduced.
### `hook_api_ui_state_verification`
**Context:** Manual verification of UI widget state is difficult and unreliable. `live_gui` fixture + `ApiHookClient` exist but new widget state vars (e.g. `ui_focus_agent`) are not wired to `_settable_fields` or GET endpoints. Future tracks must add state to `_settable_fields` and write `live_gui`-based tests instead of relying on user confirmation.
**Goal:** Add `ui_focus_agent` (and a standard pattern for future widgets) to `_settable_fields`; add a `/api/gui/state` GET endpoint returning key UI vars; write `live_gui` integration test for Focus Agent filter.
### 4. `gui_decoupling_controller_20260302`
**Priority:** High
**Goal:** Extract the state machine and core lifecycle into a headless `app_controller.py`, leaving `gui_2.py` as a pure, immediate-mode view.
### `concurrent_tier_source_tier`
**Context:** `ai_client.current_tier` is a module-level `str | None`. Safe today because the MMA engine serializes `send()` calls. When concurrent Tier 3/4 agents run in parallel (multiple tickets processed simultaneously), this will produce incorrect tier tags.
**Goal:** Replace with `threading.local()` storage or pass `source_tier` explicitly through the `send()` call signature so each concurrent agent self-identifies without sharing module state.
### 5. `hook_api_ui_state_verification_20260302`
**Priority:** Medium
**Goal:** Add a `/api/gui/state` GET endpoint. Wire UI state into `_settable_fields` to enable programmatic `live_gui` testing without user confirmation.
### `test_suite_performance_and_flakiness`
**Context:** Running `uv run pytest` takes over 5.0 minutes to execute and frequently hangs on integration tests (e.g. `test_spawn_interception.py`). Several simulation tests (`test_sim_ai_settings.py`, `test_extended_sims.py`) are also currently failing or timing out.
**Goal:** Audit the test suite for `time.sleep()` abuse. Replace hardcoded sleeps with `threading.Event()` hooks or robust polling. Isolate slow integration tests with `@pytest.mark.slow` and ensure the core unit test suite runs in under 10 seconds to maintain high-velocity TDD.
### 6. `robust_json_parsing_tech_lead_20260302`
**Priority:** Medium
**Goal:** Implement an auto-retry loop that catches `JSONDecodeError` and feeds the traceback to the Tier 2 model for self-correction.
### 7. `concurrent_tier_source_tier_20260302`
**Priority:** Low
**Goal:** Replace global state with `threading.local()` or explicit context passing to guarantee thread-safe logging when multiple Tier 3 workers process tickets in parallel.
### 8. `test_suite_performance_and_flakiness_20260302`
**Priority:** Low
**Goal:** Replace `time.sleep()` with deterministic polling or `threading.Event()` triggers. Mark exceptionally heavy tests with `@pytest.mark.slow`.

View File

@@ -9,9 +9,27 @@ This file tracks all major tracks for the project. Each track has its own detail
- [ ] **Track: Test Suite Stabilization & Consolidation**
*Link: [./tracks/test_stabilization_20260302/](./tracks/test_stabilization_20260302/)*
- [ ] **Track: Strict Static Analysis & Type Safety**
*Link: [./tracks/strict_static_analysis_and_typing_20260302/](./tracks/strict_static_analysis_and_typing_20260302/)*
- [ ] **Track: Codebase Migration to `src` & Cleanup**
*Link: [./tracks/codebase_migration_20260302/](./tracks/codebase_migration_20260302/)*
- [ ] **Track: GUI Decoupling & Controller Architecture**
*Link: [./tracks/gui_decoupling_controller_20260302/](./tracks/gui_decoupling_controller_20260302/)*
- [ ] **Track: Hook API UI State Verification**
*Link: [./tracks/hook_api_ui_state_verification_20260302/](./tracks/hook_api_ui_state_verification_20260302/)*
- [ ] **Track: Robust JSON Parsing for Tech Lead**
*Link: [./tracks/robust_json_parsing_tech_lead_20260302/](./tracks/robust_json_parsing_tech_lead_20260302/)*
- [ ] **Track: Concurrent Tier Source Isolation**
*Link: [./tracks/concurrent_tier_source_tier_20260302/](./tracks/concurrent_tier_source_tier_20260302/)*
- [ ] **Track: Test Suite Performance & Flakiness**
*Link: [./tracks/test_suite_performance_and_flakiness_20260302/](./tracks/test_suite_performance_and_flakiness_20260302/)*
---
## Completed / Archived

View File

@@ -0,0 +1,5 @@
# Track concurrent_tier_source_tier_20260302 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)

View File

@@ -0,0 +1,8 @@
{
"track_id": "concurrent_tier_source_tier_20260302",
"type": "refactor",
"status": "new",
"created_at": "2026-03-02T22:30:00Z",
"updated_at": "2026-03-02T22:30:00Z",
"description": "Replace ai_client.current_tier global state with threading.local() for parallel agent safety."
}

View File

@@ -0,0 +1,10 @@
# Implementation Plan: Concurrent Tier Isolation
## Phase 1: Thread-Local Storage
- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator`
- [ ] Task: Replace `current_tier` with `threading.local()`.
- [ ] Task: Conductor - User Manual Verification 'Phase 1'
## Phase 2: Refactor & Test
- [ ] Task: Update loggers and test with mock concurrent threads.
- [ ] Task: Conductor - User Manual Verification 'Phase 2'

View File

@@ -0,0 +1,8 @@
# Track Specification: Concurrent Tier Source Isolation
## Overview
Prepares the architecture for parallel Tier 3/4 agents by replacing the global `ai_client.current_tier` with thread-safe `threading.local()` or explicit call signatures.
## Functional Requirements
- Refactor `current_tier` to be thread-safe.
- Update all logging calls to use the thread-safe context.

View File

@@ -0,0 +1,5 @@
# Track gui_decoupling_controller_20260302 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)

View File

@@ -0,0 +1,8 @@
{
"track_id": "gui_decoupling_controller_20260302",
"type": "refactor",
"status": "new",
"created_at": "2026-03-02T22:30:00Z",
"updated_at": "2026-03-02T22:30:00Z",
"description": "Extract the state machine and core lifecycle into a headless app_controller.py, leaving gui_2.py as a pure immediate-mode view."
}

View File

@@ -0,0 +1,18 @@
# Implementation Plan: GUI Decoupling
## Phase 1: Controller Skeleton
- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator`
- [ ] Task: Create `app_controller.py`.
- [ ] Task: Conductor - User Manual Verification 'Phase 1'
## Phase 2: State Migration
- [ ] Task: Move App state from `gui_2.py` to controller.
- [ ] Task: Conductor - User Manual Verification 'Phase 2'
## Phase 3: Logic Migration
- [ ] Task: Move non-rendering methods to controller.
- [ ] Task: Conductor - User Manual Verification 'Phase 3'
## Phase 4: Validation
- [ ] Task: Update all tests to mock/use the controller.
- [ ] Task: Conductor - User Manual Verification 'Phase 4'

View File

@@ -0,0 +1,9 @@
# Track Specification: GUI Decoupling & Controller Architecture
## Overview
`gui_2.py` is a monolithic God Object. This track extracts its business logic and state machine into `app_controller.py`, leaving the GUI as a pure immediate-mode view adhering to Data-Oriented Design.
## Functional Requirements
- Create `app_controller.py`.
- Migrate state variables and lifecycle methods from `gui_2.py` to the controller.
- Ensure `gui_2.py` only reads state and dispatches events.

View File

@@ -0,0 +1,5 @@
# Track hook_api_ui_state_verification_20260302 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)

View File

@@ -0,0 +1,8 @@
{
"track_id": "hook_api_ui_state_verification_20260302",
"type": "feature",
"status": "new",
"created_at": "2026-03-02T22:30:00Z",
"updated_at": "2026-03-02T22:30:00Z",
"description": "Add /api/gui/state GET endpoint and wire UI state variables for programmatic live_gui testing."
}

View File

@@ -0,0 +1,14 @@
# Implementation Plan: Hook API UI State
## Phase 1: API Endpoint
- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator`
- [ ] Task: Implement `/api/gui/state` GET endpoint.
- [ ] Task: Conductor - User Manual Verification 'Phase 1'
## Phase 2: State Wiring
- [ ] Task: Add UI state fields to `_settable_fields`.
- [ ] Task: Conductor - User Manual Verification 'Phase 2'
## Phase 3: Integration Tests
- [ ] Task: Write `live_gui` tests validating state retrieval.
- [ ] Task: Conductor - User Manual Verification 'Phase 3'

View File

@@ -0,0 +1,9 @@
# Track Specification: Hook API UI State Verification
## Overview
Adds an `/api/gui/state` endpoint to expose internal UI widget states (like `ui_focus_agent`) for reliable programmatic testing without user confirmation.
## Functional Requirements
- Add `/api/gui/state` endpoint to the HookServer.
- Wire UI state variables into `_settable_fields`.
- Write `live_gui` integration tests to assert widget states.

View File

@@ -0,0 +1,5 @@
# Track robust_json_parsing_tech_lead_20260302 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)

View File

@@ -0,0 +1,8 @@
{
"track_id": "robust_json_parsing_tech_lead_20260302",
"type": "bug",
"status": "new",
"created_at": "2026-03-02T22:30:00Z",
"updated_at": "2026-03-02T22:30:00Z",
"description": "Implement programmatic retry loop catching JSONDecodeError in Tier 2 ticket generation."
}

View File

@@ -0,0 +1,10 @@
# Implementation Plan: Robust JSON Parsing
## Phase 1: Retry Logic
- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator`
- [ ] Task: Implement retry loop in `conductor_tech_lead.py`.
- [ ] Task: Conductor - User Manual Verification 'Phase 1'
## Phase 2: Validation
- [ ] Task: Write unit tests simulating JSON hallucination.
- [ ] Task: Conductor - User Manual Verification 'Phase 2'

View File

@@ -0,0 +1,9 @@
# Track Specification: Robust JSON Parsing for Tech Lead
## Overview
`conductor_tech_lead.py` silently fails if Tier 2 outputs invalid JSON. This track adds an auto-retry loop that feeds tracebacks back to the LLM for self-correction.
## Functional Requirements
- Add retry loop in `generate_tickets`.
- Catch `JSONDecodeError` and reprompt the model.
- Abort after N failures.

View File

@@ -0,0 +1,5 @@
# Track strict_static_analysis_and_typing_20260302 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)

View File

@@ -0,0 +1,8 @@
{
"track_id": "strict_static_analysis_and_typing_20260302",
"type": "chore",
"status": "new",
"created_at": "2026-03-02T22:30:00Z",
"updated_at": "2026-03-02T22:30:00Z",
"description": "Resolve all mypy/ruff violations, enforce strict typing, and add pre-commit hooks."
}

View File

@@ -0,0 +1,18 @@
# Implementation Plan: Strict Static Analysis & Type Safety
## Phase 1: Configuration & Tooling
- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator`
- [ ] Task: Configure strict `mypy.ini` and update `pyproject.toml`.
- [ ] Task: Conductor - User Manual Verification 'Phase 1'
## Phase 2: Core Library Typing
- [ ] Task: Resolve typing in `api_hook_client.py` and models.
- [ ] Task: Conductor - User Manual Verification 'Phase 2'
## Phase 3: GUI Typing
- [ ] Task: Resolve typing in `gui_2.py`.
- [ ] Task: Conductor - User Manual Verification 'Phase 3'
## Phase 4: CI Integration
- [ ] Task: Implement pre-commit hooks for ruff and mypy.
- [ ] Task: Conductor - User Manual Verification 'Phase 4'

View File

@@ -0,0 +1,10 @@
# Track Specification: Strict Static Analysis & Type Safety
## Overview
The codebase suffers from massive type-safety debt (512+ mypy errors). This track resolves all violations, enforces strict typing across `gui_2.py` and `api_hook_client.py`, and integrates pre-commit checks.
## Functional Requirements
- Resolve all mypy errors.
- Resolve all remaining ruff violations.
- Enforce strict typing.
- Add CI/pre-commit hook for linting.

View File

@@ -0,0 +1,5 @@
# Track test_suite_performance_and_flakiness_20260302 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)

View File

@@ -0,0 +1,8 @@
{
"track_id": "test_suite_performance_and_flakiness_20260302",
"type": "chore",
"status": "new",
"created_at": "2026-03-02T22:30:00Z",
"updated_at": "2026-03-02T22:30:00Z",
"description": "Replace arbitrary time.sleep() calls with deterministic polling/Events and optimize test speed."
}

View File

@@ -0,0 +1,14 @@
# Implementation Plan: Test Suite Performance
## Phase 1: Audit & Polling Primitives
- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator`
- [ ] Task: Create deterministic polling primitives in `conftest.py`.
- [ ] Task: Conductor - User Manual Verification 'Phase 1'
## Phase 2: Refactoring Sleeps
- [ ] Task: Replace `time.sleep` across integration tests.
- [ ] Task: Conductor - User Manual Verification 'Phase 2'
## Phase 3: Test Marking
- [ ] Task: Apply `@pytest.mark.slow` to long-running tests.
- [ ] Task: Conductor - User Manual Verification 'Phase 3'

View File

@@ -0,0 +1,9 @@
# Track Specification: Test Suite Performance & Flakiness
## Overview
The test suite is slow and flaky due to `time.sleep()`. This track replaces sleeps with deterministic polling (`threading.Event()`), aiming for a <10s core TDD loop.
## Functional Requirements
- Audit and remove `time.sleep()` in tests.
- Implement deterministic event polling.
- Mark slow integration tests with `@pytest.mark.slow`.