- Mark live_gui tests as flaky by design in TASKS.md until stabiliztion tracks complete - Add test debt notes to upcoming tracks to guide testing strategies
77 lines
4.5 KiB
Markdown
77 lines
4.5 KiB
Markdown
# TASKS.md
|
|
<!-- Quick-read pointer to active and planned conductor tracks -->
|
|
<!-- Source of truth for task state is conductor/tracks/*/plan.md -->
|
|
|
|
## Active Tracks
|
|
*(none — all planned tracks queued below)*
|
|
|
|
## Completed This Session
|
|
- `mma_agent_focus_ux_20260302` — Per-tier source_tier tagging on comms+tool entries; Focus Agent combo UI; filter logic in comms+tool panels; [tier] label per comms entry. 18 tests. Checkpoint: b30e563.
|
|
- `feature_bleed_cleanup_20260302` — Removed dead comms panel dup, dead menubar block, duplicate __init__ vars; added working Quit; fixed Token Budget layout. All phases verified. Checkpoint: 0d081a2.
|
|
- `context_token_viz_20260301` — Token budget panel (color bar, breakdown table, trim warning, cache status, auto-refresh). All phases verified. Commit: d577457.
|
|
- `tech_debt_and_test_cleanup_20260302` — [BOTCHED/ARCHIVED] Centralized fixtures but exposed deep asyncio flaws.
|
|
|
|
---
|
|
|
|
## Planned: The Strict Execution Queue
|
|
*All previously loose backlog items have been rigorously spec'd and initialized as Conductor Tracks. They MUST be executed in this exact order.*
|
|
|
|
> [!WARNING] TEST ARCHITECTURE DEBT NOTICE (2026-03-05)
|
|
> The `gui_decoupling` track exposed deep flaws in the test architecture (asyncio event loop exhaustion, IPC polling race conditions, phantom Windows subprocesses).
|
|
> **Current Testing Policy:**
|
|
> - Full-suite integration tests (`live_gui` / extended sims) are currently considered **"flaky by design"**.
|
|
> - Do NOT write new `live_gui` simulations until Track #5 and #6 are complete.
|
|
> - If unit tests pass but `test_extended_sims.py` hangs or fails locally, you may manually verify the GUI behavior and proceed.
|
|
|
|
### 1. `test_stabilization_20260302` (Archived)
|
|
- **Status:** Completed
|
|
- **Priority:** High
|
|
- **Goal:** Stabilize `asyncio` errors, ban mock-rot, completely remove `gui_legacy.py`, and consolidate testing paradigms.
|
|
|
|
### 2. `strict_static_analysis_and_typing_20260302` (Archived)
|
|
- **Status:** Completed
|
|
- **Priority:** High
|
|
- **Goal:** Resolve 512+ mypy errors and remaining ruff violations to secure the foundation before refactoring. Add pre-commit hooks.
|
|
|
|
### 3. `codebase_migration_20260302` (Archived)
|
|
- **Status:** Completed
|
|
- **Priority:** High
|
|
- **Goal:** Restructure directories to a `src/` layout. Doing this after static analysis ensures no hidden import bugs are introduced. Creates `sloppy.py` entry point.
|
|
|
|
### 4. `gui_decoupling_controller_20260302` (Archived)
|
|
- **Status:** Completed
|
|
- **Priority:** High
|
|
- **Goal:** Extract the state machine and core lifecycle into a headless `app_controller.py`, leaving `gui_2.py` as a pure, immediate-mode view.
|
|
|
|
### 5. `hook_api_ui_state_verification_20260302` (Active/Next)
|
|
- **Status:** Initialized / Looked Over
|
|
- **Priority:** High
|
|
- **Goal:** Add a `/api/gui/state` GET endpoint. Wire UI state into `_settable_fields` to enable programmatic `live_gui` testing without user confirmation.
|
|
- **Fixes Test Debt:** Replaces brittle `time.sleep()` and string-matching assertions in simulations with deterministic API queries.
|
|
|
|
### 6. `test_suite_performance_and_flakiness_20260302`
|
|
- **Status:** Initialized / Looked Over
|
|
- **Priority:** High
|
|
- **Goal:** Resolve deep asyncio/threading deadlocks. Replace `asyncio.Queue` in `AppController` with a standard `queue.Queue`. Ensure phantom subprocesses are killed.
|
|
- **Fixes Test Debt:** Eliminates `RuntimeError: Event loop is closed` and zombie port 8999 hijacking. Restores full-suite reliability.
|
|
|
|
### 7. `robust_json_parsing_tech_lead_20260302`
|
|
- **Status:** Initialized / Looked Over
|
|
- **Priority:** Medium
|
|
- **Goal:** Implement an auto-retry loop that catches `JSONDecodeError` and feeds the traceback to the Tier 2 model for self-correction.
|
|
|
|
### 8. `concurrent_tier_source_tier_20260302`
|
|
- **Status:** Initialized / Looked Over
|
|
- **Priority:** Low
|
|
- **Goal:** Replace global state with `threading.local()` or explicit context passing to guarantee thread-safe logging when multiple Tier 3 workers process tickets in parallel.
|
|
|
|
### 9. `manual_ux_validation_20260302`
|
|
- **Status:** Initialized / Looked Over
|
|
- **Priority:** Medium
|
|
- **Goal:** Highly interactive human-in-the-loop track to review and adjust GUI UX, animations, popups, and layout structures based on slow-interval simulation feedback.
|
|
|
|
### 10. `test_architecture_integrity_audit_20260304`
|
|
- **Status:** Audit Completed
|
|
- **Priority:** High
|
|
- **Goal:** Comprehensive audit of testing infrastructure and simulation framework. Produced `report_gemini.md` detailing exact mechanical failures and remediation paths.
|