docs: Update Journal and Tasks with session 5 strategic shift
This commit is contained in:
132
TASKS.md
132
TASKS.md
@@ -9,126 +9,54 @@
|
||||
- `mma_agent_focus_ux_20260302` — Per-tier source_tier tagging on comms+tool entries; Focus Agent combo UI; filter logic in comms+tool panels; [tier] label per comms entry. 18 tests. Checkpoint: b30e563.
|
||||
- `feature_bleed_cleanup_20260302` — Removed dead comms panel dup, dead menubar block, duplicate __init__ vars; added working Quit; fixed Token Budget layout. All phases verified. Checkpoint: 0d081a2.
|
||||
- `context_token_viz_20260301` — Token budget panel (color bar, breakdown table, trim warning, cache status, auto-refresh). All phases verified. Commit: d577457.
|
||||
|
||||
## Planned: Next Track
|
||||
|
||||
### `mma_agent_focus_ux_20260302` — COMPLETED (b30e563)
|
||||
~~(initialized — run after bleed cleanup)~~
|
||||
**Priority:** High
|
||||
**Depends on:** `feature_bleed_cleanup_20260302` Phase 1 (dead comms panel removed)
|
||||
**Track dir:** `conductor/tracks/mma_agent_focus_ux_20260302/`
|
||||
|
||||
**Audit-confirmed gaps:**
|
||||
- `ai_client._append_comms` emits entries with no `source_tier` key
|
||||
- `ai_client` has no `current_tier` module variable — no way for tiers to self-identify
|
||||
- `_tool_log` is `list[tuple[str,str,float]]` — no tier field, tuple must migrate to dict
|
||||
- `run_worker_lifecycle` replaces `comms_log_callback` but never stamps `source_tier`
|
||||
- `generate_tickets` (Tier 2) does NOT replace callback at all
|
||||
- No Focus Agent selector widget in Operations Hub
|
||||
|
||||
**Scope:** Phase 1 (tier tagging) → Phase 2 (tool log dict migration) → Phase 3 (Focus Agent UI + filter). Per-tier token stats deferred to sub-track.
|
||||
|
||||
### `tech_debt_and_test_cleanup_20260302` (initialized)
|
||||
**Priority:** High
|
||||
**Depends on:** `feature_bleed_cleanup_20260302`
|
||||
**Track dir:** `conductor/tracks/tech_debt_and_test_cleanup_20260302/`
|
||||
|
||||
**Audit-confirmed gaps:**
|
||||
- 13 test files duplicate `app_instance` fixture instead of using `conftest.py`.
|
||||
- Duplicate test files (`test_ast_parser_curated.py`).
|
||||
- Multiple simulation tests silently pass with no assertions.
|
||||
- `gui_2.py` initializes 9 state variables in `__init__` that are never read.
|
||||
- `gui_2.py` has over 15 uncalled HTTP/background methods.
|
||||
|
||||
**Scope:** Phase 1 (Fixture deduplication) → Phase 2 (False-positive test fixing) → Phase 3 (Dead code excision in `gui_2.py`).
|
||||
|
||||
### `conductor_workflow_improvements_20260302` (initialized)
|
||||
**Priority:** High
|
||||
**Depends on:** None
|
||||
**Track dir:** `conductor/tracks/conductor_workflow_improvements_20260302/`
|
||||
|
||||
**Audit-confirmed gaps:**
|
||||
- Tier 2 skill lacks enforcement of AST pre-implementation scans to prevent duplicate state variables.
|
||||
- Tier 2 skill lacks explicit rejection of non-TDD execution.
|
||||
- Tier 3 skill does not strictly forbid implementing code without failing tests.
|
||||
- `workflow.md` lacks explicit warnings against zero-assertion tests and redundant `__init__` state.
|
||||
|
||||
**Scope:** Phase 1 (Update MMA Skill prompts) → Phase 2 (Update `workflow.md`).
|
||||
|
||||
### `architecture_boundary_hardening_20260302` (initialized)
|
||||
**Priority:** High
|
||||
**Depends on:** None
|
||||
**Track dir:** `conductor/tracks/architecture_boundary_hardening_20260302/`
|
||||
|
||||
**Audit-confirmed gaps:**
|
||||
- `ai_client.py` loops execute `set_file_slice` and `py_update_definition` instantly without checking `pre_tool_callback`, bypassing GUI approval.
|
||||
- New `mcp_client.py` tools are not exposed in the GUI or `manual_slop.toml` config for user control.
|
||||
- `mma_exec.py` bypasses skeletonization for `mcp_client`, causing token bloat.
|
||||
- `dag_engine.py` does not cascade `blocked` states, causing orchestrator infinite loops.
|
||||
|
||||
**Scope:** Phase 1 (Meta-tooling token fix) → Phase 2 (Complete MCP Tool Integration & Seal GUI HITL bypass) → Phase 3 (Fix DAG Engine cascading blocks).
|
||||
|
||||
### `testing_consolidation_20260302` (initialized)
|
||||
**Priority:** Medium
|
||||
**Depends on:** `tech_debt_and_test_cleanup_20260302`
|
||||
**Track dir:** `conductor/tracks/testing_consolidation_20260302/`
|
||||
|
||||
**Audit-confirmed gaps:**
|
||||
- `visual_mma_verification.py` manually runs `subprocess.Popen` instead of using the robust `live_gui` fixture.
|
||||
- Duplicate architectural logic between tests and `simulation/` directories causing fragmentation.
|
||||
|
||||
**Scope:** Phase 1 (Migrate manual launchers to fixtures) → Phase 2 (Consolidate simulation scripts).
|
||||
- `tech_debt_and_test_cleanup_20260302` — [BOTCHED/ARCHIVED] Centralized fixtures but exposed deep asyncio flaws.
|
||||
|
||||
---
|
||||
|
||||
## Track Dependency Order (Execution Guide)
|
||||
To ensure smooth execution, execute the tracks in the following order:
|
||||
1. `feature_bleed_cleanup_20260302` (Base cleanup of GUI structure)
|
||||
2. `mma_agent_focus_ux_20260302` (Depends on feature bleed cleanup Phase 1)
|
||||
3. `architecture_boundary_hardening_20260302` (Fixes critical HITL & Token leaks; independent but foundational)
|
||||
4. `tech_debt_and_test_cleanup_20260302` (Re-establishes testing foundation; run after feature tracks)
|
||||
5. `testing_consolidation_20260302` (Refactors testing methodology; depends on tech debt cleanup)
|
||||
6. `conductor_workflow_improvements_20260302` (Meta-level updates to skills/workflow docs; can be run anytime)
|
||||
|
||||
---
|
||||
|
||||
## Planned: Upcoming Tracks
|
||||
*The following tracks have been initialized and ordered for execution.*
|
||||
## Planned: The Strict Execution Queue
|
||||
*All previously loose backlog items have been rigorously spec'd and initialized as Conductor Tracks. They MUST be executed in this exact order.*
|
||||
|
||||
### 1. `test_stabilization_20260302` (Active/Next)
|
||||
**Priority:** High
|
||||
**Goal:** Stabilize `asyncio` errors, ban mock-rot, and consolidate testing paradigms.
|
||||
- **Status:** Initialized / Looked Over
|
||||
- **Priority:** High
|
||||
- **Goal:** Stabilize `asyncio` errors, ban mock-rot, completely remove `gui_legacy.py`, and consolidate testing paradigms.
|
||||
|
||||
### 2. `strict_static_analysis_and_typing_20260302`
|
||||
**Priority:** High
|
||||
**Goal:** Resolve 512+ mypy errors and remaining ruff violations to secure the foundation before refactoring. Add pre-commit hooks.
|
||||
- **Status:** Initialized / Looked Over
|
||||
- **Priority:** High
|
||||
- **Goal:** Resolve 512+ mypy errors and remaining ruff violations to secure the foundation before refactoring. Add pre-commit hooks.
|
||||
|
||||
### 3. `codebase_migration_20260302`
|
||||
**Priority:** High
|
||||
**Goal:** Restructure directories to a `src/` layout. Doing this after static analysis ensures no hidden import bugs are introduced.
|
||||
- **Status:** Initialized / Looked Over
|
||||
- **Priority:** High
|
||||
- **Goal:** Restructure directories to a `src/` layout. Doing this after static analysis ensures no hidden import bugs are introduced. Creates `sloppy.py` entry point.
|
||||
|
||||
### 4. `gui_decoupling_controller_20260302`
|
||||
**Priority:** High
|
||||
**Goal:** Extract the state machine and core lifecycle into a headless `app_controller.py`, leaving `gui_2.py` as a pure, immediate-mode view.
|
||||
- **Status:** Initialized / Looked Over
|
||||
- **Priority:** High
|
||||
- **Goal:** Extract the state machine and core lifecycle into a headless `app_controller.py`, leaving `gui_2.py` as a pure, immediate-mode view.
|
||||
|
||||
### 5. `hook_api_ui_state_verification_20260302`
|
||||
**Priority:** Medium
|
||||
**Goal:** Add a `/api/gui/state` GET endpoint. Wire UI state into `_settable_fields` to enable programmatic `live_gui` testing without user confirmation.
|
||||
- **Status:** Initialized / Looked Over
|
||||
- **Priority:** Medium
|
||||
- **Goal:** Add a `/api/gui/state` GET endpoint. Wire UI state into `_settable_fields` to enable programmatic `live_gui` testing without user confirmation.
|
||||
|
||||
### 6. `robust_json_parsing_tech_lead_20260302`
|
||||
**Priority:** Medium
|
||||
**Goal:** Implement an auto-retry loop that catches `JSONDecodeError` and feeds the traceback to the Tier 2 model for self-correction.
|
||||
- **Status:** Initialized / Looked Over
|
||||
- **Priority:** Medium
|
||||
- **Goal:** Implement an auto-retry loop that catches `JSONDecodeError` and feeds the traceback to the Tier 2 model for self-correction.
|
||||
|
||||
### 7. `concurrent_tier_source_tier_20260302`
|
||||
**Priority:** Low
|
||||
**Goal:** Replace global state with `threading.local()` or explicit context passing to guarantee thread-safe logging when multiple Tier 3 workers process tickets in parallel.
|
||||
- **Status:** Initialized / Looked Over
|
||||
- **Priority:** Low
|
||||
- **Goal:** Replace global state with `threading.local()` or explicit context passing to guarantee thread-safe logging when multiple Tier 3 workers process tickets in parallel.
|
||||
|
||||
### 8. `test_suite_performance_and_flakiness_20260302`
|
||||
**Priority:** Low
|
||||
**Goal:** Replace `time.sleep()` with deterministic polling or `threading.Event()` triggers. Mark exceptionally heavy tests with `@pytest.mark.slow`.
|
||||
- **Status:** Initialized / Looked Over
|
||||
- **Priority:** Low
|
||||
- **Goal:** Replace `time.sleep()` with deterministic polling or `threading.Event()` triggers. Mark exceptionally heavy tests with `@pytest.mark.slow`.
|
||||
|
||||
### 9. `manual_ux_validation_20260302`
|
||||
**Priority:** Medium
|
||||
**Goal:** Highly interactive human-in-the-loop track to review and adjust GUI UX, animations, popups, and layout structures based on slow-interval simulation feedback.
|
||||
|
||||
|
||||
- **Status:** Initialized / Looked Over
|
||||
- **Priority:** Medium
|
||||
- **Goal:** Highly interactive human-in-the-loop track to review and adjust GUI UX, animations, popups, and layout structures based on slow-interval simulation feedback.
|
||||
Reference in New Issue
Block a user