6.9 KiB
TASKS.md
Active Tracks
(none — all planned tracks queued below)
Completed This Session
mma_agent_focus_ux_20260302— Per-tier source_tier tagging on comms+tool entries; Focus Agent combo UI; filter logic in comms+tool panels; [tier] label per comms entry. 18 tests. Checkpoint:b30e563.feature_bleed_cleanup_20260302— Removed dead comms panel dup, dead menubar block, duplicate init vars; added working Quit; fixed Token Budget layout. All phases verified. Checkpoint:0d081a2.context_token_viz_20260301— Token budget panel (color bar, breakdown table, trim warning, cache status, auto-refresh). All phases verified. Commit:d577457.
Planned: Next Track
mma_agent_focus_ux_20260302 — COMPLETED (b30e563)
(initialized — run after bleed cleanup)
Priority: High
Depends on: feature_bleed_cleanup_20260302 Phase 1 (dead comms panel removed)
Track dir: conductor/tracks/mma_agent_focus_ux_20260302/
Audit-confirmed gaps:
ai_client._append_commsemits entries with nosource_tierkeyai_clienthas nocurrent_tiermodule variable — no way for tiers to self-identify_tool_logislist[tuple[str,str,float]]— no tier field, tuple must migrate to dictrun_worker_lifecyclereplacescomms_log_callbackbut never stampssource_tiergenerate_tickets(Tier 2) does NOT replace callback at all- No Focus Agent selector widget in Operations Hub
Scope: Phase 1 (tier tagging) → Phase 2 (tool log dict migration) → Phase 3 (Focus Agent UI + filter). Per-tier token stats deferred to sub-track.
tech_debt_and_test_cleanup_20260302 (initialized)
Priority: High
Depends on: feature_bleed_cleanup_20260302
Track dir: conductor/tracks/tech_debt_and_test_cleanup_20260302/
Audit-confirmed gaps:
- 13 test files duplicate
app_instancefixture instead of usingconftest.py. - Duplicate test files (
test_ast_parser_curated.py). - Multiple simulation tests silently pass with no assertions.
gui_2.pyinitializes 9 state variables in__init__that are never read.gui_2.pyhas over 15 uncalled HTTP/background methods.
Scope: Phase 1 (Fixture deduplication) → Phase 2 (False-positive test fixing) → Phase 3 (Dead code excision in gui_2.py).
conductor_workflow_improvements_20260302 (initialized)
Priority: High
Depends on: None
Track dir: conductor/tracks/conductor_workflow_improvements_20260302/
Audit-confirmed gaps:
- Tier 2 skill lacks enforcement of AST pre-implementation scans to prevent duplicate state variables.
- Tier 2 skill lacks explicit rejection of non-TDD execution.
- Tier 3 skill does not strictly forbid implementing code without failing tests.
workflow.mdlacks explicit warnings against zero-assertion tests and redundant__init__state.
Scope: Phase 1 (Update MMA Skill prompts) → Phase 2 (Update workflow.md).
architecture_boundary_hardening_20260302 (initialized)
Priority: High
Depends on: None
Track dir: conductor/tracks/architecture_boundary_hardening_20260302/
Audit-confirmed gaps:
ai_client.pyloops executeset_file_sliceandpy_update_definitioninstantly without checkingpre_tool_callback, bypassing GUI approval.- New
mcp_client.pytools are not exposed in the GUI ormanual_slop.tomlconfig for user control. mma_exec.pybypasses skeletonization formcp_client, causing token bloat.dag_engine.pydoes not cascadeblockedstates, causing orchestrator infinite loops.
Scope: Phase 1 (Meta-tooling token fix) → Phase 2 (Complete MCP Tool Integration & Seal GUI HITL bypass) → Phase 3 (Fix DAG Engine cascading blocks).
testing_consolidation_20260302 (initialized)
Priority: Medium
Depends on: tech_debt_and_test_cleanup_20260302
Track dir: conductor/tracks/testing_consolidation_20260302/
Audit-confirmed gaps:
visual_mma_verification.pymanually runssubprocess.Popeninstead of using the robustlive_guifixture.- Duplicate architectural logic between tests and
simulation/directories causing fragmentation.
Scope: Phase 1 (Migrate manual launchers to fixtures) → Phase 2 (Consolidate simulation scripts).
Track Dependency Order (Execution Guide)
To ensure smooth execution, execute the tracks in the following order:
feature_bleed_cleanup_20260302(Base cleanup of GUI structure)mma_agent_focus_ux_20260302(Depends on feature bleed cleanup Phase 1)architecture_boundary_hardening_20260302(Fixes critical HITL & Token leaks; independent but foundational)tech_debt_and_test_cleanup_20260302(Re-establishes testing foundation; run after feature tracks)testing_consolidation_20260302(Refactors testing methodology; depends on tech debt cleanup)conductor_workflow_improvements_20260302(Meta-level updates to skills/workflow docs; can be run anytime)
Planned: Upcoming Tracks
The following tracks have been initialized and ordered for execution.
1. test_stabilization_20260302 (Active/Next)
Priority: High
Goal: Stabilize asyncio errors, ban mock-rot, and consolidate testing paradigms.
2. strict_static_analysis_and_typing_20260302
Priority: High Goal: Resolve 512+ mypy errors and remaining ruff violations to secure the foundation before refactoring. Add pre-commit hooks.
3. codebase_migration_20260302
Priority: High
Goal: Restructure directories to a src/ layout. Doing this after static analysis ensures no hidden import bugs are introduced.
4. gui_decoupling_controller_20260302
Priority: High
Goal: Extract the state machine and core lifecycle into a headless app_controller.py, leaving gui_2.py as a pure, immediate-mode view.
5. hook_api_ui_state_verification_20260302
Priority: Medium
Goal: Add a /api/gui/state GET endpoint. Wire UI state into _settable_fields to enable programmatic live_gui testing without user confirmation.
6. robust_json_parsing_tech_lead_20260302
Priority: Medium
Goal: Implement an auto-retry loop that catches JSONDecodeError and feeds the traceback to the Tier 2 model for self-correction.
7. concurrent_tier_source_tier_20260302
Priority: Low
Goal: Replace global state with threading.local() or explicit context passing to guarantee thread-safe logging when multiple Tier 3 workers process tickets in parallel.
8. test_suite_performance_and_flakiness_20260302
Priority: Low
Goal: Replace time.sleep() with deterministic polling or threading.Event() triggers. Mark exceptionally heavy tests with @pytest.mark.slow.
9. manual_ux_validation_20260302
Priority: Medium Goal: Highly interactive human-in-the-loop track to review and adjust GUI UX, animations, popups, and layout structures based on slow-interval simulation feedback.