WIP next tracks planing

2026-03-06 14:52:10 -05:00
parent 3336959e02
commit 2c90020682
28 changed files with 482 additions and 540 deletions
@@ -6,85 +6,99 @@
 *(none — all planned tracks queued below)*

 ## Completed This Session
- `test_architecture_integrity_audit_20260304` — Comprehensive test architecture audit completed. Wrote exhaustive report_gemini.md detailing fixing the "Triple Bingo" streaming history explosion, Destructive IPC Read drops, and Asyncio deadlocks. Checkpoint: e3c6b9e.
- `mma_agent_focus_ux_20260302` — Per-tier source_tier tagging on comms+tool entries; Focus Agent combo UI; filter logic in comms+tool panels; [tier] label per comms entry. 18 tests. Checkpoint: b30e563.
- `feature_bleed_cleanup_20260302` — Removed dead comms panel dup, dead menubar block, duplicate __init__ vars; added working Quit; fixed Token Budget layout. All phases verified. Checkpoint: 0d081a2.
+*(See archive: strict_execution_queue_completed_20260306)*

 ---

-## Planned: The Strict Execution Queue
-*All previously loose backlog items have been rigorously spec'd and initialized as Conductor Tracks. They MUST be executed in this exact order.*
+## Phase 3: Future Horizons (Tracks 1-19)
+*Initialized: 2026-03-06*

-> [!WARNING] TEST ARCHITECTURE DEBT NOTICE (2026-03-05)
-> The `gui_decoupling` track exposed deep flaws in the test architecture (asyncio event loop exhaustion, IPC polling race conditions, phantom Windows subprocesses). 
-> **Current Testing Policy:** 
-> - Full-suite integration tests (`live_gui` / extended sims) are currently considered **"flaky by design"**. 
-> - Do NOT write new `live_gui` simulations until Track #1, #2, and #3 are complete. 
-> - If unit tests pass but `test_extended_sims.py` hangs or fails locally, you may manually verify the GUI behavior and proceed.
+### Architecture & Backend

-### 1. `hook_api_ui_state_verification_20260302` (Active/Next)
- **Status:** Initialized
+#### 1. \	rue_parallel_worker_execution_20260306- **Status:** Planned
 - **Priority:** High
- **Goal:** Add a `/api/gui/state` GET endpoint. Wire UI state into `_settable_fields` to enable programmatic `live_gui` testing without user confirmation. 
- **Fixes Test Debt:** Replaces brittle `time.sleep()` and string-matching assertions in simulations with deterministic API queries.
+- **Goal:** Implement true concurrency for the DAG engine. Once threading.local() is in place, the ExecutionEngine should spawn independent Tier 3 workers in parallel (e.g., 4 workers handling 4 isolated tests simultaneously). Requires strict file-locking or a Git-based diff-merging strategy to prevent AST collision.

-### 2. `asyncio_decoupling_refactor_20260306`
- **Status:** Initialized
+#### 2. \deep_ast_context_pruning_20260306- **Status:** Planned
 - **Priority:** High
- **Goal:** Resolve deep asyncio/threading deadlocks. Replace `asyncio.Queue` in `AppController` with a standard `queue.Queue`. Ensure phantom subprocesses are killed.
- **Fixes Test Debt:** Eliminates `RuntimeError: Event loop is closed` and zombie port 8999 hijacking. Restores full-suite reliability.
+- **Goal:** Before dispatching a Tier 3 worker, use tree_sitter to automatically parse the target file AST, strip out unrelated function bodies, and inject a surgically condensed skeleton into the worker prompt. Guarantees the AI only sees what it needs to edit, drastically reducing token burn.

-### 3. `mock_provider_hardening_20260305`
- **Status:** Initialized
+#### 3. \isual_dag_ticket_editing_20260306- **Status:** Planned
 - **Priority:** Medium
- **Goal:** Introduce negative testing paths (malformed JSON, timeouts) into the mock AI provider.
- **Fixes Test Debt:** Allows the test suite to verify error handling flows that were previously masked by a mock provider that only ever returned success.
+- **Goal:** Replace the linear ticket list in the GUI with an interactive Node Graph using ImGui Bundle node editor. Allow the user to visually drag dependency lines, split nodes, or delete tasks before clicking Execute Pipeline.

-### 4. `robust_json_parsing_tech_lead_20260302`
- **Status:** Initialized
+#### 4. \	ier4_auto_patching_20260306- **Status:** Planned
 - **Priority:** Medium
- **Goal:** Implement an auto-retry loop that catches `JSONDecodeError` and feeds the traceback to the Tier 2 model for self-correction.
- **Test Debt Note:** Rely strictly on in-process `unittest.mock` to verify the retry logic until stabilization tracks are done.
+- **Goal:** Elevate Tier 4 from a log summarizer to an auto-patcher. When a verification test fails, Tier 4 generates a .patch file. The GUI intercepts this and presents a side-by-side Diff Viewer. The user clicks Apply Patch to instantly resume the pipeline.

-### 5. `concurrent_tier_source_tier_20260302`
- **Status:** Initialized
+#### 5. ative_orchestrator_20260306- **Status:** Planned
 - **Priority:** Low
- **Goal:** Replace global state with `threading.local()` or explicit context passing to guarantee thread-safe logging when multiple Tier 3 workers process tickets in parallel.
- **Test Debt Note:** Use in-process mocks to verify concurrency.
-
-### 6. `manual_ux_validation_20260302`
- **Status:** Initialized
- **Priority:** Medium
- **Goal:** Highly interactive human-in-the-loop track to review and adjust GUI UX, animations, popups, and layout structures based on slow-interval simulation feedback.
- **Test Debt Note:** Naturally bypasses automated testing debt as it is purely human-in-the-loop.
-
-### 7. `async_tool_execution_20260303`
- **Status:** Initialized
- **Priority:** Medium
- **Goal:** Refactor MCP tool execution to utilize `asyncio.gather` or thread pools to run multiple tools concurrently within a single AI loop.
- **Test Debt Note:** Use in-process mocks to verify concurrency.
-
-### 8. `simulation_fidelity_enhancement_20260305`
- **Status:** Initialized
- **Priority:** Low
- **Goal:** Add human-like jitter, hesitation, and reading latency to the UserSimAgent.
+- **Goal:** Absorb the Conductor extension entirely into the core application. Manual Slop should natively read/write plan.md, manage the metadata.json, and orchestrate the MMA tiers in pure Python, removing the dependency on external CLI shell executions (mma_exec.py).

 ---

-## Phase 3: Future Horizons (Post-Hardening Backlog)
-*To be evaluated in a future Tier 1 session once the Strict Execution Queue is cleared and the architectural foundation is stabilized.*
+### GUI Overhauls & Visualizations

-### 1. True Parallel Worker Execution (The DAG Realization)
-**Goal:** Implement true concurrency for the DAG engine. Once `threading.local()` is in place, the `ExecutionEngine` should spawn independent Tier 3 workers in parallel (e.g., 4 workers handling 4 isolated tests simultaneously). Requires strict file-locking or a Git-based diff-merging strategy to prevent AST collision.
+#### 6. \cost_token_analytics_20260306- **Status:** Planned
+- **Priority:** High
+- **Goal:** Real-time cost tracking panel displaying cost per model, session totals, and breakdown by tier. Uses existing cost_tracker.py which is implemented but has no GUI.

-### 2. Deep AST-Driven Context Pruning (RAG for Code)
-**Goal:** Before dispatching a Tier 3 worker, use `tree_sitter` to automatically parse the target file's AST, strip out unrelated function bodies, and inject a surgically condensed skeleton into the worker's prompt. Guarantees the AI only "sees" what it needs to edit, drastically reducing token burn.
+#### 7. \performance_dashboard_20260306- **Status:** Planned
+- **Priority:** High
+- **Goal:** Expand performance metrics panel with CPU/RAM usage, frame time, input lag with historical graphs. Uses existing performance_monitor.py which has basic metrics but no detailed visualization.

-### 3. Visual DAG & Interactive Ticket Editing
-**Goal:** Replace the linear ticket list in the GUI with an interactive Node Graph using ImGui Bundle's node editor. Allow the user to visually drag dependency lines, split nodes, or delete tasks before clicking "Execute Pipeline."
+#### 8. \mma_multiworker_viz_20260306- **Status:** Planned
+- **Priority:** High
+- **Goal:** Split-view GUI for parallel worker streams per tier. Visualize multiple concurrent workers with individual status, output tabs, and resource usage. Enable kill/restart per worker.

-### 4. Advanced Tier 4 QA Auto-Patching
-**Goal:** Elevate Tier 4 from a log summarizer to an auto-patcher. When a verification test fails, Tier 4 generates a `.patch` file. The GUI intercepts this and presents a side-by-side Diff Viewer. The user clicks "Apply Patch" to instantly resume the pipeline.
+#### 9. \cache_analytics_20260306- **Status:** Planned
+- **Priority:** Medium
+- **Goal:** Gemini cache hit/miss visualization, memory usage, TTL status display. Uses existing ai_client.get_gemini_cache_stats() which is not displayed in GUI.

-### 5. Transitioning to a Native Orchestrator
-**Goal:** Absorb the Conductor extension entirely into the core application. Manual Slop should natively read/write `plan.md`, manage the `metadata.json`, and orchestrate the MMA tiers in pure Python, removing the dependency on external CLI shell executions (`mma_exec.py`).
+#### 10. \	ool_usage_analytics_20260306- **Status:** Planned
+- **Priority:** Medium
+- **Goal:** Analytics panel showing most-used tools, average execution time, and failure rates. Uses existing tool_log_callback data.
+
+#### 11. \session_insights_20260306- **Status:** Planned
+- **Priority:** Medium
+- **Goal:** Token usage over time, cost projections, session summary with efficiency scores. Visualize session_logger data.
+
+#### 12. \	rack_progress_viz_20260306- **Status:** Planned
+- **Priority:** Medium
+- **Goal:** Progress bars and percentage completion for active tracks and tickets. Better visualization of DAG execution state.
+
+#### 13. \manual_skeleton_injection_20260306- **Status:** Planned
+- **Priority:** Medium
+- **Goal:** Add UI controls to manually flag files for skeleton injection in discussions. Allow agent to request full file reads or specific def/class definitions on-demand. Currently skeletons are auto-generated for workers only; extend to manual discussions with user-controlled file selection and def-level retrieval.
+
+#### 14. \on_demand_def_lookup_20260306- **Status:** Planned
+- **Priority:** Medium
+- **Goal:** Add ability for agent to request specific class/function definitions during discussion.
+
+#### 15. \manual_ux_validation_20260302- **Status:** Planned
+- **Priority:** Medium
+- **Goal:** Interactive human-in-the-loop track to review and adjust GUI UX, animations, popups, and layout structures. User can @mention a symbol and get its full definition inline, or allow the AI to auto-fetch definitions when it encounters unknown symbols. Complements skeleton injection by providing deep-dive capability.
+
+---
+
+### Manual UX Controls
+
+#### 15. \	icket_queue_mgmt_20260306- **Status:** Planned
+- **Priority:** High
+- **Goal:** Allow user to manually reorder, prioritize, or requeue tickets in the DAG. Add drag-drop reordering, priority tags, and bulk selection for execute/skip/block.
+
+#### 16. \kill_abort_workers_20260306- **Status:** Planned
+- **Priority:** High
+- **Goal:** Add ability to kill/abort a running Tier 3 worker mid-execution. Currently workers run to completion; add cancel button with forced termination option.
+
+#### 17. \manual_block_control_20260306- **Status:** Planned
+- **Priority:** Medium
+- **Goal:** Allow user to manually block or unblock tickets with custom reasons. Currently blocked tickets rely on dependency resolution; add manual override.
+
+#### 18. \pipeline_pause_resume_20260306- **Status:** Planned
+- **Priority:** Medium
+- **Goal:** Add global pause/resume for the entire DAG execution pipeline. Allow user to freeze all worker activity and resume later.
+
+#### 19. \per_ticket_model_20260306- **Status:** Planned
+- **Priority:** Low
+- **Goal:** Allow user to manually select which model to use for a specific ticket, overriding the default tier model. Useful for forcing a smarter model on hard tickets.