conductor: Create 3 MVP tracks with surgical specs from full codebase analysis

Three new tracks identified by analyzing product.md requirements against actual codebase state using 1M-context Opus with all architecture docs loaded: 1. mma_pipeline_fix_20260301 (P0, blocker): - Diagnoses why Tier 3 worker output never reaches mma_streams in GUI - Identifies 4 root cause candidates: positional arg ordering, asyncio.Queue thread-safety violation, ai_client.reset_session() side effects, token stats stub returning empty dict - 2 phases, 6 tasks with exact line references 2. simulation_hardening_20260301 (P1, depends on pipeline fix): - Addresses 3 documented issues from robust_live_simulation session compression - Mock triggers wrong approval popup, popup state desync, approval ambiguity - 3 phases, 9 tasks including standalone mock test suite 3. context_token_viz_20260301 (P2): - Builds UI for product.md primary use case #2 'Context & Memory Management' - Backend already complete (get_history_bleed_stats, 140 lines) - Token budget bar, proportion breakdown, trimming preview, cache status - 3 phases, 10 tasks Execution order: pipeline_fix -> simulation_hardening -> gui_ux (parallel w/ token_viz) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-01 09:58:34 -05:00
parent d93f650c3a
commit 0d2b6049d1
9 changed files with 194 additions and 0 deletions
--- a/conductor/tracks/context_token_viz_20260301/spec.md
+++ b/conductor/tracks/context_token_viz_20260301/spec.md
@@ -0,0 +1,42 @@
+# Track Specification: Context & Token Visualization
+
+## Overview
+product.md lists "Context & Memory Management" as primary use case #2: "Better visualization and management of token usage and context memory, allowing developers to optimize prompt limits manually." The backend already computes everything needed via `ai_client.get_history_bleed_stats()` (ai_client.py:1657-1796, 140 lines). This track builds the UI to expose it.
+
+## Current State
+
+### Backend (already implemented)
+`get_history_bleed_stats(md_content=None) -> dict[str, Any]` returns:
+- `provider`: Active provider name
+- `model`: Active model name
+- `history_turns`: Number of conversation turns
+- `estimated_prompt_tokens`: Total estimated prompt tokens (system + history + tools)
+- `max_prompt_tokens`: Provider's max (180K Anthropic, 900K Gemini)
+- `utilization_pct`: `estimated / max * 100`
+- `headroom_tokens`: Tokens remaining before trimming kicks in
+- `would_trim`: Boolean — whether the next call would trigger history trimming
+- `trimmable_turns`: Number of turns that could be dropped
+- `system_tokens`: Tokens consumed by system prompt + context
+- `tools_tokens`: Tokens consumed by tool definitions
+- `history_tokens`: Tokens consumed by conversation history
+- Per-message breakdown with role, token estimate, and whether it contains tool use
+
+### GUI (missing)
+No UI exists to display any of this. The user has zero visibility into:
+- How close they are to hitting the context window limit
+- What proportion is system prompt vs history vs tools
+- Which messages would be trimmed and when
+- Whether Gemini's server-side cache is active and how large it is
+
+## Goals
+1. **Token Budget Bar**: A prominent progress bar showing context utilization (green < 50%, yellow 50-80%, red > 80%).
+2. **Breakdown Panel**: Stacked bar or table showing system/tools/history proportions.
+3. **Trimming Preview**: When `would_trim` is true, show which turns would be dropped.
+4. **Cache Status**: For Gemini, show whether `_gemini_cache` exists, its size in tokens, and TTL remaining.
+5. **Refresh**: Auto-refresh on provider/model switch and after each AI response.
+
+## Architecture Reference
+- AI client state: [docs/guide_architecture.md](../../docs/guide_architecture.md) — see "AI Client: Multi-Provider Architecture"
+- Gemini cache: [docs/guide_architecture.md](../../docs/guide_architecture.md) — see "Gemini Cache Strategy"
+- Anthropic cache: [docs/guide_architecture.md](../../docs/guide_architecture.md) — see "Anthropic Cache Strategy (4-Breakpoint System)"
+- Frame-sync: [docs/guide_architecture.md](../../docs/guide_architecture.md) — see `_process_pending_gui_tasks` for how to safely read backend state from GUI thread