Files
manual_slop/conductor/tracks/context_token_viz_20260301/plan.md
Ed_ 0d2b6049d1 conductor: Create 3 MVP tracks with surgical specs from full codebase analysis
Three new tracks identified by analyzing product.md requirements against
actual codebase state using 1M-context Opus with all architecture docs loaded:

1. mma_pipeline_fix_20260301 (P0, blocker):
   - Diagnoses why Tier 3 worker output never reaches mma_streams in GUI
   - Identifies 4 root cause candidates: positional arg ordering, asyncio.Queue
     thread-safety violation, ai_client.reset_session() side effects, token
     stats stub returning empty dict
   - 2 phases, 6 tasks with exact line references

2. simulation_hardening_20260301 (P1, depends on pipeline fix):
   - Addresses 3 documented issues from robust_live_simulation session compression
   - Mock triggers wrong approval popup, popup state desync, approval ambiguity
   - 3 phases, 9 tasks including standalone mock test suite

3. context_token_viz_20260301 (P2):
   - Builds UI for product.md primary use case #2 'Context & Memory Management'
   - Backend already complete (get_history_bleed_stats, 140 lines)
   - Token budget bar, proportion breakdown, trimming preview, cache status
   - 3 phases, 10 tasks

Execution order: pipeline_fix -> simulation_hardening -> gui_ux (parallel w/ token_viz)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-01 09:58:34 -05:00

3.6 KiB

Implementation Plan: Context & Token Visualization

Architecture reference: docs/guide_architecture.md — AI Client section

Phase 1: Token Budget Display

  • Task 1.1: Add a new method _render_token_budget_panel(self) in gui_2.py. Place it in the Provider panel area (after _render_provider_panel, gui_2.py:2485-2542), or as a new collapsible section within the provider panel. Call ai_client.get_history_bleed_stats(self._last_stable_md) — need to cache self._last_stable_md from the last _do_generate() call (gui_2.py:1408-1425, the stable_md return value). Store the result in self._token_stats: dict = {}, refreshed on each _do_generate call and on provider/model switch.
  • Task 1.2: Render the utilization bar. Use imgui.progress_bar(stats['utilization_pct'] / 100, ImVec2(-1, 0), f"{stats['utilization_pct']:.1f}%"). Color-code via imgui.push_style_color(imgui.Col_.plot_histogram, ...): green if <50%, yellow if 50-80%, red if >80%. Below the bar, show: f"{stats['estimated_prompt_tokens']:,} / {stats['max_prompt_tokens']:,} tokens ({stats['headroom_tokens']:,} remaining)".
  • Task 1.3: Render the proportion breakdown as a 3-row table: System (system_tokens), Tools (tools_tokens), History (history_tokens). Each row shows token count and percentage of total. Use imgui.begin_table("token_breakdown", 3) with columns: Component, Tokens, Pct.
  • Task 1.4: Write tests verifying _render_token_budget_panel calls get_history_bleed_stats and handles the empty dict case (when no provider is configured).

Phase 2: Trimming Preview & Cache Status

  • Task 2.1: When stats.get('would_trim') is True, render a warning: imgui.text_colored(ImVec4(1,0.3,0,1), "WARNING: Next call will trim history"). Below it, show f"Trimmable turns: {stats['trimmable_turns']}". If stats contains per-message breakdown, render the first 3 trimmable messages with their role and token count in a compact list.
  • Task 2.2: Add Gemini cache status display. Read ai_client._gemini_cache (check is not None), ai_client._gemini_cache_created_at, and ai_client._GEMINI_CACHE_TTL. If cache exists, show: "Gemini Cache: ACTIVE | Age: {age_seconds}s / {ttl}s | Renews at: {ttl * 0.9:.0f}s". If not, show "Gemini Cache: INACTIVE". Guard with if ai_client._provider == "gemini":.
  • Task 2.3: Add Anthropic cache hint. When provider is "anthropic", show: "Anthropic: 4-breakpoint ephemeral caching (auto-managed)" with the number of history turns and whether the latest response used cache reads (check last comms log entry for cache_read_input_tokens).
  • Task 2.4: Write tests for trimming warning visibility and cache status display.

Phase 3: Auto-Refresh & Integration

  • Task 3.1: Hook _token_stats refresh into three trigger points: (a) after _do_generate() completes — cache stable_md and call get_history_bleed_stats; (b) after provider/model switch in current_provider.setter and current_model.setter — clear and re-fetch; (c) after each handle_ai_response in _process_pending_gui_tasks — refresh stats since history grew. For (c), use a flag self._token_stats_dirty = True and refresh in the next frame's render call to avoid calling the stats function too frequently.
  • Task 3.2: Add the token budget panel to the Hook API. Extend /api/gui/mma_status (or add a new /api/gui/token_stats endpoint) to expose _token_stats for simulation verification. This allows tests to assert on token utilization levels.
  • Task 3.3: Conductor - User Manual Verification 'Phase 3: Auto-Refresh & Integration' (Protocol in workflow.md)