Three new tracks identified by analyzing product.md requirements against
actual codebase state using 1M-context Opus with all architecture docs loaded:
1. mma_pipeline_fix_20260301 (P0, blocker):
- Diagnoses why Tier 3 worker output never reaches mma_streams in GUI
- Identifies 4 root cause candidates: positional arg ordering, asyncio.Queue
thread-safety violation, ai_client.reset_session() side effects, token
stats stub returning empty dict
- 2 phases, 6 tasks with exact line references
2. simulation_hardening_20260301 (P1, depends on pipeline fix):
- Addresses 3 documented issues from robust_live_simulation session compression
- Mock triggers wrong approval popup, popup state desync, approval ambiguity
- 3 phases, 9 tasks including standalone mock test suite
3. context_token_viz_20260301 (P2):
- Builds UI for product.md primary use case #2 'Context & Memory Management'
- Backend already complete (get_history_bleed_stats, 140 lines)
- Token budget bar, proportion breakdown, trimming preview, cache status
- 3 phases, 10 tasks
Execution order: pipeline_fix -> simulation_hardening -> gui_ux (parallel w/ token_viz)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3.6 KiB
3.6 KiB
Implementation Plan: Context & Token Visualization
Architecture reference: docs/guide_architecture.md — AI Client section
Phase 1: Token Budget Display
- Task 1.1: Add a new method
_render_token_budget_panel(self)ingui_2.py. Place it in the Provider panel area (after_render_provider_panel, gui_2.py:2485-2542), or as a new collapsible section within the provider panel. Callai_client.get_history_bleed_stats(self._last_stable_md)— need to cacheself._last_stable_mdfrom the last_do_generate()call (gui_2.py:1408-1425, thestable_mdreturn value). Store the result inself._token_stats: dict = {}, refreshed on each_do_generatecall and on provider/model switch. - Task 1.2: Render the utilization bar. Use
imgui.progress_bar(stats['utilization_pct'] / 100, ImVec2(-1, 0), f"{stats['utilization_pct']:.1f}%"). Color-code viaimgui.push_style_color(imgui.Col_.plot_histogram, ...): green if <50%, yellow if 50-80%, red if >80%. Below the bar, show:f"{stats['estimated_prompt_tokens']:,} / {stats['max_prompt_tokens']:,} tokens ({stats['headroom_tokens']:,} remaining)". - Task 1.3: Render the proportion breakdown as a 3-row table: System (
system_tokens), Tools (tools_tokens), History (history_tokens). Each row shows token count and percentage of total. Useimgui.begin_table("token_breakdown", 3)with columns: Component, Tokens, Pct. - Task 1.4: Write tests verifying
_render_token_budget_panelcallsget_history_bleed_statsand handles the empty dict case (when no provider is configured).
Phase 2: Trimming Preview & Cache Status
- Task 2.1: When
stats.get('would_trim')is True, render a warning:imgui.text_colored(ImVec4(1,0.3,0,1), "WARNING: Next call will trim history"). Below it, showf"Trimmable turns: {stats['trimmable_turns']}". Ifstatscontains per-message breakdown, render the first 3 trimmable messages with their role and token count in a compact list. - Task 2.2: Add Gemini cache status display. Read
ai_client._gemini_cache(checkis not None),ai_client._gemini_cache_created_at, andai_client._GEMINI_CACHE_TTL. If cache exists, show:"Gemini Cache: ACTIVE | Age: {age_seconds}s / {ttl}s | Renews at: {ttl * 0.9:.0f}s". If not, show"Gemini Cache: INACTIVE". Guard withif ai_client._provider == "gemini":. - Task 2.3: Add Anthropic cache hint. When provider is
"anthropic", show:"Anthropic: 4-breakpoint ephemeral caching (auto-managed)"with the number of history turns and whether the latest response used cache reads (check last comms log entry forcache_read_input_tokens). - Task 2.4: Write tests for trimming warning visibility and cache status display.
Phase 3: Auto-Refresh & Integration
- Task 3.1: Hook
_token_statsrefresh into three trigger points: (a) after_do_generate()completes — cachestable_mdand callget_history_bleed_stats; (b) after provider/model switch incurrent_provider.setterandcurrent_model.setter— clear and re-fetch; (c) after eachhandle_ai_responsein_process_pending_gui_tasks— refresh stats since history grew. For (c), use a flagself._token_stats_dirty = Trueand refresh in the next frame's render call to avoid calling the stats function too frequently. - Task 3.2: Add the token budget panel to the Hook API. Extend
/api/gui/mma_status(or add a new/api/gui/token_statsendpoint) to expose_token_statsfor simulation verification. This allows tests to assert on token utilization levels. - Task 3.3: Conductor - User Manual Verification 'Phase 3: Auto-Refresh & Integration' (Protocol in workflow.md)