chore(conductor): Archive track 'context_token_viz_20260301'
This commit is contained in:
5
conductor/archive/context_token_viz_20260301/index.md
Normal file
5
conductor/archive/context_token_viz_20260301/index.md
Normal file
@@ -0,0 +1,5 @@
|
||||
# Track context_token_viz_20260301 Context
|
||||
|
||||
- [Specification](./spec.md)
|
||||
- [Implementation Plan](./plan.md)
|
||||
- [Metadata](./metadata.json)
|
||||
@@ -0,0 +1,9 @@
|
||||
{
|
||||
"track_id": "context_token_viz_20260301",
|
||||
"description": "Build UI for context window utilization, token breakdown, trimming preview, and cache status.",
|
||||
"type": "feature",
|
||||
"status": "new",
|
||||
"priority": "P2",
|
||||
"created_at": "2026-03-01T15:50:00Z",
|
||||
"updated_at": "2026-03-01T15:50:00Z"
|
||||
}
|
||||
23
conductor/archive/context_token_viz_20260301/plan.md
Normal file
23
conductor/archive/context_token_viz_20260301/plan.md
Normal file
@@ -0,0 +1,23 @@
|
||||
# Implementation Plan: Context & Token Visualization
|
||||
|
||||
Architecture reference: [docs/guide_architecture.md](../../docs/guide_architecture.md) — AI Client section
|
||||
|
||||
## Phase 1: Token Budget Display
|
||||
|
||||
- [x] Task 1.1: Add a new method `_render_token_budget_panel(self)` in `gui_2.py`. 5bfb20f Place it in the Provider panel area (after `_render_provider_panel`, gui_2.py:2485-2542), or as a new collapsible section within the provider panel. Call `ai_client.get_history_bleed_stats(self._last_stable_md)` — need to cache `self._last_stable_md` from the last `_do_generate()` call (gui_2.py:1408-1425, the `stable_md` return value). Store the result in `self._token_stats: dict = {}`, refreshed on each `_do_generate` call and on provider/model switch.
|
||||
- [x] Task 1.2: Render the utilization bar. 5bfb20f Use `imgui.progress_bar(stats['utilization_pct'] / 100, ImVec2(-1, 0), f"{stats['utilization_pct']:.1f}%")`. Color-code via `imgui.push_style_color(imgui.Col_.plot_histogram, ...)`: green if <50%, yellow if 50-80%, red if >80%. Below the bar, show: `f"{stats['estimated_prompt_tokens']:,} / {stats['max_prompt_tokens']:,} tokens ({stats['headroom_tokens']:,} remaining)"`.
|
||||
- [x] Task 1.3: Render the proportion breakdown as a 3-row table. 5bfb20f: System (`system_tokens`), Tools (`tools_tokens`), History (`history_tokens`). Each row shows token count and percentage of total. Use `imgui.begin_table("token_breakdown", 3)` with columns: Component, Tokens, Pct.
|
||||
- [x] Task 1.4: Write tests verifying `_render_token_budget_panel`. 5bfb20f calls `get_history_bleed_stats` and handles the empty dict case (when no provider is configured).
|
||||
|
||||
## Phase 2: Trimming Preview & Cache Status
|
||||
|
||||
- [x] Task 2.1: When `stats.get('would_trim')` is True. 7b5d9b1, render a warning: `imgui.text_colored(ImVec4(1,0.3,0,1), "WARNING: Next call will trim history")`. Below it, show `f"Trimmable turns: {stats['trimmable_turns']}"`. If `stats` contains per-message breakdown, render the first 3 trimmable messages with their role and token count in a compact list.
|
||||
- [x] Task 2.2: Add Gemini cache status display. 7b5d9b1 Read `ai_client._gemini_cache` (check `is not None`), `ai_client._gemini_cache_created_at`, and `ai_client._GEMINI_CACHE_TTL`. If cache exists, show: `"Gemini Cache: ACTIVE | Age: {age_seconds}s / {ttl}s | Renews at: {ttl * 0.9:.0f}s"`. If not, show `"Gemini Cache: INACTIVE"`. Guard with `if ai_client._provider == "gemini":`.
|
||||
- [x] Task 2.3: Add Anthropic cache hint. 7b5d9b1 When provider is `"anthropic"`, show: `"Anthropic: 4-breakpoint ephemeral caching (auto-managed)"` with the number of history turns and whether the latest response used cache reads (check last comms log entry for `cache_read_input_tokens`).
|
||||
- [x] Task 2.4: Write tests for trimming warning visibility and cache status display. 7b5d9b1
|
||||
|
||||
## Phase 3: Auto-Refresh & Integration
|
||||
|
||||
- [x] Task 3.1: Hook `_token_stats` refresh into three trigger points. 6f18102: (a) after `_do_generate()` completes — cache `stable_md` and call `get_history_bleed_stats`; (b) after provider/model switch in `current_provider.setter` and `current_model.setter` — clear and re-fetch; (c) after each `handle_ai_response` in `_process_pending_gui_tasks` — refresh stats since history grew. For (c), use a flag `self._token_stats_dirty = True` and refresh in the next frame's render call to avoid calling the stats function too frequently.
|
||||
- [x] Task 3.2: Add the token budget panel to the Hook API. 6f18102 Extend `/api/gui/mma_status` (or add a new `/api/gui/token_stats` endpoint) to expose `_token_stats` for simulation verification. This allows tests to assert on token utilization levels.
|
||||
- [x] Task 3.3: Conductor - User Manual Verification 'Phase 3: Auto-Refresh & Integration' (Protocol in workflow.md). 2929a64 — verified by user, panel rendering correctly.
|
||||
42
conductor/archive/context_token_viz_20260301/spec.md
Normal file
42
conductor/archive/context_token_viz_20260301/spec.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# Track Specification: Context & Token Visualization
|
||||
|
||||
## Overview
|
||||
product.md lists "Context & Memory Management" as primary use case #2: "Better visualization and management of token usage and context memory, allowing developers to optimize prompt limits manually." The backend already computes everything needed via `ai_client.get_history_bleed_stats()` (ai_client.py:1657-1796, 140 lines). This track builds the UI to expose it.
|
||||
|
||||
## Current State
|
||||
|
||||
### Backend (already implemented)
|
||||
`get_history_bleed_stats(md_content=None) -> dict[str, Any]` returns:
|
||||
- `provider`: Active provider name
|
||||
- `model`: Active model name
|
||||
- `history_turns`: Number of conversation turns
|
||||
- `estimated_prompt_tokens`: Total estimated prompt tokens (system + history + tools)
|
||||
- `max_prompt_tokens`: Provider's max (180K Anthropic, 900K Gemini)
|
||||
- `utilization_pct`: `estimated / max * 100`
|
||||
- `headroom_tokens`: Tokens remaining before trimming kicks in
|
||||
- `would_trim`: Boolean — whether the next call would trigger history trimming
|
||||
- `trimmable_turns`: Number of turns that could be dropped
|
||||
- `system_tokens`: Tokens consumed by system prompt + context
|
||||
- `tools_tokens`: Tokens consumed by tool definitions
|
||||
- `history_tokens`: Tokens consumed by conversation history
|
||||
- Per-message breakdown with role, token estimate, and whether it contains tool use
|
||||
|
||||
### GUI (missing)
|
||||
No UI exists to display any of this. The user has zero visibility into:
|
||||
- How close they are to hitting the context window limit
|
||||
- What proportion is system prompt vs history vs tools
|
||||
- Which messages would be trimmed and when
|
||||
- Whether Gemini's server-side cache is active and how large it is
|
||||
|
||||
## Goals
|
||||
1. **Token Budget Bar**: A prominent progress bar showing context utilization (green < 50%, yellow 50-80%, red > 80%).
|
||||
2. **Breakdown Panel**: Stacked bar or table showing system/tools/history proportions.
|
||||
3. **Trimming Preview**: When `would_trim` is true, show which turns would be dropped.
|
||||
4. **Cache Status**: For Gemini, show whether `_gemini_cache` exists, its size in tokens, and TTL remaining.
|
||||
5. **Refresh**: Auto-refresh on provider/model switch and after each AI response.
|
||||
|
||||
## Architecture Reference
|
||||
- AI client state: [docs/guide_architecture.md](../../docs/guide_architecture.md) — see "AI Client: Multi-Provider Architecture"
|
||||
- Gemini cache: [docs/guide_architecture.md](../../docs/guide_architecture.md) — see "Gemini Cache Strategy"
|
||||
- Anthropic cache: [docs/guide_architecture.md](../../docs/guide_architecture.md) — see "Anthropic Cache Strategy (4-Breakpoint System)"
|
||||
- Frame-sync: [docs/guide_architecture.md](../../docs/guide_architecture.md) — see `_process_pending_gui_tasks` for how to safely read backend state from GUI thread
|
||||
Reference in New Issue
Block a user