feat(token-viz): Phase 3 — auto-refresh triggers and /api/gui/token_stats endpoint
This commit is contained in:
@@ -11,13 +11,13 @@ Architecture reference: [docs/guide_architecture.md](../../docs/guide_architectu
|
||||
|
||||
## Phase 2: Trimming Preview & Cache Status
|
||||
|
||||
- [x] Task 2.1: When `stats.get('would_trim')` is True, render a warning: `imgui.text_colored(ImVec4(1,0.3,0,1), "WARNING: Next call will trim history")`. Below it, show `f"Trimmable turns: {stats['trimmable_turns']}"`. If `stats` contains per-message breakdown, render the first 3 trimmable messages with their role and token count in a compact list.
|
||||
- [x] Task 2.2: Add Gemini cache status display. Read `ai_client._gemini_cache` (check `is not None`), `ai_client._gemini_cache_created_at`, and `ai_client._GEMINI_CACHE_TTL`. If cache exists, show: `"Gemini Cache: ACTIVE | Age: {age_seconds}s / {ttl}s | Renews at: {ttl * 0.9:.0f}s"`. If not, show `"Gemini Cache: INACTIVE"`. Guard with `if ai_client._provider == "gemini":`.
|
||||
- [x] Task 2.3: Add Anthropic cache hint. When provider is `"anthropic"`, show: `"Anthropic: 4-breakpoint ephemeral caching (auto-managed)"` with the number of history turns and whether the latest response used cache reads (check last comms log entry for `cache_read_input_tokens`).
|
||||
- [~] Task 2.4: Write tests for trimming warning visibility and cache status display.
|
||||
- [x] Task 2.1: When `stats.get('would_trim')` is True. 7b5d9b1, render a warning: `imgui.text_colored(ImVec4(1,0.3,0,1), "WARNING: Next call will trim history")`. Below it, show `f"Trimmable turns: {stats['trimmable_turns']}"`. If `stats` contains per-message breakdown, render the first 3 trimmable messages with their role and token count in a compact list.
|
||||
- [x] Task 2.2: Add Gemini cache status display. 7b5d9b1 Read `ai_client._gemini_cache` (check `is not None`), `ai_client._gemini_cache_created_at`, and `ai_client._GEMINI_CACHE_TTL`. If cache exists, show: `"Gemini Cache: ACTIVE | Age: {age_seconds}s / {ttl}s | Renews at: {ttl * 0.9:.0f}s"`. If not, show `"Gemini Cache: INACTIVE"`. Guard with `if ai_client._provider == "gemini":`.
|
||||
- [x] Task 2.3: Add Anthropic cache hint. 7b5d9b1 When provider is `"anthropic"`, show: `"Anthropic: 4-breakpoint ephemeral caching (auto-managed)"` with the number of history turns and whether the latest response used cache reads (check last comms log entry for `cache_read_input_tokens`).
|
||||
- [x] Task 2.4: Write tests for trimming warning visibility and cache status display. 7b5d9b1
|
||||
|
||||
## Phase 3: Auto-Refresh & Integration
|
||||
|
||||
- [ ] Task 3.1: Hook `_token_stats` refresh into three trigger points: (a) after `_do_generate()` completes — cache `stable_md` and call `get_history_bleed_stats`; (b) after provider/model switch in `current_provider.setter` and `current_model.setter` — clear and re-fetch; (c) after each `handle_ai_response` in `_process_pending_gui_tasks` — refresh stats since history grew. For (c), use a flag `self._token_stats_dirty = True` and refresh in the next frame's render call to avoid calling the stats function too frequently.
|
||||
- [ ] Task 3.2: Add the token budget panel to the Hook API. Extend `/api/gui/mma_status` (or add a new `/api/gui/token_stats` endpoint) to expose `_token_stats` for simulation verification. This allows tests to assert on token utilization levels.
|
||||
- [x] Task 3.1: Hook `_token_stats` refresh into three trigger points: (a) after `_do_generate()` completes — cache `stable_md` and call `get_history_bleed_stats`; (b) after provider/model switch in `current_provider.setter` and `current_model.setter` — clear and re-fetch; (c) after each `handle_ai_response` in `_process_pending_gui_tasks` — refresh stats since history grew. For (c), use a flag `self._token_stats_dirty = True` and refresh in the next frame's render call to avoid calling the stats function too frequently.
|
||||
- [~] Task 3.2: Add the token budget panel to the Hook API. Extend `/api/gui/mma_status` (or add a new `/api/gui/token_stats` endpoint) to expose `_token_stats` for simulation verification. This allows tests to assert on token utilization levels.
|
||||
- [ ] Task 3.3: Conductor - User Manual Verification 'Phase 3: Auto-Refresh & Integration' (Protocol in workflow.md)
|
||||
|
||||
17
gui_2.py
17
gui_2.py
@@ -298,6 +298,7 @@ class App:
|
||||
self._gemini_cache_text = ""
|
||||
self._last_stable_md: str = ''
|
||||
self._token_stats: dict = {}
|
||||
self._token_stats_dirty: bool = False
|
||||
self.ui_disc_truncate_pairs: int = 2
|
||||
self.ui_auto_scroll_comms = True
|
||||
self.ui_auto_scroll_tool_calls = True
|
||||
@@ -360,7 +361,10 @@ class App:
|
||||
if hasattr(self, 'hook_server'):
|
||||
self.hook_server.start()
|
||||
self.available_models = []
|
||||
self.available_models = []
|
||||
self._fetch_models(value)
|
||||
self._token_stats = {}
|
||||
self._token_stats_dirty = True
|
||||
|
||||
@property
|
||||
def current_model(self) -> str:
|
||||
@@ -372,6 +376,8 @@ class App:
|
||||
self._current_model = value
|
||||
ai_client.reset_session()
|
||||
ai_client.set_provider(self.current_provider, value)
|
||||
self._token_stats = {}
|
||||
self._token_stats_dirty = True
|
||||
|
||||
def _init_ai_and_hooks(self) -> None:
|
||||
ai_client.set_provider(self.current_provider, self.current_model)
|
||||
@@ -607,6 +613,12 @@ class App:
|
||||
async def stream(req: GenerateRequest) -> Any:
|
||||
"""Placeholder for streaming AI generation responses (Not yet implemented)."""
|
||||
raise HTTPException(status_code=501, detail="Streaming endpoint (/api/v1/stream) is not yet supported in this version.")
|
||||
|
||||
@api.get("/api/gui/token_stats", dependencies=[Depends(get_api_key)])
|
||||
def token_stats() -> dict[str, Any]:
|
||||
"""Returns current token budget stats for simulation/test verification."""
|
||||
return dict(self._token_stats)
|
||||
|
||||
return api
|
||||
# ---------------------------------------------------------------- project loading
|
||||
|
||||
@@ -935,6 +947,8 @@ class App:
|
||||
self.ai_response = text
|
||||
self.ai_status = payload.get("status", "done")
|
||||
self._trigger_blink = True
|
||||
if not stream_id:
|
||||
self._token_stats_dirty = True
|
||||
if self.ui_auto_add_history and not stream_id:
|
||||
role = payload.get("role", "AI")
|
||||
with self._pending_history_adds_lock:
|
||||
@@ -2732,6 +2746,9 @@ class App:
|
||||
imgui.text_colored(C_SUB, self._gemini_cache_text)
|
||||
|
||||
def _render_token_budget_panel(self) -> None:
|
||||
if self._token_stats_dirty:
|
||||
self._token_stats_dirty = False
|
||||
self._refresh_api_metrics({}, md_content=self._last_stable_md or None)
|
||||
stats = self._token_stats
|
||||
if not stats:
|
||||
imgui.text_disabled("Token stats unavailable")
|
||||
|
||||
Reference in New Issue
Block a user