# Implementation Plan: AI Provider Caching Optimization ## Phase 1: Metric Tracking & Prefix Stabilization - [ ] Task: Implement cache metric tracking for OpenAI and DeepSeek in `src/ai_client.py`. - [ ] Update `_send_deepseek` to extract `prompt_cache_hit_tokens` and `prompt_cache_miss_tokens` from usage metadata. - [ ] Update `_send_openai` (or its equivalent) to extract `cached_tokens` from `prompt_tokens_details`. - [ ] Update `_append_comms` and the `response_received` event to propagate these metrics. - [ ] Task: Optimize prompt structure for OpenAI and DeepSeek to stabilize prefixes. - [ ] Ensure system instructions and tool definitions are at the absolute beginning of the messages array. - [ ] Research and implement the `prompt_cache_key` parameter for OpenAI if applicable to increase hit rates. - [ ] Task: Conductor - User Manual Verification 'Phase 1: Metric Tracking & Prefix Stabilization' (Protocol in workflow.md) ## Phase 2: Anthropic 4-Breakpoint Optimization - [ ] Task: Implement hierarchical caching for Anthropic in `src/ai_client.py`. - [ ] Refactor `_send_anthropic` to use exactly 4 breakpoints: 1. Global System block. 2. Project Context block. 3. Context Injection block (file contents). 4. Sliding history window (last N turns). - [ ] Task: Research and implement "Automatic Caching" if supported by the SDK. - [ ] Check if `cache_control: {"type": "ephemeral"}` can be applied at the request level to simplify history caching. - [ ] Task: Conductor - User Manual Verification 'Phase 2: Anthropic 4-Breakpoint Optimization' (Protocol in workflow.md) ## Phase 3: Gemini Caching & TTL Management - [ ] Task: Optimize Gemini explicit caching logic. - [ ] Update `_send_gemini` to handle the 32k token threshold more intelligently (e.g., only create `CachedContent` when multiple turns are expected). - [ ] Expose `_GEMINI_CACHE_TTL` as a configurable setting in `config.toml`. - [ ] Task: Implement manual cache controls in `src/ai_client.py`. - [ ] Add `invalidate_provider_caches(provider)` to delete server-side caches. - [ ] Task: Conductor - User Manual Verification 'Phase 3: Gemini Caching & TTL Management' (Protocol in workflow.md) ## Phase 4: GUI Integration & Visualization - [ ] Task: Enhance the AI Metrics panel in `src/gui_2.py`. - [ ] Add "Saved Tokens" and "Cache Hit Rate" displays. - [ ] Implement visual indicators (badges) for cached files in the Context Hub. - [ ] Task: Add manual cache management buttons to the AI Settings panel. - [ ] "Force Cache Rebuild" and "Clear All Server Caches". - [ ] Task: Final end-to-end efficiency audit across all providers. - [ ] Task: Conductor - User Manual Verification 'Phase 4: GUI Integration & Visualization' (Protocol in workflow.md)