chore(conductor): Add new track 'AI Provider Caching Optimization'

2026-03-08 13:55:06 -04:00
parent 792352fb5b
commit d7083fc73f
5 changed files with 109 additions and 0 deletions
@@ -0,0 +1,39 @@
+# Implementation Plan: AI Provider Caching Optimization
+
+## Phase 1: Metric Tracking & Prefix Stabilization
+- [ ] Task: Implement cache metric tracking for OpenAI and DeepSeek in `src/ai_client.py`.
+    - [ ] Update `_send_deepseek` to extract `prompt_cache_hit_tokens` and `prompt_cache_miss_tokens` from usage metadata.
+    - [ ] Update `_send_openai` (or its equivalent) to extract `cached_tokens` from `prompt_tokens_details`.
+    - [ ] Update `_append_comms` and the `response_received` event to propagate these metrics.
+- [ ] Task: Optimize prompt structure for OpenAI and DeepSeek to stabilize prefixes.
+    - [ ] Ensure system instructions and tool definitions are at the absolute beginning of the messages array.
+    - [ ] Research and implement the `prompt_cache_key` parameter for OpenAI if applicable to increase hit rates.
+- [ ] Task: Conductor - User Manual Verification 'Phase 1: Metric Tracking & Prefix Stabilization' (Protocol in workflow.md)
+
+## Phase 2: Anthropic 4-Breakpoint Optimization
+- [ ] Task: Implement hierarchical caching for Anthropic in `src/ai_client.py`.
+    - [ ] Refactor `_send_anthropic` to use exactly 4 breakpoints:
+        1. Global System block.
+        2. Project Context block.
+        3. Context Injection block (file contents).
+        4. Sliding history window (last N turns).
+- [ ] Task: Research and implement "Automatic Caching" if supported by the SDK.
+    - [ ] Check if `cache_control: {"type": "ephemeral"}` can be applied at the request level to simplify history caching.
+- [ ] Task: Conductor - User Manual Verification 'Phase 2: Anthropic 4-Breakpoint Optimization' (Protocol in workflow.md)
+
+## Phase 3: Gemini Caching & TTL Management
+- [ ] Task: Optimize Gemini explicit caching logic.
+    - [ ] Update `_send_gemini` to handle the 32k token threshold more intelligently (e.g., only create `CachedContent` when multiple turns are expected).
+    - [ ] Expose `_GEMINI_CACHE_TTL` as a configurable setting in `config.toml`.
+- [ ] Task: Implement manual cache controls in `src/ai_client.py`.
+    - [ ] Add `invalidate_provider_caches(provider)` to delete server-side caches.
+- [ ] Task: Conductor - User Manual Verification 'Phase 3: Gemini Caching & TTL Management' (Protocol in workflow.md)
+
+## Phase 4: GUI Integration & Visualization
+- [ ] Task: Enhance the AI Metrics panel in `src/gui_2.py`.
+    - [ ] Add "Saved Tokens" and "Cache Hit Rate" displays.
+    - [ ] Implement visual indicators (badges) for cached files in the Context Hub.
+- [ ] Task: Add manual cache management buttons to the AI Settings panel.
+    - [ ] "Force Cache Rebuild" and "Clear All Server Caches".
+- [ ] Task: Final end-to-end efficiency audit across all providers.
+- [ ] Task: Conductor - User Manual Verification 'Phase 4: GUI Integration & Visualization' (Protocol in workflow.md)