3.0 KiB
3.0 KiB
Implementation Plan: AI Provider Caching Optimization
Phase 1: Metric Tracking & Prefix Stabilization
- Task: Implement cache metric tracking for OpenAI and DeepSeek in
src/ai_client.py.- Update
_send_deepseekto extractprompt_cache_hit_tokensandprompt_cache_miss_tokensfrom usage metadata. - Update
_send_openai(or its equivalent) to extractcached_tokensfromprompt_tokens_details. - Update
_append_commsand theresponse_receivedevent to propagate these metrics.
- Update
- Task: Optimize prompt structure for OpenAI and DeepSeek to stabilize prefixes.
- Ensure system instructions and tool definitions are at the absolute beginning of the messages array.
- Research and implement the
prompt_cache_keyparameter for OpenAI if applicable to increase hit rates.
- Task: Conductor - User Manual Verification 'Phase 1: Metric Tracking & Prefix Stabilization' (Protocol in workflow.md)
Phase 2: Anthropic 4-Breakpoint Optimization
- Task: Implement hierarchical caching for Anthropic in
src/ai_client.py.- Refactor
_send_anthropicto use exactly 4 breakpoints:- Global System block.
- Project Context block.
- Context Injection block (file contents).
- Sliding history window (last N turns).
- Refactor
- Task: Research and implement "Automatic Caching" if supported by the SDK.
- Check if
cache_control: {"type": "ephemeral"}can be applied at the request level to simplify history caching.
- Check if
- Task: Conductor - User Manual Verification 'Phase 2: Anthropic 4-Breakpoint Optimization' (Protocol in workflow.md)
Phase 3: Gemini Caching & TTL Management
- Task: Optimize Gemini explicit caching logic.
- Update
_send_geminito handle the 32k token threshold more intelligently (e.g., only createCachedContentwhen multiple turns are expected). - Expose
_GEMINI_CACHE_TTLas a configurable setting inconfig.toml.
- Update
- Task: Implement manual cache controls in
src/ai_client.py.- Add
invalidate_provider_caches(provider)to delete server-side caches.
- Add
- Task: Conductor - User Manual Verification 'Phase 3: Gemini Caching & TTL Management' (Protocol in workflow.md)
Phase 4: GUI Integration & Visualization
- Task: Enhance the AI Metrics panel in
src/gui_2.py.- Add "Saved Tokens" and "Cache Hit Rate" displays.
- Implement visual indicators (badges) for cached files in the Context Hub.
- Task: Add manual cache management buttons to the AI Settings panel.
- "Force Cache Rebuild" and "Clear All Server Caches".
- Task: Update Comms Log UI to show per-response metrics.
- Modify
_render_comms_history_panelinsrc/gui_2.pyto display token usage (including cache hits) for each response entry.
- Modify
- Task: Final end-to-end efficiency audit across all providers.
- Task: Conductor - User Manual Verification 'Phase 4: GUI Integration & Visualization' (Protocol in workflow.md)