fix to ai_client.py
This commit is contained in:
@@ -13,15 +13,15 @@ Is a local GUI tool for manually curating and sending context to AI APIs. It agg
|
||||
|
||||
**Files:**
|
||||
- `gui.py` - main GUI, `App` class, all panels, all callbacks, confirmation dialog, layout persistence, rich comms rendering
|
||||
- `ai_client.py` - unified provider wrapper, model listing, session management, send, tool/function-call loop, comms log, provider error classification
|
||||
- `aggregate.py` - reads config, collects files/screenshots/discussion, writes numbered `.md` files to `output_dir`
|
||||
- `ai_client.py` - unified provider wrapper, model listing, session management, send, tool/function-call loop, comms log, provider error classification, token estimation, and aggressive history truncation
|
||||
- `aggregate.py` - reads config, collects files/screenshots/discussion, builds `file_items` with `mtime` for cache optimization, writes numbered `.md` files to `output_dir` using `build_markdown_from_items` to avoid double I/O; `run()` returns `(markdown_str, path, file_items)` tuple; `summary_only=False` by default (full file contents sent, not heuristic summaries)
|
||||
- `shell_runner.py` - subprocess wrapper that runs PowerShell scripts sandboxed to `base_dir`, returns stdout/stderr/exit code as a string
|
||||
- `session_logger.py` - opens timestamped log files at session start; writes comms entries as JSON-L and tool calls as markdown; saves each AI-generated script as a `.ps1` file
|
||||
- `project_manager.py` - per-project .toml load/save, entry serialisation (entry_to_str/str_to_entry with @timestamp support), default_project/default_discussion factories, migrate_from_legacy_config, flat_config for aggregate.run(), git helpers (get_git_commit, get_git_log)
|
||||
- `theme.py` - palette definitions, font loading, scale, load_from_config/save_to_config
|
||||
- `gemini.py` - legacy standalone Gemini wrapper (not used by the main GUI; superseded by `ai_client.py`)
|
||||
- `file_cache.py` - stub; Anthropic Files API path removed; kept so stale imports don't break
|
||||
- `mcp_client.py` - MCP-style read-only file tools (read_file, list_directory, search_files, get_file_summary); allowlist enforced against project file_items + base_dirs; dispatched by ai_client tool-use loop for both Anthropic and Gemini
|
||||
- `mcp_client.py` - MCP-style tools (read_file, list_directory, search_files, get_file_summary, web_search, fetch_url); allowlist enforced against project file_items + base_dirs for file tools; web tools are unrestricted; dispatched by ai_client tool-use loop for both Anthropic and Gemini
|
||||
- `summarize.py` - local heuristic summariser (no AI); .py via AST, .toml via regex, .md headings, generic preview; used by mcp_client.get_file_summary and aggregate.build_summary_section
|
||||
- `config.toml` - global-only settings: [ai] provider+model+system_prompt, [theme] palette+font+scale, [projects] paths array + active path
|
||||
- `manual_slop.toml` - per-project file: [project] name+git_dir+system_prompt+main_context, [output] namespace+output_dir, [files] base_dir+paths, [screenshots] base_dir+paths, [discussion] roles+active+[discussion.discussions.<name>] git_commit+last_updated+history
|
||||
@@ -87,7 +87,7 @@ Is a local GUI tool for manually curating and sending context to AI APIs. It agg
|
||||
- All tool calls (script + result/rejection) are appended to `_tool_log` and displayed in the Tool Calls panel
|
||||
|
||||
**Dynamic file context refresh (ai_client.py):**
|
||||
- After the last tool call in each round, all project files from `file_items` are re-read from disk via `_reread_file_items()`. The `file_items` variable is reassigned so subsequent rounds see fresh content.
|
||||
- After the last tool call in each round, project files from `file_items` are checked via `_reread_file_items()`. It uses `mtime` to only re-read modified files, returning only the `changed` files to build a minimal `[FILES UPDATED]` block.
|
||||
- For Anthropic: the refreshed file contents are injected as a `text` block appended to the `tool_results` user message, prefixed with `[FILES UPDATED]` and an instruction not to re-read them.
|
||||
- For Gemini: refreshed file contents are appended to the last function response's `output` string as a `[SYSTEM: FILES UPDATED]` block. On the next tool round, stale `[FILES UPDATED]` blocks are stripped from history and old tool outputs are truncated to `_history_trunc_limit` characters to control token growth.
|
||||
- `_build_file_context_text(file_items)` formats the refreshed files as markdown code blocks (same format as the original context)
|
||||
@@ -141,10 +141,12 @@ Entry layout: index + timestamp + direction + kind + provider/model header row,
|
||||
- `log_tool_call(script, result, script_path)` writes the script to `scripts/generated/<ts>_<seq:04d>.ps1` and appends a markdown record to the toolcalls log without the script body (just the file path + result); uses a `threading.Lock` for the sequence counter
|
||||
- `close_session()` flushes and closes both file handles; called just before `dpg.destroy_context()`
|
||||
|
||||
**Anthropic prompt caching:**
|
||||
**Anthropic prompt caching & history management:**
|
||||
- System prompt + context are combined into one string, chunked into <=120k char blocks, and sent as the `system=` parameter array. Only the LAST chunk gets `cache_control: ephemeral`, so the entire system prefix is cached as one unit.
|
||||
- Last tool in `_ANTHROPIC_TOOLS` (`run_powershell`) has `cache_control: ephemeral`; this means the tools prefix is cached together with the system prefix after the first request.
|
||||
- The user message is sent as a plain `[{"type": "text", "text": user_message}]` block with NO cache_control. The context lives in `system=`, not in the first user message.
|
||||
- `_add_history_cache_breakpoint` places `cache_control:ephemeral` on the last content block of the second-to-last user message, using the 4th cache breakpoint to cache the conversation history prefix.
|
||||
- `_trim_anthropic_history` uses token estimation (`_CHARS_PER_TOKEN = 3.5`) to keep the prompt under `_ANTHROPIC_MAX_PROMPT_TOKENS = 180_000`. It strips stale file refreshes from old turns, and drops oldest turn pairs if still over budget.
|
||||
- The tools list is built once per session via `_get_anthropic_tools()` and reused across all API calls within the tool loop, avoiding redundant Python-side reconstruction.
|
||||
- `_strip_cache_controls()` removes stale `cache_control` markers from all history entries before each API call, ensuring only the stable system/tools prefix consumes cache breakpoint slots.
|
||||
- Cache stats (creation tokens, read tokens) are surfaced in the comms log usage dict and displayed in the Comms History panel
|
||||
@@ -180,13 +182,15 @@ Entry layout: index + timestamp + direction + kind + provider/model header row,
|
||||
**MCP file tools (mcp_client.py + ai_client.py):**
|
||||
- Four read-only tools exposed to the AI as native function/tool declarations: `read_file`, `list_directory`, `search_files`, `get_file_summary`
|
||||
- Access control: `mcp_client.configure(file_items, extra_base_dirs)` is called before each send; builds an allowlist of resolved absolute paths from the project's `file_items` plus the `base_dir`; any path that is not explicitly in the list or not under one of the allowed directories returns `ACCESS DENIED`
|
||||
- `mcp_client.dispatch(tool_name, tool_input)` is the single dispatch entry point used by both Anthropic and Gemini tool-use loops
|
||||
- `mcp_client.dispatch(tool_name, tool_input)` is the single dispatch entry point used by both Anthropic and Gemini tool-use loops; `TOOL_NAMES` set now includes all six tool names
|
||||
- Anthropic: MCP tools appear before `run_powershell` in the tools list (no `cache_control` on them; only `run_powershell` carries `cache_control: ephemeral`)
|
||||
- Gemini: MCP tools are included in the `FunctionDeclaration` list alongside `run_powershell`
|
||||
- `get_file_summary` uses `summarize.summarise_file()` — same heuristic used for the initial `<context>` block, so the AI gets the same compact structural view it already knows
|
||||
- `list_directory` sorts dirs before files; shows name, type, and size
|
||||
- `search_files` uses `Path.glob()` with the caller-supplied pattern (supports `**/*.py` style)
|
||||
- `read_file` returns raw UTF-8 text; errors (not found, access denied, decode error) are returned as error strings rather than exceptions, so the AI sees them as tool results
|
||||
- `web_search(query)` queries DuckDuckGo HTML endpoint and returns the top 5 results (title, URL, snippet) as a formatted string; uses a custom `_DDGParser` (HTMLParser subclass)
|
||||
- `fetch_url(url)` fetches a URL, strips HTML tags/scripts via `_TextExtractor` (HTMLParser subclass), collapses whitespace, and truncates to 40k chars to prevent context blowup; handles DuckDuckGo redirect links automatically
|
||||
- `summarize.py` heuristics: `.py` → AST imports + ALL_CAPS constants + classes+methods + top-level functions; `.toml` → table headers + top-level keys; `.md` → h1–h3 headings with indentation; all others → line count + first 8 lines preview
|
||||
- Comms log: MCP tool calls log `OUT/tool_call` with `{"name": ..., "args": {...}}` and `IN/tool_result` with `{"name": ..., "output": ...}`; rendered in the Comms History panel via `_render_payload_tool_call` (shows each arg key/value) and `_render_payload_tool_result` (shows output)
|
||||
|
||||
@@ -199,7 +203,9 @@ Entry layout: index + timestamp + direction + kind + provider/model header row,
|
||||
|
||||
### Gemini Context Management
|
||||
- Gemini uses explicit caching via `client.caches.create()` to store the `system_instruction` + tools as an immutable cached prefix with a 1-hour TTL. The cache is created once per chat session.
|
||||
- Proactively rebuilds cache at 90% of `_GEMINI_CACHE_TTL = 3600` to avoid stale-reference errors.
|
||||
- When context changes (detected via `md_content` hash), the old cache is deleted, a new cache is created, and chat history is migrated to a fresh chat session pointing at the new cache.
|
||||
- Trims history by dropping oldest pairs if input tokens exceed `_GEMINI_MAX_INPUT_TOKENS = 900_000`.
|
||||
- If cache creation fails (e.g., content is under the minimum token threshold — 1024 for Flash, 4096 for Pro), the system falls back to inline `system_instruction` in the chat config. Implicit caching may still provide cost savings in this case.
|
||||
- The `<context>` block lives inside `system_instruction`, NOT in user messages, preventing history bloat across turns.
|
||||
- On cleanup/exit, active caches are deleted via `ai_client.cleanup()` to prevent orphaned billing.
|
||||
@@ -244,3 +250,20 @@ Documentation has been completely rewritten matching the strict, structural form
|
||||
- `docs/guide_architecture.md`: Details the Python implementation algorithms, queue management for UI rendering, the specific AST heuristics used for context aggregation, and the distinct algorithms for trimming Anthropic history vs Gemini state caching.
|
||||
- `docs/Readme.md`: The core interface manual.
|
||||
- `docs/guide_tools.md`: Security architecture for `_is_allowed` paths and definitions of the read-only vs destructive tool pipeline.
|
||||
|
||||
|
||||
|
||||
|
||||
## Updates (2026-02-22 — ai_client.py & aggregate.py)
|
||||
|
||||
### mcp_client.py — Web Tools Added
|
||||
- `web_search(query)` and `fetch_url(url)` added as two new MCP tools alongside the existing four file tools.
|
||||
- `TOOL_NAMES` set updated to include all six tool names for dispatch routing.
|
||||
- `MCP_TOOL_SPECS` list extended with full JSON schema definitions for both web tools.
|
||||
- Both tools are declared in `_build_anthropic_tools()` and `_gemini_tool_declaration()` so they are available to both providers.
|
||||
- Web tools bypass the `_is_allowed` path check (no filesystem access); file tools retain the allowlist enforcement.
|
||||
|
||||
### aggregate.py — run() double-I/O elimination
|
||||
- `run()` now calls `build_file_items()` once, then passes the result to `build_markdown_from_items()` instead of calling `build_files_section()` separately. This avoids reading every file twice per send.
|
||||
- `build_markdown_from_items()` accepts a `summary_only` flag (default `False`); when `False` it inlines full file content; when `True` it delegates to `summarize.build_summary_markdown()` for compact structural summaries.
|
||||
- `run()` returns a 3-tuple `(markdown_str, output_path, file_items)` — the `file_items` list is passed through to `gui.py` as `self.last_file_items` for dynamic context refresh after tool calls.
|
||||
|
||||
Reference in New Issue
Block a user