Compare commits
7 Commits
not_sure
...
68e895cb8a
| Author | SHA1 | Date | |
|---|---|---|---|
| 68e895cb8a | |||
| b4734f4bba | |||
| 8a3c2d8e21 | |||
| 73fad80257 | |||
| 17eebff5f8 | |||
| 1581380a43 | |||
| 8bf95866dc |
+44
-7
@@ -12,16 +12,16 @@ Is a local GUI tool for manually curating and sending context to AI APIs. It agg
|
|||||||
- `uv` - package/env management
|
- `uv` - package/env management
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
- `gui.py` - main GUI, `App` class, all panels, all callbacks, confirmation dialog, layout persistence, rich comms rendering
|
- `gui.py` - main GUI, `App` class, all panels, all callbacks, confirmation dialog, layout persistence, rich comms rendering; `[+ Maximize]` buttons in `ConfirmDialog` and `win_script_output` now pass text directly as `user_data` / read from `self._last_script` / `self._last_output` instance vars instead of `dpg.get_value(tag)` — fixes glitch when word-wrap is ON or dialog is dismissed before viewer opens
|
||||||
- `ai_client.py` - unified provider wrapper, model listing, session management, send, tool/function-call loop, comms log, provider error classification
|
- `ai_client.py` - unified provider wrapper, model listing, session management, send, tool/function-call loop, comms log, provider error classification, token estimation, and aggressive history truncation
|
||||||
- `aggregate.py` - reads config, collects files/screenshots/discussion, writes numbered `.md` files to `output_dir`
|
- `aggregate.py` - reads config, collects files/screenshots/discussion, builds `file_items` with `mtime` for cache optimization, writes numbered `.md` files to `output_dir` using `build_markdown_from_items` to avoid double I/O; `run()` returns `(markdown_str, path, file_items)` tuple; `summary_only=False` by default (full file contents sent, not heuristic summaries)
|
||||||
- `shell_runner.py` - subprocess wrapper that runs PowerShell scripts sandboxed to `base_dir`, returns stdout/stderr/exit code as a string
|
- `shell_runner.py` - subprocess wrapper that runs PowerShell scripts sandboxed to `base_dir`, returns stdout/stderr/exit code as a string
|
||||||
- `session_logger.py` - opens timestamped log files at session start; writes comms entries as JSON-L and tool calls as markdown; saves each AI-generated script as a `.ps1` file
|
- `session_logger.py` - opens timestamped log files at session start; writes comms entries as JSON-L and tool calls as markdown; saves each AI-generated script as a `.ps1` file
|
||||||
- `project_manager.py` - per-project .toml load/save, entry serialisation (entry_to_str/str_to_entry with @timestamp support), default_project/default_discussion factories, migrate_from_legacy_config, flat_config for aggregate.run(), git helpers (get_git_commit, get_git_log)
|
- `project_manager.py` - per-project .toml load/save, entry serialisation (entry_to_str/str_to_entry with @timestamp support), default_project/default_discussion factories, migrate_from_legacy_config, flat_config for aggregate.run(), git helpers (get_git_commit, get_git_log)
|
||||||
- `theme.py` - palette definitions, font loading, scale, load_from_config/save_to_config
|
- `theme.py` - palette definitions, font loading, scale, load_from_config/save_to_config
|
||||||
- `gemini.py` - legacy standalone Gemini wrapper (not used by the main GUI; superseded by `ai_client.py`)
|
- `gemini.py` - legacy standalone Gemini wrapper (not used by the main GUI; superseded by `ai_client.py`)
|
||||||
- `file_cache.py` - stub; Anthropic Files API path removed; kept so stale imports don't break
|
- `file_cache.py` - stub; Anthropic Files API path removed; kept so stale imports don't break
|
||||||
- `mcp_client.py` - MCP-style read-only file tools (read_file, list_directory, search_files, get_file_summary); allowlist enforced against project file_items + base_dirs; dispatched by ai_client tool-use loop for both Anthropic and Gemini
|
- `mcp_client.py` - MCP-style tools (read_file, list_directory, search_files, get_file_summary, web_search, fetch_url); allowlist enforced against project file_items + base_dirs for file tools; web tools are unrestricted; dispatched by ai_client tool-use loop for both Anthropic and Gemini
|
||||||
- `summarize.py` - local heuristic summariser (no AI); .py via AST, .toml via regex, .md headings, generic preview; used by mcp_client.get_file_summary and aggregate.build_summary_section
|
- `summarize.py` - local heuristic summariser (no AI); .py via AST, .toml via regex, .md headings, generic preview; used by mcp_client.get_file_summary and aggregate.build_summary_section
|
||||||
- `config.toml` - global-only settings: [ai] provider+model+system_prompt, [theme] palette+font+scale, [projects] paths array + active path
|
- `config.toml` - global-only settings: [ai] provider+model+system_prompt, [theme] palette+font+scale, [projects] paths array + active path
|
||||||
- `manual_slop.toml` - per-project file: [project] name+git_dir+system_prompt+main_context, [output] namespace+output_dir, [files] base_dir+paths, [screenshots] base_dir+paths, [discussion] roles+active+[discussion.discussions.<name>] git_commit+last_updated+history
|
- `manual_slop.toml` - per-project file: [project] name+git_dir+system_prompt+main_context, [output] namespace+output_dir, [files] base_dir+paths, [screenshots] base_dir+paths, [discussion] roles+active+[discussion.discussions.<name>] git_commit+last_updated+history
|
||||||
@@ -87,7 +87,7 @@ Is a local GUI tool for manually curating and sending context to AI APIs. It agg
|
|||||||
- All tool calls (script + result/rejection) are appended to `_tool_log` and displayed in the Tool Calls panel
|
- All tool calls (script + result/rejection) are appended to `_tool_log` and displayed in the Tool Calls panel
|
||||||
|
|
||||||
**Dynamic file context refresh (ai_client.py):**
|
**Dynamic file context refresh (ai_client.py):**
|
||||||
- After the last tool call in each round, all project files from `file_items` are re-read from disk via `_reread_file_items()`. The `file_items` variable is reassigned so subsequent rounds see fresh content.
|
- After the last tool call in each round, project files from `file_items` are checked via `_reread_file_items()`. It uses `mtime` to only re-read modified files, returning only the `changed` files to build a minimal `[FILES UPDATED]` block.
|
||||||
- For Anthropic: the refreshed file contents are injected as a `text` block appended to the `tool_results` user message, prefixed with `[FILES UPDATED]` and an instruction not to re-read them.
|
- For Anthropic: the refreshed file contents are injected as a `text` block appended to the `tool_results` user message, prefixed with `[FILES UPDATED]` and an instruction not to re-read them.
|
||||||
- For Gemini: refreshed file contents are appended to the last function response's `output` string as a `[SYSTEM: FILES UPDATED]` block. On the next tool round, stale `[FILES UPDATED]` blocks are stripped from history and old tool outputs are truncated to `_history_trunc_limit` characters to control token growth.
|
- For Gemini: refreshed file contents are appended to the last function response's `output` string as a `[SYSTEM: FILES UPDATED]` block. On the next tool round, stale `[FILES UPDATED]` blocks are stripped from history and old tool outputs are truncated to `_history_trunc_limit` characters to control token growth.
|
||||||
- `_build_file_context_text(file_items)` formats the refreshed files as markdown code blocks (same format as the original context)
|
- `_build_file_context_text(file_items)` formats the refreshed files as markdown code blocks (same format as the original context)
|
||||||
@@ -141,10 +141,12 @@ Entry layout: index + timestamp + direction + kind + provider/model header row,
|
|||||||
- `log_tool_call(script, result, script_path)` writes the script to `scripts/generated/<ts>_<seq:04d>.ps1` and appends a markdown record to the toolcalls log without the script body (just the file path + result); uses a `threading.Lock` for the sequence counter
|
- `log_tool_call(script, result, script_path)` writes the script to `scripts/generated/<ts>_<seq:04d>.ps1` and appends a markdown record to the toolcalls log without the script body (just the file path + result); uses a `threading.Lock` for the sequence counter
|
||||||
- `close_session()` flushes and closes both file handles; called just before `dpg.destroy_context()`
|
- `close_session()` flushes and closes both file handles; called just before `dpg.destroy_context()`
|
||||||
|
|
||||||
**Anthropic prompt caching:**
|
**Anthropic prompt caching & history management:**
|
||||||
- System prompt + context are combined into one string, chunked into <=120k char blocks, and sent as the `system=` parameter array. Only the LAST chunk gets `cache_control: ephemeral`, so the entire system prefix is cached as one unit.
|
- System prompt + context are combined into one string, chunked into <=120k char blocks, and sent as the `system=` parameter array. Only the LAST chunk gets `cache_control: ephemeral`, so the entire system prefix is cached as one unit.
|
||||||
- Last tool in `_ANTHROPIC_TOOLS` (`run_powershell`) has `cache_control: ephemeral`; this means the tools prefix is cached together with the system prefix after the first request.
|
- Last tool in `_ANTHROPIC_TOOLS` (`run_powershell`) has `cache_control: ephemeral`; this means the tools prefix is cached together with the system prefix after the first request.
|
||||||
- The user message is sent as a plain `[{"type": "text", "text": user_message}]` block with NO cache_control. The context lives in `system=`, not in the first user message.
|
- The user message is sent as a plain `[{"type": "text", "text": user_message}]` block with NO cache_control. The context lives in `system=`, not in the first user message.
|
||||||
|
- `_add_history_cache_breakpoint` places `cache_control:ephemeral` on the last content block of the second-to-last user message, using the 4th cache breakpoint to cache the conversation history prefix.
|
||||||
|
- `_trim_anthropic_history` uses token estimation (`_CHARS_PER_TOKEN = 3.5`) to keep the prompt under `_ANTHROPIC_MAX_PROMPT_TOKENS = 180_000`. It strips stale file refreshes from old turns, and drops oldest turn pairs if still over budget.
|
||||||
- The tools list is built once per session via `_get_anthropic_tools()` and reused across all API calls within the tool loop, avoiding redundant Python-side reconstruction.
|
- The tools list is built once per session via `_get_anthropic_tools()` and reused across all API calls within the tool loop, avoiding redundant Python-side reconstruction.
|
||||||
- `_strip_cache_controls()` removes stale `cache_control` markers from all history entries before each API call, ensuring only the stable system/tools prefix consumes cache breakpoint slots.
|
- `_strip_cache_controls()` removes stale `cache_control` markers from all history entries before each API call, ensuring only the stable system/tools prefix consumes cache breakpoint slots.
|
||||||
- Cache stats (creation tokens, read tokens) are surfaced in the comms log usage dict and displayed in the Comms History panel
|
- Cache stats (creation tokens, read tokens) are surfaced in the comms log usage dict and displayed in the Comms History panel
|
||||||
@@ -180,13 +182,15 @@ Entry layout: index + timestamp + direction + kind + provider/model header row,
|
|||||||
**MCP file tools (mcp_client.py + ai_client.py):**
|
**MCP file tools (mcp_client.py + ai_client.py):**
|
||||||
- Four read-only tools exposed to the AI as native function/tool declarations: `read_file`, `list_directory`, `search_files`, `get_file_summary`
|
- Four read-only tools exposed to the AI as native function/tool declarations: `read_file`, `list_directory`, `search_files`, `get_file_summary`
|
||||||
- Access control: `mcp_client.configure(file_items, extra_base_dirs)` is called before each send; builds an allowlist of resolved absolute paths from the project's `file_items` plus the `base_dir`; any path that is not explicitly in the list or not under one of the allowed directories returns `ACCESS DENIED`
|
- Access control: `mcp_client.configure(file_items, extra_base_dirs)` is called before each send; builds an allowlist of resolved absolute paths from the project's `file_items` plus the `base_dir`; any path that is not explicitly in the list or not under one of the allowed directories returns `ACCESS DENIED`
|
||||||
- `mcp_client.dispatch(tool_name, tool_input)` is the single dispatch entry point used by both Anthropic and Gemini tool-use loops
|
- `mcp_client.dispatch(tool_name, tool_input)` is the single dispatch entry point used by both Anthropic and Gemini tool-use loops; `TOOL_NAMES` set now includes all six tool names
|
||||||
- Anthropic: MCP tools appear before `run_powershell` in the tools list (no `cache_control` on them; only `run_powershell` carries `cache_control: ephemeral`)
|
- Anthropic: MCP tools appear before `run_powershell` in the tools list (no `cache_control` on them; only `run_powershell` carries `cache_control: ephemeral`)
|
||||||
- Gemini: MCP tools are included in the `FunctionDeclaration` list alongside `run_powershell`
|
- Gemini: MCP tools are included in the `FunctionDeclaration` list alongside `run_powershell`
|
||||||
- `get_file_summary` uses `summarize.summarise_file()` — same heuristic used for the initial `<context>` block, so the AI gets the same compact structural view it already knows
|
- `get_file_summary` uses `summarize.summarise_file()` — same heuristic used for the initial `<context>` block, so the AI gets the same compact structural view it already knows
|
||||||
- `list_directory` sorts dirs before files; shows name, type, and size
|
- `list_directory` sorts dirs before files; shows name, type, and size
|
||||||
- `search_files` uses `Path.glob()` with the caller-supplied pattern (supports `**/*.py` style)
|
- `search_files` uses `Path.glob()` with the caller-supplied pattern (supports `**/*.py` style)
|
||||||
- `read_file` returns raw UTF-8 text; errors (not found, access denied, decode error) are returned as error strings rather than exceptions, so the AI sees them as tool results
|
- `read_file` returns raw UTF-8 text; errors (not found, access denied, decode error) are returned as error strings rather than exceptions, so the AI sees them as tool results
|
||||||
|
- `web_search(query)` queries DuckDuckGo HTML endpoint and returns the top 5 results (title, URL, snippet) as a formatted string; uses a custom `_DDGParser` (HTMLParser subclass)
|
||||||
|
- `fetch_url(url)` fetches a URL, strips HTML tags/scripts via `_TextExtractor` (HTMLParser subclass), collapses whitespace, and truncates to 40k chars to prevent context blowup; handles DuckDuckGo redirect links automatically
|
||||||
- `summarize.py` heuristics: `.py` → AST imports + ALL_CAPS constants + classes+methods + top-level functions; `.toml` → table headers + top-level keys; `.md` → h1–h3 headings with indentation; all others → line count + first 8 lines preview
|
- `summarize.py` heuristics: `.py` → AST imports + ALL_CAPS constants + classes+methods + top-level functions; `.toml` → table headers + top-level keys; `.md` → h1–h3 headings with indentation; all others → line count + first 8 lines preview
|
||||||
- Comms log: MCP tool calls log `OUT/tool_call` with `{"name": ..., "args": {...}}` and `IN/tool_result` with `{"name": ..., "output": ...}`; rendered in the Comms History panel via `_render_payload_tool_call` (shows each arg key/value) and `_render_payload_tool_result` (shows output)
|
- Comms log: MCP tool calls log `OUT/tool_call` with `{"name": ..., "args": {...}}` and `IN/tool_result` with `{"name": ..., "output": ...}`; rendered in the Comms History panel via `_render_payload_tool_call` (shows each arg key/value) and `_render_payload_tool_result` (shows output)
|
||||||
|
|
||||||
@@ -199,7 +203,9 @@ Entry layout: index + timestamp + direction + kind + provider/model header row,
|
|||||||
|
|
||||||
### Gemini Context Management
|
### Gemini Context Management
|
||||||
- Gemini uses explicit caching via `client.caches.create()` to store the `system_instruction` + tools as an immutable cached prefix with a 1-hour TTL. The cache is created once per chat session.
|
- Gemini uses explicit caching via `client.caches.create()` to store the `system_instruction` + tools as an immutable cached prefix with a 1-hour TTL. The cache is created once per chat session.
|
||||||
|
- Proactively rebuilds cache at 90% of `_GEMINI_CACHE_TTL = 3600` to avoid stale-reference errors.
|
||||||
- When context changes (detected via `md_content` hash), the old cache is deleted, a new cache is created, and chat history is migrated to a fresh chat session pointing at the new cache.
|
- When context changes (detected via `md_content` hash), the old cache is deleted, a new cache is created, and chat history is migrated to a fresh chat session pointing at the new cache.
|
||||||
|
- Trims history by dropping oldest pairs if input tokens exceed `_GEMINI_MAX_INPUT_TOKENS = 900_000`.
|
||||||
- If cache creation fails (e.g., content is under the minimum token threshold — 1024 for Flash, 4096 for Pro), the system falls back to inline `system_instruction` in the chat config. Implicit caching may still provide cost savings in this case.
|
- If cache creation fails (e.g., content is under the minimum token threshold — 1024 for Flash, 4096 for Pro), the system falls back to inline `system_instruction` in the chat config. Implicit caching may still provide cost savings in this case.
|
||||||
- The `<context>` block lives inside `system_instruction`, NOT in user messages, preventing history bloat across turns.
|
- The `<context>` block lives inside `system_instruction`, NOT in user messages, preventing history bloat across turns.
|
||||||
- On cleanup/exit, active caches are deleted via `ai_client.cleanup()` to prevent orphaned billing.
|
- On cleanup/exit, active caches are deleted via `ai_client.cleanup()` to prevent orphaned billing.
|
||||||
@@ -244,3 +250,34 @@ Documentation has been completely rewritten matching the strict, structural form
|
|||||||
- `docs/guide_architecture.md`: Details the Python implementation algorithms, queue management for UI rendering, the specific AST heuristics used for context aggregation, and the distinct algorithms for trimming Anthropic history vs Gemini state caching.
|
- `docs/guide_architecture.md`: Details the Python implementation algorithms, queue management for UI rendering, the specific AST heuristics used for context aggregation, and the distinct algorithms for trimming Anthropic history vs Gemini state caching.
|
||||||
- `docs/Readme.md`: The core interface manual.
|
- `docs/Readme.md`: The core interface manual.
|
||||||
- `docs/guide_tools.md`: Security architecture for `_is_allowed` paths and definitions of the read-only vs destructive tool pipeline.
|
- `docs/guide_tools.md`: Security architecture for `_is_allowed` paths and definitions of the read-only vs destructive tool pipeline.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
## Updates (2026-02-22 — ai_client.py & aggregate.py)
|
||||||
|
|
||||||
|
### mcp_client.py — Web Tools Added
|
||||||
|
- `web_search(query)` and `fetch_url(url)` added as two new MCP tools alongside the existing four file tools.
|
||||||
|
- `TOOL_NAMES` set updated to include all six tool names for dispatch routing.
|
||||||
|
- `MCP_TOOL_SPECS` list extended with full JSON schema definitions for both web tools.
|
||||||
|
- Both tools are declared in `_build_anthropic_tools()` and `_gemini_tool_declaration()` so they are available to both providers.
|
||||||
|
- Web tools bypass the `_is_allowed` path check (no filesystem access); file tools retain the allowlist enforcement.
|
||||||
|
|
||||||
|
### aggregate.py — run() double-I/O elimination
|
||||||
|
- `run()` now calls `build_file_items()` once, then passes the result to `build_markdown_from_items()` instead of calling `build_files_section()` separately. This avoids reading every file twice per send.
|
||||||
|
- `build_markdown_from_items()` accepts a `summary_only` flag (default `False`); when `False` it inlines full file content; when `True` it delegates to `summarize.build_summary_markdown()` for compact structural summaries.
|
||||||
|
- `run()` returns a 3-tuple `(markdown_str, output_path, file_items)` — the `file_items` list is passed through to `gui.py` as `self.last_file_items` for dynamic context refresh after tool calls.
|
||||||
|
|
||||||
|
|
||||||
|
## Updates (2026-02-22 — gui.py [+ Maximize] bug fix)
|
||||||
|
|
||||||
|
### Problem
|
||||||
|
Three `[+ Maximize]` buttons were reading their text content via `dpg.get_value(tag)` at click time:
|
||||||
|
1. `ConfirmDialog.show()` — passed `f"{self._tag}_script"` as `user_data` and called `dpg.get_value(u)` in the lambda. If the dialog was dismissed before the viewer opened, the item no longer existed and the call would fail silently or crash.
|
||||||
|
2. `win_script_output` Script `[+ Maximize]` — used `user_data="last_script_text"` and `dpg.get_value(u)`. When word-wrap is ON, `last_script_text` is hidden (`show=False`); in some DPG versions `dpg.get_value` on a hidden `input_text` returns `""`.
|
||||||
|
3. `win_script_output` Output `[+ Maximize]` — same issue with `"last_script_output"`.
|
||||||
|
|
||||||
|
### Fix
|
||||||
|
- `ConfirmDialog.show()`: changed `user_data` to `self._script` (the actual text string captured at button-creation time) and the callback to `lambda s, a, u: _show_text_viewer("Confirm Script", u)`. The text is now baked in at dialog construction, not read from a potentially-deleted widget.
|
||||||
|
- `App._append_tool_log()`: added `self._last_script = script` and `self._last_output = result` assignments so the latest values are always available as instance state.
|
||||||
|
- `win_script_output` buttons: both `[+ Maximize]` buttons now use `lambda s, a, u: _show_text_viewer("...", self._last_script/output)` directly, bypassing DPG widget state entirely.
|
||||||
|
|||||||
+45
-7
@@ -98,24 +98,28 @@ def build_file_items(base_dir: Path, files: list[str]) -> list[dict]:
|
|||||||
entry : str (original config entry string)
|
entry : str (original config entry string)
|
||||||
content : str (file text, or error string)
|
content : str (file text, or error string)
|
||||||
error : bool
|
error : bool
|
||||||
|
mtime : float (last modification time, for skip-if-unchanged optimization)
|
||||||
"""
|
"""
|
||||||
items = []
|
items = []
|
||||||
for entry in files:
|
for entry in files:
|
||||||
paths = resolve_paths(base_dir, entry)
|
paths = resolve_paths(base_dir, entry)
|
||||||
if not paths:
|
if not paths:
|
||||||
items.append({"path": None, "entry": entry, "content": f"ERROR: no files matched: {entry}", "error": True})
|
items.append({"path": None, "entry": entry, "content": f"ERROR: no files matched: {entry}", "error": True, "mtime": 0.0})
|
||||||
continue
|
continue
|
||||||
for path in paths:
|
for path in paths:
|
||||||
try:
|
try:
|
||||||
content = path.read_text(encoding="utf-8")
|
content = path.read_text(encoding="utf-8")
|
||||||
|
mtime = path.stat().st_mtime
|
||||||
error = False
|
error = False
|
||||||
except FileNotFoundError:
|
except FileNotFoundError:
|
||||||
content = f"ERROR: file not found: {path}"
|
content = f"ERROR: file not found: {path}"
|
||||||
|
mtime = 0.0
|
||||||
error = True
|
error = True
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
content = f"ERROR: {e}"
|
content = f"ERROR: {e}"
|
||||||
|
mtime = 0.0
|
||||||
error = True
|
error = True
|
||||||
items.append({"path": path, "entry": entry, "content": content, "error": error})
|
items.append({"path": path, "entry": entry, "content": content, "error": error, "mtime": mtime})
|
||||||
return items
|
return items
|
||||||
|
|
||||||
def build_summary_section(base_dir: Path, files: list[str]) -> str:
|
def build_summary_section(base_dir: Path, files: list[str]) -> str:
|
||||||
@@ -126,6 +130,40 @@ def build_summary_section(base_dir: Path, files: list[str]) -> str:
|
|||||||
items = build_file_items(base_dir, files)
|
items = build_file_items(base_dir, files)
|
||||||
return summarize.build_summary_markdown(items)
|
return summarize.build_summary_markdown(items)
|
||||||
|
|
||||||
|
def _build_files_section_from_items(file_items: list[dict]) -> str:
|
||||||
|
"""Build the files markdown section from pre-read file items (avoids double I/O)."""
|
||||||
|
sections = []
|
||||||
|
for item in file_items:
|
||||||
|
path = item.get("path")
|
||||||
|
entry = item.get("entry", "unknown")
|
||||||
|
content = item.get("content", "")
|
||||||
|
if path is None:
|
||||||
|
sections.append(f"### `{entry}`\n\n```text\n{content}\n```")
|
||||||
|
continue
|
||||||
|
suffix = path.suffix.lstrip(".") if hasattr(path, "suffix") else "text"
|
||||||
|
lang = suffix if suffix else "text"
|
||||||
|
original = entry if "*" not in entry else str(path)
|
||||||
|
sections.append(f"### `{original}`\n\n```{lang}\n{content}\n```")
|
||||||
|
return "\n\n---\n\n".join(sections)
|
||||||
|
|
||||||
|
|
||||||
|
def build_markdown_from_items(file_items: list[dict], screenshot_base_dir: Path, screenshots: list[str], history: list[str], summary_only: bool = False) -> str:
|
||||||
|
"""Build markdown from pre-read file items instead of re-reading from disk."""
|
||||||
|
parts = []
|
||||||
|
# STATIC PREFIX: Files and Screenshots must go first to maximize Cache Hits
|
||||||
|
if file_items:
|
||||||
|
if summary_only:
|
||||||
|
parts.append("## Files (Summary)\n\n" + summarize.build_summary_markdown(file_items))
|
||||||
|
else:
|
||||||
|
parts.append("## Files\n\n" + _build_files_section_from_items(file_items))
|
||||||
|
if screenshots:
|
||||||
|
parts.append("## Screenshots\n\n" + build_screenshots_section(screenshot_base_dir, screenshots))
|
||||||
|
# DYNAMIC SUFFIX: History changes every turn, must go last
|
||||||
|
if history:
|
||||||
|
parts.append("## Discussion History\n\n" + build_discussion_section(history))
|
||||||
|
return "\n\n---\n\n".join(parts)
|
||||||
|
|
||||||
|
|
||||||
def build_markdown(base_dir: Path, files: list[str], screenshot_base_dir: Path, screenshots: list[str], history: list[str], summary_only: bool = False) -> str:
|
def build_markdown(base_dir: Path, files: list[str], screenshot_base_dir: Path, screenshots: list[str], history: list[str], summary_only: bool = False) -> str:
|
||||||
parts = []
|
parts = []
|
||||||
# STATIC PREFIX: Files and Screenshots must go first to maximize Cache Hits
|
# STATIC PREFIX: Files and Screenshots must go first to maximize Cache Hits
|
||||||
@@ -141,7 +179,7 @@ def build_markdown(base_dir: Path, files: list[str], screenshot_base_dir: Path,
|
|||||||
parts.append("## Discussion History\n\n" + build_discussion_section(history))
|
parts.append("## Discussion History\n\n" + build_discussion_section(history))
|
||||||
return "\n\n---\n\n".join(parts)
|
return "\n\n---\n\n".join(parts)
|
||||||
|
|
||||||
def run(config: dict) -> tuple[str, Path]:
|
def run(config: dict) -> tuple[str, Path, list[dict]]:
|
||||||
namespace = config.get("project", {}).get("name")
|
namespace = config.get("project", {}).get("name")
|
||||||
if not namespace:
|
if not namespace:
|
||||||
namespace = config.get("output", {}).get("namespace", "project")
|
namespace = config.get("output", {}).get("namespace", "project")
|
||||||
@@ -155,11 +193,11 @@ def run(config: dict) -> tuple[str, Path]:
|
|||||||
output_dir.mkdir(parents=True, exist_ok=True)
|
output_dir.mkdir(parents=True, exist_ok=True)
|
||||||
increment = find_next_increment(output_dir, namespace)
|
increment = find_next_increment(output_dir, namespace)
|
||||||
output_file = output_dir / f"{namespace}_{increment:03d}.md"
|
output_file = output_dir / f"{namespace}_{increment:03d}.md"
|
||||||
# Provide full files to trigger Gemini's 32k cache threshold and give the AI immediate context
|
# Build file items once, then construct markdown from them (avoids double I/O)
|
||||||
markdown = build_markdown(base_dir, files, screenshot_base_dir, screenshots, history,
|
|
||||||
summary_only=False)
|
|
||||||
output_file.write_text(markdown, encoding="utf-8")
|
|
||||||
file_items = build_file_items(base_dir, files)
|
file_items = build_file_items(base_dir, files)
|
||||||
|
markdown = build_markdown_from_items(file_items, screenshot_base_dir, screenshots, history,
|
||||||
|
summary_only=False)
|
||||||
|
output_file.write_text(markdown, encoding="utf-8")
|
||||||
return markdown, output_file, file_items
|
return markdown, output_file, file_items
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
|
|||||||
+160
-52
@@ -13,6 +13,7 @@ during chat creation to avoid massive history bloat.
|
|||||||
# ai_client.py
|
# ai_client.py
|
||||||
import tomllib
|
import tomllib
|
||||||
import json
|
import json
|
||||||
|
import time
|
||||||
import datetime
|
import datetime
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
import file_cache
|
import file_cache
|
||||||
@@ -34,6 +35,12 @@ def set_model_params(temp: float, max_tok: int, trunc_limit: int = 8000):
|
|||||||
_gemini_client = None
|
_gemini_client = None
|
||||||
_gemini_chat = None
|
_gemini_chat = None
|
||||||
_gemini_cache = None
|
_gemini_cache = None
|
||||||
|
_gemini_cache_md_hash: int | None = None
|
||||||
|
_gemini_cache_created_at: float | None = None
|
||||||
|
|
||||||
|
# Gemini cache TTL in seconds. Caches are created with this TTL and
|
||||||
|
# proactively rebuilt at 90% of this value to avoid stale-reference errors.
|
||||||
|
_GEMINI_CACHE_TTL = 3600
|
||||||
|
|
||||||
_anthropic_client = None
|
_anthropic_client = None
|
||||||
_anthropic_history: list[dict] = []
|
_anthropic_history: list[dict] = []
|
||||||
@@ -216,6 +223,7 @@ def cleanup():
|
|||||||
|
|
||||||
def reset_session():
|
def reset_session():
|
||||||
global _gemini_client, _gemini_chat, _gemini_cache
|
global _gemini_client, _gemini_chat, _gemini_cache
|
||||||
|
global _gemini_cache_md_hash, _gemini_cache_created_at
|
||||||
global _anthropic_client, _anthropic_history
|
global _anthropic_client, _anthropic_history
|
||||||
global _CACHED_ANTHROPIC_TOOLS
|
global _CACHED_ANTHROPIC_TOOLS
|
||||||
if _gemini_client and _gemini_cache:
|
if _gemini_client and _gemini_cache:
|
||||||
@@ -226,6 +234,8 @@ def reset_session():
|
|||||||
_gemini_client = None
|
_gemini_client = None
|
||||||
_gemini_chat = None
|
_gemini_chat = None
|
||||||
_gemini_cache = None
|
_gemini_cache = None
|
||||||
|
_gemini_cache_md_hash = None
|
||||||
|
_gemini_cache_created_at = None
|
||||||
_anthropic_client = None
|
_anthropic_client = None
|
||||||
_anthropic_history = []
|
_anthropic_history = []
|
||||||
_CACHED_ANTHROPIC_TOOLS = None
|
_CACHED_ANTHROPIC_TOOLS = None
|
||||||
@@ -383,12 +393,15 @@ def _run_script(script: str, base_dir: str) -> str:
|
|||||||
|
|
||||||
# ------------------------------------------------------------------ dynamic file context refresh
|
# ------------------------------------------------------------------ dynamic file context refresh
|
||||||
|
|
||||||
def _reread_file_items(file_items: list[dict]) -> list[dict]:
|
def _reread_file_items(file_items: list[dict]) -> tuple[list[dict], list[dict]]:
|
||||||
"""
|
"""
|
||||||
Re-read every file in file_items from disk, returning a fresh list.
|
Re-read file_items from disk, but only files whose mtime has changed.
|
||||||
This is called after tool calls so the AI sees updated file contents.
|
Returns (all_items, changed_items) — all_items is the full refreshed list,
|
||||||
|
changed_items contains only the files that were actually modified since
|
||||||
|
the last read (used to build a minimal [FILES UPDATED] block).
|
||||||
"""
|
"""
|
||||||
refreshed = []
|
refreshed = []
|
||||||
|
changed = []
|
||||||
for item in file_items:
|
for item in file_items:
|
||||||
path = item.get("path")
|
path = item.get("path")
|
||||||
if path is None:
|
if path is None:
|
||||||
@@ -397,11 +410,20 @@ def _reread_file_items(file_items: list[dict]) -> list[dict]:
|
|||||||
from pathlib import Path as _P
|
from pathlib import Path as _P
|
||||||
p = _P(path) if not isinstance(path, _P) else path
|
p = _P(path) if not isinstance(path, _P) else path
|
||||||
try:
|
try:
|
||||||
|
current_mtime = p.stat().st_mtime
|
||||||
|
prev_mtime = item.get("mtime", 0.0)
|
||||||
|
if current_mtime == prev_mtime:
|
||||||
|
refreshed.append(item) # unchanged — skip re-read
|
||||||
|
continue
|
||||||
content = p.read_text(encoding="utf-8")
|
content = p.read_text(encoding="utf-8")
|
||||||
refreshed.append({**item, "content": content, "error": False})
|
new_item = {**item, "content": content, "error": False, "mtime": current_mtime}
|
||||||
|
refreshed.append(new_item)
|
||||||
|
changed.append(new_item)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
refreshed.append({**item, "content": f"ERROR re-reading {p}: {e}", "error": True})
|
err_item = {**item, "content": f"ERROR re-reading {p}: {e}", "error": True, "mtime": 0.0}
|
||||||
return refreshed
|
refreshed.append(err_item)
|
||||||
|
changed.append(err_item)
|
||||||
|
return refreshed, changed
|
||||||
|
|
||||||
|
|
||||||
def _build_file_context_text(file_items: list[dict]) -> str:
|
def _build_file_context_text(file_items: list[dict]) -> str:
|
||||||
@@ -453,8 +475,20 @@ def _ensure_gemini_client():
|
|||||||
_gemini_client = genai.Client(api_key=creds["gemini"]["api_key"])
|
_gemini_client = genai.Client(api_key=creds["gemini"]["api_key"])
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
def _get_gemini_history_list(chat):
|
||||||
|
if not chat: return []
|
||||||
|
# google-genai SDK stores the mutable list in _history
|
||||||
|
if hasattr(chat, "_history"):
|
||||||
|
return chat._history
|
||||||
|
if hasattr(chat, "history"):
|
||||||
|
return chat.history
|
||||||
|
if hasattr(chat, "get_history"):
|
||||||
|
return chat.get_history()
|
||||||
|
return []
|
||||||
|
|
||||||
def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items: list[dict] | None = None) -> str:
|
def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items: list[dict] | None = None) -> str:
|
||||||
global _gemini_chat, _gemini_cache
|
global _gemini_chat, _gemini_cache, _gemini_cache_md_hash, _gemini_cache_created_at
|
||||||
from google.genai import types
|
from google.genai import types
|
||||||
try:
|
try:
|
||||||
_ensure_gemini_client(); mcp_client.configure(file_items or [], [base_dir])
|
_ensure_gemini_client(); mcp_client.configure(file_items or [], [base_dir])
|
||||||
@@ -464,15 +498,29 @@ def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items:
|
|||||||
# DYNAMIC CONTEXT: Check if files/context changed mid-session
|
# DYNAMIC CONTEXT: Check if files/context changed mid-session
|
||||||
current_md_hash = hash(md_content)
|
current_md_hash = hash(md_content)
|
||||||
old_history = None
|
old_history = None
|
||||||
if _gemini_chat and getattr(_gemini_chat, "_last_md_hash", None) != current_md_hash:
|
if _gemini_chat and _gemini_cache_md_hash != current_md_hash:
|
||||||
old_history = list(_gemini_chat.history) if _gemini_chat.history else []
|
old_history = list(_get_gemini_history_list(_gemini_chat)) if _get_gemini_history_list(_gemini_chat) else []
|
||||||
if _gemini_cache:
|
if _gemini_cache:
|
||||||
try: _gemini_client.caches.delete(name=_gemini_cache.name)
|
try: _gemini_client.caches.delete(name=_gemini_cache.name)
|
||||||
except: pass
|
except: pass
|
||||||
_gemini_chat = None
|
_gemini_chat = None
|
||||||
_gemini_cache = None
|
_gemini_cache = None
|
||||||
|
_gemini_cache_created_at = None
|
||||||
_append_comms("OUT", "request", {"message": "[CONTEXT CHANGED] Rebuilding cache and chat session..."})
|
_append_comms("OUT", "request", {"message": "[CONTEXT CHANGED] Rebuilding cache and chat session..."})
|
||||||
|
|
||||||
|
# CACHE TTL: Proactively rebuild before the cache expires server-side.
|
||||||
|
# If we don't, send_message() will reference a deleted cache and fail.
|
||||||
|
if _gemini_chat and _gemini_cache and _gemini_cache_created_at:
|
||||||
|
elapsed = time.time() - _gemini_cache_created_at
|
||||||
|
if elapsed > _GEMINI_CACHE_TTL * 0.9:
|
||||||
|
old_history = list(_get_gemini_history_list(_gemini_chat)) if _get_gemini_history_list(_gemini_chat) else []
|
||||||
|
try: _gemini_client.caches.delete(name=_gemini_cache.name)
|
||||||
|
except: pass
|
||||||
|
_gemini_chat = None
|
||||||
|
_gemini_cache = None
|
||||||
|
_gemini_cache_created_at = None
|
||||||
|
_append_comms("OUT", "request", {"message": f"[CACHE TTL] Rebuilding cache (expired after {int(elapsed)}s)..."})
|
||||||
|
|
||||||
if not _gemini_chat:
|
if not _gemini_chat:
|
||||||
chat_config = types.GenerateContentConfig(
|
chat_config = types.GenerateContentConfig(
|
||||||
system_instruction=sys_instr,
|
system_instruction=sys_instr,
|
||||||
@@ -488,9 +536,10 @@ def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items:
|
|||||||
config=types.CreateCachedContentConfig(
|
config=types.CreateCachedContentConfig(
|
||||||
system_instruction=sys_instr,
|
system_instruction=sys_instr,
|
||||||
tools=tools_decl,
|
tools=tools_decl,
|
||||||
ttl="3600s",
|
ttl=f"{_GEMINI_CACHE_TTL}s",
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
_gemini_cache_created_at = time.time()
|
||||||
chat_config = types.GenerateContentConfig(
|
chat_config = types.GenerateContentConfig(
|
||||||
cached_content=_gemini_cache.name,
|
cached_content=_gemini_cache.name,
|
||||||
temperature=_temperature,
|
temperature=_temperature,
|
||||||
@@ -499,35 +548,38 @@ def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items:
|
|||||||
)
|
)
|
||||||
_append_comms("OUT", "request", {"message": f"[CACHE CREATED] {_gemini_cache.name}"})
|
_append_comms("OUT", "request", {"message": f"[CACHE CREATED] {_gemini_cache.name}"})
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
_gemini_cache = None # Ensure clean state on failure
|
_gemini_cache = None
|
||||||
|
_gemini_cache_created_at = None
|
||||||
|
_append_comms("OUT", "request", {"message": f"[CACHE FAILED] {type(e).__name__}: {e} — falling back to inline system_instruction"})
|
||||||
|
|
||||||
kwargs = {"model": _model, "config": chat_config}
|
kwargs = {"model": _model, "config": chat_config}
|
||||||
if old_history:
|
if old_history:
|
||||||
kwargs["history"] = old_history
|
kwargs["history"] = old_history
|
||||||
|
|
||||||
_gemini_chat = _gemini_client.chats.create(**kwargs)
|
_gemini_chat = _gemini_client.chats.create(**kwargs)
|
||||||
_gemini_chat._last_md_hash = current_md_hash
|
_gemini_cache_md_hash = current_md_hash
|
||||||
|
|
||||||
_append_comms("OUT", "request", {"message": f"[ctx {len(md_content)} + msg {len(user_message)}]"})
|
_append_comms("OUT", "request", {"message": f"[ctx {len(md_content)} + msg {len(user_message)}]"})
|
||||||
payload, all_text = user_message, []
|
payload, all_text = user_message, []
|
||||||
|
|
||||||
for r_idx in range(MAX_TOOL_ROUNDS + 2):
|
# Strip stale file refreshes and truncate old tool outputs ONCE before
|
||||||
# Strip stale file refreshes and truncate old tool outputs in Gemini history
|
# entering the tool loop (not per-round — history entries don't change).
|
||||||
if _gemini_chat and _gemini_chat.history:
|
if _gemini_chat and _get_gemini_history_list(_gemini_chat):
|
||||||
for msg in _gemini_chat.history:
|
for msg in _get_gemini_history_list(_gemini_chat):
|
||||||
if msg.role == "user" and hasattr(msg, "parts"):
|
if msg.role == "user" and hasattr(msg, "parts"):
|
||||||
for p in msg.parts:
|
for p in msg.parts:
|
||||||
if hasattr(p, "function_response") and p.function_response and hasattr(p.function_response, "response"):
|
if hasattr(p, "function_response") and p.function_response and hasattr(p.function_response, "response"):
|
||||||
r = p.function_response.response
|
r = p.function_response.response
|
||||||
if isinstance(r, dict) and "output" in r:
|
if isinstance(r, dict) and "output" in r:
|
||||||
val = r["output"]
|
val = r["output"]
|
||||||
if isinstance(val, str):
|
if isinstance(val, str):
|
||||||
if "[SYSTEM: FILES UPDATED]" in val:
|
if "[SYSTEM: FILES UPDATED]" in val:
|
||||||
val = val.split("[SYSTEM: FILES UPDATED]")[0].strip()
|
val = val.split("[SYSTEM: FILES UPDATED]")[0].strip()
|
||||||
if _history_trunc_limit > 0 and len(val) > _history_trunc_limit:
|
if _history_trunc_limit > 0 and len(val) > _history_trunc_limit:
|
||||||
val = val[:_history_trunc_limit] + "\n\n... [TRUNCATED BY SYSTEM TO SAVE TOKENS.]"
|
val = val[:_history_trunc_limit] + "\n\n... [TRUNCATED BY SYSTEM TO SAVE TOKENS.]"
|
||||||
r["output"] = val
|
r["output"] = val
|
||||||
|
|
||||||
|
for r_idx in range(MAX_TOOL_ROUNDS + 2):
|
||||||
resp = _gemini_chat.send_message(payload)
|
resp = _gemini_chat.send_message(payload)
|
||||||
txt = "\n".join(p.text for c in resp.candidates if getattr(c, "content", None) for p in c.content.parts if hasattr(p, "text") and p.text)
|
txt = "\n".join(p.text for c in resp.candidates if getattr(c, "content", None) for p in c.content.parts if hasattr(p, "text") and p.text)
|
||||||
if txt: all_text.append(txt)
|
if txt: all_text.append(txt)
|
||||||
@@ -543,23 +595,25 @@ def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items:
|
|||||||
|
|
||||||
# Guard: if Gemini reports input tokens approaching the limit, drop oldest history pairs
|
# Guard: if Gemini reports input tokens approaching the limit, drop oldest history pairs
|
||||||
total_in = usage.get("input_tokens", 0)
|
total_in = usage.get("input_tokens", 0)
|
||||||
if total_in > _GEMINI_MAX_INPUT_TOKENS and _gemini_chat and _gemini_chat.history:
|
if total_in > _GEMINI_MAX_INPUT_TOKENS and _gemini_chat and _get_gemini_history_list(_gemini_chat):
|
||||||
hist = _gemini_chat.history
|
hist = _get_gemini_history_list(_gemini_chat)
|
||||||
dropped = 0
|
dropped = 0
|
||||||
# Drop oldest pairs (user+model) but keep at least the last 2 entries
|
# Drop oldest pairs (user+model) but keep at least the last 2 entries
|
||||||
while len(hist) > 4 and total_in > _GEMINI_MAX_INPUT_TOKENS * 0.7:
|
while len(hist) > 4 and total_in > _GEMINI_MAX_INPUT_TOKENS * 0.7:
|
||||||
# Rough estimate: each dropped message saves ~(chars/4) tokens
|
# Drop in pairs (user + model) to maintain alternating roles required by Gemini
|
||||||
saved = 0
|
saved = 0
|
||||||
for p in hist[0].parts:
|
for _ in range(2):
|
||||||
if hasattr(p, "text") and p.text:
|
if not hist: break
|
||||||
saved += len(p.text) // 4
|
for p in hist[0].parts:
|
||||||
elif hasattr(p, "function_response") and p.function_response:
|
if hasattr(p, "text") and p.text:
|
||||||
r = getattr(p.function_response, "response", {})
|
saved += len(p.text) // 4
|
||||||
if isinstance(r, dict):
|
elif hasattr(p, "function_response") and p.function_response:
|
||||||
saved += len(str(r.get("output", ""))) // 4
|
r = getattr(p.function_response, "response", {})
|
||||||
hist.pop(0)
|
if isinstance(r, dict):
|
||||||
total_in -= max(saved, 100)
|
saved += len(str(r.get("output", ""))) // 4
|
||||||
dropped += 1
|
hist.pop(0)
|
||||||
|
dropped += 1
|
||||||
|
total_in -= max(saved, 200)
|
||||||
if dropped > 0:
|
if dropped > 0:
|
||||||
_append_comms("OUT", "request", {"message": f"[GEMINI HISTORY TRIMMED: dropped {dropped} old entries to stay within token budget]"})
|
_append_comms("OUT", "request", {"message": f"[GEMINI HISTORY TRIMMED: dropped {dropped} old entries to stay within token budget]"})
|
||||||
|
|
||||||
@@ -579,8 +633,8 @@ def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items:
|
|||||||
|
|
||||||
if i == len(calls) - 1:
|
if i == len(calls) - 1:
|
||||||
if file_items:
|
if file_items:
|
||||||
file_items = _reread_file_items(file_items)
|
file_items, changed = _reread_file_items(file_items)
|
||||||
ctx = _build_file_context_text(file_items)
|
ctx = _build_file_context_text(changed)
|
||||||
if ctx:
|
if ctx:
|
||||||
out += f"\n\n[SYSTEM: FILES UPDATED]\n\n{ctx}"
|
out += f"\n\n[SYSTEM: FILES UPDATED]\n\n{ctx}"
|
||||||
if r_idx == MAX_TOOL_ROUNDS: out += "\n\n[SYSTEM: MAX ROUNDS. PROVIDE FINAL ANSWER.]"
|
if r_idx == MAX_TOOL_ROUNDS: out += "\n\n[SYSTEM: MAX ROUNDS. PROVIDE FINAL ANSWER.]"
|
||||||
@@ -616,7 +670,15 @@ _FILE_REFRESH_MARKER = "[FILES UPDATED"
|
|||||||
|
|
||||||
|
|
||||||
def _estimate_message_tokens(msg: dict) -> int:
|
def _estimate_message_tokens(msg: dict) -> int:
|
||||||
"""Rough token estimate for a single Anthropic message dict."""
|
"""
|
||||||
|
Rough token estimate for a single Anthropic message dict.
|
||||||
|
Caches the result on the dict as '_est_tokens' so repeated calls
|
||||||
|
(e.g., from _trim_anthropic_history) don't re-scan unchanged messages.
|
||||||
|
Call _invalidate_token_estimate() when a message's content is modified.
|
||||||
|
"""
|
||||||
|
cached = msg.get("_est_tokens")
|
||||||
|
if cached is not None:
|
||||||
|
return cached
|
||||||
total_chars = 0
|
total_chars = 0
|
||||||
content = msg.get("content", "")
|
content = msg.get("content", "")
|
||||||
if isinstance(content, str):
|
if isinstance(content, str):
|
||||||
@@ -634,7 +696,14 @@ def _estimate_message_tokens(msg: dict) -> int:
|
|||||||
total_chars += len(_json.dumps(inp, ensure_ascii=False))
|
total_chars += len(_json.dumps(inp, ensure_ascii=False))
|
||||||
elif isinstance(block, str):
|
elif isinstance(block, str):
|
||||||
total_chars += len(block)
|
total_chars += len(block)
|
||||||
return max(1, int(total_chars / _CHARS_PER_TOKEN))
|
est = max(1, int(total_chars / _CHARS_PER_TOKEN))
|
||||||
|
msg["_est_tokens"] = est
|
||||||
|
return est
|
||||||
|
|
||||||
|
|
||||||
|
def _invalidate_token_estimate(msg: dict):
|
||||||
|
"""Remove the cached token estimate so the next call recalculates."""
|
||||||
|
msg.pop("_est_tokens", None)
|
||||||
|
|
||||||
|
|
||||||
def _estimate_prompt_tokens(system_blocks: list[dict], history: list[dict]) -> int:
|
def _estimate_prompt_tokens(system_blocks: list[dict], history: list[dict]) -> int:
|
||||||
@@ -646,7 +715,7 @@ def _estimate_prompt_tokens(system_blocks: list[dict], history: list[dict]) -> i
|
|||||||
total += max(1, int(len(text) / _CHARS_PER_TOKEN))
|
total += max(1, int(len(text) / _CHARS_PER_TOKEN))
|
||||||
# Tool definitions (rough fixed estimate — they're ~2k tokens for our set)
|
# Tool definitions (rough fixed estimate — they're ~2k tokens for our set)
|
||||||
total += 2500
|
total += 2500
|
||||||
# History messages
|
# History messages (uses cached estimates for unchanged messages)
|
||||||
for msg in history:
|
for msg in history:
|
||||||
total += _estimate_message_tokens(msg)
|
total += _estimate_message_tokens(msg)
|
||||||
return total
|
return total
|
||||||
@@ -681,6 +750,7 @@ def _strip_stale_file_refreshes(history: list[dict]):
|
|||||||
cleaned.append(block)
|
cleaned.append(block)
|
||||||
if len(cleaned) < len(content):
|
if len(cleaned) < len(content):
|
||||||
msg["content"] = cleaned
|
msg["content"] = cleaned
|
||||||
|
_invalidate_token_estimate(msg)
|
||||||
|
|
||||||
|
|
||||||
def _trim_anthropic_history(system_blocks: list[dict], history: list[dict]):
|
def _trim_anthropic_history(system_blocks: list[dict], history: list[dict]):
|
||||||
@@ -772,6 +842,28 @@ def _strip_cache_controls(history: list[dict]):
|
|||||||
if isinstance(block, dict):
|
if isinstance(block, dict):
|
||||||
block.pop("cache_control", None)
|
block.pop("cache_control", None)
|
||||||
|
|
||||||
|
def _add_history_cache_breakpoint(history: list[dict]):
|
||||||
|
"""
|
||||||
|
Place cache_control:ephemeral on the last content block of the
|
||||||
|
second-to-last user message. This uses one of the 4 allowed Anthropic
|
||||||
|
cache breakpoints to cache the conversation prefix so the full history
|
||||||
|
isn't reprocessed on every request.
|
||||||
|
"""
|
||||||
|
user_indices = [i for i, m in enumerate(history) if m.get("role") == "user"]
|
||||||
|
if len(user_indices) < 2:
|
||||||
|
return # Only one user message (the current turn) — nothing stable to cache
|
||||||
|
target_idx = user_indices[-2]
|
||||||
|
content = history[target_idx].get("content")
|
||||||
|
if isinstance(content, list) and content:
|
||||||
|
last_block = content[-1]
|
||||||
|
if isinstance(last_block, dict):
|
||||||
|
last_block["cache_control"] = {"type": "ephemeral"}
|
||||||
|
elif isinstance(content, str):
|
||||||
|
history[target_idx]["content"] = [
|
||||||
|
{"type": "text", "text": content, "cache_control": {"type": "ephemeral"}}
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
def _repair_anthropic_history(history: list[dict]):
|
def _repair_anthropic_history(history: list[dict]):
|
||||||
"""
|
"""
|
||||||
If history ends with an assistant message that contains tool_use blocks
|
If history ends with an assistant message that contains tool_use blocks
|
||||||
@@ -809,23 +901,36 @@ def _send_anthropic(md_content: str, user_message: str, base_dir: str, file_item
|
|||||||
_ensure_anthropic_client()
|
_ensure_anthropic_client()
|
||||||
mcp_client.configure(file_items or [], [base_dir])
|
mcp_client.configure(file_items or [], [base_dir])
|
||||||
|
|
||||||
system_text = _get_combined_system_prompt() + f"\n\n<context>\n{md_content}\n</context>"
|
# Split system into two cache breakpoints:
|
||||||
system_blocks = _build_chunked_context_blocks(system_text)
|
# 1. Stable system prompt (never changes — always a cache hit)
|
||||||
|
# 2. Dynamic file context (invalidated only when files change)
|
||||||
|
stable_prompt = _get_combined_system_prompt()
|
||||||
|
stable_blocks = [{"type": "text", "text": stable_prompt, "cache_control": {"type": "ephemeral"}}]
|
||||||
|
context_text = f"\n\n<context>\n{md_content}\n</context>"
|
||||||
|
context_blocks = _build_chunked_context_blocks(context_text)
|
||||||
|
system_blocks = stable_blocks + context_blocks
|
||||||
|
|
||||||
user_content = [{"type": "text", "text": user_message}]
|
user_content = [{"type": "text", "text": user_message}]
|
||||||
|
|
||||||
# COMPRESS HISTORY: Truncate massive tool outputs from previous turns
|
# COMPRESS HISTORY: Truncate massive tool outputs from previous turns
|
||||||
for msg in _anthropic_history:
|
for msg in _anthropic_history:
|
||||||
if msg.get("role") == "user" and isinstance(msg.get("content"), list):
|
if msg.get("role") == "user" and isinstance(msg.get("content"), list):
|
||||||
|
modified = False
|
||||||
for block in msg["content"]:
|
for block in msg["content"]:
|
||||||
if isinstance(block, dict) and block.get("type") == "tool_result":
|
if isinstance(block, dict) and block.get("type") == "tool_result":
|
||||||
t_content = block.get("content", "")
|
t_content = block.get("content", "")
|
||||||
if _history_trunc_limit > 0 and isinstance(t_content, str) and len(t_content) > _history_trunc_limit:
|
if _history_trunc_limit > 0 and isinstance(t_content, str) and len(t_content) > _history_trunc_limit:
|
||||||
block["content"] = t_content[:_history_trunc_limit] + "\n\n... [TRUNCATED BY SYSTEM TO SAVE TOKENS. Original output was too large.]"
|
block["content"] = t_content[:_history_trunc_limit] + "\n\n... [TRUNCATED BY SYSTEM TO SAVE TOKENS. Original output was too large.]"
|
||||||
|
modified = True
|
||||||
|
if modified:
|
||||||
|
_invalidate_token_estimate(msg)
|
||||||
|
|
||||||
_strip_cache_controls(_anthropic_history)
|
_strip_cache_controls(_anthropic_history)
|
||||||
_repair_anthropic_history(_anthropic_history)
|
_repair_anthropic_history(_anthropic_history)
|
||||||
_anthropic_history.append({"role": "user", "content": user_content})
|
_anthropic_history.append({"role": "user", "content": user_content})
|
||||||
|
# Use the 4th cache breakpoint to cache the conversation history prefix.
|
||||||
|
# This is placed on the second-to-last user message (the last stable one).
|
||||||
|
_add_history_cache_breakpoint(_anthropic_history)
|
||||||
|
|
||||||
n_chunks = len(system_blocks)
|
n_chunks = len(system_blocks)
|
||||||
_append_comms("OUT", "request", {
|
_append_comms("OUT", "request", {
|
||||||
@@ -850,13 +955,16 @@ def _send_anthropic(md_content: str, user_message: str, base_dir: str, file_item
|
|||||||
),
|
),
|
||||||
})
|
})
|
||||||
|
|
||||||
|
def _strip_private_keys(history):
|
||||||
|
return [{k: v for k, v in m.items() if not k.startswith("_")} for m in history]
|
||||||
|
|
||||||
response = _anthropic_client.messages.create(
|
response = _anthropic_client.messages.create(
|
||||||
model=_model,
|
model=_model,
|
||||||
max_tokens=_max_tokens,
|
max_tokens=_max_tokens,
|
||||||
temperature=_temperature,
|
temperature=_temperature,
|
||||||
system=system_blocks,
|
system=system_blocks,
|
||||||
tools=_get_anthropic_tools(),
|
tools=_get_anthropic_tools(),
|
||||||
messages=_anthropic_history,
|
messages=_strip_private_keys(_anthropic_history),
|
||||||
)
|
)
|
||||||
|
|
||||||
# Convert SDK content block objects to plain dicts before storing in history
|
# Convert SDK content block objects to plain dicts before storing in history
|
||||||
@@ -939,10 +1047,10 @@ def _send_anthropic(md_content: str, user_message: str, base_dir: str, file_item
|
|||||||
"content": output,
|
"content": output,
|
||||||
})
|
})
|
||||||
|
|
||||||
# Refresh file context after tool calls and inject into tool result message
|
# Refresh file context after tool calls — only inject CHANGED files
|
||||||
if file_items:
|
if file_items:
|
||||||
file_items = _reread_file_items(file_items)
|
file_items, changed = _reread_file_items(file_items)
|
||||||
refreshed_ctx = _build_file_context_text(file_items)
|
refreshed_ctx = _build_file_context_text(changed)
|
||||||
if refreshed_ctx:
|
if refreshed_ctx:
|
||||||
tool_results.append({
|
tool_results.append({
|
||||||
"type": "text",
|
"type": "text",
|
||||||
|
|||||||
+2
-2
@@ -10,11 +10,11 @@ system_prompt = "DO NOT EVER make a shell script unless told to. DO NOT EVER mak
|
|||||||
palette = "10x Dark"
|
palette = "10x Dark"
|
||||||
font_path = "C:/Users/Ed/AppData/Local/uv/cache/archive-v0/WSthkYsQ82b_ywV6DkiaJ/pygame_gui/data/FiraCode-Regular.ttf"
|
font_path = "C:/Users/Ed/AppData/Local/uv/cache/archive-v0/WSthkYsQ82b_ywV6DkiaJ/pygame_gui/data/FiraCode-Regular.ttf"
|
||||||
font_size = 18.0
|
font_size = 18.0
|
||||||
scale = 1.1
|
scale = 1.25
|
||||||
|
|
||||||
[projects]
|
[projects]
|
||||||
paths = [
|
paths = [
|
||||||
"manual_slop.toml",
|
"manual_slop.toml",
|
||||||
"C:/projects/forth/bootslop/bootslop.toml",
|
"C:/projects/forth/bootslop/bootslop.toml",
|
||||||
]
|
]
|
||||||
active = "C:/projects/forth/bootslop/bootslop.toml"
|
active = "manual_slop.toml"
|
||||||
|
|||||||
+3
-2
@@ -29,7 +29,7 @@ Controls what is explicitly fed into the context compiler.
|
|||||||
|
|
||||||
- **Base Dir:** Defines the root for path resolution and tool constraints.
|
- **Base Dir:** Defines the root for path resolution and tool constraints.
|
||||||
- **Paths:** Explicit files or wildcard globs (e.g., src/**/*.rs).
|
- **Paths:** Explicit files or wildcard globs (e.g., src/**/*.rs).
|
||||||
- When generating a request, these files are summarized symbolically (summarize.py) to conserve tokens, unless the AI explicitly decides to read their full contents via its internal tools.
|
- When generating a request, full file contents are inlined into the context by default (`summary_only=False`). The AI can also call `get_file_summary` via its MCP tools to get a compact structural view of any file on demand.
|
||||||
|
|
||||||
## Interaction Panels
|
## Interaction Panels
|
||||||
|
|
||||||
@@ -46,8 +46,9 @@ Switch between API backends (Gemini, Anthropic) on the fly. Clicking "Fetch Mode
|
|||||||
|
|
||||||
### Global Text Viewer & Script Outputs
|
### Global Text Viewer & Script Outputs
|
||||||
|
|
||||||
- **Last Script Output:** Whenever the AI executes a background script, this window pops up, flashing blue. It contains both the executed script and the stdout/stderr.
|
- **Last Script Output:** Whenever the AI executes a background script, this window pops up, flashing blue. It contains both the executed script and the stdout/stderr. The `[+ Maximize]` buttons read directly from stored instance variables (`_last_script`, `_last_output`) rather than DPG widget tags, so they work correctly regardless of word-wrap state.
|
||||||
- **Text Viewer:** A large, resizable global popup invoked anytime you click a [+] or [+ Maximize] button in the UI. Used for deep-reading long logs, discussion entries, or script bodies.
|
- **Text Viewer:** A large, resizable global popup invoked anytime you click a [+] or [+ Maximize] button in the UI. Used for deep-reading long logs, discussion entries, or script bodies.
|
||||||
|
- **Confirm Dialog:** The `[+ Maximize]` button in the script approval modal passes the script text directly as `user_data` at button-creation time, so it remains safe to click even after the dialog has been dismissed.
|
||||||
|
|
||||||
## System Prompts
|
## System Prompts
|
||||||
|
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
# Guide: Architecture
|
# Guide: Architecture
|
||||||
|
|
||||||
Overview of the package design, state management, and code-path layout.
|
Overview of the package design, state management, and code-path layout.
|
||||||
|
|
||||||
@@ -33,10 +33,9 @@ This occurs inside aggregate.run.
|
|||||||
If using the default workflow, aggregate.py hashes through the following process:
|
If using the default workflow, aggregate.py hashes through the following process:
|
||||||
|
|
||||||
1. **Glob Resolution:** Iterates through config["files"]["paths"] and unpacks any wildcards (e.g., src/**/*.rs) against the designated base_dir.
|
1. **Glob Resolution:** Iterates through config["files"]["paths"] and unpacks any wildcards (e.g., src/**/*.rs) against the designated base_dir.
|
||||||
2. **Summarization Pass:** Instead of concatenating raw file bodies (which would quickly overwhelm the ~200k token limit over multiple rounds), the files are passed to summarize.py.
|
2. **File Item Build:** `build_file_items()` reads each resolved file once, storing path, content, and `mtime`. This list is returned alongside the markdown so `ai_client.py` can use it for dynamic context refresh after tool calls without re-reading from disk.
|
||||||
3. **AST Parsing:** summarize.py runs a heuristic pass. For Python files, it uses the standard ast module to read structural nodes (Classes, Methods, Imports, Constants). It outputs a compact Markdown table.
|
3. **Markdown Generation:** `build_markdown_from_items()` assembles the final `<project>_00N.md` string. By default (`summary_only=False`) it inlines full file contents. If `summary_only=True`, it delegates to `summarize.build_summary_markdown()` which uses AST-based heuristics to produce compact structural summaries instead.
|
||||||
4. **Markdown Generation:** The final <project>_00N.md string is constructed, comprising the truncated AST summaries, the user's current project system prompt, and the active discussion branch.
|
4. The Markdown file is persisted to disk (`./md_gen/` by default) for auditing. `run()` returns a 3-tuple `(markdown_str, output_path, file_items)`.
|
||||||
5. The Markdown file is persisted to disk (./md_gen/ by default) for auditing.
|
|
||||||
|
|
||||||
### AI Communication & The Tool Loop
|
### AI Communication & The Tool Loop
|
||||||
|
|
||||||
@@ -85,3 +84,4 @@ All I/O bound session data is recorded sequentially. session_logger.py hooks int
|
|||||||
- logs/comms_<ts>.log: A JSON-L structured timeline of every raw payload sent/received.
|
- logs/comms_<ts>.log: A JSON-L structured timeline of every raw payload sent/received.
|
||||||
- logs/toolcalls_<ts>.log: A sequential markdown record detailing every AI tool invocation and its exact stdout result.
|
- logs/toolcalls_<ts>.log: A sequential markdown record detailing every AI tool invocation and its exact stdout result.
|
||||||
- scripts/generated/: Every .ps1 script approved and executed by the shell runner is physically written to disk for version control transparency.
|
- scripts/generated/: Every .ps1 script approved and executed by the shell runner is physically written to disk for version control transparency.
|
||||||
|
|
||||||
|
|||||||
+12
-7
@@ -12,17 +12,22 @@ Implemented in mcp_client.py. These tools allow the AI to selectively expand its
|
|||||||
|
|
||||||
### Security & Scope
|
### Security & Scope
|
||||||
|
|
||||||
Every filesystem MCP tool passes its arguments through _resolve_and_check. This function ensures that the requested path falls under one of the allowed directories defined in the GUI's Base Dir configurations.
|
Every **filesystem** MCP tool passes its arguments through `_resolve_and_check`. This function ensures that the requested path falls under one of the allowed directories defined in the GUI's Base Dir configurations.
|
||||||
If the AI attempts to read or search a path outside the project bounds, the tool safely catches the constraint violation and returns ACCESS DENIED.
|
If the AI attempts to read or search a path outside the project bounds, the tool safely catches the constraint violation and returns ACCESS DENIED.
|
||||||
|
|
||||||
|
The two **web tools** (`web_search`, `fetch_url`) bypass this check entirely — they have no filesystem access and are unrestricted.
|
||||||
|
|
||||||
### Supplied Tools:
|
### Supplied Tools:
|
||||||
|
|
||||||
* read_file(path): Returns the raw UTF-8 text of a file.
|
**Filesystem tools** (access-controlled via `_resolve_and_check`):
|
||||||
* list_directory(path): Returns a formatted table of a directory's contents, showing file vs dir and byte sizes.
|
* `read_file(path)`: Returns the raw UTF-8 text of a file.
|
||||||
* search_files(path, pattern): Executes an absolute glob search (e.g., **/*.py) to find specific files.
|
* `list_directory(path)`: Returns a formatted table of a directory's contents, showing file vs dir and byte sizes.
|
||||||
* get_file_summary(path): Invokes the local summarize.py heuristic parser to get the AST structure of a file without reading the whole body.
|
* `search_files(path, pattern)`: Executes a glob search (e.g., `**/*.py`) within an allowed directory.
|
||||||
* web_search(query): Queries DuckDuckGo's raw HTML endpoint and returns the top 5 results (Titles, URLs, Snippets) using a native HTMLParser to avoid heavy dependencies.
|
* `get_file_summary(path)`: Invokes the local `summarize.py` heuristic parser to get the AST structure of a file without reading the whole body.
|
||||||
* fetch_url(url): Downloads a target webpage and strips out all scripts, styling, and structural HTML, returning only the raw prose content (clamped to 40,000 characters).
|
|
||||||
|
**Web tools** (unrestricted — no filesystem access):
|
||||||
|
* `web_search(query)`: Queries DuckDuckGo's raw HTML endpoint and returns the top 5 results (title, URL, snippet) using a native `_DDGParser` (HTMLParser subclass) to avoid heavy dependencies.
|
||||||
|
* `fetch_url(url)`: Downloads a target webpage and strips out all scripts, styling, and structural HTML via `_TextExtractor`, returning only the raw prose content (clamped to 40,000 characters). Automatically resolves DuckDuckGo redirect links.
|
||||||
|
|
||||||
## 2. Destructive Execution (run_powershell)
|
## 2. Destructive Execution (run_powershell)
|
||||||
|
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
# gui.py
|
# gui.py
|
||||||
"""
|
"""
|
||||||
Note(Gemini):
|
Note(Gemini):
|
||||||
The main DearPyGui interface orchestrator.
|
The main DearPyGui interface orchestrator.
|
||||||
@@ -302,8 +302,8 @@ class ConfirmDialog:
|
|||||||
dpg.add_text("Script:")
|
dpg.add_text("Script:")
|
||||||
dpg.add_button(
|
dpg.add_button(
|
||||||
label="[+ Maximize]",
|
label="[+ Maximize]",
|
||||||
user_data=f"{self._tag}_script",
|
user_data=self._script,
|
||||||
callback=lambda s, a, u: _show_text_viewer("Confirm Script", dpg.get_value(u))
|
callback=lambda s, a, u: _show_text_viewer("Confirm Script", u)
|
||||||
)
|
)
|
||||||
dpg.add_input_text(
|
dpg.add_input_text(
|
||||||
tag=f"{self._tag}_script",
|
tag=f"{self._tag}_script",
|
||||||
@@ -432,6 +432,8 @@ class App:
|
|||||||
self._pending_dialog_lock = threading.Lock()
|
self._pending_dialog_lock = threading.Lock()
|
||||||
|
|
||||||
self._tool_log: list[tuple[str, str]] = []
|
self._tool_log: list[tuple[str, str]] = []
|
||||||
|
self._last_script: str = ""
|
||||||
|
self._last_output: str = ""
|
||||||
|
|
||||||
# Comms log entries queued from background thread for main-thread rendering
|
# Comms log entries queued from background thread for main-thread rendering
|
||||||
self._pending_comms: list[dict] = []
|
self._pending_comms: list[dict] = []
|
||||||
@@ -748,6 +750,8 @@ class App:
|
|||||||
return output
|
return output
|
||||||
|
|
||||||
def _append_tool_log(self, script: str, result: str):
|
def _append_tool_log(self, script: str, result: str):
|
||||||
|
self._last_script = script
|
||||||
|
self._last_output = result
|
||||||
self._tool_log.append((script, result))
|
self._tool_log.append((script, result))
|
||||||
self._rebuild_tool_log()
|
self._rebuild_tool_log()
|
||||||
|
|
||||||
@@ -1917,8 +1921,7 @@ class App:
|
|||||||
dpg.add_text("Script:")
|
dpg.add_text("Script:")
|
||||||
dpg.add_button(
|
dpg.add_button(
|
||||||
label="[+ Maximize]",
|
label="[+ Maximize]",
|
||||||
user_data="last_script_text",
|
callback=lambda s, a, u: _show_text_viewer("Last Script", self._last_script),
|
||||||
callback=lambda s, a, u: _show_text_viewer("Last Script", dpg.get_value(u))
|
|
||||||
)
|
)
|
||||||
dpg.add_input_text(
|
dpg.add_input_text(
|
||||||
tag="last_script_text",
|
tag="last_script_text",
|
||||||
@@ -1934,8 +1937,7 @@ class App:
|
|||||||
dpg.add_text("Output:")
|
dpg.add_text("Output:")
|
||||||
dpg.add_button(
|
dpg.add_button(
|
||||||
label="[+ Maximize]",
|
label="[+ Maximize]",
|
||||||
user_data="last_script_output",
|
callback=lambda s, a, u: _show_text_viewer("Last Output", self._last_output),
|
||||||
callback=lambda s, a, u: _show_text_viewer("Last Output", dpg.get_value(u))
|
|
||||||
)
|
)
|
||||||
dpg.add_input_text(
|
dpg.add_input_text(
|
||||||
tag="last_script_output",
|
tag="last_script_output",
|
||||||
@@ -2120,3 +2122,7 @@ def main():
|
|||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
main()
|
main()
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
+44
-29
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user