# Manual Slop ## Summary Is a local GUI tool for manually curating and sending context to AI APIs. It aggregates files, screenshots, and discussion history into a structured markdown file and sends it to a chosen AI provider with a user-written message. The AI can also execute PowerShell scripts within the project directory, with user confirmation required before each execution. **Stack:** - `dearpygui` - GUI with docking/floating/resizable panels - `google-genai` - Gemini API - `anthropic` - Anthropic API - `tomli-w` - TOML writing - `uv` - package/env management **Files:** - `gui.py` - main GUI, `App` class, all panels, all callbacks, confirmation dialog, layout persistence, rich comms rendering - `ai_client.py` - unified provider wrapper, model listing, session management, send, tool/function-call loop, comms log, provider error classification - `aggregate.py` - reads config, collects files/screenshots/discussion, writes numbered `.md` files to `output_dir` - `shell_runner.py` - subprocess wrapper that runs PowerShell scripts sandboxed to `base_dir`, returns stdout/stderr/exit code as a string - `session_logger.py` - opens timestamped log files at session start; writes comms entries as JSON-L and tool calls as markdown; saves each AI-generated script as a `.ps1` file - `project_manager.py` - per-project .toml load/save, entry serialisation (entry_to_str/str_to_entry with @timestamp support), default_project/default_discussion factories, migrate_from_legacy_config, flat_config for aggregate.run(), git helpers (get_git_commit, get_git_log) - `theme.py` - palette definitions, font loading, scale, load_from_config/save_to_config - `gemini.py` - legacy standalone Gemini wrapper (not used by the main GUI; superseded by `ai_client.py`) - `file_cache.py` - stub; Anthropic Files API path removed; kept so stale imports don't break - `mcp_client.py` - MCP-style read-only file tools (read_file, list_directory, search_files, get_file_summary); allowlist enforced against project file_items + base_dirs; dispatched by ai_client tool-use loop for both Anthropic and Gemini - `summarize.py` - local heuristic summariser (no AI); .py via AST, .toml via regex, .md headings, generic preview; used by mcp_client.get_file_summary and aggregate.build_summary_section - `config.toml` - global-only settings: [ai] provider+model+system_prompt, [theme] palette+font+scale, [projects] paths array + active path - `manual_slop.toml` - per-project file: [project] name+git_dir+system_prompt+main_context, [output] namespace+output_dir, [files] base_dir+paths, [screenshots] base_dir+paths, [discussion] roles+active+[discussion.discussions.] git_commit+last_updated+history - `credentials.toml` - gemini api_key, anthropic api_key - `dpg_layout.ini` - Dear PyGui window layout file (auto-saved on exit, auto-loaded on startup); gitignore this per-user **GUI Panels:** - **Projects** - active project name display (green), git directory input + Browse button, scrollable list of loaded project paths (click name to switch, x to remove), Add Project / New Project / Save All buttons - **Config** - namespace, output dir, save (these are project-level fields from the active .toml) - **Files** - base_dir, scrollable path list with remove, add file(s), add wildcard - **Screenshots** - base_dir, scrollable path list with remove, add screenshot(s) - **Discussion History** - discussion selector (collapsible header): listbox of named discussions, git commit + last_updated display, Update Commit button, Create/Rename/Delete buttons with name input; structured entry editor: each entry has collapse toggle (-/+), role combo, timestamp display, multiline content field; per-entry Ins/Del buttons when collapsed; global toolbar: + Entry, -All, +All, Clear All, Save; collapsible **Roles** sub-section; -> History buttons on Message and Response panels append current message/response as new entry with timestamp - **Provider** - provider combo (gemini/anthropic), model listbox populated from API, fetch models button - **Message** - multiline input, Gen+Send button, MD Only button, Reset session button, -> History button - **Response** - readonly multiline displaying last AI response, -> History button - **Tool Calls** - scrollable log of every PowerShell tool call the AI made; Clear button - **System Prompts** - global (all projects) and project-specific multiline text areas for injecting custom system instructions. Combined with the built-in tool prompt. - **Comms History** - rich structured live log of every API interaction; status line at top; colour legend; Clear button **Layout persistence:** - `dpg.configure_app(..., init_file="dpg_layout.ini")` loads the ini at startup if it exists; DPG silently ignores a missing file - `dpg.save_init_file("dpg_layout.ini")` is called immediately before `dpg.destroy_context()` on clean exit - The ini records window positions, sizes, and dock node assignments in DPG's native format - First run (no ini) uses the hardcoded `pos=` defaults in `_build_ui()`; after that the ini takes over - Delete `dpg_layout.ini` to reset to defaults **Project management:** - `config.toml` is global-only: `[ai]`, `[theme]`, `[projects]` (paths list + active path). No project data lives here. - Each project has its own `.toml` file (e.g. `manual_slop.toml`). Multiple project tomls can be registered by path. - `App.__init__` loads global config, then loads the active project `.toml` via `project_manager.load_project()`. Falls back to `migrate_from_legacy_config()` if no valid project file exists, creating a new `.toml` automatically. - `_flush_to_project()` pulls widget values into `self.project` (the per-project dict) and serialises disc_entries into the active discussion's history list - `_flush_to_config()` writes global settings ([ai], [theme], [projects]) into `self.config` - `_save_active_project()` writes `self.project` to the active `.toml` path via `project_manager.save_project()` - `_do_generate()` calls both flush methods, saves both files, then uses `project_manager.flat_config()` to produce the dict that `aggregate.run()` expects — so `aggregate.py` needs zero changes - Switching projects: saves current project, loads new one, refreshes all GUI state, resets AI session - New project: file dialog for save path, creates default project structure, saves it, switches to it **Discussion management (per-project):** - Each project `.toml` stores one or more named discussions under `[discussion.discussions.]` - Each discussion has: `git_commit` (str), `last_updated` (ISO timestamp), `history` (list of serialised entry strings) - `active` key in `[discussion]` tracks which discussion is currently selected - Creating a discussion: adds a new empty discussion dict via `default_discussion()`, switches to it - Renaming: moves the dict to a new key, updates `active` if it was the current one - Deleting: removes the dict; cannot delete the last discussion; switches to first remaining if active was deleted - Switching: flushes current entries to project, loads new discussion's history, rebuilds disc list - Update Commit button: runs `git rev-parse HEAD` in the project's `git_dir` and stores result + timestamp in the active discussion - Timestamps: each disc entry carries a `ts` field (ISO datetime); shown next to the role combo; new entries from `-> History` or `+ Entry` get `now_ts()` **Entry serialisation (project_manager):** - `entry_to_str(entry)` → `"@\n:\n"` (or `":\n"` if no ts) - `str_to_entry(raw, roles)` → parses optional `@` prefix, then role line, then content; returns `{role, content, collapsed, ts}` - Round-trips correctly through TOML string arrays; handles legacy entries without timestamps **AI Tool Use (PowerShell):** - Both Gemini and Anthropic are configured with a `run_powershell` tool/function declaration - When the AI wants to edit or create files it emits a tool call with a `script` string - `ai_client` runs a loop (max `MAX_TOOL_ROUNDS = 5`) feeding tool results back until the AI stops calling tools - Before any script runs, `gui.py` shows a modal `ConfirmDialog` on the main thread; the background send thread blocks on a `threading.Event` until the user clicks Approve or Reject - The dialog displays `base_dir`, shows the script in an editable text box (allowing last-second tweaks), and has Approve & Run / Reject buttons - On approval the (possibly edited) script is passed to `shell_runner.run_powershell()` which prepends `Set-Location -LiteralPath ''` and runs it via `powershell -NoProfile -NonInteractive -Command` - stdout, stderr, and exit code are returned to the AI as the tool result - Rejections return `"USER REJECTED: command was not executed"` to the AI - All tool calls (script + result/rejection) are appended to `_tool_log` and displayed in the Tool Calls panel **Dynamic file context refresh (ai_client.py):** - After every tool call round, all project files from `file_items` are re-read from disk via `_reread_file_items()` - For Anthropic: the refreshed file contents are injected as a `text` block appended to the `tool_results` user message, prefixed with `[FILES UPDATED]` and an instruction not to re-read them - For Gemini: files are re-read (updating the `file_items` list in place) but cannot be injected into tool results due to Gemini's structured function response format - `_build_file_context_text(file_items)` formats the refreshed files as markdown code blocks (same format as the original context) - The `tool_result_send` comms log entry filters out the injected text block (only logs actual `tool_result` entries) to keep the comms panel clean - `file_items` flows from `aggregate.build_file_items()` → `gui.py` `self.last_file_items` → `ai_client.send(file_items=...)` → `_send_anthropic(file_items=...)` / `_send_gemini(file_items=...)` - System prompt updated to tell the AI: "the user's context files are automatically refreshed after every tool call, so you do NOT need to re-read files that are already provided in the block" **Anthropic bug fixes applied (session history):** - Bug 1: SDK ContentBlock objects now converted to plain dicts via `_content_block_to_dict()` before storing in `_anthropic_history`; prevents re-serialisation failures on subsequent tool-use rounds - Bug 2: `_repair_anthropic_history` simplified to dict-only path since history always contains dicts - Bug 3: Gemini part.function_call access now guarded with `hasattr` check - Bug 4: Anthropic `b.type == "tool_use"` changed to `getattr(b, "type", None) == "tool_use"` for safe access during response processing **Comms Log (ai_client.py):** - `_comms_log: list[dict]` accumulates every API interaction during a session - `_append_comms(direction, kind, payload)` called at each boundary: OUT/request before sending, IN/response after each model reply, OUT/tool_call before executing, IN/tool_result after executing, OUT/tool_result_send when returning results to the model - Entry fields: `ts` (HH:MM:SS), `direction` (OUT/IN), `kind`, `provider`, `model`, `payload` (dict) - Anthropic responses also include `usage` (input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens) and `stop_reason` in payload - `get_comms_log()` returns a snapshot; `clear_comms_log()` empties it - `comms_log_callback` (injected by gui.py) is called from the background thread with each new entry; gui queues entries in `_pending_comms` (lock-protected) and flushes them to the DPG panel each render frame - `COMMS_CLAMP_CHARS = 300` in gui.py governs the display cutoff for heavy text fields **Comms History panel — rich structured rendering (gui.py):** Rather than showing raw JSON, each comms entry is rendered using a kind-specific renderer function. Unknown kinds fall back to a generic key/value layout. Colour maps: - Direction: OUT = blue-ish `(100,200,255)`, IN = green-ish `(140,255,160)` - Kind: request=gold, response=light-green, tool_call=orange, tool_result=light-blue, tool_result_send=lavender - Labels: grey `(180,180,180)`; values: near-white `(220,220,220)`; dict keys/indices: `(140,200,255)`; numbers/token counts: `(180,255,180)`; sub-headers: `(220,200,120)` Helper functions: - `_add_text_field(parent, label, value)` — labelled text; strings longer than `COMMS_CLAMP_CHARS` render as an 80px readonly scrollable `input_text`; shorter strings render as `add_text` - `_add_kv_row(parent, key, val)` — single horizontal key: value row - `_render_usage(parent, usage)` — renders Anthropic token usage dict in a fixed display order (input → cache_read → cache_creation → output) - `_render_tool_calls_list(parent, tool_calls)` — iterates tool call list, showing name, id, and all args via `_add_text_field` Kind-specific renderers (in `_KIND_RENDERERS` dict, dispatched by `_render_comms_entry`): - `_render_payload_request` — shows `message` field via `_add_text_field` - `_render_payload_response` — shows round, stop_reason (orange), text, tool_calls list, usage block - `_render_payload_tool_call` — shows name, optional id, script via `_add_text_field` - `_render_payload_tool_result` — shows name, optional id, output via `_add_text_field` - `_render_payload_tool_result_send` — iterates results list, shows tool_use_id and content per result - `_render_payload_generic` — fallback for unknown kinds; renders all keys, using `_add_text_field` for keys in `_HEAVY_KEYS`, `_add_kv_row` for others; dicts/lists are JSON-serialised Entry layout: index + timestamp + direction + kind + provider/model header row, then payload rendered by the appropriate function, then a separator line. **Session Logger (session_logger.py):** - `open_session()` called once at GUI startup; creates `logs/` and `scripts/generated/` directories; opens `logs/comms_.log` and `logs/toolcalls_.log` (line-buffered) - `log_comms(entry)` appends each comms entry as a JSON-L line to the comms log; called from `App._on_comms_entry` (background thread); thread-safe via GIL + line buffering - `log_tool_call(script, result, script_path)` writes the script to `scripts/generated/_.ps1` and appends a markdown record to the toolcalls log without the script body (just the file path + result); uses a `threading.Lock` for the sequence counter - `close_session()` flushes and closes both file handles; called just before `dpg.destroy_context()` **Anthropic prompt caching:** - System prompt sent as an array with `cache_control: ephemeral` on the text block - Last tool in `_ANTHROPIC_TOOLS` has `cache_control: ephemeral`; system + tools prefix is cached together after the first request - First user message content[0] is the `` block with `cache_control: ephemeral`; content[1] is the user question without cache control - Cache stats (creation tokens, read tokens) are surfaced in the comms log usage dict and displayed in the Comms History panel **Data flow:** 1. GUI edits are held in `App` state (`self.files`, `self.screenshots`, `self.disc_entries`, `self.project`) and dpg widget values 2. `_flush_to_project()` pulls all widget values into `self.project` dict (per-project data) 3. `_flush_to_config()` pulls global settings into `self.config` dict 4. `_do_generate()` calls both flush methods, saves both files, calls `project_manager.flat_config(self.project, disc_name)` to produce a dict for `aggregate.run()`, which writes the md and returns `(markdown_str, path, file_items)` 5. `cb_generate_send()` calls `_do_generate()` then threads a call to `ai_client.send(md, message, base_dir)` 6. `ai_client.send()` prepends the md as a `` block to the user message and sends via the active provider chat session 7. If the AI responds with tool calls, the loop handles them (with GUI confirmation) before returning the final text response 8. Sessions are stateful within a run (chat history maintained), `Reset` clears them, the tool log, and the comms log **Config persistence:** - `config.toml` — global only: `[ai]` provider+model, `[theme]` palette+font+scale, `[projects]` paths array + active path - `.toml` — per-project: output, files, screenshots, discussion (roles, active discussion name, all named discussions with their history+metadata) - On every send and save, both files are written - On clean exit, `run()` calls `_flush_to_project()`, `_save_active_project()`, `_flush_to_config()`, `save_config()` before destroying context **Threading model:** - DPG render loop runs on the main thread - AI sends and model fetches run on daemon background threads - `_pending_dialog` (guarded by a `threading.Lock`) is set by the background thread and consumed by the render loop each frame, calling `dialog.show()` on the main thread - `dialog.wait()` blocks the background thread on a `threading.Event` until the user acts - `_pending_comms` (guarded by a separate `threading.Lock`) is populated by `_on_comms_entry` (background thread) and drained by `_flush_pending_comms()` each render frame (main thread) **Provider error handling:** - `ProviderError(kind, provider, original)` wraps upstream API exceptions with a classified `kind`: quota, rate_limit, auth, balance, network, unknown - `_classify_anthropic_error` and `_classify_gemini_error` inspect exception types and status codes/message bodies to assign the kind - `ui_message()` returns a human-readable label for display in the Response panel **MCP file tools (mcp_client.py + ai_client.py):** - Four read-only tools exposed to the AI as native function/tool declarations: `read_file`, `list_directory`, `search_files`, `get_file_summary` - Access control: `mcp_client.configure(file_items, extra_base_dirs)` is called before each send; builds an allowlist of resolved absolute paths from the project's `file_items` plus the `base_dir`; any path that is not explicitly in the list or not under one of the allowed directories returns `ACCESS DENIED` - `mcp_client.dispatch(tool_name, tool_input)` is the single dispatch entry point used by both Anthropic and Gemini tool-use loops - Anthropic: MCP tools appear before `run_powershell` in the tools list (no `cache_control` on them; only `run_powershell` carries `cache_control: ephemeral`) - Gemini: MCP tools are included in the `FunctionDeclaration` list alongside `run_powershell` - `get_file_summary` uses `summarize.summarise_file()` — same heuristic used for the initial `` block, so the AI gets the same compact structural view it already knows - `list_directory` sorts dirs before files; shows name, type, and size - `search_files` uses `Path.glob()` with the caller-supplied pattern (supports `**/*.py` style) - `read_file` returns raw UTF-8 text; errors (not found, access denied, decode error) are returned as error strings rather than exceptions, so the AI sees them as tool results - `summarize.py` heuristics: `.py` → AST imports + ALL_CAPS constants + classes+methods + top-level functions; `.toml` → table headers + top-level keys; `.md` → h1–h3 headings with indentation; all others → line count + first 8 lines preview - Comms log: MCP tool calls log `OUT/tool_call` with `{"name": ..., "args": {...}}` and `IN/tool_result` with `{"name": ..., "output": ...}`; rendered in the Comms History panel via `_render_payload_tool_call` (shows each arg key/value) and `_render_payload_tool_result` (shows output) **Known extension points:** - Add more providers by adding a section to `credentials.toml`, a `_list_*` and `_send_*` function in `ai_client.py`, and the provider name to the `PROVIDERS` list in `gui.py` - System prompt support could be added as a field in the project `.toml` and passed in `ai_client.send()` - Discussion history excerpts could be individually toggleable for inclusion in the generated md - `MAX_TOOL_ROUNDS` in `ai_client.py` caps agentic loops at 5 rounds; adjustable - `COMMS_CLAMP_CHARS` in `gui.py` controls the character threshold for clamping heavy payload fields in the Comms History panel - Additional project metadata (description, tags, created date) could be added to `[project]` in the per-project toml ### Gemini Context Management - Investigating ways to prevent context duplication in _gemini_chat history, as currently {md_content} is prepended to the user message on every single request, causing history bloat. - Discussing explicit Gemini Context Caching API (client.caches.create()) to store read-only file context and avoid re-reading files across sessions. ### Latest Changes - Removed `Config` panel from the GUI to streamline per-project configuration. - `output_dir` was moved into the Projects panel. - `auto_add_history` was moved to the Discussion History panel. - `namespace` is no longer a configurable field; `aggregate.py` automatically uses the active project's `name` property. ### UI / Visual Updates - The success blink notification on the response text box is now dimmer and more transparent to be less visually jarring. - Added a new floating **Last Script Output** popup window. This window automatically displays and blinks blue whenever the AI executes a PowerShell tool, showing both the executed script and its result in real-time. ## Recent Changes (Text Viewer Maximization) - **Global Text Viewer (gui.py)**: Added a dedicated, large popup window (win_text_viewer) to allow reading and scrolling through large, dense text blocks without feeling cramped. - **Comms History**: Every multi-line text field in the comms log now has a [+] button next to its label that opens the text in the Global Text Viewer. - **Tool Log History**: Added [+ Script] and [+ Output] buttons next to each logged tool call to easily maximize and read the full executed scripts and raw tool outputs. - **Last Script Output Popup**: Expanded the default size of the popup (now 800x600) and gave the input script panel more vertical space to prevent it from feeling 'scrunched'. Added [+ Maximize] buttons for both the script and the output sections to inspect them in full detail. - **Confirm Dialog**: The script confirmation modal now has a [+ Maximize] button so you can read large generated scripts in full-screen before approving them.