Files
manual_slop/MainContext.md
2026-02-21 18:36:37 -05:00

12 KiB

Make sure to update this file every time.

manual_slop is a local GUI tool for manually curating and sending context to AI APIs. It aggregates files, screenshots, and discussion history into a structured markdown file and sends it to a chosen AI provider with a user-written message. The AI can also execute PowerShell scripts within the project directory, with user confirmation required before each execution.

Stack:

  • dearpygui - GUI with docking/floating/resizable panels
  • google-genai - Gemini API
  • anthropic - Anthropic API
  • tomli-w - TOML writing
  • uv - package/env management

Files:

  • gui.py - main GUI, App class, all panels, all callbacks, confirmation dialog, layout persistence, rich comms rendering
  • ai_client.py - unified provider wrapper, model listing, session management, send, tool/function-call loop, comms log, provider error classification
  • aggregate.py - reads config, collects files/screenshots/discussion, writes numbered .md files to output_dir
  • shell_runner.py - subprocess wrapper that runs PowerShell scripts sandboxed to base_dir, returns stdout/stderr/exit code as a string
  • session_logger.py - opens timestamped log files at session start; writes comms entries as JSON-L and tool calls as markdown; saves each AI-generated script as a .ps1 file
  • gemini.py - legacy standalone Gemini wrapper (not used by the main GUI; superseded by ai_client.py)
  • config.toml - namespace, output_dir, files paths+base_dir, screenshots paths+base_dir, discussion history array, ai provider+model
  • credentials.toml - gemini api_key, anthropic api_key
  • dpg_layout.ini - Dear PyGui window layout file (auto-saved on exit, auto-loaded on startup); gitignore this per-user

GUI Panels:

  • Config - namespace, output dir, save
  • Files - base_dir, scrollable path list with remove, add file(s), add wildcard
  • Screenshots - base_dir, scrollable path list with remove, add screenshot(s)
  • Discussion History - structured block editor; each entry has a role combo (User/AI/Vendor API/System) and a multiline content field; buttons: Insert Before, Remove per entry; global buttons: + Entry, Clear All, Save; -> History buttons on Message and Response panels append the current message/response as a new entry
  • Provider - provider combo (gemini/anthropic), model listbox populated from API, fetch models button
  • Message - multiline input, Gen+Send button, MD Only button, Reset session button
  • Response - readonly multiline displaying last AI response
  • Tool Calls - scrollable log of every PowerShell tool call the AI made; shows first line of script + result (script body omitted from display, full script saved to .ps1 file via session_logger); Clear button
  • Comms History - rich structured live log of every API interaction; status line at top; colour legend; Clear button; each entry rendered with kind-specific layout rather than raw JSON

Layout persistence:

  • dpg.configure_app(..., init_file="dpg_layout.ini") loads the ini at startup if it exists; DPG silently ignores a missing file
  • dpg.save_init_file("dpg_layout.ini") is called immediately before dpg.destroy_context() on clean exit
  • The ini records window positions, sizes, and dock node assignments in DPG's native format
  • First run (no ini) uses the hardcoded pos= defaults in _build_ui(); after that the ini takes over
  • Delete dpg_layout.ini to reset to defaults

AI Tool Use (PowerShell):

  • Both Gemini and Anthropic are configured with a run_powershell tool/function declaration
  • When the AI wants to edit or create files it emits a tool call with a script string
  • ai_client runs a loop (max MAX_TOOL_ROUNDS = 5) feeding tool results back until the AI stops calling tools
  • Before any script runs, gui.py shows a modal ConfirmDialog on the main thread; the background send thread blocks on a threading.Event until the user clicks Approve or Reject
  • The dialog displays base_dir, shows the script in an editable text box (allowing last-second tweaks), and has Approve & Run / Reject buttons
  • On approval the (possibly edited) script is passed to shell_runner.run_powershell() which prepends Set-Location -LiteralPath '<base_dir>' and runs it via powershell -NoProfile -NonInteractive -Command
  • stdout, stderr, and exit code are returned to the AI as the tool result
  • Rejections return "USER REJECTED: command was not executed" to the AI
  • All tool calls (script + result/rejection) are appended to _tool_log and displayed in the Tool Calls panel

Comms Log (ai_client.py):

  • _comms_log: list[dict] accumulates every API interaction during a session
  • _append_comms(direction, kind, payload) called at each boundary: OUT/request before sending, IN/response after each model reply, OUT/tool_call before executing, IN/tool_result after executing, OUT/tool_result_send when returning results to the model
  • Entry fields: ts (HH:MM:SS), direction (OUT/IN), kind, provider, model, payload (dict)
  • Anthropic responses also include usage (input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens) and stop_reason in payload
  • get_comms_log() returns a snapshot; clear_comms_log() empties it
  • comms_log_callback (injected by gui.py) is called from the background thread with each new entry; gui queues entries in _pending_comms (lock-protected) and flushes them to the DPG panel each render frame
  • MAX_FIELD_CHARS = 400 in ai_client (unused in display logic; kept as reference); COMMS_CLAMP_CHARS = 300 in gui.py governs the display cutoff for heavy text fields

Comms History panel — rich structured rendering (gui.py):

Rather than showing raw JSON, each comms entry is rendered using a kind-specific renderer function. Unknown kinds fall back to a generic key/value layout.

Colour maps:

  • Direction: OUT = blue-ish (100,200,255), IN = green-ish (140,255,160)
  • Kind: request=gold, response=light-green, tool_call=orange, tool_result=light-blue, tool_result_send=lavender
  • Labels: grey (180,180,180); values: near-white (220,220,220); dict keys/indices: (140,200,255); numbers/token counts: (180,255,180); sub-headers: (220,200,120)

Helper functions:

  • _add_text_field(parent, label, value) — labelled text; strings longer than COMMS_CLAMP_CHARS render as an 80px readonly scrollable input_text; shorter strings render as add_text
  • _add_kv_row(parent, key, val) — single horizontal key: value row
  • _render_usage(parent, usage) — renders Anthropic token usage dict in a fixed display order (input → cache_read → cache_creation → output)
  • _render_tool_calls_list(parent, tool_calls) — iterates tool call list, showing name, id, and all args via _add_text_field

Kind-specific renderers (in _KIND_RENDERERS dict, dispatched by _render_comms_entry):

  • _render_payload_request — shows message field via _add_text_field
  • _render_payload_response — shows round, stop_reason (orange), text, tool_calls list, usage block
  • _render_payload_tool_call — shows name, optional id, script via _add_text_field
  • _render_payload_tool_result — shows name, optional id, output via _add_text_field
  • _render_payload_tool_result_send — iterates results list, shows tool_use_id and content per result
  • _render_payload_generic — fallback for unknown kinds; renders all keys, using _add_text_field for keys in _HEAVY_KEYS, _add_kv_row for others; dicts/lists are JSON-serialised

Entry layout: index + timestamp + direction + kind + provider/model header row, then payload rendered by the appropriate function, then a separator line.

Status line and colour legend live at the top of the Comms History window (above the scrollable child window comms_scroll).

Session Logger (session_logger.py):

  • open_session() called once at GUI startup; creates logs/ and scripts/generated/ directories; opens logs/comms_<ts>.log and logs/toolcalls_<ts>.log (line-buffered)
  • log_comms(entry) appends each comms entry as a JSON-L line to the comms log; called from App._on_comms_entry (background thread); thread-safe via GIL + line buffering
  • log_tool_call(script, result, script_path) writes the script to scripts/generated/<ts>_<seq:04d>.ps1 and appends a markdown record to the toolcalls log without the script body (just the file path + result), keeping the log readable; uses a threading.Lock for the sequence counter
  • close_session() flushes and closes both file handles; called just before dpg.destroy_context()
  • _on_tool_log in App is wired to ai_client.tool_log_callback and calls session_logger.log_tool_call

Anthropic prompt caching:

  • System prompt sent as an array with cache_control: ephemeral on the text block
  • Last tool in _ANTHROPIC_TOOLS has cache_control: ephemeral; system + tools prefix is cached together after the first request
  • First user message content[0] is the <context> block with cache_control: ephemeral; content[1] is the user question without cache control
  • Cache stats (creation tokens, read tokens) are surfaced in the comms log usage dict and displayed in the Comms History panel

Data flow:

  1. GUI edits are held in App state lists (self.files, self.screenshots, self.history) and dpg widget values
  2. _flush_to_config() pulls all widget values into self.config dict
  3. _do_generate() calls _flush_to_config(), saves config.toml, calls aggregate.run(config) which writes the md and returns (markdown_str, path)
  4. cb_generate_send() calls _do_generate() then threads a call to ai_client.send(md, message, base_dir)
  5. ai_client.send() prepends the md as a <context> block to the user message and sends via the active provider chat session
  6. If the AI responds with tool calls, the loop handles them (with GUI confirmation) before returning the final text response
  7. Sessions are stateful within a run (chat history maintained), Reset clears them, the tool log, and the comms log

Config persistence:

  • Every send and save writes config.toml with current state including selected provider and model under [ai]
  • Discussion history is stored as a TOML array of strings in [discussion] history
  • File and screenshot paths are stored as TOML arrays, support absolute paths, relative paths from base_dir, and **/* wildcards

Threading model:

  • DPG render loop runs on the main thread
  • AI sends and model fetches run on daemon background threads
  • _pending_dialog (guarded by a threading.Lock) is set by the background thread and consumed by the render loop each frame, calling dialog.show() on the main thread
  • dialog.wait() blocks the background thread on a threading.Event until the user acts
  • _pending_comms (guarded by a separate threading.Lock) is populated by _on_comms_entry (background thread) and drained by _flush_pending_comms() each render frame (main thread)

Provider error handling:

  • ProviderError(kind, provider, original) wraps upstream API exceptions with a classified kind: quota, rate_limit, auth, balance, network, unknown
  • _classify_anthropic_error and _classify_gemini_error inspect exception types and status codes/message bodies to assign the kind
  • ui_message() returns a human-readable label for display in the Response panel

Known extension points:

  • Add more providers by adding a section to credentials.toml, a _list_* and _send_* function in ai_client.py, and the provider name to the PROVIDERS list in gui.py
  • System prompt support could be added as a field in config.toml and passed in ai_client.send()
  • Discussion history excerpts could be individually toggleable for inclusion in the generated md
  • MAX_TOOL_ROUNDS in ai_client.py caps agentic loops at 5 rounds; adjustable
  • COMMS_CLAMP_CHARS in gui.py controls the character threshold for clamping heavy payload fields in the Comms History panel