12 KiB
Make sure to update this file every time.
manual_slop is a local GUI tool for manually curating and sending context to AI APIs. It aggregates files, screenshots, and discussion history into a structured markdown file and sends it to a chosen AI provider with a user-written message. The AI can also execute PowerShell scripts within the project directory, with user confirmation required before each execution.
Stack:
dearpygui- GUI with docking/floating/resizable panelsgoogle-genai- Gemini APIanthropic- Anthropic APItomli-w- TOML writinguv- package/env management
Files:
gui.py- main GUI,Appclass, all panels, all callbacks, confirmation dialog, layout persistence, rich comms renderingai_client.py- unified provider wrapper, model listing, session management, send, tool/function-call loop, comms log, provider error classificationaggregate.py- reads config, collects files/screenshots/discussion, writes numbered.mdfiles tooutput_dirshell_runner.py- subprocess wrapper that runs PowerShell scripts sandboxed tobase_dir, returns stdout/stderr/exit code as a stringsession_logger.py- opens timestamped log files at session start; writes comms entries as JSON-L and tool calls as markdown; saves each AI-generated script as a.ps1filegemini.py- legacy standalone Gemini wrapper (not used by the main GUI; superseded byai_client.py)config.toml- namespace, output_dir, files paths+base_dir, screenshots paths+base_dir, discussion history array, ai provider+modelcredentials.toml- gemini api_key, anthropic api_keydpg_layout.ini- Dear PyGui window layout file (auto-saved on exit, auto-loaded on startup); gitignore this per-user
GUI Panels:
- Config - namespace, output dir, save
- Files - base_dir, scrollable path list with remove, add file(s), add wildcard
- Screenshots - base_dir, scrollable path list with remove, add screenshot(s)
- Discussion History - structured block editor; each entry has a role combo (User/AI/Vendor API/System) and a multiline content field; buttons: Insert Before, Remove per entry; global buttons: + Entry, Clear All, Save;
-> Historybuttons on Message and Response panels append the current message/response as a new entry - Provider - provider combo (gemini/anthropic), model listbox populated from API, fetch models button
- Message - multiline input, Gen+Send button, MD Only button, Reset session button
- Response - readonly multiline displaying last AI response
- Tool Calls - scrollable log of every PowerShell tool call the AI made; shows first line of script + result (script body omitted from display, full script saved to
.ps1file via session_logger); Clear button - Comms History - rich structured live log of every API interaction; status line at top; colour legend; Clear button; each entry rendered with kind-specific layout rather than raw JSON
Layout persistence:
dpg.configure_app(..., init_file="dpg_layout.ini")loads the ini at startup if it exists; DPG silently ignores a missing filedpg.save_init_file("dpg_layout.ini")is called immediately beforedpg.destroy_context()on clean exit- The ini records window positions, sizes, and dock node assignments in DPG's native format
- First run (no ini) uses the hardcoded
pos=defaults in_build_ui(); after that the ini takes over - Delete
dpg_layout.inito reset to defaults
AI Tool Use (PowerShell):
- Both Gemini and Anthropic are configured with a
run_powershelltool/function declaration - When the AI wants to edit or create files it emits a tool call with a
scriptstring ai_clientruns a loop (maxMAX_TOOL_ROUNDS = 5) feeding tool results back until the AI stops calling tools- Before any script runs,
gui.pyshows a modalConfirmDialogon the main thread; the background send thread blocks on athreading.Eventuntil the user clicks Approve or Reject - The dialog displays
base_dir, shows the script in an editable text box (allowing last-second tweaks), and has Approve & Run / Reject buttons - On approval the (possibly edited) script is passed to
shell_runner.run_powershell()which prependsSet-Location -LiteralPath '<base_dir>'and runs it viapowershell -NoProfile -NonInteractive -Command - stdout, stderr, and exit code are returned to the AI as the tool result
- Rejections return
"USER REJECTED: command was not executed"to the AI - All tool calls (script + result/rejection) are appended to
_tool_logand displayed in the Tool Calls panel
Comms Log (ai_client.py):
_comms_log: list[dict]accumulates every API interaction during a session_append_comms(direction, kind, payload)called at each boundary: OUT/request before sending, IN/response after each model reply, OUT/tool_call before executing, IN/tool_result after executing, OUT/tool_result_send when returning results to the model- Entry fields:
ts(HH:MM:SS),direction(OUT/IN),kind,provider,model,payload(dict) - Anthropic responses also include
usage(input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens) andstop_reasonin payload get_comms_log()returns a snapshot;clear_comms_log()empties itcomms_log_callback(injected by gui.py) is called from the background thread with each new entry; gui queues entries in_pending_comms(lock-protected) and flushes them to the DPG panel each render frameMAX_FIELD_CHARS = 400in ai_client (unused in display logic; kept as reference);COMMS_CLAMP_CHARS = 300in gui.py governs the display cutoff for heavy text fields
Comms History panel — rich structured rendering (gui.py):
Rather than showing raw JSON, each comms entry is rendered using a kind-specific renderer function. Unknown kinds fall back to a generic key/value layout.
Colour maps:
- Direction: OUT = blue-ish
(100,200,255), IN = green-ish(140,255,160) - Kind: request=gold, response=light-green, tool_call=orange, tool_result=light-blue, tool_result_send=lavender
- Labels: grey
(180,180,180); values: near-white(220,220,220); dict keys/indices:(140,200,255); numbers/token counts:(180,255,180); sub-headers:(220,200,120)
Helper functions:
_add_text_field(parent, label, value)— labelled text; strings longer thanCOMMS_CLAMP_CHARSrender as an 80px readonly scrollableinput_text; shorter strings render asadd_text_add_kv_row(parent, key, val)— single horizontal key: value row_render_usage(parent, usage)— renders Anthropic token usage dict in a fixed display order (input → cache_read → cache_creation → output)_render_tool_calls_list(parent, tool_calls)— iterates tool call list, showing name, id, and all args via_add_text_field
Kind-specific renderers (in _KIND_RENDERERS dict, dispatched by _render_comms_entry):
_render_payload_request— showsmessagefield via_add_text_field_render_payload_response— shows round, stop_reason (orange), text, tool_calls list, usage block_render_payload_tool_call— shows name, optional id, script via_add_text_field_render_payload_tool_result— shows name, optional id, output via_add_text_field_render_payload_tool_result_send— iterates results list, shows tool_use_id and content per result_render_payload_generic— fallback for unknown kinds; renders all keys, using_add_text_fieldfor keys in_HEAVY_KEYS,_add_kv_rowfor others; dicts/lists are JSON-serialised
Entry layout: index + timestamp + direction + kind + provider/model header row, then payload rendered by the appropriate function, then a separator line.
Status line and colour legend live at the top of the Comms History window (above the scrollable child window comms_scroll).
Session Logger (session_logger.py):
open_session()called once at GUI startup; createslogs/andscripts/generated/directories; openslogs/comms_<ts>.logandlogs/toolcalls_<ts>.log(line-buffered)log_comms(entry)appends each comms entry as a JSON-L line to the comms log; called fromApp._on_comms_entry(background thread); thread-safe via GIL + line bufferinglog_tool_call(script, result, script_path)writes the script toscripts/generated/<ts>_<seq:04d>.ps1and appends a markdown record to the toolcalls log without the script body (just the file path + result), keeping the log readable; uses athreading.Lockfor the sequence counterclose_session()flushes and closes both file handles; called just beforedpg.destroy_context()_on_tool_loginAppis wired toai_client.tool_log_callbackand callssession_logger.log_tool_call
Anthropic prompt caching:
- System prompt sent as an array with
cache_control: ephemeralon the text block - Last tool in
_ANTHROPIC_TOOLShascache_control: ephemeral; system + tools prefix is cached together after the first request - First user message content[0] is the
<context>block withcache_control: ephemeral; content[1] is the user question without cache control - Cache stats (creation tokens, read tokens) are surfaced in the comms log usage dict and displayed in the Comms History panel
Data flow:
- GUI edits are held in
Appstate lists (self.files,self.screenshots,self.history) and dpg widget values _flush_to_config()pulls all widget values intoself.configdict_do_generate()calls_flush_to_config(), savesconfig.toml, callsaggregate.run(config)which writes the md and returns(markdown_str, path)cb_generate_send()calls_do_generate()then threads a call toai_client.send(md, message, base_dir)ai_client.send()prepends the md as a<context>block to the user message and sends via the active provider chat session- If the AI responds with tool calls, the loop handles them (with GUI confirmation) before returning the final text response
- Sessions are stateful within a run (chat history maintained),
Resetclears them, the tool log, and the comms log
Config persistence:
- Every send and save writes
config.tomlwith current state including selected provider and model under[ai] - Discussion history is stored as a TOML array of strings in
[discussion] history - File and screenshot paths are stored as TOML arrays, support absolute paths, relative paths from base_dir, and
**/*wildcards
Threading model:
- DPG render loop runs on the main thread
- AI sends and model fetches run on daemon background threads
_pending_dialog(guarded by athreading.Lock) is set by the background thread and consumed by the render loop each frame, callingdialog.show()on the main threaddialog.wait()blocks the background thread on athreading.Eventuntil the user acts_pending_comms(guarded by a separatethreading.Lock) is populated by_on_comms_entry(background thread) and drained by_flush_pending_comms()each render frame (main thread)
Provider error handling:
ProviderError(kind, provider, original)wraps upstream API exceptions with a classifiedkind: quota, rate_limit, auth, balance, network, unknown_classify_anthropic_errorand_classify_gemini_errorinspect exception types and status codes/message bodies to assign the kindui_message()returns a human-readable label for display in the Response panel
Known extension points:
- Add more providers by adding a section to
credentials.toml, a_list_*and_send_*function inai_client.py, and the provider name to thePROVIDERSlist ingui.py - System prompt support could be added as a field in
config.tomland passed inai_client.send() - Discussion history excerpts could be individually toggleable for inclusion in the generated md
MAX_TOOL_ROUNDSinai_client.pycaps agentic loops at 5 rounds; adjustableCOMMS_CLAMP_CHARSingui.pycontrols the character threshold for clamping heavy payload fields in the Comms History panel