7.8 KiB
7.8 KiB
Make sure to update this file every time.
manual_slop is a local GUI tool for manually curating and sending context to AI APIs. It aggregates files, screenshots, and discussion history into a structured markdown file and sends it to a chosen AI provider with a user-written message. The AI can also execute PowerShell scripts within the project directory, with user confirmation required before each execution.
Stack:
dearpygui- GUI with docking/floating/resizable panelsgoogle-genai- Gemini APIanthropic- Anthropic APItomli-w- TOML writinguv- package/env management
Files:
gui.py- main GUI,Appclass, all panels, all callbacks, confirmation dialog, layout persistenceai_client.py- unified provider wrapper, model listing, session management, send, tool/function-call loop, comms logaggregate.py- reads config, collects files/screenshots/discussion, writes numbered.mdfiles tooutput_dirshell_runner.py- subprocess wrapper that runs PowerShell scripts sandboxed tobase_dir, returns stdout/stderr/exit code as a stringconfig.toml- namespace, output_dir, files paths+base_dir, screenshots paths+base_dir, discussion history array, ai provider+modelcredentials.toml- gemini api_key, anthropic api_keydpg_layout.ini- Dear PyGui window layout file (auto-saved on exit, auto-loaded on startup); gitignore this per-user
GUI Panels:
- Config - namespace, output dir, save
- Files - base_dir, scrollable path list with remove, add file(s), add wildcard
- Screenshots - base_dir, scrollable path list with remove, add screenshot(s)
- Discussion History - multiline text box,
---as separator between excerpts, save splits on---back into toml array - Provider - provider combo (gemini/anthropic), model listbox populated from API, fetch models button
- Message - multiline input, Gen+Send button, MD Only button, Reset session button
- Response - readonly multiline displaying last AI response
- Tool Calls - scrollable log of every PowerShell tool call the AI made, showing script and result; Clear button
- Comms History - live log of every raw request/response/tool_call/tool_result exchanged with the vendor API; status line lives here; Clear button; heavy fields (message, text, script, output) clamped to an 80px scrollable box when they exceed
COMMS_CLAMP_CHARS(300) characters
Layout persistence:
dpg.configure_app(..., init_file="dpg_layout.ini")loads the ini at startup if it exists; DPG silently ignores a missing filedpg.save_init_file("dpg_layout.ini")is called immediately beforedpg.destroy_context()on clean exit- The ini records window positions, sizes, and dock node assignments in DPG's native format
- First run (no ini) uses the hardcoded
pos=defaults in_build_ui(); after that the ini takes over - Delete
dpg_layout.inito reset to defaults
AI Tool Use (PowerShell):
- Both Gemini and Anthropic are configured with a
run_powershelltool/function declaration - When the AI wants to edit or create files it emits a tool call with a
scriptstring ai_clientruns a loop (maxMAX_TOOL_ROUNDS = 5) feeding tool results back until the AI stops calling tools- Before any script runs,
gui.pyshows a modalConfirmDialogon the main thread; the background send thread blocks on athreading.Eventuntil the user clicks Approve or Reject - The dialog displays
base_dir, shows the script in an editable text box (allowing last-second tweaks), and has Approve & Run / Reject buttons - On approval the (possibly edited) script is passed to
shell_runner.run_powershell()which prependsSet-Location -LiteralPath '<base_dir>'and runs it viapowershell -NoProfile -NonInteractive -Command - stdout, stderr, and exit code are returned to the AI as the tool result
- Rejections return
"USER REJECTED: command was not executed"to the AI - All tool calls (script + result/rejection) are appended to
_tool_logand displayed in the Tool Calls panel
Comms Log (ai_client.py):
_comms_log: list[dict]accumulates every API interaction during a session_append_comms(direction, kind, payload)called at each boundary: OUT/request before sending, IN/response after each model reply, OUT/tool_call before executing, IN/tool_result after executing, OUT/tool_result_send when returning results to the model- Entry fields:
ts(HH:MM:SS),direction(OUT/IN),kind,provider,model,payload(dict) - Anthropic responses also include
usage(input_tokens/output_tokens) andstop_reasonin payload get_comms_log()returns a snapshot;clear_comms_log()empties itcomms_log_callback(injected by gui.py) is called from the background thread with each new entry; gui queues entries in_pending_comms(lock-protected) and flushes them to the DPG panel each render frameMAX_FIELD_CHARS = 400in ai_client is the threshold used for the clamp decision in the UI (COMMS_CLAMP_CHARS = 300in gui.py governs the display cutoff)
Comms History panel rendering:
- Each entry shows: index, timestamp, direction (colour-coded blue=OUT / green=IN), kind (colour-coded), provider/model
- Payload fields rendered below the header; fields in
_HEAVY_KEYS(message,text,script,output,content) that exceedCOMMS_CLAMP_CHARSare shown in an 80px tall readonly scrollableinput_textbox instead of a plainadd_text - Colour legend row at the top of the panel
- Status line (formerly in Provider panel) moved to top of Comms History panel
- Reset session also clears the comms log and panel; Clear button in Comms History clears only the comms log
Data flow:
- GUI edits are held in
Appstate lists (self.files,self.screenshots,self.history) and dpg widget values _flush_to_config()pulls all widget values intoself.configdict_do_generate()calls_flush_to_config(), savesconfig.toml, callsaggregate.run(config)which writes the md and returns(markdown_str, path)cb_generate_send()calls_do_generate()then threads a call toai_client.send(md, message, base_dir)ai_client.send()prepends the md as a<context>block to the user message and sends via the active provider chat session- If the AI responds with tool calls, the loop handles them (with GUI confirmation) before returning the final text response
- Sessions are stateful within a run (chat history maintained),
Resetclears them, the tool log, and the comms log
Config persistence:
- Every send and save writes
config.tomlwith current state including selected provider and model under[ai] - Discussion history is stored as a TOML array of strings in
[discussion] history - File and screenshot paths are stored as TOML arrays, support absolute paths, relative paths from base_dir, and
**/*wildcards
Threading model:
- DPG render loop runs on the main thread
- AI sends and model fetches run on daemon background threads
_pending_dialog(guarded by athreading.Lock) is set by the background thread and consumed by the render loop each frame, callingdialog.show()on the main threaddialog.wait()blocks the background thread on athreading.Eventuntil the user acts_pending_comms(guarded by a separatethreading.Lock) is populated by_on_comms_entry(background thread) and drained by_flush_pending_comms()each render frame (main thread)
Known extension points:
- Add more providers by adding a section to
credentials.toml, a_list_*and_send_*function inai_client.py, and the provider name to thePROVIDERSlist ingui.py - System prompt support could be added as a field in
config.tomland passed inai_client.send() - Discussion history excerpts could be individually toggleable for inclusion in the generated md
MAX_TOOL_ROUNDSinai_client.pycaps agentic loops at 5 rounds; adjustableCOMMS_CLAMP_CHARSingui.pycontrols the character threshold for clamping heavy payload fields