Files
manual_slop/docs/guide_architecture.md
Ed_ 08e003a137 docs: Complete documentation rewrite at gencpp/VEFontCache reference quality
Rewrites all docs from Gemini's 330-line executive summaries to 1874 lines
of expert-level architectural reference matching the pedagogical depth of
gencpp (Parser_Algo.md, AST_Types.md) and VEFontCache-Odin (guide_architecture.md).

Changes:
- guide_architecture.md: 73 -> 542 lines. Adds inline data structures for all
  dialog classes, cross-thread communication patterns, complete action type
  catalog, provider comparison table, 4-breakpoint Anthropic cache strategy,
  Gemini server-side cache lifecycle, context refresh algorithm.
- guide_tools.md: 66 -> 385 lines. Full 26-tool inventory with parameters,
  3-layer MCP security model walkthrough, all Hook API GET/POST endpoints
  with request/response formats, ApiHookClient method reference, /api/ask
  synchronous HITL protocol, shell runner with env config.
- guide_mma.md: NEW (368 lines). Fills major documentation gap — complete
  Ticket/Track/WorkerContext data structures, DAG engine algorithms (cycle
  detection, topological sort), ConductorEngine execution loop, Tier 2 ticket
  generation, Tier 3 worker lifecycle with context amnesia, token firewalling.
- guide_simulations.md: 64 -> 377 lines. 8-stage Puppeteer simulation
  lifecycle, mock_gemini_cli.py JSON-L protocol, approval automation pattern,
  ASTParser tree-sitter vs stdlib ast comparison, VerificationLogger.
- Readme.md: Rewritten with module map, architecture summary, config examples.
- docs/Readme.md: Proper index with guide contents table and GUI panel docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-01 09:44:50 -05:00

23 KiB

Architecture

Top | Tools & IPC | MMA Orchestration | Simulations


Philosophy: The Decoupled State Machine

Manual Slop solves a single tension: AI reasoning is high-latency and non-deterministic; GUI interaction must be low-latency and responsive. The engine enforces strict decoupling between three thread domains so that multi-second LLM calls never block the render loop, and every AI-generated payload passes through a human-auditable gate before execution.


Thread Domains

Four distinct thread domains operate concurrently:

Domain Created By Purpose Lifecycle
Main / GUI immapp.run() Dear ImGui retained-mode render loop; sole writer of GUI state App lifetime
Asyncio Worker App.__init__ via threading.Thread(daemon=True) Event queue processing, AI client calls Daemon (dies with process)
HookServer api_hooks.HookServer.start() HTTP API on :8999 for external automation and IPC Daemon thread
Ad-hoc Transient threading.Thread calls Model-fetching, legacy send paths Short-lived

The asyncio worker is not the main thread's event loop. It runs a dedicated asyncio.new_event_loop() on its own daemon thread:

# App.__init__:
self._loop = asyncio.new_event_loop()
self._loop_thread = threading.Thread(target=self._run_event_loop, daemon=True)
self._loop_thread.start()

# _run_event_loop:
def _run_event_loop(self) -> None:
    asyncio.set_event_loop(self._loop)
    self._loop.create_task(self._process_event_queue())
    self._loop.run_forever()

The GUI thread uses asyncio.run_coroutine_threadsafe(coro, self._loop) to push work into this loop.


Cross-Thread Data Structures

All cross-thread communication uses one of three patterns:

Pattern A: AsyncEventQueue (GUI -> Asyncio)

# events.py
class AsyncEventQueue:
    _queue: asyncio.Queue  # holds Tuple[str, Any] items

    async def put(self, event_name: str, payload: Any = None) -> None
    async def get(self) -> Tuple[str, Any]

The central event bus. Uses asyncio.Queue, so non-asyncio threads must enqueue via asyncio.run_coroutine_threadsafe(). Consumer is App._process_event_queue(), running as a long-lived coroutine on the asyncio loop.

Pattern B: Guarded Lists (Any Thread -> GUI)

Background threads cannot write GUI state directly. They append task dicts to lock-guarded lists; the main thread drains these once per frame:

# App.__init__:
self._pending_gui_tasks: list[dict[str, Any]] = []
self._pending_gui_tasks_lock = threading.Lock()

self._pending_comms: list[dict[str, Any]] = []
self._pending_comms_lock = threading.Lock()

self._pending_tool_calls: list[tuple[str, str, float]] = []
self._pending_tool_calls_lock = threading.Lock()

self._pending_history_adds: list[dict[str, Any]] = []
self._pending_history_adds_lock = threading.Lock()

Additional locks:

self._send_thread_lock = threading.Lock()       # Guards send_thread creation
self._pending_dialog_lock = threading.Lock()     # Guards _pending_dialog + _pending_actions dict

Pattern C: Condition-Variable Dialogs (Bidirectional Blocking)

Used for Human-in-the-Loop (HITL) approval. Background thread blocks on threading.Condition; GUI thread signals after user action. See the HITL section below.


Event System

Three classes in events.py (89 lines, no external dependencies beyond asyncio and typing):

EventEmitter

class EventEmitter:
    _listeners: Dict[str, List[Callable]]

    def on(self, event_name: str, callback: Callable) -> None
    def emit(self, event_name: str, *args: Any, **kwargs: Any) -> None

Synchronous pub-sub. Callbacks execute in the caller's thread. Used by ai_client.events for lifecycle hooks (request_start, response_received, tool_execution). No thread safety — relies on consistent single-thread usage.

AsyncEventQueue

Described above in Pattern A.

UserRequestEvent

class UserRequestEvent:
    prompt: str           # User's raw input text
    stable_md: str        # Generated markdown context (files, screenshots)
    file_items: List[Any] # File attachment items for dynamic refresh
    disc_text: str        # Serialized discussion history
    base_dir: str         # Working directory for shell commands

    def to_dict(self) -> Dict[str, Any]

Pure data carrier. Created on the GUI thread in _handle_generate_send, consumed on the asyncio thread in _handle_request_event.


Application Lifetime

Boot Sequence

The App.__init__ (lines 152-296) follows this precise order:

  1. Config hydration: Reads config.toml (global) and <project>.toml (local). Builds the initial "world view" — tracked files, discussion history, active models.
  2. Thread bootstrapping:
    • Asyncio event loop thread starts (_loop_thread).
    • HookServer starts as a daemon if test_hooks_enabled or provider is gemini_cli.
  3. Callback wiring (_init_ai_and_hooks): Connects ai_client.confirm_and_run_callback, comms_log_callback, tool_log_callback to GUI handlers.
  4. UI entry: Main thread enters immapp.run(). GUI is now alive; background threads are ready.

Shutdown Sequence

When immapp.run() returns (user closed window):

  1. hook_server.stop() — shuts down HTTP server, joins thread.
  2. perf_monitor.stop().
  3. ai_client.cleanup() — destroys server-side API caches (Gemini CachedContent).
  4. Dual-Flush persistence: _flush_to_project(), _save_active_project(), _flush_to_config(), save_config() — commits state back to both project and global configs.
  5. session_logger.close_session().

The asyncio loop thread is a daemon — it dies with the process. App.shutdown() exists for explicit cleanup in test scenarios:

def shutdown(self) -> None:
    if self._loop.is_running():
        self._loop.call_soon_threadsafe(self._loop.stop)
    if self._loop_thread.is_alive():
        self._loop_thread.join(timeout=2.0)

The Task Pipeline: Producer-Consumer Synchronization

Request Flow

GUI Thread                    Asyncio Thread                      GUI Thread (next frame)
──────────                    ──────────────                      ──────────────────────
1. User clicks "Gen + Send"
2. _handle_generate_send():
   - Compiles md context
   - Creates UserRequestEvent
   - Enqueues via
     run_coroutine_threadsafe  ──>  3. _process_event_queue():
                                       awaits event_queue.get()
                                       routes "user_request" to
                                       _handle_request_event()
                                   4. Configures ai_client
                                   5. ai_client.send() BLOCKS
                                      (seconds to minutes)
                                   6. On completion, enqueues
                                      "response" event back       ──>  7. _process_pending_gui_tasks():
                                                                          Drains task list under lock
                                                                          Sets ai_response text
                                                                          Triggers terminal blink

Event Types Routed by _process_event_queue

Event Name Action
"user_request" Calls _handle_request_event(payload) — synchronous blocking AI call
"response" Appends {"action": "handle_ai_response", ...} to _pending_gui_tasks
"mma_state_update" Appends {"action": "mma_state_update", ...} to _pending_gui_tasks
"mma_spawn_approval" Appends the raw payload for HITL dialog creation
"mma_step_approval" Appends the raw payload for HITL dialog creation

The pattern: events arriving on the asyncio thread that need GUI state changes are serialized into _pending_gui_tasks for consumption on the next render frame.

Frame-Sync Mechanism: _process_pending_gui_tasks

Called once per ImGui frame on the main GUI thread. This is the sole safe point for mutating GUI-visible state.

Locking strategy — copy-and-clear:

def _process_pending_gui_tasks(self) -> None:
    if not self._pending_gui_tasks:
        return
    with self._pending_gui_tasks_lock:
        tasks = self._pending_gui_tasks[:]   # Snapshot
        self._pending_gui_tasks.clear()       # Release lock fast
    for task in tasks:
        # Process each task outside the lock

Acquires the lock briefly to snapshot the task list, then processes outside the lock. Minimizes lock contention with producer threads.

Complete Action Type Catalog

Action Source Effect
"refresh_api_metrics" asyncio/hooks Updates API metrics display
"handle_ai_response" asyncio Sets ai_response, ai_status, mma_streams[stream_id]; triggers blink; optionally auto-adds to discussion history
"show_track_proposal" asyncio Sets proposed_tracks list, opens modal
"mma_state_update" asyncio Updates mma_status, active_tier, mma_tier_usage, active_tickets, active_track
"set_value" HookServer Sets any field in _settable_fields map via setattr; special-cases current_provider/current_model to reconfigure AI client
"click" HookServer Dispatches to _clickable_actions map; introspects signatures to decide whether to pass user_data
"select_list_item" HookServer Routes to _switch_discussion() for discussion listbox
{"type": "ask"} HookServer Opens ask dialog: sets _pending_ask_dialog = True, stores _ask_request_id and _ask_tool_data
"clear_ask" HookServer Clears ask dialog state if request_id matches
"custom_callback" HookServer Executes an arbitrary callable with args
"mma_step_approval" asyncio (MMA engine) Creates MMAApprovalDialog, stores in _pending_mma_approval
"mma_spawn_approval" asyncio (MMA engine) Creates MMASpawnApprovalDialog, stores in _pending_mma_spawn
"refresh_from_project" HookServer/internal Reloads all UI state from project dict

The Execution Clutch: Human-in-the-Loop

The "Execution Clutch" ensures every destructive AI action passes through an auditable human gate. Three dialog types implement this, all sharing the same blocking pattern.

Dialog Classes

ConfirmDialog — PowerShell script execution approval:

class ConfirmDialog:
    _uid: str                        # uuid4 identifier
    _script: str                     # The PowerShell script text (editable)
    _base_dir: str                   # Working directory
    _condition: threading.Condition  # Blocking primitive
    _done: bool                      # Signal flag
    _approved: bool                  # User's decision

    def wait(self) -> tuple[bool, str]   # Blocks until _done; returns (approved, script)

MMAApprovalDialog — MMA tier step approval:

class MMAApprovalDialog:
    _ticket_id: str
    _payload: str                    # The step payload (editable)
    _condition: threading.Condition
    _done: bool
    _approved: bool

    def wait(self) -> tuple[bool, str]   # Returns (approved, payload)

MMASpawnApprovalDialog — Sub-agent spawn approval:

class MMASpawnApprovalDialog:
    _ticket_id: str
    _role: str                       # tier3-worker, tier4-qa, etc.
    _prompt: str                     # Spawn prompt (editable)
    _context_md: str                 # Context document (editable)
    _condition: threading.Condition
    _done: bool
    _approved: bool
    _abort: bool                     # Can abort entire track

    def wait(self) -> dict[str, Any]   # Returns {approved, abort, prompt, context_md}

Blocking Flow

Using ConfirmDialog as exemplar:

   ASYNCIO THREAD (ai_client tool callback)         GUI MAIN THREAD
   ─────────────────────────────────────────         ───────────────
   1. ai_client calls _confirm_and_run(script)
   2. Creates ConfirmDialog(script, base_dir)
   3. Stores dialog:
      - Headless: _pending_actions[uid] = dialog
      - GUI mode: _pending_dialog = dialog
   4. If test_hooks_enabled:
      pushes to _api_event_queue
   5. dialog.wait() BLOCKS on _condition
                                                    6. Next frame: ImGui renders
                                                       _pending_dialog in modal
                                                    7. User clicks Approve/Reject
                                                    8. _handle_approve_script():
                                                       with dialog._condition:
                                                           dialog._approved = True
                                                           dialog._done = True
                                                           dialog._condition.notify_all()
   9. wait() returns (True, potentially_edited_script)
   10. Executes shell_runner.run_powershell()
   11. Returns output to ai_client

The _condition.wait(timeout=0.1) uses a 100ms polling interval inside a loop — a polling-with-condition hybrid that ensures the blocking thread wakes periodically.

Resolution Paths

GUI button path (normal interactive use): _handle_approve_script() / _handle_approve_mma_step() / _handle_approve_spawn() directly manipulate the dialog's condition variable from the GUI thread.

HTTP API path (headless/automation): resolve_pending_action(action_id, approved) looks up the dialog by UUID in _pending_actions dict (headless) or _pending_dialog (GUI), then signals the condition:

def resolve_pending_action(self, action_id: str, approved: bool) -> bool:
    with self._pending_dialog_lock:
        if action_id in self._pending_actions:
            dialog = self._pending_actions[action_id]
            with dialog._condition:
                dialog._approved = approved
                dialog._done = True
                dialog._condition.notify_all()
            return True

MMA approval path: _handle_mma_respond(approved, payload, abort, prompt, context_md) is the unified resolver. It uses a dialog_container — a one-element list [None] used as a mutable reference shared between the MMA engine (which creates the container) and the GUI (which populates it via _process_pending_gui_tasks).


AI Client: Multi-Provider Architecture

ai_client.py operates as a stateful singleton — all provider state is held in module-level globals. There is no class wrapping; the module itself is the abstraction layer.

Module-Level State

_provider: str = "gemini"              # "gemini" | "anthropic" | "deepseek" | "gemini_cli"
_model: str = "gemini-2.5-flash-lite"
_temperature: float = 0.0
_max_tokens: int = 8192
_history_trunc_limit: int = 8000       # Char limit for truncating old tool outputs

_send_lock: threading.Lock             # Serializes ALL send() calls across providers

Per-provider client objects:

# Gemini (SDK-managed stateful chat)
_gemini_client: genai.Client | None
_gemini_chat: Any                      # Holds history internally
_gemini_cache: Any                     # Server-side CachedContent
_gemini_cache_md_hash: int | None      # For cache invalidation
_GEMINI_CACHE_TTL: int = 3600          # 1-hour; rebuilt at 90% (3240s)

# Anthropic (client-managed history)
_anthropic_client: anthropic.Anthropic | None
_anthropic_history: list[dict]         # Mutable [{role, content}, ...]
_anthropic_history_lock: threading.Lock

# DeepSeek (raw HTTP, client-managed history)
_deepseek_history: list[dict]
_deepseek_history_lock: threading.Lock

# Gemini CLI (adapter wrapper)
_gemini_cli_adapter: GeminiCliAdapter | None

Safety limits:

MAX_TOOL_ROUNDS: int = 10              # Max tool-call loop iterations per send()
_MAX_TOOL_OUTPUT_BYTES: int = 500_000  # 500KB cumulative tool output budget
_ANTHROPIC_CHUNK_SIZE: int = 120_000   # Max chars per system text block
_ANTHROPIC_MAX_PROMPT_TOKENS: int = 180_000  # 200k limit minus headroom
_GEMINI_MAX_INPUT_TOKENS: int = 900_000      # 1M window minus headroom

The send() Dispatcher

def send(md_content, user_message, base_dir=".", file_items=None,
         discussion_history="", stream=False,
         pre_tool_callback=None, qa_callback=None) -> str:
    with _send_lock:
        if _provider == "gemini":      return _send_gemini(...)
        elif _provider == "gemini_cli": return _send_gemini_cli(...)
        elif _provider == "anthropic":  return _send_anthropic(...)
        elif _provider == "deepseek":   return _send_deepseek(..., stream=stream)

_send_lock serializes all API calls — only one provider call can be in-flight at a time. All providers share the same callback signatures. Return type is always str.

Provider Comparison

Aspect Gemini SDK Anthropic DeepSeek Gemini CLI
Client genai.Client anthropic.Anthropic Raw requests.post GeminiCliAdapter (subprocess)
History SDK-managed (_gemini_chat._history) Client-managed list Client-managed list CLI-managed (session ID)
Caching Server-side CachedContent with TTL Prompt caching via cache_control: ephemeral (4 breakpoints) None None
Tool format types.FunctionDeclaration JSON Schema dict Not declared Same as SDK via adapter
Tool results Part.from_function_response(response={"output": ...}) {"type": "tool_result", "tool_use_id": ..., "content": ...} {"role": "tool", "tool_call_id": ..., "content": ...} {"role": "tool", ...}
History trimming In-place at 40% of 900K token estimate 2-phase: strip stale file refreshes, then drop turn pairs at 180K None None
Streaming No No Yes No

Tool-Call Loop (common pattern across providers)

All providers follow the same high-level loop, iterated up to MAX_TOOL_ROUNDS + 2 times:

  1. Send message (or tool results from prior round) to API.
  2. Extract text response and any function calls.
  3. Log to comms log; emit events.
  4. If no function calls or max rounds exceeded: break.
  5. For each function call:
    • If pre_tool_callback rejects: return rejection text.
    • Dispatch to mcp_client.dispatch() or shell_runner.run_powershell().
    • After the last call of this round: run _reread_file_items() for context refresh.
    • Truncate tool output at _history_trunc_limit chars.
    • Accumulate _cumulative_tool_bytes.
  6. If cumulative bytes > 500KB: inject warning.
  7. Package tool results in provider-specific format; loop.

Context Refresh Mechanism

After the last tool call in each round, _reread_file_items(file_items) checks mtimes of all tracked files:

  1. For each file item: compare Path.stat().st_mtime against stored mtime.
  2. If unchanged: pass through as-is.
  3. If changed: re-read content, store old_content for diffing, update mtime.
  4. Changed files are diffed via _build_file_diff_text:
    • Files <= 200 lines: emit full content.
    • Files > 200 lines with old_content: emit difflib.unified_diff.
  5. Diff is appended to the last tool's output as [SYSTEM: FILES UPDATED]\n\n{diff}.
  6. Stale [FILES UPDATED] blocks are stripped from older history turns by _strip_stale_file_refreshes to prevent context bloat.

Anthropic Cache Strategy (4-Breakpoint System)

Anthropic allows a maximum of 4 cache_control: ephemeral breakpoints:

# Location Purpose
1 Last block of stable system prompt Cache base instructions
2 Last block of context chunks Cache file context
3 Last tool definition Cache tool schema
4 Second-to-last user message Cache conversation prefix

Before placing breakpoint 4, all existing cache_control is stripped from history to prevent exceeding the limit.

Gemini Cache Strategy (Server-Side TTL)

System instruction content is hashed. On each call, a 3-way decision:

  • Hash changed: Delete old cache, rebuild with new content.
  • Cache age > 90% of TTL: Proactive renewal (delete + rebuild).
  • No cache exists: Create new CachedContent if token count >= 2048; otherwise inline.

Comms Log System

Every API interaction is logged to a module-level list with real-time GUI push:

def _append_comms(direction: str, kind: str, payload: dict[str, Any]) -> None:
    entry = {
        "ts":        datetime.now().strftime("%H:%M:%S"),
        "direction": direction,     # "OUT" (to API) or "IN" (from API)
        "kind":      kind,          # "request" | "response" | "tool_call" | "tool_result"
        "provider":  _provider,
        "model":     _model,
        "payload":   payload,
    }
    _comms_log.append(entry)
    if comms_log_callback:
        comms_log_callback(entry)   # Real-time push to GUI

State Machines

ai_status (Informal)

"idle" -> "sending..." -> [AI call in progress]
    -> "running powershell..." -> "powershell done, awaiting AI..."
    -> "fetching url..." | "searching web..."
    -> "done" | "error"
    -> "idle" (on reset)

HITL Dialog State (Binary per type)

  • _pending_dialog is not None — script confirmation active
  • _pending_mma_approval is not None — MMA step approval active
  • _pending_mma_spawn is not None — spawn approval active
  • _pending_ask_dialog == True — tool ask dialog active

Security: The MCP Allowlist

Every filesystem tool (read, list, search, write) is gated by the MCP Bridge (mcp_client.py). See guide_tools.md for the complete security model, tool inventory, and endpoint reference.

Summary: Every path is resolved to an absolute path and checked against a dynamically-built allowlist constructed from the project's tracked files and base directories. Files named history.toml or *_history.toml are hard-blacklisted.


Telemetry & Auditing

Every interaction is designed to be auditable:

  • JSON-L Comms Logs: Raw API traffic logged to logs/sessions/<id>/comms.log for debugging and token cost analysis.
  • Tool Call Logs: Markdown-formatted sequential records to toolcalls.log.
  • Generated Scripts: Every PowerShell script that passes through the Execution Clutch is saved to scripts/generated/<ts>_<seq>.ps1.
  • API Hook Logs: All HTTP hook invocations logged to apihooks.log.
  • CLI Call Logs: Subprocess execution details (command, stdin, stdout, stderr, latency) to clicalls.log as JSON-L.
  • Performance Monitor: Real-time FPS, Frame Time, CPU, Input Lag tracked and queryable via Hook API.

Architectural Invariants

  1. Single-writer principle: All GUI state mutations happen on the main thread via _process_pending_gui_tasks. Background threads never write GUI state directly.
  2. Copy-and-clear lock pattern: _process_pending_gui_tasks snapshots and clears the task list under the lock, then processes outside the lock.
  3. Context Amnesia: Each MMA Tier 3 Worker starts with ai_client.reset_session(). No conversational bleed between tickets.
  4. Send serialization: _send_lock ensures only one provider call is in-flight at a time across all threads.
  5. Dual-Flush persistence: On exit, state is committed to both project-level and global-level config files.