# Architecture [Top](../README.md) | [Tools & IPC](guide_tools.md) | [MMA Orchestration](guide_mma.md) | [Simulations](guide_simulations.md) --- ## Philosophy: The Decoupled State Machine Manual Slop solves a single tension: **AI reasoning is high-latency and non-deterministic; GUI interaction must be low-latency and responsive.** The engine enforces strict decoupling between four thread domains so that multi-second LLM calls never block the render loop, and every AI-generated payload passes through a human-auditable gate before execution. The architectural philosophy follows data-oriented design principles: - The GUI (`gui_2.py`, `app_controller.py`) remains a pure visualization of application state - State mutations occur only through lock-guarded queues consumed on the main render thread - Background threads never write GUI state directly — they serialize task dicts for later consumption - All cross-thread communication uses explicit synchronization primitives (Locks, Conditions, Events) ## Project Structure The codebase is organized into a `src/` layout to separate implementation from configuration and artifacts. ``` manual_slop/ ├── conductor/ # Conductor tracks, specs, and plans ├── docs/ # Deep-dive architectural documentation ├── logs/ # Session logs, agent traces, and errors ├── scripts/ # Build, migration, and IPC bridge scripts ├── src/ # Core Python implementation │ ├── ai_client.py # LLM provider abstraction │ ├── gui_2.py # Main ImGui application │ ├── mcp_client.py # MCP tool implementation │ └── ... # Other core modules ├── tests/ # Pytest suite and simulation fixtures ├── simulation/ # Workflow and agent simulation logic ├── sloppy.py # Primary application entry point ├── config.toml # Global application settings └── manual_slop.toml # Project-specific configuration ``` --- ## Thread Domains Four distinct thread domains operate concurrently: | Domain | Created By | Purpose | Lifecycle | Key Synchronization Primitives | |---|---|---|---|---| | **Main / GUI** | `immapp.run()` | Dear ImGui retained-mode render loop; sole writer of GUI state | App lifetime | None (consumer of queues) | | **Asyncio Worker** | `App.__init__` via `threading.Thread(daemon=True)` | Event queue processing, AI client calls | Daemon (dies with process) | `AsyncEventQueue`, `threading.Lock` | | **HookServer** | `api_hooks.HookServer.start()` | HTTP API on `:8999` for external automation and IPC | Daemon thread | `threading.Lock`, `threading.Event` | | **Ad-hoc** | Transient `threading.Thread` calls | Model-fetching, legacy send paths, log pruning | Short-lived | Task-specific locks | The asyncio worker is **not** the main thread's event loop. It runs a dedicated `asyncio.new_event_loop()` on its own daemon thread: ```python # AppController.__init__: self._loop = asyncio.new_event_loop() self._loop_thread = threading.Thread(target=self._run_event_loop, daemon=True) self._loop_thread.start() # _run_event_loop: def _run_event_loop(self) -> None: asyncio.set_event_loop(self._loop) self._loop.create_task(self._process_event_queue()) self._loop.run_forever() ``` The GUI thread uses `asyncio.run_coroutine_threadsafe(coro, self._loop)` to push work into this loop. ### Thread-Local Context Isolation For concurrent multi-agent execution, the application uses `threading.local()` to manage per-thread context: ```python # ai_client.py _local_storage = threading.local() def get_current_tier() -> Optional[str]: """Returns the current tier from thread-local storage.""" return getattr(_local_storage, "current_tier", None) def set_current_tier(tier: Optional[str]) -> None: """Sets the current tier in thread-local storage.""" _local_storage.current_tier = tier ``` This ensures that comms log entries and tool calls are correctly tagged with their source tier even when multiple workers execute concurrently. --- ## Cross-Thread Data Structures All cross-thread communication uses one of three patterns: ### Pattern A: AsyncEventQueue (GUI -> Asyncio) ```python # events.py class AsyncEventQueue: _queue: asyncio.Queue # holds Tuple[str, Any] items async def put(self, event_name: str, payload: Any = None) -> None async def get(self) -> Tuple[str, Any] ``` The central event bus. Uses `asyncio.Queue`, so non-asyncio threads must enqueue via `asyncio.run_coroutine_threadsafe()`. Consumer is `App._process_event_queue()`, running as a long-lived coroutine on the asyncio loop. ### Pattern B: Guarded Lists (Any Thread -> GUI) Background threads cannot write GUI state directly. They append task dicts to lock-guarded lists; the main thread drains these once per frame: ```python # App.__init__: self._pending_gui_tasks: list[dict[str, Any]] = [] self._pending_gui_tasks_lock = threading.Lock() self._pending_comms: list[dict[str, Any]] = [] self._pending_comms_lock = threading.Lock() self._pending_tool_calls: list[tuple[str, str, float]] = [] self._pending_tool_calls_lock = threading.Lock() self._pending_history_adds: list[dict[str, Any]] = [] self._pending_history_adds_lock = threading.Lock() ``` Additional locks: ```python self._send_thread_lock = threading.Lock() # Guards send_thread creation self._pending_dialog_lock = threading.Lock() # Guards _pending_dialog + _pending_actions dict ``` ### Pattern C: Condition-Variable Dialogs (Bidirectional Blocking) Used for Human-in-the-Loop (HITL) approval. Background thread blocks on `threading.Condition`; GUI thread signals after user action. See the [HITL section](#the-execution-clutch-human-in-the-loop) below. --- ## Event System Three classes in `events.py` (89 lines, no external dependencies beyond `asyncio` and `typing`): ### EventEmitter ```python class EventEmitter: _listeners: Dict[str, List[Callable]] def on(self, event_name: str, callback: Callable) -> None def emit(self, event_name: str, *args: Any, **kwargs: Any) -> None ``` Synchronous pub-sub. Callbacks execute in the caller's thread. Used by `ai_client.events` for lifecycle hooks (`request_start`, `response_received`, `tool_execution`). No thread safety — relies on consistent single-thread usage. ### AsyncEventQueue Described above in Pattern A. ### UserRequestEvent ```python class UserRequestEvent: prompt: str # User's raw input text stable_md: str # Generated markdown context (files, screenshots) file_items: List[Any] # File attachment items for dynamic refresh disc_text: str # Serialized discussion history base_dir: str # Working directory for shell commands def to_dict(self) -> Dict[str, Any] ``` Pure data carrier. Created on the GUI thread in `_handle_generate_send`, consumed on the asyncio thread in `_handle_request_event`. --- ## Application Lifetime ### Boot Sequence The `App.__init__` (lines 152-296) follows this precise order: 1. **Config hydration**: Reads `config.toml` (global) and `.toml` (local). Builds the initial "world view" — tracked files, discussion history, active models. 2. **Thread bootstrapping**: - Asyncio event loop thread starts (`_loop_thread`). - `HookServer` starts as a daemon if `test_hooks_enabled` or provider is `gemini_cli`. 3. **Callback wiring** (`_init_ai_and_hooks`): Connects `ai_client.confirm_and_run_callback`, `comms_log_callback`, `tool_log_callback` to GUI handlers. 4. **UI entry**: Main thread enters `immapp.run()`. GUI is now alive; background threads are ready. ### Shutdown Sequence When `immapp.run()` returns (user closed window): 1. `hook_server.stop()` — shuts down HTTP server, joins thread. 2. `perf_monitor.stop()`. 3. `ai_client.cleanup()` — destroys server-side API caches (Gemini `CachedContent`). 4. **Dual-Flush persistence**: `_flush_to_project()`, `_save_active_project()`, `_flush_to_config()`, `save_config()` — commits state back to both project and global configs. 5. `session_logger.close_session()`. The asyncio loop thread is a daemon — it dies with the process. `App.shutdown()` exists for explicit cleanup in test scenarios: ```python def shutdown(self) -> None: if self._loop.is_running(): self._loop.call_soon_threadsafe(self._loop.stop) if self._loop_thread.is_alive(): self._loop_thread.join(timeout=2.0) ``` --- ## The Task Pipeline: Producer-Consumer Synchronization ### Request Flow ``` GUI Thread Asyncio Thread GUI Thread (next frame) ────────── ────────────── ────────────────────── 1. User clicks "Gen + Send" 2. _handle_generate_send(): - Compiles md context - Creates UserRequestEvent - Enqueues via run_coroutine_threadsafe ──> 3. _process_event_queue(): awaits event_queue.get() routes "user_request" to _handle_request_event() 4. Configures ai_client 5. ai_client.send() BLOCKS (seconds to minutes) 6. On completion, enqueues "response" event back ──> 7. _process_pending_gui_tasks(): Drains task list under lock Sets ai_response text Triggers terminal blink ``` ### Event Types Routed by `_process_event_queue` | Event Name | Action | |---|---| | `"user_request"` | Calls `_handle_request_event(payload)` — synchronous blocking AI call | | `"response"` | Appends `{"action": "handle_ai_response", ...}` to `_pending_gui_tasks` | | `"mma_state_update"` | Appends `{"action": "mma_state_update", ...}` to `_pending_gui_tasks` | | `"mma_spawn_approval"` | Appends the raw payload for HITL dialog creation | | `"mma_step_approval"` | Appends the raw payload for HITL dialog creation | The pattern: events arriving on the asyncio thread that need GUI state changes are **serialized into `_pending_gui_tasks`** for consumption on the next render frame. ### Frame-Sync Mechanism: `_process_pending_gui_tasks` Called once per ImGui frame on the **main GUI thread**. This is the sole safe point for mutating GUI-visible state. **Locking strategy** — copy-and-clear: ```python def _process_pending_gui_tasks(self) -> None: if not self._pending_gui_tasks: return with self._pending_gui_tasks_lock: tasks = self._pending_gui_tasks[:] # Snapshot self._pending_gui_tasks.clear() # Release lock fast for task in tasks: # Process each task outside the lock ``` Acquires the lock briefly to snapshot the task list, then processes outside the lock. Minimizes lock contention with producer threads. ### Complete Action Type Catalog | Action | Source | Effect | |---|---|---| | `"refresh_api_metrics"` | asyncio/hooks | Updates API metrics display | | `"handle_ai_response"` | asyncio | Sets `ai_response`, `ai_status`, `mma_streams[stream_id]`; triggers blink; optionally auto-adds to discussion history | | `"show_track_proposal"` | asyncio | Sets `proposed_tracks` list, opens modal | | `"mma_state_update"` | asyncio | Updates `mma_status`, `active_tier`, `mma_tier_usage`, `active_tickets`, `active_track` | | `"set_value"` | HookServer | Sets any field in `_settable_fields` map via `setattr`; special-cases `current_provider`/`current_model` to reconfigure AI client | | `"click"` | HookServer | Dispatches to `_clickable_actions` map; introspects signatures to decide whether to pass `user_data` | | `"select_list_item"` | HookServer | Routes to `_switch_discussion()` for discussion listbox | | `{"type": "ask"}` | HookServer | Opens ask dialog: sets `_pending_ask_dialog = True`, stores `_ask_request_id` and `_ask_tool_data` | | `"clear_ask"` | HookServer | Clears ask dialog state if request_id matches | | `"custom_callback"` | HookServer | Executes an arbitrary callable with args | | `"mma_step_approval"` | asyncio (MMA engine) | Creates `MMAApprovalDialog`, stores in `_pending_mma_approval` | | `"mma_spawn_approval"` | asyncio (MMA engine) | Creates `MMASpawnApprovalDialog`, stores in `_pending_mma_spawn` | | `"refresh_from_project"` | HookServer/internal | Reloads all UI state from project dict | --- ## The Execution Clutch: Human-in-the-Loop The "Execution Clutch" ensures every destructive AI action passes through an auditable human gate. Three dialog types implement this, all sharing the same blocking pattern. ### Dialog Classes **`ConfirmDialog`** — PowerShell script execution approval: ```python class ConfirmDialog: _uid: str # uuid4 identifier _script: str # The PowerShell script text (editable) _base_dir: str # Working directory _condition: threading.Condition # Blocking primitive _done: bool # Signal flag _approved: bool # User's decision def wait(self) -> tuple[bool, str] # Blocks until _done; returns (approved, script) ``` **`MMAApprovalDialog`** — MMA tier step approval: ```python class MMAApprovalDialog: _ticket_id: str _payload: str # The step payload (editable) _condition: threading.Condition _done: bool _approved: bool def wait(self) -> tuple[bool, str] # Returns (approved, payload) ``` **`MMASpawnApprovalDialog`** — Sub-agent spawn approval: ```python class MMASpawnApprovalDialog: _ticket_id: str _role: str # tier3-worker, tier4-qa, etc. _prompt: str # Spawn prompt (editable) _context_md: str # Context document (editable) _condition: threading.Condition _done: bool _approved: bool _abort: bool # Can abort entire track def wait(self) -> dict[str, Any] # Returns {approved, abort, prompt, context_md} ``` ### Blocking Flow Using `ConfirmDialog` as exemplar: ``` ASYNCIO THREAD (ai_client tool callback) GUI MAIN THREAD ───────────────────────────────────────── ─────────────── 1. ai_client calls _confirm_and_run(script) 2. Creates ConfirmDialog(script, base_dir) 3. Stores dialog: - Headless: _pending_actions[uid] = dialog - GUI mode: _pending_dialog = dialog 4. If test_hooks_enabled: pushes to _api_event_queue 5. dialog.wait() BLOCKS on _condition 6. Next frame: ImGui renders _pending_dialog in modal 7. User clicks Approve/Reject 8. _handle_approve_script(): with dialog._condition: dialog._approved = True dialog._done = True dialog._condition.notify_all() 9. wait() returns (True, potentially_edited_script) 10. Executes shell_runner.run_powershell() 11. Returns output to ai_client ``` The `_condition.wait(timeout=0.1)` uses a 100ms polling interval inside a loop — a polling-with-condition hybrid that ensures the blocking thread wakes periodically. ### Resolution Paths **GUI button path** (normal interactive use): `_handle_approve_script()` / `_handle_approve_mma_step()` / `_handle_approve_spawn()` directly manipulate the dialog's condition variable from the GUI thread. **HTTP API path** (headless/automation): `resolve_pending_action(action_id, approved)` looks up the dialog by UUID in `_pending_actions` dict (headless) or `_pending_dialog` (GUI), then signals the condition: ```python def resolve_pending_action(self, action_id: str, approved: bool) -> bool: with self._pending_dialog_lock: if action_id in self._pending_actions: dialog = self._pending_actions[action_id] with dialog._condition: dialog._approved = approved dialog._done = True dialog._condition.notify_all() return True ``` **MMA approval path**: `_handle_mma_respond(approved, payload, abort, prompt, context_md)` is the unified resolver. It uses a `dialog_container` — a one-element list `[None]` used as a mutable reference shared between the MMA engine (which creates the container) and the GUI (which populates it via `_process_pending_gui_tasks`). --- ## AI Client: Multi-Provider Architecture `ai_client.py` operates as a **stateful singleton** — all provider state is held in module-level globals. There is no class wrapping; the module itself is the abstraction layer. ### Module-Level State ```python _provider: str = "gemini" # "gemini" | "anthropic" | "deepseek" | "gemini_cli" _model: str = "gemini-2.5-flash-lite" _temperature: float = 0.0 _max_tokens: int = 8192 _history_trunc_limit: int = 8000 # Char limit for truncating old tool outputs _send_lock: threading.Lock # Serializes ALL send() calls across providers ``` Per-provider client objects: ```python # Gemini (SDK-managed stateful chat) _gemini_client: genai.Client | None _gemini_chat: Any # Holds history internally _gemini_cache: Any # Server-side CachedContent _gemini_cache_md_hash: int | None # For cache invalidation _GEMINI_CACHE_TTL: int = 3600 # 1-hour; rebuilt at 90% (3240s) # Anthropic (client-managed history) _anthropic_client: anthropic.Anthropic | None _anthropic_history: list[dict] # Mutable [{role, content}, ...] _anthropic_history_lock: threading.Lock # DeepSeek (raw HTTP, client-managed history) _deepseek_history: list[dict] _deepseek_history_lock: threading.Lock # Gemini CLI (adapter wrapper) _gemini_cli_adapter: GeminiCliAdapter | None ``` Safety limits: ```python MAX_TOOL_ROUNDS: int = 10 # Max tool-call loop iterations per send() _MAX_TOOL_OUTPUT_BYTES: int = 500_000 # 500KB cumulative tool output budget _ANTHROPIC_CHUNK_SIZE: int = 120_000 # Max chars per system text block _ANTHROPIC_MAX_PROMPT_TOKENS: int = 180_000 # 200k limit minus headroom _GEMINI_MAX_INPUT_TOKENS: int = 900_000 # 1M window minus headroom ``` ### The `send()` Dispatcher ```python def send(md_content, user_message, base_dir=".", file_items=None, discussion_history="", stream=False, pre_tool_callback=None, qa_callback=None) -> str: with _send_lock: if _provider == "gemini": return _send_gemini(...) elif _provider == "gemini_cli": return _send_gemini_cli(...) elif _provider == "anthropic": return _send_anthropic(...) elif _provider == "deepseek": return _send_deepseek(..., stream=stream) ``` `_send_lock` serializes all API calls — only one provider call can be in-flight at a time. All providers share the same callback signatures. Return type is always `str`. ### Provider Comparison | Aspect | Gemini SDK | Anthropic | DeepSeek | Gemini CLI | |---|---|---|---|---| | **Client** | `genai.Client` | `anthropic.Anthropic` | Raw `requests.post` | `GeminiCliAdapter` (subprocess) | | **History** | SDK-managed (`_gemini_chat._history`) | Client-managed list | Client-managed list | CLI-managed (session ID) | | **Caching** | Server-side `CachedContent` with TTL | Prompt caching via `cache_control: ephemeral` (4 breakpoints) | None | None | | **Tool format** | `types.FunctionDeclaration` | JSON Schema dict | Not declared | Same as SDK via adapter | | **Tool results** | `Part.from_function_response(response={"output": ...})` | `{"type": "tool_result", "tool_use_id": ..., "content": ...}` | `{"role": "tool", "tool_call_id": ..., "content": ...}` | `{"role": "tool", ...}` | | **History trimming** | In-place at 40% of 900K token estimate | 2-phase: strip stale file refreshes, then drop turn pairs at 180K | None | None | | **Streaming** | No | No | Yes | No | ### Tool-Call Loop (common pattern across providers) All providers follow the same high-level loop, iterated up to `MAX_TOOL_ROUNDS + 2` times: 1. Send message (or tool results from prior round) to API. 2. Extract text response and any function calls. 3. Log to comms log; emit events. 4. If no function calls or max rounds exceeded: **break**. 5. For each function call: - If `pre_tool_callback` rejects: return rejection text. - Dispatch to `mcp_client.dispatch()` or `shell_runner.run_powershell()`. - After the **last** call of this round: run `_reread_file_items()` for context refresh. - Truncate tool output at `_history_trunc_limit` chars. - Accumulate `_cumulative_tool_bytes`. 6. If cumulative bytes > 500KB: inject warning. 7. Package tool results in provider-specific format; loop. ### Context Refresh Mechanism After the last tool call in each round, `_reread_file_items(file_items)` checks mtimes of all tracked files: 1. For each file item: compare `Path.stat().st_mtime` against stored `mtime`. 2. If unchanged: pass through as-is. 3. If changed: re-read content, store `old_content` for diffing, update `mtime`. 4. Changed files are diffed via `_build_file_diff_text`: - Files <= 200 lines: emit full content. - Files > 200 lines with `old_content`: emit `difflib.unified_diff`. 5. Diff is appended to the last tool's output as `[SYSTEM: FILES UPDATED]\n\n{diff}`. 6. Stale `[FILES UPDATED]` blocks are stripped from older history turns by `_strip_stale_file_refreshes` to prevent context bloat. ### Anthropic Cache Strategy (4-Breakpoint System) Anthropic allows a maximum of 4 `cache_control: ephemeral` breakpoints: | # | Location | Purpose | |---|---|---| | 1 | Last block of stable system prompt | Cache base instructions | | 2 | Last block of context chunks | Cache file context | | 3 | Last tool definition | Cache tool schema | | 4 | Second-to-last user message | Cache conversation prefix | Before placing breakpoint 4, all existing `cache_control` is stripped from history to prevent exceeding the limit. ### Gemini Cache Strategy (Server-Side TTL) System instruction content is hashed. On each call, a 3-way decision: - **Hash changed**: Delete old cache, rebuild with new content. - **Cache age > 90% of TTL**: Proactive renewal (delete + rebuild). - **No cache exists**: Create new `CachedContent` if token count >= 2048; otherwise inline. --- ## Comms Log System Every API interaction is logged to a module-level list with real-time GUI push: ```python def _append_comms(direction: str, kind: str, payload: dict[str, Any]) -> None: entry = { "ts": datetime.now().strftime("%H:%M:%S"), "direction": direction, # "OUT" (to API) or "IN" (from API) "kind": kind, # "request" | "response" | "tool_call" | "tool_result" "provider": _provider, "model": _model, "payload": payload, } _comms_log.append(entry) if comms_log_callback: comms_log_callback(entry) # Real-time push to GUI ``` --- ## State Machines ### `ai_status` (Informal) ``` "idle" -> "sending..." -> [AI call in progress] -> "running powershell..." -> "powershell done, awaiting AI..." -> "fetching url..." | "searching web..." -> "done" | "error" -> "idle" (on reset) ``` ### HITL Dialog State (Binary per type) - `_pending_dialog is not None` — script confirmation active - `_pending_mma_approval is not None` — MMA step approval active - `_pending_mma_spawn is not None` — spawn approval active - `_pending_ask_dialog == True` — tool ask dialog active --- ## Security: The MCP Allowlist Every filesystem tool (read, list, search, write) is gated by the MCP Bridge (`mcp_client.py`). See [guide_tools.md](guide_tools.md) for the complete security model, tool inventory, and endpoint reference. Summary: Every path is resolved to an absolute path and checked against a dynamically-built allowlist constructed from the project's tracked files and base directories. Files named `history.toml` or `*_history.toml` are hard-blacklisted. --- ## Telemetry & Auditing Every interaction is designed to be auditable: - **JSON-L Comms Logs**: Raw API traffic logged to `logs/sessions//comms.log` for debugging and token cost analysis. - **Tool Call Logs**: Markdown-formatted sequential records to `toolcalls.log`. - **Generated Scripts**: Every PowerShell script that passes through the Execution Clutch is saved to `scripts/generated/_.ps1`. - **API Hook Logs**: All HTTP hook invocations logged to `apihooks.log`. - **CLI Call Logs**: Subprocess execution details (command, stdin, stdout, stderr, latency) to `clicalls.log` as JSON-L. - **Performance Monitor**: Real-time FPS, Frame Time, CPU, Input Lag tracked and queryable via Hook API. ### Telemetry Data Structures ```python # Comms log entry (JSON-L) { "ts": "14:32:05", "direction": "OUT", "kind": "tool_call", "provider": "gemini", "model": "gemini-2.5-flash-lite", "payload": { "name": "run_powershell", "id": "call_abc123", "script": "Get-ChildItem" }, "source_tier": "Tier 3", "local_ts": 1709875925.123 } # Performance metrics (via get_metrics()) { "fps": 60.0, "fps_avg": 58.5, "last_frame_time_ms": 16.67, "frame_time_ms_avg": 17.1, "cpu_percent": 12.5, "cpu_percent_avg": 15.2, "input_lag_ms": 2.3, "input_lag_ms_avg": 3.1, "time_render_mma_dashboard_ms": 5.2, "time_render_mma_dashboard_ms_avg": 4.8 } ``` --- ## MMA Engine Architecture ### WorkerPool: Concurrent Worker Management The `WorkerPool` class in `multi_agent_conductor.py` manages a bounded pool of worker threads: ```python class WorkerPool: def __init__(self, max_workers: int = 4): self.max_workers = max_workers self._active: dict[str, threading.Thread] = {} self._lock = threading.Lock() self._semaphore = threading.Semaphore(max_workers) def spawn(self, ticket_id: str, target: Callable, args: tuple) -> Optional[threading.Thread]: with self._lock: if len(self._active) >= self.max_workers: return None def wrapper(*a, **kw): try: with self._semaphore: target(*a, **kw) finally: with self._lock: self._active.pop(ticket_id, None) t = threading.Thread(target=wrapper, args=args, daemon=True) with self._lock: self._active[ticket_id] = t t.start() return t ``` **Key behaviors**: - **Bounded concurrency**: `max_workers` (default 4) limits parallel ticket execution - **Semaphore gating**: Ensures no more than `max_workers` can execute simultaneously - **Automatic cleanup**: Thread removes itself from `_active` dict on completion - **Non-blocking spawn**: Returns `None` if pool is full, allowing the engine to defer ### ConductorEngine: Orchestration Loop The `ConductorEngine` orchestrates ticket execution within a track: ```python class ConductorEngine: def __init__(self, track: Track, event_queue: Optional[SyncEventQueue] = None, auto_queue: bool = False) -> None: self.track = track self.event_queue = event_queue self.dag = TrackDAG(self.track.tickets) self.engine = ExecutionEngine(self.dag, auto_queue=auto_queue) self.pool = WorkerPool(max_workers=4) self._abort_events: dict[str, threading.Event] = {} self._pause_event = threading.Event() self._tier_usage_lock = threading.Lock() self.tier_usage = { "Tier 1": {"input": 0, "output": 0, "model": "gemini-3.1-pro-preview"}, "Tier 2": {"input": 0, "output": 0, "model": "gemini-3-flash-preview"}, "Tier 3": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"}, "Tier 4": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"}, } ``` **Main execution loop** (`run` method): 1. **Pause check**: If `_pause_event` is set, sleep and broadcast "paused" status 2. **DAG tick**: Call `engine.tick()` to get ready tasks 3. **Completion check**: If no ready tasks and all completed, break with "done" status 4. **Wait for workers**: If tasks in-progress or pool active, sleep and continue 5. **Blockage detection**: If no ready, no in-progress, and not all done, break with "blocked" status 6. **Spawn workers**: For each ready task, spawn a worker via `pool.spawn()` 7. **Model escalation**: Workers use `models_list[min(retry_count, 2)]` for capability upgrade on retries ### Abort Event Propagation Each ticket has an associated `threading.Event` for abort signaling: ```python # Before spawning worker self._abort_events[ticket.id] = threading.Event() # Worker checks abort at three points: # 1. Before major work if abort_event.is_set(): ticket.status = "killed" return "ABORTED" # 2. Before tool execution (in clutch_callback) if abort_event.is_set(): return False # Reject tool # 3. After blocking send() returns if abort_event.is_set(): ticket.status = "killed" return "ABORTED" ``` --- ## Architectural Invariants 1. **Single-writer principle**: All GUI state mutations happen on the main thread via `_process_pending_gui_tasks`. Background threads never write GUI state directly. 2. **Copy-and-clear lock pattern**: `_process_pending_gui_tasks` snapshots and clears the task list under the lock, then processes outside the lock. 3. **Context Amnesia**: Each MMA Tier 3 Worker starts with `ai_client.reset_session()`. No conversational bleed between tickets. 4. **Send serialization**: `_send_lock` ensures only one provider call is in-flight at a time across all threads. 5. **Dual-Flush persistence**: On exit, state is committed to both project-level and global-level config files. 6. **No cross-thread GUI mutation**: Background threads must push tasks to `_pending_gui_tasks` rather than calling GUI methods directly. 7. **Abort-before-execution**: Workers check abort events before major work phases, enabling clean cancellation. 8. **Bounded worker pool**: `WorkerPool` enforces `max_workers` limit to prevent resource exhaustion. --- ## Error Classification & Recovery ### ProviderError Taxonomy The `ProviderError` class provides structured error classification: ```python class ProviderError(Exception): def __init__(self, kind: str, provider: str, original: Exception): self.kind = kind # "quota" | "rate_limit" | "auth" | "balance" | "network" | "unknown" self.provider = provider self.original = original def ui_message(self) -> str: labels = { "quota": "QUOTA EXHAUSTED", "rate_limit": "RATE LIMITED", "auth": "AUTH / API KEY ERROR", "balance": "BALANCE / BILLING ERROR", "network": "NETWORK / CONNECTION ERROR", "unknown": "API ERROR", } return f"[{self.provider.upper()} {labels.get(self.kind, 'API ERROR')}]\n\n{self.original}" ``` ### Error Recovery Patterns | Error Kind | Recovery Strategy | |---|---| | `quota` | Display in UI, await user intervention | | `rate_limit` | Exponential backoff (not yet implemented) | | `auth` | Prompt for credential verification | | `balance` | Display billing alert | | `network` | Auto-retry with timeout | | `unknown` | Log full traceback, display in UI | --- ## Memory Management ### History Trimming Strategies **Gemini (40% threshold)**: ```python if total_in > _GEMINI_MAX_INPUT_TOKENS * 0.4: while len(hist) > 4 and total_in > _GEMINI_MAX_INPUT_TOKENS * 0.3: # Drop oldest message pairs hist.pop(0) # Assistant hist.pop(0) # User ``` **Anthropic (180K limit)**: ```python def _trim_anthropic_history(system_blocks, history): est = _estimate_prompt_tokens(system_blocks, history) while len(history) > 3 and est > _ANTHROPIC_MAX_PROMPT_TOKENS: # Drop turn pairs, preserving tool_result chains ... ``` ### Tool Output Budget ```python _MAX_TOOL_OUTPUT_BYTES: int = 500_000 # 500KB cumulative if _cumulative_tool_bytes > _MAX_TOOL_OUTPUT_BYTES: # Inject warning, force final answer parts.append("SYSTEM WARNING: Cumulative tool output exceeded 500KB budget.") ``` ### AST Cache (file_cache.py) ```python _ast_cache: Dict[str, Tuple[float, tree_sitter.Tree]] = {} def get_cached_tree(self, path: Optional[str], code: str) -> tree_sitter.Tree: mtime = p.stat().st_mtime if p.exists() else 0.0 if path in _ast_cache: cached_mtime, tree = _ast_cache[path] if cached_mtime == mtime: return tree # Parse and cache with simple LRU (max 10 entries) if len(_ast_cache) >= 10: del _ast_cache[next(iter(_ast_cache))] tree = self.parse(code) _ast_cache[path] = (mtime, tree) return tree ```