docs update (wip)
This commit is contained in:
@@ -1,12 +1,18 @@
|
||||
# Architecture
|
||||
|
||||
[Top](../Readme.md) | [Tools & IPC](guide_tools.md) | [MMA Orchestration](guide_mma.md) | [Simulations](guide_simulations.md)
|
||||
[Top](../README.md) | [Tools & IPC](guide_tools.md) | [MMA Orchestration](guide_mma.md) | [Simulations](guide_simulations.md)
|
||||
|
||||
---
|
||||
|
||||
## Philosophy: The Decoupled State Machine
|
||||
|
||||
Manual Slop solves a single tension: **AI reasoning is high-latency and non-deterministic; GUI interaction must be low-latency and responsive.** The engine enforces strict decoupling between three thread domains so that multi-second LLM calls never block the render loop, and every AI-generated payload passes through a human-auditable gate before execution.
|
||||
Manual Slop solves a single tension: **AI reasoning is high-latency and non-deterministic; GUI interaction must be low-latency and responsive.** The engine enforces strict decoupling between four thread domains so that multi-second LLM calls never block the render loop, and every AI-generated payload passes through a human-auditable gate before execution.
|
||||
|
||||
The architectural philosophy follows data-oriented design principles:
|
||||
- The GUI (`gui_2.py`, `app_controller.py`) remains a pure visualization of application state
|
||||
- State mutations occur only through lock-guarded queues consumed on the main render thread
|
||||
- Background threads never write GUI state directly — they serialize task dicts for later consumption
|
||||
- All cross-thread communication uses explicit synchronization primitives (Locks, Conditions, Events)
|
||||
|
||||
## Project Structure
|
||||
|
||||
@@ -36,17 +42,17 @@ manual_slop/
|
||||
|
||||
Four distinct thread domains operate concurrently:
|
||||
|
||||
| Domain | Created By | Purpose | Lifecycle |
|
||||
|---|---|---|---|
|
||||
| **Main / GUI** | `immapp.run()` | Dear ImGui retained-mode render loop; sole writer of GUI state | App lifetime |
|
||||
| **Asyncio Worker** | `App.__init__` via `threading.Thread(daemon=True)` | Event queue processing, AI client calls | Daemon (dies with process) |
|
||||
| **HookServer** | `api_hooks.HookServer.start()` | HTTP API on `:8999` for external automation and IPC | Daemon thread |
|
||||
| **Ad-hoc** | Transient `threading.Thread` calls | Model-fetching, legacy send paths | Short-lived |
|
||||
| Domain | Created By | Purpose | Lifecycle | Key Synchronization Primitives |
|
||||
|---|---|---|---|---|
|
||||
| **Main / GUI** | `immapp.run()` | Dear ImGui retained-mode render loop; sole writer of GUI state | App lifetime | None (consumer of queues) |
|
||||
| **Asyncio Worker** | `App.__init__` via `threading.Thread(daemon=True)` | Event queue processing, AI client calls | Daemon (dies with process) | `AsyncEventQueue`, `threading.Lock` |
|
||||
| **HookServer** | `api_hooks.HookServer.start()` | HTTP API on `:8999` for external automation and IPC | Daemon thread | `threading.Lock`, `threading.Event` |
|
||||
| **Ad-hoc** | Transient `threading.Thread` calls | Model-fetching, legacy send paths, log pruning | Short-lived | Task-specific locks |
|
||||
|
||||
The asyncio worker is **not** the main thread's event loop. It runs a dedicated `asyncio.new_event_loop()` on its own daemon thread:
|
||||
|
||||
```python
|
||||
# App.__init__:
|
||||
# AppController.__init__:
|
||||
self._loop = asyncio.new_event_loop()
|
||||
self._loop_thread = threading.Thread(target=self._run_event_loop, daemon=True)
|
||||
self._loop_thread.start()
|
||||
@@ -60,6 +66,25 @@ def _run_event_loop(self) -> None:
|
||||
|
||||
The GUI thread uses `asyncio.run_coroutine_threadsafe(coro, self._loop)` to push work into this loop.
|
||||
|
||||
### Thread-Local Context Isolation
|
||||
|
||||
For concurrent multi-agent execution, the application uses `threading.local()` to manage per-thread context:
|
||||
|
||||
```python
|
||||
# ai_client.py
|
||||
_local_storage = threading.local()
|
||||
|
||||
def get_current_tier() -> Optional[str]:
|
||||
"""Returns the current tier from thread-local storage."""
|
||||
return getattr(_local_storage, "current_tier", None)
|
||||
|
||||
def set_current_tier(tier: Optional[str]) -> None:
|
||||
"""Sets the current tier in thread-local storage."""
|
||||
_local_storage.current_tier = tier
|
||||
```
|
||||
|
||||
This ensures that comms log entries and tool calls are correctly tagged with their source tier even when multiple workers execute concurrently.
|
||||
|
||||
---
|
||||
|
||||
## Cross-Thread Data Structures
|
||||
@@ -553,12 +578,247 @@ Every interaction is designed to be auditable:
|
||||
- **CLI Call Logs**: Subprocess execution details (command, stdin, stdout, stderr, latency) to `clicalls.log` as JSON-L.
|
||||
- **Performance Monitor**: Real-time FPS, Frame Time, CPU, Input Lag tracked and queryable via Hook API.
|
||||
|
||||
### Telemetry Data Structures
|
||||
|
||||
```python
|
||||
# Comms log entry (JSON-L)
|
||||
{
|
||||
"ts": "14:32:05",
|
||||
"direction": "OUT",
|
||||
"kind": "tool_call",
|
||||
"provider": "gemini",
|
||||
"model": "gemini-2.5-flash-lite",
|
||||
"payload": {
|
||||
"name": "run_powershell",
|
||||
"id": "call_abc123",
|
||||
"script": "Get-ChildItem"
|
||||
},
|
||||
"source_tier": "Tier 3",
|
||||
"local_ts": 1709875925.123
|
||||
}
|
||||
|
||||
# Performance metrics (via get_metrics())
|
||||
{
|
||||
"fps": 60.0,
|
||||
"fps_avg": 58.5,
|
||||
"last_frame_time_ms": 16.67,
|
||||
"frame_time_ms_avg": 17.1,
|
||||
"cpu_percent": 12.5,
|
||||
"cpu_percent_avg": 15.2,
|
||||
"input_lag_ms": 2.3,
|
||||
"input_lag_ms_avg": 3.1,
|
||||
"time_render_mma_dashboard_ms": 5.2,
|
||||
"time_render_mma_dashboard_ms_avg": 4.8
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## MMA Engine Architecture
|
||||
|
||||
### WorkerPool: Concurrent Worker Management
|
||||
|
||||
The `WorkerPool` class in `multi_agent_conductor.py` manages a bounded pool of worker threads:
|
||||
|
||||
```python
|
||||
class WorkerPool:
|
||||
def __init__(self, max_workers: int = 4):
|
||||
self.max_workers = max_workers
|
||||
self._active: dict[str, threading.Thread] = {}
|
||||
self._lock = threading.Lock()
|
||||
self._semaphore = threading.Semaphore(max_workers)
|
||||
|
||||
def spawn(self, ticket_id: str, target: Callable, args: tuple) -> Optional[threading.Thread]:
|
||||
with self._lock:
|
||||
if len(self._active) >= self.max_workers:
|
||||
return None
|
||||
|
||||
def wrapper(*a, **kw):
|
||||
try:
|
||||
with self._semaphore:
|
||||
target(*a, **kw)
|
||||
finally:
|
||||
with self._lock:
|
||||
self._active.pop(ticket_id, None)
|
||||
|
||||
t = threading.Thread(target=wrapper, args=args, daemon=True)
|
||||
with self._lock:
|
||||
self._active[ticket_id] = t
|
||||
t.start()
|
||||
return t
|
||||
```
|
||||
|
||||
**Key behaviors**:
|
||||
- **Bounded concurrency**: `max_workers` (default 4) limits parallel ticket execution
|
||||
- **Semaphore gating**: Ensures no more than `max_workers` can execute simultaneously
|
||||
- **Automatic cleanup**: Thread removes itself from `_active` dict on completion
|
||||
- **Non-blocking spawn**: Returns `None` if pool is full, allowing the engine to defer
|
||||
|
||||
### ConductorEngine: Orchestration Loop
|
||||
|
||||
The `ConductorEngine` orchestrates ticket execution within a track:
|
||||
|
||||
```python
|
||||
class ConductorEngine:
|
||||
def __init__(self, track: Track, event_queue: Optional[SyncEventQueue] = None,
|
||||
auto_queue: bool = False) -> None:
|
||||
self.track = track
|
||||
self.event_queue = event_queue
|
||||
self.dag = TrackDAG(self.track.tickets)
|
||||
self.engine = ExecutionEngine(self.dag, auto_queue=auto_queue)
|
||||
self.pool = WorkerPool(max_workers=4)
|
||||
self._abort_events: dict[str, threading.Event] = {}
|
||||
self._pause_event = threading.Event()
|
||||
self._tier_usage_lock = threading.Lock()
|
||||
self.tier_usage = {
|
||||
"Tier 1": {"input": 0, "output": 0, "model": "gemini-3.1-pro-preview"},
|
||||
"Tier 2": {"input": 0, "output": 0, "model": "gemini-3-flash-preview"},
|
||||
"Tier 3": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"},
|
||||
"Tier 4": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"},
|
||||
}
|
||||
```
|
||||
|
||||
**Main execution loop** (`run` method):
|
||||
|
||||
1. **Pause check**: If `_pause_event` is set, sleep and broadcast "paused" status
|
||||
2. **DAG tick**: Call `engine.tick()` to get ready tasks
|
||||
3. **Completion check**: If no ready tasks and all completed, break with "done" status
|
||||
4. **Wait for workers**: If tasks in-progress or pool active, sleep and continue
|
||||
5. **Blockage detection**: If no ready, no in-progress, and not all done, break with "blocked" status
|
||||
6. **Spawn workers**: For each ready task, spawn a worker via `pool.spawn()`
|
||||
7. **Model escalation**: Workers use `models_list[min(retry_count, 2)]` for capability upgrade on retries
|
||||
|
||||
### Abort Event Propagation
|
||||
|
||||
Each ticket has an associated `threading.Event` for abort signaling:
|
||||
|
||||
```python
|
||||
# Before spawning worker
|
||||
self._abort_events[ticket.id] = threading.Event()
|
||||
|
||||
# Worker checks abort at three points:
|
||||
# 1. Before major work
|
||||
if abort_event.is_set():
|
||||
ticket.status = "killed"
|
||||
return "ABORTED"
|
||||
|
||||
# 2. Before tool execution (in clutch_callback)
|
||||
if abort_event.is_set():
|
||||
return False # Reject tool
|
||||
|
||||
# 3. After blocking send() returns
|
||||
if abort_event.is_set():
|
||||
ticket.status = "killed"
|
||||
return "ABORTED"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architectural Invariants
|
||||
|
||||
1. **Single-writer principle**: All GUI state mutations happen on the main thread via `_process_pending_gui_tasks`. Background threads never write GUI state directly.
|
||||
|
||||
2. **Copy-and-clear lock pattern**: `_process_pending_gui_tasks` snapshots and clears the task list under the lock, then processes outside the lock.
|
||||
|
||||
3. **Context Amnesia**: Each MMA Tier 3 Worker starts with `ai_client.reset_session()`. No conversational bleed between tickets.
|
||||
|
||||
4. **Send serialization**: `_send_lock` ensures only one provider call is in-flight at a time across all threads.
|
||||
|
||||
5. **Dual-Flush persistence**: On exit, state is committed to both project-level and global-level config files.
|
||||
|
||||
6. **No cross-thread GUI mutation**: Background threads must push tasks to `_pending_gui_tasks` rather than calling GUI methods directly.
|
||||
|
||||
7. **Abort-before-execution**: Workers check abort events before major work phases, enabling clean cancellation.
|
||||
|
||||
8. **Bounded worker pool**: `WorkerPool` enforces `max_workers` limit to prevent resource exhaustion.
|
||||
|
||||
---
|
||||
|
||||
## Error Classification & Recovery
|
||||
|
||||
### ProviderError Taxonomy
|
||||
|
||||
The `ProviderError` class provides structured error classification:
|
||||
|
||||
```python
|
||||
class ProviderError(Exception):
|
||||
def __init__(self, kind: str, provider: str, original: Exception):
|
||||
self.kind = kind # "quota" | "rate_limit" | "auth" | "balance" | "network" | "unknown"
|
||||
self.provider = provider
|
||||
self.original = original
|
||||
|
||||
def ui_message(self) -> str:
|
||||
labels = {
|
||||
"quota": "QUOTA EXHAUSTED",
|
||||
"rate_limit": "RATE LIMITED",
|
||||
"auth": "AUTH / API KEY ERROR",
|
||||
"balance": "BALANCE / BILLING ERROR",
|
||||
"network": "NETWORK / CONNECTION ERROR",
|
||||
"unknown": "API ERROR",
|
||||
}
|
||||
return f"[{self.provider.upper()} {labels.get(self.kind, 'API ERROR')}]\n\n{self.original}"
|
||||
```
|
||||
|
||||
### Error Recovery Patterns
|
||||
|
||||
| Error Kind | Recovery Strategy |
|
||||
|---|---|
|
||||
| `quota` | Display in UI, await user intervention |
|
||||
| `rate_limit` | Exponential backoff (not yet implemented) |
|
||||
| `auth` | Prompt for credential verification |
|
||||
| `balance` | Display billing alert |
|
||||
| `network` | Auto-retry with timeout |
|
||||
| `unknown` | Log full traceback, display in UI |
|
||||
|
||||
---
|
||||
|
||||
## Memory Management
|
||||
|
||||
### History Trimming Strategies
|
||||
|
||||
**Gemini (40% threshold)**:
|
||||
```python
|
||||
if total_in > _GEMINI_MAX_INPUT_TOKENS * 0.4:
|
||||
while len(hist) > 4 and total_in > _GEMINI_MAX_INPUT_TOKENS * 0.3:
|
||||
# Drop oldest message pairs
|
||||
hist.pop(0) # Assistant
|
||||
hist.pop(0) # User
|
||||
```
|
||||
|
||||
**Anthropic (180K limit)**:
|
||||
```python
|
||||
def _trim_anthropic_history(system_blocks, history):
|
||||
est = _estimate_prompt_tokens(system_blocks, history)
|
||||
while len(history) > 3 and est > _ANTHROPIC_MAX_PROMPT_TOKENS:
|
||||
# Drop turn pairs, preserving tool_result chains
|
||||
...
|
||||
```
|
||||
|
||||
### Tool Output Budget
|
||||
|
||||
```python
|
||||
_MAX_TOOL_OUTPUT_BYTES: int = 500_000 # 500KB cumulative
|
||||
|
||||
if _cumulative_tool_bytes > _MAX_TOOL_OUTPUT_BYTES:
|
||||
# Inject warning, force final answer
|
||||
parts.append("SYSTEM WARNING: Cumulative tool output exceeded 500KB budget.")
|
||||
```
|
||||
|
||||
### AST Cache (file_cache.py)
|
||||
|
||||
```python
|
||||
_ast_cache: Dict[str, Tuple[float, tree_sitter.Tree]] = {}
|
||||
|
||||
def get_cached_tree(self, path: Optional[str], code: str) -> tree_sitter.Tree:
|
||||
mtime = p.stat().st_mtime if p.exists() else 0.0
|
||||
if path in _ast_cache:
|
||||
cached_mtime, tree = _ast_cache[path]
|
||||
if cached_mtime == mtime:
|
||||
return tree
|
||||
# Parse and cache with simple LRU (max 10 entries)
|
||||
if len(_ast_cache) >= 10:
|
||||
del _ast_cache[next(iter(_ast_cache))]
|
||||
tree = self.parse(code)
|
||||
_ast_cache[path] = (mtime, tree)
|
||||
return tree
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user