docs update (wip)

This commit is contained in:
2026-03-08 01:46:34 -05:00
parent d9a06fd2fe
commit d34c35941f
14 changed files with 1213 additions and 105 deletions

258
Readme.md
View File

@@ -1,14 +1,56 @@
# Sloppy
# Manual Slop
![img](./gallery/splash.png)
A GUI orchestrator for local LLM-driven coding sessions. Manual Slop bridges high-latency AI reasoning with a low-latency ImGui render loop via a thread-safe asynchronous pipeline, ensuring every AI-generated payload passes through a human-auditable gate before execution.
A high-density GUI orchestrator for local LLM-driven coding sessions. Manual Slop bridges high-latency AI reasoning with a low-latency ImGui render loop via a thread-safe asynchronous pipeline, ensuring every AI-generated payload passes through a human-auditable gate before execution.
**Tech Stack**: Python 3.11+, Dear PyGui / ImGui, FastAPI, Uvicorn
**Providers**: Gemini API, Anthropic API, DeepSeek, Gemini CLI (headless)
**Design Philosophy**: Full manual control over vendor API metrics, agent capabilities, and context memory usage. High information density, tactile interactions, and explicit confirmation for destructive actions.
**Tech Stack**: Python 3.11+, Dear PyGui / ImGui Bundle, FastAPI, Uvicorn, tree-sitter
**Providers**: Gemini API, Anthropic API, DeepSeek, Gemini CLI (headless), MiniMax
**Platform**: Windows (PowerShell) — single developer, local use
![img](./gallery/python_2026-03-01_23-45-34.png)
![img](./gallery/python_2026-03-07_14-32-50.png)
---
## Key Features
### Multi-Provider Integration
- **Gemini SDK**: Server-side context caching with TTL management, automatic cache rebuilding at 90% TTL
- **Anthropic**: Ephemeral prompt caching with 4-breakpoint system, automatic history truncation at 180K tokens
- **DeepSeek**: Dedicated SDK for code-optimized reasoning
- **Gemini CLI**: Headless adapter with full functional parity, synchronous HITL bridge
- **MiniMax**: Alternative provider support
### 4-Tier MMA Orchestration
Hierarchical task decomposition with specialized models and strict token firewalling:
- **Tier 1 (Orchestrator)**: Product alignment, epic → tracks
- **Tier 2 (Tech Lead)**: Track → tickets (DAG), persistent context
- **Tier 3 (Worker)**: Stateless TDD implementation, context amnesia
- **Tier 4 (QA)**: Stateless error analysis, no fixes
### Strict Human-in-the-Loop (HITL)
- **Execution Clutch**: All destructive actions suspend on `threading.Condition` pending GUI approval
- **Three Dialog Types**: ConfirmDialog (scripts), MMAApprovalDialog (steps), MMASpawnApprovalDialog (workers)
- **Editable Payloads**: Review, modify, or reject any AI-generated content before execution
### 26 MCP Tools with Sandboxing
Three-layer security model: Allowlist Construction → Path Validation → Resolution Gate
- **File I/O**: read, list, search, slice, edit, tree
- **AST-Based (Python)**: skeleton, outline, definition, signature, class summary, docstring
- **Analysis**: summary, git diff, find usages, imports, syntax check, hierarchy
- **Network**: web search, URL fetch
- **Runtime**: UI performance metrics
### Parallel Tool Execution
Multiple independent tool calls within a single AI turn execute concurrently via `asyncio.gather`, significantly reducing latency.
### AST-Based Context Management
- **Skeleton View**: Signatures + docstrings, bodies replaced with `...`
- **Curated View**: Preserves `@core_logic` decorated functions and `[HOT]` comment blocks
- **Targeted View**: Extracts only specified symbols and their dependencies
- **Heuristic Summaries**: Token-efficient structural descriptions without AI calls
---
@@ -26,35 +68,12 @@ The **MMA (Multi-Model Agent)** system decomposes epics into tracks, tracks into
| Guide | Scope |
|---|---|
| [Readme](./docs/Readme.md) | Documentation index, GUI panel reference, configuration files, environment variables |
| [Architecture](./docs/guide_architecture.md) | Threading model, event system, AI client multi-provider architecture, HITL mechanism, comms logging |
| [Tools & IPC](./docs/guide_tools.md) | MCP Bridge security model, all 26 native tools, Hook API endpoints, ApiHookClient reference, shell runner |
| [MMA Orchestration](./docs/guide_mma.md) | 4-tier hierarchy, Ticket/Track data structures, DAG engine, ConductorEngine execution loop, worker lifecycle |
| [Simulations](./docs/guide_simulations.md) | `live_gui` fixture, Puppeteer pattern, mock provider, visual verification patterns, ASTParser / summarizer |
---
## Module Map
Core implementation resides in the `src/` directory.
| File | Role |
|---|---|
| `src/gui_2.py` | Primary ImGui interface — App class, frame-sync, HITL dialogs |
| `src/ai_client.py` | Multi-provider LLM abstraction (Gemini, Anthropic, DeepSeek, Gemini CLI) |
| `src/mcp_client.py` | 26 MCP tools with filesystem sandboxing and tool dispatch |
| `src/api_hooks.py` | HookServer — REST API for external automation on `:8999` |
| `src/api_hook_client.py` | Python client for the Hook API (used by tests and external tooling) |
| `src/multi_agent_conductor.py` | ConductorEngine — Tier 2 orchestration loop with DAG execution |
| `src/conductor_tech_lead.py` | Tier 2 ticket generation from track briefs |
| `src/dag_engine.py` | TrackDAG (dependency graph) + ExecutionEngine (tick-based state machine) |
| `src/models.py` | Ticket, Track, WorkerContext dataclasses |
| `src/events.py` | EventEmitter, AsyncEventQueue, UserRequestEvent |
| `src/project_manager.py` | TOML config persistence, discussion management, track state |
| `src/session_logger.py` | JSON-L + markdown audit trails (comms, tools, CLI, hooks) |
| `src/shell_runner.py` | PowerShell execution with timeout, env config, QA callback |
| `src/file_cache.py` | ASTParser (tree-sitter) — skeleton and curated views |
| `src/summarize.py` | Heuristic file summaries (imports, classes, functions) |
| `src/outline_tool.py` | Hierarchical code outline via stdlib `ast` |
| [Tools & IPC](./docs/guide_tools.md) | MCP Bridge 3-layer security, 26 tool inventory, Hook API endpoints, ApiHookClient reference, shell runner |
| [MMA Orchestration](./docs/guide_mma.md) | 4-tier hierarchy, Ticket/Track data structures, DAG engine, ConductorEngine, worker lifecycle, abort propagation |
| [Simulations](./docs/guide_simulations.md) | `live_gui` fixture, Puppeteer pattern, mock provider, visual verification, ASTParser / summarizer |
| [Meta-Boundary](./docs/guide_meta_boundary.md) | Application vs Meta-Tooling domains, inter-domain bridges, safety model separation |
---
@@ -105,6 +124,151 @@ uv run pytest tests/ -v
---
## MMA 4-Tier Architecture
The Multi-Model Agent system uses hierarchical task decomposition with specialized models at each tier:
| Tier | Role | Model | Responsibility |
|------|------|-------|----------------|
| **Tier 1** | Orchestrator | `gemini-3.1-pro-preview` | Product alignment, epic → tracks, track initialization |
| **Tier 2** | Tech Lead | `gemini-3-flash-preview` | Track → tickets (DAG), architectural oversight, persistent context |
| **Tier 3** | Worker | `gemini-2.5-flash-lite` / `deepseek-v3` | Stateless TDD implementation per ticket, context amnesia |
| **Tier 4** | QA | `gemini-2.5-flash-lite` / `deepseek-v3` | Stateless error analysis, diagnostics only (no fixes) |
**Key Principles:**
- **Context Amnesia**: Tier 3/4 workers start with `ai_client.reset_session()` — no history bleed
- **Token Firewalling**: Each tier receives only the context it needs
- **Model Escalation**: Failed tickets automatically retry with more capable models
- **WorkerPool**: Bounded concurrency (default: 4 workers) with semaphore gating
---
## Module by Domain
### src/ — Core implementation
| File | Role |
|---|---|
| `src/gui_2.py` | Primary ImGui interface — App class, frame-sync, HITL dialogs, event system |
| `src/ai_client.py` | Multi-provider LLM abstraction (Gemini, Anthropic, DeepSeek, MiniMax) |
| `src/mcp_client.py` | 26 MCP tools with filesystem sandboxing and tool dispatch |
| `src/api_hooks.py` | HookServer — REST API on `127.0.0.1:8999 for external automation |
| `src/api_hook_client.py` | Python client for the Hook API (used by tests and external tooling) |
| `src/multi_agent_conductor.py` | ConductorEngine — Tier 2 orchestration loop with DAG execution |
| `src/conductor_tech_lead.py` | Tier 2 ticket generation from track briefs |
| `src/dag_engine.py` | TrackDAG (dependency graph) + ExecutionEngine (tick-based state machine) |
| `src/models.py` | Ticket, Track, WorkerContext, Metadata, Track state |
| `src/events.py` | EventEmitter, AsyncEventQueue, UserRequestEvent |
| `src/project_manager.py` | TOML config persistence, discussion management, track state |
| `src/session_logger.py` | JSON-L + markdown audit trails (comms, tools, CLI, hooks) |
| `src/shell_runner.py` | PowerShell execution with timeout, env config, QA callback |
| `src/file_cache.py` | ASTParser (tree-sitter) — skeleton, curated, and targeted views |
| `src/summarize.py` | Heuristic file summaries (imports, classes, functions) |
| `src/outline_tool.py` | Hierarchical code outline via stdlib `ast` |
| `src/performance_monitor.py` | FPS, frame time, CPU, input lag tracking |
| `src/log_registry.py` | Session metadata persistence |
| `src/log_pruner.py` | Automated log cleanup based on age and whitelist |
| `src/paths.py` | Centralized path resolution with environment variable overrides |
| `src/cost_tracker.py` | Token cost estimation for API calls |
| `src/gemini_cli_adapter.py` | CLI subprocess adapter with session management |
| `src/mma_prompts.py` | Tier-specific system prompts for MMA orchestration |
| `src/theme_*.py` | UI theming (dark, light modes) |
Simulation modules in `simulation/`:
| File | Role |
|---|--- |
| `simulation/sim_base.py` | BaseSimulation class with setup/teardown lifecycle |
| `simulation/workflow_sim.py` | WorkflowSimulator — high-level GUI automation |
| `simulation/user_agent.py` | UserSimAgent — simulated user behavior (reading time, thinking delays) |
---
## Setup
The MCP Bridge implements a three-layer security model in `mcp_client.py`:
Every tool accessing the filesystem passes through `_resolve_and_check(path)` before any I/O.
### Layer 1: Allowlist Construction (`configure`)
Called by `ai_client` before each send cycle:
1. Resets `_allowed_paths` and `_base_dirs` to empty sets
2. Sets `_primary_base_dir` from `extra_base_dirs[0]`
3. Iterates `file_items`, resolving paths, adding to allowlist
4. Blacklist check: `history.toml`, `*_history.toml`, `config.toml`, `credentials.toml` are NEVER allowed
### Layer 2: Path Validation (`_is_allowed`)
Checks run in order:
1. **Blacklist**: `history.toml`, `*_history.toml` → hard deny
2. **Explicit allowlist**: Path in `_allowed_paths` → allow
3. **CWD fallback**: If no base dirs, allow `cwd()` subpaths
4. **Base containment**: Must be subpath of `_base_dirs`
5. **Default deny**: All other paths rejected
### Layer 3: Resolution Gate (`_resolve_and_check`)
1. Convert raw path string to `Path`
2. If not absolute, prepend `_primary_base_dir`
3. Resolve to absolute (follows symlinks)
4. Call `_is_allowed()`
5. Return `(resolved_path, "")` on success or `(None, error_message)` on failure
All paths are resolved (following symlinks) before comparison, preventing symlink-based traversal attacks.
### Security Model
The MCP Bridge implements a three-layer security model in `mcp_client.py`. Every tool accessing the filesystem passes through `_resolve_and_check(path)` before any I/O.
### Layer 1: Allowlist Construction (`configure`)
Called by `ai_client` before each send cycle:
1. Resets `_allowed_paths` and `_base_dirs` to empty sets.
2. Sets `_primary_base_dir` from `extra_base_dirs[0]` (resolved) or falls back to cwd().
3. Iterates `file_items`, resolving each path to an absolute path, adding to `_allowed_paths`; its parent directory is added to `_base_dirs`.
4. Any entries in `extra_base_dirs` that are valid directories are also added to `_base_dirs`.
### Layer 2: Path Validation (`_is_allowed`)
Checks run in this exact order:
1. **Blacklist**: `history.toml`, `*_history.toml`, `config`, `credentials` → hard deny
2. **Explicit allowlist**: Path in `_allowed_paths` → allow
7. **CWD fallback**: If no base dirs, any under `cwd()` is allowed (fail-safe for projects without explicit base dirs)
8. **Base containment**: Must be a subpath of at least one entry in `_base_dirs` (via `relative_to()`)
9. **Default deny**: All other paths rejected
All paths are resolved (following symlinks) before comparison, preventing symlink-based traversal attacks.
### Layer 3: Resolution Gate (`_resolve_and_check`)
Every tool call passes through this:
1. Convert raw path string to `Path`.
2. If not absolute, prepend `_primary_base_dir`.
3. Resolve to absolute.
4. Call `_is_allowed()`.
5. Return `(resolved_path, "")` on success, `(None, error_message)` on failure
All paths are resolved (following symlinks) before comparison, preventing symlink-based traversal attacks.
---
## Conductor SystemThe project uses a spec-driven track system in `conductor/` for structured development:
```
conductor/
├── workflow.md # Task lifecycle, TDD protocol, phase verification
├── tech-stack.md # Technology constraints and patterns
├── product.md # Product vision and guidelines
├── product-guidelines.md # Code standards, UX principles
└── tracks/
└── <track_name>_<YYYYMMDD>/
├── spec.md # Track specification
├── plan.md # Implementation plan with checkbox tasks
├── metadata.json # Track metadata
└── state.toml # Structured state with task list
```
**Key Concepts:**
- **Tracks**: Self-contained implementation units with spec, plan, and state
- **TDD Protocol**: Red (failing tests) → Green (pass) → Refactor
- **Phase Checkpoints**: Verification gates with git notes for audit trails
- **MMA Delegation**: Tracks are executed via the 4-tier agent hierarchy
See `conductor/workflow.md` for the full development workflow.
---
## Project Configuration
Projects are stored as `<name>.toml` files. The discussion history is split into a sibling `<name>_history.toml` to keep the main config lean.
@@ -134,3 +298,31 @@ run_powershell = true
read_file = true
# ... 26 tool flags
```
---
## Quick Reference
### Hook API Endpoints (port 8999)
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/status` | GET | Health check |
| `/api/project` | GET/POST | Project config |
| `/api/session` | GET/POST | Discussion entries |
| `/api/gui` | POST | GUI task queue |
| `/api/gui/mma_status` | GET | Full MMA state |
| `/api/gui/value/<tag>` | GET | Read GUI field |
| `/api/ask` | POST | Blocking HITL dialog |
### MCP Tool Categories
| Category | Tools |
|----------|-------|
| **File I/O** | `read_file`, `list_directory`, `search_files`, `get_tree`, `get_file_slice`, `set_file_slice`, `edit_file` |
| **AST (Python)** | `py_get_skeleton`, `py_get_code_outline`, `py_get_definition`, `py_update_definition`, `py_get_signature`, `py_set_signature`, `py_get_class_summary`, `py_get_var_declaration`, `py_set_var_declaration`, `py_get_docstring` |
| **Analysis** | `get_file_summary`, `get_git_diff`, `py_find_usages`, `py_get_imports`, `py_check_syntax`, `py_get_hierarchy` |
| **Network** | `web_search`, `fetch_url` |
| **Runtime** | `get_ui_performance` |
---

View File

@@ -1,6 +1,12 @@
# Documentation Index
[Top](../Readme.md)
[Top](../README.md)
---
## Overview
This documentation suite provides comprehensive technical reference for the Manual Slop application — a GUI orchestrator for local LLM-driven coding sessions. The guides follow a strict old-school technical documentation style, emphasizing architectural depth, state management details, algorithmic breakdowns, and structural formats.
---
@@ -8,68 +14,341 @@
| Guide | Contents |
|---|---|
| [Architecture](guide_architecture.md) | Thread domains, cross-thread data structures, event system, application lifetime, task pipeline (producer-consumer), Execution Clutch (HITL), AI client multi-provider architecture, Anthropic/Gemini caching strategies, context refresh, comms logging, state machines |
| [Meta-Boundary](guide_meta_boundary.md) | Explicit distinction between the Application's domain (Strict HITL) and the Meta-Tooling domain (autonomous agents), preventing feature bleed and safety bypasses via shared bridges like `mcp_client.py`. |
| [Tools & IPC](guide_tools.md) | MCP Bridge 3-layer security model, all 26 native tool signatures, Hook API GET/POST endpoints with request/response formats, ApiHookClient method reference, `/api/ask` synchronous HITL protocol, session logging, shell runner |
| [MMA Orchestration](guide_mma.md) | Ticket/Track/WorkerContext data structures, DAG engine (cycle detection, topological sort), ConductorEngine execution loop, Tier 2 ticket generation, Tier 3 worker lifecycle with context amnesia, Tier 4 QA integration, token firewalling, track state persistence |
| [Simulations](guide_simulations.md) | `live_gui` pytest fixture lifecycle, `VerificationLogger`, process cleanup, Puppeteer pattern (8-stage MMA simulation), approval automation, mock provider (`mock_gemini_cli.py`) with JSON-L protocol, visual verification patterns, ASTParser (tree-sitter) vs summarizer (stdlib `ast`) |
| [Architecture](guide_architecture.md) | Thread domains (GUI Main, Asyncio Worker, HookServer, Ad-hoc), cross-thread data structures (AsyncEventQueue, Guarded Lists, Condition-Variable Dialogs), event system (EventEmitter, SyncEventQueue, UserRequestEvent), application lifetime (boot sequence, shutdown sequence), task pipeline (producer-consumer synchronization), Execution Clutch (HITL mechanism with ConfirmDialog, MMAApprovalDialog, MMASpawnApprovalDialog), AI client multi-provider architecture (Gemini SDK, Anthropic, DeepSeek, Gemini CLI, MiniMax), Anthropic/Gemini caching strategies (4-breakpoint system, server-side TTL), context refresh mechanism (mtime-based file re-reading, diff injection), comms logging (JSON-L format), state machines (ai_status, HITL dialog state) |
| [Meta-Boundary](guide_meta_boundary.md) | Explicit distinction between the Application's domain (Strict HITL`gui_2.py`, `ai_client.py`, `multi_agent_conductor.py`, `dag_engine.py`) and the Meta-Tooling domain (`scripts/mma_exec.py`, `scripts/claude_mma_exec.py`, `scripts/tool_call.py`, `scripts/mcp_server.py`, `.gemini/`, `.claude/`), preventing feature bleed and safety bypasses via shared bridges like `mcp_client.py`. Documents the Inter-Domain Bridges (`cli_tool_bridge.py`, `claude_tool_bridge.py`) and the `GEMINI_CLI_HOOK_CONTEXT` environment variable. |
| [Tools & IPC](guide_tools.md) | MCP Bridge 3-layer security model (Allowlist Construction, Path Validation, Resolution Gate), all 26 native tool signatures with parameters and behavior (File I/O, AST-Based, Analysis, Network, Runtime), Hook API GET/POST endpoints with request/response formats, ApiHookClient method reference (Connection Methods, State Query Methods, GUI Manipulation Methods, Polling Methods, HITL Method), `/api/ask` synchronous HITL protocol (blocking request-response over HTTP), session logging (comms.log, toolcalls.log, apihooks.log, clicalls.log, scripts/generated/*.ps1), shell runner (mcp_env.toml configuration, run_powershell function with timeout handling and QA callback integration) |
| [MMA Orchestration](guide_mma.md) | Ticket/Track/WorkerContext data structures (from `models.py`), DAG engine (TrackDAG class with cycle detection, topological sort, cascade_blocks; ExecutionEngine class with tick-based state machine), ConductorEngine execution loop (run method, _push_state for state broadcast, parse_json_tickets for ingestion), Tier 2 ticket generation (generate_tickets, topological_sort), Tier 3 worker lifecycle (run_worker_lifecycle with Context Amnesia, AST skeleton injection, HITL clutch integration via confirm_spawn and confirm_execution), Tier 4 QA integration (run_tier4_analysis, run_tier4_patch_callback), token firewalling (tier_usage tracking, model escalation), track state persistence (TrackState, save_track_state, load_track_state, get_all_tracks) |
| [Simulations](guide_simulations.md) | Structural Testing Contract (Ban on Arbitrary Core Mocking, `live_gui` Standard, Artifact Isolation), `live_gui` pytest fixture lifecycle (spawning, readiness polling, failure path, teardown, session isolation via reset_ai_client), VerificationLogger for structured diagnostic logging, process cleanup (kill_process_tree for Windows/Unix), Puppeteer pattern (8-stage MMA simulation with mock provider setup, epic planning, track acceptance, ticket loading, status transitions, worker output verification), mock provider strategy (`tests/mock_gemini_cli.py` with JSON-L protocol, input mechanisms, response routing, output protocol), visual verification patterns (DAG integrity, stream telemetry, modal state, performance monitoring), supporting analysis modules (ASTParser with tree-sitter, summarize.py heuristic summaries, outline_tool.py hierarchical outlines) |
---
## GUI Panels
### Projects Panel
### Context Hub
Configuration and context management. Specifies the Git Directory (for commit tracking) and tracked file paths. Project switching swaps the active file list, discussion history, and settings via `<project>.toml` profiles.
The primary panel for project and file management.
- **Word-Wrap Toggle**: Dynamically swaps text rendering in large read-only panels (Responses, Comms Log) between unwrapped (code formatting) and wrapped (prose).
- **Project Selector**: Switch between `<project>.toml` configurations. Changing projects swaps the active file list, discussion history, and settings.
- **Git Directory**: Path to the repository for commit tracking and git operations.
- **Main Context File**: Optional primary context document for the project.
- **Output Dir**: Directory where generated markdown files are written.
- **Word-Wrap Toggle**: Dynamically swaps text rendering in large read-only panels between unwrapped (code formatting) and wrapped (prose).
- **Summary Only**: When enabled, sends file structure summaries instead of full content to reduce token usage.
- **Auto-Scroll Comms/Tool History**: Automatically scrolls to the bottom when new entries arrive.
### Discussion History
### Files & Media Panel
Controls what context is compiled and sent to the AI.
- **Base Dir**: Root directory for path resolution and MCP tool constraints.
- **Paths**: Explicit files or wildcard globs (`src/**/*.py`).
- **File Flags**:
- **Auto-Aggregate**: Include in context compilation.
- **Force Full**: Bypass summary-only mode for this file.
- **Cache Indicator**: Green dot (●) indicates file is in provider's context cache.
### Discussion Hub
Manages conversational branches to prevent context poisoning across tasks.
- **Discussions Sub-Menu**: Create separate timelines for different tasks (e.g., "Refactoring Auth" vs. "Adding API Endpoints").
- **Git Commit Tracking**: "Update Commit" reads HEAD from the project's git directory and stamps the discussion.
- **Entry Management**: Each turn has a Role (User, AI, System). Toggle between Read/Edit modes, collapse entries, or open in the Global Text Viewer via `[+ Max]`.
- **Entry Management**: Each turn has a Role (User, AI, System, Context, Tool, Vendor API). Toggle between Read/Edit modes, collapse entries, or open in the Global Text Viewer via `[+ Max]`.
- **Auto-Add**: When toggled, Message panel sends and Response panel returns are automatically appended to the current discussion.
- **Truncate History**: Reduces history to N most recent User/AI pairs.
### Files & Screenshots
### AI Settings Panel
Controls what is fed into the context compiler.
- **Provider**: Switch between API backends (Gemini, Anthropic, DeepSeek, Gemini CLI, MiniMax).
- **Model**: Select from available models for the current provider.
- **Fetch Models**: Queries the active provider for the latest model list.
- **Temperature / Max Tokens**: Generation parameters.
- **History Truncation Limit**: Character limit for truncating old tool outputs.
- **Base Dir**: Defines the root for path resolution and MCP tool constraints.
- **Paths**: Explicit files or wildcard globs (`src/**/*.rs`).
- Full file contents are inlined by default. The AI can call `get_file_summary` for compact structural views.
### Token Budget Panel
### Provider
- **Current Usage**: Real-time token counts (input, output, cache read, cache creation).
- **Budget Percentage**: Visual indicator of context window utilization.
- **Provider-Specific Limits**: Anthropic (180K prompt), Gemini (900K input).
Switches between API backends (Gemini, Anthropic, DeepSeek, Gemini CLI). "Fetch Models" queries the active provider for the latest model list.
### Cache Panel
### Message & Response
- **Gemini Cache Stats**: Count, total size, and list of cached files.
- **Clear Cache**: Forces cache invalidation on next send.
- **Message**: User input field.
### Tool Analytics Panel
- **Per-Tool Statistics**: Call count, total time, failure count for each tool.
- **Session Insights**: Burn rate estimation, average latency.
### Message & Response Panels
- **Message**: User input field with auto-expanding height.
- **Gen + Send**: Compiles markdown context and dispatches to the AI via `AsyncEventQueue`.
- **MD Only**: Dry-runs the compiler for context inspection without API cost.
- **Response**: Read-only output; flashes green on new response.
### Global Text Viewer & Script Outputs
### Operations Hub
- **Last Script Output**: Pops up (flashing blue) whenever the AI executes a script. Shows both the executed script and stdout/stderr. `[+ Maximize]` reads from stored instance variables, not DPG widget tags, so it works regardless of word-wrap state.
- **Text Viewer**: Large resizable popup invoked by `[+]` / `[+ Maximize]` buttons. For deep-reading long logs, discussion entries, or script bodies.
- **Confirm Dialog**: The `[+ Maximize]` button in the script approval modal passes script text as `user_data` at button-creation time — safe to click even after the dialog is dismissed.
### Tool Calls & Comms History
Real-time display of MCP tool invocations and raw API traffic. Each comms entry: timestamp, direction (OUT/IN), kind, provider, model, payload.
- **Focus Agent Filter**: Show comms/tool history for specific tier (All, Tier 2, Tier 3, Tier 4).
- **Comms History**: Real-time display of raw API traffic (timestamp, direction, kind, provider, model, payload preview).
- **Tool Calls**: Sequential log of tool invocations with script/args and result preview.
### MMA Dashboard
Displays the 4-tier orchestration state: active track, ticket DAG with status indicators, per-tier token usage, output streams. Approval buttons for spawn/step/tool gates.
The 4-tier orchestration control center.
### System Prompts
- **Track Browser**: List of all tracks with status, progress, and actions (Load, Delete).
- **Active Track Summary**: Color-coded progress bar, ticket status breakdown (Completed, In Progress, Blocked, Todo), ETA estimation.
- **Visual Task DAG**: Node-based visualization using `imgui-node-editor` with color-coded states (Ready, Running, Blocked, Done).
- **Ticket Queue Management**: Bulk operations (Execute, Skip, Block), drag-and-drop reordering, priority assignment.
- **Tier Streams**: Real-time output from Tier 1/2/3/4 agents.
Two text inputs for instruction overrides:
1. **Global**: Applied across every project.
2. **Project**: Specific to the active workspace.
### Tier Stream Panels
Concatenated onto the base tool-usage guidelines.
Dedicated windows for each MMA tier:
- **Tier 1: Strategy**: Orchestrator output for epic planning and track initialization.
- **Tier 2: Tech Lead**: Architectural decisions and ticket generation.
- **Tier 3: Workers**: Individual worker output streams (one per active ticket).
- **Tier 4: QA**: Error analysis and diagnostic summaries.
### Log Management
- **Session Registry**: Table of all session logs with metadata (start time, message count, size, whitelist status).
- **Star/Unstar**: Mark sessions for preservation during pruning.
- **Force Prune**: Manually trigger aggressive log cleanup.
### Diagnostics Panel
- **Performance Telemetry**: FPS, Frame Time, CPU %, Input Lag with moving averages.
- **Detailed Component Timings**: Per-panel rendering times with threshold alerts.
- **Performance Graphs**: Historical plots for selected metrics.
---
## Configuration Files
### config.toml (Global)
```toml
[ai]
provider = "gemini"
model = "gemini-2.5-flash-lite"
temperature = 0.0
max_tokens = 8192
history_trunc_limit = 8000
system_prompt = ""
[projects]
active = "path/to/project.toml"
paths = ["path/to/project.toml"]
[gui]
separate_message_panel = false
separate_response_panel = false
separate_tool_calls_panel = false
show_windows = { "Context Hub": true, ... }
[paths]
logs_dir = "logs/sessions"
scripts_dir = "scripts/generated"
conductor_dir = "conductor"
[mma]
max_workers = 4
```
### <project>.toml (Per-Project)
```toml
[project]
name = "my_project"
git_dir = "./my_repo"
system_prompt = ""
main_context = ""
[files]
base_dir = "."
paths = ["src/**/*.py"]
tier_assignments = { "src/core.py" = 1 }
[screenshots]
base_dir = "."
paths = []
[output]
output_dir = "./md_gen"
[gemini_cli]
binary_path = "gemini"
[deepseek]
reasoning_effort = "medium"
[agent.tools]
run_powershell = true
read_file = true
list_directory = true
search_files = true
get_file_summary = true
web_search = true
fetch_url = true
py_get_skeleton = true
py_get_code_outline = true
get_file_slice = true
set_file_slice = false
edit_file = false
py_get_definition = true
py_update_definition = false
py_get_signature = true
py_set_signature = false
py_get_class_summary = true
py_get_var_declaration = true
py_set_var_declaration = false
get_git_diff = true
py_find_usages = true
py_get_imports = true
py_check_syntax = true
py_get_hierarchy = true
py_get_docstring = true
get_tree = true
get_ui_performance = true
[mma]
epic = ""
active_track_id = ""
tracks = []
```
### credentials.toml
```toml
[gemini]
api_key = "YOUR_KEY"
[anthropic]
api_key = "YOUR_KEY"
[deepseek]
api_key = "YOUR_KEY"
[minimax]
api_key = "YOUR_KEY"
```
### mcp_env.toml (Optional)
```toml
[path]
prepend = ["C:/custom/bin"]
[env]
MY_VAR = "some_value"
EXPANDED = "${HOME}/subdir"
```
---
## Environment Variables
| Variable | Purpose |
|---|---|
| `SLOP_CONFIG` | Override path to `config.toml` |
| `SLOP_CREDENTIALS` | Override path to `credentials.toml` |
| `SLOP_MCP_ENV` | Override path to `mcp_env.toml` |
| `SLOP_TEST_HOOKS` | Set to `"1"` to enable test hooks |
| `SLOP_LOGS_DIR` | Override logs directory |
| `SLOP_SCRIPTS_DIR` | Override generated scripts directory |
| `SLOP_CONDUCTOR_DIR` | Override conductor directory |
| `GEMINI_CLI_HOOK_CONTEXT` | Set by bridge scripts to bypass HITL for sub-agents |
| `CLAUDE_CLI_HOOK_CONTEXT` | Set by bridge scripts to bypass HITL for sub-agents |
---
## Exit Codes
| Code | Meaning |
|---|---|
| 0 | Normal exit |
| 1 | General error |
| 2 | Configuration error |
| 3 | API error |
| 4 | Test failure |
---
## File Layout
```
manual_slop/
├── conductor/ # Conductor system
│ ├── tracks/ # Track directories
│ │ └── <track_id>/ # Per-track files
│ │ ├── spec.md
│ │ ├── plan.md
│ │ ├── metadata.json
│ │ └── state.toml
│ ├── archive/ # Completed tracks
│ ├── product.md # Product definition
│ ├── product-guidelines.md
│ ├── tech-stack.md
│ └── workflow.md
├── docs/ # Deep-dive documentation
│ ├── guide_architecture.md
│ ├── guide_meta_boundary.md
│ ├── guide_mma.md
│ ├── guide_simulations.md
│ └── guide_tools.md
├── logs/ # Runtime logs
│ ├── sessions/ # Session logs
│ │ └── <session_id>/ # Per-session files
│ │ ├── comms.log
│ │ ├── toolcalls.log
│ │ ├── apihooks.log
│ │ └── clicalls.log
│ ├── agents/ # Sub-agent logs
│ ├── errors/ # Error logs
│ └── test/ # Test logs
├── scripts/ # Utility scripts
│ ├── generated/ # AI-generated scripts
│ └── *.py # Build/execution scripts
├── src/ # Core implementation
│ ├── gui_2.py # Primary ImGui interface
│ ├── app_controller.py # Headless controller
│ ├── ai_client.py # Multi-provider LLM abstraction
│ ├── mcp_client.py # 26 MCP tools
│ ├── api_hooks.py # HookServer REST API
│ ├── api_hook_client.py # Hook API client
│ ├── multi_agent_conductor.py # ConductorEngine
│ ├── conductor_tech_lead.py # Tier 2 ticket generation
│ ├── dag_engine.py # TrackDAG + ExecutionEngine
│ ├── models.py # Ticket, Track, WorkerContext
│ ├── events.py # EventEmitter, SyncEventQueue
│ ├── project_manager.py # TOML persistence
│ ├── session_logger.py # JSON-L logging
│ ├── shell_runner.py # PowerShell execution
│ ├── file_cache.py # ASTParser (tree-sitter)
│ ├── summarize.py # Heuristic summaries
│ ├── outline_tool.py # Code outlining
│ ├── performance_monitor.py # FPS/CPU tracking
│ ├── log_registry.py # Session metadata
│ ├── log_pruner.py # Log cleanup
│ ├── paths.py # Path resolution
│ ├── cost_tracker.py # Token cost estimation
│ ├── gemini_cli_adapter.py # CLI subprocess adapter
│ ├── mma_prompts.py # Tier system prompts
│ └── theme*.py # UI theming
├── simulation/ # Test simulations
│ ├── sim_base.py # BaseSimulation class
│ ├── workflow_sim.py # WorkflowSimulator
│ ├── user_agent.py # UserSimAgent
│ └── sim_*.py # Specific simulations
├── tests/ # Test suite
│ ├── conftest.py # Fixtures (live_gui)
│ ├── artifacts/ # Test outputs
│ └── test_*.py # Test files
├── sloppy.py # Main entry point
├── config.toml # Global configuration
└── credentials.toml # API keys
```

View File

@@ -1,12 +1,18 @@
# Architecture
[Top](../Readme.md) | [Tools & IPC](guide_tools.md) | [MMA Orchestration](guide_mma.md) | [Simulations](guide_simulations.md)
[Top](../README.md) | [Tools & IPC](guide_tools.md) | [MMA Orchestration](guide_mma.md) | [Simulations](guide_simulations.md)
---
## Philosophy: The Decoupled State Machine
Manual Slop solves a single tension: **AI reasoning is high-latency and non-deterministic; GUI interaction must be low-latency and responsive.** The engine enforces strict decoupling between three thread domains so that multi-second LLM calls never block the render loop, and every AI-generated payload passes through a human-auditable gate before execution.
Manual Slop solves a single tension: **AI reasoning is high-latency and non-deterministic; GUI interaction must be low-latency and responsive.** The engine enforces strict decoupling between four thread domains so that multi-second LLM calls never block the render loop, and every AI-generated payload passes through a human-auditable gate before execution.
The architectural philosophy follows data-oriented design principles:
- The GUI (`gui_2.py`, `app_controller.py`) remains a pure visualization of application state
- State mutations occur only through lock-guarded queues consumed on the main render thread
- Background threads never write GUI state directly — they serialize task dicts for later consumption
- All cross-thread communication uses explicit synchronization primitives (Locks, Conditions, Events)
## Project Structure
@@ -36,17 +42,17 @@ manual_slop/
Four distinct thread domains operate concurrently:
| Domain | Created By | Purpose | Lifecycle |
|---|---|---|---|
| **Main / GUI** | `immapp.run()` | Dear ImGui retained-mode render loop; sole writer of GUI state | App lifetime |
| **Asyncio Worker** | `App.__init__` via `threading.Thread(daemon=True)` | Event queue processing, AI client calls | Daemon (dies with process) |
| **HookServer** | `api_hooks.HookServer.start()` | HTTP API on `:8999` for external automation and IPC | Daemon thread |
| **Ad-hoc** | Transient `threading.Thread` calls | Model-fetching, legacy send paths | Short-lived |
| Domain | Created By | Purpose | Lifecycle | Key Synchronization Primitives |
|---|---|---|---|---|
| **Main / GUI** | `immapp.run()` | Dear ImGui retained-mode render loop; sole writer of GUI state | App lifetime | None (consumer of queues) |
| **Asyncio Worker** | `App.__init__` via `threading.Thread(daemon=True)` | Event queue processing, AI client calls | Daemon (dies with process) | `AsyncEventQueue`, `threading.Lock` |
| **HookServer** | `api_hooks.HookServer.start()` | HTTP API on `:8999` for external automation and IPC | Daemon thread | `threading.Lock`, `threading.Event` |
| **Ad-hoc** | Transient `threading.Thread` calls | Model-fetching, legacy send paths, log pruning | Short-lived | Task-specific locks |
The asyncio worker is **not** the main thread's event loop. It runs a dedicated `asyncio.new_event_loop()` on its own daemon thread:
```python
# App.__init__:
# AppController.__init__:
self._loop = asyncio.new_event_loop()
self._loop_thread = threading.Thread(target=self._run_event_loop, daemon=True)
self._loop_thread.start()
@@ -60,6 +66,25 @@ def _run_event_loop(self) -> None:
The GUI thread uses `asyncio.run_coroutine_threadsafe(coro, self._loop)` to push work into this loop.
### Thread-Local Context Isolation
For concurrent multi-agent execution, the application uses `threading.local()` to manage per-thread context:
```python
# ai_client.py
_local_storage = threading.local()
def get_current_tier() -> Optional[str]:
"""Returns the current tier from thread-local storage."""
return getattr(_local_storage, "current_tier", None)
def set_current_tier(tier: Optional[str]) -> None:
"""Sets the current tier in thread-local storage."""
_local_storage.current_tier = tier
```
This ensures that comms log entries and tool calls are correctly tagged with their source tier even when multiple workers execute concurrently.
---
## Cross-Thread Data Structures
@@ -553,12 +578,247 @@ Every interaction is designed to be auditable:
- **CLI Call Logs**: Subprocess execution details (command, stdin, stdout, stderr, latency) to `clicalls.log` as JSON-L.
- **Performance Monitor**: Real-time FPS, Frame Time, CPU, Input Lag tracked and queryable via Hook API.
### Telemetry Data Structures
```python
# Comms log entry (JSON-L)
{
"ts": "14:32:05",
"direction": "OUT",
"kind": "tool_call",
"provider": "gemini",
"model": "gemini-2.5-flash-lite",
"payload": {
"name": "run_powershell",
"id": "call_abc123",
"script": "Get-ChildItem"
},
"source_tier": "Tier 3",
"local_ts": 1709875925.123
}
# Performance metrics (via get_metrics())
{
"fps": 60.0,
"fps_avg": 58.5,
"last_frame_time_ms": 16.67,
"frame_time_ms_avg": 17.1,
"cpu_percent": 12.5,
"cpu_percent_avg": 15.2,
"input_lag_ms": 2.3,
"input_lag_ms_avg": 3.1,
"time_render_mma_dashboard_ms": 5.2,
"time_render_mma_dashboard_ms_avg": 4.8
}
```
---
## MMA Engine Architecture
### WorkerPool: Concurrent Worker Management
The `WorkerPool` class in `multi_agent_conductor.py` manages a bounded pool of worker threads:
```python
class WorkerPool:
def __init__(self, max_workers: int = 4):
self.max_workers = max_workers
self._active: dict[str, threading.Thread] = {}
self._lock = threading.Lock()
self._semaphore = threading.Semaphore(max_workers)
def spawn(self, ticket_id: str, target: Callable, args: tuple) -> Optional[threading.Thread]:
with self._lock:
if len(self._active) >= self.max_workers:
return None
def wrapper(*a, **kw):
try:
with self._semaphore:
target(*a, **kw)
finally:
with self._lock:
self._active.pop(ticket_id, None)
t = threading.Thread(target=wrapper, args=args, daemon=True)
with self._lock:
self._active[ticket_id] = t
t.start()
return t
```
**Key behaviors**:
- **Bounded concurrency**: `max_workers` (default 4) limits parallel ticket execution
- **Semaphore gating**: Ensures no more than `max_workers` can execute simultaneously
- **Automatic cleanup**: Thread removes itself from `_active` dict on completion
- **Non-blocking spawn**: Returns `None` if pool is full, allowing the engine to defer
### ConductorEngine: Orchestration Loop
The `ConductorEngine` orchestrates ticket execution within a track:
```python
class ConductorEngine:
def __init__(self, track: Track, event_queue: Optional[SyncEventQueue] = None,
auto_queue: bool = False) -> None:
self.track = track
self.event_queue = event_queue
self.dag = TrackDAG(self.track.tickets)
self.engine = ExecutionEngine(self.dag, auto_queue=auto_queue)
self.pool = WorkerPool(max_workers=4)
self._abort_events: dict[str, threading.Event] = {}
self._pause_event = threading.Event()
self._tier_usage_lock = threading.Lock()
self.tier_usage = {
"Tier 1": {"input": 0, "output": 0, "model": "gemini-3.1-pro-preview"},
"Tier 2": {"input": 0, "output": 0, "model": "gemini-3-flash-preview"},
"Tier 3": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"},
"Tier 4": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"},
}
```
**Main execution loop** (`run` method):
1. **Pause check**: If `_pause_event` is set, sleep and broadcast "paused" status
2. **DAG tick**: Call `engine.tick()` to get ready tasks
3. **Completion check**: If no ready tasks and all completed, break with "done" status
4. **Wait for workers**: If tasks in-progress or pool active, sleep and continue
5. **Blockage detection**: If no ready, no in-progress, and not all done, break with "blocked" status
6. **Spawn workers**: For each ready task, spawn a worker via `pool.spawn()`
7. **Model escalation**: Workers use `models_list[min(retry_count, 2)]` for capability upgrade on retries
### Abort Event Propagation
Each ticket has an associated `threading.Event` for abort signaling:
```python
# Before spawning worker
self._abort_events[ticket.id] = threading.Event()
# Worker checks abort at three points:
# 1. Before major work
if abort_event.is_set():
ticket.status = "killed"
return "ABORTED"
# 2. Before tool execution (in clutch_callback)
if abort_event.is_set():
return False # Reject tool
# 3. After blocking send() returns
if abort_event.is_set():
ticket.status = "killed"
return "ABORTED"
```
---
## Architectural Invariants
1. **Single-writer principle**: All GUI state mutations happen on the main thread via `_process_pending_gui_tasks`. Background threads never write GUI state directly.
2. **Copy-and-clear lock pattern**: `_process_pending_gui_tasks` snapshots and clears the task list under the lock, then processes outside the lock.
3. **Context Amnesia**: Each MMA Tier 3 Worker starts with `ai_client.reset_session()`. No conversational bleed between tickets.
4. **Send serialization**: `_send_lock` ensures only one provider call is in-flight at a time across all threads.
5. **Dual-Flush persistence**: On exit, state is committed to both project-level and global-level config files.
6. **No cross-thread GUI mutation**: Background threads must push tasks to `_pending_gui_tasks` rather than calling GUI methods directly.
7. **Abort-before-execution**: Workers check abort events before major work phases, enabling clean cancellation.
8. **Bounded worker pool**: `WorkerPool` enforces `max_workers` limit to prevent resource exhaustion.
---
## Error Classification & Recovery
### ProviderError Taxonomy
The `ProviderError` class provides structured error classification:
```python
class ProviderError(Exception):
def __init__(self, kind: str, provider: str, original: Exception):
self.kind = kind # "quota" | "rate_limit" | "auth" | "balance" | "network" | "unknown"
self.provider = provider
self.original = original
def ui_message(self) -> str:
labels = {
"quota": "QUOTA EXHAUSTED",
"rate_limit": "RATE LIMITED",
"auth": "AUTH / API KEY ERROR",
"balance": "BALANCE / BILLING ERROR",
"network": "NETWORK / CONNECTION ERROR",
"unknown": "API ERROR",
}
return f"[{self.provider.upper()} {labels.get(self.kind, 'API ERROR')}]\n\n{self.original}"
```
### Error Recovery Patterns
| Error Kind | Recovery Strategy |
|---|---|
| `quota` | Display in UI, await user intervention |
| `rate_limit` | Exponential backoff (not yet implemented) |
| `auth` | Prompt for credential verification |
| `balance` | Display billing alert |
| `network` | Auto-retry with timeout |
| `unknown` | Log full traceback, display in UI |
---
## Memory Management
### History Trimming Strategies
**Gemini (40% threshold)**:
```python
if total_in > _GEMINI_MAX_INPUT_TOKENS * 0.4:
while len(hist) > 4 and total_in > _GEMINI_MAX_INPUT_TOKENS * 0.3:
# Drop oldest message pairs
hist.pop(0) # Assistant
hist.pop(0) # User
```
**Anthropic (180K limit)**:
```python
def _trim_anthropic_history(system_blocks, history):
est = _estimate_prompt_tokens(system_blocks, history)
while len(history) > 3 and est > _ANTHROPIC_MAX_PROMPT_TOKENS:
# Drop turn pairs, preserving tool_result chains
...
```
### Tool Output Budget
```python
_MAX_TOOL_OUTPUT_BYTES: int = 500_000 # 500KB cumulative
if _cumulative_tool_bytes > _MAX_TOOL_OUTPUT_BYTES:
# Inject warning, force final answer
parts.append("SYSTEM WARNING: Cumulative tool output exceeded 500KB budget.")
```
### AST Cache (file_cache.py)
```python
_ast_cache: Dict[str, Tuple[float, tree_sitter.Tree]] = {}
def get_cached_tree(self, path: Optional[str], code: str) -> tree_sitter.Tree:
mtime = p.stat().st_mtime if p.exists() else 0.0
if path in _ast_cache:
cached_mtime, tree = _ast_cache[path]
if cached_mtime == mtime:
return tree
# Parse and cache with simple LRU (max 10 entries)
if len(_ast_cache) >= 10:
del _ast_cache[next(iter(_ast_cache))]
tree = self.parse(code)
_ast_cache[path] = (mtime, tree)
return tree
```

View File

@@ -138,6 +138,31 @@ class ExecutionEngine:
---
## WorkerPool (`multi_agent_conductor.py`)
Bounded concurrent worker pool with semaphore gating.
```python
class WorkerPool:
def __init__(self, max_workers: int = 4):
self.max_workers = max_workers
self._active: dict[str, threading.Thread] = {}
self._lock = threading.Lock()
self._semaphore = threading.Semaphore(max_workers)
```
**Key Methods:**
- `spawn(ticket_id, target, args)` — Spawns a worker thread if pool has capacity. Returns `None` if full.
- `join_all(timeout)` — Waits for all active workers to complete.
- `get_active_count()` — Returns current number of active workers.
- `is_full()` — Returns `True` if at capacity.
**Thread Safety:** All state mutations are protected by `_lock`. The semaphore ensures at most `max_workers` threads execute concurrently.
**Configuration:** `max_workers` is loaded from `config.toml``[mma].max_workers` (default: 4).
---
## ConductorEngine (`multi_agent_conductor.py`)
The Tier 2 orchestrator. Owns the execution loop that drives tickets through the DAG.
@@ -148,13 +173,16 @@ class ConductorEngine:
self.track = track
self.event_queue = event_queue
self.tier_usage = {
"Tier 1": {"input": 0, "output": 0},
"Tier 2": {"input": 0, "output": 0},
"Tier 3": {"input": 0, "output": 0},
"Tier 4": {"input": 0, "output": 0},
"Tier 1": {"input": 0, "output": 0, "model": "gemini-3.1-pro-preview"},
"Tier 2": {"input": 0, "output": 0, "model": "gemini-3-flash-preview"},
"Tier 3": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"},
"Tier 4": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"},
}
self.dag = TrackDAG(self.track.tickets)
self.engine = ExecutionEngine(self.dag, auto_queue=auto_queue)
self.pool = WorkerPool(max_workers=max_workers)
self._abort_events: dict[str, threading.Event] = {}
self._pause_event: threading.Event = threading.Event()
```
### State Broadcast (`_push_state`)
@@ -350,6 +378,80 @@ Each tier operates within its own token budget:
---
## Abort Event Propagation
Workers can be killed mid-execution via abort events:
```python
# In ConductorEngine.__init__:
self._abort_events: dict[str, threading.Event] = {}
# When spawning a worker:
self._abort_events[ticket.id] = threading.Event()
# To kill a worker:
def kill_worker(self, ticket_id: str) -> None:
if ticket_id in self._abort_events:
self._abort_events[ticket_id].set() # Signal abort
thread = self._active_workers.get(ticket_id)
if thread:
thread.join(timeout=1.0) # Wait for graceful shutdown
```
**Abort Check Points in `run_worker_lifecycle`:**
1. **Before major work** — checked immediately after `ai_client.reset_session()`
2. **During clutch_callback** — checked before each tool execution
3. **After blocking send()** — checked after AI call returns
When abort is detected, the ticket status is set to `"killed"` and the worker exits immediately.
---
## Pause/Resume Control
The engine supports pausing the entire orchestration pipeline:
```python
def pause(self) -> None:
self._pause_event.set()
def resume(self) -> None:
self._pause_event.clear()
```
In the main `run()` loop:
```python
while True:
if self._pause_event.is_set():
self._push_state(status="paused", active_tier="Paused")
time.sleep(0.5)
continue
# ... normal execution
```
This allows the user to pause execution without killing workers.
---
## Model Escalation
Workers automatically escalate to more capable models on retry:
```python
models_list = [
"gemini-2.5-flash-lite", # First attempt
"gemini-2.5-flash", # Second attempt
"gemini-3.1-pro-preview" # Third+ attempt
]
model_idx = min(ticket.retry_count, len(models_list) - 1)
model_name = models_list[model_idx]
```
The `ticket.model_override` field can bypass this logic with a specific model.
---
## Track State Persistence
Track state can be persisted to disk via `project_manager.py`:

View File

@@ -310,8 +310,9 @@ class ASTParser:
self.parser = tree_sitter.Parser(self.language)
def parse(self, code: str) -> tree_sitter.Tree
def get_skeleton(self, code: str) -> str
def get_curated_view(self, code: str) -> str
def get_skeleton(self, code: str, path: str = "") -> str
def get_curated_view(self, code: str, path: str = "") -> str
def get_targeted_view(self, code: str, symbols: List[str], path: str = "") -> str
```
**`get_skeleton` algorithm:**
@@ -329,6 +330,13 @@ Enhanced skeleton that preserves bodies under two conditions:
If either condition is true, the body is preserved verbatim. This enables a two-tier code view: hot paths shown in full, boilerplate compressed.
**`get_targeted_view` algorithm:**
Extracts only the specified symbols and their dependencies:
1. Find all requested symbol definitions (classes, functions, methods).
2. For each symbol, traverse its body to find referenced names.
3. Include only the definitions that are directly referenced.
4. Used for surgical context injection when `target_symbols` is specified on a Ticket.
### `summarize.py` — Heuristic File Summaries
Token-efficient structural descriptions without AI calls:

View File

@@ -141,6 +141,33 @@ The `_get_symbol_node` helper supports dot notation (`ClassName.method_name`) by
---
## Parallel Tool Execution
Tools can be executed concurrently via `async_dispatch`:
```python
async def async_dispatch(tool_name: str, tool_input: dict[str, Any]) -> str:
"""Dispatch an MCP tool call asynchronously."""
return await asyncio.to_thread(dispatch, tool_name, tool_input)
```
In `ai_client.py`, multiple tool calls within a single AI turn are executed in parallel:
```python
async def _execute_tool_calls_concurrently(calls, base_dir, ...):
tasks = []
for fc in calls:
tasks.append(_execute_single_tool_call_async(name, args, ...))
results = await asyncio.gather(*tasks)
return results
```
This significantly reduces latency when the AI makes multiple independent file reads in a single turn.
**Thread Safety Note:** The `configure()` function resets global state. In concurrent environments, ensure configuration is complete before dispatching tools.
---
## The Hook API: Remote Control & Telemetry
Manual Slop exposes a REST-based IPC interface on `127.0.0.1:8999` using Python's `ThreadingHTTPServer`. Each incoming request gets its own thread.
@@ -312,6 +339,47 @@ class ApiHookClient:
---
## Parallel Tool Execution
Tool calls are executed concurrently within a single AI turn using `asyncio.gather`. This significantly reduces latency when multiple independent tools need to be called.
### `async_dispatch` Implementation
```python
async def async_dispatch(tool_name: str, tool_input: dict[str, Any]) -> str:
"""
Dispatch an MCP tool call by name asynchronously.
Returns the result as a string.
"""
# Run blocking I/O bound tools in a thread to allow parallel execution
return await asyncio.to_thread(dispatch, tool_name, tool_input)
```
All tools are wrapped in `asyncio.to_thread()` to prevent blocking the event loop. This enables `ai_client.py` to execute multiple tools via `asyncio.gather()`:
```python
results = await asyncio.gather(
async_dispatch("read_file", {"path": "src/module_a.py"}),
async_dispatch("read_file", {"path": "src/module_b.py"}),
async_dispatch("get_file_summary", {"path": "src/module_c.py"}),
)
```
### Concurrency Benefits
| Scenario | Sequential | Parallel |
|----------|------------|----------|
| 3 file reads (100ms each) | 300ms | ~100ms |
| 5 file reads + 1 web fetch (200ms each) | 1200ms | ~200ms |
| Mixed I/O operations | Sum of all | Max of all |
The parallel execution model is particularly effective for:
- Reading multiple source files simultaneously
- Fetching URLs while performing local file operations
- Running syntax checks across multiple files
---
## Synthetic Context Refresh
To minimize token churn and redundant `read_file` calls, the `ai_client` performs a post-tool-execution context refresh. See [guide_architecture.md](guide_architecture.md#context-refresh-mechanism) for the full algorithm.

Binary file not shown.

After

Width:  |  Height:  |  Size: 446 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 413 KiB

View File

@@ -1,3 +1,45 @@
"""
Base Simulation Framework - Abstract base class for GUI automation tests.
This module provides the foundation for all simulation-based tests in the
Manual Slop test suite. Simulations act as external "puppeteers" that drive
the GUI through the ApiHookClient HTTP interface.
Architecture:
- BaseSimulation: Abstract base class with setup/teardown lifecycle
- WorkflowSimulator: High-level workflow operations (project setup, file mgmt)
- ApiHookClient: Low-level HTTP client for Hook API communication
Typical Usage:
class MySimulation(BaseSimulation):
def run(self) -> None:
self.client.set_value('mma_epic_input', 'My epic description')
self.client.click('btn_mma_plan_epic')
# Poll for completion...
status = self.client.get_mma_status()
assert status['mma_status'] == 'done'
if __name__ == '__main__':
run_sim(MySimulation)
Lifecycle:
1. setup() - Connects to GUI, resets session, scaffolds temp project
2. run() - Implemented by subclass with simulation logic
3. teardown() - Cleanup (optional file retention for debugging)
Prerequisites:
- GUI must be running with --enable-test-hooks flag
- HookServer must be listening on http://127.0.0.1:8999
Thread Safety:
- Simulations are designed to run in the main thread
- ApiHookClient handles its own connection pooling
See Also:
- simulation/workflow_sim.py for WorkflowSimulator
- tests/conftest.py for live_gui pytest fixture
- docs/guide_simulations.md for full simulation documentation
"""
import sys
import os
import time

View File

@@ -1,3 +1,44 @@
"""
Workflow Simulator - High-level GUI workflow automation for testing.
This module provides the WorkflowSimulator class which orchestrates complex
multi-step workflows through the GUI via the ApiHookClient. It is designed
for integration testing and automated verification of GUI behavior.
Key Capabilities:
- Project setup and configuration
- Discussion creation and switching
- AI turn execution with stall detection
- Context file management
- MMA (Multi-Model Agent) orchestration simulation
Stall Detection:
The run_discussion_turn() method implements intelligent stall detection:
- Monitors ai_status for transitions from busy -> idle
- Detects stalled Tool results (non-busy state with Tool as last role)
- Automatically triggers btn_gen_send to recover from stalls
Integration with UserSimAgent:
WorkflowSimulator delegates user simulation behavior (reading time, delays)
to UserSimAgent for realistic interaction patterns.
Thread Safety:
This class is NOT thread-safe. All methods should be called from a single
thread (typically the main test thread).
Example Usage:
client = ApiHookClient()
sim = WorkflowSimulator(client)
sim.setup_new_project("TestProject", "/path/to/git/dir")
sim.create_discussion("Feature A")
result = sim.run_discussion_turn("Please implement feature A")
See Also:
- simulation/sim_base.py for BaseSimulation class
- simulation/user_agent.py for UserSimAgent
- api_hook_client.py for ApiHookClient
- docs/guide_simulations.md for full simulation documentation
"""
import time
from api_hook_client import ApiHookClient
from simulation.user_agent import UserSimAgent

View File

@@ -1,3 +1,31 @@
"""
DAG Engine - Directed Acyclic Graph execution for MMA ticket orchestration.
This module provides the core graph data structures and state machine logic
for executing implementation tickets in dependency order within the MMA
(Multi-Model Agent) system.
Key Classes:
- TrackDAG: Graph representation with cycle detection, topological sorting,
and transitive blocking propagation.
- ExecutionEngine: Tick-based state machine that evaluates the DAG and
manages task status transitions.
Architecture Integration:
- TrackDAG is constructed from a list of Ticket objects (from models.py)
- ExecutionEngine is consumed by ConductorEngine (multi_agent_conductor.py)
- The tick() method is called in the main orchestration loop to determine
which tasks are ready for execution
Thread Safety:
- This module is NOT thread-safe. Callers must synchronize access if used
from multiple threads (e.g., the ConductorEngine's async loop).
See Also:
- docs/guide_mma.md for the full MMA orchestration documentation
- src/models.py for Ticket and Track data structures
- src/multi_agent_conductor.py for ConductorEngine integration
"""
from typing import List
from src.models import Ticket

View File

@@ -1,5 +1,33 @@
"""
Decoupled event emission system for cross-module communication.
Events - Decoupled event emission and queuing for cross-thread communication.
This module provides three complementary patterns for thread-safe communication
between the GUI main thread and background workers:
1. EventEmitter: Pub/sub pattern for synchronous event broadcast
- Used for: API lifecycle events (request_start, response_received, tool_execution)
- Thread-safe: Callbacks execute on emitter's thread
- Example: ai_client.py emits 'request_start' and 'response_received' events
2. SyncEventQueue: Producer-consumer pattern via queue.Queue
- Used for: Decoupled task submission where consumer polls at its own pace
- Thread-safe: Built on Python's thread-safe queue.Queue
- Example: Background workers submit tasks, main thread drains queue
3. UserRequestEvent: Structured payload for AI request data
- Used for: Bundling prompt, context, files, and base_dir into single object
- Immutable data transfer object for cross-thread handoff
Integration Points:
- ai_client.py: EventEmitter for API lifecycle events
- gui_2.py: Consumes events via _process_event_queue()
- multi_agent_conductor.py: Uses SyncEventQueue for state updates
- api_hooks.py: Pushes events to _api_event_queue for external visibility
Thread Safety:
- EventEmitter: NOT thread-safe for concurrent on/emit (use from single thread)
- SyncEventQueue: FULLY thread-safe (built on queue.Queue)
- UserRequestEvent: Immutable, safe for concurrent access
"""
import queue
from typing import Callable, Any, Dict, List, Tuple

View File

@@ -1,34 +1,56 @@
# mcp_client.py
"""
Note(Gemini):
MCP-style file context tools for manual_slop.
Exposes read-only filesystem tools the AI can call to selectively fetch file
content on demand, instead of having everything inlined into the context block.
MCP Client - Multi-tool filesystem and network operations with sandboxing.
All access is restricted to paths that are either:
- Explicitly listed in the project's allowed_paths set, OR
- Contained within an allowed base_dir (must resolve to a subpath of it)
This module implements a Model Context Protocol (MCP)-like interface for AI
agents to interact with the filesystem and network. It provides 26 tools
with a three-layer security model to prevent unauthorized access.
This is heavily inspired by Claude's own tooling limits. We enforce safety here
so the AI doesn't wander outside the project workspace.
Three-Layer Security Model:
1. Allowlist Construction (configure()):
- Builds _allowed_paths from project file_items
- Populates _base_dirs from file parents and extra_base_dirs
- Sets _primary_base_dir for relative path resolution
2. Path Validation (_is_allowed()):
- Blacklist check: history.toml, *_history.toml, config, credentials
- Explicit allowlist check: _allowed_paths membership
- CWD fallback: allows cwd() subpaths if no base_dirs configured
- Base directory containment: must be subpath of _base_dirs
3. Resolution Gate (_resolve_and_check()):
- Converts relative paths using _primary_base_dir
- Resolves symlinks to prevent traversal attacks
- Returns (resolved_path, error_message) tuple
Tool Categories:
- File I/O: read_file, list_directory, search_files, get_tree
- Surgical Edits: set_file_slice, edit_file
- AST-Based (Python): py_get_skeleton, py_get_code_outline, py_get_definition,
py_update_definition, py_get_signature, py_set_signature, py_get_class_summary,
py_get_var_declaration, py_set_var_declaration
- Analysis: get_file_summary, get_git_diff, py_find_usages, py_get_imports,
py_check_syntax, py_get_hierarchy, py_get_docstring
- Network: web_search, fetch_url
- Runtime: get_ui_performance
Mutating Tools:
The MUTATING_TOOLS frozenset defines tools that modify files. ai_client.py
checks this set and routes to pre_tool_callback (GUI approval) if present.
Thread Safety:
This module uses module-level global state (_allowed_paths, _base_dirs).
Call configure() before dispatch() in multi-threaded environments.
See Also:
- docs/guide_tools.md for complete tool inventory and security model
- src/ai_client.py for tool dispatch integration
- src/shell_runner.py for PowerShell execution
"""
# mcp_client.py
#MCP-style file context tools for manual_slop.
# Exposes read-only filesystem tools the AI can call to selectively fetch file
# content on demand, instead of having everything inlined into the context block.
# All access is restricted to paths that are either:
# - Explicitly listed in the project's allowed_paths set, OR
# - Contained within an allowed base_dir (must resolve to a subpath of it)
# Tools exposed:
# read_file(path) - return full UTF-8 content of a file
# list_directory(path) - list entries in a directory (names + type)
# search_files(path, pattern) - glob pattern search within an allowed dir
# get_file_summary(path) - return the summarize.py heuristic summary
#
from __future__ import annotations
import asyncio
from pathlib import Path

View File

@@ -1,3 +1,41 @@
"""
Models - Core data structures for MMA orchestration and project configuration.
This module defines the primary dataclasses used throughout the Manual Slop
application for representing tasks, tracks, and execution context.
Key Data Structures:
- Ticket: Atomic unit of work with status, dependencies, and context requirements
- Track: Collection of tickets with a shared goal
- WorkerContext: Execution context for a Tier 3 worker
- Metadata: Track metadata (id, name, status, timestamps)
- TrackState: Serializable track state with discussion history
- FileItem: File configuration with auto-aggregate and force-full flags
Status Machine (Ticket):
todo -> in_progress -> completed
| |
v v
blocked blocked
Serialization:
All dataclasses provide to_dict() and from_dict() class methods for TOML/JSON
persistence via project_manager.py.
Thread Safety:
These dataclasses are NOT thread-safe. Callers must synchronize mutations
if sharing instances across threads (e.g., during ConductorEngine execution).
Configuration Integration:
- load_config() / save_config() read/write the global config.toml
- AGENT_TOOL_NAMES defines the canonical list of MCP tools available to agents
See Also:
- docs/guide_mma.md for MMA orchestration documentation
- src/dag_engine.py for TrackDAG and ExecutionEngine
- src/multi_agent_conductor.py for ConductorEngine
- src/project_manager.py for persistence layer
"""
from __future__ import annotations
import tomllib
import datetime