docs update (wip)

2026-03-08 01:46:34 -05:00
parent d9a06fd2fe
commit d34c35941f
14 changed files with 1213 additions and 105 deletions
@@ -1,14 +1,56 @@
-# Sloppy
+# Manual Slop

 ![img](./gallery/splash.png)

-A GUI orchestrator for local LLM-driven coding sessions. Manual Slop bridges high-latency AI reasoning with a low-latency ImGui render loop via a thread-safe asynchronous pipeline, ensuring every AI-generated payload passes through a human-auditable gate before execution.
+A high-density GUI orchestrator for local LLM-driven coding sessions. Manual Slop bridges high-latency AI reasoning with a low-latency ImGui render loop via a thread-safe asynchronous pipeline, ensuring every AI-generated payload passes through a human-auditable gate before execution.

-**Tech Stack**: Python 3.11+, Dear PyGui / ImGui, FastAPI, Uvicorn
-**Providers**: Gemini API, Anthropic API, DeepSeek, Gemini CLI (headless)
+**Design Philosophy**: Full manual control over vendor API metrics, agent capabilities, and context memory usage. High information density, tactile interactions, and explicit confirmation for destructive actions.
+
+**Tech Stack**: Python 3.11+, Dear PyGui / ImGui Bundle, FastAPI, Uvicorn, tree-sitter
+**Providers**: Gemini API, Anthropic API, DeepSeek, Gemini CLI (headless), MiniMax
 **Platform**: Windows (PowerShell) — single developer, local use

-![img](./gallery/python_2026-03-01_23-45-34.png)
+![img](./gallery/python_2026-03-07_14-32-50.png)
+
+---
+
+## Key Features
+
+### Multi-Provider Integration
+- **Gemini SDK**: Server-side context caching with TTL management, automatic cache rebuilding at 90% TTL
+- **Anthropic**: Ephemeral prompt caching with 4-breakpoint system, automatic history truncation at 180K tokens
+- **DeepSeek**: Dedicated SDK for code-optimized reasoning
+- **Gemini CLI**: Headless adapter with full functional parity, synchronous HITL bridge
+- **MiniMax**: Alternative provider support
+
+### 4-Tier MMA Orchestration
+Hierarchical task decomposition with specialized models and strict token firewalling:
+- **Tier 1 (Orchestrator)**: Product alignment, epic → tracks
+- **Tier 2 (Tech Lead)**: Track → tickets (DAG), persistent context
+- **Tier 3 (Worker)**: Stateless TDD implementation, context amnesia
+- **Tier 4 (QA)**: Stateless error analysis, no fixes
+
+### Strict Human-in-the-Loop (HITL)
+- **Execution Clutch**: All destructive actions suspend on `threading.Condition` pending GUI approval
+- **Three Dialog Types**: ConfirmDialog (scripts), MMAApprovalDialog (steps), MMASpawnApprovalDialog (workers)
+- **Editable Payloads**: Review, modify, or reject any AI-generated content before execution
+
+### 26 MCP Tools with Sandboxing
+Three-layer security model: Allowlist Construction → Path Validation → Resolution Gate
+- **File I/O**: read, list, search, slice, edit, tree
+- **AST-Based (Python)**: skeleton, outline, definition, signature, class summary, docstring
+- **Analysis**: summary, git diff, find usages, imports, syntax check, hierarchy
+- **Network**: web search, URL fetch
+- **Runtime**: UI performance metrics
+
+### Parallel Tool Execution
+Multiple independent tool calls within a single AI turn execute concurrently via `asyncio.gather`, significantly reducing latency.
+
+### AST-Based Context Management
+- **Skeleton View**: Signatures + docstrings, bodies replaced with `...`
+- **Curated View**: Preserves `@core_logic` decorated functions and `[HOT]` comment blocks
+- **Targeted View**: Extracts only specified symbols and their dependencies
+- **Heuristic Summaries**: Token-efficient structural descriptions without AI calls

 ---

@@ -26,35 +68,12 @@ The **MMA (Multi-Model Agent)** system decomposes epics into tracks, tracks into

 | Guide | Scope |
 |---|---|
+| [Readme](./docs/Readme.md) | Documentation index, GUI panel reference, configuration files, environment variables |
 | [Architecture](./docs/guide_architecture.md) | Threading model, event system, AI client multi-provider architecture, HITL mechanism, comms logging |
-| [Tools & IPC](./docs/guide_tools.md) | MCP Bridge security model, all 26 native tools, Hook API endpoints, ApiHookClient reference, shell runner |
-| [MMA Orchestration](./docs/guide_mma.md) | 4-tier hierarchy, Ticket/Track data structures, DAG engine, ConductorEngine execution loop, worker lifecycle |
-| [Simulations](./docs/guide_simulations.md) | `live_gui` fixture, Puppeteer pattern, mock provider, visual verification patterns, ASTParser / summarizer |
-
---
-
-## Module Map
-
-Core implementation resides in the `src/` directory.
-
-| File | Role |
-|---|---|
-| `src/gui_2.py` | Primary ImGui interface — App class, frame-sync, HITL dialogs |
-| `src/ai_client.py` | Multi-provider LLM abstraction (Gemini, Anthropic, DeepSeek, Gemini CLI) |
-| `src/mcp_client.py` | 26 MCP tools with filesystem sandboxing and tool dispatch |
-| `src/api_hooks.py` | HookServer — REST API for external automation on `:8999` |
-| `src/api_hook_client.py` | Python client for the Hook API (used by tests and external tooling) |
-| `src/multi_agent_conductor.py` | ConductorEngine — Tier 2 orchestration loop with DAG execution |
-| `src/conductor_tech_lead.py` | Tier 2 ticket generation from track briefs |
-| `src/dag_engine.py` | TrackDAG (dependency graph) + ExecutionEngine (tick-based state machine) |
-| `src/models.py` | Ticket, Track, WorkerContext dataclasses |
-| `src/events.py` | EventEmitter, AsyncEventQueue, UserRequestEvent |
-| `src/project_manager.py` | TOML config persistence, discussion management, track state |
-| `src/session_logger.py` | JSON-L + markdown audit trails (comms, tools, CLI, hooks) |
-| `src/shell_runner.py` | PowerShell execution with timeout, env config, QA callback |
-| `src/file_cache.py` | ASTParser (tree-sitter) — skeleton and curated views |
-| `src/summarize.py` | Heuristic file summaries (imports, classes, functions) |
-| `src/outline_tool.py` | Hierarchical code outline via stdlib `ast` |
+| [Tools & IPC](./docs/guide_tools.md) | MCP Bridge 3-layer security, 26 tool inventory, Hook API endpoints, ApiHookClient reference, shell runner |
+| [MMA Orchestration](./docs/guide_mma.md) | 4-tier hierarchy, Ticket/Track data structures, DAG engine, ConductorEngine, worker lifecycle, abort propagation |
+| [Simulations](./docs/guide_simulations.md) | `live_gui` fixture, Puppeteer pattern, mock provider, visual verification, ASTParser / summarizer |
+| [Meta-Boundary](./docs/guide_meta_boundary.md) | Application vs Meta-Tooling domains, inter-domain bridges, safety model separation |

 ---

@@ -105,6 +124,151 @@ uv run pytest tests/ -v

 ---

+## MMA 4-Tier Architecture
+
+The Multi-Model Agent system uses hierarchical task decomposition with specialized models at each tier:
+
+| Tier | Role | Model | Responsibility |
+|------|------|-------|----------------|
+| **Tier 1** | Orchestrator | `gemini-3.1-pro-preview` | Product alignment, epic → tracks, track initialization |
+| **Tier 2** | Tech Lead | `gemini-3-flash-preview` | Track → tickets (DAG), architectural oversight, persistent context |
+| **Tier 3** | Worker | `gemini-2.5-flash-lite` / `deepseek-v3` | Stateless TDD implementation per ticket, context amnesia |
+| **Tier 4** | QA | `gemini-2.5-flash-lite` / `deepseek-v3` | Stateless error analysis, diagnostics only (no fixes) |
+
+**Key Principles:**
+- **Context Amnesia**: Tier 3/4 workers start with `ai_client.reset_session()` — no history bleed
+- **Token Firewalling**: Each tier receives only the context it needs
+- **Model Escalation**: Failed tickets automatically retry with more capable models
+- **WorkerPool**: Bounded concurrency (default: 4 workers) with semaphore gating
+
+---
+
+## Module by Domain
+
+### src/ — Core implementation
+
+| File | Role |
+|---|---|
+| `src/gui_2.py` | Primary ImGui interface — App class, frame-sync, HITL dialogs, event system |
+| `src/ai_client.py` | Multi-provider LLM abstraction (Gemini, Anthropic, DeepSeek, MiniMax) |
+| `src/mcp_client.py` |       26 MCP tools with filesystem sandboxing and tool dispatch |
+| `src/api_hooks.py`  |          HookServer — REST API on `127.0.0.1:8999 for external automation |
+| `src/api_hook_client.py` |       Python client for the Hook API (used by tests and external tooling) |
+| `src/multi_agent_conductor.py` |   ConductorEngine — Tier 2 orchestration loop with DAG execution  |
+| `src/conductor_tech_lead.py`  |   Tier 2 ticket generation from track briefs |
+| `src/dag_engine.py`  |       TrackDAG (dependency graph) + ExecutionEngine (tick-based state machine) |
+| `src/models.py`  |       Ticket, Track, WorkerContext, Metadata, Track state |
+| `src/events.py`  |           EventEmitter, AsyncEventQueue, UserRequestEvent |
+| `src/project_manager.py`  |       TOML config persistence, discussion management, track state |
+| `src/session_logger.py`  |       JSON-L + markdown audit trails (comms, tools, CLI, hooks) |
+| `src/shell_runner.py`  |       PowerShell execution with timeout, env config, QA callback |
+| `src/file_cache.py`  |       ASTParser (tree-sitter) — skeleton, curated, and targeted views |
+| `src/summarize.py`  |       Heuristic file summaries (imports, classes, functions) |
+| `src/outline_tool.py`  |       Hierarchical code outline via stdlib `ast` |
+| `src/performance_monitor.py`  |       FPS, frame time, CPU, input lag tracking |
+| `src/log_registry.py`  |       Session metadata persistence |
+| `src/log_pruner.py`  |       Automated log cleanup based on age and whitelist |
+| `src/paths.py`  |       Centralized path resolution with environment variable overrides |
+| `src/cost_tracker.py`  |       Token cost estimation for API calls |
+| `src/gemini_cli_adapter.py`  |       CLI subprocess adapter with session management |
+| `src/mma_prompts.py`  |       Tier-specific system prompts for MMA orchestration |
+| `src/theme_*.py` |        UI theming (dark, light modes) |
+
+Simulation modules in `simulation/`:
+| File | Role |
+|---|--- |
+| `simulation/sim_base.py` |       BaseSimulation class with setup/teardown lifecycle |
+| `simulation/workflow_sim.py` |       WorkflowSimulator — high-level GUI automation |
+| `simulation/user_agent.py` |        UserSimAgent — simulated user behavior (reading time, thinking delays) |
+
+---
+
+## Setup
+The MCP Bridge implements a three-layer security model in `mcp_client.py`:
+
+Every tool accessing the filesystem passes through `_resolve_and_check(path)` before any I/O.
+
+### Layer 1: Allowlist Construction (`configure`)
+Called by `ai_client` before each send cycle:
+1. Resets `_allowed_paths` and `_base_dirs` to empty sets
+2. Sets `_primary_base_dir` from `extra_base_dirs[0]`
+3. Iterates `file_items`, resolving paths, adding to allowlist
+4. Blacklist check: `history.toml`, `*_history.toml`, `config.toml`, `credentials.toml` are NEVER allowed
+
+### Layer 2: Path Validation (`_is_allowed`)
+Checks run in order:
+1. **Blacklist**: `history.toml`, `*_history.toml` → hard deny
+2. **Explicit allowlist**: Path in `_allowed_paths` → allow
+3. **CWD fallback**: If no base dirs, allow `cwd()` subpaths
+4. **Base containment**: Must be subpath of `_base_dirs`
+5. **Default deny**: All other paths rejected
+
+### Layer 3: Resolution Gate (`_resolve_and_check`)
+1. Convert raw path string to `Path`
+2. If not absolute, prepend `_primary_base_dir`
+3. Resolve to absolute (follows symlinks)
+4. Call `_is_allowed()`
+5. Return `(resolved_path, "")` on success or `(None, error_message)` on failure
+
+All paths are resolved (following symlinks) before comparison, preventing symlink-based traversal attacks.
+
+### Security Model
+
+The MCP Bridge implements a three-layer security model in `mcp_client.py`. Every tool accessing the filesystem passes through `_resolve_and_check(path)` before any I/O.
+
+### Layer 1: Allowlist Construction (`configure`)
+Called by `ai_client` before each send cycle:
+1. Resets `_allowed_paths` and `_base_dirs` to empty sets.
+2. Sets `_primary_base_dir` from `extra_base_dirs[0]` (resolved) or falls back to cwd().
+3. Iterates `file_items`, resolving each path to an absolute path, adding to `_allowed_paths`; its parent directory is added to `_base_dirs`.
+4. Any entries in `extra_base_dirs` that are valid directories are also added to `_base_dirs`.
+
+### Layer 2: Path Validation (`_is_allowed`)
+Checks run in this exact order:
+1. **Blacklist**: `history.toml`, `*_history.toml`, `config`, `credentials` → hard deny
+2. **Explicit allowlist**: Path in `_allowed_paths` → allow
+7. **CWD fallback**: If no base dirs, any under `cwd()` is allowed (fail-safe for projects without explicit base dirs)
+8. **Base containment**: Must be a subpath of at least one entry in `_base_dirs` (via `relative_to()`)
+9. **Default deny**: All other paths rejected
+All paths are resolved (following symlinks) before comparison, preventing symlink-based traversal attacks.
+
+### Layer 3: Resolution Gate (`_resolve_and_check`)
+Every tool call passes through this:
+1. Convert raw path string to `Path`.
+2. If not absolute, prepend `_primary_base_dir`.
+3. Resolve to absolute.
+4. Call `_is_allowed()`.
+5. Return `(resolved_path, "")` on success, `(None, error_message)` on failure
+All paths are resolved (following symlinks) before comparison, preventing symlink-based traversal attacks.
+
+---
+
+## Conductor SystemThe project uses a spec-driven track system in `conductor/` for structured development:
+
+```
+conductor/
+├── workflow.md           # Task lifecycle, TDD protocol, phase verification
+├── tech-stack.md         # Technology constraints and patterns
+├── product.md            # Product vision and guidelines
+├── product-guidelines.md # Code standards, UX principles
+└── tracks/
+    └── <track_name>_<YYYYMMDD>/
+        ├── spec.md       # Track specification
+        ├── plan.md       # Implementation plan with checkbox tasks
+        ├── metadata.json # Track metadata
+        └── state.toml    # Structured state with task list
+```
+
+**Key Concepts:**
+- **Tracks**: Self-contained implementation units with spec, plan, and state
+- **TDD Protocol**: Red (failing tests) → Green (pass) → Refactor
+- **Phase Checkpoints**: Verification gates with git notes for audit trails
+- **MMA Delegation**: Tracks are executed via the 4-tier agent hierarchy
+
+See `conductor/workflow.md` for the full development workflow.
+
+---
+
 ## Project Configuration

 Projects are stored as `<name>.toml` files. The discussion history is split into a sibling `<name>_history.toml` to keep the main config lean.
@@ -134,3 +298,31 @@ run_powershell = true
 read_file = true
 # ... 26 tool flags
 ```
+
+---
+
+## Quick Reference
+
+### Hook API Endpoints (port 8999)
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/status` | GET | Health check |
+| `/api/project` | GET/POST | Project config |
+| `/api/session` | GET/POST | Discussion entries |
+| `/api/gui` | POST | GUI task queue |
+| `/api/gui/mma_status` | GET | Full MMA state |
+| `/api/gui/value/<tag>` | GET | Read GUI field |
+| `/api/ask` | POST | Blocking HITL dialog |
+
+### MCP Tool Categories
+
+| Category | Tools |
+|----------|-------|
+| **File I/O** | `read_file`, `list_directory`, `search_files`, `get_tree`, `get_file_slice`, `set_file_slice`, `edit_file` |
+| **AST (Python)** | `py_get_skeleton`, `py_get_code_outline`, `py_get_definition`, `py_update_definition`, `py_get_signature`, `py_set_signature`, `py_get_class_summary`, `py_get_var_declaration`, `py_set_var_declaration`, `py_get_docstring` |
+| **Analysis** | `get_file_summary`, `get_git_diff`, `py_find_usages`, `py_get_imports`, `py_check_syntax`, `py_get_hierarchy` |
+| **Network** | `web_search`, `fetch_url` |
+| **Runtime** | `get_ui_performance` |
+
+---