2026-03-08 03:11:11 -04:00
2026-03-07 19:41:23 -05:00
2026-03-07 19:50:41 -05:00
2026-03-08 03:05:15 -04:00
2026-03-08 01:46:34 -05:00
2026-03-08 01:46:34 -05:00
2026-03-08 01:46:34 -05:00
2026-03-08 03:11:11 -04:00
2026-03-07 20:02:06 -05:00
2026-02-24 20:37:20 -05:00
2026-03-06 20:27:03 -05:00
2026-03-08 01:46:34 -05:00
2026-03-06 00:11:35 -05:00
2026-03-05 13:55:40 -05:00
2026-03-08 03:05:15 -04:00

Manual Slop

img

A high-density GUI orchestrator for local LLM-driven coding sessions. Manual Slop bridges high-latency AI reasoning with a low-latency ImGui render loop via a thread-safe asynchronous pipeline, ensuring every AI-generated payload passes through a human-auditable gate before execution.

Design Philosophy: Full manual control over vendor API metrics, agent capabilities, and context memory usage. High information density, tactile interactions, and explicit confirmation for destructive actions.

Tech Stack: Python 3.11+, Dear PyGui / ImGui Bundle, FastAPI, Uvicorn, tree-sitter Providers: Gemini API, Anthropic API, DeepSeek, Gemini CLI (headless), MiniMax Platform: Windows (PowerShell) — single developer, local use

img


Key Features

Multi-Provider Integration

  • Gemini SDK: Server-side context caching with TTL management, automatic cache rebuilding at 90% TTL
  • Anthropic: Ephemeral prompt caching with 4-breakpoint system, automatic history truncation at 180K tokens
  • DeepSeek: Dedicated SDK for code-optimized reasoning
  • Gemini CLI: Headless adapter with full functional parity, synchronous HITL bridge
  • MiniMax: Alternative provider support

4-Tier MMA Orchestration

Hierarchical task decomposition with specialized models and strict token firewalling:

  • Tier 1 (Orchestrator): Product alignment, epic → tracks
  • Tier 2 (Tech Lead): Track → tickets (DAG), persistent context
  • Tier 3 (Worker): Stateless TDD implementation, context amnesia
  • Tier 4 (QA): Stateless error analysis, no fixes

Strict Human-in-the-Loop (HITL)

  • Execution Clutch: All destructive actions suspend on threading.Condition pending GUI approval
  • Three Dialog Types: ConfirmDialog (scripts), MMAApprovalDialog (steps), MMASpawnApprovalDialog (workers)
  • Editable Payloads: Review, modify, or reject any AI-generated content before execution

26 MCP Tools with Sandboxing

Three-layer security model: Allowlist Construction → Path Validation → Resolution Gate

  • File I/O: read, list, search, slice, edit, tree
  • AST-Based (Python): skeleton, outline, definition, signature, class summary, docstring
  • Analysis: summary, git diff, find usages, imports, syntax check, hierarchy
  • Network: web search, URL fetch
  • Runtime: UI performance metrics

Parallel Tool Execution

Multiple independent tool calls within a single AI turn execute concurrently via asyncio.gather, significantly reducing latency.

AST-Based Context Management

  • Skeleton View: Signatures + docstrings, bodies replaced with ...
  • Curated View: Preserves @core_logic decorated functions and [HOT] comment blocks
  • Targeted View: Extracts only specified symbols and their dependencies
  • Heuristic Summaries: Token-efficient structural descriptions without AI calls

Architecture at a Glance

Four thread domains operate concurrently: the ImGui main loop, an asyncio worker for AI calls, a HookServer (HTTP on :8999) for external automation, and transient threads for model fetching. Background threads never write GUI state directly — they serialize task dicts into lock-guarded lists that the main thread drains once per frame (details).

The Execution Clutch suspends the AI execution thread on a threading.Condition when a destructive action (PowerShell script, sub-agent spawn) is requested. The GUI renders a modal where the user can read, edit, or reject the payload. On approval, the condition is signaled and execution resumes (details).

The MMA (Multi-Model Agent) system decomposes epics into tracks, tracks into DAG-ordered tickets, and executes each ticket with a stateless Tier 3 worker that starts from ai_client.reset_session() — no conversational bleed between tickets (details).


Documentation

Guide Scope
Readme Documentation index, GUI panel reference, configuration files, environment variables
Architecture Threading model, event system, AI client multi-provider architecture, HITL mechanism, comms logging
Tools & IPC MCP Bridge 3-layer security, 26 tool inventory, Hook API endpoints, ApiHookClient reference, shell runner
MMA Orchestration 4-tier hierarchy, Ticket/Track data structures, DAG engine, ConductorEngine, worker lifecycle, abort propagation
Simulations live_gui fixture, Puppeteer pattern, mock provider, visual verification, ASTParser / summarizer
Meta-Boundary Application vs Meta-Tooling domains, inter-domain bridges, safety model separation

Setup

Prerequisites

  • Python 3.11+
  • uv for package management

Installation

git clone <repo>
cd manual_slop
uv sync

Credentials

Configure in credentials.toml:

[gemini]
api_key = "YOUR_KEY"

[anthropic]
api_key = "YOUR_KEY"

[deepseek]
api_key = "YOUR_KEY"

Running

uv run sloppy.py                        # Normal mode
uv run sloppy.py --enable-test-hooks    # With Hook API on :8999

Running Tests

uv run pytest tests/ -v

Note: See the Structural Testing Contract for rules regarding mock patching, live_gui standard usage, and artifact isolation (logs are generated in tests/logs/ and tests/artifacts/).


MMA 4-Tier Architecture

The Multi-Model Agent system uses hierarchical task decomposition with specialized models at each tier:

Tier Role Model Responsibility
Tier 1 Orchestrator gemini-3.1-pro-preview Product alignment, epic → tracks, track initialization
Tier 2 Tech Lead gemini-3-flash-preview Track → tickets (DAG), architectural oversight, persistent context
Tier 3 Worker gemini-2.5-flash-lite / deepseek-v3 Stateless TDD implementation per ticket, context amnesia
Tier 4 QA gemini-2.5-flash-lite / deepseek-v3 Stateless error analysis, diagnostics only (no fixes)

Key Principles:

  • Context Amnesia: Tier 3/4 workers start with ai_client.reset_session() — no history bleed
  • Token Firewalling: Each tier receives only the context it needs
  • Model Escalation: Failed tickets automatically retry with more capable models
  • WorkerPool: Bounded concurrency (default: 4 workers) with semaphore gating

Module by Domain

src/ — Core implementation

File Role
src/gui_2.py Primary ImGui interface — App class, frame-sync, HITL dialogs, event system
src/ai_client.py Multi-provider LLM abstraction (Gemini, Anthropic, DeepSeek, MiniMax)
src/mcp_client.py 26 MCP tools with filesystem sandboxing and tool dispatch
src/api_hooks.py HookServer — REST API on `127.0.0.1:8999 for external automation
src/api_hook_client.py Python client for the Hook API (used by tests and external tooling)
src/multi_agent_conductor.py ConductorEngine — Tier 2 orchestration loop with DAG execution
src/conductor_tech_lead.py Tier 2 ticket generation from track briefs
src/dag_engine.py TrackDAG (dependency graph) + ExecutionEngine (tick-based state machine)
src/models.py Ticket, Track, WorkerContext, Metadata, Track state
src/events.py EventEmitter, AsyncEventQueue, UserRequestEvent
src/project_manager.py TOML config persistence, discussion management, track state
src/session_logger.py JSON-L + markdown audit trails (comms, tools, CLI, hooks)
src/shell_runner.py PowerShell execution with timeout, env config, QA callback
src/file_cache.py ASTParser (tree-sitter) — skeleton, curated, and targeted views
src/summarize.py Heuristic file summaries (imports, classes, functions)
src/outline_tool.py Hierarchical code outline via stdlib ast
src/performance_monitor.py FPS, frame time, CPU, input lag tracking
src/log_registry.py Session metadata persistence
src/log_pruner.py Automated log cleanup based on age and whitelist
src/paths.py Centralized path resolution with environment variable overrides
src/cost_tracker.py Token cost estimation for API calls
src/gemini_cli_adapter.py CLI subprocess adapter with session management
src/mma_prompts.py Tier-specific system prompts for MMA orchestration
src/theme_*.py UI theming (dark, light modes)

Simulation modules in simulation/:

File Role
simulation/sim_base.py BaseSimulation class with setup/teardown lifecycle
simulation/workflow_sim.py WorkflowSimulator — high-level GUI automation
simulation/user_agent.py UserSimAgent — simulated user behavior (reading time, thinking delays)

Setup

The MCP Bridge implements a three-layer security model in mcp_client.py:

Every tool accessing the filesystem passes through _resolve_and_check(path) before any I/O.

Layer 1: Allowlist Construction (configure)

Called by ai_client before each send cycle:

  1. Resets _allowed_paths and _base_dirs to empty sets
  2. Sets _primary_base_dir from extra_base_dirs[0]
  3. Iterates file_items, resolving paths, adding to allowlist
  4. Blacklist check: history.toml, *_history.toml, config.toml, credentials.toml are NEVER allowed

Layer 2: Path Validation (_is_allowed)

Checks run in order:

  1. Blacklist: history.toml, *_history.toml → hard deny
  2. Explicit allowlist: Path in _allowed_paths → allow
  3. CWD fallback: If no base dirs, allow cwd() subpaths
  4. Base containment: Must be subpath of _base_dirs
  5. Default deny: All other paths rejected

Layer 3: Resolution Gate (_resolve_and_check)

  1. Convert raw path string to Path
  2. If not absolute, prepend _primary_base_dir
  3. Resolve to absolute (follows symlinks)
  4. Call _is_allowed()
  5. Return (resolved_path, "") on success or (None, error_message) on failure

All paths are resolved (following symlinks) before comparison, preventing symlink-based traversal attacks.

Security Model

The MCP Bridge implements a three-layer security model in mcp_client.py. Every tool accessing the filesystem passes through _resolve_and_check(path) before any I/O.

Layer 1: Allowlist Construction (configure)

Called by ai_client before each send cycle:

  1. Resets _allowed_paths and _base_dirs to empty sets.
  2. Sets _primary_base_dir from extra_base_dirs[0] (resolved) or falls back to cwd().
  3. Iterates file_items, resolving each path to an absolute path, adding to _allowed_paths; its parent directory is added to _base_dirs.
  4. Any entries in extra_base_dirs that are valid directories are also added to _base_dirs.

Layer 2: Path Validation (_is_allowed)

Checks run in this exact order:

  1. Blacklist: history.toml, *_history.toml, config, credentials → hard deny
  2. Explicit allowlist: Path in _allowed_paths → allow
  3. CWD fallback: If no base dirs, any under cwd() is allowed (fail-safe for projects without explicit base dirs)
  4. Base containment: Must be a subpath of at least one entry in _base_dirs (via relative_to())
  5. Default deny: All other paths rejected All paths are resolved (following symlinks) before comparison, preventing symlink-based traversal attacks.

Layer 3: Resolution Gate (_resolve_and_check)

Every tool call passes through this:

  1. Convert raw path string to Path.
  2. If not absolute, prepend _primary_base_dir.
  3. Resolve to absolute.
  4. Call _is_allowed().
  5. Return (resolved_path, "") on success, (None, error_message) on failure All paths are resolved (following symlinks) before comparison, preventing symlink-based traversal attacks.

Conductor SystemThe project uses a spec-driven track system in conductor/ for structured development:

conductor/
├── workflow.md           # Task lifecycle, TDD protocol, phase verification
├── tech-stack.md         # Technology constraints and patterns
├── product.md            # Product vision and guidelines
├── product-guidelines.md # Code standards, UX principles
└── tracks/
    └── <track_name>_<YYYYMMDD>/
        ├── spec.md       # Track specification
        ├── plan.md       # Implementation plan with checkbox tasks
        ├── metadata.json # Track metadata
        └── state.toml    # Structured state with task list

Key Concepts:

  • Tracks: Self-contained implementation units with spec, plan, and state
  • TDD Protocol: Red (failing tests) → Green (pass) → Refactor
  • Phase Checkpoints: Verification gates with git notes for audit trails
  • MMA Delegation: Tracks are executed via the 4-tier agent hierarchy

See conductor/workflow.md for the full development workflow.


Project Configuration

Projects are stored as <name>.toml files. The discussion history is split into a sibling <name>_history.toml to keep the main config lean.

[project]
name = "my_project"
git_dir = "./my_repo"
system_prompt = ""

[files]
base_dir = "./my_repo"
paths = ["src/**/*.py", "README.md"]

[screenshots]
base_dir = "./my_repo"
paths = []

[output]
output_dir = "./md_gen"

[gemini_cli]
binary_path = "gemini"

[agent.tools]
run_powershell = true
read_file = true
# ... 26 tool flags

Quick Reference

Hook API Endpoints (port 8999)

Endpoint Method Description
/status GET Health check
/api/project GET/POST Project config
/api/session GET/POST Discussion entries
/api/gui POST GUI task queue
/api/gui/mma_status GET Full MMA state
/api/gui/value/<tag> GET Read GUI field
/api/ask POST Blocking HITL dialog

MCP Tool Categories

Category Tools
File I/O read_file, list_directory, search_files, get_tree, get_file_slice, set_file_slice, edit_file
AST (Python) py_get_skeleton, py_get_code_outline, py_get_definition, py_update_definition, py_get_signature, py_set_signature, py_get_class_summary, py_get_var_declaration, py_set_var_declaration, py_get_docstring
Analysis get_file_summary, get_git_diff, py_find_usages, py_get_imports, py_check_syntax, py_get_hierarchy
Network web_search, fetch_url
Runtime get_ui_performance

Description
No description provided
Readme 14 MiB
Languages
Python 99.1%
PowerShell 0.7%
C++ 0.1%