68 lines
12 KiB
Markdown
68 lines
12 KiB
Markdown
# Product Guide: Manual Slop
|
|
|
|
## Vision
|
|
|
|
To serve as an expert-level utility for personal developer use on small projects, providing full, manual control over vendor API metrics, agent capabilities, and context memory usage.
|
|
|
|
## Architecture Reference
|
|
|
|
For deep implementation details when planning or implementing tracks, consult `docs/` (last updated: 08e003a):
|
|
|
|
- **[docs/guide_architecture.md](../docs/guide_architecture.md):** Threading model, event system, AI client, HITL mechanism
|
|
- **[docs/guide_meta_boundary.md](../docs/guide_meta_boundary.md):** The critical distinction between the Application's Strict-HITL environment and the Meta-Tooling environment used to build it.
|
|
- **[docs/guide_tools.md](../docs/guide_tools.md):** MCP Bridge, Hook API, ApiHookClient, shell runner
|
|
- **[docs/guide_mma.md](../docs/guide_mma.md):** 4-tier orchestration, DAG engine, worker lifecycle
|
|
- **[docs/guide_simulations.md](../docs/guide_simulations.md):** Test framework, mock provider, verification patterns
|
|
|
|
## Primary Use Cases
|
|
|
|
- **Full Control over Vendor APIs:** Exposing detailed API metrics and configuring deep agent capabilities directly within the GUI.
|
|
- **Context & Memory Management:** Better visualization and management of token usage and context memory. Includes granular per-file flags (**Auto-Aggregate**, **Force Full**) and a dedicated **'Context' role** for manual injections, allowing developers to optimize prompt limits with expert precision.
|
|
- **Manual "Vibe Coding" Assistant:** Serving as an auxiliary, multi-provider assistant that natively interacts with the codebase via sandboxed PowerShell scripts and MCP-like file tools, emphasizing manual developer oversight and explicit confirmation.
|
|
|
|
## Key Features
|
|
|
|
- **Multi-Provider Integration:** Supports Gemini, Anthropic, and DeepSeek with seamless switching.
|
|
- **4-Tier Hierarchical Multi-Model Architecture:** Orchestrates an intelligent cascade of specialized models to isolate cognitive loads and minimize token burn.
|
|
- **Tier 1 (Orchestrator):** Strategic product alignment, setup (`/conductor:setup`), and track initialization (`/conductor:newTrack`) using `gemini-3.1-pro-preview`.
|
|
- **Tier 2 (Tech Lead):** Technical oversight and track execution (`/conductor:implement`) using `gemini-2.5-flash`. Maintains persistent context throughout implementation.
|
|
- **Tier 3 (Worker):** Surgical code implementation and TDD using `gemini-2.5-flash` or `deepseek-v3`. Operates statelessly with tool access and dependency skeletons.
|
|
- **Tier 4 (QA):** Error analysis and diagnostics using `gemini-2.5-flash` or `deepseek-v3`. Operates statelessly with tool access.
|
|
- **MMA Delegation Engine:** Route tasks, ensuring role-scoped context and detailed observability via timestamped sub-agent logs. Supports dynamic ticket creation and dependency resolution via an automated Dispatcher Loop.
|
|
- **MMA Observability Dashboard:** A high-density control center within the GUI for monitoring and managing the 4-Tier architecture.
|
|
- **Track Browser:** Real-time visualization of all implementation tracks with status indicators and progress bars. Includes a dedicated **Active Track Summary** featuring a color-coded progress bar, precise ticket status breakdown (Completed, In Progress, Blocked, Todo), and dynamic **ETA estimation** based on historical completion times.
|
|
- **Visual Task DAG:** An interactive, node-based visualizer for the active track's task dependencies using `imgui-node-editor`. Features color-coded state tracking (Ready, Running, Blocked, Done), drag-and-drop dependency creation, and right-click deletion.
|
|
- **Strategy Visualization:** Dedicated real-time output streams for Tier 1 (Strategic Planning) and Tier 2/3 (Execution) agents, allowing the user to follow the agent's reasoning chains alongside the task DAG.
|
|
- **Track-Scoped State Management:** Segregates discussion history and task progress into per-track state files (e.g., `conductor/tracks/<track_id>/state.toml`). This prevents global context pollution and ensures the Tech Lead session is isolated to the specific track's objective.
|
|
**Native DAG Execution Engine:** Employs a Python-based Directed Acyclic Graph (DAG) engine to manage complex task dependencies. Supports automated topological sorting, robust cycle detection, and **transitive blocking propagation** (cascading `blocked` status to downstream dependents to prevent execution stalls).
|
|
|
|
- **Programmable Execution State machine:** Governing the transition between "Auto-Queue" (autonomous worker spawning) and "Step Mode" (explicit manual approval for each task transition).
|
|
- **Role-Scoped Documentation:** Automated mapping of foundational documents to specific tiers to prevent token bloat and maintain high-signal context.
|
|
- **Tiered Context Scoping:** Employs optimized context subsets for each tier. Tiers 1 & 2 receive strategic documents and full history, while Tier 3/4 workers receive task-specific "Focus Files" and automated AST dependency skeletons.
|
|
- **Worker Spawn Interceptor:** A mandatory security gate that intercepts every sub-agent launch. Provides a GUI modal allowing the user to review, modify, or reject the worker's prompt and file context before it is sent to the API.
|
|
- **Strict Memory Siloing:** Employs tree-sitter AST-based interface extraction (Skeleton View, Curated View, and Targeted View) and "Context Amnesia" to provide workers only with the absolute minimum context required. Includes **Manual Skeleton Context Injection**, allowing developers to preview and manually inject file skeletons or full content into discussions via a dedicated GUI modal. Features multi-level dependency traversal and AST caching to minimize re-parsing overhead and token burn.
|
|
- **Explicit Execution Control:** All AI-generated PowerShell scripts require explicit human confirmation via interactive UI dialogs before execution, supported by a global "Linear Execution Clutch" for deterministic debugging.
|
|
- **Parallel Multi-Agent Execution:** Executes multiple AI workers in parallel using a non-blocking execution engine and a dedicated `WorkerPool`. Features configurable concurrency limits (defaulting to 4) to optimize resource usage and prevent API rate limiting.
|
|
- **Parallel Tool Execution:** Executes independent tool calls (e.g., parallel file reads) concurrently within a single agent turn using an asynchronous execution engine, significantly reducing end-to-end latency.
|
|
- **Automated Tier 4 QA:** Integrates real-time error interception in the shell runner, automatically forwarding technical failures to cheap sub-agents for 20-word diagnostic summaries injected back into the worker history.
|
|
- **High-Fidelity Selectable UI:** Most read-only labels and logs across the interface (including discussion history, comms payloads, tool outputs, and telemetry metrics) are now implemented as selectable text fields. This enables standard OS-level text selection and copying (Ctrl+C) while maintaining a high-density, non-editable aesthetic.
|
|
- **Enhanced MMA Observability:** Worker streams and ticket previews now support direct text selection, allowing for easy extraction of specific logs or reasoning fragments during parallel execution.
|
|
- **Detailed History Management:** Rich discussion history with branching, timestamping, and specific git commit linkage per conversation.
|
|
- **Advanced Log Management:** Optimizes log storage by offloading large data (AI-generated scripts and tool outputs) to unique files within the session directory, using compact `[REF:filename]` pointers in JSON-L logs to minimize token overhead during analysis.
|
|
- **Full Session Restoration:** Allows users to load and reconstruct entire historical sessions from their log directories. Includes a dedicated, tinted **'Historical Replay' mode** that populates discussion history and provides a read-only view of prior agent activities.
|
|
- **Dedicated Diagnostic Logging:** Consolidates transient system warnings and performance alerts into a separate **System Diagnostics** tab within the Log Management panel, ensuring the persistent Discussion History remains focused on high-signal content while providing deep visibility into runtime health.
|
|
- **Improved MMA Observability:** Enhances sub-agent logging by injecting precise ticket IDs and descriptive roles into communication metadata, enabling granular filtering and tracking of parallel worker activities within the Comms History.
|
|
- **In-Depth Toolset Access:** MCP-like file exploration, URL fetching, search, and dynamic context aggregation embedded within a multi-viewport Dear PyGui/ImGui interface.
|
|
- **Integrated Workspace:** A consolidated Hub-based layout (Context, AI Settings, Discussion, Operations) designed for expert multi-monitor workflows.
|
|
- **Session Analysis:** Ability to load and visualize historical session logs with a dedicated tinted "Prior Session" viewing mode.
|
|
- **Structured Log Taxonomy:** Automated session-based log organization into configurable directories (defaulting to `logs/sessions/`). Includes a dedicated GUI panel for monitoring and manual whitelisting. Features an intelligent heuristic-based pruner that automatically cleans up insignificant logs older than 24 hours while preserving valuable sessions.
|
|
- **Clean Project Root:** Enforces a "Cruft-Free Root" policy by organizing core implementation into a `src/` directory and redirecting all temporary test data, configurations, and AI-generated artifacts to `tests/artifacts/`.
|
|
- **Performance Diagnostics:** Comprehensive, conditional per-component profiling across the entire application. Features a dedicated **Diagnostics Panel** providing real-time telemetry for FPS, Frame Time, CPU usage, and **Detailed Component Timings** for all GUI panels and background threads, including automated threshold-based latency alerts.
|
|
- **Automated UX Verification:** A robust IPC mechanism via API hooks and a modular simulation suite allows for human-like simulation walkthroughs and automated regression testing of the full GUI lifecycle across multiple specialized scenarios.
|
|
- **Headless Backend Service:** Optional headless mode allowing the core AI and tool execution logic to run as a decoupled REST API service (FastAPI), optimized for Docker and server-side environments (e.g., Unraid).
|
|
- **Remote Confirmation Protocol:** A non-blocking, ID-based challenge/response mechanism for approving AI actions via the REST API, enabling remote "Human-in-the-Loop" safety.
|
|
- **Gemini CLI Integration:** Allows using the `gemini` CLI as a headless backend provider. This enables leveraging Gemini subscriptions with advanced features like persistent sessions, while maintaining full "Human-in-the-Loop" safety through a dedicated bridge for synchronous tool call approvals within the Manual Slop GUI. Now features full functional parity with the direct API, including accurate token estimation, safety settings, and robust system instruction handling.
|
|
- **Context & Token Visualization:** Detailed UI panels for monitoring real-time token usage, history depth, and **visual cache awareness** (tracking specific files currently live in the provider's context cache).
|
|
- **On-Demand Definition Lookup:** Allows developers to request specific class or function definitions during discussions using `@SymbolName` syntax. Injected definitions feature syntax highlighting, intelligent collapsing for long blocks, and a **[Source]** button for instant navigation to the full file.
|
|
- **Manual Ticket Queue Management:** Provides a dedicated GUI panel for granular control over the implementation queue. Features include color-coded priority assignment (High, Medium, Low), multi-select bulk operations (Execute, Skip, Block), and interactive drag-and-drop reordering with real-time Directed Acyclic Graph (DAG) validation.
|