manual_slop/docs/guide_simulations.md

# Manual Slop: Verification & Simulation Framework

Detailed specification of the live GUI testing infrastructure, simulation lifecycle, and the mock provider strategy.

---

## 1. Live GUI Verification Infrastructure

To verify complex UI state and asynchronous interactions, Manual Slop employs a **Live Verification** strategy using the application's built-in API hooks.

### `--enable-test-hooks`
When launched with this flag, the application starts the `HookServer` on port `8999`, exposing its internal state to external HTTP requests. This is the foundation for all automated visual verification.

### The `live_gui` pytest Fixture
Defined in `tests/conftest.py`, this session-scoped fixture manages the lifecycle of the application under test:
1.  **Startup:** Spawns `gui_2.py` in a separate process with `--enable-test-hooks`.
2.  **Telemetry:** Polls `/status` until the hook server is ready.
3.  **Isolation:** Resets the AI session and clears comms logs between tests to prevent state pollution.
4.  **Teardown:** Robustly kills the process tree on completion or failure.

---

## 2. Simulation Lifecycle: The "Puppeteer" Pattern

Simulations (like `tests/visual_sim_mma_v2.py`) act as a "Puppeteer," driving the GUI through the `ApiHookClient`.

### Phase 1: Environment Setup
*   **Provider Mocking:** The simulation sets the `current_provider` to `gemini_cli` and redirects the `gcli_path` to a mock script (e.g., `tests/mock_gemini_cli.py`).
*   **Workspace Isolation:** The `files_base_dir` is pointed to a temporary artifacts directory to prevent accidental modification of the host project.

### Phase 2: User Interaction Loop
The simulation replicates a human workflow by invoking client methods:
1.  `client.set_value('mma_epic_input', '...')`: Injects the epic description.
2.  `client.click('btn_mma_plan_epic')`: Triggers the orchestration engine.

### Phase 3: Polling & Assertion
Because AI orchestration is asynchronous, simulations use a **Polling with Multi-Modal Approval** loop:
*   **State Polling:** The script polls `client.get_mma_status()` in a loop.
*   **Auto-Approval:** If the status indicates a pending tool or spawn request, the simulation automatically clicks the approval buttons (`btn_approve_spawn`, `btn_approve_tool`).
*   **Verification:** Once the expected state (e.g., "Mock Goal 1" appears in the track list) is detected, the simulation proceeds to the next phase or asserts success.

---

## 3. Mock Provider Strategy

To test the 4-Tier MMA hierarchy without incurring API costs or latency, Manual Slop uses a **Script-Based Mocking** strategy via the `gemini_cli` adapter.

### `tests/mock_gemini_cli.py`
This script simulates the behavior of the `gemini` CLI by:
1.  **Input Parsing:** Reading the system prompt and user message from the environment/stdin.
2.  **Deterministic Response:** Returning pre-defined JSON payloads (e.g., track definitions, worker implementation scripts) based on keywords in the prompt.
3.  **Tool Simulation:** Mimicking function-call responses to trigger the "Execution Clutch" within the GUI.

---

## 4. Visual Verification Examples

Tests in this framework don't just check return values; they verify the **rendered state** of the application:
*   **DAG Integrity:** Verifying that `active_tickets` in the MMA status matches the expected task graph.
*   **Stream Telemetry:** Checking `mma_streams` to ensure that output from multiple tiers is correctly captured and displayed in the terminal.
*   **Modal State:** Asserting that the correct dialog (e.g., `ConfirmDialog`) is active during a pending tool call.

By combining these techniques, Manual Slop achieves a level of verification rigor usually reserved for high-stakes embedded systems or complex graphics engines.