Files
manual_slop/docs/guide_simulations.md

64 lines
3.7 KiB
Markdown

# Manual Slop: Verification & Simulation Framework
Detailed specification of the live GUI testing infrastructure, simulation lifecycle, and the mock provider strategy.
---
## 1. Live GUI Verification Infrastructure
To verify complex UI state and asynchronous interactions, Manual Slop employs a **Live Verification** strategy using the application's built-in API hooks.
### `--enable-test-hooks`
When launched with this flag, the application starts the `HookServer` on port `8999`, exposing its internal state to external HTTP requests. This is the foundation for all automated visual verification.
### The `live_gui` pytest Fixture
Defined in `tests/conftest.py`, this session-scoped fixture manages the lifecycle of the application under test:
1. **Startup:** Spawns `gui_2.py` in a separate process with `--enable-test-hooks`.
2. **Telemetry:** Polls `/status` until the hook server is ready.
3. **Isolation:** Resets the AI session and clears comms logs between tests to prevent state pollution.
4. **Teardown:** Robustly kills the process tree on completion or failure.
---
## 2. Simulation Lifecycle: The "Puppeteer" Pattern
Simulations (like `tests/visual_sim_mma_v2.py`) act as a "Puppeteer," driving the GUI through the `ApiHookClient`.
### Phase 1: Environment Setup
* **Provider Mocking:** The simulation sets the `current_provider` to `gemini_cli` and redirects the `gcli_path` to a mock script (e.g., `tests/mock_gemini_cli.py`).
* **Workspace Isolation:** The `files_base_dir` is pointed to a temporary artifacts directory to prevent accidental modification of the host project.
### Phase 2: User Interaction Loop
The simulation replicates a human workflow by invoking client methods:
1. `client.set_value('mma_epic_input', '...')`: Injects the epic description.
2. `client.click('btn_mma_plan_epic')`: Triggers the orchestration engine.
### Phase 3: Polling & Assertion
Because AI orchestration is asynchronous, simulations use a **Polling with Multi-Modal Approval** loop:
* **State Polling:** The script polls `client.get_mma_status()` in a loop.
* **Auto-Approval:** If the status indicates a pending tool or spawn request, the simulation automatically clicks the approval buttons (`btn_approve_spawn`, `btn_approve_tool`).
* **Verification:** Once the expected state (e.g., "Mock Goal 1" appears in the track list) is detected, the simulation proceeds to the next phase or asserts success.
---
## 3. Mock Provider Strategy
To test the 4-Tier MMA hierarchy without incurring API costs or latency, Manual Slop uses a **Script-Based Mocking** strategy via the `gemini_cli` adapter.
### `tests/mock_gemini_cli.py`
This script simulates the behavior of the `gemini` CLI by:
1. **Input Parsing:** Reading the system prompt and user message from the environment/stdin.
2. **Deterministic Response:** Returning pre-defined JSON payloads (e.g., track definitions, worker implementation scripts) based on keywords in the prompt.
3. **Tool Simulation:** Mimicking function-call responses to trigger the "Execution Clutch" within the GUI.
---
## 4. Visual Verification Examples
Tests in this framework don't just check return values; they verify the **rendered state** of the application:
* **DAG Integrity:** Verifying that `active_tickets` in the MMA status matches the expected task graph.
* **Stream Telemetry:** Checking `mma_streams` to ensure that output from multiple tiers is correctly captured and displayed in the terminal.
* **Modal State:** Asserting that the correct dialog (e.g., `ConfirmDialog`) is active during a pending tool call.
By combining these techniques, Manual Slop achieves a level of verification rigor usually reserved for high-stakes embedded systems or complex graphics engines.