64 lines
3.7 KiB
Markdown
64 lines
3.7 KiB
Markdown
# Manual Slop: Verification & Simulation Framework
|
|
|
|
Detailed specification of the live GUI testing infrastructure, simulation lifecycle, and the mock provider strategy.
|
|
|
|
---
|
|
|
|
## 1. Live GUI Verification Infrastructure
|
|
|
|
To verify complex UI state and asynchronous interactions, Manual Slop employs a **Live Verification** strategy using the application's built-in API hooks.
|
|
|
|
### `--enable-test-hooks`
|
|
When launched with this flag, the application starts the `HookServer` on port `8999`, exposing its internal state to external HTTP requests. This is the foundation for all automated visual verification.
|
|
|
|
### The `live_gui` pytest Fixture
|
|
Defined in `tests/conftest.py`, this session-scoped fixture manages the lifecycle of the application under test:
|
|
1. **Startup:** Spawns `gui_2.py` in a separate process with `--enable-test-hooks`.
|
|
2. **Telemetry:** Polls `/status` until the hook server is ready.
|
|
3. **Isolation:** Resets the AI session and clears comms logs between tests to prevent state pollution.
|
|
4. **Teardown:** Robustly kills the process tree on completion or failure.
|
|
|
|
---
|
|
|
|
## 2. Simulation Lifecycle: The "Puppeteer" Pattern
|
|
|
|
Simulations (like `tests/visual_sim_mma_v2.py`) act as a "Puppeteer," driving the GUI through the `ApiHookClient`.
|
|
|
|
### Phase 1: Environment Setup
|
|
* **Provider Mocking:** The simulation sets the `current_provider` to `gemini_cli` and redirects the `gcli_path` to a mock script (e.g., `tests/mock_gemini_cli.py`).
|
|
* **Workspace Isolation:** The `files_base_dir` is pointed to a temporary artifacts directory to prevent accidental modification of the host project.
|
|
|
|
### Phase 2: User Interaction Loop
|
|
The simulation replicates a human workflow by invoking client methods:
|
|
1. `client.set_value('mma_epic_input', '...')`: Injects the epic description.
|
|
2. `client.click('btn_mma_plan_epic')`: Triggers the orchestration engine.
|
|
|
|
### Phase 3: Polling & Assertion
|
|
Because AI orchestration is asynchronous, simulations use a **Polling with Multi-Modal Approval** loop:
|
|
* **State Polling:** The script polls `client.get_mma_status()` in a loop.
|
|
* **Auto-Approval:** If the status indicates a pending tool or spawn request, the simulation automatically clicks the approval buttons (`btn_approve_spawn`, `btn_approve_tool`).
|
|
* **Verification:** Once the expected state (e.g., "Mock Goal 1" appears in the track list) is detected, the simulation proceeds to the next phase or asserts success.
|
|
|
|
---
|
|
|
|
## 3. Mock Provider Strategy
|
|
|
|
To test the 4-Tier MMA hierarchy without incurring API costs or latency, Manual Slop uses a **Script-Based Mocking** strategy via the `gemini_cli` adapter.
|
|
|
|
### `tests/mock_gemini_cli.py`
|
|
This script simulates the behavior of the `gemini` CLI by:
|
|
1. **Input Parsing:** Reading the system prompt and user message from the environment/stdin.
|
|
2. **Deterministic Response:** Returning pre-defined JSON payloads (e.g., track definitions, worker implementation scripts) based on keywords in the prompt.
|
|
3. **Tool Simulation:** Mimicking function-call responses to trigger the "Execution Clutch" within the GUI.
|
|
|
|
---
|
|
|
|
## 4. Visual Verification Examples
|
|
|
|
Tests in this framework don't just check return values; they verify the **rendered state** of the application:
|
|
* **DAG Integrity:** Verifying that `active_tickets` in the MMA status matches the expected task graph.
|
|
* **Stream Telemetry:** Checking `mma_streams` to ensure that output from multiple tiers is correctly captured and displayed in the terminal.
|
|
* **Modal State:** Asserting that the correct dialog (e.g., `ConfirmDialog`) is active during a pending tool call.
|
|
|
|
By combining these techniques, Manual Slop achieves a level of verification rigor usually reserved for high-stakes embedded systems or complex graphics engines.
|