3.7 KiB
Manual Slop: Verification & Simulation Framework
Detailed specification of the live GUI testing infrastructure, simulation lifecycle, and the mock provider strategy.
1. Live GUI Verification Infrastructure
To verify complex UI state and asynchronous interactions, Manual Slop employs a Live Verification strategy using the application's built-in API hooks.
--enable-test-hooks
When launched with this flag, the application starts the HookServer on port 8999, exposing its internal state to external HTTP requests. This is the foundation for all automated visual verification.
The live_gui pytest Fixture
Defined in tests/conftest.py, this session-scoped fixture manages the lifecycle of the application under test:
- Startup: Spawns
gui_2.pyin a separate process with--enable-test-hooks. - Telemetry: Polls
/statusuntil the hook server is ready. - Isolation: Resets the AI session and clears comms logs between tests to prevent state pollution.
- Teardown: Robustly kills the process tree on completion or failure.
2. Simulation Lifecycle: The "Puppeteer" Pattern
Simulations (like tests/visual_sim_mma_v2.py) act as a "Puppeteer," driving the GUI through the ApiHookClient.
Phase 1: Environment Setup
- Provider Mocking: The simulation sets the
current_providertogemini_cliand redirects thegcli_pathto a mock script (e.g.,tests/mock_gemini_cli.py). - Workspace Isolation: The
files_base_diris pointed to a temporary artifacts directory to prevent accidental modification of the host project.
Phase 2: User Interaction Loop
The simulation replicates a human workflow by invoking client methods:
client.set_value('mma_epic_input', '...'): Injects the epic description.client.click('btn_mma_plan_epic'): Triggers the orchestration engine.
Phase 3: Polling & Assertion
Because AI orchestration is asynchronous, simulations use a Polling with Multi-Modal Approval loop:
- State Polling: The script polls
client.get_mma_status()in a loop. - Auto-Approval: If the status indicates a pending tool or spawn request, the simulation automatically clicks the approval buttons (
btn_approve_spawn,btn_approve_tool). - Verification: Once the expected state (e.g., "Mock Goal 1" appears in the track list) is detected, the simulation proceeds to the next phase or asserts success.
3. Mock Provider Strategy
To test the 4-Tier MMA hierarchy without incurring API costs or latency, Manual Slop uses a Script-Based Mocking strategy via the gemini_cli adapter.
tests/mock_gemini_cli.py
This script simulates the behavior of the gemini CLI by:
- Input Parsing: Reading the system prompt and user message from the environment/stdin.
- Deterministic Response: Returning pre-defined JSON payloads (e.g., track definitions, worker implementation scripts) based on keywords in the prompt.
- Tool Simulation: Mimicking function-call responses to trigger the "Execution Clutch" within the GUI.
4. Visual Verification Examples
Tests in this framework don't just check return values; they verify the rendered state of the application:
- DAG Integrity: Verifying that
active_ticketsin the MMA status matches the expected task graph. - Stream Telemetry: Checking
mma_streamsto ensure that output from multiple tiers is correctly captured and displayed in the terminal. - Modal State: Asserting that the correct dialog (e.g.,
ConfirmDialog) is active during a pending tool call.
By combining these techniques, Manual Slop achieves a level of verification rigor usually reserved for high-stakes embedded systems or complex graphics engines.