docs(guidelines): add Testing Requirements section with 4 standards

- Structural Testing Contract (mirrors workflow.md) - Isolated-Pass Verification Fallacy (Lesson 1, with link to the test_infrastructure_hardening_batch_green_20260610 incident report that motivated the rule) - Audit Scripts as CI Gates (4 scripts: check_test_toml_paths, audit_main_thread_imports, audit_weak_types, audit_no_models_config_io) - Skip Markers Are Documentation, Not Avoidance (workflow.md policy)
2026-06-10 20:20:58 -04:00
parent 965e015709
commit 72b237457e
1 changed files with 9 additions and 0 deletions
@@ -47,6 +47,15 @@
  - **Functions/Methods:** `[C: Caller1, Caller2]` (Primary callers).
  - **State Variables:** `[M: File:Line, Method]` (Mutation points) and `[U: File]` (Major use paths).

+## Testing Requirements
+
+These are the process standards the project's test infrastructure enforces. For the full implementation contract (fixture names, anti-patterns, audit scripts), see [docs/guide_testing.md §Structural Testing Contract](../docs/guide_testing.md) and the per-styleguide audit scripts in [code_styleguides/](code_styleguides/).
+
+- **Structural Testing Contract:** Ban on arbitrary core mocking with `unittest.mock.patch` (unless explicitly authorized for a specific boundary test). All integration and end-to-end testing must use the `live_gui` fixture to interact with a real instance of the application via the Hook API. Bypassing the hook server to directly mutate GUI state in tests is prohibited. All test-generated artifacts (logs, temporary workspaces, mock outputs) MUST be written to `tests/artifacts/` or `tests/logs/` (gitignored).
+- **Isolated-Pass Verification Fallacy (Added 2026-06-10):** A test that "passes when run after test X but fails in isolation" is a **fragile test, not a fragile fixture**. The flip side is also true: a test that "passes in isolation but fails in batch" is failing — its failure is masked by isolation. The only verification that matters for `live_gui` tests (or any test that depends on shared subprocess state) is the **batch run** in the suite the test will ship in. Do NOT commit a fix that has only been verified in isolation. The 4-day test-hell saga of 2026-06-06 to 2026-06-10 was the result of agents committing fixes after isolated passes; the bisect required both directions and was only caught at the suite-level batch green on 2026-06-10. See [docs/reports/test_infrastructure_hardening_batch_green_20260610.md](../docs/reports/test_infrastructure_hardening_batch_green_20260610.md) for the full incident.
+- **Audit Scripts as CI Gates:** The 4 audit scripts (`check_test_toml_paths.py`, `audit_main_thread_imports.py`, `audit_weak_types.py`, `audit_no_models_config_io.py`) enforce the conventions above. They run as pre-commit/CI gates and exit non-zero on regression. New conventions must be paired with a new audit script per [conductor/workflow.md §Audit Script Policy](workflow.md).
+- **Skip Markers Are Documentation, Not Avoidance:** `@pytest.mark.skip(reason=...)` is a record of a known failure, not an escape from fixing the underlying bug. Skip markers are valid for opt-in integration tests (require external resources, env-var-gated) or features behind a feature flag. They are NOT valid for pre-existing failing tests, tests the agent doesn't understand, or racy assertions the agent doesn't want to debug. When you add a skip, document the underlying issue in `reason=` and commit with a follow-up note. See [conductor/workflow.md §Skip-Marker Policy](workflow.md).
+
 ## See Also — Applied Conventions

 The product guidelines are best understood alongside the per-source-file guides that demonstrate them: