docs: append test performance track to backlog based on timeout evaluation

docs: Add Meta-Level Sanity Check responsibility to Tier 2 skill
2026-03-02 13:22:45 -05:00 · 2026-03-02 13:09:36 -05:00
2 changed files with 5 additions and 0 deletions
@@ -21,6 +21,7 @@ When implementing tracks, consult these docs for threading, data flow, and modul
 - Maintain persistent context throughout a track's implementation phase (No Context Amnesia).
 - Review implementations and coordinate bug fixes via Tier 4 QA.
 - **CRITICAL: ATOMIC PER-TASK COMMITS**: You MUST commit your progress on a per-task basis. Immediately after a task is verified successfully, you must stage the changes, commit them, attach the git note summary, and update `plan.md` before moving to the next task. Do NOT batch multiple tasks into a single commit.
 - **Meta-Level Sanity Check**: After completing a track (or upon explicit request), perform a codebase sanity check. Run `uv run ruff check .` and `uv run mypy --explicit-package-bases .` to ensure Tier 3 Workers haven't degraded static analysis constraints. Identify broken simulation tests and append them to a tech debt track or fix them immediately.
 ## Surgical Delegation Protocol
 When delegating to Tier 3 workers, construct prompts that specify:
@@ -104,4 +104,8 @@ To ensure smooth execution, execute the tracks in the following order:
 **Context:** Running `uv run ruff check .` and `uv run mypy --explicit-package-bases .` revealed massive technical debt in type safety (512+ Mypy errors across 64 files, 200+ remaining Ruff violations). The `gui_2.py` and `api_hook_client.py` files specifically have severe "Any" bleeding and incorrect unions.
 **Goal:** Resolve all static analysis errors. Enforce strict `mypy` compliance, remove implicit `Optional` types, and fix ambiguous variables (`l`). Integrate `ruff` and `mypy` into a CI pre-commit hook so Tier 3 workers are forced to write type-safe code going forward.
 ### `test_suite_performance_and_flakiness`
 **Context:** Running `uv run pytest` takes over 5.0 minutes to execute and frequently hangs on integration tests (e.g. `test_spawn_interception.py`). Several simulation tests (`test_sim_ai_settings.py`, `test_extended_sims.py`) are also currently failing or timing out. 
 **Goal:** Audit the test suite for `time.sleep()` abuse. Replace hardcoded sleeps with `threading.Event()` hooks or robust polling. Isolate slow integration tests with `@pytest.mark.slow` and ensure the core unit test suite runs in under 10 seconds to maintain high-velocity TDD.
Author	SHA1	Message	Date
ed	54635d8d1c	docs: append test performance track to backlog based on timeout evaluation	2026-03-02 13:22:45 -05:00
ed	7afa3f3090	docs: Add Meta-Level Sanity Check responsibility to Tier 2 skill	2026-03-02 13:09:36 -05:00