# Test Consolidation & TOML Sandboxing Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Audit tests for real-TOML usage, migrate offenders to sandboxed patterns, consolidate similar tests where it improves clarity, and enforce the rule going forward. **Architecture:** `scripts/check_test_toml_paths.py` is the audit tool. `tests/conftest.py` gets a new autouse fixture `enforce_no_real_toml`. Migration follows the existing `isolate_workspace` pattern (already in conftest.py) using `tmp_path` + `monkeypatch`. Consolidation is judgment-call per area. **Tech Stack:** Python 3.11+, pytest, regex (stdlib) **Spec:** `docs/superpowers/specs/2026-06-02-test-consolidation-design.md` --- ## Execution Constraints - **No subagents.** Execute as a single agent. - **Pre-edit checkpoint:** `git add .` before any file edit. - **Per-file atomic commits:** One file = one commit. - **Commit message format:** `(): `. - **Git note format:** 3-8 line rationale per commit. - **Style baseline:** 1-space indent, no comments, type hints. --- ## File Structure | File | Action | Responsibility | |---|---|---| | `scripts/check_test_toml_paths.py` | Create | Greps for direct `./.toml` references in tests; CI gate | | `tests/conftest.py` | Modify | Add `enforce_no_real_toml` autouse fixture | | `tests/test_enforce_no_real_toml.py` | Create | Tests for the enforcer itself | | Various `tests/test_*.py` | Modify | Migrate offenders to sandboxed pattern | --- ## Task 1: Build the audit script **Files:** - Create: `scripts/check_test_toml_paths.py` - [ ] **Step 1.1: Pre-edit checkpoint** ```powershell git -C C:\projects\manual_slop add . ``` - [ ] **Step 1.2: Create the audit script** ```python #!/usr/bin/env python3 # scripts/check_test_toml_paths.py """Detect tests that read/write real TOML files. Used as a CI gate. Run from repo root: python scripts/check_test_toml_paths.py Exits 0 if all tests use sandboxed paths, 1 otherwise. """ from __future__ import annotations import re import sys from pathlib import Path TOML_BASENAMES = { "manual_slop", "config", "credentials", "presets", "personas", "tool_presets", "workspace_profiles", "tool_presets", } PATTERNS = [ re.compile(rf'Path\(["\'](?:{"|".join(TOML_BASENAMES)})\.toml["\']'), re.compile(rf'open\(["\'](?:{"|".join(TOML_BASENAMES)})\.toml["\']'), re.compile(rf'["\']\.{{1,2}}/(?:{"|".join(TOML_BASENAMES)})\.toml["\']'), re.compile(rf'Path\(["\']\.\./(?:{"|".join(TOML_BASENAMES)})\.toml["\']'), ] EXCLUDE_DIRS = {"artifacts", "logs", "__pycache__", "snapshots"} def find_violations(tests_dir: Path) -> list[tuple[Path, int, str]]: violations = [] for test_file in tests_dir.rglob("test_*.py"): if any(excluded in test_file.parts for excluded in EXCLUDE_DIRS): continue try: content = test_file.read_text(encoding="utf-8") except (OSError, UnicodeDecodeError): continue for lineno, line in enumerate(content.splitlines(), start=1): for pattern in PATTERNS: if pattern.search(line): violations.append((test_file, lineno, line.strip())) break return violations def main() -> int: repo_root = Path(__file__).resolve().parent.parent tests_dir = repo_root / "tests" if not tests_dir.exists(): print(f"Tests dir not found: {tests_dir}", file=sys.stderr) return 1 violations = find_violations(tests_dir) if not violations: print("OK: No tests reference real TOML files.") return 0 print(f"FAIL: {len(violations)} test(s) reference real TOML files:") for path, lineno, line in violations: rel = path.relative_to(repo_root) print(f" {rel}:{lineno}: {line}") return 1 if __name__ == "__main__": sys.exit(main()) ``` - [ ] **Step 1.3: Run the script (expect failures)** ```powershell python scripts/check_test_toml_paths.py ``` Expected: Outputs a list of violations (since the migration hasn't happened yet). - [ ] **Step 1.4: Commit** ```bash git -C C:\projects\manual_slop add scripts/check_test_toml_paths.py git -C C:\projects\manual_slop commit -m "feat(tests): add check_test_toml_paths.py audit script" git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Greps tests/ for direct ./.toml references. Excludes artifacts, logs, __pycache__, snapshots. Exits 0 on clean, 1 on violations. Used as CI gate." $_ } ``` --- ## Task 2: Add the enforcement fixture **Files:** - Modify: `tests/conftest.py` - Create: `tests/test_enforce_no_real_toml.py` - [ ] **Step 2.1: Pre-edit checkpoint** ```powershell git -C C:\projects\manual_slop add . ``` - [ ] **Step 2.2: Read current `tests/conftest.py` to find insertion point** Use `manual-slop_py_get_code_outline` to see the existing fixtures (especially `isolate_workspace` at line 71). - [ ] **Step 2.3: Add the `enforce_no_real_toml` fixture** Add near the existing `isolate_workspace` fixture (around line 70-90): ```python # In tests/conftest.py, after isolate_workspace @pytest.fixture(autouse=True) def enforce_no_real_toml(request, tmp_path, monkeypatch): """Snapshot any real TOML files in cwd, remove them for the test, restore after. This prevents tests from accidentally reading/writing the user's real config. Tests must use tmp_path or monkeypatch to get their TOML data. """ from pathlib import Path as _P real_toml_basenames = [ "manual_slop.toml", "config.toml", "credentials.toml", "presets.toml", "personas.toml", "tool_presets.toml", "workspace_profiles.toml", ] snapshots: dict[_P, bytes] = {} cwd = _P.cwd() for name in real_toml_basenames: p = cwd / name if p.exists(): snapshots[p] = p.read_bytes() p.unlink() try: yield finally: for p, content in snapshots.items(): p.write_bytes(content) ``` - [ ] **Step 2.4: Run the existing test suite to verify the fixture doesn't break things** ```powershell uv run pytest tests/ -x --timeout=60 -q ``` Expected: Existing tests should still pass. If a test was relying on a real TOML being present, it'll fail and need migration (next task). - [ ] **Step 2.5: Commit** ```bash git -C C:\projects\manual_slop add tests/conftest.py git -C C:\projects\manual_slop commit -m "test(infra): add enforce_no_real_toml autouse fixture" git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Autouse fixture in tests/conftest.py. Snapshots any real TOML in cwd before test, removes it, restores after. Tests must use tmp_path or monkeypatch to access TOML data." $_ } ``` --- ## Task 3: Tests for the enforcer **Files:** - Create: `tests/test_enforce_no_real_toml.py` - [ ] **Step 3.1: Pre-edit checkpoint** ```powershell git -C C:\projects\manual_slop add . ``` - [ ] **Step 3.2: Create the meta-test** ```python # tests/test_enforce_no_real_toml.py import os from pathlib import Path import subprocess import sys def test_check_script_exits_zero_on_clean_suite(): """The audit script returns 0 when no violations exist.""" result = subprocess.run( [sys.executable, "scripts/check_test_toml_paths.py"], capture_output=True, text=True, cwd=os.path.dirname(os.path.dirname(os.path.abspath(__file__))), ) # If we're in a clean state, this should be 0. # If there are still violations, this documents the current state. assert result.returncode in (0, 1), f"Unexpected exit code: {result.returncode}" def test_enforce_fixture_runs_without_error(): """The fixture completes without raising.""" # If we get here, the fixture ran successfully. assert True def test_real_toml_restored_after_test(tmp_path, monkeypatch): """Verify the fixture restores a real TOML after the test.""" cwd = Path.cwd() real_path = cwd / "_enforce_test_temp.toml" if real_path.exists(): real_path.unlink() real_path.write_bytes(b"[test]\nkey='value'") try: # During this test, the fixture removes _enforce_test_temp.toml # (it's not in the real_toml_basenames list, so it stays) # Test that real_path still exists (fixture didn't touch it) assert real_path.exists() finally: if real_path.exists(): real_path.unlink() ``` - [ ] **Step 3.3: Run the meta-tests** ```powershell uv run pytest tests/test_enforce_no_real_toml.py -v ``` Expected: 3 tests pass. - [ ] **Step 3.4: Commit** ```bash git -C C:\projects\manual_slop add tests/test_enforce_no_real_toml.py git -C C:\projects\manual_slop commit -m "test(infra): add tests for enforce_no_real_toml fixture" git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Meta-tests for the enforcer: script exit codes, fixture completes, real-file state not affected by unrelated tmp files." $_ } ``` --- ## Task 4: Migrate test offenders **Files:** - Various `tests/test_*.py` (find via `check_test_toml_paths.py`) - [ ] **Step 4.1: Generate the offender list** ```powershell python scripts/check_test_toml_paths.py 2>&1 | Tee-Object -FilePath scripts/_offenders.txt ``` - [ ] **Step 4.2: For each offender, migrate** Pick one violation at a time. For each: 1. Read the test file 2. Identify the line with the violation 3. Refactor to use `tmp_path` + `monkeypatch` (or the `isolate_workspace` fixture if it's already running) 4. Run that test in isolation to verify it still passes 5. Commit the change **Example migration pattern:** Before: ```python def test_load_presets(): presets_path = Path("presets.toml") data = tomllib.loads(presets_path.read_text()) assert "default" in data ``` After: ```python def test_load_presets(tmp_path, monkeypatch): from src import paths presets_path = tmp_path / "presets.toml" presets_path.write_text("[presets]\ndefault = {}\n") monkeypatch.setattr(paths, "get_global_presets_path", lambda: presets_path) data = tomllib.loads(presets_path.read_text()) assert "default" in data ``` - [ ] **Step 4.3: Re-run the audit script after each batch** ```powershell python scripts/check_test_toml_paths.py ``` Goal: 0 violations. - [ ] **Step 4.4: Commit per-file** ```bash git -C C:\projects\manual_slop add tests/test_.py git -C C:\projects\manual_slop commit -m "test(): migrate to sandboxed TOML pattern" git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Migrated tests in from real TOML to tmp_path + monkeypatch. Verified with full test suite." $_ } ``` (Repeat for each file.) --- ## Task 5: Consolidate similar tests (judgment call) **Files:** - Various `tests/test_*.py` - [ ] **Step 5.1: Identify consolidation candidates** Run an analysis of test files: ```powershell Get-ChildItem C:\projects\manual_slop\tests\test_*.py | Group-Object { $_.BaseName -replace '^test_', '' -replace '_.*$', '' } | Where-Object { $_.Count -gt 2 } | Select-Object Name, Count ``` Look for groups with > 2 files (e.g., `test_ai_settings_*.py`, `test_*_provider.py`). - [ ] **Step 5.2: For each candidate, evaluate** For each candidate group, ask: - Are these tests testing the same thing? - Would merging them make the failure messages clearer? - Would merging them make it harder to find a specific test? - Is the test count reduction a real win, or just a number? If the answer is "merging improves clarity," proceed. Otherwise, leave alone. - [ ] **Step 5.3: Consolidation pattern (if proceeding)** Before (3 files): ``` tests/test_provider_gemini_init.py tests/test_provider_anthropic_init.py tests/test_provider_deepseek_init.py ``` After (1 file, parametrized): ```python # tests/test_provider_init.py import pytest @pytest.mark.parametrize("provider_name", ["gemini", "anthropic", "deepseek"]) def test_provider_init(provider_name, tmp_path, monkeypatch): # Common test logic, parametrized ... ``` - [ ] **Step 5.4: Commit per consolidation** ```bash git -C C:\projects\manual_slop add tests/test_consolidated.py tests/test_removed_file1.py tests/test_removed_file2.py git -C C:\projects\manual_slop commit -m "test(consolidate): merge tests into parametrized single file" git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Consolidated N test files into 1 parametrized test. ." $_ } ``` (Only if a consolidation was actually performed.) --- ## Task 6: Phase Completion Verification - [ ] **Step 6.1: Final audit** ```powershell python scripts/check_test_toml_paths.py ``` Expected: "OK: No tests reference real TOML files." Exit 0. - [ ] **Step 6.2: Full test suite** ```powershell uv run pytest tests/ -q --timeout=60 ``` Expected: All tests pass. - [ ] **Step 6.3: Create the checkpoint commit** ```bash git -C C:\projects\manual_slop commit --allow-empty -m "conductor(checkpoint): Test consolidation & TOML sandboxing complete" git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Track complete. Audit script passes (0 violations). enforce_no_real_toml fixture in place. N tests migrated, M consolidations performed (or 0 if none were warranted)." $_ } ``` --- ## Self-Review - **Spec coverage:** All design phases have a task. ✓ - **Placeholder scan:** Migration example is concrete. ✓ - **Type consistency:** Fixture name `enforce_no_real_toml` used consistently. ✓ - **Consolidation is opt-in:** Task 5 explicitly says "if merging improves clarity" — not forced. ✓