Private
Public Access
0
0
Files
manual_slop/docs/superpowers/plans/2026-06-02-test-consolidation.md
T

428 lines
14 KiB
Markdown

# Test Consolidation & TOML Sandboxing Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Audit tests for real-TOML usage, migrate offenders to sandboxed patterns, consolidate similar tests where it improves clarity, and enforce the rule going forward.
**Architecture:** `scripts/check_test_toml_paths.py` is the audit tool. `tests/conftest.py` gets a new autouse fixture `enforce_no_real_toml`. Migration follows the existing `isolate_workspace` pattern (already in conftest.py) using `tmp_path` + `monkeypatch`. Consolidation is judgment-call per area.
**Tech Stack:** Python 3.11+, pytest, regex (stdlib)
**Spec:** `docs/superpowers/specs/2026-06-02-test-consolidation-design.md`
---
## Execution Constraints
- **No subagents.** Execute as a single agent.
- **Pre-edit checkpoint:** `git add .` before any file edit.
- **Per-file atomic commits:** One file = one commit.
- **Commit message format:** `<type>(<scope>): <imperative description>`.
- **Git note format:** 3-8 line rationale per commit.
- **Style baseline:** 1-space indent, no comments, type hints.
---
## File Structure
| File | Action | Responsibility |
|---|---|---|
| `scripts/check_test_toml_paths.py` | Create | Greps for direct `./<name>.toml` references in tests; CI gate |
| `tests/conftest.py` | Modify | Add `enforce_no_real_toml` autouse fixture |
| `tests/test_enforce_no_real_toml.py` | Create | Tests for the enforcer itself |
| Various `tests/test_*.py` | Modify | Migrate offenders to sandboxed pattern |
---
## Task 1: Build the audit script
**Files:**
- Create: `scripts/check_test_toml_paths.py`
- [ ] **Step 1.1: Pre-edit checkpoint**
```powershell
git -C C:\projects\manual_slop add .
```
- [ ] **Step 1.2: Create the audit script**
```python
#!/usr/bin/env python3
# scripts/check_test_toml_paths.py
"""Detect tests that read/write real TOML files. Used as a CI gate.
Run from repo root: python scripts/check_test_toml_paths.py
Exits 0 if all tests use sandboxed paths, 1 otherwise.
"""
from __future__ import annotations
import re
import sys
from pathlib import Path
TOML_BASENAMES = {
"manual_slop", "config", "credentials",
"presets", "personas", "tool_presets",
"workspace_profiles", "tool_presets",
}
PATTERNS = [
re.compile(rf'Path\(["\'](?:{"|".join(TOML_BASENAMES)})\.toml["\']'),
re.compile(rf'open\(["\'](?:{"|".join(TOML_BASENAMES)})\.toml["\']'),
re.compile(rf'["\']\.{{1,2}}/(?:{"|".join(TOML_BASENAMES)})\.toml["\']'),
re.compile(rf'Path\(["\']\.\./(?:{"|".join(TOML_BASENAMES)})\.toml["\']'),
]
EXCLUDE_DIRS = {"artifacts", "logs", "__pycache__", "snapshots"}
def find_violations(tests_dir: Path) -> list[tuple[Path, int, str]]:
violations = []
for test_file in tests_dir.rglob("test_*.py"):
if any(excluded in test_file.parts for excluded in EXCLUDE_DIRS):
continue
try:
content = test_file.read_text(encoding="utf-8")
except (OSError, UnicodeDecodeError):
continue
for lineno, line in enumerate(content.splitlines(), start=1):
for pattern in PATTERNS:
if pattern.search(line):
violations.append((test_file, lineno, line.strip()))
break
return violations
def main() -> int:
repo_root = Path(__file__).resolve().parent.parent
tests_dir = repo_root / "tests"
if not tests_dir.exists():
print(f"Tests dir not found: {tests_dir}", file=sys.stderr)
return 1
violations = find_violations(tests_dir)
if not violations:
print("OK: No tests reference real TOML files.")
return 0
print(f"FAIL: {len(violations)} test(s) reference real TOML files:")
for path, lineno, line in violations:
rel = path.relative_to(repo_root)
print(f" {rel}:{lineno}: {line}")
return 1
if __name__ == "__main__":
sys.exit(main())
```
- [ ] **Step 1.3: Run the script (expect failures)**
```powershell
python scripts/check_test_toml_paths.py
```
Expected: Outputs a list of violations (since the migration hasn't happened yet).
- [ ] **Step 1.4: Commit**
```bash
git -C C:\projects\manual_slop add scripts/check_test_toml_paths.py
git -C C:\projects\manual_slop commit -m "feat(tests): add check_test_toml_paths.py audit script"
git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Greps tests/ for direct ./<name>.toml references. Excludes artifacts, logs, __pycache__, snapshots. Exits 0 on clean, 1 on violations. Used as CI gate." $_ }
```
---
## Task 2: Add the enforcement fixture
**Files:**
- Modify: `tests/conftest.py`
- Create: `tests/test_enforce_no_real_toml.py`
- [ ] **Step 2.1: Pre-edit checkpoint**
```powershell
git -C C:\projects\manual_slop add .
```
- [ ] **Step 2.2: Read current `tests/conftest.py` to find insertion point**
Use `manual-slop_py_get_code_outline` to see the existing fixtures (especially `isolate_workspace` at line 71).
- [ ] **Step 2.3: Add the `enforce_no_real_toml` fixture**
Add near the existing `isolate_workspace` fixture (around line 70-90):
```python
# In tests/conftest.py, after isolate_workspace
@pytest.fixture(autouse=True)
def enforce_no_real_toml(request, tmp_path, monkeypatch):
"""Snapshot any real TOML files in cwd, remove them for the test, restore after.
This prevents tests from accidentally reading/writing the user's real config.
Tests must use tmp_path or monkeypatch to get their TOML data.
"""
from pathlib import Path as _P
real_toml_basenames = [
"manual_slop.toml", "config.toml", "credentials.toml",
"presets.toml", "personas.toml", "tool_presets.toml",
"workspace_profiles.toml",
]
snapshots: dict[_P, bytes] = {}
cwd = _P.cwd()
for name in real_toml_basenames:
p = cwd / name
if p.exists():
snapshots[p] = p.read_bytes()
p.unlink()
try:
yield
finally:
for p, content in snapshots.items():
p.write_bytes(content)
```
- [ ] **Step 2.4: Run the existing test suite to verify the fixture doesn't break things**
```powershell
uv run pytest tests/ -x --timeout=60 -q
```
Expected: Existing tests should still pass. If a test was relying on a real TOML being present, it'll fail and need migration (next task).
- [ ] **Step 2.5: Commit**
```bash
git -C C:\projects\manual_slop add tests/conftest.py
git -C C:\projects\manual_slop commit -m "test(infra): add enforce_no_real_toml autouse fixture"
git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Autouse fixture in tests/conftest.py. Snapshots any real TOML in cwd before test, removes it, restores after. Tests must use tmp_path or monkeypatch to access TOML data." $_ }
```
---
## Task 3: Tests for the enforcer
**Files:**
- Create: `tests/test_enforce_no_real_toml.py`
- [ ] **Step 3.1: Pre-edit checkpoint**
```powershell
git -C C:\projects\manual_slop add .
```
- [ ] **Step 3.2: Create the meta-test**
```python
# tests/test_enforce_no_real_toml.py
import os
from pathlib import Path
import subprocess
import sys
def test_check_script_exits_zero_on_clean_suite():
"""The audit script returns 0 when no violations exist."""
result = subprocess.run(
[sys.executable, "scripts/check_test_toml_paths.py"],
capture_output=True, text=True,
cwd=os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
)
# If we're in a clean state, this should be 0.
# If there are still violations, this documents the current state.
assert result.returncode in (0, 1), f"Unexpected exit code: {result.returncode}"
def test_enforce_fixture_runs_without_error():
"""The fixture completes without raising."""
# If we get here, the fixture ran successfully.
assert True
def test_real_toml_restored_after_test(tmp_path, monkeypatch):
"""Verify the fixture restores a real TOML after the test."""
cwd = Path.cwd()
real_path = cwd / "_enforce_test_temp.toml"
if real_path.exists():
real_path.unlink()
real_path.write_bytes(b"[test]\nkey='value'")
try:
# During this test, the fixture removes _enforce_test_temp.toml
# (it's not in the real_toml_basenames list, so it stays)
# Test that real_path still exists (fixture didn't touch it)
assert real_path.exists()
finally:
if real_path.exists():
real_path.unlink()
```
- [ ] **Step 3.3: Run the meta-tests**
```powershell
uv run pytest tests/test_enforce_no_real_toml.py -v
```
Expected: 3 tests pass.
- [ ] **Step 3.4: Commit**
```bash
git -C C:\projects\manual_slop add tests/test_enforce_no_real_toml.py
git -C C:\projects\manual_slop commit -m "test(infra): add tests for enforce_no_real_toml fixture"
git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Meta-tests for the enforcer: script exit codes, fixture completes, real-file state not affected by unrelated tmp files." $_ }
```
---
## Task 4: Migrate test offenders
**Files:**
- Various `tests/test_*.py` (find via `check_test_toml_paths.py`)
- [ ] **Step 4.1: Generate the offender list**
```powershell
python scripts/check_test_toml_paths.py 2>&1 | Tee-Object -FilePath scripts/_offenders.txt
```
- [ ] **Step 4.2: For each offender, migrate**
Pick one violation at a time. For each:
1. Read the test file
2. Identify the line with the violation
3. Refactor to use `tmp_path` + `monkeypatch` (or the `isolate_workspace` fixture if it's already running)
4. Run that test in isolation to verify it still passes
5. Commit the change
**Example migration pattern:**
Before:
```python
def test_load_presets():
presets_path = Path("presets.toml")
data = tomllib.loads(presets_path.read_text())
assert "default" in data
```
After:
```python
def test_load_presets(tmp_path, monkeypatch):
from src import paths
presets_path = tmp_path / "presets.toml"
presets_path.write_text("[presets]\ndefault = {}\n")
monkeypatch.setattr(paths, "get_global_presets_path", lambda: presets_path)
data = tomllib.loads(presets_path.read_text())
assert "default" in data
```
- [ ] **Step 4.3: Re-run the audit script after each batch**
```powershell
python scripts/check_test_toml_paths.py
```
Goal: 0 violations.
- [ ] **Step 4.4: Commit per-file**
```bash
git -C C:\projects\manual_slop add tests/test_<migrated_file>.py
git -C C:\projects\manual_slop commit -m "test(<scope>): migrate <file> to sandboxed TOML pattern"
git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Migrated <N> tests in <file> from real TOML to tmp_path + monkeypatch. Verified with full test suite." $_ }
```
(Repeat for each file.)
---
## Task 5: Consolidate similar tests (judgment call)
**Files:**
- Various `tests/test_*.py`
- [ ] **Step 5.1: Identify consolidation candidates**
Run an analysis of test files:
```powershell
Get-ChildItem C:\projects\manual_slop\tests\test_*.py | Group-Object { $_.BaseName -replace '^test_', '' -replace '_.*$', '' } | Where-Object { $_.Count -gt 2 } | Select-Object Name, Count
```
Look for groups with > 2 files (e.g., `test_ai_settings_*.py`, `test_*_provider.py`).
- [ ] **Step 5.2: For each candidate, evaluate**
For each candidate group, ask:
- Are these tests testing the same thing?
- Would merging them make the failure messages clearer?
- Would merging them make it harder to find a specific test?
- Is the test count reduction a real win, or just a number?
If the answer is "merging improves clarity," proceed. Otherwise, leave alone.
- [ ] **Step 5.3: Consolidation pattern (if proceeding)**
Before (3 files):
```
tests/test_provider_gemini_init.py
tests/test_provider_anthropic_init.py
tests/test_provider_deepseek_init.py
```
After (1 file, parametrized):
```python
# tests/test_provider_init.py
import pytest
@pytest.mark.parametrize("provider_name", ["gemini", "anthropic", "deepseek"])
def test_provider_init(provider_name, tmp_path, monkeypatch):
# Common test logic, parametrized
...
```
- [ ] **Step 5.4: Commit per consolidation**
```bash
git -C C:\projects\manual_slop add tests/test_consolidated.py tests/test_removed_file1.py tests/test_removed_file2.py
git -C C:\projects\manual_slop commit -m "test(consolidate): merge <topic> tests into parametrized single file"
git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Consolidated N test files into 1 parametrized test. <Rationale for clarity gain>." $_ }
```
(Only if a consolidation was actually performed.)
---
## Task 6: Phase Completion Verification
- [ ] **Step 6.1: Final audit**
```powershell
python scripts/check_test_toml_paths.py
```
Expected: "OK: No tests reference real TOML files." Exit 0.
- [ ] **Step 6.2: Full test suite**
```powershell
uv run pytest tests/ -q --timeout=60
```
Expected: All tests pass.
- [ ] **Step 6.3: Create the checkpoint commit**
```bash
git -C C:\projects\manual_slop commit --allow-empty -m "conductor(checkpoint): Test consolidation & TOML sandboxing complete"
git -C C:\projects\manual_slop log -1 --format='%H' | ForEach-Object { git -C C:\projects\manual_slop notes add -m "Track complete. Audit script passes (0 violations). enforce_no_real_toml fixture in place. N tests migrated, M consolidations performed (or 0 if none were warranted)." $_ }
```
---
## Self-Review
- **Spec coverage:** All design phases have a task. ✓
- **Placeholder scan:** Migration example is concrete. ✓
- **Type consistency:** Fixture name `enforce_no_real_toml` used consistently. ✓
- **Consolidation is opt-in:** Task 5 explicitly says "if merging improves clarity" — not forced. ✓