Private
Public Access
0
0
Files
manual_slop/docs/superpowers/specs/2026-06-02-test-consolidation-design.md
T

221 lines
7.7 KiB
Markdown

# Test Consolidation & TOML Sandboxing Enforcement
**Date:** 2026-06-02
**Status:** Draft (pending review)
---
## Context & Motivation
The Manual Slop test suite has grown to ~258 test files. Many tests read or write project TOML files (manual_slop.toml, config.toml, credentials.toml, presets.toml, etc.) for fixtures. The pattern is inconsistent:
- Some tests use `tmp_path` + `monkeypatch` (good — isolated)
- Some tests use real `./` paths (bad — pollutes user config)
- Some tests use mock paths at module level (good — fast)
The user wants to:
1. Audit tests for real-TOML usage
2. Migrate offenders to sandboxed variants
3. Consolidate similar tests where it improves clarity
4. Enforce the rule going forward
The `isolate_workspace` autouse fixture in `tests/conftest.py` (added in the May 2026 docs refresh work) is the foundation for the migration pattern.
---
## Scope
### In Scope
- Audit all `tests/*.py` for direct path references to `./` TOML files
- Migrate offenders to use `tmp_path` + `monkeypatch` (or `isolate_workspace`)
- Consolidate similar tests where it improves clarity (judgment call)
- Add a `tests/conftest.py` autouse fixture that prevents regression
- Add a `scripts/check_test_toml_paths.py` script for CI/pre-commit
- Add tests for the enforcement mechanism itself
### Out of Scope
- Rewriting tests for clarity (only consolidation where it improves maintainability)
- Adding new tests
- Changing the test runner (pytest stays)
- Coverage tooling changes
---
## Design
### Phase 1: Audit
A script that greps `tests/*.py` for problematic patterns:
```python
# scripts/check_test_toml_paths.py
import re
from pathlib import Path
PROBLEMATIC_PATTERNS = [
r'Path\("(?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml"\)',
r'open\(["\'](?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml["\']',
r'["\']\.{1,2}/(?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml["\']',
]
def find_violations(tests_dir: Path) -> List[Tuple[Path, int, str]]:
"""Returns list of (file, line, pattern) for each violation."""
...
```
Run this script as the first step. Output a report grouped by file.
### Phase 2: Migrate Offenders
For each violation, refactor the test to use the sandboxed pattern:
**Before (real TOML):**
```python
def test_load_presets():
path = Path("presets.toml") # Real file!
if path.exists():
data = tomllib.loads(path.read_text())
assert data is not None
```
**After (sandboxed):**
```python
def test_load_presets(tmp_path):
path = tmp_path / "presets.toml"
path.write_text("[presets.test]\nkey = 'value'\n")
# Patch the path module to point to tmp_path
monkeypatch = pytest.MonkeyPatch()
monkeypatch.setattr("src.paths.get_global_presets_path", lambda: path)
data = tomllib.loads(path.read_text())
assert data["presets"]["test"]["key"] == "value"
monkeypatch.undo()
```
Or use the `isolate_workspace` autouse fixture (already in conftest.py) which redirects all path resolution to `tmp_path`.
### Phase 3: Consolidate (Judgment Call)
Examples of consolidation opportunities (NOT a forced refactor):
| Current | Proposed | Rationale |
|---|---|---|
| `test_ai_settings_layout.py` + `test_sim_ai_settings.py` | `test_ai_settings.py` with parametrize | Tests cover same surface |
| `test_*_provider.py` (5+ files) | `test_providers.py` parametrized | Each provider test has same shape |
| `test_*_preset*.py` (3 files) | `test_presets.py` with class organization | Settings/presets/tools all CRUD TOML |
| `test_*_screenshot*.py` | `test_screenshots.py` | Currently fragmented |
Each consolidation is reviewed case-by-case. **Test count is not a goal; test clarity is.** Don't merge tests that test different things just to reduce file count.
### Phase 4: Enforce
**4a. Autouse fixture** in `tests/conftest.py`:
```python
@pytest.fixture(autouse=True)
def enforce_no_real_toml(monkeypatch, tmp_path):
"""Prevents any test from reading ./<name>.toml by detecting file existence
and asserting the path is inside tmp_path or explicitly monkeypatched."""
real_toml_paths = [
Path("manual_slop.toml"),
Path("config.toml"),
Path("credentials.toml"),
Path("presets.toml"),
Path("personas.toml"),
Path("tool_presets.toml"),
Path("workspace_profiles.toml"),
]
# If any real TOML exists in the cwd, save it for restoration
snapshots = {}
for p in real_toml_paths:
if p.exists():
snapshots[p] = p.read_bytes()
p.unlink() # Remove to prevent test from reading
yield # Run the test
# Restore after test
for p, content in snapshots.items():
p.write_bytes(content)
```
This is **strict** — any test that tries to read a real TOML will get FileNotFoundError. Tests must use `tmp_path` or `monkeypatch`.
If this is too aggressive, a softer alternative:
```python
@pytest.fixture(autouse=True)
def warn_on_real_toml():
"""Warns if a test reads a real TOML. Does not fail by default;
set ENFORCE_NO_REAL_TOML=1 to convert warnings to failures."""
...
```
**4b. CI script** `scripts/check_test_toml_paths.py` — runs on every commit:
```python
# Greps for direct ./<name>.toml references
# Exits non-zero if any found
# Output: "test_foo.py:42: Path('presets.toml') — direct reference to real TOML"
```
Add to `conductor/...` workflow or as a pre-commit hook (out of scope for this track — just provide the script).
### Phase 5: Test the Enforcer
`tests/test_enforce_no_real_toml.py` — meta-test:
```python
def test_enforcer_catches_violation(tmp_path, monkeypatch):
"""Verify the fixture prevents reading a real TOML."""
# Create a real-looking TOML in cwd
real_path = Path("test_enforcer_temp.toml")
real_path.write_text("[test]\nkey='value'")
try:
# The fixture removes it; try to read it
with pytest.raises(FileNotFoundError):
real_path.read_text()
finally:
if real_path.exists():
real_path.unlink()
def test_enforcer_restores_real_tomls(tmp_path):
"""Verify the fixture restores real TOMLs after the test."""
real_path = Path("test_enforcer_temp2.toml")
original = b"[test]\nkey='original'"
real_path.write_bytes(original)
# The test runs (fixture activates)
assert real_path.exists() # The fixture restored it
assert real_path.read_bytes() == original
real_path.unlink()
```
---
## File Structure
- `scripts/check_test_toml_paths.py` — NEW: greps for violations, exits non-zero
- `tests/conftest.py` — MODIFY: add `enforce_no_real_toml` autouse fixture (strict or warn-only)
- `tests/test_enforce_no_real_toml.py` — NEW: tests for the enforcer
- Various `tests/test_*.py` — MODIFY: migrate offenders to sandboxed pattern
- Various `tests/test_*.py` — MODIFY: consolidate where it improves clarity
---
## Acceptance Criteria
- All existing tests pass after migration
- `scripts/check_test_toml_paths.py` exits 0 on the test suite after migration
- The autouse fixture catches new violations in CI
- Test count is approximately the same after consolidation (slight decrease acceptable)
- No real TOML files in the user's project are touched by the test suite
---
## Risks
1. **Test breakage:** Migration may break tests that depend on real-file behavior. Mitigation: run full test suite after each migration batch.
2. **Performance:** The autouse fixture adds overhead to every test. Mitigation: keep it cheap (just snapshot/restore file existence).
3. **Coverage regression:** Removing real-file behavior may hide bugs. Mitigation: add explicit tests for the sandboxed path resolution.