7.7 KiB
Test Consolidation & TOML Sandboxing Enforcement
Date: 2026-06-02 Status: Draft (pending review)
Context & Motivation
The Manual Slop test suite has grown to ~258 test files. Many tests read or write project TOML files (manual_slop.toml, config.toml, credentials.toml, presets.toml, etc.) for fixtures. The pattern is inconsistent:
- Some tests use
tmp_path+monkeypatch(good — isolated) - Some tests use real
./paths (bad — pollutes user config) - Some tests use mock paths at module level (good — fast)
The user wants to:
- Audit tests for real-TOML usage
- Migrate offenders to sandboxed variants
- Consolidate similar tests where it improves clarity
- Enforce the rule going forward
The isolate_workspace autouse fixture in tests/conftest.py (added in the May 2026 docs refresh work) is the foundation for the migration pattern.
Scope
In Scope
- Audit all
tests/*.pyfor direct path references to./TOML files - Migrate offenders to use
tmp_path+monkeypatch(orisolate_workspace) - Consolidate similar tests where it improves clarity (judgment call)
- Add a
tests/conftest.pyautouse fixture that prevents regression - Add a
scripts/check_test_toml_paths.pyscript for CI/pre-commit - Add tests for the enforcement mechanism itself
Out of Scope
- Rewriting tests for clarity (only consolidation where it improves maintainability)
- Adding new tests
- Changing the test runner (pytest stays)
- Coverage tooling changes
Design
Phase 1: Audit
A script that greps tests/*.py for problematic patterns:
# scripts/check_test_toml_paths.py
import re
from pathlib import Path
PROBLEMATIC_PATTERNS = [
r'Path\("(?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml"\)',
r'open\(["\'](?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml["\']',
r'["\']\.{1,2}/(?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml["\']',
]
def find_violations(tests_dir: Path) -> List[Tuple[Path, int, str]]:
"""Returns list of (file, line, pattern) for each violation."""
...
Run this script as the first step. Output a report grouped by file.
Phase 2: Migrate Offenders
For each violation, refactor the test to use the sandboxed pattern:
Before (real TOML):
def test_load_presets():
path = Path("presets.toml") # Real file!
if path.exists():
data = tomllib.loads(path.read_text())
assert data is not None
After (sandboxed):
def test_load_presets(tmp_path):
path = tmp_path / "presets.toml"
path.write_text("[presets.test]\nkey = 'value'\n")
# Patch the path module to point to tmp_path
monkeypatch = pytest.MonkeyPatch()
monkeypatch.setattr("src.paths.get_global_presets_path", lambda: path)
data = tomllib.loads(path.read_text())
assert data["presets"]["test"]["key"] == "value"
monkeypatch.undo()
Or use the isolate_workspace autouse fixture (already in conftest.py) which redirects all path resolution to tmp_path.
Phase 3: Consolidate (Judgment Call)
Examples of consolidation opportunities (NOT a forced refactor):
| Current | Proposed | Rationale |
|---|---|---|
test_ai_settings_layout.py + test_sim_ai_settings.py |
test_ai_settings.py with parametrize |
Tests cover same surface |
test_*_provider.py (5+ files) |
test_providers.py parametrized |
Each provider test has same shape |
test_*_preset*.py (3 files) |
test_presets.py with class organization |
Settings/presets/tools all CRUD TOML |
test_*_screenshot*.py |
test_screenshots.py |
Currently fragmented |
Each consolidation is reviewed case-by-case. Test count is not a goal; test clarity is. Don't merge tests that test different things just to reduce file count.
Phase 4: Enforce
4a. Autouse fixture in tests/conftest.py:
@pytest.fixture(autouse=True)
def enforce_no_real_toml(monkeypatch, tmp_path):
"""Prevents any test from reading ./<name>.toml by detecting file existence
and asserting the path is inside tmp_path or explicitly monkeypatched."""
real_toml_paths = [
Path("manual_slop.toml"),
Path("config.toml"),
Path("credentials.toml"),
Path("presets.toml"),
Path("personas.toml"),
Path("tool_presets.toml"),
Path("workspace_profiles.toml"),
]
# If any real TOML exists in the cwd, save it for restoration
snapshots = {}
for p in real_toml_paths:
if p.exists():
snapshots[p] = p.read_bytes()
p.unlink() # Remove to prevent test from reading
yield # Run the test
# Restore after test
for p, content in snapshots.items():
p.write_bytes(content)
This is strict — any test that tries to read a real TOML will get FileNotFoundError. Tests must use tmp_path or monkeypatch.
If this is too aggressive, a softer alternative:
@pytest.fixture(autouse=True)
def warn_on_real_toml():
"""Warns if a test reads a real TOML. Does not fail by default;
set ENFORCE_NO_REAL_TOML=1 to convert warnings to failures."""
...
4b. CI script scripts/check_test_toml_paths.py — runs on every commit:
# Greps for direct ./<name>.toml references
# Exits non-zero if any found
# Output: "test_foo.py:42: Path('presets.toml') — direct reference to real TOML"
Add to conductor/... workflow or as a pre-commit hook (out of scope for this track — just provide the script).
Phase 5: Test the Enforcer
tests/test_enforce_no_real_toml.py — meta-test:
def test_enforcer_catches_violation(tmp_path, monkeypatch):
"""Verify the fixture prevents reading a real TOML."""
# Create a real-looking TOML in cwd
real_path = Path("test_enforcer_temp.toml")
real_path.write_text("[test]\nkey='value'")
try:
# The fixture removes it; try to read it
with pytest.raises(FileNotFoundError):
real_path.read_text()
finally:
if real_path.exists():
real_path.unlink()
def test_enforcer_restores_real_tomls(tmp_path):
"""Verify the fixture restores real TOMLs after the test."""
real_path = Path("test_enforcer_temp2.toml")
original = b"[test]\nkey='original'"
real_path.write_bytes(original)
# The test runs (fixture activates)
assert real_path.exists() # The fixture restored it
assert real_path.read_bytes() == original
real_path.unlink()
File Structure
scripts/check_test_toml_paths.py— NEW: greps for violations, exits non-zerotests/conftest.py— MODIFY: addenforce_no_real_tomlautouse fixture (strict or warn-only)tests/test_enforce_no_real_toml.py— NEW: tests for the enforcer- Various
tests/test_*.py— MODIFY: migrate offenders to sandboxed pattern - Various
tests/test_*.py— MODIFY: consolidate where it improves clarity
Acceptance Criteria
- All existing tests pass after migration
scripts/check_test_toml_paths.pyexits 0 on the test suite after migration- The autouse fixture catches new violations in CI
- Test count is approximately the same after consolidation (slight decrease acceptable)
- No real TOML files in the user's project are touched by the test suite
Risks
- Test breakage: Migration may break tests that depend on real-file behavior. Mitigation: run full test suite after each migration batch.
- Performance: The autouse fixture adds overhead to every test. Mitigation: keep it cheap (just snapshot/restore file existence).
- Coverage regression: Removing real-file behavior may hide bugs. Mitigation: add explicit tests for the sandboxed path resolution.