Private
Public Access
0
0
Files
manual_slop/docs/superpowers/specs/2026-06-02-test-consolidation-design.md
T

7.7 KiB

Test Consolidation & TOML Sandboxing Enforcement

Date: 2026-06-02 Status: Draft (pending review)


Context & Motivation

The Manual Slop test suite has grown to ~258 test files. Many tests read or write project TOML files (manual_slop.toml, config.toml, credentials.toml, presets.toml, etc.) for fixtures. The pattern is inconsistent:

  • Some tests use tmp_path + monkeypatch (good — isolated)
  • Some tests use real ./ paths (bad — pollutes user config)
  • Some tests use mock paths at module level (good — fast)

The user wants to:

  1. Audit tests for real-TOML usage
  2. Migrate offenders to sandboxed variants
  3. Consolidate similar tests where it improves clarity
  4. Enforce the rule going forward

The isolate_workspace autouse fixture in tests/conftest.py (added in the May 2026 docs refresh work) is the foundation for the migration pattern.


Scope

In Scope

  • Audit all tests/*.py for direct path references to ./ TOML files
  • Migrate offenders to use tmp_path + monkeypatch (or isolate_workspace)
  • Consolidate similar tests where it improves clarity (judgment call)
  • Add a tests/conftest.py autouse fixture that prevents regression
  • Add a scripts/check_test_toml_paths.py script for CI/pre-commit
  • Add tests for the enforcement mechanism itself

Out of Scope

  • Rewriting tests for clarity (only consolidation where it improves maintainability)
  • Adding new tests
  • Changing the test runner (pytest stays)
  • Coverage tooling changes

Design

Phase 1: Audit

A script that greps tests/*.py for problematic patterns:

# scripts/check_test_toml_paths.py
import re
from pathlib import Path

PROBLEMATIC_PATTERNS = [
    r'Path\("(?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml"\)',
    r'open\(["\'](?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml["\']',
    r'["\']\.{1,2}/(?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml["\']',
]

def find_violations(tests_dir: Path) -> List[Tuple[Path, int, str]]:
    """Returns list of (file, line, pattern) for each violation."""
    ...

Run this script as the first step. Output a report grouped by file.

Phase 2: Migrate Offenders

For each violation, refactor the test to use the sandboxed pattern:

Before (real TOML):

def test_load_presets():
    path = Path("presets.toml")  # Real file!
    if path.exists():
        data = tomllib.loads(path.read_text())
    assert data is not None

After (sandboxed):

def test_load_presets(tmp_path):
    path = tmp_path / "presets.toml"
    path.write_text("[presets.test]\nkey = 'value'\n")
    # Patch the path module to point to tmp_path
    monkeypatch = pytest.MonkeyPatch()
    monkeypatch.setattr("src.paths.get_global_presets_path", lambda: path)
    data = tomllib.loads(path.read_text())
    assert data["presets"]["test"]["key"] == "value"
    monkeypatch.undo()

Or use the isolate_workspace autouse fixture (already in conftest.py) which redirects all path resolution to tmp_path.

Phase 3: Consolidate (Judgment Call)

Examples of consolidation opportunities (NOT a forced refactor):

Current Proposed Rationale
test_ai_settings_layout.py + test_sim_ai_settings.py test_ai_settings.py with parametrize Tests cover same surface
test_*_provider.py (5+ files) test_providers.py parametrized Each provider test has same shape
test_*_preset*.py (3 files) test_presets.py with class organization Settings/presets/tools all CRUD TOML
test_*_screenshot*.py test_screenshots.py Currently fragmented

Each consolidation is reviewed case-by-case. Test count is not a goal; test clarity is. Don't merge tests that test different things just to reduce file count.

Phase 4: Enforce

4a. Autouse fixture in tests/conftest.py:

@pytest.fixture(autouse=True)
def enforce_no_real_toml(monkeypatch, tmp_path):
    """Prevents any test from reading ./<name>.toml by detecting file existence
    and asserting the path is inside tmp_path or explicitly monkeypatched."""
    
    real_toml_paths = [
        Path("manual_slop.toml"),
        Path("config.toml"),
        Path("credentials.toml"),
        Path("presets.toml"),
        Path("personas.toml"),
        Path("tool_presets.toml"),
        Path("workspace_profiles.toml"),
    ]
    
    # If any real TOML exists in the cwd, save it for restoration
    snapshots = {}
    for p in real_toml_paths:
        if p.exists():
            snapshots[p] = p.read_bytes()
            p.unlink()  # Remove to prevent test from reading
        yield  # Run the test
    # Restore after test
    for p, content in snapshots.items():
        p.write_bytes(content)

This is strict — any test that tries to read a real TOML will get FileNotFoundError. Tests must use tmp_path or monkeypatch.

If this is too aggressive, a softer alternative:

@pytest.fixture(autouse=True)
def warn_on_real_toml():
    """Warns if a test reads a real TOML. Does not fail by default;
    set ENFORCE_NO_REAL_TOML=1 to convert warnings to failures."""
    ...

4b. CI script scripts/check_test_toml_paths.py — runs on every commit:

# Greps for direct ./<name>.toml references
# Exits non-zero if any found
# Output: "test_foo.py:42: Path('presets.toml') — direct reference to real TOML"

Add to conductor/... workflow or as a pre-commit hook (out of scope for this track — just provide the script).

Phase 5: Test the Enforcer

tests/test_enforce_no_real_toml.py — meta-test:

def test_enforcer_catches_violation(tmp_path, monkeypatch):
    """Verify the fixture prevents reading a real TOML."""
    # Create a real-looking TOML in cwd
    real_path = Path("test_enforcer_temp.toml")
    real_path.write_text("[test]\nkey='value'")
    try:
        # The fixture removes it; try to read it
        with pytest.raises(FileNotFoundError):
            real_path.read_text()
    finally:
        if real_path.exists():
            real_path.unlink()

def test_enforcer_restores_real_tomls(tmp_path):
    """Verify the fixture restores real TOMLs after the test."""
    real_path = Path("test_enforcer_temp2.toml")
    original = b"[test]\nkey='original'"
    real_path.write_bytes(original)
    # The test runs (fixture activates)
    assert real_path.exists()  # The fixture restored it
    assert real_path.read_bytes() == original
    real_path.unlink()

File Structure

  • scripts/check_test_toml_paths.py — NEW: greps for violations, exits non-zero
  • tests/conftest.py — MODIFY: add enforce_no_real_toml autouse fixture (strict or warn-only)
  • tests/test_enforce_no_real_toml.py — NEW: tests for the enforcer
  • Various tests/test_*.py — MODIFY: migrate offenders to sandboxed pattern
  • Various tests/test_*.py — MODIFY: consolidate where it improves clarity

Acceptance Criteria

  • All existing tests pass after migration
  • scripts/check_test_toml_paths.py exits 0 on the test suite after migration
  • The autouse fixture catches new violations in CI
  • Test count is approximately the same after consolidation (slight decrease acceptable)
  • No real TOML files in the user's project are touched by the test suite

Risks

  1. Test breakage: Migration may break tests that depend on real-file behavior. Mitigation: run full test suite after each migration batch.
  2. Performance: The autouse fixture adds overhead to every test. Mitigation: keep it cheap (just snapshot/restore file existence).
  3. Coverage regression: Removing real-file behavior may hide bugs. Mitigation: add explicit tests for the sandboxed path resolution.