manual_slop/docs/superpowers/specs/2026-06-02-test-consolidation-design.md

# Test Consolidation & TOML Sandboxing Enforcement

**Date:** 2026-06-02
**Status:** Draft (pending review)

---

## Context & Motivation

The Manual Slop test suite has grown to ~258 test files. Many tests read or write project TOML files (manual_slop.toml, config.toml, credentials.toml, presets.toml, etc.) for fixtures. The pattern is inconsistent:

- Some tests use `tmp_path` + `monkeypatch` (good — isolated)
- Some tests use real `./` paths (bad — pollutes user config)
- Some tests use mock paths at module level (good — fast)

The user wants to:
1. Audit tests for real-TOML usage
2. Migrate offenders to sandboxed variants
3. Consolidate similar tests where it improves clarity
4. Enforce the rule going forward

The `isolate_workspace` autouse fixture in `tests/conftest.py` (added in the May 2026 docs refresh work) is the foundation for the migration pattern.

---

## Scope

### In Scope

- Audit all `tests/*.py` for direct path references to `./` TOML files
- Migrate offenders to use `tmp_path` + `monkeypatch` (or `isolate_workspace`)
- Consolidate similar tests where it improves clarity (judgment call)
- Add a `tests/conftest.py` autouse fixture that prevents regression
- Add a `scripts/check_test_toml_paths.py` script for CI/pre-commit
- Add tests for the enforcement mechanism itself

### Out of Scope

- Rewriting tests for clarity (only consolidation where it improves maintainability)
- Adding new tests
- Changing the test runner (pytest stays)
- Coverage tooling changes

---

## Design

### Phase 1: Audit

A script that greps `tests/*.py` for problematic patterns:

```python
# scripts/check_test_toml_paths.py
import re
from pathlib import Path

PROBLEMATIC_PATTERNS = [
    r'Path\("(?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml"\)',
    r'open\(["\'](?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml["\']',
    r'["\']\.{1,2}/(?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml["\']',
]

def find_violations(tests_dir: Path) -> List[Tuple[Path, int, str]]:
    """Returns list of (file, line, pattern) for each violation."""
    ...
```

Run this script as the first step. Output a report grouped by file.

### Phase 2: Migrate Offenders

For each violation, refactor the test to use the sandboxed pattern:

**Before (real TOML):**
```python
def test_load_presets():
    path = Path("presets.toml")  # Real file!
    if path.exists():
        data = tomllib.loads(path.read_text())
    assert data is not None
```

**After (sandboxed):**
```python
def test_load_presets(tmp_path):
    path = tmp_path / "presets.toml"
    path.write_text("[presets.test]\nkey = 'value'\n")
    # Patch the path module to point to tmp_path
    monkeypatch = pytest.MonkeyPatch()
    monkeypatch.setattr("src.paths.get_global_presets_path", lambda: path)
    data = tomllib.loads(path.read_text())
    assert data["presets"]["test"]["key"] == "value"
    monkeypatch.undo()
```

Or use the `isolate_workspace` autouse fixture (already in conftest.py) which redirects all path resolution to `tmp_path`.

### Phase 3: Consolidate (Judgment Call)

Examples of consolidation opportunities (NOT a forced refactor):

| Current | Proposed | Rationale |
|---|---|---|
| `test_ai_settings_layout.py` + `test_sim_ai_settings.py` | `test_ai_settings.py` with parametrize | Tests cover same surface |
| `test_*_provider.py` (5+ files) | `test_providers.py` parametrized | Each provider test has same shape |
| `test_*_preset*.py` (3 files) | `test_presets.py` with class organization | Settings/presets/tools all CRUD TOML |
| `test_*_screenshot*.py` | `test_screenshots.py` | Currently fragmented |

Each consolidation is reviewed case-by-case. **Test count is not a goal; test clarity is.** Don't merge tests that test different things just to reduce file count.

### Phase 4: Enforce

**4a. Autouse fixture** in `tests/conftest.py`:

```python
@pytest.fixture(autouse=True)
def enforce_no_real_toml(monkeypatch, tmp_path):
    """Prevents any test from reading ./<name>.toml by detecting file existence
    and asserting the path is inside tmp_path or explicitly monkeypatched."""

    real_toml_paths = [
        Path("manual_slop.toml"),
        Path("config.toml"),
        Path("credentials.toml"),
        Path("presets.toml"),
        Path("personas.toml"),
        Path("tool_presets.toml"),
        Path("workspace_profiles.toml"),
    ]

    # If any real TOML exists in the cwd, save it for restoration
    snapshots = {}
    for p in real_toml_paths:
        if p.exists():
            snapshots[p] = p.read_bytes()
            p.unlink()  # Remove to prevent test from reading
        yield  # Run the test
    # Restore after test
    for p, content in snapshots.items():
        p.write_bytes(content)
```

This is **strict** — any test that tries to read a real TOML will get FileNotFoundError. Tests must use `tmp_path` or `monkeypatch`.

If this is too aggressive, a softer alternative:

```python
@pytest.fixture(autouse=True)
def warn_on_real_toml():
    """Warns if a test reads a real TOML. Does not fail by default;
    set ENFORCE_NO_REAL_TOML=1 to convert warnings to failures."""
    ...
```

**4b. CI script** `scripts/check_test_toml_paths.py` — runs on every commit:

```python
# Greps for direct ./<name>.toml references
# Exits non-zero if any found
# Output: "test_foo.py:42: Path('presets.toml') — direct reference to real TOML"
```

Add to `conductor/...` workflow or as a pre-commit hook (out of scope for this track — just provide the script).

### Phase 5: Test the Enforcer

`tests/test_enforce_no_real_toml.py` — meta-test:

```python
def test_enforcer_catches_violation(tmp_path, monkeypatch):
    """Verify the fixture prevents reading a real TOML."""
    # Create a real-looking TOML in cwd
    real_path = Path("test_enforcer_temp.toml")
    real_path.write_text("[test]\nkey='value'")
    try:
        # The fixture removes it; try to read it
        with pytest.raises(FileNotFoundError):
            real_path.read_text()
    finally:
        if real_path.exists():
            real_path.unlink()

def test_enforcer_restores_real_tomls(tmp_path):
    """Verify the fixture restores real TOMLs after the test."""
    real_path = Path("test_enforcer_temp2.toml")
    original = b"[test]\nkey='original'"
    real_path.write_bytes(original)
    # The test runs (fixture activates)
    assert real_path.exists()  # The fixture restored it
    assert real_path.read_bytes() == original
    real_path.unlink()
```

---

## File Structure

- `scripts/check_test_toml_paths.py` — NEW: greps for violations, exits non-zero
- `tests/conftest.py` — MODIFY: add `enforce_no_real_toml` autouse fixture (strict or warn-only)
- `tests/test_enforce_no_real_toml.py` — NEW: tests for the enforcer
- Various `tests/test_*.py` — MODIFY: migrate offenders to sandboxed pattern
- Various `tests/test_*.py` — MODIFY: consolidate where it improves clarity

---

## Acceptance Criteria

- All existing tests pass after migration
- `scripts/check_test_toml_paths.py` exits 0 on the test suite after migration
- The autouse fixture catches new violations in CI
- Test count is approximately the same after consolidation (slight decrease acceptable)
- No real TOML files in the user's project are touched by the test suite

---

## Risks

1. **Test breakage:** Migration may break tests that depend on real-file behavior. Mitigation: run full test suite after each migration batch.
2. **Performance:** The autouse fixture adds overhead to every test. Mitigation: keep it cheap (just snapshot/restore file existence).
3. **Coverage regression:** Removing real-file behavior may hide bugs. Mitigation: add explicit tests for the sandboxed path resolution.