221 lines
7.7 KiB
Markdown
221 lines
7.7 KiB
Markdown
# Test Consolidation & TOML Sandboxing Enforcement
|
|
|
|
**Date:** 2026-06-02
|
|
**Status:** Draft (pending review)
|
|
|
|
---
|
|
|
|
## Context & Motivation
|
|
|
|
The Manual Slop test suite has grown to ~258 test files. Many tests read or write project TOML files (manual_slop.toml, config.toml, credentials.toml, presets.toml, etc.) for fixtures. The pattern is inconsistent:
|
|
|
|
- Some tests use `tmp_path` + `monkeypatch` (good — isolated)
|
|
- Some tests use real `./` paths (bad — pollutes user config)
|
|
- Some tests use mock paths at module level (good — fast)
|
|
|
|
The user wants to:
|
|
1. Audit tests for real-TOML usage
|
|
2. Migrate offenders to sandboxed variants
|
|
3. Consolidate similar tests where it improves clarity
|
|
4. Enforce the rule going forward
|
|
|
|
The `isolate_workspace` autouse fixture in `tests/conftest.py` (added in the May 2026 docs refresh work) is the foundation for the migration pattern.
|
|
|
|
---
|
|
|
|
## Scope
|
|
|
|
### In Scope
|
|
|
|
- Audit all `tests/*.py` for direct path references to `./` TOML files
|
|
- Migrate offenders to use `tmp_path` + `monkeypatch` (or `isolate_workspace`)
|
|
- Consolidate similar tests where it improves clarity (judgment call)
|
|
- Add a `tests/conftest.py` autouse fixture that prevents regression
|
|
- Add a `scripts/check_test_toml_paths.py` script for CI/pre-commit
|
|
- Add tests for the enforcement mechanism itself
|
|
|
|
### Out of Scope
|
|
|
|
- Rewriting tests for clarity (only consolidation where it improves maintainability)
|
|
- Adding new tests
|
|
- Changing the test runner (pytest stays)
|
|
- Coverage tooling changes
|
|
|
|
---
|
|
|
|
## Design
|
|
|
|
### Phase 1: Audit
|
|
|
|
A script that greps `tests/*.py` for problematic patterns:
|
|
|
|
```python
|
|
# scripts/check_test_toml_paths.py
|
|
import re
|
|
from pathlib import Path
|
|
|
|
PROBLEMATIC_PATTERNS = [
|
|
r'Path\("(?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml"\)',
|
|
r'open\(["\'](?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml["\']',
|
|
r'["\']\.{1,2}/(?:manual_slop|config|credentials|presets|personas|tool_presets|workspace_profiles)\.toml["\']',
|
|
]
|
|
|
|
def find_violations(tests_dir: Path) -> List[Tuple[Path, int, str]]:
|
|
"""Returns list of (file, line, pattern) for each violation."""
|
|
...
|
|
```
|
|
|
|
Run this script as the first step. Output a report grouped by file.
|
|
|
|
### Phase 2: Migrate Offenders
|
|
|
|
For each violation, refactor the test to use the sandboxed pattern:
|
|
|
|
**Before (real TOML):**
|
|
```python
|
|
def test_load_presets():
|
|
path = Path("presets.toml") # Real file!
|
|
if path.exists():
|
|
data = tomllib.loads(path.read_text())
|
|
assert data is not None
|
|
```
|
|
|
|
**After (sandboxed):**
|
|
```python
|
|
def test_load_presets(tmp_path):
|
|
path = tmp_path / "presets.toml"
|
|
path.write_text("[presets.test]\nkey = 'value'\n")
|
|
# Patch the path module to point to tmp_path
|
|
monkeypatch = pytest.MonkeyPatch()
|
|
monkeypatch.setattr("src.paths.get_global_presets_path", lambda: path)
|
|
data = tomllib.loads(path.read_text())
|
|
assert data["presets"]["test"]["key"] == "value"
|
|
monkeypatch.undo()
|
|
```
|
|
|
|
Or use the `isolate_workspace` autouse fixture (already in conftest.py) which redirects all path resolution to `tmp_path`.
|
|
|
|
### Phase 3: Consolidate (Judgment Call)
|
|
|
|
Examples of consolidation opportunities (NOT a forced refactor):
|
|
|
|
| Current | Proposed | Rationale |
|
|
|---|---|---|
|
|
| `test_ai_settings_layout.py` + `test_sim_ai_settings.py` | `test_ai_settings.py` with parametrize | Tests cover same surface |
|
|
| `test_*_provider.py` (5+ files) | `test_providers.py` parametrized | Each provider test has same shape |
|
|
| `test_*_preset*.py` (3 files) | `test_presets.py` with class organization | Settings/presets/tools all CRUD TOML |
|
|
| `test_*_screenshot*.py` | `test_screenshots.py` | Currently fragmented |
|
|
|
|
Each consolidation is reviewed case-by-case. **Test count is not a goal; test clarity is.** Don't merge tests that test different things just to reduce file count.
|
|
|
|
### Phase 4: Enforce
|
|
|
|
**4a. Autouse fixture** in `tests/conftest.py`:
|
|
|
|
```python
|
|
@pytest.fixture(autouse=True)
|
|
def enforce_no_real_toml(monkeypatch, tmp_path):
|
|
"""Prevents any test from reading ./<name>.toml by detecting file existence
|
|
and asserting the path is inside tmp_path or explicitly monkeypatched."""
|
|
|
|
real_toml_paths = [
|
|
Path("manual_slop.toml"),
|
|
Path("config.toml"),
|
|
Path("credentials.toml"),
|
|
Path("presets.toml"),
|
|
Path("personas.toml"),
|
|
Path("tool_presets.toml"),
|
|
Path("workspace_profiles.toml"),
|
|
]
|
|
|
|
# If any real TOML exists in the cwd, save it for restoration
|
|
snapshots = {}
|
|
for p in real_toml_paths:
|
|
if p.exists():
|
|
snapshots[p] = p.read_bytes()
|
|
p.unlink() # Remove to prevent test from reading
|
|
yield # Run the test
|
|
# Restore after test
|
|
for p, content in snapshots.items():
|
|
p.write_bytes(content)
|
|
```
|
|
|
|
This is **strict** — any test that tries to read a real TOML will get FileNotFoundError. Tests must use `tmp_path` or `monkeypatch`.
|
|
|
|
If this is too aggressive, a softer alternative:
|
|
|
|
```python
|
|
@pytest.fixture(autouse=True)
|
|
def warn_on_real_toml():
|
|
"""Warns if a test reads a real TOML. Does not fail by default;
|
|
set ENFORCE_NO_REAL_TOML=1 to convert warnings to failures."""
|
|
...
|
|
```
|
|
|
|
**4b. CI script** `scripts/check_test_toml_paths.py` — runs on every commit:
|
|
|
|
```python
|
|
# Greps for direct ./<name>.toml references
|
|
# Exits non-zero if any found
|
|
# Output: "test_foo.py:42: Path('presets.toml') — direct reference to real TOML"
|
|
```
|
|
|
|
Add to `conductor/...` workflow or as a pre-commit hook (out of scope for this track — just provide the script).
|
|
|
|
### Phase 5: Test the Enforcer
|
|
|
|
`tests/test_enforce_no_real_toml.py` — meta-test:
|
|
|
|
```python
|
|
def test_enforcer_catches_violation(tmp_path, monkeypatch):
|
|
"""Verify the fixture prevents reading a real TOML."""
|
|
# Create a real-looking TOML in cwd
|
|
real_path = Path("test_enforcer_temp.toml")
|
|
real_path.write_text("[test]\nkey='value'")
|
|
try:
|
|
# The fixture removes it; try to read it
|
|
with pytest.raises(FileNotFoundError):
|
|
real_path.read_text()
|
|
finally:
|
|
if real_path.exists():
|
|
real_path.unlink()
|
|
|
|
def test_enforcer_restores_real_tomls(tmp_path):
|
|
"""Verify the fixture restores real TOMLs after the test."""
|
|
real_path = Path("test_enforcer_temp2.toml")
|
|
original = b"[test]\nkey='original'"
|
|
real_path.write_bytes(original)
|
|
# The test runs (fixture activates)
|
|
assert real_path.exists() # The fixture restored it
|
|
assert real_path.read_bytes() == original
|
|
real_path.unlink()
|
|
```
|
|
|
|
---
|
|
|
|
## File Structure
|
|
|
|
- `scripts/check_test_toml_paths.py` — NEW: greps for violations, exits non-zero
|
|
- `tests/conftest.py` — MODIFY: add `enforce_no_real_toml` autouse fixture (strict or warn-only)
|
|
- `tests/test_enforce_no_real_toml.py` — NEW: tests for the enforcer
|
|
- Various `tests/test_*.py` — MODIFY: migrate offenders to sandboxed pattern
|
|
- Various `tests/test_*.py` — MODIFY: consolidate where it improves clarity
|
|
|
|
---
|
|
|
|
## Acceptance Criteria
|
|
|
|
- All existing tests pass after migration
|
|
- `scripts/check_test_toml_paths.py` exits 0 on the test suite after migration
|
|
- The autouse fixture catches new violations in CI
|
|
- Test count is approximately the same after consolidation (slight decrease acceptable)
|
|
- No real TOML files in the user's project are touched by the test suite
|
|
|
|
---
|
|
|
|
## Risks
|
|
|
|
1. **Test breakage:** Migration may break tests that depend on real-file behavior. Mitigation: run full test suite after each migration batch.
|
|
2. **Performance:** The autouse fixture adds overhead to every test. Mitigation: keep it cheap (just snapshot/restore file existence).
|
|
3. **Coverage regression:** Removing real-file behavior may hide bugs. Mitigation: add explicit tests for the sandboxed path resolution.
|