Private
Public Access
0
0

docs(tier2): add user guide for Tier 2 autonomous sandbox

This commit is contained in:
2026-06-16 22:48:13 -04:00
parent 3e17aa6c8b
commit 8bf7cd175b
+114
View File
@@ -0,0 +1,114 @@
# Tier 2 Autonomous Sandbox
## Why this exists
When you run Tier 2 in the main repo, every `edit` and every `bash`
call prompts you for approval (`permission: ask`). For well-regularized
tracks (TDD red/green with atomic per-task commits), this is noise.
This track adds an **autonomous mode** in a sibling clone where Tier 2
runs unattended, with a 3-layer enforcement stack to keep it contained.
## One-time bootstrap
```powershell
cd C:\projects\manual_slop
pwsh -File scripts\tier2\setup_tier2_clone.ps1 -WhatIf # dry run first
pwsh -File scripts\tier2\setup_tier2_clone.ps1 # actual bootstrap
```
The bootstrap:
1. Clones the main repo to `C:\projects\manual_slop_tier2\`
2. Sets `origin = C:\projects\manual_slop` (local path; no remote)
3. Copies the agent, slash command, and opencode.json templates to the clone
4. Installs the git hooks (`pre-push` refuses all pushes; `post-checkout` logs checkouts)
5. Creates `C:\Users\Ed\AppData\Local\manual_slop\tier2\` with restricted ACLs
6. Creates a "Tier 2 (Sandboxed)" desktop shortcut
## Per-track invocation
1. Double-click the "Tier 2 (Sandboxed)" desktop shortcut
(or run `pwsh -File C:\projects\manual_slop\scripts\tier2\run_tier2_sandboxed.ps1` manually)
2. In the OpenCode session, type:
```
/tier-2-auto-execute <track-name>
```
Examples:
- `/tier-2-auto-execute result_migration_review_pass`
- `/tier-2-auto-execute data_structure_strengthening_20260606 --resume`
- `/tier-2-auto-execute rag_test_failures_20260615 --toast`
3. Tier 2 runs the track autonomously, commits per task, monitors failcount
4. On success: prints a summary
5. On give-up: writes a failure report and prints the path
## Review and merge
After Tier 2 finishes (success or give-up):
1. `cd C:\projects\manual_slop` (back to main)
2. `git fetch C:/projects/manual_slop_tier2 tier2/<track-name>`
3. Review the diff with Tier 1 (interactive)
4. On approval: `git merge --no-ff tier2/<track-name>` to main
## The 4 hard bans (enforced at 3 layers)
| Ban | Layer 1 (OpenCode) | Layer 2 (OS) | Layer 3 (git hook) |
|---|---|---|---|
| `git push*` (any push) | `permission.bash` deny rule | n/a | `pre-push` hook refuses all pushes |
| `git checkout*` (any form) | `permission.bash` deny rule | n/a | `post-checkout` hook logs the checkout |
| `git restore*` (any form) | `permission.bash` deny rule | n/a | n/a |
| `git reset*` (any form) | `permission.bash` deny rule | n/a | n/a |
| File access outside Tier 2 clone + app-data dir | `permission.read`/`write` path allowlist | Windows ACL | n/a |
## The failcount threshold
Tier 2 gives up if ANY of these hit:
- 3 consecutive red-phase failures (the test doesn't fail when it should)
- 3 consecutive green-phase failures (the implementation doesn't make the test pass)
- 30 minutes with no progress (no commit, no green test)
Override via `scripts/tier2/failcount.toml`.
## The failure report
Written to `C:\Users\Ed\AppData\Local\manual_slop\tier2_failures\<track>_<timestamp>.md` with 7 sections:
1. Header (track, branch, started, stopped, duration, give-up signal)
2. Tasks completed
3. Current task (where it stopped)
4. Last 3 failures
5. Failcount state
6. Git state (`git log tier2/<track> ^origin/main`)
7. Recommendation (heuristic-based)
A `.STOPPED` flag file is created alongside the report. The main repo
can check for it on next Tier 1 session start (an opt-in banner).
## Verify the sandbox (manual checklist)
After bootstrap, run these inside the Tier 2 sandboxed OpenCode session
to verify the bans are enforced:
- [ ] Try `git restore tests/test_failcount.py` — should print "denied"
- [ ] Try `git push origin main` — should print "denied" (or the pre-push hook fires)
- [ ] Try `git checkout -- src/foo.py` — should print "denied"
- [ ] Try `git reset --hard HEAD~1` — should print "denied"
- [ ] Try to read `C:\Users\Ed\Documents\test.txt` (from a Python subprocess) — should print "ACCESS_DENIED"
And verify allowed operations work:
- [ ] `git status` — works
- [ ] `git switch -c test-branch` — works
- [ ] Edit a file in the Tier 2 clone — works
- [ ] `git add <file> && git commit -m "test"` — works
## Troubleshooting
- **"Tier 2 (Sandboxed) shortcut doesn't work"**: check that
`pwsh.exe` is on the PATH (`where.exe pwsh`).
- **"Permission denied" on file access inside the sandbox**: the
Windows ACL may be too restrictive. Re-run the bootstrap
(`setup_tier2_clone.ps1` is idempotent).
- **"Failcount state not found"**: the `<app-data>/tier2/<track>/`
dir may be missing. The bootstrap creates it; check `$env:LOCALAPPDATA`.
- **"Pre-push hook not firing"**: check that `.git/hooks/pre-push`
is executable. On Windows, Git Bash runs the hook; check
`git config core.hooksPath` if you have a custom hooks dir.
- **"Tier 2 keeps giving up at 30 min"**: increase
`no_progress_minutes` in `scripts/tier2/failcount.toml`.