Private
Public Access
0
0
Files
manual_slop/docs/guide_tier2_autonomous.md
T

4.8 KiB

Tier 2 Autonomous Sandbox

Why this exists

When you run Tier 2 in the main repo, every edit and every bash call prompts you for approval (permission: ask). For well-regularized tracks (TDD red/green with atomic per-task commits), this is noise. This track adds an autonomous mode in a sibling clone where Tier 2 runs unattended, with a 3-layer enforcement stack to keep it contained.

One-time bootstrap

cd C:\projects\manual_slop
pwsh -File scripts\tier2\setup_tier2_clone.ps1 -WhatIf   # dry run first
pwsh -File scripts\tier2\setup_tier2_clone.ps1            # actual bootstrap

The bootstrap:

  1. Clones the main repo to C:\projects\manual_slop_tier2\
  2. Sets origin = C:\projects\manual_slop (local path; no remote)
  3. Copies the agent, slash command, and opencode.json templates to the clone
  4. Installs the git hooks (pre-push refuses all pushes; post-checkout logs checkouts)
  5. Creates C:\Users\Ed\AppData\Local\manual_slop\tier2\ with restricted ACLs
  6. Creates a "Tier 2 (Sandboxed)" desktop shortcut

Per-track invocation

  1. Double-click the "Tier 2 (Sandboxed)" desktop shortcut (or run pwsh -File C:\projects\manual_slop\scripts\tier2\run_tier2_sandboxed.ps1 manually)
  2. In the OpenCode session, type:
    /tier-2-auto-execute <track-name>
    
    Examples:
    • /tier-2-auto-execute result_migration_review_pass
    • /tier-2-auto-execute data_structure_strengthening_20260606 --resume
    • /tier-2-auto-execute rag_test_failures_20260615 --toast
  3. Tier 2 runs the track autonomously, commits per task, monitors failcount
  4. On success: prints a summary
  5. On give-up: writes a failure report and prints the path

Review and merge

After Tier 2 finishes (success or give-up):

  1. cd C:\projects\manual_slop (back to main)
  2. git fetch C:/projects/manual_slop_tier2 tier2/<track-name>
  3. Review the diff with Tier 1 (interactive)
  4. On approval: git merge --no-ff tier2/<track-name> to main

The 4 hard bans (enforced at 3 layers)

Ban Layer 1 (OpenCode) Layer 2 (OS) Layer 3 (git hook)
git push* (any push) permission.bash deny rule n/a pre-push hook refuses all pushes
git checkout* (any form) permission.bash deny rule n/a post-checkout hook logs the checkout
git restore* (any form) permission.bash deny rule n/a n/a
git reset* (any form) permission.bash deny rule n/a n/a
File access outside Tier 2 clone + app-data dir permission.read/write path allowlist Windows ACL n/a

The failcount threshold

Tier 2 gives up if ANY of these hit:

  • 3 consecutive red-phase failures (the test doesn't fail when it should)
  • 3 consecutive green-phase failures (the implementation doesn't make the test pass)
  • 30 minutes with no progress (no commit, no green test)

Override via scripts/tier2/failcount.toml.

The failure report

Written to C:\Users\Ed\AppData\Local\manual_slop\tier2_failures\<track>_<timestamp>.md with 7 sections:

  1. Header (track, branch, started, stopped, duration, give-up signal)
  2. Tasks completed
  3. Current task (where it stopped)
  4. Last 3 failures
  5. Failcount state
  6. Git state (git log tier2/<track> ^origin/main)
  7. Recommendation (heuristic-based)

A .STOPPED flag file is created alongside the report. The main repo can check for it on next Tier 1 session start (an opt-in banner).

Verify the sandbox (manual checklist)

After bootstrap, run these inside the Tier 2 sandboxed OpenCode session to verify the bans are enforced:

  • Try git restore tests/test_failcount.py — should print "denied"
  • Try git push origin main — should print "denied" (or the pre-push hook fires)
  • Try git checkout -- src/foo.py — should print "denied"
  • Try git reset --hard HEAD~1 — should print "denied"
  • Try to read C:\Users\Ed\Documents\test.txt (from a Python subprocess) — should print "ACCESS_DENIED"

And verify allowed operations work:

  • git status — works
  • git switch -c test-branch — works
  • Edit a file in the Tier 2 clone — works
  • git add <file> && git commit -m "test" — works

Troubleshooting

  • "Tier 2 (Sandboxed) shortcut doesn't work": check that pwsh.exe is on the PATH (where.exe pwsh).
  • "Permission denied" on file access inside the sandbox: the Windows ACL may be too restrictive. Re-run the bootstrap (setup_tier2_clone.ps1 is idempotent).
  • "Failcount state not found": the <app-data>/tier2/<track>/ dir may be missing. The bootstrap creates it; check $env:LOCALAPPDATA.
  • "Pre-push hook not firing": check that .git/hooks/pre-push is executable. On Windows, Git Bash runs the hook; check git config core.hooksPath if you have a custom hooks dir.
  • "Tier 2 keeps giving up at 30 min": increase no_progress_minutes in scripts/tier2/failcount.toml.