Merge branch 'master' of C:\projects\manual_slop into tier2/send_result_to_send_20260616
This commit is contained in:
@@ -75,19 +75,30 @@ Written to `C:\Users\Ed\AppData\Local\manual_slop\tier2_failures\<track>_<timest
|
||||
3. Current task (where it stopped)
|
||||
4. Last 3 failures
|
||||
5. Failcount state
|
||||
6. Git state (`git log tier2/<track> ^origin/main`)
|
||||
6. Git state (`git log tier2/<track> ^origin/master`)
|
||||
7. Recommendation (heuristic-based)
|
||||
|
||||
A `.STOPPED` flag file is created alongside the report. The main repo
|
||||
can check for it on next Tier 1 session start (an opt-in banner).
|
||||
|
||||
## Conventions (added 2026-06-17)
|
||||
|
||||
These are enforced by the Tier 2 agent prompt. The agent MUST follow them — they're not optional.
|
||||
|
||||
- **Test runner:** Tier 2 always uses `uv run python scripts/run_tests_batched.py`. Never `uv run pytest` directly. The batched runner provides tier-based filtering, parallelization (xdist), and a summary table that direct pytest doesn't.
|
||||
- **Default branch:** this repo uses `master` (not `main`). When fetching or branching, use `origin/master`. Tier 2 may otherwise get confused by the missing `main` reference.
|
||||
- **Line endings:** Tier 2 preserves existing line endings on edit. This repo has a mix of CRLF and LF; standardizing to repo-wide LF is a future track. For now, do not normalize.
|
||||
- **Throw-away scripts:** Tier 2 writes its working scripts to `scripts/tier2/artifacts/<track-name>/`, NOT the base `scripts/tier2/` directory. The base directory is reserved for production code. Throw-away scripts are kept for archival but isolated in a track-specific subdir.
|
||||
- **End-of-track report:** at the end of every track, Tier 2 writes `docs/reports/TRACK_COMPLETION_<track-name>.md` (follow the precedent set by `TRACK_COMPLETION_tier2_autonomous_sandbox_20260616.md`) and updates `conductor/tracks/<track-name>/state.toml` to `status = "completed"`. The user reads this report to decide merge.
|
||||
- **Run-time expectation:** tracks are expected to take 1-4 hours. If the model reports it is running out of context, Tier 2 notes progress to disk and continues. The user expects autonomous runs to complete without manual "press continue" intervention.
|
||||
|
||||
## Verify the sandbox (manual checklist)
|
||||
|
||||
After bootstrap, run these inside the Tier 2 sandboxed OpenCode session
|
||||
to verify the bans are enforced:
|
||||
|
||||
- [ ] Try `git restore tests/test_failcount.py` — should print "denied"
|
||||
- [ ] Try `git push origin main` — should print "denied" (or the pre-push hook fires)
|
||||
- [ ] Try `git push origin master` — should print "denied" (or the pre-push hook fires)
|
||||
- [ ] Try `git checkout -- src/foo.py` — should print "denied"
|
||||
- [ ] Try `git reset --hard HEAD~1` — should print "denied"
|
||||
- [ ] Try to read `C:\Users\Ed\Documents\test.txt` (from a Python subprocess) — should print "ACCESS_DENIED"
|
||||
@@ -112,3 +123,8 @@ And verify allowed operations work:
|
||||
`git config core.hooksPath` if you have a custom hooks dir.
|
||||
- **"Tier 2 keeps giving up at 30 min"**: increase
|
||||
`no_progress_minutes` in `scripts/tier2/failcount.toml`.
|
||||
- **"Tier 2 ran out of context"**: the model stopped mid-track. The
|
||||
user (interactive Tier 1) should `cd` to the Tier 2 clone, inspect
|
||||
`<app-data>/tier2/<track>/state.json` for the last completed task,
|
||||
and re-invoke with `/tier-2-auto-execute <track-name> --resume`
|
||||
to continue. The state file persists across runs.
|
||||
|
||||
Reference in New Issue
Block a user