387adff579
Follow-up to the 'NEVER USE APPDATA' directive. The agent kept
trying to use \C:\Users\Ed\AppData\Local\Temp / \C:\Users\Ed\AppData\Local\Temp / %TEMP% / %TMP% — the previous
deny rule (*AppData\\\\* and *AppData\\Local\\Temp\\*) only matched
the literal expanded path, not the env-var form. The agent would
self-block based on its own interpretation of the rule, but it still
TRIED before self-blocking (the 'fucking tired of it fucking with
AppData' complaint).
Fix:
1. opencode.json.fragment: add bash deny patterns matched against
the LITERAL command string (before shell expansion):
*\C:\Users\Ed\AppData\Local\Temp* - PowerShell env var (the form the agent tried)
*\C:\Users\Ed\AppData\Local\Temp* - PowerShell env var
*%TEMP%* - cmd env var
*%TMP%* - cmd env var
*GetTempPath* - .NET API
*gettempdir* - Python tempfile module
*mkstemp* - Python tempfile.mkstemp
Applied to BOTH the top-level permission.bash (for default agents)
and the tier2-autonomous agent's permission.bash.
2. conductor/tier2/agents/tier2-autonomous.md: rewrite the Temp
files section to explicitly list ALL forbidden literals and
reiterate 'every one of those literal command strings is denied
at the bash level'. Updated changelog note.
3. conductor/tier2/commands/tier-2-auto-execute.md: same.
4. tests/test_tier2_slash_command_spec.py: extend
test_config_fragment_denies_temp_writes to assert each of the 9
patterns in both the top-level and the agent's bash.
Verified: re-ran setup against the live clone. tier2 agent's bash
has 13 deny patterns (9 AppData/temp + 4 git). 37/37 default-on
tests pass.
Note: the user's prior commit (fix(tier2): remove AppData allow
rules from OpenCode permission JSON) already removed the AppData
allow rules from read/write and added the broader *AppData\\\\*
deny rule. This commit layers on top of that with the env-var-form
deny patterns.
56 lines
4.5 KiB
Markdown
56 lines
4.5 KiB
Markdown
---
|
|
description: Autonomously execute a conductor track in the Tier 2 sandbox
|
|
agent: tier2-autonomous
|
|
---
|
|
|
|
# /tier-2-auto-execute
|
|
|
|
Run a track autonomously in the Tier 2 sandboxed mode. No `permission: ask` prompts.
|
|
|
|
## Arguments
|
|
|
|
$ARGUMENTS - Track name (required). Examples: `result_migration_review_pass`, `data_structure_strengthening_20260606`.
|
|
Optional flags: `--resume` (continue from last completed task), `--toast` (Windows toast on give-up).
|
|
|
|
## Pre-flight
|
|
|
|
1. **Verify sandbox is active.** This slash command must be invoked from a sandboxed OpenCode session. If `manual-slop_get_ui_performance` returns an error or the run_tier2_sandboxed.ps1 wrapper is not in the parent process, refuse to start.
|
|
2. **Load the track spec.** Read `conductor/tracks/<track-name>/spec.md` and `plan.md` from the current branch. If the track does not exist, abort.
|
|
3. **Check for a previous run.** If `tests/artifacts/tier2_state/<track-name>/state.json` exists AND `--resume` is NOT set, abort with: "Previous run found for this track. Use `--resume` to continue, or delete the state file to start fresh."
|
|
|
|
## Protocol
|
|
|
|
1. `git fetch origin master` (NOTE: this repo uses `master`, not `main`; added 2026-06-17)
|
|
2. `git switch -c tier2/<track-name> origin/master` (NOT `git checkout` - it is banned)
|
|
3. Initialize failcount state at `tests/artifacts/tier2_state/<track-name>/state.json` (use `load_state` or fresh state)
|
|
4. For each task in `plan.md`:
|
|
a. Red: delegate test creation to @tier3-worker
|
|
b. Run tests via `uv run python scripts/run_tests_batched.py` (NEVER `uv run pytest` directly; the batched runner provides tier filtering, parallelization, and the summary table — added 2026-06-17)
|
|
c. If pass unexpectedly, call `record_red_failure` and check `should_give_up`
|
|
d. Green: delegate implementation to @tier3-worker
|
|
e. Run tests via `scripts/run_tests_batched.py`; if fail, call `record_green_failure` and check `should_give_up`
|
|
f. On green: `record_commit` and `record_green_success` (resets counters)
|
|
g. Commit per task with `git add <specific files> && git commit -m "..."` and attach git note
|
|
h. Update `plan.md` with commit SHA
|
|
5. After all tasks complete, write the end-of-track report (see step 7) and print success summary.
|
|
6. On give-up: call `write_failure_report` from `scripts.tier2.write_report`, print "TRACK ABORTED, see report at <path>".
|
|
7. **End-of-track report** (added 2026-06-17): on success, write `docs/reports/TRACK_COMPLETION_<track-name>.md` following the precedent set by `TRACK_COMPLETION_tier2_autonomous_sandbox_20260616.md`. Update `conductor/tracks/<track-name>/state.toml` to `status = "completed"`. The user reads this report to decide merge.
|
|
|
|
## Conventions (MUST follow - added 2026-06-17)
|
|
|
|
- **Test runner:** use `uv run python scripts/run_tests_batched.py` (NOT `uv run pytest`)
|
|
- **Default branch:** `master` (this repo never had `main`)
|
|
- **Line endings:** preserve existing (CRLF stays CRLF, LF stays LF)
|
|
- **Throw-away scripts:** write to `scripts/tier2/artifacts/<track-name>/`, NOT the base directory
|
|
- **Run-time expectation:** tracks are 1-4 hours. If context runs out, note progress to disk and continue.
|
|
- **Temp files** (added 2026-06-17, rewritten 2026-06-18, paths updated 2026-06-18 per Tier 2's project-relative relocation; deny patterns expanded 2026-06-19 to catch all env-var forms): All scratch, state, audit-output, and intermediate files MUST live INSIDE the Tier 2 clone. Default locations: `tests/artifacts/tier2_state/<track>/state.json` for failcount state, `tests/artifacts/tier2_failures/` for failure reports, `scripts/tier2/artifacts/<track>/` for throwaway scripts. **NEVER USE APPDATA** — the AppData tree is OFF-LIMITS. The full list of forbidden literals (matched against the command string): `*AppData\\*`, `*AppData\Local\Temp\*`, `*$env:TEMP*`, `*$env:TMP*`, `*%TEMP%*`, `*%TMP%*`, `*GetTempPath*`, `*gettempdir*`, `*mkstemp*`. Do NOT attempt to use `$env:TEMP`, `$env:TMP`, `%TEMP%`, `%TMP%`, or any temp-dir API in any form — every one of those literal command strings is denied at the bash level.
|
|
|
|
## Hard Bans (enforced by 3 layers)
|
|
|
|
- `git restore*` (any form) — denied
|
|
- `git push*` (any push) — denied
|
|
- `git checkout*` (any form) — denied; use `git switch` instead
|
|
- `git reset*` (any form) — denied
|
|
|
|
Filesystem access is restricted to the Tier 2 clone (`C:\projects\manual_slop_tier2\`). The Windows restricted token blocks reads/writes outside this path at the OS level. **NEVER USE APPDATA** — there is no longer any Tier 2 state or scratch dir on AppData; the `*AppData\\*` bash deny rule enforces this.
|