fix(tier2): expand %TEMP% deny patterns to catch env-var forms

Follow-up to the 'NEVER USE APPDATA' directive. The agent kept trying to use \C:\Users\Ed\AppData\Local\Temp / \C:\Users\Ed\AppData\Local\Temp / %TEMP% / %TMP% — the previous deny rule (*AppData\\\\* and *AppData\\Local\\Temp\\*) only matched the literal expanded path, not the env-var form. The agent would self-block based on its own interpretation of the rule, but it still TRIED before self-blocking (the 'fucking tired of it fucking with AppData' complaint). Fix: 1. opencode.json.fragment: add bash deny patterns matched against the LITERAL command string (before shell expansion): *\C:\Users\Ed\AppData\Local\Temp* - PowerShell env var (the form the agent tried) *\C:\Users\Ed\AppData\Local\Temp* - PowerShell env var *%TEMP%* - cmd env var *%TMP%* - cmd env var *GetTempPath* - .NET API *gettempdir* - Python tempfile module *mkstemp* - Python tempfile.mkstemp Applied to BOTH the top-level permission.bash (for default agents) and the tier2-autonomous agent's permission.bash. 2. conductor/tier2/agents/tier2-autonomous.md: rewrite the Temp files section to explicitly list ALL forbidden literals and reiterate 'every one of those literal command strings is denied at the bash level'. Updated changelog note. 3. conductor/tier2/commands/tier-2-auto-execute.md: same. 4. tests/test_tier2_slash_command_spec.py: extend test_config_fragment_denies_temp_writes to assert each of the 9 patterns in both the top-level and the agent's bash. Verified: re-ran setup against the live clone. tier2 agent's bash has 13 deny patterns (9 AppData/temp + 4 git). 37/37 default-on tests pass. Note: the user's prior commit (fix(tier2): remove AppData allow rules from OpenCode permission JSON) already removed the AppData allow rules from read/write and added the broader *AppData\\\\* deny rule. This commit layers on top of that with the env-var-form deny patterns.
2026-06-19 07:41:15 -04:00
parent aa3c993f4a
commit 387adff579
4 changed files with 36 additions and 9 deletions
@@ -41,7 +41,7 @@ You are running inside a Windows restricted token. The OpenCode permission syste
 - **Throw-away scripts:** write them to `scripts/tier2/artifacts/<track-name>/`, NOT the base `scripts/tier2/` directory. The base directory is reserved for production code that ships with the sandbox (failcount.py, run_track.py, write_report.py, the .ps1 launchers). Throw-away scripts are kept for archival but live in a track-specific subdir so they don't pollute the base.
 - **End-of-track report:** after all tasks complete, you MUST write `docs/reports/TRACK_COMPLETION_<track-name>.md` (follow the precedent set by `TRACK_COMPLETION_tier2_autonomous_sandbox_20260616.md`) and update `conductor/tracks/<track-name>/state.toml` to `status = "completed"`. This is the handoff document the user reads to decide merge.
 - **Run-time expectation:** tracks are expected to take 1-4 hours. If the model reports it is running out of context or steps, do not stop. Note progress to disk (the failcount state file) and continue. The user expects autonomous runs to complete without manual intervention.
- **Temp files** (added 2026-06-17, rewritten 2026-06-18, paths updated 2026-06-18 per Tier 2's project-relative relocation): All scratch, state, audit-output, and intermediate files MUST live INSIDE the Tier 2 clone. Default locations: `tests/artifacts/tier2_state/<track>/state.json` for failcount state, `tests/artifacts/tier2_failures/` for failure reports, `scripts/tier2/artifacts/<track>/` for throwaway scripts. **NEVER USE APPDATA** — the AppData tree is OFF-LIMITS for any read, write, or shell command. The `*AppData\\*` bash deny rule enforces this; a violation halts the run. The original `*AppData\Local\Temp\*` deny rule is kept for self-documentation. Examples: `uv run python scripts/audit_exception_handling.py --json > tests/artifacts/tier2_state/audit_initial.json` (NOT `%TEMP%\audit_initial.json`; AppData is denied by the bash rule).
+- **Temp files** (added 2026-06-17, rewritten 2026-06-18, paths updated 2026-06-18 per Tier 2's project-relative relocation; deny patterns expanded 2026-06-19 to catch all env-var forms): All scratch, state, audit-output, and intermediate files MUST live INSIDE the Tier 2 clone. Default locations: `tests/artifacts/tier2_state/<track>/state.json` for failcount state, `tests/artifacts/tier2_failures/` for failure reports, `scripts/tier2/artifacts/<track>/` for throwaway scripts. **NEVER USE APPDATA** — the AppData tree is OFF-LIMITS for any read, write, or shell command. The bash deny rules enforce this; a violation halts the run. The full list of forbidden patterns (matched against the literal command string): `*AppData\\*`, `*AppData\Local\Temp\*`, `*$env:TEMP*`, `*$env:TMP*`, `*%TEMP%*`, `*%TMP%*`, `*GetTempPath*`, `*gettempdir*`, `*mkstemp*`. Do NOT attempt to use `$env:TEMP`, `$env:TMP`, `%TEMP%`, `%TMP%`, or any temp-dir API in any form — every one of those literal command strings is denied. Examples: `uv run python scripts/audit_exception_handling.py --json > tests/artifacts/tier2_state/audit_initial.json` (NOT `%TEMP%\audit_initial.json`; AppData is denied by the bash rule).

 ## Failcount Contract

@@ -43,7 +43,7 @@ Optional flags: `--resume` (continue from last completed task), `--toast` (Windo
 - **Line endings:** preserve existing (CRLF stays CRLF, LF stays LF)
 - **Throw-away scripts:** write to `scripts/tier2/artifacts/<track-name>/`, NOT the base directory
 - **Run-time expectation:** tracks are 1-4 hours. If context runs out, note progress to disk and continue.
- **Temp files** (added 2026-06-17, rewritten 2026-06-18, paths updated 2026-06-18 per Tier 2's project-relative relocation): All scratch, state, audit-output, and intermediate files MUST live INSIDE the Tier 2 clone. Default locations: `tests/artifacts/tier2_state/<track>/state.json` for failcount state, `tests/artifacts/tier2_failures/` for failure reports, `scripts/tier2/artifacts/<track>/` for throwaway scripts. **NEVER USE APPDATA** — the AppData tree is OFF-LIMITS. The `*AppData\\*` bash deny rule enforces this.
+- **Temp files** (added 2026-06-17, rewritten 2026-06-18, paths updated 2026-06-18 per Tier 2's project-relative relocation; deny patterns expanded 2026-06-19 to catch all env-var forms): All scratch, state, audit-output, and intermediate files MUST live INSIDE the Tier 2 clone. Default locations: `tests/artifacts/tier2_state/<track>/state.json` for failcount state, `tests/artifacts/tier2_failures/` for failure reports, `scripts/tier2/artifacts/<track>/` for throwaway scripts. **NEVER USE APPDATA** — the AppData tree is OFF-LIMITS. The full list of forbidden literals (matched against the command string): `*AppData\\*`, `*AppData\Local\Temp\*`, `*$env:TEMP*`, `*$env:TMP*`, `*%TEMP%*`, `*%TMP%*`, `*GetTempPath*`, `*gettempdir*`, `*mkstemp*`. Do NOT attempt to use `$env:TEMP`, `$env:TMP`, `%TEMP%`, `%TMP%`, or any temp-dir API in any form — every one of those literal command strings is denied at the bash level.

 ## Hard Bans (enforced by 3 layers)

@@ -41,6 +41,13 @@
      "pwsh -File scripts/tier2/*": "allow",
      "*AppData\\*": "deny",
      "*AppData\\Local\\Temp\\*": "deny",
+      "*$env:TEMP*": "deny",
+      "*$env:TMP*": "deny",
+      "*%TEMP%*": "deny",
+      "*%TMP%*": "deny",
+      "*GetTempPath*": "deny",
+      "*gettempdir*": "deny",
+      "*mkstemp*": "deny",
      "git push*": "deny",
      "git checkout*": "deny",
      "git restore*": "deny",
@@ -65,6 +72,13 @@
          "*": "allow",
          "*AppData\\*": "deny",
          "*AppData\\Local\\Temp\\*": "deny",
+          "*$env:TEMP*": "deny",
+          "*$env:TMP*": "deny",
+          "*%TEMP%*": "deny",
+          "*%TMP%*": "deny",
+          "*GetTempPath*": "deny",
+          "*gettempdir*": "deny",
+          "*mkstemp*": "deny",
          "git push*": "deny",
          "git checkout*": "deny",
          "git restore*": "deny",
@@ -167,15 +167,28 @@ def test_config_fragment_has_top_level_permission() -> None:


 def test_config_fragment_denies_temp_writes() -> None:
- """Regression test (2026-06-17): the agent wrote audit output to
+ """Regression test (2026-06-17, expanded 2026-06-19 to catch all
+ env-var forms): the agent wrote audit output to
 C:\\Users\\Ed\\AppData\\Local\\Temp\\ which is outside the sandbox.
 Both the top-level and the tier2-autonomous agent's bash MUST deny
- commands targeting AppData\\Local\\Temp\\ so the agent cannot write
- there, and so the session-level 'ask' prompt is never triggered."""
+ commands targeting the global temp dir in ANY form (literal path,
+ $env:TEMP, $env:TMP, %TEMP%, %TMP%, GetTempPath, gettempdir,
+ mkstemp, NamedTemporaryFile)."""
 data = json.loads(CONFIG_PATH.read_text(encoding="utf-8"))
 top_bash = data["permission"]["bash"]
 agent_bash = data["agent"]["tier2-autonomous"]["permission"]["bash"]
- temp_deny_keys = [k for k in top_bash if "Temp" in k and top_bash[k] == "deny"]
- assert temp_deny_keys, "top-level bash must have a deny rule for AppData\\Local\\Temp\\ paths"
- temp_deny_keys_agent = [k for k in agent_bash if "Temp" in k and agent_bash[k] == "deny"]
- assert temp_deny_keys_agent, "tier2-autonomous agent bash must have a deny rule for AppData\\Local\\Temp\\ paths"
+ # Required deny patterns (matched against the literal command string)
+ required = [
+  "*AppData\\*",
+  "*AppData\\Local\\Temp\\*",
+  "*$env:TEMP*",
+  "*$env:TMP*",
+  "*%TEMP%*",
+  "*%TMP%*",
+  "*GetTempPath*",
+  "*gettempdir*",
+  "*mkstemp*",
+ ]
+ for pat in required:
+  assert top_bash.get(pat) == "deny", f"top-level bash must deny pattern: {pat!r}"
+  assert agent_bash.get(pat) == "deny", f"tier2-autonomous agent bash must deny pattern: {pat!r}"