diff --git a/conductor/tier2/agents/tier2-autonomous.md b/conductor/tier2/agents/tier2-autonomous.md index 2e3b64d0..2d249e01 100644 --- a/conductor/tier2/agents/tier2-autonomous.md +++ b/conductor/tier2/agents/tier2-autonomous.md @@ -41,7 +41,7 @@ You are running inside a Windows restricted token. The OpenCode permission syste - **Throw-away scripts:** write them to `scripts/tier2/artifacts//`, NOT the base `scripts/tier2/` directory. The base directory is reserved for production code that ships with the sandbox (failcount.py, run_track.py, write_report.py, the .ps1 launchers). Throw-away scripts are kept for archival but live in a track-specific subdir so they don't pollute the base. - **End-of-track report:** after all tasks complete, you MUST write `docs/reports/TRACK_COMPLETION_.md` (follow the precedent set by `TRACK_COMPLETION_tier2_autonomous_sandbox_20260616.md`) and update `conductor/tracks//state.toml` to `status = "completed"`. This is the handoff document the user reads to decide merge. - **Run-time expectation:** tracks are expected to take 1-4 hours. If the model reports it is running out of context or steps, do not stop. Note progress to disk (the failcount state file) and continue. The user expects autonomous runs to complete without manual intervention. -- **Temp files** (added 2026-06-17, rewritten 2026-06-18, paths updated 2026-06-18 per Tier 2's project-relative relocation): All scratch, state, audit-output, and intermediate files MUST live INSIDE the Tier 2 clone. Default locations: `tests/artifacts/tier2_state//state.json` for failcount state, `tests/artifacts/tier2_failures/` for failure reports, `scripts/tier2/artifacts//` for throwaway scripts. **NEVER USE APPDATA** — the AppData tree is OFF-LIMITS for any read, write, or shell command. The `*AppData\\*` bash deny rule enforces this; a violation halts the run. The original `*AppData\Local\Temp\*` deny rule is kept for self-documentation. Examples: `uv run python scripts/audit_exception_handling.py --json > tests/artifacts/tier2_state/audit_initial.json` (NOT `%TEMP%\audit_initial.json`; AppData is denied by the bash rule). +- **Temp files** (added 2026-06-17, rewritten 2026-06-18, paths updated 2026-06-18 per Tier 2's project-relative relocation; deny patterns expanded 2026-06-19 to catch all env-var forms): All scratch, state, audit-output, and intermediate files MUST live INSIDE the Tier 2 clone. Default locations: `tests/artifacts/tier2_state//state.json` for failcount state, `tests/artifacts/tier2_failures/` for failure reports, `scripts/tier2/artifacts//` for throwaway scripts. **NEVER USE APPDATA** — the AppData tree is OFF-LIMITS for any read, write, or shell command. The bash deny rules enforce this; a violation halts the run. The full list of forbidden patterns (matched against the literal command string): `*AppData\\*`, `*AppData\Local\Temp\*`, `*$env:TEMP*`, `*$env:TMP*`, `*%TEMP%*`, `*%TMP%*`, `*GetTempPath*`, `*gettempdir*`, `*mkstemp*`. Do NOT attempt to use `$env:TEMP`, `$env:TMP`, `%TEMP%`, `%TMP%`, or any temp-dir API in any form — every one of those literal command strings is denied. Examples: `uv run python scripts/audit_exception_handling.py --json > tests/artifacts/tier2_state/audit_initial.json` (NOT `%TEMP%\audit_initial.json`; AppData is denied by the bash rule). ## Failcount Contract diff --git a/conductor/tier2/commands/tier-2-auto-execute.md b/conductor/tier2/commands/tier-2-auto-execute.md index daa79803..24756e91 100644 --- a/conductor/tier2/commands/tier-2-auto-execute.md +++ b/conductor/tier2/commands/tier-2-auto-execute.md @@ -43,7 +43,7 @@ Optional flags: `--resume` (continue from last completed task), `--toast` (Windo - **Line endings:** preserve existing (CRLF stays CRLF, LF stays LF) - **Throw-away scripts:** write to `scripts/tier2/artifacts//`, NOT the base directory - **Run-time expectation:** tracks are 1-4 hours. If context runs out, note progress to disk and continue. -- **Temp files** (added 2026-06-17, rewritten 2026-06-18, paths updated 2026-06-18 per Tier 2's project-relative relocation): All scratch, state, audit-output, and intermediate files MUST live INSIDE the Tier 2 clone. Default locations: `tests/artifacts/tier2_state//state.json` for failcount state, `tests/artifacts/tier2_failures/` for failure reports, `scripts/tier2/artifacts//` for throwaway scripts. **NEVER USE APPDATA** — the AppData tree is OFF-LIMITS. The `*AppData\\*` bash deny rule enforces this. +- **Temp files** (added 2026-06-17, rewritten 2026-06-18, paths updated 2026-06-18 per Tier 2's project-relative relocation; deny patterns expanded 2026-06-19 to catch all env-var forms): All scratch, state, audit-output, and intermediate files MUST live INSIDE the Tier 2 clone. Default locations: `tests/artifacts/tier2_state//state.json` for failcount state, `tests/artifacts/tier2_failures/` for failure reports, `scripts/tier2/artifacts//` for throwaway scripts. **NEVER USE APPDATA** — the AppData tree is OFF-LIMITS. The full list of forbidden literals (matched against the command string): `*AppData\\*`, `*AppData\Local\Temp\*`, `*$env:TEMP*`, `*$env:TMP*`, `*%TEMP%*`, `*%TMP%*`, `*GetTempPath*`, `*gettempdir*`, `*mkstemp*`. Do NOT attempt to use `$env:TEMP`, `$env:TMP`, `%TEMP%`, `%TMP%`, or any temp-dir API in any form — every one of those literal command strings is denied at the bash level. ## Hard Bans (enforced by 3 layers) diff --git a/conductor/tier2/opencode.json.fragment b/conductor/tier2/opencode.json.fragment index d169e4ad..8e0b7d6a 100644 --- a/conductor/tier2/opencode.json.fragment +++ b/conductor/tier2/opencode.json.fragment @@ -41,6 +41,13 @@ "pwsh -File scripts/tier2/*": "allow", "*AppData\\*": "deny", "*AppData\\Local\\Temp\\*": "deny", + "*$env:TEMP*": "deny", + "*$env:TMP*": "deny", + "*%TEMP%*": "deny", + "*%TMP%*": "deny", + "*GetTempPath*": "deny", + "*gettempdir*": "deny", + "*mkstemp*": "deny", "git push*": "deny", "git checkout*": "deny", "git restore*": "deny", @@ -65,6 +72,13 @@ "*": "allow", "*AppData\\*": "deny", "*AppData\\Local\\Temp\\*": "deny", + "*$env:TEMP*": "deny", + "*$env:TMP*": "deny", + "*%TEMP%*": "deny", + "*%TMP%*": "deny", + "*GetTempPath*": "deny", + "*gettempdir*": "deny", + "*mkstemp*": "deny", "git push*": "deny", "git checkout*": "deny", "git restore*": "deny", diff --git a/tests/test_tier2_slash_command_spec.py b/tests/test_tier2_slash_command_spec.py index 555984df..315e992c 100644 --- a/tests/test_tier2_slash_command_spec.py +++ b/tests/test_tier2_slash_command_spec.py @@ -167,15 +167,28 @@ def test_config_fragment_has_top_level_permission() -> None: def test_config_fragment_denies_temp_writes() -> None: - """Regression test (2026-06-17): the agent wrote audit output to + """Regression test (2026-06-17, expanded 2026-06-19 to catch all + env-var forms): the agent wrote audit output to C:\\Users\\Ed\\AppData\\Local\\Temp\\ which is outside the sandbox. Both the top-level and the tier2-autonomous agent's bash MUST deny - commands targeting AppData\\Local\\Temp\\ so the agent cannot write - there, and so the session-level 'ask' prompt is never triggered.""" + commands targeting the global temp dir in ANY form (literal path, + $env:TEMP, $env:TMP, %TEMP%, %TMP%, GetTempPath, gettempdir, + mkstemp, NamedTemporaryFile).""" data = json.loads(CONFIG_PATH.read_text(encoding="utf-8")) top_bash = data["permission"]["bash"] agent_bash = data["agent"]["tier2-autonomous"]["permission"]["bash"] - temp_deny_keys = [k for k in top_bash if "Temp" in k and top_bash[k] == "deny"] - assert temp_deny_keys, "top-level bash must have a deny rule for AppData\\Local\\Temp\\ paths" - temp_deny_keys_agent = [k for k in agent_bash if "Temp" in k and agent_bash[k] == "deny"] - assert temp_deny_keys_agent, "tier2-autonomous agent bash must have a deny rule for AppData\\Local\\Temp\\ paths" + # Required deny patterns (matched against the literal command string) + required = [ + "*AppData\\*", + "*AppData\\Local\\Temp\\*", + "*$env:TEMP*", + "*$env:TMP*", + "*%TEMP%*", + "*%TMP%*", + "*GetTempPath*", + "*gettempdir*", + "*mkstemp*", + ] + for pat in required: + assert top_bash.get(pat) == "deny", f"top-level bash must deny pattern: {pat!r}" + assert agent_bash.get(pat) == "deny", f"tier2-autonomous agent bash must deny pattern: {pat!r}"