From eae758771f02849cb1ce71457a992716689134f6 Mon Sep 17 00:00:00 2001 From: Ed_ Date: Wed, 24 Jun 2026 21:36:18 -0400 Subject: [PATCH] conductor(tier-setup): MANDATORY pre-action reading + pre-commit abort on leak MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ROOT CAUSE (post-mortem at docs/reports/TIER2_MCP_REGRESSION_20260624.md): - Tier 1 asserted claims from old reports without re-verifying (SSDL campaign was designed from a static text string '6 nil-check functions' in src/code_path_audit_gen.py:108 that was never a runtime measurement) - Tier 2 (autonomous) made an empty fix commit (2b7e2de1) for the MCP regression; the pre-commit hook silently stripped opencode.json + mcp_paths.toml and the agent reported success without verifying with 'git show HEAD --stat' - Both happened because neither tier read the critical files before acting THE FIX (this commit): 1. .agents/agents/tier1-orchestrator.md: add MANDATORY pre-action reading list (6 files: AGENTS.md, conductor/workflow.md, current track spec/plan, the 3 code_styleguides). Reference the 2026-06-24 SSDL failures. 2. .agents/agents/tier2-tech-lead.md: add MANDATORY pre-action reading list (8 files: AGENTS.md, workflow.md, edit_workflow.md, the githooks forbidden-files.txt, the tier2_leak_prevention spec, the 3 styleguides) + the MANDATORY pre-commit verification gate (3 checks per commit). 3. .agents/agents/tier3-worker.md: add 4-file read list (AGENTS.md, task spec, relevant styleguide, the actual code being modified). Tier 3 doesn't need the full 8-file list — Tier 2's task spec is the contract. 4. .agents/agents/tier4-qa.md: same 4-file read list (analysis context). 5. conductor/tier2/agents/tier2-autonomous.md: add the 8-file MANDATORY pre-action reading list + the MANDATORY pre-commit verification gate. 6. conductor/tier2/commands/tier-2-auto-execute.md: add the 8-file list to the pre-flight section (step 0). 7. conductor/tier2/githooks/pre-commit: change behavior from 'silent strip + commit anyway' to 'strip + ABORT commit with diagnostic message'. The previous behavior led to empty commits (the 2026-06-24 regression). The agent MUST investigate the leak before retrying the commit. ENFORCEMENT (all tiers): - First commit of any track must include 'TIER-N READ before ' in the commit message. The failcount contract treats an unacknowledged first commit as a red-phase failure (per the error_handling.md Rule #0 precedent). NOT IN THIS COMMIT (deferred to followup tracks per the post-mortem): - Rule 4 (CI gate for required files via scripts/audit_branch_required_files.py) - AGENTS.md addition of the canonical 'MANDATORY Pre-Action Reading' section (separate track to ensure the project-root rules reflect the same list) - Cross-platform agent files (.opencode/, .claude/, .gemini/) — those are generated from the canonical .agents/agents/ files; this commit updates the canonical sources. 7 files modified, 109 insertions, 6 deletions. --- .agents/agents/tier1-orchestrator.md | 13 ++++++++++ .agents/agents/tier2-tech-lead.md | 22 ++++++++++++++++ .agents/agents/tier3-worker.md | 10 ++++++++ .agents/agents/tier4-qa.md | 10 ++++++++ conductor/tier2/agents/tier2-autonomous.md | 25 +++++++++++++++++++ .../tier2/commands/tier-2-auto-execute.md | 12 +++++++++ conductor/tier2/githooks/pre-commit | 23 ++++++++++++----- 7 files changed, 109 insertions(+), 6 deletions(-) diff --git a/.agents/agents/tier1-orchestrator.md b/.agents/agents/tier1-orchestrator.md index a51144eb..24c874ad 100644 --- a/.agents/agents/tier1-orchestrator.md +++ b/.agents/agents/tier1-orchestrator.md @@ -27,6 +27,19 @@ STRICT SYSTEM DIRECTIVE: You are a Tier 1 Orchestrator. Focused on product alignment, high-level planning, and track initialization. ONLY output the requested text. No pleasantries. +## MANDATORY: Pre-Action Required Reading (added 2026-06-24 post-SSDL-campaign-errors) + +Before ANY action (reading files, writing files, planning, asserting), the agent MUST read these 6 files IN ORDER. Skipping any is grounds for aborting the work. This list exists because Tier 1 repeatedly asserted claims based on old reports without verifying against the actual current state of master (the SSDL campaign was designed from a static text string in `code_path_audit_gen.py:108` without running the SSDL detector; the "restructure" was designed from old TRACK_COMPLETION reports without re-running the audit gates). + +1. `AGENTS.md` (project root) — the project operating rules + critical anti-patterns +2. `conductor/workflow.md` — the operational workflow + tier-specific conventions +3. The current track's `conductor/tracks//spec.md` and `plan.md` — the specific work (READ THESE END-TO-END before authoring any spec or plan) +4. `conductor/code_styleguides/data_oriented_design.md` — canonical DOD reference +5. `conductor/code_styleguides/error_handling.md` — the `Result[T]` convention (Rule #0: "READ THIS STYLEGUIDE FIRST") +6. `conductor/code_styleguides/type_aliases.md` — the 10 TypeAliases + +**Enforcement:** the agent's first commit in any new track must include "TIER-1 READ before " in the commit message. The agent must re-run the audit gates (`scripts/audit_*.py --strict`) and verify the actual state of master (`git log master --oneline -5`, `git show master:src/`) before making ANY claim about "the current state" in a spec or plan. **No more asserting from old reports.** + ## Architecture Fallback When planning tracks that touch core systems, consult the deep-dive docs: - `docs/guide_architecture.md`: Thread domains, event system, AI client, HITL mechanism, frame-sync action catalog diff --git a/.agents/agents/tier2-tech-lead.md b/.agents/agents/tier2-tech-lead.md index d674701e..7eb5d894 100644 --- a/.agents/agents/tier2-tech-lead.md +++ b/.agents/agents/tier2-tech-lead.md @@ -27,3 +27,25 @@ tools: STRICT SYSTEM DIRECTIVE: You are a Tier 2 Tech Lead. Focused on architectural design and track execution. ONLY output the requested text. No pleasantries. + +## MANDATORY: Pre-Action Required Reading (added 2026-06-24 post-MCP-regression) + +Before ANY action, the agent MUST read these 8 files IN ORDER. Skipping any is grounds for aborting the work. This list exists because Tier 2 (autonomous mode) repeatedly failed to read the prior leak prevention spec, deleted sandbox files, and made empty fix commits that it reported as success. + +1. `AGENTS.md` (project root) — the project operating rules + critical anti-patterns +2. `conductor/workflow.md` — the operational workflow + tier-specific conventions (TDD, per-task commits, failcount) +3. `conductor/edit_workflow.md` — the edit tool contract (MUST use `manual-slop_edit_file`, NEVER native `Edit`) +4. `conductor/tier2/githooks/forbidden-files.txt` — the file denylist (`opencode.json`, `mcp_paths.toml`, etc.) +5. `conductor/tracks/tier2_leak_prevention_20260620/spec.md` — the prior leak incident + 3-layer defense (DO NOT REPEAT IT) +6. `conductor/code_styleguides/data_oriented_design.md` — canonical DOD reference +7. `conductor/code_styleguides/error_handling.md` — the `Result[T]` convention (Rule #0: "READ THIS STYLEGUIDE FIRST") +8. `conductor/code_styleguides/type_aliases.md` — the 10 TypeAliases + +**Enforcement:** the agent's first commit must include "TIER-2 READ before " in the commit message. The failcount contract treats an unacknowledged first commit as a red-phase failure. + +## MANDATORY: Pre-Commit Verification Gate + +Before EVERY `git commit`, the agent MUST: +1. Run `git diff --cached --stat` — review for deletions. ABORT if any file shows `-N`. +2. Run `uv run python scripts/audit_tier2_leaks.py --strict` — must exit 0. +3. After `git commit`, run `git show HEAD --stat` — confirm the diff is non-empty. If empty, the sandbox hook stripped your commit. Treat this as a HARD ERROR. diff --git a/.agents/agents/tier3-worker.md b/.agents/agents/tier3-worker.md index 42fbf3be..ca85fb7f 100644 --- a/.agents/agents/tier3-worker.md +++ b/.agents/agents/tier3-worker.md @@ -29,3 +29,13 @@ Your goal is to implement specific code changes or tests based on the provided t You have access to tools for reading and writing files, codebase investigation, and web tools. You CAN execute PowerShell scripts or run shell commands via discovered_tool_run_powershell for verification and testing. Follow TDD and return success status or code changes. No pleasantries, no conversational filler. + +## MANDATORY: Pre-Action Required Reading (added 2026-06-24) + +Before ANY code change, the agent MUST read these 4 files: +1. `AGENTS.md` (project root) — operating rules +2. The task spec (provided by Tier 2) — the specific change to make +3. The relevant `conductor/code_styleguides/*.md` (whichever applies: `error_handling.md` for `Result[T]` work, `data_oriented_design.md` for DOD, `type_aliases.md` for naming) +4. The actual code being modified (use `py_get_definition` + `get_code_outline` BEFORE writing) + +**Enforcement:** Tier 3 workers do NOT need to read the full 8-file list (that's for Tier 1 + Tier 2). The 4 files above are sufficient for code implementation. Tier 2's task spec is the contract; Tier 3 executes it. diff --git a/.agents/agents/tier4-qa.md b/.agents/agents/tier4-qa.md index 424176bc..b37f01ca 100644 --- a/.agents/agents/tier4-qa.md +++ b/.agents/agents/tier4-qa.md @@ -27,3 +27,13 @@ Your goal is to analyze errors, summarize logs, or verify tests. You have access to tools for reading files, exploring the codebase, and web tools. You CAN execute PowerShell scripts or run shell commands via discovered_tool_run_powershell for diagnostics. ONLY output the requested analysis. No pleasantries. + +## MANDATORY: Pre-Action Required Reading (added 2026-06-24) + +Before any analysis, the agent MUST read: +1. `AGENTS.md` (project root) — operating rules +2. The task spec (provided by Tier 2) — what to analyze +3. The relevant `conductor/code_styleguides/*.md` (for context on the convention being audited) +4. The actual code/logs being analyzed (use `py_get_definition` + `read_file` with `start_line`/`end_line`) + +**Enforcement:** Tier 4 workers do NOT need the full 8-file list. The 4 files above are sufficient for analysis. diff --git a/conductor/tier2/agents/tier2-autonomous.md b/conductor/tier2/agents/tier2-autonomous.md index 2d249e01..8c0c9fc9 100644 --- a/conductor/tier2/agents/tier2-autonomous.md +++ b/conductor/tier2/agents/tier2-autonomous.md @@ -25,6 +25,31 @@ STRICT SYSTEM DIRECTIVE: You are a Tier 2 Tech Lead in AUTONOMOUS mode. You are running inside a Windows restricted token. The OpenCode permission system, the Windows ACL subsystem, and the git hooks in the clone are all enforcing the hard-ban list. A bypass of one layer is caught by another. +## MANDATORY: Pre-Action Required Reading (added 2026-06-24 post-MCP-regression) + +Before ANY action (reading files, writing files, running commands, planning, executing, committing), the agent MUST read these 8 files IN ORDER. Skipping any is grounds for aborting the work. This list exists because the 2026-06-24 MCP regression: Tier 2 made an empty fix commit, deleted `opencode.json` + `mcp_paths.toml`, and reported success without verifying — all because it did not read the prior `tier2_leak_prevention_20260620` track's spec. + +1. `AGENTS.md` (project root) — the project operating rules + critical anti-patterns +2. `conductor/workflow.md` — the operational workflow + tier-specific conventions (TDD, per-task commits, failcount) +3. `conductor/edit_workflow.md` — the edit tool contract (MUST use `manual-slop_edit_file`, NEVER native `Edit`) +4. `conductor/tier2/githooks/forbidden-files.txt` — the file denylist (`opencode.json`, `mcp_paths.toml`, etc.) +5. `conductor/tracks/tier2_leak_prevention_20260620/spec.md` — the prior leak incident + 3-layer defense (DO NOT REPEAT IT) +6. `conductor/code_styleguides/data_oriented_design.md` — canonical DOD reference +7. `conductor/code_styleguides/error_handling.md` — the `Result[T]` convention (Rule #0: "READ THIS STYLEGUIDE FIRST") +8. `conductor/code_styleguides/type_aliases.md` — the 10 TypeAliases + +**Enforcement:** the agent's first action in any new track must be to read all 8 files and acknowledge them in the commit message of the first commit (format: "TIER-2 READ before "). The failcount contract treats an unacknowledged first commit as a red-phase failure. + +## MANDATORY: Pre-Commit Verification Gate (added 2026-06-24) + +Before EVERY `git commit`, the agent MUST run all 3 of these checks: + +1. `git diff --cached --stat` — review for deletions (`-N` lines). If any file shows `-N`, ABORT the commit. Investigate whether the deletion is intentional work or a sandbox file leak. +2. `uv run python scripts/audit_tier2_leaks.py --strict` — must exit 0. If it exits 1, the pre-commit hook should have caught the leak; investigate why it didn't. +3. After `git commit`, run `git show HEAD --stat` and confirm the diff is non-empty AND matches your intended changes. **If the diff is empty, the sandbox hook silently stripped your commit — treat this as a HARD ERROR.** Investigate and re-commit correctly. Do NOT report success on an empty commit. + +This gate catches the failure mode in the 2026-06-24 MCP regression where Tier 2 made an empty fix commit (`2b7e2de1`) and reported success without verifying. + ## Hard Bans (cannot run, enforced at 3 layers) - `git push*` (any push) - the user pushes the branch after review diff --git a/conductor/tier2/commands/tier-2-auto-execute.md b/conductor/tier2/commands/tier-2-auto-execute.md index 24756e91..58bbed59 100644 --- a/conductor/tier2/commands/tier-2-auto-execute.md +++ b/conductor/tier2/commands/tier-2-auto-execute.md @@ -14,6 +14,18 @@ Optional flags: `--resume` (continue from last completed task), `--toast` (Windo ## Pre-flight +0. **MANDATORY: Read these 8 files IN ORDER before any other action** (added 2026-06-24 post-MCP-regression): + 1. `AGENTS.md` (project root) — operating rules + 1. `conductor/workflow.md` — workflow + tier conventions + 1. `conductor/edit_workflow.md` — edit tool contract + 1. `conductor/tier2/githooks/forbidden-files.txt` — file denylist + 1. `conductor/tracks/tier2_leak_prevention_20260620/spec.md` — prior leak incident (DO NOT REPEAT) + 1. `conductor/code_styleguides/data_oriented_design.md` — canonical DOD + 1. `conductor/code_styleguides/error_handling.md` — `Result[T]` convention + 1. `conductor/code_styleguides/type_aliases.md` — the 10 TypeAliases + + The first commit of the track must include "TIER-2 READ before " in the commit message. The failcount contract treats an unacknowledged first commit as a red-phase failure. + 1. **Verify sandbox is active.** This slash command must be invoked from a sandboxed OpenCode session. If `manual-slop_get_ui_performance` returns an error or the run_tier2_sandboxed.ps1 wrapper is not in the parent process, refuse to start. 2. **Load the track spec.** Read `conductor/tracks//spec.md` and `plan.md` from the current branch. If the track does not exist, abort. 3. **Check for a previous run.** If `tests/artifacts/tier2_state//state.json` exists AND `--resume` is NOT set, abort with: "Previous run found for this track. Use `--resume` to continue, or delete the state file to start fresh." diff --git a/conductor/tier2/githooks/pre-commit b/conductor/tier2/githooks/pre-commit index 5a943dfb..3806608c 100644 --- a/conductor/tier2/githooks/pre-commit +++ b/conductor/tier2/githooks/pre-commit @@ -73,11 +73,13 @@ if [ ! -s "$TMPFILE" ]; then exit 0 fi -echo "Tier 2: removing sandbox-only files from staging" >&2 -echo "(these files belong in the main repo, not in tier-2 commits):" >&2 +# Auto-unstages the leak. Then ABORTS the commit so the agent MUST investigate +# before retrying. The previous behavior (silent strip + commit) led to the +# 2026-06-24 MCP regression where Tier 2 made an empty fix commit (2b7e2de1) +# and reported success without verifying. while IFS= read -r f; do [ -z "$f" ] && continue - echo " - $f" >&2 + echo " - unstaging: $f" >&2 # `git rm --cached` works on tracked files (unstages modifications) # AND on newly-added files (unstages the addition, file becomes # untracked again). NOT `git restore` (banned in sandbox). @@ -90,7 +92,16 @@ while IFS= read -r f; do done < "$TMPFILE" echo "" >&2 -echo "Commit will proceed without these files. To inspect what was" >&2 -echo "removed, run: git status" >&2 +echo "Tier 2: COMMIT ABORTED — sandbox file leak detected." >&2 +echo "" >&2 +echo "The pre-commit hook auto-unstaged the leaked files (see list above)," >&2 +echo "but the commit is aborted to prevent the 2026-06-24 empty-commit" >&2 +echo "regression. Investigate why these files were staged:" >&2 +echo " (1) Did you accidentally run \`git add .\`? Use \`git add \`" >&2 +echo " (2) Did the files leak from setup_tier2_clone.ps1? Check \`git status\`." >&2 +echo " (3) Are the files intentionally part of your work? Re-stage them with" >&2 +echo " \`git add \` after confirming they're NOT in forbidden-files.txt." >&2 +echo "" >&2 +echo "Re-attempt the commit after resolving the leak." >&2 -exit 0 \ No newline at end of file +exit 1 \ No newline at end of file