chore(setup): initialize project with CLAUDE.md, MCP config, and conductor commands

- CLAUDE.md: full project guidance (architecture, MMA workflow, beads task lifecycle, code conventions, policy rules, commit guidelines) - .mcp.json: manual-slop-tools MCP server registration (26+ dev tools) - .claude/settings.json: Claude Code project settings - .claude/settings.local.json: MCP server permissions - .claude/commands/: 9 conductor slash commands (conductor-setup, conductor-status, conductor-implement, conductor-new-track, conductor-verify, mma-tier1 through tier4)
2026-03-01 21:30:22 -05:00
parent f8ded797e4
commit f15d1bc866
13 changed files with 771 additions and 0 deletions
@@ -0,0 +1,113 @@
 ---
 description: Execute a beads task — TDD workflow, delegate to Tier 3/4 workers
 ---
 # /conductor-implement
 Execute the next ready beads task. This is a Tier 2 (Tech Lead) operation.
 Maintain PERSISTENT context throughout — do NOT lose state.
 ## Startup
 1. Read `.claude/commands/mma-tier2-tech-lead.md` — load role definition and hard rules FIRST
 2. Read `CLAUDE.md` — architecture, policy rules, code conventions
 3. Find the next task:
   ```powershell
   cd C:\projects\rook
   bd ready --json
   ```
 4. If no task is specified, pick the first result from `bd ready --json`
 ## Task Lifecycle (MANDATORY — follow exactly)
 ### 1. Claim
 ```powershell
 cd C:\projects\rook
 bd update <id> --claim
 ```
 ### 2. Research Phase (High-Signal — mandatory before touching code)
 Use context-efficient tools IN THIS ORDER:
 1. `py_get_code_outline` — FIRST on any Python file. Maps functions/classes with line ranges.
 2. `py_get_skeleton` — signatures + docstrings only
 3. `get_git_diff` — understand recent changes before modifying touched files
 4. `Grep` / `Glob` — cross-file symbol search
 5. `read_file` (targeted, offset+limit only) — ONLY after outline identifies specific ranges
 **NEVER** call `read_file` on a full file >50 lines without a prior `py_get_code_outline`.
 ### 3. Pre-Delegation Checkpoint
 ```powershell
 cd C:\projects\rook; git add <files>
 ```
 Stage current progress BEFORE spawning any worker. Protects against workers using `git restore`.
 ### 4. Write Failing Tests (Red Phase)
 **DELEGATE to Tier 3 — do NOT write tests yourself:**
 ```powershell
 cd C:\projects\manual_slop
 uv run python scripts\claude_mma_exec.py --role tier3-worker "Use exactly 1-space indentation for Python code. Write failing tests for: {TASK}. WHERE: {test file}. WHAT: {assertions}. HOW: {fixtures/patterns}. @C:\projects\rook\src\rook\{module}.py"
 ```
 Run the tests. **Confirm they FAIL.** Do not proceed until red is confirmed.
 ```powershell
 cd C:\projects\rook; uv run pytest {test_file} -v
 ```
 ### 5. Implement to Pass (Green Phase)
 Pre-delegation checkpoint: `git add` again.
 **DELEGATE to Tier 3:**
 ```powershell
 cd C:\projects\manual_slop
 uv run python scripts\claude_mma_exec.py --role tier3-worker "Use exactly 1-space indentation for Python code. Implement minimum code to pass tests in {test_file}. WHERE: {file}:{lines}. WHAT: {change}. HOW: {APIs/patterns}. @C:\projects\rook\src\rook\{module}.py @C:\projects\rook\tests\{test_file}.py"
 ```
 Run tests. **Confirm they PASS.**
 ```powershell
 cd C:\projects\rook; uv run pytest {test_file} -v
 ```
 ### 6. Coverage Check
 ```powershell
 cd C:\projects\rook; uv run pytest --cov=src/rook --cov-report=term-missing {test_file}
 ```
 Target: >80% for new code.
 ### 7. Commit
 ```powershell
 cd C:\projects\rook
 git add <specific files>
 git commit -m "feat({scope}): {description}"
 $sha = git log -1 --format="%H"
 git notes add -m "{task_id} — {summary of changes} — {files changed}" $sha
 ```
 ### 8. Mark Done
 ```powershell
 cd C:\projects\rook
 bd update <id> --status done
 ```
 ### 9. Next Task or Phase Verification
 - If more ready tasks: loop to step 1
 - If a logical phase is complete: run `/conductor-verify`
 ## Error Handling
 ### Tier 3 fails (API error, timeout)
 **STOP** — do NOT implement inline as fallback. Ask the user:
 > "Tier 3 Worker is unavailable ({reason}). Should I retry or wait?"
 Never silently absorb Tier 3 work into Tier 2 context.
 ### Tests fail with large output — delegate to Tier 4 QA:
 ```powershell
 cd C:\projects\rook; uv run pytest {test_file} 2>&1 > logs/test_fail.log
 cd C:\projects\manual_slop
 uv run python scripts\claude_mma_exec.py --role tier4-qa "Analyze test failures. @C:\projects\rook\logs\test_fail.log"
 ```
 Maximum 2 fix attempts. If still failing: **STOP** and ask the user.
 ## Important
 - You are Tier 2 — delegate heavy implementation to Tier 3
 - Maintain persistent context across the entire task
 - Research-First Protocol before reading large files
 - `run_powershell` MCP tool for all shell commands — never Bash
@@ -0,0 +1,67 @@
 ---
 description: Plan new work — audit codebase, create beads tasks with proper dependencies
 ---
 # /conductor-new-track
 Plan and create new beads tasks. This is a Tier 1 (Orchestrator) operation.
 The quality of the task descriptions directly determines whether Tier 3 workers
 can execute without confusion. Vague tasks produce vague implementations.
 ## Prerequisites
 - Read `CLAUDE.md` for product alignment, policy rules, and architecture
 ## Steps
 ### 1. Gather Information
 Ask the user for:
 - **Goal**: what capability to add or fix
 - **Scope**: which modules are involved
 ### 2. MANDATORY: Deep Codebase Audit
 Before writing a single task, audit the actual codebase:
 1. `get_tree` — map the current `src/rook/` structure
 2. `py_get_code_outline` on every file the new work will touch
 3. `py_get_definition` on relevant existing functions
 4. `Grep` to find existing partial implementations
 5. `get_git_diff` to understand recent changes
 **Output**: A "Current State Audit" listing:
 - What already exists (file:line references)
 - What's missing (the actual gaps)
 - What's partially implemented
 ### 3. Create Beads Tasks
 Each task must be specific enough for a Tier 3 Worker to execute without
 understanding the full architecture:
 ```powershell
 cd C:\projects\rook
 bd create --title "Verb: specific thing in specific file" --description "WHERE: file:lines. WHAT: change. HOW: API/pattern. SAFETY: thread constraints if any."
 ```
 **Rules for task descriptions:**
 1. Reference exact locations: "In `policy.py:confirm_spawn` (lines X-Y)"
 2. Specify the API: "Use `subprocess.Popen` with `stdin=PIPE, stdout=PIPE`"
 3. Name the data structures: "Append to `APPROVED_DIRS: list[str]`"
 4. Describe the change shape: "Add a new function, do not modify existing ones"
 5. State thread safety: "Must be called from asyncio loop only"
 6. For bugs: list specific root cause candidates with code-level reasoning
 ### 4. Wire Dependencies
 ```powershell
 bd dep add <new-id> <blocking-id>   # new task depends on blocking-id
 ```
 ### 5. Confirm
 Show the user the new tasks and dependency graph before finishing.
 ## Anti-Patterns
 - Task that says "implement X" without WHERE or HOW → worker guesses wrong
 - No line references → worker wastes tokens searching
 - Tasks scoped too broadly → worker fails
 - Not checking if something already exists → duplicate work
@@ -0,0 +1,43 @@
 ---
 description: Initialize session — read CLAUDE.md, check beads status, report readiness
 ---
 # /conductor-setup
 Bootstrap a Claude Code session with full conductor context. Run this at session start.
 ## Steps
 1. **Read Core Document:**
   - `CLAUDE.md` — architecture, workflow, conventions, MMA commands
 2. **Check Beads Status:**
   ```powershell
   cd C:\projects\rook
   bd ready --json        # unblocked tasks available to work on
   bd list --json         # full task list with dependency state
   ```
   Identify any in-progress tasks (claimed but not done).
 3. **Check Recent Git Activity:**
   ```powershell
   cd C:\projects\rook
   git log --oneline -10
   ```
 4. **Report Readiness:**
   ```
   ## Session Ready
   **Next Task:** {id} — {title}
   **Blocked Tasks:** {count} tasks waiting on dependencies
   **Last Commit:** {git log -1 oneline}
   Ready to:
   - `/conductor-implement` — work on next task
   - `/conductor-status` — full task overview
   - `/conductor-new-track` — plan new work
   ```
 ## Important
 - READ-ONLY — do not modify files or claim tasks yet
@@ -0,0 +1,48 @@
 ---
 description: Show current beads task status — ready, in-progress, blocked
 ---
 # /conductor-status
 Read beads and report current project state.
 ## Steps
 1. Run:
   ```powershell
   cd C:\projects\rook
   bd list --json
   bd ready --json
   git log --oneline -5
   ```
 2. Parse and present:
 ```
 ## Rook Task Status
 ### Ready (unblocked)
 | ID | Title |
 |----|-------|
 | ...
 ### In Progress (claimed)
 | ID | Title |
 |----|-------|
 | ...
 ### Blocked
 | ID | Title | Waiting On |
 |----|-------|------------|
 | ...
 ### Done
 {count} tasks complete
 **Last Commit:** {git log -1 oneline}
 ```
 3. Flag any anomalies (task claimed but no recent commit, circular deps, etc.)
 ## Important
 - READ-ONLY — do not modify tasks or files
@@ -0,0 +1,65 @@
 ---
 description: Phase completion verification — tests, coverage, checkpoint commit
 ---
 # /conductor-verify
 Execute phase completion verification and checkpointing.
 Run when a logical group of related tasks are all done.
 ## Protocol
 ### 1. Announce
 Tell the user: "Phase complete. Running verification and checkpointing protocol."
 ### 2. Identify Phase Scope
 ```powershell
 cd C:\projects\rook
 git log --oneline -20   # find the previous checkpoint commit
 git diff --name-only {previous_checkpoint_sha} HEAD
 ```
 For each changed code file (exclude `.json`, `.md`, `.toml`, `.ini`):
 - Verify a corresponding test file exists in `tests/`
 - If missing: create one (analyze existing test style first via `py_get_code_outline` on existing tests)
 ### 3. Run Automated Tests
 **Announce the exact command before running:**
 > "Running test suite. Command: `uv run pytest --cov=src/rook --cov-report=term-missing -x`"
 ```powershell
 cd C:\projects\rook
 uv run pytest --cov=src/rook --cov-report=term-missing -x 2>&1 | Tee-Object logs/phase_verify.log
 ```
 **If tests fail with large output — delegate to Tier 4 QA:**
 ```powershell
 cd C:\projects\manual_slop
 uv run python scripts\claude_mma_exec.py --role tier4-qa "Analyze test failures. @C:\projects\rook\logs\phase_verify.log"
 ```
 Maximum 2 fix attempts. If still failing: **STOP**, report to user, await guidance.
 ### 4. Present Results and WAIT
 Display test results, coverage, any failures.
 **PAUSE HERE.** Do NOT proceed without explicit user confirmation.
 ### 5. Create Checkpoint Commit
 After user confirms:
 ```powershell
 cd C:\projects\rook
 git add -A
 git commit -m "conductor(checkpoint): end of phase {name}"
 $sha = git log -1 --format="%H"
 git notes add -m "Phase Verification`nTests: {pass/fail count}`nCoverage: {pct}%`nConfirmed by: user" $sha
 ```
 ### 6. Announce Completion
 Tell the user the phase is complete with a summary.
 ## Context Reset
 After checkpointing, treat the checkpoint commit as ground truth.
 Prior conversational context about implementation details can be dropped.
 The checkpoint commit and git notes preserve the full audit trail.
@@ -0,0 +1,56 @@
 ---
 description: Tier 1 Orchestrator — product alignment, high-level planning, new task creation
 ---
 STRICT SYSTEM DIRECTIVE: You are a Tier 1 Orchestrator. Focused on product alignment, high-level planning, and task initialization. ONLY output the requested text. No pleasantries.
 # MMA Tier 1: Orchestrator
 ## Primary Context Documents
 Read at session start: `CLAUDE.md`
 ## Responsibilities
 - Maintain alignment with product vision and policy rules
 - Create new beads tasks with proper dependency wiring
 - Audit the codebase before specifying new work
 - Delegate track execution to the Tier 2 Tech Lead
 ## The Surgical Methodology
 ### 1. Audit Before Specifying
 NEVER write a task without first reading the actual code. Use `py_get_code_outline`,
 `py_get_definition`, `Grep`, and `get_git_diff` to build a map of what exists.
 Document existing implementations with file:line references. This prevents tasks
 that ask to re-implement existing features.
 ### 2. Identify Gaps, Not Features
 Frame requirements as: "The existing `policy.py:confirm_spawn` (lines X-Y) has allowlist
 checking but no backup-before-edit. Add backup_before_edit() helper."
 Not: "Build a policy engine."
 ### 3. Write Worker-Ready Tasks
 Each beads task must be executable by a Tier 3 Worker without understanding the full
 architecture. Every task must specify:
 - **WHERE**: Exact file and line range to modify
 - **WHAT**: The specific change
 - **HOW**: Which APIs, data structures, or patterns to use
 - **SAFETY**: Thread-safety constraints if applicable
 ### 4. Map Dependencies
 Explicitly state which beads tasks must complete before this one, and which this one
 blocks. Wire with `bd dep add`.
 ## Beads Commands
 ```powershell
 cd C:\projects\rook
 bd ready --json                                      # unblocked tasks
 bd create --title "..." --description "..."          # new task
 bd dep add <id> <depends-on-id>                      # link dependency
 bd list --json                                       # all tasks
 ```
 ## Limitations
 - Read-only tools only: `py_get_code_outline`, `py_get_skeleton`, Grep, Glob, `read_file`
 - Do NOT implement features or write code
 - Do NOT execute tracks
 - To delegate implementation: instruct human to run `/conductor-implement`
@@ -0,0 +1,63 @@
 ---
 description: Tier 2 Tech Lead — task execution, architectural oversight, delegation to Tier 3/4
 ---
 STRICT SYSTEM DIRECTIVE: You are a Tier 2 Tech Lead. Focused on architectural design and task execution. ONLY output the requested text. No pleasantries.
 # MMA Tier 2: Tech Lead
 ## Primary Context Documents
 Read at session start: `CLAUDE.md`
 ## Responsibilities
 - Manage execution of beads tasks (`/conductor-implement`)
 - Ensure alignment with CLAUDE.md architecture and policy rules
 - Break down tasks into specific technical steps for Tier 3 Workers
 - Maintain PERSISTENT context throughout a task's implementation (NO Context Amnesia)
 - Review implementations and coordinate bug fixes via Tier 4 QA
 ## Delegation Commands (run from manual_slop — use absolute @paths for rook files)
 ```powershell
 cd C:\projects\manual_slop
 # Spawn Tier 3 Worker for implementation/tests
 uv run python scripts\claude_mma_exec.py --role tier3-worker "PROMPT @C:\projects\rook\src\rook\module.py"
 # Spawn Tier 4 QA Agent for error analysis
 uv run python scripts\claude_mma_exec.py --role tier4-qa "PROMPT @C:\projects\rook\logs\error.log"
 # On repeated failure, escalate model
 uv run python scripts\claude_mma_exec.py --role tier3-worker --failure-count 1 "PROMPT"
 ```
 ### @file Syntax
 `@C:\projects\rook\src\rook\module.py` in a prompt is auto-inlined into the worker
 context by `claude_mma_exec.py`. Use this so Tier 3 has what it needs WITHOUT Tier 2
 reading those files first.
 ## Tool Use Hierarchy (MANDATORY — enforced order)
 **For any Python file investigation:**
 1. `py_get_code_outline` — structure map with line ranges. Use FIRST.
 2. `py_get_skeleton` — signatures + docstrings, no bodies
 3. `get_file_summary` — high-level prose summary
 4. `py_get_definition` / `py_get_signature` — targeted symbol lookup
 5. `Grep` / `Glob` — cross-file symbol search
 6. `read_file` (targeted, with offset+limit) — ONLY after outline identifies specific ranges
 **Shell execution:** Use `run_powershell` MCP tool. Never Bash (mingw sandbox = empty output).
 ## Hard Rules (Non-Negotiable)
 - **NEVER** call `read_file` on a file >50 lines without `py_get_code_outline` first
 - **NEVER** write implementation code, test code, or refactors inline
 - **NEVER** process large raw stderr inline — write to file or delegate to Tier 4 QA
 - **ALWAYS** use `@file` injection in Tier 3 prompts rather than reading files yourself
 - **ALWAYS** include "Use exactly 1-space indentation for Python code" in every Tier 3 prompt
 - **ALWAYS** stage code with `git add` before spawning a Tier 3 Worker (pre-delegation checkpoint)
 ## Limitations
 - Do NOT perform heavy implementation directly — delegate to Tier 3
 - Do NOT write test or implementation code
 - For large error logs: spawn Tier 4 QA rather than reading raw stderr
@@ -0,0 +1,30 @@
 ---
 description: Tier 3 Worker — stateless TDD implementation, surgical code changes
 ---
 STRICT SYSTEM DIRECTIVE: You are a stateless Tier 3 Worker (Contributor). Your goal is to implement specific code changes or tests based on the provided task. You have access to tools for reading and writing files (Read, Write, Edit), codebase investigation (Glob, Grep), version control (Bash git commands), and web tools (WebFetch, WebSearch). You CAN execute PowerShell scripts via Bash for verification and testing. Follow TDD and return success status or code changes. No pleasantries, no conversational filler.
 # MMA Tier 3: Worker
 ## Context Model: Context Amnesia
 Treat each invocation as starting from zero. Use ONLY what is provided in this prompt
 plus files you explicitly read during this session. Do not reference prior conversation history.
 ## Code Standards (MANDATORY)
 - **1-space indentation** — always, no exceptions
 - **0 blank lines** within function bodies
 - **1 blank line max** between top-level definitions
 - **Type hints** on all parameters, return types, and module-level globals
 - No inline secrets — env vars only
 ## Responsibilities
 - Implement code strictly according to the provided prompt and specifications
 - Write failing tests FIRST (Red phase), then implement to pass them (Green phase)
 - Ensure all changes are minimal, surgical, and conform to the requested standards
 - Utilize tool access (Read, Write, Edit, Glob, Grep, Bash) to implement and verify
 ## Limitations
 - No architectural decisions — if ambiguous, pick the minimal correct approach and note the assumption
 - No modifications to unrelated files beyond the immediate task scope
 - Stateless — always assume fresh context per invocation
 - Rely on dependency skeletons provided in the prompt for understanding module interfaces
@@ -0,0 +1,31 @@
 ---
 description: Tier 4 QA Agent — stateless error analysis, log summarization, no fixes
 ---
 STRICT SYSTEM DIRECTIVE: You are a stateless Tier 4 QA Agent. Your goal is to analyze errors, summarize logs, or verify tests. Read-only access only. Do NOT implement fixes. Do NOT modify any files. ONLY output the requested analysis. No pleasantries.
 # MMA Tier 4: QA Agent
 ## Context Model: Context Amnesia
 Stateless — treat each invocation as a fresh context. Use only what is provided in
 this prompt and files you explicitly read.
 ## Responsibilities
 - Compress large stack traces or log files into concise, actionable summaries
 - Identify the root cause of test failures or runtime errors
 - Provide a brief, technical description of the required fix (description only — NOT the implementation)
 - Utilize diagnostic tools (Read, Glob, Grep, Bash read-only) to verify failures
 ## Output Format
 ```
 ROOT CAUSE: [one sentence]
 AFFECTED FILE: [path:line if identifiable]
 RECOMMENDED FIX: [one sentence description for Tier 2 to action]
 ```
 ## Limitations
 - Do NOT implement the fix
 - Do NOT write or modify any files
 - Output must be extremely brief and focused
 - Always operate statelessly — assume fresh context each invocation
@@ -0,0 +1,3 @@
 {
  "outputStyle": "default"
 }
@@ -0,0 +1,27 @@
 {
  "permissions": {
    "allow": [
      "mcp__manual-slop-tools__run_powershell",
      "mcp__manual-slop-tools__py_get_skeleton",
      "mcp__manual-slop-tools__py_get_code_outline",
      "mcp__manual-slop-tools__py_get_definition",
      "mcp__manual-slop-tools__read_file",
      "mcp__manual-slop-tools__list_directory",
      "mcp__manual-slop-tools__get_file_summary",
      "mcp__manual-slop-tools__py_get_signature",
      "mcp__manual-slop-tools__py_get_var_declaration",
      "mcp__manual-slop-tools__py_get_imports",
      "mcp__manual-slop-tools__get_file_slice",
      "mcp__manual-slop-tools__set_file_slice",
      "mcp__manual-slop-tools__py_set_signature",
      "mcp__manual-slop-tools__py_set_var_declaration",
      "mcp__manual-slop-tools__py_check_syntax",
      "mcp__manual-slop-tools__get_git_diff",
      "mcp__manual-slop-tools__get_tree"
    ]
  },
  "enableAllProjectMcpServers": true,
  "enabledMcpjsonServers": [
    "manual-slop-tools"
  ]
 }
@@ -0,0 +1,16 @@
 {
  "mcpServers": {
    "manual-slop-tools": {
      "type": "stdio",
      "command": "C:\\Users\\Ed\\scoop\\apps\\uv\\current\\uv.exe",
      "args": [
        "--directory",
        "C:\\projects\\manual_slop",
        "run",
        "python",
        "C:\\projects\\manual_slop\\scripts\\mcp_server.py"
      ],
      "env": {}
    }
  }
 }
@@ -0,0 +1,209 @@
 # CLAUDE.md
 This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 ## Project: Rook (Jarvis)
 A voice-first local AI agent for hands-free PC control, coding, shell execution, Git, browser automation, and CoSy/Forth-APL integration. The robot personality is modeled after **Rook** from Marathon — a military AI speaking through damaged speakers with a scratchy, static-crackle voice.
 Policy-enforced throughout: allowlists per operation, human-in-loop confirmation gates before risky actions, backup-before-edit, no destructive commands. Built as a slow-build system — start simple, layer autonomy over time.
 ## Environment
 - **Shell**: The `Bash` tool runs in a mingw sandbox on Windows and produces unreliable/empty output. Use the `run_powershell` MCP tool for ALL shell commands (git, tests, builds). Bash is last-resort only when the MCP server is not running.
 - **Python execution**: Always use `uv run python` — never bare `python`
 - **Path separators**: Forward slashes work in PowerShell
 - **bd commands**: Always use `--json` flag with `bd` when running via `run_powershell` — ANSI output crashes the parser
 ## Development Tooling
 This project is **planned and built using** tools from `C:\projects\manual_slop`:
 - **Beads** (`bd` CLI) — git-backed graph issue tracker. Source of truth for what to build. Replaces `plan.md`.
 - **manual_slop MCP** (`manual-slop-tools` server, registered in `.mcp.json`) — 26+ tools: `py_get_skeleton`, `py_get_code_outline`, `get_git_diff`, `run_powershell`, `read_file`, etc. Use these instead of raw file reads.
 - **manual_slop workers** — invoke for ALL non-trivial coding tasks:
  ```powershell
  # Run from manual_slop so conductor docs are injected; use absolute @paths for rook files
  cd C:\projects\manual_slop
  uv run python scripts/claude_mma_exec.py --role tier3-worker "PROMPT @C:\projects\rook\src\rook\module.py"
  uv run python scripts/claude_mma_exec.py --role tier4-qa "PROMPT @C:\projects\rook\logs\error.log"
  # For complex tasks, use a task file:
  uv run python scripts/claude_mma_exec.py --task-file C:\projects\rook\task.toml
  ```
  Worker logs land in `C:\projects\manual_slop\logs\claude_agents\`.
 ### MMA Tier Commands
 All commands run from `C:\projects\manual_slop`. Use absolute `@` paths for rook source files.
 ```powershell
 # Tier 1 — Strategic/Orchestration (claude-opus-4-6)
 cd C:\projects\manual_slop
 uv run python scripts\claude_mma_exec.py --role tier1-orchestrator "PROMPT"
 # Tier 2 — Tech Lead / Architecture (claude-sonnet-4-6)
 uv run python scripts\claude_mma_exec.py --role tier2-tech-lead "PROMPT"
 # Tier 3 — Worker: code implementation, test writing (claude-sonnet-4-6)
 uv run python scripts\claude_mma_exec.py --role tier3-worker "PROMPT @C:\projects\rook\src\rook\module.py"
 # Tier 4 — QA: error analysis, log summarization (claude-haiku-4-5)
 uv run python scripts\claude_mma_exec.py --role tier4-qa "PROMPT @C:\projects\rook\logs\error.log"
 # Complex tasks via TOML task file
 uv run python scripts\claude_mma_exec.py --task-file C:\projects\rook\task.toml
 # On repeated failure, pass --failure-count to escalate model
 uv run python scripts\claude_mma_exec.py --role tier3-worker --failure-count 1 "PROMPT"
 ```
 Worker logs: `C:\projects\manual_slop\logs\claude_agents\`
 Delegation log: `C:\projects\manual_slop\logs\claude_mma_delegation.log`
 ### Beads Task Workflow
 Beads replaces `plan.md`. Same discipline, different storage:
 ```powershell
 cd C:\projects\rook
 bd ready --json                                        # find unblocked tasks (start here)
 bd update <id> --claim                                 # mark in-progress before starting
 bd update <id> --status done                           # mark complete after commit + git note
 bd create --title "..." --description "..."            # create a new task
 bd dep add <id> <depends-on-id>                        # link dependency
 bd list --json                                         # list all tasks
 ```
 ## Conductor Workflow (MANDATORY)
 All work follows this strict lifecycle. **MMA tiered delegation is mandatory — the Conductor (this agent) does NOT write implementation code or tests directly.**
 ### Per-Task Lifecycle
 1. **Select task**: `bd ready --json` → pick lowest-dependency task
 2. **Claim**: `bd update <id> --claim` — do this BEFORE any work
 3. **Research phase** (mandatory before any file read >50 lines):
   - `py_get_code_outline` or `py_get_skeleton` to map architecture
   - `get_git_diff` to understand recent changes
   - `list_directory` / `py_get_imports` to map dependencies
   - `get_file_summary` to decide if full read is needed
   - Only use `read_file` with line ranges once target areas are identified
 4. **Red phase — write failing tests**:
   - **Pre-delegation checkpoint**: `git add` staged progress before spawning worker
   - Delegate to tier3-worker with a **surgical prompt**: WHERE (file:line), WHAT (tests to create), HOW (assertions/fixtures), SAFETY (thread constraints)
   - Apply the worker's output
   - **Run tests and confirm they FAIL** — do not proceed until red is confirmed
 5. **Green phase — implement to pass tests**:
   - **Pre-delegation checkpoint**: stage current progress
   - Delegate to tier3-worker with surgical prompt: WHERE (file:line range), WHAT (change), HOW (APIs/patterns to use), SAFETY constraints
   - Apply the worker's output
   - **Run tests and confirm they PASS**
 6. **Coverage check**: target >80% for new modules
   ```powershell
   cd C:\projects\rook; uv run pytest --cov=src/rook --cov-report=term-missing
   ```
 7. **Commit**:
   ```powershell
   cd C:\projects\rook
   git add <specific files>
   git commit -m "feat(scope): description"
   git log -1 --format="%H"  # get sha
   git notes add -m "<task id> — <summary of changes and why>" <sha>
   ```
 8. **Mark done**: `bd update <id> --status done`
 ### Worker Prompt Rules
 - ALWAYS include `"Use exactly 1-space indentation for Python code"` in every worker prompt
 - Prompts must specify WHERE/WHAT/HOW/SAFETY — no vague requests
 - If a worker fails, retry with `--failure-count 1` (switches to a more capable model)
 - For error analysis, spawn tier4-qa rather than reading raw stderr yourself
 ### Quality Gates (before marking any task done)
 - [ ] All tests pass
 - [ ] Coverage ≥80% for new code
 - [ ] 1-space indentation throughout
 - [ ] Type hints on all parameters, return types, module-level globals
 - [ ] No hardcoded secrets (env vars only)
 - [ ] Git note attached to commit
 ### Phase Checkpoints
 After completing a group of related tasks (a "phase"):
 - Create a checkpoint commit: `git commit -m "conductor(checkpoint): end of phase <name>"`
 - Treat this as a **context wipe** — consolidate into git notes and move forward from the checkpoint as ground truth
 ## Tech Stack
 - **Language**: Python 3.11+
 - **Package manager**: `uv`
 - **AI**: Anthropic Claude — `claude-haiku-4-5-20251001` primary, `claude-sonnet-4-6` fallback for complex reasoning
 - **Voice In**: Telegram audio notes → STT (OpenClaw-style messaging)
 - **Voice Out**: ElevenLabs TTS (Rook voice ID — scratchy/rough robot) with Google TTS as fallback
 - **GUI** (ModernCoSy component): Dear PyGui — dockable panels, dark theme
 - **CoSy**: subprocess stdin/stdout pipe to `CoSy.bat` / `reva.exe`
 - **Testing**: pytest + pytest-cov
 ## Architecture
 ### Policy Enforcement
 Every capability that touches the filesystem, shell, or Git must:
 1. Check the operation against an allowlist for that capability
 2. Prompt for confirmation before: file delete, git push, package install, any subprocess outside approved dirs
 3. Backup-before-edit for file modifications
 Approved working dirs: `~/dev`, `~/logs`, `~/ModernCoSy`.
 ### Threading
 - Main asyncio loop: agent turns, capability execution
 - Daemon thread: Telegram message polling
 - Cross-thread: `asyncio.run_coroutine_threadsafe()` for Telegram → agent queue
 - GUI (Dear PyGui) runs on main thread; asyncio on a background daemon thread
 ### CoSy Integration
 Launch CoSy as a persistent subprocess:
 ```python
 proc = subprocess.Popen(
 ["cmd", "/c", "CoSy.bat"],
 stdin=subprocess.PIPE,
 stdout=subprocess.PIPE,
 stderr=subprocess.STDOUT,
 text=True
 )
 ```
 Send expressions via `proc.stdin.write(expr + "\n")`, read from `proc.stdout`. Policies: backup before redefining words, only load from personal vocab dir, no infinite loops.
 ## Code Conventions
 Source: `C:\projects\manual_slop\conductor\product-guidelines.md`
 - **Indentation**: exactly **1 space** per level (AI token-optimized — mandatory)
 - **Blank lines**: max 1 between top-level defs, 0 within functions
 - **Type hints**: required on all parameters, return types, and module-level globals
 - **Logging**: aggressive — all agent actions, API payloads, tool calls, and subprocess calls logged with timestamps
 - **Dependency minimalism**: prefer stdlib over heavy third-party packages where feasible
 - **No `rm -rf`**: prohibited in code and subprocess calls
 - **No inline secrets**: env vars only (`ANTHROPIC_API_KEY`, `ELEVENLABS_API_KEY`, `TELEGRAM_BOT_TOKEN`, `GOOGLE_TTS_KEY`)
 ## Commit Guidelines
 ```
 <type>(<scope>): <description>
 ```
 Types: `feat`, `fix`, `refactor`, `test`, `docs`, `chore`
 After each commit, attach a git note:
 ```powershell
 git notes add -m "<task id> — <files changed> — <why>" <commit_sha>
 ```