checkpoint: Claude Code integration + implement missing MCP var tools

Add Claude Code conductor commands, MCP server, MMA exec scripts, and implement py_get_var_declaration / py_set_var_declaration which were registered in dispatch and tool specs but had no function bodies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 10:47:42 -05:00
parent d36632c21a
commit a2a1447f58
22 changed files with 1845 additions and 0 deletions
@@ -0,0 +1,97 @@
+---
+description: Execute a conductor track — follow TDD workflow, delegate to Tier 3/4 workers
+---
+
+# /conductor-implement
+
+Execute a track's implementation plan. This is a Tier 2 (Tech Lead) operation.
+You maintain PERSISTENT context throughout the track — do NOT lose state.
+
+## Startup
+
+1. Read `conductor/workflow.md` for the full task lifecycle protocol
+2. Read `conductor/tech-stack.md` for technology constraints
+3. Read the target track's `spec.md` and `plan.md`
+4. Identify the current task: first `[ ]` or `[~]` in `plan.md`
+
+If no track name is provided, run `/conductor-status` first and ask which track to implement.
+
+## Task Lifecycle (per task)
+
+Follow this EXACTLY per `conductor/workflow.md`:
+
+### 1. Mark In Progress
+Edit `plan.md`: change `[ ]` → `[~]` for the current task.
+
+### 2. Research Phase (High-Signal)
+Before touching code, use context-efficient tools:
+- `py_get_code_outline` or `py_get_skeleton` (via MCP tools) to map architecture
+- `get_git_diff` to understand recent changes
+- `Grep`/`Glob` to locate symbols
+- Only `Read` full files after identifying specific target ranges
+
+### 3. Write Failing Tests (Red Phase — TDD)
+**DELEGATE to Tier 3 Worker** — do NOT write tests yourself:
+```powershell
+uv run python scripts\claude_mma_exec.py --role tier3-worker "Write failing tests for: {TASK_DESCRIPTION}. Focus files: {FILE_LIST}. Spec: {RELEVANT_SPEC_EXCERPT}"
+```
+Run the tests. Confirm they FAIL. This is the Red phase.
+
+### 4. Implement to Pass (Green Phase)
+**DELEGATE to Tier 3 Worker**:
+```powershell
+uv run python scripts\claude_mma_exec.py --role tier3-worker "Implement minimum code to pass these tests: {TEST_FILE}. Focus files: {FILE_LIST}"
+```
+Run tests. Confirm they PASS. This is the Green phase.
+
+### 5. Refactor (Optional)
+With passing tests as safety net, refactor if needed. Rerun tests.
+
+### 6. Verify Coverage
+```powershell
+uv run pytest --cov=. --cov-report=term-missing {TEST_FILE}
+```
+Target: >80% for new code.
+
+### 7. Commit
+Stage changes. Message format:
+```
+feat({scope}): {description}
+```
+
+### 8. Attach Git Notes
+```powershell
+$sha = git log -1 --format="%H"
+git notes add -m "Task: {TASK_NAME}`nSummary: {CHANGES}`nFiles: {FILE_LIST}" $sha
+```
+
+### 9. Update plan.md
+Change `[~]` → `[x]` and append first 7 chars of commit SHA:
+```
+[x] Task description. abc1234
+```
+Commit: `conductor(plan): Mark task '{TASK_NAME}' as complete`
+
+### 10. Next Task or Phase Completion
+- If more tasks in current phase: loop to step 1 with next task
+- If phase complete: run `/conductor-verify`
+
+## Error Handling
+If tests fail with large output, delegate to Tier 4 QA:
+```powershell
+uv run python scripts\claude_mma_exec.py --role tier4-qa "Analyze this test failure: {ERROR_SUMMARY}. Test file: {TEST_FILE}"
+```
+Maximum 2 fix attempts. If still failing: STOP and ask the user.
+
+## Deviations from Tech Stack
+If implementation requires something not in `tech-stack.md`:
+1. **STOP** implementation
+2. Update `tech-stack.md` with justification
+3. Add dated note
+4. Resume
+
+## Important
+- You are Tier 2 — delegate heavy implementation to Tier 3
+- Maintain persistent context across the entire track
+- Use Research-First Protocol before reading large files
+- The plan.md is the SOURCE OF TRUTH for task state
@@ -0,0 +1,100 @@
+---
+description: Initialize a new conductor track with spec, plan, and metadata
+---
+
+# /conductor-new-track
+
+Create a new track in the conductor system. This is a Tier 1 (Orchestrator) operation.
+
+## Prerequisites
+- Read `conductor/product.md` and `conductor/product-guidelines.md` for product alignment
+- Read `conductor/tech-stack.md` for technology constraints
+
+## Steps
+
+### 1. Gather Information
+Ask the user for:
+- **Track name**: descriptive, snake_case (e.g., `add_auth_system`)
+- **Track type**: `feat`, `fix`, `refactor`, `chore`
+- **Description**: one-line summary
+- **Requirements**: functional requirements for the spec
+
+### 2. Create Track Directory
+```
+conductor/tracks/{track_name}_{YYYYMMDD}/
+```
+Use today's date in YYYYMMDD format.
+
+### 3. Create metadata.json
+```json
+{
+  "track_id": "{track_name}_{YYYYMMDD}",
+  "type": "{feat|fix|refactor|chore}",
+  "status": "new",
+  "created_at": "{ISO8601}",
+  "updated_at": "{ISO8601}",
+  "description": "{description}"
+}
+```
+
+### 4. Create index.md
+```markdown
+# Track: {Track Title}
+
+- [Specification](spec.md)
+- [Implementation Plan](plan.md)
+```
+
+### 5. Create spec.md
+```markdown
+# {Track Title} — Specification
+
+## Overview
+{Description of what this track delivers}
+
+## Functional Requirements
+1. {Requirement from user input}
+2. ...
+
+## Non-Functional Requirements
+- Performance: {if applicable}
+- Testing: >80% coverage for new code
+
+## Acceptance Criteria
+- [ ] {Criterion 1}
+- [ ] {Criterion 2}
+
+## Out of Scope
+- {Explicitly excluded items}
+
+## Context
+- Tech stack: see `conductor/tech-stack.md`
+- Product guidelines: see `conductor/product-guidelines.md`
+```
+
+### 6. Create plan.md
+```markdown
+# {Track Title} — Implementation Plan
+
+## Phase 1: {Phase Name}
+- [ ] Task: {Description}
+- [ ] Task: {Description}
+
+## Phase 2: {Phase Name}
+- [ ] Task: {Description}
+```
+
+Break requirements into phases with 2-5 tasks each. Each task should be a single atomic unit of work suitable for a Tier 3 Worker.
+
+### 7. Update Track Registry
+If `conductor/tracks.md` exists, add the new track entry.
+
+### 8. Commit
+```
+conductor(track): Initialize track '{track_name}'
+```
+
+## Important
+- Do NOT start implementing — track initialization only
+- Implementation is done via `/conductor-implement`
+- Each task should be scoped for a single Tier 3 Worker delegation
@@ -0,0 +1,46 @@
+---
+description: Initialize conductor context — read product docs, verify structure, report readiness
+---
+
+# /conductor-setup
+
+Bootstrap a Claude Code session with full conductor context. Run this at session start.
+
+## Steps
+
+1. **Read Core Documents:**
+   - `conductor/index.md` — navigation hub
+   - `conductor/product.md` — product vision
+   - `conductor/product-guidelines.md` — UX/code standards
+   - `conductor/tech-stack.md` — technology constraints
+   - `conductor/workflow.md` — task lifecycle (skim; reference during implementation)
+
+2. **Check Active Tracks:**
+   - List all directories in `conductor/tracks/`
+   - Read each `metadata.json` for status
+   - Read each `plan.md` for current task state
+   - Identify the track with `[~]` in-progress tasks
+
+3. **Check Session Context:**
+   - Read `TASKS.md` if it exists — check for IN_PROGRESS or BLOCKED tasks
+   - Read last 3 entries in `JOURNAL.md` for recent activity
+   - Run `git log --oneline -10` for recent commits
+
+4. **Report Readiness:**
+   Present a session startup summary:
+   ```
+   ## Session Ready
+
+   **Active Track:** {track name} — Phase {N}, Task: {current task description}
+   **Recent Activity:** {last journal entry title}
+   **Last Commit:** {git log -1 oneline}
+
+   Ready to:
+   - `/conductor-implement` — resume active track
+   - `/conductor-status` — full status overview
+   - `/conductor-new-track` — start new work
+   ```
+
+## Important
+- This is READ-ONLY — do not modify files
+- This replaces Gemini's `activate_skill mma-orchestrator` + `/conductor:setup`
@@ -0,0 +1,32 @@
+---
+description: Show current conductor track status — active tracks, phases, pending tasks
+---
+
+# /conductor-status
+
+Read the conductor track registry and all active tracks, then report current project state.
+
+## Steps
+
+1. Read `conductor/tracks.md` for the track registry
+2. For each track directory in `conductor/tracks/`:
+   - Read `metadata.json` for status
+   - Read `plan.md` and count: total tasks, completed `[x]`, in-progress `[~]`, pending `[ ]`
+   - Identify the current phase (first phase with `[~]` or `[ ]` tasks)
+3. Read `JOURNAL.md` last 3 entries for recent activity context
+
+## Output Format
+
+Present a summary table:
+
+```
+| Track | Status | Phase | Progress | Last SHA |
+|-------|--------|-------|----------|----------|
+```
+
+Then for each in-progress track, list the specific next pending task.
+
+## Important
+- This is READ-ONLY — do not modify any files
+- Report exactly what the plan.md files say
+- Flag any discrepancies (e.g., metadata says "new" but plan.md has [x] tasks)
@@ -0,0 +1,85 @@
+---
+description: Run phase completion verification — tests, coverage, checkpoint commit
+---
+
+# /conductor-verify
+
+Execute the Phase Completion Verification and Checkpointing Protocol.
+Run this when all tasks in a phase are marked `[x]`.
+
+## Protocol
+
+### 1. Announce
+Tell the user: "Phase complete. Running verification and checkpointing protocol."
+
+### 2. Verify Test Coverage for Phase
+
+Find the phase scope:
+- Read `plan.md` to find the previous phase's checkpoint SHA
+- If no previous checkpoint: scope is all changes since first commit
+- Run: `git diff --name-only {previous_checkpoint_sha} HEAD`
+- For each changed code file (exclude `.json`, `.md`, `.yaml`, `.toml`):
+  - Check if a corresponding test file exists
+  - If missing: create one (analyze existing test style first)
+
+### 3. Run Automated Tests
+
+**ANNOUNCE the exact command before running:**
+> "I will now run the automated test suite. Command: `uv run pytest --cov=. --cov-report=term-missing -x`"
+
+Execute the command.
+
+**If tests fail with large output:**
+- Pipe output to `logs/phase_verify.log`
+- Spawn Tier 4 QA for analysis:
+```powershell
+uv run python scripts\claude_mma_exec.py --role tier4-qa "Analyze test failures from logs/phase_verify.log"
+```
+- Maximum 2 fix attempts
+- If still failing: **STOP**, report to user, await guidance
+
+### 4. API Hook Verification (if applicable)
+
+If the track involves UI changes:
+- Check if GUI test hooks are available on port 8999
+- Run relevant simulation tests from `tests/visual_sim_*.py`
+- Log results
+
+### 5. Present Results and WAIT
+
+Display:
+- Test results (pass/fail count)
+- Coverage report
+- Any verification logs
+
+**PAUSE HERE.** Do NOT proceed without explicit user confirmation.
+
+### 6. Create Checkpoint Commit
+
+After user confirms:
+```powershell
+git add -A
+git commit -m "conductor(checkpoint): Checkpoint end of Phase {N} - {Phase Name}"
+```
+
+### 7. Attach Verification Report via Git Notes
+```powershell
+$sha = git log -1 --format="%H"
+git notes add -m "Phase Verification Report`nCommand: {test_command}`nResult: {pass/fail}`nCoverage: {percentage}`nConfirmed by: user" $sha
+```
+
+### 8. Update plan.md
+
+Update the phase heading to include checkpoint SHA:
+```markdown
+## Phase N: {Name} [checkpoint: {sha_7}]
+```
+Commit: `conductor(plan): Mark phase '{Phase Name}' as complete`
+
+### 9. Announce Completion
+Tell the user the phase is complete with a summary of the verification report.
+
+## Context Reset
+After phase checkpointing, treat the checkpoint as ground truth.
+Prior conversational context about implementation details can be dropped.
+The checkpoint commit and git notes preserve the audit trail.
@@ -0,0 +1,25 @@
+---
+description: Tier 1 Orchestrator — product alignment, high-level planning, track initialization
+---
+
+STRICT SYSTEM DIRECTIVE: You are a Tier 1 Orchestrator. Focused on product alignment, high-level planning, and track initialization. ONLY output the requested text. No pleasantries.
+
+# MMA Tier 1: Orchestrator
+
+## Primary Context Documents
+Read at session start: `conductor/product.md`, `conductor/product-guidelines.md`
+
+## Responsibilities
+- Maintain alignment with the product guidelines and definition
+- Define track boundaries and initialize new tracks (`/conductor:newTrack`)
+- Set up the project environment (`/conductor:setup`)
+- Delegate track execution to the Tier 2 Tech Lead
+
+## Limitations
+- Read-only tools only: Read, Glob, Grep, WebFetch, WebSearch, Bash (read-only ops)
+- Do NOT execute tracks or implement features
+- Do NOT write code or edit files
+- Do NOT perform low-level bug fixing
+- Keep context strictly focused on product definitions and high-level strategy
+- To delegate track execution: instruct the human operator to run:
+  `uv run python scripts\claude_mma_exec.py --role tier2-tech-lead "[PROMPT]"`
@@ -0,0 +1,38 @@
+---
+description: Tier 2 Tech Lead — track execution, architectural oversight, delegation to Tier 3/4
+---
+
+STRICT SYSTEM DIRECTIVE: You are a Tier 2 Tech Lead. Focused on architectural design and track execution. ONLY output the requested text. No pleasantries.
+
+# MMA Tier 2: Tech Lead
+
+## Primary Context Documents
+Read at session start: `conductor/tech-stack.md`, `conductor/workflow.md`
+
+## Responsibilities
+- Manage the execution of implementation tracks (`/conductor:implement`)
+- Ensure alignment with `tech-stack.md` and project architecture
+- Break down tasks into specific technical steps for Tier 3 Workers
+- Maintain PERSISTENT context throughout a track's implementation phase (NO Context Amnesia)
+- Review implementations and coordinate bug fixes via Tier 4 QA
+
+## Delegation Commands (PowerShell)
+
+```powershell
+# Spawn Tier 3 Worker for implementation tasks
+uv run python scripts\claude_mma_exec.py --role tier3-worker "[PROMPT]"
+
+# Spawn Tier 4 QA Agent for error analysis
+uv run python scripts\claude_mma_exec.py --role tier4-qa "[PROMPT]"
+```
+
+Use `@file/path.py` syntax in prompts to inject file context for the sub-agent.
+
+## Limitations
+- Do NOT perform heavy implementation work directly — delegate to Tier 3
+- Do NOT write test or implementation code directly
+- Minimize full file reads; use Research-First Protocol before reading files >50 lines:
+  - `py_get_code_outline` / `Grep` to map architecture
+  - `git diff` to understand recent changes
+  - `Glob` / `Grep` to locate symbols
+- For large error logs, always spawn Tier 4 QA rather than reading raw stderr
@@ -0,0 +1,22 @@
+---
+description: Tier 3 Worker — stateless TDD implementation, surgical code changes
+---
+
+STRICT SYSTEM DIRECTIVE: You are a stateless Tier 3 Worker (Contributor). Your goal is to implement specific code changes or tests based on the provided task. You have access to tools for reading and writing files (Read, Write, Edit), codebase investigation (Glob, Grep), version control (Bash git commands), and web tools (WebFetch, WebSearch). You CAN execute PowerShell scripts via Bash for verification and testing. Follow TDD and return success status or code changes. No pleasantries, no conversational filler.
+
+# MMA Tier 3: Worker
+
+## Context Model: Context Amnesia
+Treat each invocation as starting from zero. Use ONLY what is provided in this prompt plus files you explicitly read during this session. Do not reference prior conversation history.
+
+## Responsibilities
+- Implement code strictly according to the provided prompt and specifications
+- Write failing tests FIRST (Red phase), then implement code to pass them (Green phase)
+- Ensure all changes are minimal, surgical, and conform to the requested standards
+- Utilize tool access (Read, Write, Edit, Glob, Grep, Bash) to implement and verify
+
+## Limitations
+- No architectural decisions — if ambiguous, pick the minimal correct approach and note the assumption
+- No modifications to unrelated files beyond the immediate task scope
+- Stateless — always assume a fresh context per invocation
+- Rely on dependency skeletons provided in the prompt for understanding module interfaces
@@ -0,0 +1,30 @@
+---
+description: Tier 4 QA Agent — stateless error analysis, log summarization, no fixes
+---
+
+STRICT SYSTEM DIRECTIVE: You are a stateless Tier 4 QA Agent. Your goal is to analyze errors, summarize logs, or verify tests. Read-only access only. Do NOT implement fixes. Do NOT modify any files. ONLY output the requested analysis. No pleasantries.
+
+# MMA Tier 4: QA Agent
+
+## Context Model: Context Amnesia
+Stateless — treat each invocation as a fresh context. Use only what is provided in this prompt and files you explicitly read.
+
+## Responsibilities
+- Compress large stack traces or log files into concise, actionable summaries
+- Identify the root cause of test failures or runtime errors
+- Provide a brief, technical description of the required fix (description only — NOT the implementation)
+- Utilize diagnostic tools (Read, Glob, Grep, Bash read-only) to verify failures
+
+## Output Format
+
+```
+ROOT CAUSE: [one sentence]
+AFFECTED FILE: [path:line if identifiable]
+RECOMMENDED FIX: [one sentence description for Tier 2 to action]
+```
+
+## Limitations
+- Do NOT implement the fix directly
+- Do NOT write or modify any files
+- Ensure output is extremely brief and focused
+- Always operate statelessly — assume fresh context each invocation