docs(conductor): Update workflow with mma-exec and 4-tier model definitions

This commit is contained in:
2026-02-25 20:23:25 -05:00
parent 6710b58d25
commit 5e256d1c12

View File

@@ -23,12 +23,12 @@ All tasks follow a strict lifecycle:
2. **Mark In Progress:** Before beginning work, edit `plan.md` and change the task from `[ ]` to `[~]` 2. **Mark In Progress:** Before beginning work, edit `plan.md` and change the task from `[ ]` to `[~]`
3. **Write Failing Tests (Red Phase):** 3. **Write Failing Tests (Red Phase):**
- **Delegate Test Creation:** Do NOT write test code directly. Spawn a Tier 3 Worker (`run_subagent.ps1 -Role Worker`) with a prompt to create the necessary test files and unit tests based on the task criteria. - **Delegate Test Creation:** Do NOT write test code directly. Spawn a Tier 3 Worker (`python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`) with a prompt to create the necessary test files and unit tests based on the task criteria.
- Take the code generated by the Worker and apply it. - Take the code generated by the Worker and apply it.
- **CRITICAL:** Run the tests and confirm that they fail as expected. This is the "Red" phase of TDD. Do not proceed until you have failing tests. - **CRITICAL:** Run the tests and confirm that they fail as expected. This is the "Red" phase of TDD. Do not proceed until you have failing tests.
4. **Implement to Pass Tests (Green Phase):** 4. **Implement to Pass Tests (Green Phase):**
- **Delegate Implementation:** Do NOT write the implementation code directly. Spawn a Tier 3 Worker (`run_subagent.ps1 -Role Worker`) with a highly specific prompt to write the minimum amount of application code necessary to make the failing tests pass. - **Delegate Implementation:** Do NOT write the implementation code directly. Spawn a Tier 3 Worker (`python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`) with a highly specific prompt to write the minimum amount of application code necessary to make the failing tests pass.
- Take the code generated by the Worker and apply it. - Take the code generated by the Worker and apply it.
- Run the test suite again and confirm that all tests now pass. This is the "Green" phase. - Run the test suite again and confirm that all tests now pass. This is the "Green" phase.
@@ -88,7 +88,7 @@ All tasks follow a strict lifecycle:
- Before execution, you **must** announce the exact shell command you will use to run the tests. - Before execution, you **must** announce the exact shell command you will use to run the tests.
- **Example Announcement:** "I will now run the automated test suite to verify the phase. **Command:** `CI=true npm test`" - **Example Announcement:** "I will now run the automated test suite to verify the phase. **Command:** `CI=true npm test`"
- Execute the announced command. - Execute the announced command.
- If tests fail with significant output (e.g., a large traceback), **DO NOT** attempt to read the raw `stderr` directly into your context. Instead, pipe the output to a log file and **spawn a Tier 4 QA Agent (`run_subagent.ps1 -Role QA`)** to summarize the failure. - If tests fail with significant output (e.g., a large traceback), **DO NOT** attempt to read the raw `stderr` directly into your context. Instead, pipe the output to a log file and **spawn a Tier 4 QA Agent (`python scripts/mma_exec.py --role tier4-qa "[PROMPT]"`)** to summarize the failure.
- You **must** inform the user and begin debugging using the QA Agent's summary. You may attempt to propose a fix a **maximum of two times**. If the tests still fail after your second proposed fix, you **must stop**, report the persistent failure, and ask the user for guidance. - You **must** inform the user and begin debugging using the QA Agent's summary. You may attempt to propose a fix a **maximum of two times**. If the tests still fail after your second proposed fix, you **must stop**, report the persistent failure, and ask the user for guidance.
4. **Execute Automated API Hook Verification:** 4. **Execute Automated API Hook Verification:**
@@ -370,15 +370,22 @@ To emulate the 4-Tier MMA Architecture within the standard Conductor extension w
### 1. Active Model Switching (Simulating the 4 Tiers) ### 1. Active Model Switching (Simulating the 4 Tiers)
- **Activate MMA Orchestrator Skill:** To enforce the 4-Tier token firewall, the agent MUST invoke `activate_skill mma-orchestrator` at the start of any implementation phase. - **Activate MMA Orchestrator Skill:** To enforce the 4-Tier token firewall, the agent MUST invoke `activate_skill mma-orchestrator` at the start of any implementation phase.
- **Tiered Delegation (The Role-Based Protocol):** - **The MMA Bridge (`mma_exec.py`):** All tiered delegation is routed through `python scripts/mma_exec.py`. This script acts as the primary bridge, managing model selection, context injection, and logging.
- **Tier 3 Worker (Implementation):** For significant code modifications (Coding > 50 lines), delegate to a stateless sub-agent: - **Model Tiers:**
`.\scripts\run_subagent.ps1 -Role Worker -Prompt "Modify [FILE] to implement [SPEC]..."` - **Tier 1 (Strategic/Orchestration):** `gemini-3.1-pro-preview`. Used for planning and high-level logic.
- **Tier 4 QA Agent (Error Analysis):** If tests fail with large traces (Errors > 100 lines), delegate to a QA agent for compression: - **Tier 2 (Architectural/Tech Lead):** `gemini-3-flash-preview`. Used for code review and structural design.
`.\scripts\run_subagent.ps1 -Role QA -Prompt "Summarize this stack trace into a 20-word fix: [SNIPPET]"` - **Tier 3 (Execution/Worker):** `gemini-2.5-flash-lite`. Used for surgical code implementation and test generation.
- **Traceability:** Use the `-ShowContext` flag during debugging to see the role-specific system prompts and hand-offs in the terminal. - **Tier 4 (Utility/QA):** `gemini-2.5-flash-lite`. Used for log summarization and error analysis.
- **Tiered Delegation Protocol:**
- **Tier 3 Worker:** `python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`
- **Tier 4 QA Agent:** `python scripts/mma_exec.py --role tier4-qa "[PROMPT]"`
- **Logging:** All hierarchical interactions are automatically recorded in `logs/mma_delegation.log` for auditable verification. - **Logging:** All hierarchical interactions are automatically recorded in `logs/mma_delegation.log` for auditable verification.
### 2. Context Checkpoints (The Token Firewall) ### 2. Context Management and Token Firewalling
- **Context Amnesia:** `mma_exec.py` enforces "Context Amnesia" by executing sub-agents in a stateless manner. Each call starts with a clean slate, receiving only the strictly necessary documents and prompts. This prevents conversational "hallucination bleed" and keeps token costs low.
- **AST Skeleton Views:** For Tier 3 implementation, `mma_exec.py` automatically generates "AST Skeleton Views" of project dependencies. This provides the worker model with the interface-level structure (function signatures, docstrings) of imported modules without the full source code, maximizing the signal-to-noise ratio in the context window.
### 3. Phase Checkpoints (The Final Defense)
- The **Phase Completion Verification and Checkpointing Protocol** is the project's primary defense against token bloat. - The **Phase Completion Verification and Checkpointing Protocol** is the project's primary defense against token bloat.
- When a Phase is marked complete and a checkpoint commit is created, the AI Agent must actively interpret this as a **"Context Wipe"** signal. It should summarize the outcome in its git notes and move forward treating the checkpoint as absolute truth, deliberately dropping earlier conversational history. - When a Phase is marked complete and a checkpoint commit is created, the AI Agent must actively interpret this as a **"Context Wipe"** signal. It should summarize the outcome in its git notes and move forward treating the checkpoint as absolute truth, deliberately dropping earlier conversational history.
- **MMA Phase Memory Wipe:** After completing a major Phase, use the Tier 1/2 Orchestrator's perspective to consolidate state into Git Notes and then disregard previous trial-and-error histories. - **MMA Phase Memory Wipe:** After completing a major Phase, use the Tier 1/2 Orchestrator's perspective to consolidate state into Git Notes and then disregard previous trial-and-error histories.