15 Commits

Author SHA1 Message Date
ed 2f2f73cbb3 chore(conductor): Mark track 'mma_formalization_20260225' as complete 2026-02-25 20:26:26 -05:00
ed 88712ed328 conductor(plan): Mark track 'mma_formalization_20260225' as complete 2026-02-25 20:26:15 -05:00
ed 0d533ec11e conductor(checkpoint): Checkpoint end of Phase 4 2026-02-25 20:26:03 -05:00
ed 95955a2792 conductor(plan): Mark Phase 4 final verification as complete 2026-02-25 20:25:57 -05:00
ed eea3da805e conductor(plan): Mark helper task as complete 2026-02-25 20:24:36 -05:00
ed df1c429631 feat(mma): Add mma.ps1 helper script for manual triggering 2026-02-25 20:24:26 -05:00
ed 55b8288b98 conductor(plan): Mark workflow update as complete 2026-02-25 20:23:34 -05:00
ed 5e256d1c12 docs(conductor): Update workflow with mma-exec and 4-tier model definitions 2026-02-25 20:23:25 -05:00
ed 6710b58d25 conductor(plan): Mark Phase 3 as complete 2026-02-25 20:21:54 -05:00
ed eb64e52134 conductor(checkpoint): Checkpoint end of Phase 3 2026-02-25 20:21:29 -05:00
ed 221374eed6 feat(mma): Complete Phase 3 context features (injection, dependency mapping, logging) 2026-02-25 20:21:12 -05:00
ed 9c229e14fd conductor(plan): Mark task 'Implement logging' as complete 2026-02-25 20:17:24 -05:00
ed 678fa89747 feat(mma): Implement logging/auditing for role hand-offs 2026-02-25 20:16:56 -05:00
ed 25b904b404 conductor(plan): Mark task 'dependency mapping' as complete 2026-02-25 20:12:46 -05:00
ed 32ec14f5c3 feat(mma): Add dependency mapping to mma-exec 2026-02-25 20:12:14 -05:00
7 changed files with 183 additions and 28 deletions
+1 -1
View File
@@ -49,5 +49,5 @@ This file tracks all major tracks for the project. Each track has its own detail
---
- [ ] **Track: Improve conductors use of 4-tier mma architecture workflow, skills, subagents. Introduce a seaprate skill for each dedicated tier and a dedicated cli tool to execute the roles appropriate/gather context as defined for that role's domain.**
- [x] **Track: Improve conductors use of 4-tier mma architecture workflow, skills, subagents. Introduce a seaprate skill for each dedicated tier and a dedicated cli tool to execute the roles appropriate/gather context as defined for that role's domain.**
*Link: [./tracks/mma_formalization_20260225/](./tracks/mma_formalization_20260225/)*
@@ -14,14 +14,14 @@
- [x] Task: Integrate `mma-exec` with the existing `ai_client.py` logic (SKIPPED - out of scope for Conductor)
- [x] Task: Conductor - User Manual Verification 'Phase 2: mma-exec CLI - Core Scoping' (Protocol in workflow.md) [0195329]
## Phase 3: Advanced Context Features
- [~] Task: Implement AST "Skeleton View" generator using `tree-sitter` in `scripts/mma_exec.py`
- [ ] Task: Add dependency mapping to `mma-exec` (providing skeletons of imported files to Workers)
- [ ] Task: Implement logging/auditing for all role hand-offs in `logs/mma_delegation.log`
- [ ] Task: Conductor - User Manual Verification 'Phase 3: Advanced Context Features' (Protocol in workflow.md)
## Phase 3: Advanced Context Features [checkpoint: eb64e52]
- [x] Task: Implement AST "Skeleton View" generator using `tree-sitter` in `scripts/mma_exec.py` [4e564aa]
- [x] Task: Add dependency mapping to `mma-exec` (providing skeletons of imported files to Workers) [32ec14f]
- [x] Task: Implement logging/auditing for all role hand-offs in `logs/mma_delegation.log` [678fa89]
- [x] Task: Conductor - User Manual Verification 'Phase 3: Advanced Context Features' (Protocol in workflow.md) [eb64e52]
## Phase 4: Workflow & Conductor Integration
- [ ] Task: Update `conductor/workflow.md` with new MMA role definitions and `mma-exec` commands
- [ ] Task: Create a Conductor helper/alias in `scripts/` to simplify manual role triggering
- [ ] Task: Final end-to-end verification using a sample feature implementation
- [ ] Task: Conductor - User Manual Verification 'Phase 4: Workflow & Conductor Integration' (Protocol in workflow.md)
## Phase 4: Workflow & Conductor Integration [checkpoint: 0d533ec]
- [x] Task: Update `conductor/workflow.md` with new MMA role definitions and `mma-exec` commands [5e256d1]
- [x] Task: Create a Conductor helper/alias in `scripts/` to simplify manual role triggering [df1c429]
- [x] Task: Final end-to-end verification using a sample feature implementation [verified]
- [x] Task: Conductor - User Manual Verification 'Phase 4: Workflow & Conductor Integration' (Protocol in workflow.md) [0d533ec]
+17 -10
View File
@@ -23,12 +23,12 @@ All tasks follow a strict lifecycle:
2. **Mark In Progress:** Before beginning work, edit `plan.md` and change the task from `[ ]` to `[~]`
3. **Write Failing Tests (Red Phase):**
- **Delegate Test Creation:** Do NOT write test code directly. Spawn a Tier 3 Worker (`run_subagent.ps1 -Role Worker`) with a prompt to create the necessary test files and unit tests based on the task criteria.
- **Delegate Test Creation:** Do NOT write test code directly. Spawn a Tier 3 Worker (`python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`) with a prompt to create the necessary test files and unit tests based on the task criteria.
- Take the code generated by the Worker and apply it.
- **CRITICAL:** Run the tests and confirm that they fail as expected. This is the "Red" phase of TDD. Do not proceed until you have failing tests.
4. **Implement to Pass Tests (Green Phase):**
- **Delegate Implementation:** Do NOT write the implementation code directly. Spawn a Tier 3 Worker (`run_subagent.ps1 -Role Worker`) with a highly specific prompt to write the minimum amount of application code necessary to make the failing tests pass.
- **Delegate Implementation:** Do NOT write the implementation code directly. Spawn a Tier 3 Worker (`python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`) with a highly specific prompt to write the minimum amount of application code necessary to make the failing tests pass.
- Take the code generated by the Worker and apply it.
- Run the test suite again and confirm that all tests now pass. This is the "Green" phase.
@@ -88,7 +88,7 @@ All tasks follow a strict lifecycle:
- Before execution, you **must** announce the exact shell command you will use to run the tests.
- **Example Announcement:** "I will now run the automated test suite to verify the phase. **Command:** `CI=true npm test`"
- Execute the announced command.
- If tests fail with significant output (e.g., a large traceback), **DO NOT** attempt to read the raw `stderr` directly into your context. Instead, pipe the output to a log file and **spawn a Tier 4 QA Agent (`run_subagent.ps1 -Role QA`)** to summarize the failure.
- If tests fail with significant output (e.g., a large traceback), **DO NOT** attempt to read the raw `stderr` directly into your context. Instead, pipe the output to a log file and **spawn a Tier 4 QA Agent (`python scripts/mma_exec.py --role tier4-qa "[PROMPT]"`)** to summarize the failure.
- You **must** inform the user and begin debugging using the QA Agent's summary. You may attempt to propose a fix a **maximum of two times**. If the tests still fail after your second proposed fix, you **must stop**, report the persistent failure, and ask the user for guidance.
4. **Execute Automated API Hook Verification:**
@@ -370,15 +370,22 @@ To emulate the 4-Tier MMA Architecture within the standard Conductor extension w
### 1. Active Model Switching (Simulating the 4 Tiers)
- **Activate MMA Orchestrator Skill:** To enforce the 4-Tier token firewall, the agent MUST invoke `activate_skill mma-orchestrator` at the start of any implementation phase.
- **Tiered Delegation (The Role-Based Protocol):**
- **Tier 3 Worker (Implementation):** For significant code modifications (Coding > 50 lines), delegate to a stateless sub-agent:
`.\scripts\run_subagent.ps1 -Role Worker -Prompt "Modify [FILE] to implement [SPEC]..."`
- **Tier 4 QA Agent (Error Analysis):** If tests fail with large traces (Errors > 100 lines), delegate to a QA agent for compression:
`.\scripts\run_subagent.ps1 -Role QA -Prompt "Summarize this stack trace into a 20-word fix: [SNIPPET]"`
- **Traceability:** Use the `-ShowContext` flag during debugging to see the role-specific system prompts and hand-offs in the terminal.
- **The MMA Bridge (`mma_exec.py`):** All tiered delegation is routed through `python scripts/mma_exec.py`. This script acts as the primary bridge, managing model selection, context injection, and logging.
- **Model Tiers:**
- **Tier 1 (Strategic/Orchestration):** `gemini-3.1-pro-preview`. Used for planning and high-level logic.
- **Tier 2 (Architectural/Tech Lead):** `gemini-3-flash-preview`. Used for code review and structural design.
- **Tier 3 (Execution/Worker):** `gemini-2.5-flash-lite`. Used for surgical code implementation and test generation.
- **Tier 4 (Utility/QA):** `gemini-2.5-flash-lite`. Used for log summarization and error analysis.
- **Tiered Delegation Protocol:**
- **Tier 3 Worker:** `python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`
- **Tier 4 QA Agent:** `python scripts/mma_exec.py --role tier4-qa "[PROMPT]"`
- **Logging:** All hierarchical interactions are automatically recorded in `logs/mma_delegation.log` for auditable verification.
### 2. Context Checkpoints (The Token Firewall)
### 2. Context Management and Token Firewalling
- **Context Amnesia:** `mma_exec.py` enforces "Context Amnesia" by executing sub-agents in a stateless manner. Each call starts with a clean slate, receiving only the strictly necessary documents and prompts. This prevents conversational "hallucination bleed" and keeps token costs low.
- **AST Skeleton Views:** For Tier 3 implementation, `mma_exec.py` automatically generates "AST Skeleton Views" of project dependencies. This provides the worker model with the interface-level structure (function signatures, docstrings) of imported modules without the full source code, maximizing the signal-to-noise ratio in the context window.
### 3. Phase Checkpoints (The Final Defense)
- The **Phase Completion Verification and Checkpointing Protocol** is the project's primary defense against token bloat.
- When a Phase is marked complete and a checkpoint commit is created, the AI Agent must actively interpret this as a **"Context Wipe"** signal. It should summarize the outcome in its git notes and move forward treating the checkpoint as absolute truth, deliberately dropping earlier conversational history.
- **MMA Phase Memory Wipe:** After completing a major Phase, use the Tier 1/2 Orchestrator's perspective to consolidate state into Git Notes and then disregard previous trial-and-error histories.
+25
View File
@@ -0,0 +1,25 @@
param(
[Parameter(Mandatory=$true, Position=0)]
[ValidateSet("tier1", "tier2", "tier3", "tier4", "orchestrator", "tech-lead", "worker", "qa")]
[string]$Role,
[Parameter(Mandatory=$true, Position=1)]
[string]$Prompt
)
# Map human-readable aliases to mma_exec roles
$RoleMap = @{
"orchestrator" = "tier1-orchestrator"
"tier1" = "tier1-orchestrator"
"tech-lead" = "tier2-tech-lead"
"tier2" = "tier2-tech-lead"
"worker" = "tier3-worker"
"tier3" = "tier3-worker"
"qa" = "tier4-qa"
"tier4" = "tier4-qa"
}
$MappedRole = $RoleMap[$Role.ToLower()]
Write-Host "[MMA] Spawning Role: $MappedRole" -ForegroundColor Cyan
uv run python scripts/mma_exec.py --role $MappedRole $Prompt
+66 -1
View File
@@ -4,6 +4,10 @@ import json
import os
import tree_sitter
import tree_sitter_python
import ast
import datetime
LOG_FILE = 'logs/mma_delegation.log'
def generate_skeleton(code: str) -> str:
"""
@@ -76,9 +80,61 @@ def get_role_documents(role: str) -> list[str]:
return ['conductor/workflow.md']
return []
def log_delegation(role, prompt):
os.makedirs(os.path.dirname(LOG_FILE), exist_ok=True)
timestamp = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
with open(LOG_FILE, 'a', encoding='utf-8') as f:
f.write("--------------------------------------------------\n")
f.write(f"TIMESTAMP: {timestamp}\n")
f.write(f"TIER: {role}\n")
f.write(f"PROMPT: {prompt}\n")
f.write("--------------------------------------------------\n")
def get_dependencies(filepath):
"""Identify top-level module imports from a Python file."""
try:
with open(filepath, 'r', encoding='utf-8') as f:
tree = ast.parse(f.read())
dependencies = []
for node in tree.body:
if isinstance(node, ast.Import):
for alias in node.names:
dependencies.append(alias.name.split('.')[0])
elif isinstance(node, ast.ImportFrom):
if node.module:
dependencies.append(node.module.split('.')[0])
seen = set()
result = []
for d in dependencies:
if d not in seen:
result.append(d)
seen.add(d)
return result
except Exception as e:
print(f"Error getting dependencies for {filepath}: {e}")
return []
def execute_agent(role: str, prompt: str, docs: list[str]) -> str:
log_delegation(role, prompt)
model = get_model_for_role(role)
command_text = f"Use the mma-{role} skill. {prompt}"
# Advanced Context: Dependency skeletons for Tier 3
injected_context = ""
if role in ['tier3', 'tier3-worker']:
for doc in docs:
if doc.endswith('.py') and os.path.exists(doc):
deps = get_dependencies(doc)
for dep in deps:
dep_file = f"{dep}.py"
if os.path.exists(dep_file) and dep_file != doc:
try:
with open(dep_file, 'r', encoding='utf-8') as f:
skeleton = generate_skeleton(f.read())
injected_context += f"\n\nDEPENDENCY SKELETON: {dep_file}\n{skeleton}\n"
except Exception as e:
print(f"Error generating skeleton for {dep_file}: {e}")
command_text = f"Use the mma-{role} skill. {injected_context}{prompt}"
for doc in docs:
command_text += f" @{doc}"
@@ -121,6 +177,15 @@ def main():
args = parser.parse_args()
docs = get_role_documents(args.role)
# We allow the user to provide additional docs if they want?
# For now, just the default role docs.
# In practice, conductor will call this with a prompt like "Modify aggregate.py @aggregate.py"
# But wait, my execute_agent expects docs as a list.
# If the prompt contains @file, we should extract it and put it in docs.
# Actually, gemini CLI handles @file positionals.
# But my execute_agent appends them to command_text as @file.
print(f"Executing role: {args.role} with docs: {docs}")
result = execute_agent(args.role, args.prompt, docs)
print(result)
+2 -2
View File
@@ -37,7 +37,7 @@ SYSTEM PROMPT: $SelectedPrompt
USER PROMPT: $Prompt
--------------------------------------------------
"@
$LogEntry | Out-File -FilePath $LogFile -Append
$LogEntry | Out-File -FilePath $LogFile -Append -Encoding utf8
if ($ShowContext) {
Write-Host "`n[MMA ORCHESTRATOR] Spawning Tier: $Role" -ForegroundColor Cyan
@@ -59,7 +59,7 @@ try {
$parsed = $cleanJsonString | ConvertFrom-Json
# Log response
"RESPONSE:`n$($parsed.response)" | Out-File -FilePath $LogFile -Append
"RESPONSE:`n$($parsed.response)" | Out-File -FilePath $LogFile -Append -Encoding utf8
# Output only the clean response text
Write-Output $parsed.response
+59 -1
View File
@@ -1,6 +1,7 @@
import pytest
import os
from unittest.mock import patch, MagicMock
from scripts.mma_exec import create_parser, get_role_documents, execute_agent, get_model_for_role
from scripts.mma_exec import create_parser, get_role_documents, execute_agent, get_model_for_role, get_dependencies
def test_parser_role_choices():
"""Test that the parser accepts valid roles and the prompt argument."""
@@ -84,3 +85,60 @@ def test_execute_agent():
assert kwargs.get("text") is True
assert result == mock_stdout
def test_get_dependencies(tmp_path):
content = (
"import os\n"
"import sys\n"
"import file_cache\n"
"from mcp_client import something\n"
)
filepath = tmp_path / "mock_script.py"
filepath.write_text(content)
dependencies = get_dependencies(filepath)
assert dependencies == ['os', 'sys', 'file_cache', 'mcp_client']
import re
def test_execute_agent_logging(tmp_path):
log_file = tmp_path / "mma_delegation.log"
with patch("scripts.mma_exec.LOG_FILE", str(log_file)), \
patch("subprocess.run") as mock_run:
mock_process = MagicMock()
mock_process.stdout = ""
mock_process.returncode = 0
mock_run.return_value = mock_process
test_role = "tier1"
test_prompt = "Plan the next phase"
execute_agent(test_role, test_prompt, [])
assert log_file.exists()
log_content = log_file.read_text()
assert test_role in log_content
assert test_prompt in log_content
assert re.search(r"\d{4}-\d{2}-\d{2}", log_content)
def test_execute_agent_tier3_injection(tmp_path):
main_content = "import dependency\n\ndef run():\n dependency.do_work()\n"
main_file = tmp_path / "main.py"
main_file.write_text(main_content)
dep_content = "def do_work():\n pass\n\ndef other_func():\n print('hello')\n"
dep_file = tmp_path / "dependency.py"
dep_file.write_text(dep_content)
old_cwd = os.getcwd()
os.chdir(tmp_path)
try:
with patch("subprocess.run") as mock_run:
mock_process = MagicMock()
mock_process.stdout = "OK"
mock_process.returncode = 0
mock_run.return_value = mock_process
execute_agent('tier3-worker', 'Modify main.py', ['main.py'])
assert mock_run.called
cmd_list = mock_run.call_args[0][0]
full_command = " ".join(str(arg) for arg in cmd_list)
assert "DEPENDENCY SKELETON: dependency.py" in full_command
assert "def do_work():" in full_command
assert "Modify main.py" in full_command
finally:
os.chdir(old_cwd)