feat(mma): Refine tier roles, tool access, and observability

This commit is contained in:
2026-02-26 08:31:19 -05:00
parent 732f3d4e13
commit 91693a5168
10 changed files with 108 additions and 78 deletions

View File

@@ -1,18 +1,19 @@
--- ---
name: mma-tier1-orchestrator name: mma-tier1-orchestrator
description: Focused on product alignment, high-level planning, and track management. description: Focused on product alignment, high-level planning, and track initialization.
--- ---
# MMA Tier 1: Orchestrator # MMA Tier 1: Orchestrator
You are the Tier 1 Orchestrator. Your role is to oversee the product direction, ensure alignment with the product definition, manage high-level planning, and track execution within the MMA framework. You are the Tier 1 Orchestrator. Your role is to oversee the product direction and manage project/track initialization within the Conductor framework.
## Responsibilities ## Responsibilities
- Maintain alignment with the product guidelines and definition. - Maintain alignment with the product guidelines and definition.
- Define track boundaries and accept tasks from users. - Define track boundaries and initialize new tracks (`/conductor:newTrack`).
- Delegate architectural planning to the Tier 2 Tech Lead. - Set up the project environment (`/conductor:setup`).
- Act as the primary interface for track management. - Delegate track execution to the Tier 2 Tech Lead.
## Limitations ## Limitations
- Do not execute tracks or implement features.
- Do not write code or perform low-level bug fixing. - Do not write code or perform low-level bug fixing.
- Keep context strictly focused on product definitions and track plans. - Keep context strictly focused on product definitions and high-level strategy.

View File

@@ -1,19 +1,21 @@
--- ---
name: mma-tier2-tech-lead name: mma-tier2-tech-lead
description: Focused on architectural design, tech stack alignment, and code review. description: Focused on track execution, architectural design, and implementation oversight.
--- ---
# MMA Tier 2: Tech Lead # MMA Tier 2: Tech Lead
You are the Tier 2 Tech Lead. Your role is to ensure architectural integrity, align with the defined tech stack, review code, and provide detailed technical specifications for the Tier 3 Workers. You are the Tier 2 Tech Lead. Your role is to manage the implementation of tracks (`/conductor:implement`), ensure architectural integrity, and oversee the work of Tier 3 and 4 sub-agents.
## Responsibilities ## Responsibilities
- Manage the execution of implementation tracks.
- Ensure alignment with `tech-stack.md` and project architecture. - Ensure alignment with `tech-stack.md` and project architecture.
- Break down tasks into specific technical steps and specifications. - Break down tasks into specific technical steps for Tier 3 Workers.
- Review implementations produced by Tier 3 Workers. - Maintain persistent context throughout a track's implementation phase (No Context Amnesia).
- Guide the resolution of complex technical issues. - Review implementations and coordinate bug fixes via Tier 4 QA.
## Limitations ## Limitations
- Do not write boilerplate or exhaustive feature code yourself. - Do not perform heavy implementation work directly; delegate to Tier 3.
- Delegate implementation tasks to Tier 3 Workers using `uv run python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`. - Delegate implementation tasks to Tier 3 Workers using `uv run python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`.
- For error analysis of large logs, use `uv run python scripts/mma_exec.py --role tier4-qa "[PROMPT]"`. - For error analysis of large logs, use `uv run python scripts/mma_exec.py --role tier4-qa "[PROMPT]"`.
- Minimize full file reads for large modules; rely on "Skeleton Views" and git diffs.

View File

@@ -5,14 +5,16 @@ description: Focused on TDD implementation, surgical code changes, and following
# MMA Tier 3: Worker # MMA Tier 3: Worker
You are the Tier 3 Worker. Your role is to implement specific, scoped technical requirements, follow Test-Driven Development (TDD), and make surgical code modifications. You are the Tier 3 Worker. Your role is to implement specific, scoped technical requirements, follow Test-Driven Development (TDD), and make surgical code modifications. You operate in a stateless manner (Context Amnesia).
## Responsibilities ## Responsibilities
- Implement code strictly according to the provided prompt and specifications. - Implement code strictly according to the provided prompt and specifications.
- Write failing tests first, then implement the code to pass them. - Write failing tests first, then implement the code to pass them.
- Ensure all changes are minimal, functional, and conform to the requested standards. - Ensure all changes are minimal, functional, and conform to the requested standards.
- Utilize provided tool access (read_file, write_file, etc.) to perform implementation and verification.
## Limitations ## Limitations
- Do not make architectural decisions. - Do not make architectural decisions.
- Do not modify unrelated files. - Do not modify unrelated files beyond the immediate task scope.
- Operate statelessly and return only the requested code or diff. - Always operate statelessly; assume each task starts with a clean context.
- Rely on "Skeleton Views" provided by Tier 2/Orchestrator for understanding dependencies.

View File

@@ -5,13 +5,15 @@ description: Focused on test analysis, error summarization, and bug reproduction
# MMA Tier 4: QA Agent # MMA Tier 4: QA Agent
You are the Tier 4 QA Agent. Your role is to analyze massive error logs, summarize tracebacks, and help diagnose issues efficiently. You are the Tier 4 QA Agent. Your role is to analyze error logs, summarize tracebacks, and help diagnose issues efficiently. You operate in a stateless manner (Context Amnesia).
## Responsibilities ## Responsibilities
- Compress massive stack traces or log files into concise, actionable summaries. - Compress large stack traces or log files into concise, actionable summaries.
- Identify the root cause of test failures or runtime errors. - Identify the root cause of test failures or runtime errors.
- Provide a brief description of the required fix. - Provide a brief, technical description of the required fix.
- Utilize provided diagnostic and exploration tools to verify failures.
## Limitations ## Limitations
- Do not implement the fix directly. - Do not implement the fix directly.
- Ensure your output is extremely brief and stateless. - Ensure your output is extremely brief and focused.
- Always operate statelessly; assume each analysis starts with a clean context.

View File

@@ -11,11 +11,11 @@ To serve as an expert-level utility for personal developer use on small projects
## Key Features ## Key Features
- **Multi-Provider Integration:** Supports Gemini, Anthropic, and DeepSeek with seamless switching. - **Multi-Provider Integration:** Supports Gemini, Anthropic, and DeepSeek with seamless switching.
- **4-Tier Hierarchical Multi-Model Architecture:** Orchestrates an intelligent cascade of specialized models to isolate cognitive loads and minimize token burn. - **4-Tier Hierarchical Multi-Model Architecture:** Orchestrates an intelligent cascade of specialized models to isolate cognitive loads and minimize token burn.
- **Tier 1 (Orchestrator):** Product alignment and high-level strategy using `gemini-3.1-pro-preview`. - **Tier 1 (Orchestrator):** Strategic product alignment, setup (`/conductor:setup`), and track initialization (`/conductor:newTrack`) using `gemini-3.1-pro-preview`.
- **Tier 2 (Tech Lead):** Architectural design and technical planning using `gemini-3-flash-preview`. - **Tier 2 (Tech Lead):** Technical oversight and track execution (`/conductor:implement`) using `gemini-3-flash-preview`. Maintains persistent context throughout implementation.
- **Tier 3 (Worker):** Focused implementation and surgical code changes using `gemini-2.5-flash-lite` or `deepseek-v3`. - **Tier 3 (Worker):** Surgical code implementation and TDD using `gemini-2.5-flash-lite` or `deepseek-v3`. Operates statelessly with tool access and dependency skeletons.
- **Tier 4 (QA):** Bug reproduction, test analysis, and error translation using `gemini-2.5-flash-lite` or `deepseek-v3`. - **Tier 4 (QA):** Error analysis and diagnostics using `gemini-2.5-flash-lite` or `deepseek-v3`. Operates statelessly with tool access.
- **MMA Delegation Engine:** Utilizes the `mma-exec` CLI and `mma.ps1` helper to route tasks, ensuring each tier receives role-scoped context (e.g., Orchestrators get Product docs; Workers get Workflow specs). - **MMA Delegation Engine:** Utilizes the `mma-exec` CLI and `mma.ps1` helper to route tasks, ensuring role-scoped context and detailed observability via timestamped sub-agent logs.
- **Role-Scoped Documentation:** Automated mapping of foundational documents to specific tiers to prevent token bloat and maintain high-signal context. - **Role-Scoped Documentation:** Automated mapping of foundational documents to specific tiers to prevent token bloat and maintain high-signal context.
- **Strict Memory Siloing:** Employs AST-based interface extraction and "Context Amnesia" to provide workers only with the absolute minimum context required, preventing hallucination loops. - **Strict Memory Siloing:** Employs AST-based interface extraction and "Context Amnesia" to provide workers only with the absolute minimum context required, preventing hallucination loops.
- **Explicit Execution Control:** All AI-generated PowerShell scripts require explicit human confirmation via interactive UI dialogs before execution, supported by a global "Linear Execution Clutch" for deterministic debugging. - **Explicit Execution Control:** All AI-generated PowerShell scripts require explicit human confirmation via interactive UI dialogs before execution, supported by a global "Linear Execution Clutch" for deterministic debugging.

View File

@@ -28,7 +28,7 @@
## Configuration & Tooling ## Configuration & Tooling
- **tree-sitter & tree-sitter-python:** For deterministic AST parsing and generation of curated "Skeleton Views" and interface-level memory structures. - **tree-sitter & tree-sitter-python:** For deterministic AST parsing and automated generation of curated "Skeleton Views" (signatures and docstrings) to minimize context bloat for sub-agents.
- **pydantic / dataclasses:** For defining strict state schemas (Tracks, Tickets) used in linear orchestration. - **pydantic / dataclasses:** For defining strict state schemas (Tracks, Tickets) used in linear orchestration.
- **tomli-w:** For writing TOML configuration files. - **tomli-w:** For writing TOML configuration files.
- **psutil:** For system and process monitoring (CPU/Memory telemetry). - **psutil:** For system and process monitoring (CPU/Memory telemetry).

View File

@@ -1,23 +1,23 @@
# Implementation Plan: MMA Utilization Refinement # Implementation Plan: MMA Utilization Refinement
## Phase 1: Skill Segregation and Tier Re-Alignment ## Phase 1: Skill Segregation and Tier Re-Alignment
- [ ] Task: Refine `mma-tier1-orchestrator` skill to focus exclusively on project/track initialization. - [x] Task: Refine `mma-tier1-orchestrator` skill to focus exclusively on project/track initialization. e950601
- [ ] Task: Refine `mma-tier2-tech-lead` skill for track execution, ensuring persistent memory across tasks (Disable Context Amnesia). - [x] Task: Refine `mma-tier2-tech-lead` skill for track execution, ensuring persistent memory across tasks (Disable Context Amnesia). e950601
- [ ] Task: Refine `mma-tier3-worker` and `mma-tier4-qa` skills to be stateless (Enable Context Amnesia) but equipped with full file read/write tools. - [x] Task: Refine `mma-tier3-worker` and `mma-tier4-qa` skills to be stateless but equipped with full file read/write tools and should be provided only the context the need of the project beyond that with ast skeleton extraction or what tier 2 provies them. e950601
- [ ] Task: Conductor - User Manual Verification 'Phase 1' (Protocol in workflow.md) - [ ] Task: Conductor - User Manual Verification 'Phase 1' (Protocol in workflow.md)
## Phase 2: AST Skeleton Extraction (Skeleton Views) ## Phase 2: AST Skeleton Extraction (Skeleton Views)
- [ ] Task: Enhance `mcp_client.py` with `get_python_skeleton` functionality using `tree-sitter` to extract signatures and docstrings. - [x] Task: Enhance `mcp_client.py` with `get_python_skeleton` functionality using `tree-sitter` to extract signatures and docstrings. e950601
- [ ] Task: Update `mma_exec.py` to utilize these skeletons for non-target dependencies when preparing context for Tier 3. - [x] Task: Update `mma_exec.py` to utilize these skeletons for non-target dependencies when preparing context for Tier 3. e950601
- [ ] Task: Integrate "Interface-level" scrubbed versions into the sub-agent injection logic. - [x] Task: Integrate "Interface-level" scrubbed versions into the sub-agent injection logic. e950601
- [ ] Task: Conductor - User Manual Verification 'Phase 2' (Protocol in workflow.md) - [ ] Task: Conductor - User Manual Verification 'Phase 2' (Protocol in workflow.md)
## Phase 3: Sub-Agent Observability ## Phase 3: Sub-Agent Observability
- [ ] Task: Implement a dedicated logging mechanism for sub-agents (e.g., `logs/mma_subagents.log`) that captures reasoning and tool output. - [x] Task: Implement a dedicated logging mechanism for sub-agents (e.g., `logs/agents/mma_tier<#>_task_<timestamp>.log`) that captures reasoning and tool output. e950601
- [ ] Task: Ensure sub-agent executions do not pollute the primary Gemini CLI history while remaining visible to the user via the log. - [x] Task: Ensure sub-agent executions do not pollute the primary Gemini CLI history while remaining visible to the user via the log. e950601
- [ ] Task: Conductor - User Manual Verification 'Phase 3' (Protocol in workflow.md) - [ ] Task: Conductor - User Manual Verification 'Phase 3' (Protocol in workflow.md)
## Phase 4: Workflow Optimization and Validation ## Phase 4: Workflow Optimization and Validation
- [ ] Task: Update `conductor/workflow.md` to formally document the refined tier roles and tool permissions. - [x] Task: Update `conductor/workflow.md` to formally document the refined tier roles and tool permissions. e950601
- [ ] Task: Conduct a full end-to-end "Dry Run" (Create a dummy track and implement a small feature) to verify the new architecture. - [x] Task: Conduct a full end-to-end "Dry Run" (Create a dummy track and implement a small feature) to verify the new architecture. e950601
- [ ] Task: Conductor - User Manual Verification 'Phase 4' (Protocol in workflow.md) - [ ] Task: Conductor - User Manual Verification 'Phase 4' (Protocol in workflow.md)

View File

@@ -372,17 +372,18 @@ To emulate the 4-Tier MMA Architecture within the standard Conductor extension w
- **Activate MMA Orchestrator Skill:** To enforce the 4-Tier token firewall, the agent MUST invoke `activate_skill mma-orchestrator` at the start of any implementation phase. - **Activate MMA Orchestrator Skill:** To enforce the 4-Tier token firewall, the agent MUST invoke `activate_skill mma-orchestrator` at the start of any implementation phase.
- **The MMA Bridge (`mma_exec.py`):** All tiered delegation is routed through `python scripts/mma_exec.py`. This script acts as the primary bridge, managing model selection, context injection, and logging. - **The MMA Bridge (`mma_exec.py`):** All tiered delegation is routed through `python scripts/mma_exec.py`. This script acts as the primary bridge, managing model selection, context injection, and logging.
- **Model Tiers:** - **Model Tiers:**
- **Tier 1 (Strategic/Orchestration):** `gemini-3.1-pro-preview`. Used for planning and high-level logic. - **Tier 1 (Strategic/Orchestration):** `gemini-3.1-pro-preview`. Focused on product alignment, setup (`/conductor:setup`), and track initialization (`/conductor:newTrack`).
- **Tier 2 (Architectural/Tech Lead):** `gemini-3-flash-preview`. Used for code review and structural design. - **Tier 2 (Architectural/Tech Lead):** `gemini-3-flash-preview`. Focused on architectural design and track execution (`/conductor:implement`). **Note:** Tier 2 maintains persistent memory throughout a track's implementation.
- **Tier 3 (Execution/Worker):** `gemini-2.5-flash-lite`. Used for surgical code implementation and test generation. - **Tier 3 (Execution/Worker):** `gemini-2.5-flash-lite`. Used for surgical code implementation and test generation. Operates statelessly (Context Amnesia) but has access to file I/O tools.
- **Tier 4 (Utility/QA):** `gemini-2.5-flash-lite`. Used for log summarization and error analysis. - **Tier 4 (Utility/QA):** `gemini-2.5-flash-lite`. Used for log summarization and error analysis. Operates statelessly (Context Amnesia) but has access to diagnostic tools.
- **Tiered Delegation Protocol:** - **Tiered Delegation Protocol:**
- **Tier 3 Worker:** `python scripts/mma_exec.py --role tier3-worker "[PROMPT]"` - **Tier 3 Worker:** `python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`
- **Tier 4 QA Agent:** `python scripts/mma_exec.py --role tier4-qa "[PROMPT]"` - **Tier 4 QA Agent:** `python scripts/mma_exec.py --role tier4-qa "[PROMPT]"`
- **Logging:** All hierarchical interactions are automatically recorded in `logs/mma_delegation.log` for auditable verification. - **Observability:** All hierarchical interactions are recorded in `logs/mma_delegation.log` and detailed sub-agent logs are saved to `logs/agents/`.
### 2. Context Management and Token Firewalling ### 2. Context Management and Token Firewalling
- **Context Amnesia:** `mma_exec.py` enforces "Context Amnesia" by executing sub-agents in a stateless manner. Each call starts with a clean slate, receiving only the strictly necessary documents and prompts. This prevents conversational "hallucination bleed" and keeps token costs low. - **Context Amnesia (Tiers 3 & 4):** `mma_exec.py` enforces "Context Amnesia" by executing sub-agents in a stateless manner. Each call starts with a clean slate, receiving only the strictly necessary documents and prompts.
- **Persistent Memory (Tier 2):** The Tier 2 Tech Lead does NOT use Context Amnesia during track implementation to ensure continuity of technical strategy.
- **AST Skeleton Views:** For Tier 3 implementation, `mma_exec.py` automatically generates "AST Skeleton Views" of project dependencies. This provides the worker model with the interface-level structure (function signatures, docstrings) of imported modules without the full source code, maximizing the signal-to-noise ratio in the context window. - **AST Skeleton Views:** For Tier 3 implementation, `mma_exec.py` automatically generates "AST Skeleton Views" of project dependencies. This provides the worker model with the interface-level structure (function signatures, docstrings) of imported modules without the full source code, maximizing the signal-to-noise ratio in the context window.
### 3. Phase Checkpoints (The Final Defense) ### 3. Phase Checkpoints (The Final Defense)

View File

@@ -5,34 +5,33 @@ description: Enforces the 4-Tier Hierarchical Multi-Model Architecture (MMA) wit
# MMA Token Firewall & Tiered Delegation Protocol # MMA Token Firewall & Tiered Delegation Protocol
You are operating as a Tier 1 Product Manager or Tier 2 Tech Lead within the MMA Framework. Your context window is extremely valuable and must be protected from token bloat (such as raw, repetitive code edits, trial-and-error histories, or massive stack traces). You are operating within the MMA Framework, acting as either the **Tier 1 Orchestrator** (for setup/init) or the **Tier 2 Tech Lead** (for execution). Your context window is extremely valuable and must be protected from token bloat (such as raw, repetitive code edits, trial-and-error histories, or massive stack traces).
To accomplish this, you MUST delegate token-heavy or stateless tasks to "Tier 3 Contributors" or "Tier 4 QA Agents" by spawning secondary Gemini CLI instances via `run_shell_command`. To accomplish this, you MUST delegate token-heavy or stateless tasks to **Tier 3 Workers** or **Tier 4 QA Agents** by spawning secondary Gemini CLI instances via `run_shell_command`.
**CRITICAL Prerequisite:** **CRITICAL Prerequisite:**
To avoid hanging the CLI and ensure proper environment authentication, you MUST NOT call the `gemini` command directly. Instead, you MUST use the wrapper script: To ensure proper environment handling and logging, you MUST NOT call the `gemini` command directly for sub-tasks. Instead, use the wrapper script:
`uv run python scripts/mma_exec.py --role <Role> "..."` `uv run python scripts/mma_exec.py --role <Role> "..."`
## 1. The Tier 3 Worker (Heads-Down Coding) ## 1. The Tier 3 Worker (Execution)
When you need to perform a significant code modification (e.g., refactoring a 50-line+ script, writing a massive class, or implementing a predefined spec): When performing code modifications or implementing specific requirements:
1. **DO NOT** attempt to write or use `replace`/`write_file` yourself. Your history will bloat. 1. **DO NOT** perform large code writes yourself.
2. **DO** construct a single, highly specific prompt. 2. **DO** construct a single, highly specific prompt with a clear objective.
3. **DO** spawn a sub-agent using `run_shell_command` pointing to the target file. 3. **DO** spawn a Tier 3 Worker.
*Command:* `uv run python scripts/mma_exec.py --role tier3-worker "Read [FILE_PATH] and modify it to implement [SPECIFIC_INSTRUCTION]. Only write the code, no pleasantries."` *Command:* `uv run python scripts/mma_exec.py --role tier3-worker "Implement [SPECIFIC_INSTRUCTION] in [FILE_PATH]. Follow TDD and return success status or code changes."`
4. The Tier 3 Worker is stateless and has no tool access. You must take the clean code it returns and apply it to the file system using your own `replace` or `write_file` tools. 4. The Tier 3 Worker is stateless and has tool access for file I/O.
## 2. The Tier 4 QA Agent (Error Translation) ## 2. The Tier 4 QA Agent (Diagnostics)
If you run a local test (e.g., `npm test`, `pytest`, `go run`) via `run_shell_command` and it fails with a massive traceback (e.g., 100+ lines of `stderr`): If you run a test or command that fails with a significant error or large traceback:
1. **DO NOT** analyze the raw `stderr` in your own context window. 1. **DO NOT** analyze the raw logs in your own context window.
2. **DO** immediately spawn a stateless Tier 4 agent to compress the error. 2. **DO** spawn a stateless Tier 4 agent to diagnose the failure.
3. *Command:* `uv run python scripts/mma_exec.py --role tier4-qa "Summarize this stack trace into a 20-word fix: [PASTE_SNIPPET_OF_STDERR_HERE]"` 3. *Command:* `uv run python scripts/mma_exec.py --role tier4-qa "Analyze this failure and summarize the root cause: [LOG_DATA]"`
4. Use the 20-word fix returned by the Tier 4 agent to inform your next architectural decision or pass it to the Tier 3 worker.
## 3. Context Amnesia (Phase Checkpoints) ## 3. Persistent Tech Lead Memory (Tier 2)
When you complete a major Phase or Track within the `conductor` workflow: Unlike the stateless sub-agents (Tiers 3 & 4), the **Tier 2 Tech Lead** maintains persistent context throughout the implementation of a track. Do NOT apply "Context Amnesia" to your own session during track implementation. You are responsible for the continuity of the technical strategy.
1. Stage your changes and commit them.
2. Draft a comprehensive summary of the state changes in a Git Note attached to the commit. ## 4. AST Skeleton Views
3. Treat the checkpoint as a "Memory Wipe." Actively disregard previous conversational turns and trial-and-error histories. Rely exclusively on the newly generated Git Note and the physical state of the files on disk for your next Phase. To minimize context bloat for Tier 3, use "Skeleton Views" of dependencies (extracted via `mcp_client.py` or similar) instead of full file contents, unless the Tier 3 worker is explicitly modifying that specific file.
<examples> <examples>
### Example 1: Spawning a Tier 4 QA Agent ### Example 1: Spawning a Tier 4 QA Agent

View File

@@ -81,17 +81,33 @@ def get_role_documents(role: str) -> list[str]:
return ['conductor/workflow.md'] return ['conductor/workflow.md']
return [] return []
def log_delegation(role, prompt): def log_delegation(role, prompt, result=None):
os.makedirs('logs/agents', exist_ok=True)
timestamp = datetime.datetime.now().strftime('%Y%m%d_%H%M%S')
log_file = f'logs/agents/mma_{role}_task_{timestamp}.log'
with open(log_file, 'w', encoding='utf-8') as f:
f.write("==================================================\n")
f.write(f"ROLE: {role}\n")
f.write(f"TIMESTAMP: {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")
f.write("--------------------------------------------------\n")
f.write(f"PROMPT:\n{prompt}\n")
f.write("--------------------------------------------------\n")
if result:
f.write(f"RESULT:\n{result}\n")
f.write("==================================================\n")
# Also keep the master log
os.makedirs(os.path.dirname(LOG_FILE), exist_ok=True) os.makedirs(os.path.dirname(LOG_FILE), exist_ok=True)
timestamp = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
with open(LOG_FILE, 'a', encoding='utf-8') as f: with open(LOG_FILE, 'a', encoding='utf-8') as f:
f.write("--------------------------------------------------\n") f.write(f"[{datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}] {role}: {prompt[:100]}... (Log: {log_file})\n")
f.write(f"TIMESTAMP: {timestamp}\n")
f.write(f"TIER: {role}\n") return log_file
f.write(f"PROMPT: {prompt}\n")
f.write("--------------------------------------------------\n")
def get_dependencies(filepath): def execute_agent(role: str, prompt: str, docs: list[str]) -> str:
model = get_model_for_role(role)
# Advanced Context: Dependency skeletons for Tier 3
"""Identify top-level module imports from a Python file.""" """Identify top-level module imports from a Python file."""
try: try:
with open(filepath, 'r', encoding='utf-8') as f: with open(filepath, 'r', encoding='utf-8') as f:
@@ -116,7 +132,6 @@ def get_dependencies(filepath):
return [] return []
def execute_agent(role: str, prompt: str, docs: list[str]) -> str: def execute_agent(role: str, prompt: str, docs: list[str]) -> str:
log_delegation(role, prompt)
model = get_model_for_role(role) model = get_model_for_role(role)
# Advanced Context: Dependency skeletons for Tier 3 # Advanced Context: Dependency skeletons for Tier 3
@@ -158,16 +173,17 @@ def execute_agent(role: str, prompt: str, docs: list[str]) -> str:
# MMA Protocol: Tier 3 and 4 are stateless. # MMA Protocol: Tier 3 and 4 are stateless.
if role in ['tier3', 'tier3-worker']: if role in ['tier3', 'tier3-worker']:
system_directive = "STRICT SYSTEM DIRECTIVE: You are a stateless Tier 3 Worker (Contributor). " \ system_directive = "STRICT SYSTEM DIRECTIVE: You are a stateless Tier 3 Worker (Contributor). " \
"Your goal is to generate high-quality code or diffs based on the provided task. " \ "Your goal is to implement specific code changes or tests based on the provided task. " \
"DO NOT USE ANY TOOLS (no write_file, no run_shell_command, etc.). " \ "You have access to tools for reading and writing files. " \
"ONLY output the clean code or the requested diff. No pleasantries, no conversational filler." "Follow TDD and return success status or code changes. No pleasantries, no conversational filler."
elif role in ['tier4', 'tier4-qa']: elif role in ['tier4', 'tier4-qa']:
system_directive = "STRICT SYSTEM DIRECTIVE: You are a stateless Tier 4 QA Agent. " \ system_directive = "STRICT SYSTEM DIRECTIVE: You are a stateless Tier 4 QA Agent. " \
"Your goal is to analyze errors, summarize logs, or verify tests. " \ "Your goal is to analyze errors, summarize logs, or verify tests. " \
"DO NOT USE ANY TOOLS. ONLY output the requested analysis. No pleasantries." "You have access to tools for reading files and exploring the codebase. " \
"ONLY output the requested analysis. No pleasantries."
else: else:
system_directive = f"STRICT SYSTEM DIRECTIVE: You are a stateless {role}. " \ system_directive = f"STRICT SYSTEM DIRECTIVE: You are a stateless {role}. " \
"DO NOT USE ANY TOOLS. ONLY output the requested text. No pleasantries." "ONLY output the requested text. No pleasantries."
command_text = f"{system_directive}\n\n{injected_context}\n\n" command_text = f"{system_directive}\n\n{injected_context}\n\n"
@@ -194,8 +210,13 @@ def execute_agent(role: str, prompt: str, docs: list[str]) -> str:
try: try:
process = subprocess.run(cmd, input=command_text, capture_output=True, text=True, encoding='utf-8') process = subprocess.run(cmd, input=command_text, capture_output=True, text=True, encoding='utf-8')
result = process.stdout
if not process.stdout and process.stderr: if not process.stdout and process.stderr:
return f"Error: {process.stderr}" result = f"Error: {process.stderr}"
# Log the attempt and result
log_file = log_delegation(role, command_text, result)
print(f"Sub-agent log created: {log_file}")
stdout = process.stdout stdout = process.stdout
start_index = stdout.find('{') start_index = stdout.find('{')
@@ -208,7 +229,9 @@ def execute_agent(role: str, prompt: str, docs: list[str]) -> str:
return stdout return stdout
return stdout return stdout
except Exception as e: except Exception as e:
return f"Execution failed: {str(e)}" err_msg = f"Execution failed: {str(e)}"
log_delegation(role, command_text, err_msg)
return err_msg
def create_parser(): def create_parser():
parser = argparse.ArgumentParser(description="MMA Execution Script") parser = argparse.ArgumentParser(description="MMA Execution Script")