Fixes to mma and conductor.
This commit is contained in:
@@ -30,12 +30,14 @@ All tasks follow a strict lifecycle:
|
|||||||
- **Minimize Token Burn:** Only use `read_file` with `start_line`/`end_line` for specific implementation details once target areas are identified.
|
- **Minimize Token Burn:** Only use `read_file` with `start_line`/`end_line` for specific implementation details once target areas are identified.
|
||||||
4. **Write Failing Tests (Red Phase):**
|
4. **Write Failing Tests (Red Phase):**
|
||||||
- **Pre-Delegation Checkpoint:** Before spawning a worker for dangerous or non-trivial changes, ensure your current progress is staged (`git add .`) or committed. This prevents losing iterations if a sub-agent incorrectly uses `git restore`.
|
- **Pre-Delegation Checkpoint:** Before spawning a worker for dangerous or non-trivial changes, ensure your current progress is staged (`git add .`) or committed. This prevents losing iterations if a sub-agent incorrectly uses `git restore`.
|
||||||
|
- **Code Style:** ALWAYS explicitly mention "Use exactly 1-space indentation for Python code" when prompting a sub-agent.
|
||||||
- **Delegate Test Creation:** Do NOT write test code directly. Spawn a Tier 3 Worker (`python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`) with a prompt to create the necessary test files and unit tests based on the task criteria. (If repeating due to failures, pass `--failure-count X` to switch to a more capable model).
|
- **Delegate Test Creation:** Do NOT write test code directly. Spawn a Tier 3 Worker (`python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`) with a prompt to create the necessary test files and unit tests based on the task criteria. (If repeating due to failures, pass `--failure-count X` to switch to a more capable model).
|
||||||
- Take the code generated by the Worker and apply it.
|
- Take the code generated by the Worker and apply it.
|
||||||
- **CRITICAL:** Run the tests and confirm that they fail as expected. This is the "Red" phase of TDD. Do not proceed until you have failing tests.
|
- **CRITICAL:** Run the tests and confirm that they fail as expected. This is the "Red" phase of TDD. Do not proceed until you have failing tests.
|
||||||
|
|
||||||
4. **Implement to Pass Tests (Green Phase):**
|
4. **Implement to Pass Tests (Green Phase):**
|
||||||
- **Pre-Delegation Checkpoint:** Ensure current progress is staged or committed before delegating.
|
- **Pre-Delegation Checkpoint:** Ensure current progress is staged or committed before delegating.
|
||||||
|
- **Code Style:** ALWAYS explicitly mention "Use exactly 1-space indentation for Python code" when prompting a sub-agent.
|
||||||
- **Delegate Implementation:** Do NOT write the implementation code directly. Spawn a Tier 3 Worker (`python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`) with a highly specific prompt to write the minimum amount of application code necessary to make the failing tests pass. (If repeating due to failures, pass `--failure-count X` to switch to a more capable model).
|
- **Delegate Implementation:** Do NOT write the implementation code directly. Spawn a Tier 3 Worker (`python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`) with a highly specific prompt to write the minimum amount of application code necessary to make the failing tests pass. (If repeating due to failures, pass `--failure-count X` to switch to a more capable model).
|
||||||
- Take the code generated by the Worker and apply it.
|
- Take the code generated by the Worker and apply it.
|
||||||
- Run the test suite again and confirm that all tests now pass. This is the "Green" phase.
|
- Run the test suite again and confirm that all tests now pass. This is the "Green" phase.
|
||||||
|
|||||||
@@ -16,12 +16,13 @@ To ensure proper environment handling and logging, you MUST NOT call the `gemini
|
|||||||
## 1. The Tier 3 Worker (Execution)
|
## 1. The Tier 3 Worker (Execution)
|
||||||
When performing code modifications or implementing specific requirements:
|
When performing code modifications or implementing specific requirements:
|
||||||
1. **Pre-Delegation Checkpoint:** For dangerous or non-trivial changes, ALWAYS stage your changes (`git add .`) or commit before delegating to a Tier 3 Worker. If the worker fails or runs `git restore`, you will lose all prior AI iterations for that file if it wasn't staged/committed.
|
1. **Pre-Delegation Checkpoint:** For dangerous or non-trivial changes, ALWAYS stage your changes (`git add .`) or commit before delegating to a Tier 3 Worker. If the worker fails or runs `git restore`, you will lose all prior AI iterations for that file if it wasn't staged/committed.
|
||||||
2. **DO NOT** perform large code writes yourself.
|
2. **Code Style Enforcement:** You MUST explicitly remind the worker to "use exactly 1-space indentation for Python code" in your prompt to prevent them from breaking the established codebase style.
|
||||||
3. **DO** construct a single, highly specific prompt with a clear objective.
|
3. **DO NOT** perform large code writes yourself.
|
||||||
4. **DO** spawn a Tier 3 Worker.
|
4. **DO** construct a single, highly specific prompt with a clear objective.
|
||||||
*Command:* `uv run python scripts/mma_exec.py --role tier3-worker "Implement [SPECIFIC_INSTRUCTION] in [FILE_PATH]. Follow TDD and return success status or code changes."`
|
5. **DO** spawn a Tier 3 Worker.
|
||||||
5. **Handling Repeated Failures:** If a Tier 3 Worker fails multiple times on the same task, it may lack the necessary capability. You must track failures and retry with `--failure-count <N>` (e.g., `--failure-count 2`). This tells `mma_exec.py` to escalate the sub-agent to a more powerful reasoning model (like `gemini-3-flash`).
|
*Command:* `uv run python scripts/mma_exec.py --role tier3-worker "Implement [SPECIFIC_INSTRUCTION] in [FILE_PATH]. Use 1-space indentation. Follow TDD and return success status or code changes."`
|
||||||
6. The Tier 3 Worker is stateless and has tool access for file I/O.
|
6. **Handling Repeated Failures:** If a Tier 3 Worker fails multiple times on the same task, it may lack the necessary capability. You must track failures and retry with `--failure-count <N>` (e.g., `--failure-count 2`). This tells `mma_exec.py` to escalate the sub-agent to a more powerful reasoning model (like `gemini-3-flash`).
|
||||||
|
7. The Tier 3 Worker is stateless and has tool access for file I/O.
|
||||||
|
|
||||||
## 2. The Tier 4 QA Agent (Diagnostics)
|
## 2. The Tier 4 QA Agent (Diagnostics)
|
||||||
If you run a test or command that fails with a significant error or large traceback:
|
If you run a test or command that fails with a significant error or large traceback:
|
||||||
|
|||||||
@@ -126,8 +126,42 @@ def get_dependencies(filepath: str) -> list[str]:
|
|||||||
print(f"Error getting dependencies for {filepath}: {e}")
|
print(f"Error getting dependencies for {filepath}: {e}")
|
||||||
return []
|
return []
|
||||||
|
|
||||||
|
import os
|
||||||
|
import subprocess
|
||||||
|
import json
|
||||||
|
|
||||||
|
# Mock Response Definitions
|
||||||
|
MOCK_PLANNING_RESPONSE = {
|
||||||
|
"status": "success",
|
||||||
|
"message": "Mock response for planning task.",
|
||||||
|
"data": {
|
||||||
|
"task_type": "planning",
|
||||||
|
"details": "Mocked plan generated."
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
MOCK_GENERIC_RESPONSE = {
|
||||||
|
"status": "success",
|
||||||
|
"message": "Mock response from the agent.",
|
||||||
|
"data": {
|
||||||
|
"task_type": "generic_mock",
|
||||||
|
"details": "This is a generic mock response."
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
def execute_agent(role: str, prompt: str, docs: list[str], debug: bool = False, failure_count: int = 0) -> str:
|
def execute_agent(role: str, prompt: str, docs: list[str], debug: bool = False, failure_count: int = 0) -> str:
|
||||||
model = get_model_for_role(role, failure_count)
|
model = get_model_for_role(role, failure_count)
|
||||||
|
|
||||||
|
# --- NEW MOCK HANDLING LOGIC ---
|
||||||
|
if model == 'mock':
|
||||||
|
# The 'prompt' argument here represents the user's task/command text.
|
||||||
|
if "Epic Initialization" in prompt or "Sprint Planning" in prompt:
|
||||||
|
return json.dumps(MOCK_PLANNING_RESPONSE)
|
||||||
|
else:
|
||||||
|
return json.dumps(MOCK_GENERIC_RESPONSE)
|
||||||
|
# --- END NEW MOCK HANDLING LOGIC ---
|
||||||
|
|
||||||
# Advanced Context: Dependency skeletons for Tier 3
|
# Advanced Context: Dependency skeletons for Tier 3
|
||||||
injected_context = ""
|
injected_context = ""
|
||||||
# Whitelist of modules that sub-agents have "unfettered" (full) access to.
|
# Whitelist of modules that sub-agents have "unfettered" (full) access to.
|
||||||
@@ -163,6 +197,7 @@ def execute_agent(role: str, prompt: str, docs: list[str], debug: bool = False,
|
|||||||
if role in ['tier3', 'tier3-worker']:
|
if role in ['tier3', 'tier3-worker']:
|
||||||
system_directive = "STRICT SYSTEM DIRECTIVE: You are a stateless Tier 3 Worker (Contributor). " \
|
system_directive = "STRICT SYSTEM DIRECTIVE: You are a stateless Tier 3 Worker (Contributor). " \
|
||||||
"Your goal is to implement specific code changes or tests based on the provided task. " \
|
"Your goal is to implement specific code changes or tests based on the provided task. " \
|
||||||
|
"CRITICAL CODE STYLE RULE: ALL Python code MUST use exactly 1 SPACE for indentation. DO NOT use 4 spaces or tabs. " \
|
||||||
"You have access to tools for reading and writing files (e.g., read_file, write_file, replace), " \
|
"You have access to tools for reading and writing files (e.g., read_file, write_file, replace), " \
|
||||||
"codebase investigation (discovered_tool_py_get_code_outline, discovered_tool_py_get_skeleton, discovered_tool_py_find_usages, discovered_tool_py_get_imports, discovered_tool_py_check_syntax, discovered_tool_get_tree), " \
|
"codebase investigation (discovered_tool_py_get_code_outline, discovered_tool_py_get_skeleton, discovered_tool_py_find_usages, discovered_tool_py_get_imports, discovered_tool_py_check_syntax, discovered_tool_get_tree), " \
|
||||||
"version control (discovered_tool_get_git_diff), and web tools (discovered_tool_web_search, discovered_tool_fetch_url). " \
|
"version control (discovered_tool_get_git_diff), and web tools (discovered_tool_web_search, discovered_tool_fetch_url). " \
|
||||||
|
|||||||
Reference in New Issue
Block a user