Fixes to mma and conductor.

2026-02-28 21:59:28 -05:00
parent ed56e56a2c
commit cb0e14e1c0
3 changed files with 44 additions and 6 deletions
@@ -30,12 +30,14 @@ All tasks follow a strict lifecycle:
   - **Minimize Token Burn:** Only use `read_file` with `start_line`/`end_line` for specific implementation details once target areas are identified.
 4. **Write Failing Tests (Red Phase):**
   - **Pre-Delegation Checkpoint:** Before spawning a worker for dangerous or non-trivial changes, ensure your current progress is staged (`git add .`) or committed. This prevents losing iterations if a sub-agent incorrectly uses `git restore`.
   - **Code Style:** ALWAYS explicitly mention "Use exactly 1-space indentation for Python code" when prompting a sub-agent.
   - **Delegate Test Creation:** Do NOT write test code directly. Spawn a Tier 3 Worker (`python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`) with a prompt to create the necessary test files and unit tests based on the task criteria. (If repeating due to failures, pass `--failure-count X` to switch to a more capable model).
   - Take the code generated by the Worker and apply it.
   - **CRITICAL:** Run the tests and confirm that they fail as expected. This is the "Red" phase of TDD. Do not proceed until you have failing tests.
 4. **Implement to Pass Tests (Green Phase):**
   - **Pre-Delegation Checkpoint:** Ensure current progress is staged or committed before delegating.
   - **Code Style:** ALWAYS explicitly mention "Use exactly 1-space indentation for Python code" when prompting a sub-agent.
   - **Delegate Implementation:** Do NOT write the implementation code directly. Spawn a Tier 3 Worker (`python scripts/mma_exec.py --role tier3-worker "[PROMPT]"`) with a highly specific prompt to write the minimum amount of application code necessary to make the failing tests pass. (If repeating due to failures, pass `--failure-count X` to switch to a more capable model).
   - Take the code generated by the Worker and apply it.
   - Run the test suite again and confirm that all tests now pass. This is the "Green" phase.
@@ -16,12 +16,13 @@ To ensure proper environment handling and logging, you MUST NOT call the `gemini
 ## 1. The Tier 3 Worker (Execution)
 When performing code modifications or implementing specific requirements:
 1. **Pre-Delegation Checkpoint:** For dangerous or non-trivial changes, ALWAYS stage your changes (`git add .`) or commit before delegating to a Tier 3 Worker. If the worker fails or runs `git restore`, you will lose all prior AI iterations for that file if it wasn't staged/committed.
-2. **DO NOT** perform large code writes yourself.
+2. **Code Style Enforcement:** You MUST explicitly remind the worker to "use exactly 1-space indentation for Python code" in your prompt to prevent them from breaking the established codebase style.
-3. **DO** construct a single, highly specific prompt with a clear objective.
+3. **DO NOT** perform large code writes yourself.
-4. **DO** spawn a Tier 3 Worker.
+4. **DO** construct a single, highly specific prompt with a clear objective.
-   *Command:* `uv run python scripts/mma_exec.py --role tier3-worker "Implement [SPECIFIC_INSTRUCTION] in [FILE_PATH]. Follow TDD and return success status or code changes."`
+5. **DO** spawn a Tier 3 Worker.
-5. **Handling Repeated Failures:** If a Tier 3 Worker fails multiple times on the same task, it may lack the necessary capability. You must track failures and retry with `--failure-count <N>` (e.g., `--failure-count 2`). This tells `mma_exec.py` to escalate the sub-agent to a more powerful reasoning model (like `gemini-3-flash`).
+   *Command:* `uv run python scripts/mma_exec.py --role tier3-worker "Implement [SPECIFIC_INSTRUCTION] in [FILE_PATH]. Use 1-space indentation. Follow TDD and return success status or code changes."`
-6. The Tier 3 Worker is stateless and has tool access for file I/O. 
+6. **Handling Repeated Failures:** If a Tier 3 Worker fails multiple times on the same task, it may lack the necessary capability. You must track failures and retry with `--failure-count <N>` (e.g., `--failure-count 2`). This tells `mma_exec.py` to escalate the sub-agent to a more powerful reasoning model (like `gemini-3-flash`).
 7. The Tier 3 Worker is stateless and has tool access for file I/O. 
 ## 2. The Tier 4 QA Agent (Diagnostics)
 If you run a test or command that fails with a significant error or large traceback:
@@ -126,8 +126,42 @@ def get_dependencies(filepath: str) -> list[str]:
  print(f"Error getting dependencies for {filepath}: {e}")
  return []
 import os
 import subprocess
 import json
 # Mock Response Definitions
 MOCK_PLANNING_RESPONSE = {
 "status": "success",
 "message": "Mock response for planning task.",
 "data": {
  "task_type": "planning",
  "details": "Mocked plan generated."
 }
 }
 MOCK_GENERIC_RESPONSE = {
 "status": "success",
 "message": "Mock response from the agent.",
 "data": {
  "task_type": "generic_mock",
  "details": "This is a generic mock response."
 }
 }
 def execute_agent(role: str, prompt: str, docs: list[str], debug: bool = False, failure_count: int = 0) -> str:
 model = get_model_for_role(role, failure_count)
 # --- NEW MOCK HANDLING LOGIC ---
 if model == 'mock':
  # The 'prompt' argument here represents the user's task/command text.
  if "Epic Initialization" in prompt or "Sprint Planning" in prompt:
   return json.dumps(MOCK_PLANNING_RESPONSE)
  else:
   return json.dumps(MOCK_GENERIC_RESPONSE)
 # --- END NEW MOCK HANDLING LOGIC ---
 # Advanced Context: Dependency skeletons for Tier 3
 injected_context = ""
 # Whitelist of modules that sub-agents have "unfettered" (full) access to.
@@ -163,6 +197,7 @@ def execute_agent(role: str, prompt: str, docs: list[str], debug: bool = False,
 if role in ['tier3', 'tier3-worker']:
  system_directive = "STRICT SYSTEM DIRECTIVE: You are a stateless Tier 3 Worker (Contributor). " \
  "Your goal is to implement specific code changes or tests based on the provided task. " \
  "CRITICAL CODE STYLE RULE: ALL Python code MUST use exactly 1 SPACE for indentation. DO NOT use 4 spaces or tabs. " \
  "You have access to tools for reading and writing files (e.g., read_file, write_file, replace), " \
  "codebase investigation (discovered_tool_py_get_code_outline, discovered_tool_py_get_skeleton, discovered_tool_py_find_usages, discovered_tool_py_get_imports, discovered_tool_py_check_syntax, discovered_tool_get_tree), " \
  "version control (discovered_tool_get_git_diff), and web tools (discovered_tool_web_search, discovered_tool_fetch_url). " \