# Track Specification: Advanced Tier 4 QA Auto-Patching (tier4_auto_patching_20260306) ## Overview Elevate Tier 4 from log summarizer to auto-patcher. When verification tests fail, Tier 4 generates a unified diff patch. GUI displays side-by-side diff; user clicks Apply Patch to resume pipeline. ## Current State Audit ### Already Implemented (DO NOT re-implement) #### Tier 4 Analysis (ai_client.py) - **`run_tier4_analysis(stderr: str) -> str`**: Analyzes error, returns summary - **Prompt**: Uses `mma_prompts.PROMPTS["tier4_error_triage"]` - **Output**: Text analysis, no code generation #### Error Interception (shell_runner.py) - **`run_powershell()`**: Accepts `qa_callback` parameter - **On failure**: Calls `qa_callback(stderr)` and appends to output - **Integrated**: `ai_client._run_script()` passes `qa_callback` #### MCP Tools (mcp_client.py) - **`set_file_slice()`**: Replace line range in file - **`py_update_definition()`**: Replace class/function via AST - **`edit_file()`**: String replacement in file - **No diff generation or patch application** ### Gaps to Fill (This Track's Scope) - Tier 4 doesn't generate patches - No diff visualization in GUI - No patch application mechanism - No rollback capability ## Architectural Constraints ### Safe Preview - Patches MUST be previewed before application - User MUST see exactly what will change - No automatic application without approval ### Atomic Application - Patch applies all changes or none - If partial application fails, rollback ### Rollback Support - Backup created before patch - User can undo applied patch - Backup stored temporarily ## Architecture Reference ### Key Integration Points | File | Lines | Purpose | |------|-------|---------| | `src/ai_client.py` | ~700-750 | `run_tier4_analysis()` | | `src/shell_runner.py` | 50-100 | `run_powershell()` with qa_callback | | `src/mcp_client.py` | 300-350 | `set_file_slice()`, `edit_file()` | | `src/gui_2.py` | 2400-2500 | Confirmation dialogs pattern | ### Proposed Patch Workflow ``` 1. Test fails → stderr captured 2. Tier 4 analyzes → generates unified diff 3. GUI shows diff viewer with Apply/Reject buttons 4. User clicks Apply: a. Backup original file(s) b. Apply patch via subprocess or difflib c. Verify patch applied cleanly d. If fails, restore from backup 5. Pipeline resumes with patched code ``` ### Unified Diff Format ```diff --- a/src/target_file.py +++ b/src/target_file.py @@ -10,5 +10,6 @@ def existing_function(): - old_line + new_line + additional_line ``` ## Functional Requirements ### FR1: Patch Generation - Tier 4 prompt enhanced to generate unified diff - Output format: standard `diff -u` format - Include file path in diff header - Multiple files supported ### FR2: Diff Viewer GUI - Side-by-side or unified view - Color-coded additions (green) and deletions (red) - Line numbers visible - Scrollable for large diffs ### FR3: Apply Button - Creates backup: `file.py.backup` - Applies patch: `patch -p1 < diff.patch` or Python difflib - Verifies success - Shows confirmation or error ### FR4: Rollback - Restore from backup if patch fails - Manual rollback button after successful patch - Backup deleted after explicit user action ## Non-Functional Requirements | Requirement | Constraint | |-------------|------------| | Patch Generation | <5s for typical errors | | Diff Rendering | <100ms for 100-line diff | | Backup Storage | Temp dir, cleaned on exit | ## Testing Requirements ### Unit Tests - Test diff generation format - Test patch application logic - Test backup/rollback ### Integration Tests (via `live_gui` fixture) - Trigger test failure, verify patch generated - Apply patch, verify file changed correctly - Rollback, verify file restored ## Out of Scope - Automatic patch application (always requires approval) - Patch conflict resolution (reject if conflict) - Multi-file patch coordination ## Acceptance Criteria - [ ] Tier 4 generates valid unified diff on test failure - [ ] GUI displays readable side-by-side diff - [ ] User can approve/reject patch - [ ] Approved patches applied correctly - [ ] Rollback available on failure - [ ] Backup files cleaned up - [ ] 1-space indentation maintained