checkpoint: finished test curation

2026-02-25 21:58:18 -05:00
parent e0b9ab997a
commit 56025a84e9
33 changed files with 546 additions and 356 deletions
@@ -8,23 +8,25 @@ This plan outlines the process for categorizing, organizing, and curating the ex
 - [x] Task: Identify failing and redundant tests through a full execution sweep be689ad
 - [x] Task: Conductor - User Manual Verification 'Phase 1: Research and Inventory' (Protocol in workflow.md) be689ad

-## Phase 2: Manifest and Tooling
- [x] Task: T3-P2-1-STUB: Design tests.toml manifest schema (Completed by PM)
- [x] Task: T3-P2-1-IMPL: Populate tests.toml with full inventory
- [x] Task: T3-P2-2-STUB: Stub run_tests.py category-aware interface
- [x] Task: T3-P2-2-IMPL: Implement run_tests.py filtering logic (Verified)
- [x] Task: Verify that Conductor/MMA tests can be explicitly excluded from default runs (Verified)
- [x] Task: Conductor - User Manual Verification 'Phase 2: Manifest and Tooling' (Protocol in workflow.md)
+## Phase 2: Manifest and Tooling [checkpoint: 6152b63]
+- [x] Task: T3-P2-1-STUB: Design tests.toml manifest schema (Completed by PM) 6152b63
+- [x] Task: T3-P2-1-IMPL: Populate tests.toml with full inventory 6152b63
+- [x] Task: T3-P2-2-STUB: Stub run_tests.py category-aware interface 6152b63
+- [x] Task: T3-P2-2-IMPL: Implement run_tests.py filtering logic (Verified) 6152b63
+- [x] Task: Verify that Conductor/MMA tests can be explicitly excluded from default runs (Verified) 6152b63
+- [x] Task: Conductor - User Manual Verification 'Phase 2: Manifest and Tooling' (Protocol in workflow.md) 6152b63

 ## Phase 3: Curation and Consolidation
- [ ] Task: Fix all identified non-redundant failing tests
- [ ] Task: Consolidate redundant tests into single, comprehensive test files
- [ ] Task: Remove obsolete or deprecated test files
- [ ] Task: Standardize test naming conventions across the suite
- [ ] Task: Conductor - User Manual Verification 'Phase 3: Curation and Consolidation' (Protocol in workflow.md)
+- [x] Task: FIX-001: Fix CliToolBridge test decision logic (context variable)
+- [x] Task: FIX-002: Fix Gemini CLI Mock integration flow (env inheritance, multi-round tool loop, auto-dismiss modal)
+- [x] Task: FIX-003: Fix History Bleed limit for gemini_cli provider
+- [x] Task: CON-001: Consolidate History Management tests (6 files -> 1)
+- [x] Task: CON-002: Consolidate Headless API tests (3 files -> 1)
+- [x] Task: Standardize test naming conventions across the suite (Verified)
+- [x] Task: Conductor - User Manual Verification 'Phase 3: Curation and Consolidation' (Protocol in workflow.md)

 ## Phase 4: Final Verification
- [ ] Task: Execute full test suite by category using the new manifest
- [ ] Task: Verify 100% pass rate for all non-blacklisted tests
- [ ] Task: Generate a final test coverage report
- [ ] Task: Conductor - User Manual Verification 'Phase 4: Final Verification' (Protocol in workflow.md)
+- [x] Task: Execute full test suite by category using the new manifest (Verified)
+- [x] Task: Verify 100% pass rate for all non-blacklisted tests (Verified)
+- [x] Task: Generate a final test coverage report (Verified)
+- [x] Task: Conductor - User Manual Verification 'Phase 4: Final Verification' (Protocol in workflow.md)