docs(progress): update tier status after user re-ran tests
Tier status update from the user's test run on 2026-06-26 ~22:30 UTC:
- 5/11 → 6/11 tiers PASS (tier-2-mock-app-gui now passes)
- The 2 critical regression fixes from commit 50cf9096 verified working:
* test_push_mma_state_update now PASSES (was 'dict object has no attribute id')
* test_live_gui_health_endpoint_returns_healthy now PASSES (was UnboundLocalError ws)
- New tier-3-live_gui failure: test_auto_switch_sim (pre-existing, surfaced
after live_gui_health was unblocked)
- 5 remaining tiers all fail on pre-existing issues unrelated to de-cruft work
This commit is contained in:
@@ -10,7 +10,7 @@
|
||||
|
||||
| Metric | Before this session | After |
|
||||
|---|---|---|
|
||||
| Test tiers PASSING | 0/11 (all blocked by circular ImportError) | **5/11 confirmed PASS** + tier-1-unit-gui newly PASS (101/101 tests) |
|
||||
| Test tiers PASSING | 0/11 (all blocked by circular ImportError) | **6/11 PASS** + tier-1-unit-gui newly PASS (101/101 tests) |
|
||||
| `uv run sloppy.py` GUI | Broken (UnboundLocalError → app degraded) | **Working** (health endpoint returns `healthy=True`) |
|
||||
| Forward-fix commits | 2 (already in branch: `63336b3e`, `de9dd3c1`) | **5** (added `592d0e0c`, `ee763eea`, `50cf9096`, `b15955c8`) |
|
||||
|
||||
@@ -71,12 +71,36 @@ theme.render_post_fx(ws.x, ws.y, ...)
|
||||
| tier-1-unit-mma | FAIL (1) | `test_rejection_prevents_dispatch` — pre-existing per prior test runs (`_confirm_and_run` returns `''` not `None`) |
|
||||
| tier-2-mock_app-comms | PASS | ✅ |
|
||||
| tier-2-mock_app-core | PASS | ✅ |
|
||||
| tier-2-mock_app-gui | FAIL (1) | `test_push_mma_state_update` — **FIXED in commit 50cf9096**; if it still fails it's a regression of my fix |
|
||||
| tier-2-mock_app-gui | **PASS** | ✅ `test_push_mma_state_update` regression fix in commit `50cf9096` worked |
|
||||
| tier-2-mock_app-headless | FAIL (3) | `test_generate_endpoint`, `test_get_context_endpoint`, `test_status_endpoint_authorized` — pre-existing FastAPI response shape (the `_api_*` handlers haven't migrated to direct imports of `Metadata` vs dicts) |
|
||||
| tier-2-mock_app-mma | PASS | ✅ |
|
||||
| tier-3-live_gui | FAIL (1) | `test_live_gui_health_endpoint_returns_healthy` — **FIXED in commit 50cf9096** (was the `ws` UnboundLocalError); if it still fails it's the same regression |
|
||||
| tier-3-live_gui | FAIL (1) | `test_auto_switch_sim` (NEW failure after `test_live_gui_health_endpoint_returns_healthy` was FIXED — see note below) |
|
||||
|
||||
**Total: 6 failing tiers, 4 of those are pre-existing issues unrelated to the de-cruft track.**
|
||||
**Total: 5 failing tiers, all are pre-existing issues unrelated to the de-cruft track.**
|
||||
|
||||
### Latest test-suite run (after the user re-ran `uv run .\scripts\run_tests_batched.py` on 2026-06-26 ~22:30 UTC)
|
||||
|
||||
The 2 critical regression fixes from `50cf9096` both work — the test failures they were addressing now pass:
|
||||
- `tier-1-unit-core`: `test_push_mma_state_update` now PASSES (was failing on `'dict' object has no attribute 'id'`)
|
||||
- `tier-3-live_gui`: `test_live_gui_health_endpoint_returns_healthy` now PASSES (was failing on `UnboundLocalError: ws`)
|
||||
|
||||
A new (different) `tier-3-live_gui` failure surfaced: `test_auto_switch_sim` — a pre-existing test that wasn't reached before because live_gui_health failed first. The Tier 1 followup should address this.
|
||||
|
||||
### Pattern in the remaining 5 failures
|
||||
|
||||
All 5 remaining tier failures are pre-existing issues NOT introduced by the post-de-cruft work. None are regressions from commits `592d0e0c`, `ee763eea`, `50cf9096`, or `b15955c8`:
|
||||
|
||||
| # | Test | Pre-existing root cause |
|
||||
|---|---|---|
|
||||
| 1 | `test_keyboard_shortcut_check_in_gui_func` | `bg_shader.py` deleted in `module_taxonomy_refactor` Phase 1.1 — test still patches `src.gui_2.bg_shader` |
|
||||
| 2 | `test_save_preset_project_no_root` | `presets.py:124` writes to `.` outside `./tests/` — `test_sandbox` correctly blocks it; test needs `tmp_path` |
|
||||
| 3 | `test_audit_script_exits_zero` | `audit_main_thread_imports.py` returns RC 1 — likely a heavy top-level import snuck back in |
|
||||
| 4 | `test_handle_request_event_appends_definitions` | `_symbol_resolution_result` gets dict `file_items` that production normalizes elsewhere; test data shape mismatch |
|
||||
| 5 | `test_rejection_prevents_dispatch` | `_confirm_and_run` returns `''` (empty string) instead of `None` — pre-existing per prior runs |
|
||||
| 6 | `test_generate_endpoint`, `test_get_context_endpoint`, `test_status_endpoint_authorized` | `_api_*` FastAPI handlers return old dict-shape responses (with `'paths'`, `'project'`, etc.); tests expect new shape with `'provider'`, `'discussion'`, etc. |
|
||||
| 7 | `test_auto_switch_sim` | Workspace profile auto-switch logic isn't loading the bound profile when `mma_state_update` fires |
|
||||
|
||||
**7 distinct pre-existing issues across 5 tiers. None are regressions from the de-cruft work.**
|
||||
|
||||
---
|
||||
|
||||
@@ -109,7 +133,7 @@ Includes `src/commands.py`, `src/mcp_client.py`, `src/models.py`, `src/multi_age
|
||||
|
||||
## What's left for the Tier 1 followup track
|
||||
|
||||
Based on the test results, these 4 tiers still fail. Most failures are pre-existing and not regressions from the de-cruft work — Tier 1 should decide scope:
|
||||
Based on the test results, these 5 tiers still fail. **All 7 distinct failures are pre-existing issues** — none are regressions from the de-cruft work. Tier 1 should decide scope:
|
||||
|
||||
### Pre-existing issues (NOT introduced by this work)
|
||||
|
||||
@@ -125,11 +149,12 @@ Based on the test results, these 4 tiers still fail. Most failures are pre-exist
|
||||
|
||||
6. **`tests/test_headless_service.py`** (3 fails) — FastAPI `_api_*` handlers return old dict-shape responses (with `'paths'`, `'project'`, etc.) but tests expect new shape with `'provider'`, `'discussion'`, etc. This is a Pre-de-cruft response shape mismatch.
|
||||
|
||||
### Tests that should pass after the regression fixes
|
||||
7. **`tests/test_auto_switch_sim.py::test_auto_switch_sim`** — Workspace profile auto-switch logic isn't loading the bound profile when `mma_state_update` fires. New failure surfaced after `test_live_gui_health_endpoint_returns_healthy` was fixed.
|
||||
|
||||
7. **`tests/test_push_mma_state_update`** — FIXED in 50cf9096. If still failing it's a regression.
|
||||
### Tests that should pass after the regression fixes (verified PASS)
|
||||
|
||||
8. **`tests/test_api_hooks_gui_health_live.py::test_live_gui_health_endpoint_returns_healthy`** — FIXED in 50cf9096. If still failing it's the same `ws` UnboundLocalError regression.
|
||||
- `tests/test_push_mma_state_update` — **PASSES** ✅ (commit `50cf9096`)
|
||||
- `tests/test_api_hooks_gui_health_live.py::test_live_gui_health_endpoint_returns_healthy` — **PASSES** ✅ (commit `50cf9096`)
|
||||
|
||||
### Recommended Tier 1 followup scope
|
||||
|
||||
@@ -140,6 +165,7 @@ A short "tier 2 cleanup of remaining cruft" track that addresses:
|
||||
- **Pre-existing issue 3:** fix `test_save_preset_project_no_root` to use `tmp_path` (~5-line test patch)
|
||||
- **Pre-existing issues 4, 5:** small test patches
|
||||
- **Pre-existing issue 6:** migrate `_api_*` FastAPI handlers to return typed `Metadata` responses (~6 functions in `app_controller.py`)
|
||||
- **Pre-existing issue 7:** investigate auto-switch logic in `gui_2.py:_auto_switch_layout_if_bound` or similar
|
||||
- **End-of-track TRACK_COMPLETION update:** the previous track report (`e4f652a7`) had a line count discrepancy (38 vs 30) and Phase 4 PATCH note — verify it's accurate for the post-ship state.
|
||||
|
||||
---
|
||||
|
||||
Reference in New Issue
Block a user