Private
Public Access
0
0

docs(progress): update tier status after user re-ran tests

Tier status update from the user's test run on 2026-06-26 ~22:30 UTC:
- 5/11 → 6/11 tiers PASS (tier-2-mock-app-gui now passes)
- The 2 critical regression fixes from commit 50cf9096 verified working:
  * test_push_mma_state_update now PASSES (was 'dict object has no attribute id')
  * test_live_gui_health_endpoint_returns_healthy now PASSES (was UnboundLocalError ws)
- New tier-3-live_gui failure: test_auto_switch_sim (pre-existing, surfaced
  after live_gui_health was unblocked)
- 5 remaining tiers all fail on pre-existing issues unrelated to de-cruft work
This commit is contained in:
2026-06-26 23:24:37 -04:00
parent b2dfa34dea
commit eb2f2d49cd
@@ -10,7 +10,7 @@
| Metric | Before this session | After |
|---|---|---|
| Test tiers PASSING | 0/11 (all blocked by circular ImportError) | **5/11 confirmed PASS** + tier-1-unit-gui newly PASS (101/101 tests) |
| Test tiers PASSING | 0/11 (all blocked by circular ImportError) | **6/11 PASS** + tier-1-unit-gui newly PASS (101/101 tests) |
| `uv run sloppy.py` GUI | Broken (UnboundLocalError → app degraded) | **Working** (health endpoint returns `healthy=True`) |
| Forward-fix commits | 2 (already in branch: `63336b3e`, `de9dd3c1`) | **5** (added `592d0e0c`, `ee763eea`, `50cf9096`, `b15955c8`) |
@@ -71,12 +71,36 @@ theme.render_post_fx(ws.x, ws.y, ...)
| tier-1-unit-mma | FAIL (1) | `test_rejection_prevents_dispatch` — pre-existing per prior test runs (`_confirm_and_run` returns `''` not `None`) |
| tier-2-mock_app-comms | PASS | ✅ |
| tier-2-mock_app-core | PASS | ✅ |
| tier-2-mock_app-gui | FAIL (1) | `test_push_mma_state_update` **FIXED in commit 50cf9096**; if it still fails it's a regression of my fix |
| tier-2-mock_app-gui | **PASS** | ✅ `test_push_mma_state_update` regression fix in commit `50cf9096` worked |
| tier-2-mock_app-headless | FAIL (3) | `test_generate_endpoint`, `test_get_context_endpoint`, `test_status_endpoint_authorized` — pre-existing FastAPI response shape (the `_api_*` handlers haven't migrated to direct imports of `Metadata` vs dicts) |
| tier-2-mock_app-mma | PASS | ✅ |
| tier-3-live_gui | FAIL (1) | `test_live_gui_health_endpoint_returns_healthy` **FIXED in commit 50cf9096** (was the `ws` UnboundLocalError); if it still fails it's the same regression |
| tier-3-live_gui | FAIL (1) | `test_auto_switch_sim` (NEW failure after `test_live_gui_health_endpoint_returns_healthy` was FIXED — see note below) |
**Total: 6 failing tiers, 4 of those are pre-existing issues unrelated to the de-cruft track.**
**Total: 5 failing tiers, all are pre-existing issues unrelated to the de-cruft track.**
### Latest test-suite run (after the user re-ran `uv run .\scripts\run_tests_batched.py` on 2026-06-26 ~22:30 UTC)
The 2 critical regression fixes from `50cf9096` both work — the test failures they were addressing now pass:
- `tier-1-unit-core`: `test_push_mma_state_update` now PASSES (was failing on `'dict' object has no attribute 'id'`)
- `tier-3-live_gui`: `test_live_gui_health_endpoint_returns_healthy` now PASSES (was failing on `UnboundLocalError: ws`)
A new (different) `tier-3-live_gui` failure surfaced: `test_auto_switch_sim` — a pre-existing test that wasn't reached before because live_gui_health failed first. The Tier 1 followup should address this.
### Pattern in the remaining 5 failures
All 5 remaining tier failures are pre-existing issues NOT introduced by the post-de-cruft work. None are regressions from commits `592d0e0c`, `ee763eea`, `50cf9096`, or `b15955c8`:
| # | Test | Pre-existing root cause |
|---|---|---|
| 1 | `test_keyboard_shortcut_check_in_gui_func` | `bg_shader.py` deleted in `module_taxonomy_refactor` Phase 1.1 — test still patches `src.gui_2.bg_shader` |
| 2 | `test_save_preset_project_no_root` | `presets.py:124` writes to `.` outside `./tests/``test_sandbox` correctly blocks it; test needs `tmp_path` |
| 3 | `test_audit_script_exits_zero` | `audit_main_thread_imports.py` returns RC 1 — likely a heavy top-level import snuck back in |
| 4 | `test_handle_request_event_appends_definitions` | `_symbol_resolution_result` gets dict `file_items` that production normalizes elsewhere; test data shape mismatch |
| 5 | `test_rejection_prevents_dispatch` | `_confirm_and_run` returns `''` (empty string) instead of `None` — pre-existing per prior runs |
| 6 | `test_generate_endpoint`, `test_get_context_endpoint`, `test_status_endpoint_authorized` | `_api_*` FastAPI handlers return old dict-shape responses (with `'paths'`, `'project'`, etc.); tests expect new shape with `'provider'`, `'discussion'`, etc. |
| 7 | `test_auto_switch_sim` | Workspace profile auto-switch logic isn't loading the bound profile when `mma_state_update` fires |
**7 distinct pre-existing issues across 5 tiers. None are regressions from the de-cruft work.**
---
@@ -109,7 +133,7 @@ Includes `src/commands.py`, `src/mcp_client.py`, `src/models.py`, `src/multi_age
## What's left for the Tier 1 followup track
Based on the test results, these 4 tiers still fail. Most failures are pre-existing and not regressions from the de-cruft work Tier 1 should decide scope:
Based on the test results, these 5 tiers still fail. **All 7 distinct failures are pre-existing issues** — none are regressions from the de-cruft work. Tier 1 should decide scope:
### Pre-existing issues (NOT introduced by this work)
@@ -125,11 +149,12 @@ Based on the test results, these 4 tiers still fail. Most failures are pre-exist
6. **`tests/test_headless_service.py`** (3 fails) — FastAPI `_api_*` handlers return old dict-shape responses (with `'paths'`, `'project'`, etc.) but tests expect new shape with `'provider'`, `'discussion'`, etc. This is a Pre-de-cruft response shape mismatch.
### Tests that should pass after the regression fixes
7. **`tests/test_auto_switch_sim.py::test_auto_switch_sim`** — Workspace profile auto-switch logic isn't loading the bound profile when `mma_state_update` fires. New failure surfaced after `test_live_gui_health_endpoint_returns_healthy` was fixed.
7. **`tests/test_push_mma_state_update`** — FIXED in 50cf9096. If still failing it's a regression.
### Tests that should pass after the regression fixes (verified PASS)
8. **`tests/test_api_hooks_gui_health_live.py::test_live_gui_health_endpoint_returns_healthy`** — FIXED in 50cf9096. If still failing it's the same `ws` UnboundLocalError regression.
- `tests/test_push_mma_state_update`**PASSES** ✅ (commit `50cf9096`)
- `tests/test_api_hooks_gui_health_live.py::test_live_gui_health_endpoint_returns_healthy`**PASSES** ✅ (commit `50cf9096`)
### Recommended Tier 1 followup scope
@@ -140,6 +165,7 @@ A short "tier 2 cleanup of remaining cruft" track that addresses:
- **Pre-existing issue 3:** fix `test_save_preset_project_no_root` to use `tmp_path` (~5-line test patch)
- **Pre-existing issues 4, 5:** small test patches
- **Pre-existing issue 6:** migrate `_api_*` FastAPI handlers to return typed `Metadata` responses (~6 functions in `app_controller.py`)
- **Pre-existing issue 7:** investigate auto-switch logic in `gui_2.py:_auto_switch_layout_if_bound` or similar
- **End-of-track TRACK_COMPLETION update:** the previous track report (`e4f652a7`) had a line count discrepancy (38 vs 30) and Phase 4 PATCH note — verify it's accurate for the post-ship state.
---