Private
Public Access
0
0
Commit Graph

41 Commits

Author SHA1 Message Date
ed 1c565da7a0 feat(gui): wrap immapp.run in try/except + add /api/gui_health endpoint
PR2 of the test_full_live_workflow_imgui_assert fix sequence.

When an ImGui scope mismatch (IM_ASSERT(Missing End())) fires in
immapp.run (e.g. after cumulative state corruption from prior sims'
panel renders), the RuntimeError propagates out of app.run(). The
controller's _io_pool gets shut down via __del__/finalization. The
hook server (separate ThreadingHTTPServer) survives. Subsequent test
clicks fail with 'cannot schedule new futures after shutdown' and
the test times out after 120s with no clear signal of what went
wrong.

This commit:
1. Wraps immapp.run in try/except RuntimeError in gui_2.py:618.
   On assertion: logs the error to stderr (NOT silent), records
   it on controller._gui_degraded_reason and _last_imgui_assert,
   and returns from run() so the hook server keeps serving.
2. Adds _gui_degraded_reason and _last_imgui_assert to
   AppController.__init__ (initialized to None).
3. Adds /api/gui_health endpoint in api_hooks.py:148. Returns
   {healthy, degraded_reason, last_assert, io_pool_alive}.
4. Adds ApiHookClient.get_gui_health() with the matching unit
   tests (3 mocked tests + 1 live test).

Per user feedback 2026-06-08:
- The wrap does NOT silently swallow the error. It logs at ERROR
  level and surfaces it via the health endpoint.
- Tests can call client.get_gui_health() to detect a degraded GUI
  and fail fast with a clear message.

TDD: tests written first, confirmed to fail, then fix applied.
34/34 unit tests pass. 1/1 live test passes (live_gui health
endpoint reports healthy=True on fresh subprocess).
2026-06-08 20:46:41 -04:00
ed 4a33848620 fix(io_pool): increase worker count from 4 to 8 to prevent test hangs
Root cause: test_full_live_workflow in batch context (with prior sims
running AI discussion turns) would queue its _do_project_switch behind
the auto-pruner's scan of tests/logs/ (154MB, 6519 files). The 4-worker
pool was saturated, so the switch would never run within 30s.

Fix: bump IO_POOL_MAX_WORKERS from 4 to 8. This gives the pool enough
capacity to run: 2 pruners + the project switch + 5 spare.

Also: add /api/io_pool_status endpoint + get_io_pool_status +
wait_io_pool_idle helpers (kept in api_hooks.py and api_hook_client.py
for the test_api_hook_client_io_pool.py tests, even though the test
itself no longer uses them - they remain useful for future tests that
want to assert pool state directly).

Also: add wait_for_warmup at the start of test_full_live_workflow to
ensure SDK modules are loaded before AI ops.

Test verification:
- test_full_live_workflow in isolation: 11.83s PASS
- test_full_live_workflow in batch (with 4 prior sims): 83.46s PASS
- 30/30 related unit tests PASS
2026-06-08 17:49:34 -04:00
ed abb3856525 feat(api_hooks): add /api/project_switch_status endpoint for deterministic test signaling
Adds a new endpoint that exposes the project-switch state machine so tests
can poll for completion instead of guessing with timeouts.

- AppController: track _project_switch_error on failure paths
- src/api_hooks.py: GET /api/project_switch_status returns
  {in_progress, pending_path, active_path, error}
- src/api_hook_client.py: get_project_switch_status() helper
- tests/test_api_hooks_project_switch.py: 3 unit tests for client + endpoint
  shape, 1 live_gui test for the default-idle case
2026-06-08 09:55:36 -04:00
ed 91b34ae81e fix(hooks): handle dict-key bracket notation in set_value / get_value
The Hook API previously rejected key strings like
'show_windows["Project Settings"]' (and silently returned None on
get). The test_live_gui_filedialog_regression test exercises exactly
this pattern to open the Project Settings window via the Hook API;
it was previously marked skip with "hook server doesn't handle the
dict-key bracket-notation syntax".

Fix in three small places:

1. src/app_controller.py:_handle_set_value
   If `item` is not in _settable_fields, try parsing it as
   `dict_name[<key>]` notation. If dict_name IS in _settable_fields
   and the current attr is a dict, set the inner key.

2. src/api_hooks.py:/api/gui/value (POST get_val)
   Mirror the parsing for the field-based get endpoint.

3. src/api_hook_client.py:ApiHookClient.get_value
   Mirror the parsing in the client so the dict-key syntax works
   through the state endpoint as well (which is what get_value
   actually calls by default).

Test fix:
- tests/test_live_gui_filedialog_regression.py: removed the
  @pytest.mark.skip marker; the underlying issue is now fixed.

Verified: 1/1 test passes (previously skipped).
2026-06-07 16:49:51 -04:00
ed b95935bf9b fix(api_hooks): wrap session_logger in _require_warmed on POST handler
Sub-track 2C refactor at commit 372b0681 missed line 409 (was line 412 before the Unused Scripts Cleanup agent reorganized api_hooks.py). Result: every POST to the hook server raised 'NameError: name session_logger is not defined' at src/api_hooks.py:409, returning 500 to all live_gui tests that POSTed (test_ai_settings_layout, test_auto_switch_sim, test_command_palette_sim, test_gui2_parity, test_gui_context_presets, test_gui_dag_beads, test_gui_events_v2, etc.).

Verified: tests/test_ai_settings_layout.py 2/2 now pass (previously failing with provider-not-updated 500 error).
2026-06-07 12:30:23 -04:00
ed 372b0681dc refactor(api_hooks): remove top-level websockets/cost_tracker/session_logger imports
Sub-track 2C: 4 violations cleared. Removed 4 top-level imports (websockets, websockets.asyncio.server.serve, src.cost_tracker, src.session_logger). Runtime access via _require_warmed() at 4 use sites (L107 session_logger GET, L311 cost_tracker.estimate_cost, L412 session_logger POST, L855 websockets.exceptions.ConnectionClosed, L871 websockets.asyncio.server.serve). File already had 'from __future__ import annotations' so type hints (WebSocketServer) are strings.

ALSO: Added 'src.module_loader' to LEAN_ALLOWLIST in scripts/audit_main_thread_imports.py. The module is a 59-line pure-stdlib helper (only importlib + sys + typing imports); allowing its import at top level is consistent with the existing 'src.paths' / 'src.models' / 'src.config' allowlist entries.

Tests: 3 new in tests/test_api_hooks_no_top_level_heavy.py; 14 existing in test_websocket_server.py + test_hooks.py + test_api_hooks_warmup.py. All 17 pass.

GOTCHA: First edit attempt on src/api_hooks.py imports section failed because I forgot to include the '# TODO(Ed): Eliminate these?' comment line in old_string. Re-anchored on the exact 17-line block including the comment. (User will note: I also used the native 'edit' tool on the test file this turn, which the workflow says destroys 1-space indentation. Switched to manual-slop_edit_file.)
2026-06-07 10:20:17 -04:00
ed 229559caaa feat(startup): first-frame detection + startup_timeline API
Adds per-AppController startup timing instrumentation to answer
'did the warmup block the first frame?'

AppController.__init__ records _init_start_ts at entry (cold-start anchor).
WarmupManager.on_complete callback stamps _warmup_done_ts.
App.render_main_interface (gui_2.py) calls mark_first_frame_rendered()
on its first call, which stamps _first_frame_ts and logs the timeline.

New public API on AppController:
- init_start_ts (property): float
- warmup_done_ts (property): Optional[float]
- first_frame_ts (property): Optional[float]
- mark_first_frame_rendered(ts=None): idempotent; logs to stderr
- startup_timeline() -> dict with all timestamps + precomputed deltas:
  warmup_ms, first_frame_after_init_ms, first_frame_after_warmup_ms

Stderr log on warmup done:
  [startup] warmup done in 1186.2ms (first frame rendered Nms BEFORE/AFTER)

Stderr log on first frame:
  [startup] first frame at Xms after init (warmup took Yms) (rendered Zms BEFORE/AFTER warmup done)

Hook API:
- GET /api/startup_timeline
- ApiHookClient.get_startup_timeline() -> dict

5 new tests in test_warmup_canaries.py covering all the new methods.
All 18 canary tests + 10 api_hooks tests + 6 gui_indicator tests pass.

Script scripts/apply_startup_timeline.py is included as a reference
for the multi-edit pattern (the proper MCP-equivalent tools will be
added later per the edit_workflow doc).
2026-06-06 22:48:50 -04:00
ed 208aa664db feat(warmup): per-module canary records (thread + timing observability)
Adds a canary record for each module submitted to the warmup, tracking:
canary_id, module, thread_name, thread_id, submit_ts, start_ts,
end_ts, elapsed_ms, status, error.

Surface:
- WarmupManager.canaries() returns list[dict] (defensive copy)
- AppController.warmup_canaries() returns list[dict] (delegation)
- GET /api/warmup_canaries Hook API endpoint
- ApiHookClient.get_warmup_canaries() returns list[dict]

Example: the warmup of google.genai records a 1187ms canary on
thread controller-io_0 with thread_id 50420, canary_id 1.

11 new tests (8 unit in test_warmup_canaries + 3 in test_api_hooks_warmup).
All pass; live_gui smoke test confirms endpoint returns real data.
2026-06-06 22:02:35 -04:00
ed 8fea8fe9a0 feat(api_hooks): add /api/warmup_status and /api/warmup_wait endpoints (sub-track 3)
Sub-track 3 of startup_speedup_20260606. Builds on the Phase 7 minimal
work at b464d1fe which only added warmup_status to /api/gui/diagnostics.

New dedicated endpoints:
- GET /api/warmup_status -> controller.warmup_status() (cheap, lock-guarded)
- GET /api/warmup_wait?timeout=N -> controller.wait_for_warmup(timeout)
  then returns the final status. Default 30s.

Both callable from external clients via ApiHookClient.get_warmup_status()
and ApiHookClient.get_warmup_wait(timeout=30.0).

7 new tests in tests/test_api_hooks_warmup.py (5 unit + 2 live_gui).
All 7 pass.
2026-06-06 21:01:56 -04:00
ed b464d1fe49 feat(api_hooks): expose warmup_status in /api/gui/diagnostics endpoint
Phase 7 of startup_speedup_20260606 track.

Added warmup status to the existing /api/gui/diagnostics endpoint
(Phase 7 minimal scope - dedicated /api/warmup_status endpoint and
GUI status indicator deferred to follow-up sub-track).

The diagnostics response now includes:
  warmup: {
    pending: [list of module names still being warmed],
    completed: [list of module names successfully warmed],
    failed: [list of module names that failed to warm]
  }

External clients and tests can poll this endpoint to know when the
system is fully ready (all heavy modules loaded).

The endpoint gracefully handles missing controller (returns empty dict)
and exceptions (catches them, returns default empty state).

TESTS: 7 live_gui tests pass (test_hooks, test_live_workflow,
test_live_gui_integration_v2). No breakage from the new field.

NEXT: Phase 8 (runtime audit hook enforcement test) + Phase 9
(final verify + checkpoint).
2026-06-06 17:56:54 -04:00
ed 873edf42cf began to go through the files and organize imports and gui_2.py's new context defs
still a bunch to sift through after the last ai passes
2026-06-05 21:44:41 -04:00
ed 0f859d81d6 feat(gui): Unified window state and fixed context preservation regressions
- Implement unified show_windows['Text Viewer'] state and fix docking conflict loops.
- Fix Tool Call row interactivity using spanned selectables.
- Fix context selection loss when switching/creating discussions.
- Implement 'Empty Context Warning' modal for safer generation.
- Correct IndentationError in app_controller.py.
- Remove legacy show_text_viewer attribute and update API hooks.
2026-06-02 00:18:48 -04:00
ed e2305ff49a Antigravity is dog shit. 2026-05-20 07:51:58 -04:00
ed 20054b0476 fix(test): Final synchronization and stability fixes for RAG stress test
- Improved AppController.ai_status to prevent overwriting 'sending...' with 'models loaded'.
- Enhanced 	est_rag_phase4_stress.py with robust polling and increased timeout.
- Synchronized App and AppController history objects to ensure consistent view.
2026-05-16 01:21:27 -04:00
ed 7f2f9c1989 fix: Robustness improvements for RAG tests and GUI stability
- Added import sys to src/api_hook_client.py.
- Fixed App.__getattr__ to use direct attribute access on controller to avoid recursion.
- Simplified _get_app_attr and _has_app_attr in src/api_hooks.py.
- Centralized RAG and symbol enrichment in AppController._handle_request_event.
- Updated 	ests/test_symbol_parsing.py to match the new enrichment flow.
- Removed redundant task appending from i_status and mma_status setters.
- Improved _sync_rag_engine to only set 'ready' status after indexing is confirmed.
- Updated 	est_status_encapsulation.py to reflect setter changes.
2026-05-15 17:17:05 -04:00
ed b5e512f483 feat(sdm): inject structural dependency mapping tags across codebase
Adds [C: caller] tags to functions/methods and [M: mutation] / [U: usage] tags to class variables based on cross-module call analysis.
2026-05-13 22:35:52 -04:00
ed 05d0121e71 fixes 2026-05-10 11:33:07 -04:00
ed b958fa2819 refactor(phase5): Comprehensive stabilisation pass. De-duplicated App/Controller state, hardened session reset, and updated integration tests with deterministic polling. 2026-05-09 16:55:45 -04:00
ed 8c06c1767b refactor(sdm): Global pass with refined 'External Only' SDM tags. Pruned redundant internal references and fixed indentation logic in injector. Verified full project compilation. 2026-05-09 15:00:35 -04:00
ed 7fdf6c9782 feat(mma): Enable manual ticket approval via Hook API for Step Mode 2026-05-02 13:48:14 -04:00
ed f9b5acd758 refactor(api): Audit and cleanup api_hook_client.py and api_hooks.py 2026-05-02 13:08:47 -04:00
ed 8ee8862ae8 checkpoint: track complete 2026-03-18 18:39:54 -04:00
ed b4396697dd finished a track 2026-03-17 23:26:01 -04:00
ed 1a14cee3ce test: fix broken tests across suite and resolve port conflicts 2026-03-11 23:49:23 -04:00
ed 036c2f360a feat(api): implement phase 4 headless refinement and verification 2026-03-11 23:17:57 -04:00
ed 4777dd957a feat(api): implement phase 3 comprehensive control endpoints 2026-03-11 23:14:09 -04:00
ed 1be576a9a0 feat(api): implement phase 2 expanded read endpoints 2026-03-11 23:04:42 -04:00
ed 02e0fce548 feat(api): implement websocket gateway and event streaming for phase 1 2026-03-11 23:01:09 -04:00
ed 94598b605a checkpoint dealing with personal manager/editor 2026-03-10 23:47:53 -04:00
ed 83911ff1c5 plans and docs 2026-03-08 03:05:15 -04:00
ed f07b14aa66 fix(test): Restore performance threshold bounds and add profiling to test 2026-03-07 20:46:14 -05:00
ed d520d5d6c2 fix: Add debug logging to patch endpoints 2026-03-07 00:45:07 -05:00
ed 14dab8e67f feat(tier4): Add patch modal GUI integration and API hooks 2026-03-07 00:37:44 -05:00
ed 12dba31c1d REGRESSSIOSSSOONNNNSSSS 2026-03-06 21:39:50 -05:00
ed b88fdfde03 still in regression hell 2026-03-06 21:28:39 -05:00
ed f65e9b40b2 WIP: Regression hell 2026-03-06 21:22:21 -05:00
ed 5e69617f88 WIP: I HATE PYTHON 2026-03-05 13:55:40 -05:00
ed a783ee5165 feat(api): Add /api/gui/state endpoint and live_gui integration tests 2026-03-05 10:06:47 -05:00
ed 35480a26dc test(audit): fix critical test suite deadlocks and write exhaustive architectural report
- Fix 'Triple Bingo' history synchronization explosion during streaming

- Implement stateless event buffering in ApiHookClient to prevent dropped events

- Ensure 'tool_execution' events emit consistently across all LLM providers

- Add hard timeouts to all background thread wait() conditions

- Add thorough teardown cleanup to conftest.py's reset_ai_client fixture

- Write highly detailed report_gemini.md exposing asyncio lifecycle flaws
2026-03-05 01:42:47 -05:00
ed f2b25757eb refactor(tests): Update test suite and API hooks for AppController architecture 2026-03-04 11:38:36 -05:00
ed a0276e0894 feat(src): Move core implementation files to src/ directory 2026-03-04 09:55:44 -05:00