manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	229559caaa	feat(startup): first-frame detection + startup_timeline API Adds per-AppController startup timing instrumentation to answer 'did the warmup block the first frame?' AppController.__init__ records _init_start_ts at entry (cold-start anchor). WarmupManager.on_complete callback stamps _warmup_done_ts. App.render_main_interface (gui_2.py) calls mark_first_frame_rendered() on its first call, which stamps _first_frame_ts and logs the timeline. New public API on AppController: - init_start_ts (property): float - warmup_done_ts (property): Optional[float] - first_frame_ts (property): Optional[float] - mark_first_frame_rendered(ts=None): idempotent; logs to stderr - startup_timeline() -> dict with all timestamps + precomputed deltas: warmup_ms, first_frame_after_init_ms, first_frame_after_warmup_ms Stderr log on warmup done: [startup] warmup done in 1186.2ms (first frame rendered Nms BEFORE/AFTER) Stderr log on first frame: [startup] first frame at Xms after init (warmup took Yms) (rendered Zms BEFORE/AFTER warmup done) Hook API: - GET /api/startup_timeline - ApiHookClient.get_startup_timeline() -> dict 5 new tests in test_warmup_canaries.py covering all the new methods. All 18 canary tests + 10 api_hooks tests + 6 gui_indicator tests pass. Script scripts/apply_startup_timeline.py is included as a reference for the multi-edit pattern (the proper MCP-equivalent tools will be added later per the edit_workflow doc).	2026-06-06 22:48:50 -04:00
ed	152605f5dc	feat(warmup): log canaries to stderr by default (with main-thread violation warning) Per module: prints a one-line summary to stderr when the import completes or fails: [warmup 1] google.genai on controller-io_0 (id=18636): 1218.6ms [warmup 2] anthropic on controller-io_1 (id=5500): 1148.3ms [warmup 3] openai on controller-io_2 (id=34376): 1144.2ms ... When the entire warmup completes, prints an aggregate: [warmup done] 9 modules: 9 completed (sum of per-module elapsed: 3591.7ms) If ANY canary ran on the main thread (main-thread-purity violation), the per-module line is tagged with [MAIN-THREAD] AND a final WARNING is printed: [warmup WARNING] N module(s) loaded on the MAIN THREAD: google.genai Default is log_to_stderr=True so production runs get the observability for free. Tests opt out via WarmupManager(pool, log_to_stderr=False) in the _build_warmup helper. 5 new tests (4 stderr logging + 1 quiet). All 13 canary tests pass. Use case: 'did my heavy import run on the GUI thread when it shouldnt have?' is now answered by grepping stderr for [warmup ...] [MAIN-THREAD] lines. No hook server required.	2026-06-06 22:15:24 -04:00
ed	208aa664db	feat(warmup): per-module canary records (thread + timing observability) Adds a canary record for each module submitted to the warmup, tracking: canary_id, module, thread_name, thread_id, submit_ts, start_ts, end_ts, elapsed_ms, status, error. Surface: - WarmupManager.canaries() returns list[dict] (defensive copy) - AppController.warmup_canaries() returns list[dict] (delegation) - GET /api/warmup_canaries Hook API endpoint - ApiHookClient.get_warmup_canaries() returns list[dict] Example: the warmup of google.genai records a 1187ms canary on thread controller-io_0 with thread_id 50420, canary_id 1. 11 new tests (8 unit in test_warmup_canaries + 3 in test_api_hooks_warmup). All pass; live_gui smoke test confirms endpoint returns real data.	2026-06-06 22:02:35 -04:00
ed	ae3b433e5e	refactor(models): lazy-load tomli_w (sub-track 2 partial) Sub-track 2 of startup_speedup_20260606. Removes the top-level 'import tomli_w' from src/models.py and moves it inside save_config(). tomli_w (~30ms cold load) is now loaded only when the user saves config, not on every src.models import. This drops the audit violation count from 63 to 62. Pydantic BaseModel (the other src/models.py violation) is left for a future sub-track: deferring a class base requires a metaclass or proxy pattern that's higher risk for the small (~50ms) saving. 3 new tests in tests/test_models_no_top_level_tomli_w.py: - tomli_w NOT in sys.modules after import src.models - save_config() still works (because tomli_w loads on-demand) - save_config() actually triggers the import on first call 17 existing model tests pass (test_persona_models, test_bias_models, test_context_presets_models, test_per_ticket_model, test_file_item_model).	2026-06-06 21:42:08 -04:00
ed	8957c9a5be	fix(conftest): register atexit handler for non-blocking pool shutdown Fixes the run_tests_batched.py hang that occurs after batch 4. The original conftest (commit `52ea2693`) stored _warmup_app_controller at module scope for the entire pytest session. When pytest exits, GC of the AppController triggers ThreadPoolExecutor.__del__ -> shutdown(wait=True). If warmup hasn't fully completed by then, the shutdown blocks indefinitely, causing the batched test runner to hang at the subprocess.run boundary. Fix: register an atexit handler that captures the _io_pool reference directly (default argument) and shuts it down with wait=False. The pool reference is captured by closure, surviving even after the AppController is GC'd. shutdown() is idempotent so the subsequent shutdown(wait=True) in __del__ is a no-op. This is part of sub-track 4 (warmup notification) cleanup; the conftest's wait_for_warmup behavior is preserved, only the exit-hang is fixed.	2026-06-06 21:35:05 -04:00
ed	f3d071e0c8	feat(gui): warmup status indicator + completion callback (sub-track 4) Sub-track 4 of startup_speedup_20260606. Adds per-frame GUI feedback during the AppController's background warmup: - render_warmup_status_indicator(app): module-level render fn called from render_main_interface. Shows 'Warming up... (N/M)' in warning color while pending, 'Imports: K failed' in error color on failure, or 'All imports ready (M modules)' in success color for 3 seconds after completion. Hidden otherwise. - _on_warmup_complete_callback(app, status): thread-safe callback registered with controller.on_warmup_complete() in App._post_init. Records timestamp + lock-protected toast list. - App._post_init: registers the callback. 6 new tests in tests/test_gui_warmup_indicator.py: - 2 importable-checks (function exists) - 3 callback-logic tests (timestamp, failures, thread-safety) - 1 live_gui smoke test (controller exposes warmup_status)	2026-06-06 21:29:03 -04:00
ed	8fea8fe9a0	feat(api_hooks): add /api/warmup_status and /api/warmup_wait endpoints (sub-track 3) Sub-track 3 of startup_speedup_20260606. Builds on the Phase 7 minimal work at `b464d1fe` which only added warmup_status to /api/gui/diagnostics. New dedicated endpoints: - GET /api/warmup_status -> controller.warmup_status() (cheap, lock-guarded) - GET /api/warmup_wait?timeout=N -> controller.wait_for_warmup(timeout) then returns the final status. Default 30s. Both callable from external clients via ApiHookClient.get_warmup_status() and ApiHookClient.get_warmup_wait(timeout=30.0). 7 new tests in tests/test_api_hooks_warmup.py (5 unit + 2 live_gui). All 7 pass.	2026-06-06 21:01:56 -04:00
ed	253e1798d1	refactor: migrate remaining ad-hoc threads to AppController.submit_io (Phase 6 complete) Phase 6 of startup_speedup_20260606 was partial: ~13 ad-hoc threading.Thread spawns remained in src/app_controller.py and 2 in src/gui_2.py. This commit migrates all of them to self.submit_io(...) (the shared _io_pool wrapper from Phase 2). ZERO new threading.Thread() spawns in src/ (excluding the 5 domain-specific threads already exempt per spec): - api_hooks.py:739 HookServer HTTP server (domain-specific) - api_hooks.py:818 WebSocketServer (domain-specific) - app_controller.py _loop_thread (asyncio event loop, DEDICATED) - multi_agent_conductor.py WorkerPool (domain-specific) - performance_monitor.py CPU monitor (continuous, domain-specific) Sites migrated (15 total): app_controller.py: - 1289 _task in _sync_rag_engine - 1480 _run in _rebuild_rag_index - 2078-2079 do_fetch in _fetch_models (dropped stored ref) - 2218-2219 queue_fallback in _run_event_loop - 2229 _handle_request_event in _process_event_queue - 2828-2833 _do_project_switch in _switch_project (stored as Future) - 3455 worker in _handle_md_only - 3477 worker in _handle_compress_discussion - 3516 worker in _handle_generate_send - 3784 _bg_task in _cb_plan_epic - 3825 _bg_task in _cb_accept_tracks - 3844 engine.run in _cb_start_track (track_id case) - 3855 engine.run in _cb_start_track (reload case) - 3866 _start_track_logic lambda in _cb_start_track (idx case) - 3939 engine.run in _start_track_logic gui_2.py: - 1129 _stats_worker in _update_context_file_stats - 3507 worker in _check_auto_refresh_context_preview Stored-ref migration (Phase 6 partial work): - self.models_thread (declared L960, assigned L2078): No external readers. Dropped the declaration and the assignment; replaced the .start() with self.submit_io(do_fetch). - self._project_switch_thread (declared L868, assigned L2828): Read by test_project_switch_persona_preset.py:21 for .is_alive() polling. The test's _wait_for_switch helper now uses the public is_project_stale() flag instead -- the Future from submit_io isn't directly exposed, but the in_progress flag already tracks lifecycle correctly. Dropped the declaration; replaced the .start() with self.submit_io(self._do_project_switch, path). Test impact: - test_project_switch_persona_preset.py::_wait_for_switch: Updated to poll ctrl.is_project_stale() instead of the _project_switch_thread attribute. The new API is cleaner (one public method instead of two coupled attributes) and works with the io_pool background-thread model. Effectiveness: - Per-spawn cost: ~1-5ms saved (thread creation) - 4 long-lived threads eliminated; all background work now shares the 4-worker _io_pool - When 4 long-lived threads were active simultaneously, the new pool backpressure causes them to queue; future work can be backpressured explicitly TESTS: 19+39 = 58 tests touching migrated code paths all pass. The 1 remaining failure (test_api_generate_blocked_while_stale: 'AppController' object has no attribute 'ui_global_preset_name') is pre-existing and unrelated to this work (per the user's note that they will address separately).	2026-06-06 20:19:50 -04:00
ed	52ea2693cf	test(conftest): use AppController.wait_for_warmup() to fix library import race The google-genai library has a known circular-import bug in its __init__.py chain: google.genai/__init__.py:21: from .client import Client -> from ._api_client import BaseApiClient -> from .types import HttpOptions When loaded fresh in a pytest process, the chain collides with itself and leaves google.genai in a 'partially initialized' state. Per the user spec (startup_speedup_20260606 spec.md:2.2 Layer 3): "the app controller should post to test clients or the user when its threads are warmed up with imports — that way the user knows 'hey you have the ui first, but now you have all the functionality.'" This is exactly what the warmup notification system does. Phase 2 (commit `1354679e`) added the WarmupManager + _io_pool, and the warmup list (state.toml) already includes 'google.genai'. The AppController.__init__ submits the warmup jobs to the _io_pool background thread. When the warmup completes, _warmup_done_event is set and registered on_warmup_complete callbacks fire. The previous conftest fix imported 'google.genai' DIRECTLY at conftest module load. That bypassed the whole notification mechanism. This commit fixes the oversight: - Reverts the direct `import google.genai` - Creates an AppController at conftest load time - Calls `wait_for_warmup(timeout=60.0)` to block until the background warmup completes - google.genai ends up in sys.modules via the warmup's `importlib.import_module` call (same end state, but now via the documented mechanism) The conftest's `from src.gui_2 import App` at line 27 is also a heavy synchronous import chain that runs in-process. By the time that line executes, the warmup is already in progress on the _io_pool. The wait_for_warmup() call after that line ensures the warmup completes before any test collects. The AppController is session-scoped (one per pytest process). If another fixture (e.g. live_gui) creates its own AppController that also runs warmup, the second controller's wait_for_warmup returns immediately because the modules are already in sys.modules. Cost: 60s timeout worst-case (typically completes in ~3s based on the baseline measurement). One-time per pytest process. Earlier alternatives I tried and rejected: - Direct `import google.genai` in conftest: bypasses the notification mechanism. User feedback: "you are falling back to your jank." - Source-level `genai = _require_warmed('google.genai')` + `.types`: fails the same way (the library bug is in the PARENT's __init__.py, not the leaf). The parent's __init__.py never completes in a fresh process; once it's in the "partially initialized" state in sys.modules, no caller pattern can fix it. - Revert the conftest change and skip these tests: not viable, the tests are real and important.	2026-06-06 19:23:52 -04:00
ed	8c4791d03f	fix(ai_client,module_loader): pre-existing bugs surfaced by Phase 3 refactor Three test failures identified by the batched test suite, all rooted in the Phase 3 lazy-import refactor of src/ai_client.py. FIX 1: UnboundLocalError in _ensure_gemini_client - _ensure_gemini_client had a latent bug: creds was assigned inside `if _gemini_client is None:` but used on the next line. When the client was already cached, the assignment was skipped and the next line raised UnboundLocalError. Moved the Client() construction inside the if block to match creds' scope. - This affected test_ai_cache_tracking.py and (downstream) test_gui_updates.py::test_telemetry_data_updates_correctly. FIX 2: Phase 3 removed top-level `import requests` from ai_client.py. - test_discussion_compression.py::test_discussion_compression_deepseek did `patch("src.ai_client.requests.post", ...)` which no longer works. - Updated the test to mock _require_warmed to return a fake requests module with `.post()`, matching the new lazy-import pattern. FIX 3: _require_warmed could not import dotted names like `google.genai.types` - The google-genai library has a self-referential __init__.py that does `from .client import Client` which transitively does `from .types import HttpOptions`. Importing `google.genai.types` FIRST (before the parent package is fully loaded) hit a "partially initialized module" circular import. - Enhanced _require_warmed to pre-import parent packages for dotted names: walks `name.split(".")` and imports each parent (if not in sys.modules) before the leaf import. O(n) extra imports per call on first use; subsequent calls are O(1) sys.modules hit. TESTS: - test_ai_cache_tracking.py: 2/2 PASS - test_discussion_compression.py: 4/4 PASS - 29/29 PASS across the sampled test files that were failing (test_subagent_summarization, test_tool_access_exclusion, test_tier4_interceptor, test_gui2_mcp, test_gui_updates, test_headless_service) ARCHITECTURAL NOTE: The _require_warmed enhancement is a small but important robustness fix. The google-genai library's __init__.py chain is a known source of fragility; the parent- pre-import pattern is the recommended workaround.	2026-06-06 18:30:44 -04:00
ed	61d21c70bb	refactor(app_controller): remove requests + tomli_w top-level imports; add main thread purity test Phase 8 of startup_speedup_20260606 track. Part 1: app_controller.py cleanup - Removed 'import requests' (was used in 2 places - lazy import added inside) - Removed 'import tomli_w' (dead import; never referenced in app_controller) - Migrated 2 threading.Thread spawns to use self.submit_io (the do_post closures in _handle_approve_ask and _handle_reject_ask) Part 2: Main thread purity enforcement test - tests/test_main_thread_purity.py: 7 tests verify that the 6 refactored files (ai_client, app_controller, commands, theme_2, markdown_helper, gui_2) have ZERO top-level imports from the heavy denylist: {google.genai, anthropic, openai, requests, google.genai.types, fastapi, fastapi.security.api_key, src.command_palette, src.theme_nerv, src.theme_nerv_fx, src.markdown_table, numpy, tkinter, tomli_w} This is the static enforcement (the runtime audit-hook test using sys.addaudithook is a follow-up). The test is RED before each refactor phase, GREEN after. If a future commit re-introduces a heavy import in one of these files, the test fails immediately in CI. TESTS: - 7/7 main thread purity tests PASS - 15/15 log + app controller tests still PASS (no breakage from removing requests/tomli_w imports)	2026-06-06 18:01:39 -04:00
ed	de6b85d2ad	refactor(gui_2): remove dead imports; lazy numpy/tkinter via _LazyModule proxy Phase 5D of startup_speedup_20260606 track. DEAD IMPORTS REMOVED (zero uses, safe to remove): - 'import tomli_w' (line 18) - never referenced anywhere in gui_2.py - 'from src import theme_nerv_fx as theme_fx' (line 59) - never referenced; the actual NERV FX objects are created in src/theme_2.py and accessed via render_post_fx() The theme_nerv_fx removal saves the full ~254ms import of src.theme_nerv_fx on the main thread. LAZY PROXY PATTERN for heavy feature-gated modules: - 'import numpy as np' (line 9) - used in 1 place (plot_lines) - 'from tkinter import filedialog, Tk' (lines 30, 34) - duplicates removed, 13 use sites now go through the proxy Added a _LazyModule class that defers module loading until first attribute access or call. The proxy is a transparent replacement: 'np.array(...)' and 'Tk()' continue to work unchanged. The import only fires on first use, then is cached in sys.modules for O(1) subsequent access. ARCHITECTURAL NOTE: This is a general-purpose pattern that can be used for any module that should not be in the main thread's import chain. The Phase 5A 'lazy registry proxy' was a similar idea but custom-tailored to one use case; _LazyModule is the general form. EFFECTIVENESS (estimated from baseline): - src.theme_nerv_fx removal: ~254ms saved - numpy deferral: ~65ms saved (when not plotting); 0ms saved if the user is using numpy (imgui_bundle transitively brings it in anyway) - tkinter deferral: small but real savings (tkinter is stdlib but still has import cost) Note that numpy and tkinter are still brought in transitively by imgui_bundle and other src.* modules. The test verifies the AST (top-level imports of gui_2.py) is clean; the runtime sys.modules check is too strict because of these transitive imports. TESTS: - tests/test_gui_2_no_top_level_heavy_imports.py: 5/5 PASS (all RED -> GREEN) - 13 gui tests sampled (gui_progress, gui_paths, gui_kill_button, gui_window_controls, gui_custom_window, gui_fast_render, gui_startup_smoke, gui2_layout, gui2_events): all PASS NEXT: Phase 6 (ad-hoc threads -> _io_pool), Phase 7 (warmup notification), Phase 8 (enforcement), Phase 9 (final verify + checkpoint).	2026-06-06 17:16:53 -04:00
ed	48c9649951	refactor(markdown_helper): remove top-level src.markdown_table import; use _require_warmed Phase 5C of startup_speedup_20260606 track. src/markdown_helper.py imported src.markdown_table at module level: from src.markdown_table import parse_tables, render_table Both parse_tables and render_table are only used inside MarkdownRenderer.render(). Removed the top-level import; the MarkdownRenderer.render() method now does: markdown_table = _require_warmed('src.markdown_table') parse_tables = markdown_table.parse_tables render_table = markdown_table.render_table at the top of its body, before any other logic. TESTS: - tests/test_markdown_helper_no_top_level_table.py: 3/3 PASS (all RED -> GREEN) - tests/test_markdown_table*.py (5 files) + test_markdown_helper_bullets.py + test_markdown_render_robust.py: 24/24 PASS (no breakage) EFFECTIVENESS: import src.markdown_helper no longer triggers src.markdown_table (~250ms). For renderers that never hit a GFM table, the import is never paid. For renderers that do, the warmup pre-loads it on _io_pool and the render() lookup is O(1). NEXT: Phase 5D - bulk refactor of src/gui_2.py feature-gated imports via scripts/audit_gui2_imports.py.	2026-06-06 16:58:32 -04:00
ed	69d098baaa	refactor(theme_2): remove top-level NERV theme imports; use _require_warmed Phase 5B of startup_speedup_20260606 track. src/theme_2.py had 3 top-level NERV imports: from src import theme_nerv from src.theme_nerv import DATA_GREEN from src.theme_nerv_fx import CRTFilter, AlertPulsing, StatusFlicker And 3 module-level FX object instantiations: _crt_filter = CRTFilter() _alert_pulsing = AlertPulsing() _status_flicker = StatusFlicker() ALL removed. The 3 use sites now lookup via _require_warmed: - apply() NERV branch: theme_nerv = _require_warmed('src.theme_nerv') - ai_text_color(): theme_nerv = _require_warmed('src.theme_nerv') (then uses theme_nerv.DATA_GREEN) - render_post_fx(): theme_nerv_fx = _require_warmed('src.theme_nerv_fx') (then creates FX objects locally per-call) The _status_flicker was instantiated but never used (dead code path; the StatusFlicker class is still importable via theme_nerv_fx but not auto-constructed in theme_2.py). TESTS: - tests/test_theme_2_no_top_level_nerv.py: 4/4 PASS (all RED -> GREEN) - tests/test_theme.py, test_theme_nerv.py, test_theme_nerv_fx.py, test_theme_models.py: 21/21 PASS (no breakage) EFFECTIVENESS: import src.theme_2 no longer triggers src.theme_nerv or src.theme_nerv_fx (~485ms combined). For users on default theme, these are NEVER loaded. For NERV users, the warmup pre-loads on _io_pool and the lookup is O(1). NEXT: Phase 5C (markdown table) follows same TDD pattern.	2026-06-06 16:55:20 -04:00
ed	78d3a1db1f	refactor(commands): use lazy registry proxy to defer src.command_palette import Phase 5A T5A.1-T5A.4 of startup_speedup_20260606 track. src/commands.py was importing src.command_palette at module load to create the CommandRegistry singleton. The 32 @registry.register decorators on the command functions needed this registry at import time. Approach: lazy registry proxy. The @registry.register decorator now just queues the function in a list; the real CommandRegistry is built on first access to any other registry attribute (.all, .get, etc.). By that time, all 32 decorators have run and the pending list is populated, so the real registration is complete in one pass. src/commands.py changes: - Removed 'from src.command_palette import CommandRegistry' - Added 'from src.module_loader import _require_warmed' - Added _LazyCommandRegistry class (proxy) - Added _get_real_registry() function (initializes on first access) - Replaced 'registry = CommandRegistry()' with 'registry = _LazyCommandRegistry()' - The 32 @registry.register decorators are unchanged (the proxy's register method returns the function unchanged after queueing it) EFFECTIVENESS: - 'import src.commands' no longer triggers src.command_palette (~244ms) - The warmup on AppController's _io_pool pre-loads src.command_palette on a background thread during startup - First access to registry.all() (e.g. from gui_2.py at palette open time) is O(1) - the warmup module is already in sys.modules TESTS: - tests/test_commands_no_top_level_command_palette.py: 4/4 PASS (3 RED, 1 green; now all green) - tests/test_command_palette.py: 13/13 PASS (no breakage) - tests/test_command_palette_sim.py: 7/7 PASS (live_gui tests, the full palette flow works end-to-end with the lazy proxy) ARCHITECTURAL NOTE: The lazy proxy is a minimal-change solution that preserves the public API. The 32 decorated functions don't need any changes; gui_2.py's 'from src.commands import registry' still works unchanged. The deferral is invisible to consumers. NEXT: Phase 5B (NERV theme) and 5C (markdown table) follow the same TDD pattern. 5D is the bulk refactor of src/gui_2.py feature-gated imports via the audit_gui2_imports.py script.	2026-06-06 16:48:04 -04:00
ed	3849d30441	refactor(app_controller): remove top-level fastapi imports; lift _require_warmed to shared module Phase 4 T4.1-T4.4 of startup_speedup_20260606 track. DEVIATION FROM ORIGINAL SPEC: spec.md said fastapi was in src/api_hooks.py but it was actually in src/app_controller.py (lines 17, 21). api_hooks.py uses stdlib http.server. Phase 4 target corrected to app_controller. LIFTED _require_warmed TO SHARED MODULE: created src/module_loader.py to avoid duplicating the lookup logic and the cross-module import smell (app_controller -> ai_client). src/ai_client.py re-exports it so the T3.1 test (which asserts hasattr(src.ai_client, '_require_warmed')) continues to work. src/app_controller.py changes: - Added 'from __future__ import annotations' (enables lazy type annotations; -> FastAPI return type now a forward reference) - Removed 'from fastapi import FastAPI, Depends, HTTPException' (line 17) - Removed 'from fastapi.security.api_key import APIKeyHeader' (line 21) - Added 'from src.module_loader import _require_warmed' (cross-module via shared utility, not via ai_client) - create_api(): added lookups at top of function body - 7 _api_* helper functions (_api_get_key, _api_generate, _api_stream, _api_confirm_action, _api_get_session, _api_delete_session, _api_get_context): added 'HTTPException = _require_warmed(...).HTTPException' at top of each function body EFFECTIVENESS: - import src.app_controller no longer triggers fastapi import (saves ~470ms in main thread; only loaded when --enable-test-hooks is set) - When --enable-test-hooks is set, the AppController's warmup pre-loads fastapi on the _io_pool, so create_api()'s lookup is O(1) TESTS: - tests/test_app_controller_no_top_level_fastapi.py: 4/4 PASS (was 3 RED + 1 pass) - tests/test_ai_client_no_top_level_sdk_imports.py: 9/9 still PASS (re-export works) - tests/test_app_controller_mcp.py, test_app_controller_offloading.py: pass - tests/test_headless_service.py: 10/11 PASS (1 pre-existing failure test_generate_endpoint is a circular-import issue in google.genai, reproduces identically on stashed pre-Phase-4 state - NOT a regression from this change) - tests/test_hooks.py: pass NEXT: Phase 5 (feature-gated GUI module imports - command palette, NERV theme, markdown table), then Phase 6 (ad-hoc threads -> _io_pool).	2026-06-06 16:34:46 -04:00
ed	51c054ece8	refactor(ai_client): remove top-level SDK imports; use _require_warmed Phase 3 T3.2 + T3.3 of startup_speedup_20260606 track. The 5 heavy SDKs (anthropic, google.genai, openai, google.genai.types, requests) are no longer imported at module level. Each function that needs them now calls _require_warmed(name) to get the module from sys.modules (populated by AppController's warmup on _io_pool). This is the load-bearing wall of the Main Thread Purity Invariant: heavy modules are never in the main thread's import chain. run_discussion_compression now uses _require_warmed for both google.genai.types (gemini branch) and requests (deepseek branch). Tests/test_tier4_patch_generation.py adapted: the 2 tests that mocked 'src.ai_client.types' (no longer a module-level attr) now mock 'src.ai_client._require_warmed' (the new public mechanism). T3.1 tests now pass (9/9). T3.3 breakage fixed. All 25 ai_client + tier4 tests pass.	2026-06-06 16:09:16 -04:00
ed	16780ec6d4	test(ai_client): TDD red phase - no top-level SDK imports allowed Phase 3 Task T3.1 of startup_speedup_20260606 track. 9 tests assert: - import src.ai_client does NOT trigger google.genai / anthropic / openai / requests / google.genai.types imports (the main thread must not load these on import; they're warmed on _io_pool) - _require_warmed(name) helper exists and is callable - _require_warmed returns the cached module if already in sys.modules - _require_warmed falls back to importlib for tests/dev where warmup didn't run - The static audit script does not see src/ai_client.py as a contributor of heavy-import violations All 9 tests are currently FAILING (RED). They will turn GREEN when T3.2 (the actual refactor of src/ai_client.py to remove top-level imports and add _require_warmed) lands. The implementation is held pending MCP client fix (per user instruction).	2026-06-06 15:11:13 -04:00
ed	1354679e33	feat(io_pool, warmup): add shared 4-thread pool + WarmupManager Phase 2 Tasks T2.1-T2.4 of the startup_speedup_20260606 track. NEW: src/io_pool.py make_io_pool() factory: 4-worker ThreadPoolExecutor with thread_name_prefix='controller-io'. The sanctioned way for any background work. Replaces ad-hoc threading.Thread() calls per the 'no new threads' rule. NEW: src/warmup.py WarmupManager: manages a list of modules to import on the shared pool. Public API: .submit(modules) - start warmup (call once) .status() - {pending, completed, failed} .is_done() - bool .wait(timeout) - block until done .on_complete(callback) - register completion callback .reset() - clear state Thread-safe (lock-guarded). 10 tests cover all paths. NEW: tests/test_io_pool.py (4 tests): - ThreadPoolExecutor returned - 4 workers - Threads named 'controller-io-*' - Jobs run in parallel (barrier test) NEW: tests/test_warmup.py (10 tests): - One job per module submitted - Initial pending list correct - Failed imports tracked - Done event set after all complete - wait() blocks until done - on_complete callback fires (and immediately if already done) - Modules actually end up in sys.modules - reset() clears state - Jobs run concurrently (not serially) All 14 tests pass. AppController integration is the next commit.	2026-06-06 14:47:02 -04:00
ed	6f9a3af201	feat(audit): add main-thread import graph audit + baseline measurements Phase 1, Tasks T1.2 + T1.4 of the startup_speedup_20260606 track. NEW: scripts/audit_main_thread_imports.py Static CI gate that AST-walks the import graph reachable from sloppy.py and fails (exit 1) if any heavy module is imported at the top of a main-thread-reachable file. Walks into if/elif/else and try/except branches (which run at import time) but skips function bodies (which only run when called). Allowlist: stdlib + the lean gui_2 skeleton (imgui_bundle, defer, src.imgui_scopes, src.theme_2, src.theme_models, src.paths, src.models, src.events). NEW: scripts/audit_gui2_imports.py Read-only analysis tool that lists every top-level and function-level import in src/gui_2.py, classified by location. Used in Phase 5D to identify which imports to remove. NEW: tests/test_audit_main_thread_imports.py 9 tests covering: --help exits 0, clean stdlib-only passes, heavy third-party fails, google.genai fails, transitive walks, function- body imports ignored, if-branch imports flagged, try-block imports flagged, file:line reported. All 9 pass. NEW: docs/reports/startup_baseline_20260606.txt 3-run median cold-start benchmark. Worst offenders: src.gui_2 (1770ms), simulation.user_agent (1517ms), google.genai (1001ms), openai (482ms), anthropic (441ms), imgui_bundle (255ms), src.theme_nerv* (485ms combined), src.markdown_table (243ms), src.command_palette (242ms). NEW: docs/reports/startup_audit_20260606.txt Audit output on the CURRENT codebase. Reports 67 violations across the main-thread import graph (incl. numpy in src/gui_2.py:9, tomli_w in src/gui_2.py:18, fastapi + requests in src/app_controller, tree_sitter_* in src/file_cache, pydantic in src/models, plus all the src.* subsystem imports that drag in heavy transitive deps). Phase 3-5 of the track will resolve these one by one. After Phase 3-5, this audit must exit 0 (no violations). Co-located reports in docs/reports/ per project convention; the other agent finished their work in docs/superpowers/ and is unrelated.	2026-06-06 14:22:18 -04:00
ed	5a85653654	feat(startup_profiler): add StartupProfiler for per-phase init timing Lightweight, in-memory profiler for AppController init phases. Used by the startup_speedup_20260606 track to measure where the time goes during boot (config hydration, hook server start, subsystem init, etc.). The profiler is exposed via /api/startup_profile (Phase 8 work) and the Diagnostics panel so the user can see the exact per-phase cost. Public API: StartupProfiler() - create .phase(name) - context manager .snapshot() - {phases: {name: {start_ts, duration_ms}}, total_ms, count} .reset() - clear recorded phases .enable() / .disable() - toggle recording Implementation: - dataclass with list of _Phase(name, start_ts, end_ts) - @contextmanager records wall-clock via time.perf_counter - records duration even if the body raises (try/finally) - snapshot is a copy, so consumers can't mutate the live state TDD: 5 tests in tests/test_startup_profiler.py cover: basic recording, total math, snapshot isolation, exception safety, empty state.	2026-06-06 13:57:26 -04:00
ed	ca254bac41	fix(imports): break models<->dag_engine circular dependency Track.get_executable_tickets (in models.py) called TrackDAG at runtime, forcing a top-level import of src.dag_engine into models.py and creating a 2-cycle that broke whichever module loaded second (Ticket was not yet defined when models.py loaded first; TrackDAG was not yet defined when dag_engine.py loaded first). Fix: hoist the method out of the Track dataclass and into a free function get_executable_tickets(track) in dag_engine.py. models.py no longer needs TrackDAG at all, so the cycle is one-directional (models -> dag_engine) and resolves cleanly in any import order. Tests updated: - tests/test_mma_models.py: import get_executable_tickets and call it instead of track.get_executable_tickets() (4 call sites) - tests/test_conductor_engine_v2.py: comment update Verified both import orders resolve cleanly: forward: import src.models; import src.dag_engine -> OK reverse: import src.dag_engine; import src.models -> OK 34 tests pass (test_mma_models, test_dag_engine, test_execution_engine, test_arch_boundary_phase3, test_track_state_schema).	2026-06-06 13:30:18 -04:00
r00tz	9e4fac496d	made local rag needs optional (prevents having to have torch / sentence-transformers if you never use local embedding)	2026-06-06 13:21:43 -04:00
ed	16412ad5f9	fix(rag): detect ChromaDB dim mismatch and recreate collection on provider switch	2026-06-06 11:26:47 -04:00
ed	26e0ced4d9	test(prior_session): refactor to narrow render_prior_session_view (50+ mocks -> 20)	2026-06-06 01:12:29 -04:00
ed	5692cbef56	test(workspace_profile): add str/bytes TOML serialization contract test	2026-06-05 20:14:39 -04:00
ed	c96bdb06ba	test(rag_phase4): handle None status before .lower() in error check	2026-06-05 12:38:47 -04:00
ed	970f198ca6	test(view_presets): mock persona_manager in fixture	2026-06-05 11:52:49 -04:00
ed	f829d1df17	test(prior_session): mock render_palette_modal, add ui_base_system_prompt fixture	2026-06-05 11:45:42 -04:00
ed	df43f158b9	test(gui_phase4): patch markdown_helper imgui/imgui_md to avoid IM_ASSERT	2026-06-05 10:33:38 -04:00
ed	38abf2312f	test(gui_progress): adapt to C_LBL/C_VAL function API + theme_2 mock	2026-06-05 10:25:25 -04:00
ed	465396675d	docs(themes): add authoring guide for TOML theme system	2026-06-04 23:16:21 -04:00
ed	1cb68e4e3f	feat(markdown): apply active theme syntax palette to code blocks	2026-06-04 23:13:33 -04:00
ed	df2e82a82d	feat(themes): add Solarized Dark/Light, Gruvbox Dark, Moss TOML themes	2026-06-04 23:10:16 -04:00
ed	e14b3c2ce0	feat(theme): load themes from TOML and apply syntax palette mapping	2026-06-04 22:59:59 -04:00
ed	e2f698c4a3	feat(theme-models): add ThemePalette/ThemeFile schema with TOML loader	2026-06-04 22:31:22 -04:00
ed	8d1fa18785	fix(project): Non-blocking project switch with stale-ui tint When switching projects, the previous implementation ran the entire save/load/refresh sequence on the main thread. With large project files or slow disks, this caused the UI to freeze for several seconds. Fix: - _switch_project now returns immediately after setting flags; the actual work runs in a daemon thread (_do_project_switch) - New is_project_stale() property returns True while a switch is queued or running; the GUI renders an amber/yellow tint overlay to signal the controller state lags the user's last click - AI ops are gated: _api_generate returns HTTP 409, _handle_generate_send and _handle_md_only early-return with ai_status feedback, all when is_project_stale() is true - Queued switches (clicking project A then B in rapid succession) are coalesced: B replaces A as the target; once A completes, B is triggered automatically via the finally branch in _do_project_switch - New state fields: _project_switch_in_progress, _project_switch_pending_path, _project_switch_thread, _project_switch_lock - AppController state class attributes use hasattr guard for _app to keep the controller usable standalone in tests/headless mode UX: - Render loop keeps drawing during the switch - User can still scroll, switch tabs, browse files - Amber tint + popup explains what's happening and that AI ops are paused - ai_status shows the target project name Tests: - _wait_for_switch helper added for the new async switch flow - All 7 existing switch tests updated to call _wait_for_switch - 2 new tests: - test_switch_project_non_blocking: verifies _switch_project returns in <0.2s and is_project_stale() is True during the switch - test_api_generate_blocked_while_stale: verifies _api_generate raises HTTPException(409) while a switch is in progress All 33 related tests pass.	2026-06-04 21:29:12 -04:00
ed	36f3292249	fix(project): Reload context_files from new project on project switch When switching projects, the previous project's context_files remained visible in the Context Composition panel because the controller's self.context_files list was not reloaded from the new project's TOML files.paths entry. Fix in _refresh_from_project: - After loading self.files from the project TOML, populate self.context_files with deep copies of those FileItem objects - Reset self._app.ui_selected_context_files to match the new project's auto_aggregate set - Guard the _app access with hasattr so the controller is usable standalone (in tests, headless mode, etc.) without an attached App Test: 1 new test in tests/test_project_switch_persona_preset.py - test_switch_project_resets_context_files: switches from project_a (forth + gte_hello files) to project_b (gencpp timing files) and asserts context_files contains ONLY project_b's files	2026-06-04 21:03:16 -04:00
ed	7df65dff14	fix(project): Create persona_manager in _load_active_project + handle missing context preset Two fixes for the regression introduced in `b92daef3` (and an additional hardening for the persona->context_preset stale-reference class of bug): 1. Regression: persona_manager was missing on first project load. _load_active_project creates preset_manager and tool_preset_manager but did not create persona_manager, so the new self.personas = self.persona_manager.load_all() line in _refresh_from_project raised AttributeError on app startup before the post-_load_active_project persona_manager creation could run. Fix: create self.persona_manager in _load_active_project alongside the other managers, so the manager is available when _refresh_from_project runs. 2. Stale reference: persona's context_preset field pointed to a preset (e.g. 'GTE') that no longer exists in the project, causing load_context_preset to raise KeyError and crash the persona selector panel (which triggered the cascading 'Missing End()' imgui assertion). Fix: wrap the load_context_preset call in render_persona_selector_panel with try/except KeyError, surface the error in app.ai_status, and clear app.ui_active_context_preset to keep the GUI state consistent. Tests: 2 new tests in tests/test_project_switch_persona_preset.py - test_load_active_project_creates_persona_manager (regression guard) - test_load_context_preset_missing_raises_keyerror (verifies the contract that load_context_preset raises for missing names; the GUI layer is now responsible for catching the error)	2026-06-04 20:45:55 -04:00
ed	b92daef34f	fix(project): Reload personas and validate active AI settings on project switch When switching projects, the previous project's project-specific persona and presets remained selected in the AI Settings panel because: 1. self.personas was not reloaded after switching project root 2. self.ui_active_persona / tool_preset / bias_profile / project_preset_name were not validated against the newly-loaded personas/presets Fix: - Reload self.personas from self.persona_manager in _refresh_from_project - Validate each active selection and reset to None/empty if it does not exist in the newly-loaded manager dictionaries - Push the active tool preset and bias profile to ai_client after the swap - Initialize self.ui_active_bias_profile in class attribute block (was only set later in __init__, causing AttributeError on direct attribute access) Tests: 4 new tests in tests/test_project_switch_persona_preset.py verify the reset behavior for persona, preset, tool preset, and global preset preservation.	2026-06-04 20:36:59 -04:00
Conductor	58cd759968	fix(markdown): strip blank between bullet and indented continuation paragraph ROOT CAUSE: imgui_md (mekhontsev/imgui_md) BLOCK_P does NOT call ImGui::NewLine() when m_list_stack is non-empty (verified in imgui_md.cpp). So a multi-paragraph list item like: - bullet text (long, wraps to 2 lines) continuation paragraph renders BOTH paragraphs at the same Y because the second BLOCK_P enters/exits without advancing the cursor. The continuation crashes into the previous paragraph's last wrapped line. FIX: Add MarkdownRenderer._normalize_list_continuations preprocessor that strips blank lines between a list item and its indented continuation. The continuation then becomes a lazy continuation of the first paragraph (single BLOCK_P in imgui_md, proper text wrapping, no overlap). Trade-off: users cannot have separate paragraphs within a single list item. Acceptable. Also: fixed a pre-existing bug in _normalize_nested_list_endings where a duplicate conditional caused the function to return empty string (the out.append(line) was inside the wrong scope). It was silently corrupting all list content since `fd5f4d0e`. TESTS: 23/23 markdown unit tests pass. 3 new tests for the new preprocessor covering: blank-strip case, blank-preservation case, simple-list passthrough.	2026-06-03 21:48:12 -04:00
Conductor	fd5f4d0eda	fix(markdown): strip backticks in table cells + add nested-list overlap workaround FIX 1 (src/markdown_table.py): Cells now use imgui_md.render(c) instead of imgui.text_wrapped(c). imgui_md uses MD4C which strips backtick-delimited inline code spans BEFORE rendering, so backticks no longer appear as literal characters in cell content. Side benefit: inline emphasis (foo, bar) now renders in cells too. FIX 2 (src/markdown_helper.py): Added MarkdownRenderer._normalize_nested_list_endings. Upstream imgui_md (mekhontsev/imgui_md) BLOCK_UL exit only calls ImGui::NewLine() for top-level list endings. For nested list endings, no NewLine is emitted, so the next text starts at the same Y as the last list item, causing visual overlap. The preprocessor inserts a blank line before any line that follows a list item with MORE indent than itself, forcing a paragraph break. Cannot fix the C++ from Python. Tests: - test_markdown_table_wrapped.py: updated to assert imgui_md.render is called for cell content (not imgui.text_wrapped). - test_markdown_helper_bullets.py: added 4 tests for the new preprocessors (nested-list blank insertion + bullet delimiter conversion + edge cases). 20/20 markdown unit tests pass. 1-space indentation throughout. KNOWN LIMITATIONS (cannot fix without forking imgui_md C++): - Inline code spans render as plain text (no monospace font in cells) - The ' * ' bullet delimiter has a Y-overlap bug upstream (workaround: pre-convert to '- ' via _normalize_bullet_delimiters) - Nested list ending overlap (workaround: insert blank line via _normalize_nested_list_endings)	2026-06-03 21:33:47 -04:00
Conductor	feed18eb0f	fix(markdown): remove table double-header + add imgui_md bullet workaround Table fix (src/markdown_table.py): - Add TableColumnFlags_.width_stretch to each table_setup_column call (was missing — columns had no width to wrap against, so text_wrapped couldn't grow row height → all rows squished together) - Remove the explicit for-h-in-headers: table_next_column + text_wrapped(h) loop. table_headers_row() already renders the header from the table_setup_column() names; the explicit loop was drawing it AGAIN on top → double-rendered header rows. Bullet fix (src/markdown_helper.py): - Revert _render_md_no_bullet_overlap → simple imgui_md.render(chunk); imgui.spacing() (the original `af0bbe97` approach). The complex workaround was stripping '- ' and rendering stripped text to imgui_md, which double-rendered '- 1. ...' content (imgui.bullet from my code + numbered list marker from imgui_md). - Add MarkdownRenderer._normalize_bullet_delimiters: regex-converts '* ' markers to '- ' before passing to imgui_md. This works around the upstream bug in mekhontsev/imgui_md BLOCK_LI where the '*' case calls ImGui::Bullet() without ImGui::SameLine(), causing the bullet to render on its own Y with the text on the next Y. The '-' case uses Text+SameLine which is correct. Cannot fix from Python (we can't subclass the C++ class) — pre-conversion is the cheapest fix. Tests: - test_markdown_table_wrapped.py: updated to assert new behavior (text_wrapped count == cell count, not header+cell). - test_markdown_table_columns.py: updated to assert exactly 6 table_next_column calls (cells only, not 9). - test_markdown_helper_bullets.py: rewrote for new public-API behavior (imgui_md.render called with the unstripped chunk). 16/16 markdown unit tests pass.	2026-06-03 21:14:16 -04:00
Conductor	919d28e950	test(markdown): add live_gui smoke test for markdown table + bullet rendering	2026-06-03 17:37:44 -04:00
Conductor	d15fdcdb05	fix(markdown): revert table to simple form with text_wrapped + add regression tests	2026-06-03 17:31:50 -04:00
ed	afa2f31e11	fix(markdown): add missing table_setup_column calls in render_table ROOT CAUSE: src/markdown_table.py:render_table was missing imgui.table_setup_column() calls. In ImGui, columns MUST be configured via table_setup_column before table_headers_row is called. Without it, the table has no defined columns, causing cells to render at overlapping Y positions. This manifested as text overlap in the Discussion Hub's read_mode entries (e.g., 'swc2 -> gte_sw' overlapping the line above it). FIX: Call imgui.table_setup_column(h, TableColumnFlags_.width_stretch) for each header BEFORE table_headers_row(). Each column now has a defined width (stretch = fills available space) and cells render correctly without overlap. Tests: - New test_markdown_table_columns.py asserts setup_column is called once per column and table_next_column is called for each cell. - 16/16 broad regression pass (test_markdown_table, test_markdown_table_render, test_markdown_render_robust, test_gen_send_empty_context, test_gui_fast_render)	2026-06-03 15:27:29 -04:00
ed	801321c125	fix(gui): remove ListClipper from render_prior_session_view (variable-height items) ROOT CAUSE: The ListClipper in render_prior_session_view was being tripped up by the variable heights of discussion entries (huge system prompts vs small tool results). When the first entry was very tall (system prompt), the clipper would compute the visible range assuming uniform item heights, leading to underflow/overflow on subsequent items. The user saw only the first ~8 entries with massive empty space below ('early clipping'). FIX: Replace the ListClipper with a direct for loop over app.prior_disc_entries. With 233 entries, performance is acceptable and each entry renders correctly. The user can still scroll the parent imscope.child window if content overflows. Tests: - Updated test_prior_session_no_clipping.py to set entries on app_instance.controller.prior_disc_entries (the App's __getattr__ proxies attribute reads to the controller, so the set must go to the controller directly). - 28/28 broad regression pass	2026-06-03 15:18:18 -04:00
ed	96b9b00c97	fix(gui): use imscope.child for comms_scroll (was inside conditional, leaving child open) ROOT CAUSE: render_comms_history_panel had imgui.end_child() nested INSIDE an 'if app._scroll_comms_to_bottom:' block at line 3758. When _scroll_comms_to_bottom was False (the common case), end_child was NOT called, leaving the comms_scroll child window open. This caused the imGui state to corrupt: tab_item.end_tab_item, tab_bar.end_tab_bar, and the outer window.end all saw that the child was still open (WithinEndChildID was set), triggering 'Must call EndChild() and not End()!' assertion. FIX: Convert the entire comms_scroll block to imscope.child (which uses Python's with statement for exception-safe end_child). The scroll-to-bottom logic is now correctly nested INSIDE the with block, and there's no manual end_child to forget. Tests: - Updated test_comms_scroll_no_clipping.py to check imscope.child instead of begin_child - 28/28 broad regression pass	2026-06-03 14:53:05 -04:00
ed	070c159f11	fix(gui): use imscope.child in render_heavy_text for exception safety ROOT CAUSE: render_heavy_text (called per comms panel entry) had manual begin_child/end_child pairs. If anything inside the child (especially markdown_helper.render) raised, end_child was skipped. The child window was left open, corrupting the imGui state. The corruption cascaded through tab_item.end_tab_item -> tab_bar.end_tab_bar -> window.end, triggering 'Must call EndChild() and not End()!' assertion. FIX: Convert the inner begin_child/end_child pair to imscope.child so the end_child is automatically called by Python's with statement, even on exception. Also convert prior_scroll to imscope.child for consistency. TESTS: - Existing test_comms_no_extraneous_pop.py: push/pop balance check - Updated test_prior_session_no_clipping.py to match new imscope.child signature - 28/28 broad regression pass	2026-06-03 14:38:26 -04:00
ed	228359679d	fix(gui): remove orphan pop_style_color in render_comms_history_panel ROOT CAUSE: In a previous fix (`df7bda6e` 'explicit child size for comms_scroll and prior_scroll'), the code that pushed a child_bg style color at the start of render_comms_history_panel was removed when the section was rewritten to use imgui.get_content_region_avail() for explicit child sizing. However, the matching pop_style_color at the end of the function (guarded by 'if app.is_viewing_prior_session') was left in place. RESULT: When viewing a prior session, the imscope.style_color in _gui_func pushes 1 color at the start of the frame, then the orphaned pop in render_comms_history_panel decrements the imGui style counter by 1, then _gui_func's imscope __exit__ tries to pop again — triggering IM_ASSERT 'PopStyleColor() too many times!'. This caused a cascade of imGui state corruption on every frame after loading a prior session log, manifesting as 'too many times' assertions on the next frame and 'Must call EndChild() and not End()' once the style stack underflowed. FIX: Remove the orphan pop_style_color at gui_2.py:3761. No matching push exists, so the pop is unconditionally wrong. TESTS: - New test_comms_no_extraneous_pop.py asserts push/pop balance in render_comms_history_panel when is_viewing_prior_session is True - 43/43 broad regression pass	2026-06-03 14:25:59 -04:00

1 2 3 4 5 ...

610 Commits