Private
Public Access
0
0
Commit Graph

2572 Commits

Author SHA1 Message Date
ed 922c5ad9ab feat(app_controller): wire _io_pool + warmup + 5 public delegation methods
Phase 2 Task T2.5 of the startup_speedup_20260606 track.

In AppController.__init__, right after the lock init (and before the
heavy subsystem construction that follows), create the shared _io_pool
and WarmupManager, then submit the warmup list. The warmup runs
concurrently with the rest of __init__, so by the time __init__
returns, the heavy modules are loaded (or in flight).

Changes:
  - Add imports: from src.io_pool import make_io_pool,
    from src.warmup import WarmupManager
  - In __init__, after the locks block, add:
      self._io_pool = make_io_pool()
      self._warmup = WarmupManager(self._io_pool)
      self._warmup.submit(self._compute_warmup_list())
  - Add _compute_warmup_list() method: returns ['google.genai',
    'anthropic', 'openai', 'requests', 'src.command_palette',
    'src.theme_nerv', 'src.theme_nerv_fx', 'src.markdown_table',
    'numpy'] always, plus ['fastapi', 'fastapi.security.api_key']
    if self.test_hooks_enabled
  - Add public delegation methods: warmup_status(), is_warmup_done(),
    wait_for_warmup(timeout), on_warmup(callback)
  - In shutdown(), add self._io_pool.shutdown(wait=False)

The warmup currently is a no-op for the heavy modules already imported
at the top of app_controller.py (fastapi, requests, etc. are
already in sys.modules). The infrastructure is in place; Phase 3 will
remove the top-level imports so the warmup actually does work.

Verified: all 18 tests pass (test_io_pool + test_warmup + existing
test_app_controller_mcp + test_app_controller_offloading).
2026-06-06 14:48:51 -04:00
ed 1354679e33 feat(io_pool, warmup): add shared 4-thread pool + WarmupManager
Phase 2 Tasks T2.1-T2.4 of the startup_speedup_20260606 track.

NEW: src/io_pool.py
  make_io_pool() factory: 4-worker ThreadPoolExecutor with
  thread_name_prefix='controller-io'. The sanctioned way for any
  background work. Replaces ad-hoc threading.Thread() calls per
  the 'no new threads' rule.

NEW: src/warmup.py
  WarmupManager: manages a list of modules to import on the shared
  pool. Public API:
    .submit(modules)        - start warmup (call once)
    .status()               - {pending, completed, failed}
    .is_done()              - bool
    .wait(timeout)          - block until done
    .on_complete(callback)  - register completion callback
    .reset()                - clear state
  Thread-safe (lock-guarded). 10 tests cover all paths.

NEW: tests/test_io_pool.py (4 tests):
  - ThreadPoolExecutor returned
  - 4 workers
  - Threads named 'controller-io-*'
  - Jobs run in parallel (barrier test)

NEW: tests/test_warmup.py (10 tests):
  - One job per module submitted
  - Initial pending list correct
  - Failed imports tracked
  - Done event set after all complete
  - wait() blocks until done
  - on_complete callback fires (and immediately if already done)
  - Modules actually end up in sys.modules
  - reset() clears state
  - Jobs run concurrently (not serially)

All 14 tests pass. AppController integration is the next commit.
2026-06-06 14:47:02 -04:00
ed 7fdab70529 conductor(plan): write 4-phase implementation plan for test_batching_refactor_20260606
16 tasks across 4 phases, each with explicit Red-Green-Refactor TDD steps:
- Phase 1 (1.1-1.16): Library + dry-run. 20 unit tests across categorizer,
  batcher, plugin. New run_tests_batched.py has --plan/--audit only.
- Phase 2 (2.1-2.3): Shadow run via CI. Compare new vs old plan output.
- Phase 3 (3.1-3.4): Switch default. Full CLI with --tiers, --durations.
  Old script becomes .legacy. Update docs/guide_testing.md.
- Phase 4 (4.1-4.6): Populate registry, gitignore durations, delete
  legacy, archive track.

1-space indentation per project style guide. No placeholders. All
test code is concrete.
2026-06-06 14:24:39 -04:00
ed f9a0125847 conductor(plan): Phase 1 complete - baseline + audit infrastructure ready
Phase 1 of startup_speedup_20260606 track is done.

Tasks completed:
  T1.1 baseline benchmark        -> 6f9a3af2 (docs/reports/startup_baseline_20260606.txt)
  T1.2 audit_gui2_imports.py     -> 6f9a3af2 (scripts/ + audit results)
  T1.3 StartupProfiler           -> 5a856536 (src/ + 5 tests)
  T1.4 audit_main_thread_imports -> 6f9a3af2 (scripts/ + 9 tests)
  T1.5 plan update                -> this commit

Baseline numbers (3-run median, from scripts/benchmark_imports.py):
  src.gui_2                1770ms   (main-thread bottleneck)
  simulation.user_agent    1517ms
  google.genai             1001ms
  openai                    482ms
  anthropic                 441ms
  imgui_bundle              255ms   (KEEP - ImGui hot path)
  src.theme_nerv_fx         254ms
  src.theme_nerv            246ms
  src.markdown_table        243ms
  src.command_palette       242ms

Audit violations on current codebase: 67. These are the targets
for Phases 3-5 (remove top-level heavy imports to fix each one).

Next: Phase 2 (Job Pool + Warmup Foundation).
2026-06-06 14:24:20 -04:00
ed 6f9a3af201 feat(audit): add main-thread import graph audit + baseline measurements
Phase 1, Tasks T1.2 + T1.4 of the startup_speedup_20260606 track.

NEW: scripts/audit_main_thread_imports.py
  Static CI gate that AST-walks the import graph reachable from
  sloppy.py and fails (exit 1) if any heavy module is imported at the
  top of a main-thread-reachable file. Walks into if/elif/else and
  try/except branches (which run at import time) but skips function
  bodies (which only run when called). Allowlist: stdlib + the lean
  gui_2 skeleton (imgui_bundle, defer, src.imgui_scopes, src.theme_2,
  src.theme_models, src.paths, src.models, src.events).

NEW: scripts/audit_gui2_imports.py
  Read-only analysis tool that lists every top-level and function-level
  import in src/gui_2.py, classified by location. Used in Phase 5D to
  identify which imports to remove.

NEW: tests/test_audit_main_thread_imports.py
  9 tests covering: --help exits 0, clean stdlib-only passes, heavy
  third-party fails, google.genai fails, transitive walks, function-
  body imports ignored, if-branch imports flagged, try-block imports
  flagged, file:line reported. All 9 pass.

NEW: docs/reports/startup_baseline_20260606.txt
  3-run median cold-start benchmark. Worst offenders: src.gui_2
  (1770ms), simulation.user_agent (1517ms), google.genai (1001ms),
  openai (482ms), anthropic (441ms), imgui_bundle (255ms),
  src.theme_nerv* (485ms combined), src.markdown_table (243ms),
  src.command_palette (242ms).

NEW: docs/reports/startup_audit_20260606.txt
  Audit output on the CURRENT codebase. Reports 67 violations across
  the main-thread import graph (incl. numpy in src/gui_2.py:9,
  tomli_w in src/gui_2.py:18, fastapi + requests in src/app_controller,
  tree_sitter_* in src/file_cache, pydantic in src/models, plus all
  the src.* subsystem imports that drag in heavy transitive deps).
  Phase 3-5 of the track will resolve these one by one.

After Phase 3-5, this audit must exit 0 (no violations).

Co-located reports in docs/reports/ per project convention; the other
agent finished their work in docs/superpowers/ and is unrelated.
2026-06-06 14:22:18 -04:00
ed 0553983ce9 conductor(spec): Clarify --audit --strict semantics in Section 4.3
Default --audit exits non-zero on hard errors only. --strict adds the
'multiple subsystems = probably cross-cutting' heuristic from Section 9
as a CI gate. Two modes, one flag.
2026-06-06 14:16:13 -04:00
ed cbfd78c51d conductor(tracks): Register test_batching_refactor_20260606 in registry 2026-06-06 14:14:11 -04:00
ed b7a9737443 conductor(track): Initialize test_batching_refactor_20260606 spec
Three-tier batching refactor: replace alphabetical 4-at-a-time batching with
fixture-class-isolated tiers (0 opt-in, 1 unit/xdist, 2 mock_app, 3 live_gui
in one session, H headless, P performance).

Hybrid classification: auto-infer from filename + AST fixture scan; hand-curated
tests/test_categories.toml overrides for cross-cutting and ambiguous files.

Opt-in per-test order control via [[files.X.test_order]] sub-tables, gated on
a conftest-loaded pytest plugin (no-op without entries).

Priority order: B (process isolation) > A (subsystem diagnostic) > C (speed).
2026-06-06 14:12:14 -04:00
ed 96158edd97 conductor(plan): mark T1.3 StartupProfiler complete (5a856536) 2026-06-06 13:59:02 -04:00
ed 5a85653654 feat(startup_profiler): add StartupProfiler for per-phase init timing
Lightweight, in-memory profiler for AppController init phases. Used by
the startup_speedup_20260606 track to measure where the time goes
during boot (config hydration, hook server start, subsystem init, etc.).

The profiler is exposed via /api/startup_profile (Phase 8 work) and
the Diagnostics panel so the user can see the exact per-phase cost.

Public API:
  StartupProfiler() - create
  .phase(name) - context manager
  .snapshot() - {phases: {name: {start_ts, duration_ms}}, total_ms, count}
  .reset() - clear recorded phases
  .enable() / .disable() - toggle recording

Implementation:
  - dataclass with list of _Phase(name, start_ts, end_ts)
  - @contextmanager records wall-clock via time.perf_counter
  - records duration even if the body raises (try/finally)
  - snapshot is a copy, so consumers can't mutate the live state

TDD: 5 tests in tests/test_startup_profiler.py cover: basic
recording, total math, snapshot isolation, exception safety, empty
state.
2026-06-06 13:57:26 -04:00
ed f2f5ee1197 conductor(plan): flip track from lazy-loading to proactive warmup
Architectural shift driven by user clarification: lazy-loading on first
use causes user-perceptible lag when the user-triggered action (e.g.
provider switch) propagates to a controller method that triggers the
first import. The fix is to pre-import heavy modules on a bg thread
at startup and have functions access them via _require_warmed().

Old design (rejected):
  - from google import genai inside _send_gemini (lazy on first call)
  - First user action that triggers this pays the cost; UI feels laggy

New design (this commit):
  - Top-level heavy imports REMOVED from main-thread-reachable files
  - AppController.__init__ submits warmup jobs to _io_pool (4 threads,
    named 'controller-io-N')
  - Each warmup worker imports its module and updates a thread-safe
    warmup_status dict
  - Functions access modules via _require_warmed(name), which assumes
    the module is in sys.modules (warmed at startup)
  - When all jobs complete, _warmup_done_event is set and registered
    on_warmup_complete callbacks fire
  - GUI shows status indicator + toast when warmup completes
  - Hook API exposes /api/warmup_status and /api/warmup_wait
  - Tests can call controller.wait_for_warmup() before exercising
    warmup-dependent functionality

Phase 2 now bundles job pool + warmup (T2.3+T2.4 add warmup tests +
implementation). Phases 3-5 do 'remove top-level imports' instead of
'lazy-load'. Phase 7 is the notification surface (Hook API + GUI).
Definition of Done includes warmup-completion criteria, the
'no function-body imports' check, and an end-to-end 'provider switch
is INSTANT' smoke test.

No code changes; this is a planning update only.
2026-06-06 13:45:05 -04:00
ed ca254bac41 fix(imports): break models<->dag_engine circular dependency
Track.get_executable_tickets (in models.py) called TrackDAG at
runtime, forcing a top-level import of src.dag_engine into models.py
and creating a 2-cycle that broke whichever module loaded second
(Ticket was not yet defined when models.py loaded first; TrackDAG
was not yet defined when dag_engine.py loaded first).

Fix: hoist the method out of the Track dataclass and into a free
function get_executable_tickets(track) in dag_engine.py. models.py
no longer needs TrackDAG at all, so the cycle is one-directional
(models -> dag_engine) and resolves cleanly in any import order.

Tests updated:
- tests/test_mma_models.py: import get_executable_tickets and call
  it instead of track.get_executable_tickets() (4 call sites)
- tests/test_conductor_engine_v2.py: comment update

Verified both import orders resolve cleanly:
  forward:  import src.models; import src.dag_engine  -> OK
  reverse:  import src.dag_engine; import src.models  -> OK
34 tests pass (test_mma_models, test_dag_engine, test_execution_engine,
test_arch_boundary_phase3, test_track_state_schema).
2026-06-06 13:30:18 -04:00
r00tz 9e4fac496d made local rag needs optional (prevents having to have torch / sentence-transformers if you never use local embedding) 2026-06-06 13:21:43 -04:00
ed 32e633b3ec conductor(plan): mark startup_speedup_20260606 track creation committed (cd4fb045) 2026-06-06 13:01:32 -04:00
ed cd4fb04541 conductor(track): create startup_speedup_20260606 track for sloppy.py startup latency
Fulfills the existing backlog entry at conductor/tracks.md:152
(2026-06-05 root-cause analysis of live_gui wait_for_server timeouts).

Main Thread Purity Invariant: the main thread (entering immapp.run())
must never import a module heavier than imgui_bundle and the lean
gui_2 skeleton. Enforced by:
  - static gate: scripts/audit_main_thread_imports.py (CI)
  - runtime hook: tests/test_main_thread_purity.py (sys.addaudithook)

Threading constraint: no new threading.Thread(...) calls in src/.
All background work goes through AppController._io_pool
(ThreadPoolExecutor, max_workers=4, thread_name_prefix='controller-io').

9 phases, 57 tasks: audit+baseline, job pool, lazy-load SDKs, lazy-load
FastAPI, lazy-load feature-gated GUI, migrate ad-hoc threads, runtime
enforcement, hook API + diagnostics, verify+checkpoint.

Expected savings: ~2000-2400ms off main-thread import cost.
Target: import src.ai_client < 50ms (from ~1800ms), live_gui fixtures
no longer time out at wait_for_server(timeout=15).
2026-06-06 12:57:20 -04:00
ed 2adf3274af add benchmark scriptr 2026-06-06 12:47:41 -04:00
ed 311fde9a8b fixes 2026-06-06 12:44:07 -04:00
ed 9ccaf0594c some org on ai_client 2026-06-06 11:35:20 -04:00
ed 9d72d98b50 conductor(tracks): mark rag_phase4_stress_test_flake resolved (commit 16412ad5) 2026-06-06 11:29:03 -04:00
ed 16412ad5f9 fix(rag): detect ChromaDB dim mismatch and recreate collection on provider switch 2026-06-06 11:26:47 -04:00
ed 339b062913 more organization 2026-06-06 11:08:07 -04:00
ed 7d555361f9 more organization 2026-06-06 10:24:22 -04:00
ed 1c627bcc30 fix(docs): correct section order in guide_testing (patterns before See Also) + fix LF/CRLF 2026-06-06 09:34:38 -04:00
ed 0f742b1d5f conductor(workflow): add Indentation-Driven Class Method Visibility pitfall (2026-06-05) 2026-06-06 02:04:05 -04:00
ed e276bac093 docs(gui_2): add __getattr__/__setattr__ delegation pattern + indentation gotcha 2026-06-06 01:59:20 -04:00
ed 4ee22dedb9 docs(testing): add Narrow Test Paths + Indentation-Driven Method Visibility patterns 2026-06-06 01:53:25 -04:00
ed e7b8877f2a docs(readme): update for v2 completion (24 guides, 273 test files, 98.9% pass rate) 2026-06-06 01:42:45 -04:00
ed 5e0b6bbfd3 conductor(tracks): queue RAG test flake as new backlog item; mark prior_session complete 2026-06-06 01:35:21 -04:00
ed 008179360f conductor(index): v2 recently shipped, all 4 live_gui failures resolved 2026-06-06 01:30:03 -04:00
ed 9a3831897b conductor(tracks): mark live_gui_test_hardening_v2 complete (root cause was indent, not state sync) 2026-06-06 01:28:02 -04:00
ed 26e0ced4d9 test(prior_session): refactor to narrow render_prior_session_view (50+ mocks -> 20) 2026-06-06 01:12:29 -04:00
ed 11f8772401 docs(spec): live_gui_state_sync — REAL root cause is bad indent in _capture_workspace_profile 2026-06-06 01:08:07 -04:00
ed c4691a54b0 fking python 2026-06-06 01:05:00 -04:00
ed 6c541bc788 move track mds to tracks 2026-06-06 00:42:40 -04:00
ed e670fc1c3e more org 2026-06-06 00:40:07 -04:00
ed 053f5d867a some organization pass, still need to review a bunch 2026-06-06 00:21:36 -04:00
ed f8b0a1243d add note aobut hook helpers... 2026-06-05 23:03:45 -04:00
ed 7785f09fa9 Some organizing of the api_hook_client.py 2026-06-05 23:02:41 -04:00
ed 5c23ad190d conductor(tracks): link v2 to 4 sub-track specs and plans 2026-06-05 22:56:55 -04:00
ed 3e52f20d16 docs(spec+plan): undo_redo_lifecycle_fix (3-phase investigation: state-sync vs snapshot vs flake) 2026-06-05 22:49:16 -04:00
ed b692353e98 docs(spec+plan): wait_for_ready_test_pattern (replace time.sleep with polling) 2026-06-05 22:45:14 -04:00
ed 85cd34683a docs(spec+plan): prior_session_test_harden (refactor to narrow render_prior_session_view) 2026-06-05 22:41:46 -04:00
ed 9542c4c750 docs(spec+plan): live-gui state sync (App/Controller single source of truth) 2026-06-05 22:36:55 -04:00
ed aa56981c87 organizing (mostly aggregate.py) 2026-06-05 22:34:26 -04:00
ed 8b83c5d0b7 conductor(index): v2 active, v1 + regression_fixes now in recently-shipped 2026-06-05 22:12:34 -04:00
ed 70c18f92c3 conductor(tracks): mark v1 fragility_fixes complete, queue v2 (state sync + undo_redo + prior_session) 2026-06-05 22:09:30 -04:00
ed 873edf42cf began to go through the files and organize imports and gui_2.py's new context defs
still a bunch to sift through after the last ai passes
2026-06-05 21:44:41 -04:00
ed 1d89fcaf8a update readme 2026-06-05 21:33:06 -04:00
ed ed98481578 update readme with note 2026-06-05 21:32:46 -04:00
ed 1488e71568 docs: add Sentinel type contract note to 3 defer-not-catch sections 2026-06-05 20:31:38 -04:00