Private
Public Access
0
0
Commit Graph

4 Commits

Author SHA1 Message Date
ed 8d58d7fc46 fix(warmup): defer _done_event.set() until after callbacks fire
WarmupManager._record_success and _record_failure used to set
self._done_event.set() inside the with self._lock: block, BEFORE
calling the user-registered on_complete callbacks. This created
a race: a test thread calling mgr.wait() could observe
mgr.is_done() == True and proceed before the worker thread had
finished firing the callbacks. The mgr.on_complete caller would
then assert on state that the callback was supposed to mutate
(e.g. test_warmup_on_complete_callback_fires' `received` list).

Fix: move self._done_event.set() to AFTER the for cb in callbacks:
loop in both _record_success and _record_failure. The done event
is now set last, so wait() cannot return until all callbacks
have completed (or raised, which is swallowed by the try/except).

ALSO fix the previously-corrupted state of warmup.py (the result
of a misused set_file_slice edit that left orphaned code with no
def line for _record_failure). _record_failure is now a proper
class method with the def line restored.

ALSO fix tests/test_warmup.py:
  - test_warmup_on_complete_callback_fires: the test body was
    missing the pool/mgr setup. Added the missing lines.
  - test_warmup_done_event_set_after_all_complete: removed the
    racy `assert not mgr.is_done()` assertion that fires
    immediately after submit. On a fast machine, os/sys warmup
    completes in microseconds, so is_done() is already True
    by the time the assertion runs. The remaining assertion
    (`assert mgr.is_done()` after wait) still tests the
    semantic that the done event is set after completion.
  - Removed both `@pytest.mark.skip` markers; the underlying
    issues are now fixed in production code AND the tests.

Verified: 10/10 tests in tests/test_warmup.py pass (previously
2 skipped, 2 failed).
2026-06-07 16:02:30 -04:00
ed 152605f5dc feat(warmup): log canaries to stderr by default (with main-thread violation warning)
Per module: prints a one-line summary to stderr when the import
completes or fails:
  [warmup 1] google.genai on controller-io_0 (id=18636): 1218.6ms
  [warmup 2] anthropic on controller-io_1 (id=5500): 1148.3ms
  [warmup 3] openai on controller-io_2 (id=34376): 1144.2ms
  ...

When the entire warmup completes, prints an aggregate:
  [warmup done] 9 modules: 9 completed (sum of per-module elapsed: 3591.7ms)

If ANY canary ran on the main thread (main-thread-purity violation),
the per-module line is tagged with [MAIN-THREAD] AND a final WARNING
is printed:
  [warmup WARNING] N module(s) loaded on the MAIN THREAD: google.genai

Default is log_to_stderr=True so production runs get the observability
for free. Tests opt out via WarmupManager(pool, log_to_stderr=False)
in the _build_warmup helper.

5 new tests (4 stderr logging + 1 quiet). All 13 canary tests pass.

Use case: 'did my heavy import run on the GUI thread when it shouldnt
have?' is now answered by grepping stderr for [warmup ...] [MAIN-THREAD]
lines. No hook server required.
2026-06-06 22:15:24 -04:00
ed 208aa664db feat(warmup): per-module canary records (thread + timing observability)
Adds a canary record for each module submitted to the warmup, tracking:
canary_id, module, thread_name, thread_id, submit_ts, start_ts,
end_ts, elapsed_ms, status, error.

Surface:
- WarmupManager.canaries() returns list[dict] (defensive copy)
- AppController.warmup_canaries() returns list[dict] (delegation)
- GET /api/warmup_canaries Hook API endpoint
- ApiHookClient.get_warmup_canaries() returns list[dict]

Example: the warmup of google.genai records a 1187ms canary on
thread controller-io_0 with thread_id 50420, canary_id 1.

11 new tests (8 unit in test_warmup_canaries + 3 in test_api_hooks_warmup).
All pass; live_gui smoke test confirms endpoint returns real data.
2026-06-06 22:02:35 -04:00
ed 1354679e33 feat(io_pool, warmup): add shared 4-thread pool + WarmupManager
Phase 2 Tasks T2.1-T2.4 of the startup_speedup_20260606 track.

NEW: src/io_pool.py
  make_io_pool() factory: 4-worker ThreadPoolExecutor with
  thread_name_prefix='controller-io'. The sanctioned way for any
  background work. Replaces ad-hoc threading.Thread() calls per
  the 'no new threads' rule.

NEW: src/warmup.py
  WarmupManager: manages a list of modules to import on the shared
  pool. Public API:
    .submit(modules)        - start warmup (call once)
    .status()               - {pending, completed, failed}
    .is_done()              - bool
    .wait(timeout)          - block until done
    .on_complete(callback)  - register completion callback
    .reset()                - clear state
  Thread-safe (lock-guarded). 10 tests cover all paths.

NEW: tests/test_io_pool.py (4 tests):
  - ThreadPoolExecutor returned
  - 4 workers
  - Threads named 'controller-io-*'
  - Jobs run in parallel (barrier test)

NEW: tests/test_warmup.py (10 tests):
  - One job per module submitted
  - Initial pending list correct
  - Failed imports tracked
  - Done event set after all complete
  - wait() blocks until done
  - on_complete callback fires (and immediately if already done)
  - Modules actually end up in sys.modules
  - reset() clears state
  - Jobs run concurrently (not serially)

All 14 tests pass. AppController integration is the next commit.
2026-06-06 14:47:02 -04:00