docs(ai_client+api_hooks): lazy-loading + warmup endpoints (startup_speedup)
guide_ai_client.md: - Add 'Module-Level Imports' section explaining that the 5 provider SDKs are NOT imported at module level; they're obtained via src.module_loader._require_warmed() after the WarmupManager loads them in the background. (Per startup_speedup_20260606: import src.ai_client went from ~1800ms to ~161ms.) guide_api_hooks.md: - Add 4 warmup endpoints to the endpoints table: /api/warmup_status, /api/warmup_wait?timeout=N, /api/warmup_canaries, /api/startup_timeline - Add 'Warmup API' section with client methods + external script pattern (use get_warmup_wait() instead of time.sleep() race)
This commit is contained in:
@@ -12,6 +12,12 @@ The module is a **stateful singleton** — all provider state is held in module-
|
||||
|
||||
---
|
||||
|
||||
## Module-Level Imports
|
||||
|
||||
> **Important:** The 5 provider SDKs are **NOT** imported at module level. `import google.genai`, `import anthropic`, `import openai`, and `import fastapi` are heavy (~430-955ms each on cold load) and are now obtained via `src.module_loader._require_warmed("google.genai")` and similar calls, after the `WarmupManager` has loaded them in the background. The module-level globals you see in the State section (`_gemini_client`, `_anthropic_client`, etc.) are typed as `Optional` because they're populated by `_require_warmed()` on first use, not at import time.
|
||||
|
||||
This change was part of the 2026-06-06 `startup_speedup_20260606` track. Before: `import src.ai_client` took ~1800ms. After: ~161ms. The remaining cost is the bare module skeleton.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
|
||||
+20
-1
@@ -76,8 +76,27 @@ The server runs in a daemon thread. It stops when the process exits (or via `ser
|
||||
| `GET` | `/api/performance` | Performance metrics (FPS, frame time) |
|
||||
| `GET` | `/api/comms` | Communication log |
|
||||
| `GET` | `/api/diagnostics` | Diagnostics state |
|
||||
| `GET` | `/api/warmup_status` | Warmup progress snapshot: `{pending, completed, failed}` module lists |
|
||||
| `GET` | `/api/warmup_wait?timeout=N` | Server-side blocking wait for warmup completion (up to N seconds) |
|
||||
| `GET` | `/api/warmup_canaries` | Per-module import timing records (canary_id, module, thread, elapsed_ms, status) |
|
||||
| `GET` | `/api/startup_timeline` | Startup phase breakdown: init_start_ts, warmup_done_ts, first_frame_ts, warmup_ms, first_frame_after_init_ms, first_frame_after_warmup_ms |
|
||||
|
||||
(Full endpoint list may grow; check the live server for the canonical list.)
|
||||
### Warmup API
|
||||
|
||||
The 4 warmup endpoints (added in `startup_speedup_20260606`) let external clients (test harnesses, scripts, the Command Palette's "Restart" actions) answer two questions:
|
||||
1. **Is the app ready?** — `get_warmup_status()` returns the current `{pending, completed, failed}` module lists. `is_warmup_done()` (via `wait_for_warmup`) blocks until all are done.
|
||||
2. **Did the warmup block the first frame?** — `get_startup_timeline()` returns the 3 phase breakdowns (AppController init, GUI bundle setup, first render) plus the critical gap between warmup completion and first frame paint.
|
||||
|
||||
**Client methods** in `ApiHookClient` (`src/api_hook_client.py:312-348`):
|
||||
|
||||
| Method | Endpoint | Purpose |
|
||||
|---|---|---|
|
||||
| `get_warmup_status()` | `GET /api/warmup_status` | Returns `{pending, completed, failed}` |
|
||||
| `get_warmup_wait(timeout=30.0)` | `GET /api/warmup_wait?timeout=N` | Server-side blocking wait |
|
||||
| `get_warmup_canaries()` | `GET /api/warmup_canaries` | Per-module import timing |
|
||||
| `get_startup_timeline()` | `GET /api/startup_timeline` | Phase breakdown dict |
|
||||
|
||||
**External script pattern:** A test or script that needs the app fully ready should call `client.get_warmup_wait(timeout=60)` before any other API call. This replaces the old `time.sleep(N)` race-condition pattern. See `tests/test_api_hooks_warmup.py` for usage examples.
|
||||
|
||||
### Request/Response Format
|
||||
|
||||
|
||||
Reference in New Issue
Block a user