diff --git a/docs/guide_ai_client.md b/docs/guide_ai_client.md
index e1655bcc..a9e8a683 100644
--- a/docs/guide_ai_client.md
+++ b/docs/guide_ai_client.md
@@ -12,6 +12,12 @@ The module is a **stateful singleton** — all provider state is held in module-
 
 ---
 
+## Module-Level Imports
+
+> **Important:** The 5 provider SDKs are **NOT** imported at module level. `import google.genai`, `import anthropic`, `import openai`, and `import fastapi` are heavy (~430-955ms each on cold load) and are now obtained via `src.module_loader._require_warmed("google.genai")` and similar calls, after the `WarmupManager` has loaded them in the background. The module-level globals you see in the State section (`_gemini_client`, `_anthropic_client`, etc.) are typed as `Optional` because they're populated by `_require_warmed()` on first use, not at import time.
+
+This change was part of the 2026-06-06 `startup_speedup_20260606` track. Before: `import src.ai_client` took ~1800ms. After: ~161ms. The remaining cost is the bare module skeleton.
+
 ## Architecture
 
 ```
diff --git a/docs/guide_api_hooks.md b/docs/guide_api_hooks.md
index eae42913..5378f26c 100644
--- a/docs/guide_api_hooks.md
+++ b/docs/guide_api_hooks.md
@@ -76,8 +76,27 @@ The server runs in a daemon thread. It stops when the process exits (or via `ser
 | `GET` | `/api/performance` | Performance metrics (FPS, frame time) |
 | `GET` | `/api/comms` | Communication log |
 | `GET` | `/api/diagnostics` | Diagnostics state |
+| `GET` | `/api/warmup_status` | Warmup progress snapshot: `{pending, completed, failed}` module lists |
+| `GET` | `/api/warmup_wait?timeout=N` | Server-side blocking wait for warmup completion (up to N seconds) |
+| `GET` | `/api/warmup_canaries` | Per-module import timing records (canary_id, module, thread, elapsed_ms, status) |
+| `GET` | `/api/startup_timeline` | Startup phase breakdown: init_start_ts, warmup_done_ts, first_frame_ts, warmup_ms, first_frame_after_init_ms, first_frame_after_warmup_ms |
 
-(Full endpoint list may grow; check the live server for the canonical list.)
+### Warmup API
+
+The 4 warmup endpoints (added in `startup_speedup_20260606`) let external clients (test harnesses, scripts, the Command Palette's "Restart" actions) answer two questions:
+1. **Is the app ready?** — `get_warmup_status()` returns the current `{pending, completed, failed}` module lists. `is_warmup_done()` (via `wait_for_warmup`) blocks until all are done.
+2. **Did the warmup block the first frame?** — `get_startup_timeline()` returns the 3 phase breakdowns (AppController init, GUI bundle setup, first render) plus the critical gap between warmup completion and first frame paint.
+
+**Client methods** in `ApiHookClient` (`src/api_hook_client.py:312-348`):
+
+| Method | Endpoint | Purpose |
+|---|---|---|
+| `get_warmup_status()` | `GET /api/warmup_status` | Returns `{pending, completed, failed}` |
+| `get_warmup_wait(timeout=30.0)` | `GET /api/warmup_wait?timeout=N` | Server-side blocking wait |
+| `get_warmup_canaries()` | `GET /api/warmup_canaries` | Per-module import timing |
+| `get_startup_timeline()` | `GET /api/startup_timeline` | Phase breakdown dict |
+
+**External script pattern:** A test or script that needs the app fully ready should call `client.get_warmup_wait(timeout=60)` before any other API call. This replaces the old `time.sleep(N)` race-condition pattern. See `tests/test_api_hooks_warmup.py` for usage examples.
 
 ### Request/Response Format