docs(ai_client+api_hooks): lazy-loading + warmup endpoints (startup_speedup)
guide_ai_client.md: - Add 'Module-Level Imports' section explaining that the 5 provider SDKs are NOT imported at module level; they're obtained via src.module_loader._require_warmed() after the WarmupManager loads them in the background. (Per startup_speedup_20260606: import src.ai_client went from ~1800ms to ~161ms.) guide_api_hooks.md: - Add 4 warmup endpoints to the endpoints table: /api/warmup_status, /api/warmup_wait?timeout=N, /api/warmup_canaries, /api/startup_timeline - Add 'Warmup API' section with client methods + external script pattern (use get_warmup_wait() instead of time.sleep() race)
This commit is contained in:
@@ -12,6 +12,12 @@ The module is a **stateful singleton** — all provider state is held in module-
|
||||
|
||||
---
|
||||
|
||||
## Module-Level Imports
|
||||
|
||||
> **Important:** The 5 provider SDKs are **NOT** imported at module level. `import google.genai`, `import anthropic`, `import openai`, and `import fastapi` are heavy (~430-955ms each on cold load) and are now obtained via `src.module_loader._require_warmed("google.genai")` and similar calls, after the `WarmupManager` has loaded them in the background. The module-level globals you see in the State section (`_gemini_client`, `_anthropic_client`, etc.) are typed as `Optional` because they're populated by `_require_warmed()` on first use, not at import time.
|
||||
|
||||
This change was part of the 2026-06-06 `startup_speedup_20260606` track. Before: `import src.ai_client` took ~1800ms. After: ~161ms. The remaining cost is the bare module skeleton.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user