Private
Public Access
0
0

conductor(track): create startup_speedup_20260606 track for sloppy.py startup latency

Fulfills the existing backlog entry at conductor/tracks.md:152
(2026-06-05 root-cause analysis of live_gui wait_for_server timeouts).

Main Thread Purity Invariant: the main thread (entering immapp.run())
must never import a module heavier than imgui_bundle and the lean
gui_2 skeleton. Enforced by:
  - static gate: scripts/audit_main_thread_imports.py (CI)
  - runtime hook: tests/test_main_thread_purity.py (sys.addaudithook)

Threading constraint: no new threading.Thread(...) calls in src/.
All background work goes through AppController._io_pool
(ThreadPoolExecutor, max_workers=4, thread_name_prefix='controller-io').

9 phases, 57 tasks: audit+baseline, job pool, lazy-load SDKs, lazy-load
FastAPI, lazy-load feature-gated GUI, migrate ad-hoc threads, runtime
enforcement, hook API + diagnostics, verify+checkpoint.

Expected savings: ~2000-2400ms off main-thread import cost.
Target: import src.ai_client < 50ms (from ~1800ms), live_gui fixtures
no longer time out at wait_for_server(timeout=15).
This commit is contained in:
2026-06-06 12:57:20 -04:00
parent 2adf3274af
commit cd4fb04541
5 changed files with 942 additions and 2 deletions
+3 -2
View File
@@ -149,8 +149,9 @@ User review surfaced five outstanding UI issues, each previously attempted witho
## Remaining Backlog (Phases 3 & 4)
0. [ ] **Track: Sloppy.py Startup Speedup**
*Status: 2026-06-05 — Surfaced during regression_fixes_20260605 root-cause analysis. `sloppy.py --enable-test-hooks` startup latency has crept up; live_gui fixtures time out at `wait_for_server(timeout=15)`. Hypothesized cause: too much init work on the main thread (FastAPI hook server bring-up, log pruner retry loops, MCT startup). Plan: profile startup, move heavy init off the main thread to the controller's background thread pool, defer non-critical subsystems to lazy-init on first use. Spec/plan to follow.*
0. [~] **Track: Sloppy.py Startup Speedup**
*Link: [./tracks/startup_speedup_20260606/](./tracks/startup_speedup_20260606/), Spec: [./tracks/startup_speedup_20260606/spec.md](./tracks/startup_speedup_20260606/spec.md), Plan: [./tracks/startup_speedup_20260606/plan.md](./tracks/startup_speedup_20260606/plan.md)*
*Goal: Reduce `sloppy.py` startup time by ~2000-2400ms via (1) lazy-loading AI provider SDKs (`google.genai` 955ms, `anthropic` 430ms, `openai` 445ms) into the function that uses them, (2) lazy-loading `fastapi` in `HookServer` (~470ms), (3) lazy-loading feature-gated GUI modules (`command_palette` 244ms, `theme_nerv*` 485ms, `markdown_table` 250ms), (4) background prefetch of the default provider SDK on a daemon thread, (5) `StartupProfiler` + `/api/startup_profile` for measurement. Three-layer architecture: lazy in called function (load-bearing), bg prefetch (latency hiding), worker-process (future). Target: `import src.ai_client` < 50ms (from ~1800ms), `import src.gui_2` < 500ms (from ~3000ms), `live_gui.wait_for_server(timeout=15)` no longer times out.*
0b. [x] **Track: rag_phase4_stress_test_flake_20260606** — fixed 16412ad5
*Status: 2026-06-06 — Surfaced during post-v2 verification. Resolved: real bug, NOT a test flake. Root cause: ChromaDB collection dimension mismatch across test runs. The persistent on-disk collection (`tests/artifacts/live_gui_workspace/.slop_cache/chroma_test_stress/`) was created by a previous run with Gemini embeddings (3072-dim); the current run uses local SentenceTransformers (384-dim). `index_file()` upserts silently corrupt the collection, then `search()` fails with `Collection expecting embedding with dimension of 3072, got 384` and the AI request never reaches 'done' status, timing out the 50*0.5s = 25s poll loop. Fix: `RAGEngine._init_vector_store` now calls `_validate_collection_dim` which inspects the first existing vector's dim, compares to the current provider's output, and recreates the collection on mismatch (with a stderr warning). Regression tests added: `test_rag_collection_dim_mismatch_recreates_collection` and `test_rag_collection_dim_match_preserves_collection` in `tests/test_rag_engine.py`. This also fixes a real user-facing bug: switching embedding providers in the GUI previously caused silent corruption. Commit 16412ad5.*
@@ -0,0 +1,70 @@
{
"track_id": "startup_speedup_20260606",
"name": "Sloppy.py Startup Speedup",
"initialized": "2026-06-06",
"owner": "tier2-tech-lead",
"priority": "high",
"status": "active",
"type": "refactor + performance",
"scope": {
"new_files": [
"src/startup_profiler.py",
"scripts/audit_main_thread_imports.py",
"scripts/audit_gui2_imports.py",
"tests/test_ai_client_lazy_imports.py",
"tests/test_hook_server_lazy_fastapi.py",
"tests/test_app_controller_io_pool.py",
"tests/test_command_palette_lazy.py",
"tests/test_theme_nerv_lazy.py",
"tests/test_markdown_helper_lazy.py",
"tests/test_main_thread_purity.py",
"tests/test_startup_profiler.py",
"tests/test_io_pool_endpoint.py"
],
"modified_files": [
"src/ai_client.py",
"src/api_hooks.py",
"src/app_controller.py",
"src/commands.py",
"src/command_palette.py",
"src/theme_2.py",
"src/theme_nerv.py",
"src/theme_nerv_fx.py",
"src/markdown_helper.py",
"src/markdown_table.py",
"src/gui_2.py",
"src/log_pruner.py",
"src/project_manager.py"
]
},
"blocked_by": [],
"blocks": [],
"estimated_phases": 9,
"spec": "spec.md",
"plan": "plan.md",
"architectural_invariant": "The main thread (the one that enters immapp.run()) must NEVER import a module heavier than imgui_bundle and the lean gui_2 skeleton. Enforced by scripts/audit_main_thread_imports.py (static CI gate) and tests/test_main_thread_purity.py (runtime audit-hook test).",
"threading_constraint": "NO new threading.Thread(...) calls in src/. All background work must go through AppController._io_pool (ThreadPoolExecutor, max_workers=4, thread_name_prefix='controller-io').",
"verification_criteria": [
"import src.ai_client < 50ms cold start (from ~1800ms)",
"import src.gui_2 < 500ms cold start (from ~3000ms)",
"import src.app_controller < 300ms cold start (from ~700ms)",
"uv run sloppy.py --enable-test-hooks reaches immapp.run() in < 1.5s",
"live_gui.wait_for_server(timeout=15) passes for all tests",
"scripts/audit_main_thread_imports.py exits 0 (no main-thread heavy imports)",
"tests/test_main_thread_purity.py passes (runtime audit hook confirms invariant)",
"No regressions in 273+ existing tests",
"ZERO new threading.Thread(...) calls in src/ (after Phase 6 migration)",
"Startup profile + io_pool status visible via /api/startup_profile and /api/io_pool_status"
],
"links": {
"backlog_entry": "conductor/tracks.md:152",
"benchmark_script": "scripts/benchmark_imports.py",
"audit_script": "scripts/audit_main_thread_imports.py",
"related_docs": [
"docs/guide_architecture.md",
"docs/guide_app_controller.md",
"docs/guide_hot_reload.md",
"docs/guide_testing.md"
]
}
}
@@ -0,0 +1,232 @@
# Plan: Sloppy.py Startup Speedup
**Track:** `startup_speedup_20260606`
**Spec:** [./spec.md](./spec.md)
**Status:** In progress
**Started:** 2026-06-06
---
## Phase 1: Audit + Benchmark + Foundation
- [ ] **T1.1** Capture baseline with `scripts/benchmark_imports.py --runs=3 --color=never > docs/startup_baseline_20260606.txt`
- [ ] **T1.2** Write `scripts/audit_gui2_imports.py` (AST walker): for each `import X` in `src/gui_2.py`, classify as `first-frame` (reachable from `main()` / `render_main_window` etc.) vs `feature-gated` (inside an `if/elif` branch that requires user action). Commit audit results to `docs/startup_audit_20260606.md`.
- [ ] **T1.3** Add `src/startup_profiler.py` with `StartupProfiler` class (context manager `phase(name)`). Wire into `AppController.__init__` and `App.__init__` at 8 major init points. (No new test; verify via manual run + diagnostics panel.) `[T1.3]`
- [ ] **T1.4** Write `scripts/audit_main_thread_imports.py` (static gate, fails CI). AST-walks the import graph reachable from `sloppy.py`, collects all top-level `import X` / `from X import Y`, compares against an allowlist. Exits non-zero with file:line:module on violation. Allowlist: `sys.stdlib_module_names` + the lean gui_2 skeleton list from `spec.md:2.1` (`imgui_bundle`, `defer`, `src.imgui_scopes`, `src.theme_2` (default theme only), `src.theme_models`, `src.paths`, `src.models`, `src.events`).
- [ ] **T1.5** Commit baseline + audit script: `git add . && git commit -m "conductor(startup): baseline measurements + main thread import audit script"` + git note
**Phase 1 checkpoint:** Baseline established. Static gate exists. All three import classes (first-frame, feature-gated, background-safe) documented.
---
## Phase 2: Job Pool Foundation (the "no new threads" rule)
The user constraint: no new `threading.Thread(...)` per task, per import, per
ad-hoc job. The codebase gets ONE shared `ThreadPoolExecutor` on `AppController`,
named `_io_pool`, used by any subsystem that needs background work.
- [ ] **T2.1 (Red)** `tests/test_app_controller_io_pool.py`:
- `test_app_controller_has_io_pool`: instantiate `AppController`, assert `hasattr(controller, '_io_pool')` and it's a `ThreadPoolExecutor`
- `test_io_pool_uses_named_threads`: submit a job, assert the executing thread name starts with `controller-io`
- `test_io_pool_size_is_4`: assert `_io_pool._max_workers == 4`
- `test_io_pool_shuts_down_on_close`: call `controller.shutdown()`, assert the pool is shut down
- Confirm FAIL (no `_io_pool` yet)
- [ ] **T2.2 (Green)** In `src/app_controller.py`:
- Add `from concurrent.futures import ThreadPoolExecutor` at top
- In `__init__`, after the asyncio loop starts and BEFORE the existing HookServer block: `self._io_pool = ThreadPoolExecutor(max_workers=4, thread_name_prefix="controller-io")`
- In `shutdown()` (already exists in `App.shutdown` for the GUI; ensure the AppController has a matching shutdown that calls `self._io_pool.shutdown(wait=False)`)
- Add `controller.submit_io(fn, *args)` helper: `return self._io_pool.submit(fn, *args)` (with a docstring saying "use this instead of `threading.Thread` for new background work")
- [ ] **T2.3** Run T2.1 tests; confirm PASS
- [ ] **T2.4** Commit: `feat(app_controller): add shared _io_pool ThreadPoolExecutor` + git note
**Phase 2 checkpoint:** `AppController` owns a 4-thread named pool. `controller.submit_io(fn)` is the sanctioned way to do background work. Existing ad-hoc threads still exist (will be migrated in Phase 5).
---
## Phase 3: Lazy-load AI Provider SDKs (TDD)
- [ ] **T3.1 (Red)** Write `tests/test_ai_client_lazy_imports.py`:
- `test_ai_client_does_not_import_genai_at_module_level`: spawn fresh subprocess, `import src.ai_client`, assert `'google.genai' not in sys.modules` (or `google.genai` in modules but `_gemini_client` is `None`)
- `test_ai_client_does_not_import_anthropic_at_module_level`
- `test_ai_client_does_not_import_openai_at_module_level`
- `test_ai_client_does_not_import_requests_at_module_level`
- Confirm tests FAIL (proves the imports are currently eager)
- [ ] **T3.2 (Green)** In `src/ai_client.py`:
- Remove `from google import genai` from top
- Remove `import anthropic` from top
- Remove `import openai` from top
- Remove `import requests` from top
- Add lazy imports inside `_send_gemini`, `_send_anthropic`, `_send_deepseek`, `_send_minimax`
- Provider client globals stay as `None` until first `_ensure_<provider>_client()` call
- [ ] **T3.3** Run existing `tests/test_ai_client.py`; fix any breakage. Most likely issue: tests that rely on top-level import side effects need a fixture that triggers lazy init.
- [ ] **T3.4** Re-run T3.1 tests, confirm PASS
- [ ] **T3.5** Commit: `git commit -m "refactor(ai_client): lazy-load provider SDKs to defer ~1800ms off main thread"` + git note
- [ ] **T3.6** Update `conductor/tracks.md` T3 row with SHA
**Phase 3 checkpoint:** `import src.ai_client` < 50ms cold. All 273 existing tests still pass.
---
## Phase 4: Lazy-load FastAPI in HookServer (TDD)
- [ ] **T4.1 (Red)** Write `tests/test_hook_server_lazy_fastapi.py`:
- `test_hook_server_does_not_import_fastapi_at_module_level`: subprocess test
- `test_hook_server_does_not_import_fastapi_security_at_module_level`
- Confirm FAIL
- [ ] **T4.2 (Green)** In `src/api_hooks.py`:
- Remove `from fastapi import ...` from top
- Remove `from fastapi.security.api_key import APIKeyHeader` from top
- Add lazy imports inside the methods that need them (FastAPI app construction, route registration)
- [ ] **T4.3** Run existing `tests/test_api_hooks.py`; fix breakage
- [ ] **T4.4** Confirm T4.1 tests PASS
- [ ] **T4.5** Commit: `git commit -m "refactor(api_hooks): lazy-load fastapi to defer ~470ms off main thread"` + git note
**Phase 4 checkpoint:** `from src.api_hooks import HookServer` does not import fastapi.
---
## Phase 5: Lazy-load Feature-gated GUI Modules (TDD per module)
### 5A: Command Palette
- [ ] **T5A.1 (Red)** `tests/test_command_palette_lazy.py`: `from src.commands import COMMANDS` (or whatever the eager import is) does not import `src.command_palette`. Confirm FAIL.
- [ ] **T5A.2 (Green)** In `src/commands.py`: move `from src.command_palette import ...` inside the command functions that open the palette (`_open_command_palette`, `_toggle_command_palette`).
- [ ] **T5A.3** Run `tests/test_command_palette.py`; fix.
- [ ] **T5A.4** Commit: `refactor(commands): lazy-load command_palette to defer 244ms`
### 5B: NERV Theme
- [ ] **T5B.1 (Red)** `tests/test_theme_nerv_lazy.py`: `from src.theme_2 import *` (or whatever) does not import `src.theme_nerv` or `src.theme_nerv_fx`. Confirm FAIL.
- [ ] **T5B.2 (Green)** In `src/theme_2.py`: move `from src.theme_nerv import ...` and `from src.theme_nerv_fx import ...` inside `apply_nerv_theme()` (or whichever function activates the theme).
- [ ] **T5B.3** Run `tests/test_theme_2.py` and `tests/test_theme_nerv.py`; fix.
- [ ] **T5B.4** Commit: `refactor(theme): lazy-load nerv theme to defer 485ms off non-nerv path`
### 5C: Markdown Table
- [ ] **T5C.1 (Red)** `tests/test_markdown_helper_lazy.py`: `from src.markdown_helper import MarkdownRenderer` does not import `src.markdown_table`. Confirm FAIL.
- [ ] **T5C.2 (Green)** In `src/markdown_helper.py`: move `from src.markdown_table import ...` inside the table-detection branch of `render()`.
- [ ] **T5C.3** Run `tests/test_markdown_helper.py`; fix.
- [ ] **T5C.4** Commit: `refactor(markdown): lazy-load markdown_table to defer 250ms off non-table markdown`
### 5D: GUI module feature-gated imports
- [ ] **T5D.1** Run `scripts/audit_gui2_imports.py` (built in T1.2); collect list of feature-gated imports in `src/gui_2.py`
- [ ] **T5D.2** For each feature-gated import, apply the same TDD pattern (5A-5C). Group into 1-2 atomic commits per logical feature.
- [ ] **T5D.3** Run full GUI test suite; fix.
- [ ] **T5D.4** Commit per feature group
**Phase 5 checkpoint:** Feature-gated imports are lazy. Default-theme / non-palette / non-table path is lean.
---
## Phase 6: Migrate Ad-hoc Threads to `_io_pool`
The codebase has several ad-hoc `threading.Thread(...)` calls. Per the user
constraint, these should migrate to `controller.submit_io(fn)`. **This phase
audits and migrates them, but does NOT add new prefetch threads** (the heavy
SDKs are lazy-only per spec §2.2 Layer 3).
- [ ] **T6.1** Audit: `grep -rn "threading.Thread(" src/` to find all ad-hoc thread spawns. Document each in `state.toml` (a new `[ad_hoc_threads]` section).
- [ ] **T6.2** For each ad-hoc thread in `src/log_pruner.py`, `src/project_manager.py`, etc., refactor to use `controller.submit_io(fn)` instead. Wrap the callable body in a try/except (the pool's default behavior is to surface exceptions via the Future; preserve existing error logging).
- [ ] **T6.3** Run full test suite; fix.
- [ ] **T6.4** Per-migration commit (or grouped by subsystem if 3+ threads in one file). Final commit: `refactor: migrate ad-hoc threads to AppController._io_pool` + git note.
**Phase 6 checkpoint:** `grep -rn "threading.Thread(" src/` shows ZERO new spawns after this phase (existing project scaffolding threads like `HookServer` and `MMA WorkerPool` are exempt — they're domain-specific).
---
## Phase 7: Enforcement (Runtime Audit Hook)
The static gate (T1.4) catches known imports at audit time. This phase adds
empirical enforcement: a test that spawns `sloppy.py` and verifies NO heavy
import happens on the main thread at runtime.
- [ ] **T7.1 (Red)** `tests/test_main_thread_purity.py`:
- `test_headless_startup_no_heavy_imports_on_main`: spawn `uv run python sloppy.py --headless --enable-test-hooks` with a `sitecustomize.py` shim that installs `sys.addaudithook` to log every `import` event with the calling thread. The hook writes to a temp file as JSON-L.
- Wait for headless server ready (5s timeout via `ApiHookClient`).
- Read the audit log. Assert: no event with `thread_name == "MainThread"` for any module in the heavy denylist (`google.genai`, `anthropic`, `openai`, `fastapi`, `requests`, `numpy`, `tkinter`, `psutil`, `pydantic`, `tree_sitter_*`, `src.command_palette`, `src.theme_nerv`, `src.theme_nerv_fx`, `src.markdown_table`, `src.ai_client.send_*`-direct).
- Kill subprocess. Confirm FAIL (current state imports these on main).
- [ ] **T7.2** Once Phase 3-5 land and the static gate passes, this test should start passing. If it doesn't, debug and add more lazy imports.
- [ ] **T7.3** Wire `test_main_thread_purity.py` into CI as a gating test (it'll be slow, ~10s, so mark with `@pytest.mark.slow` and only run in batched CI).
- [ ] **T7.4** Commit: `test: empirical main-thread purity check via sys.audit hook` + git note
**Phase 7 checkpoint:** CI fails if a future commit re-introduces a heavy main-thread import.
---
## Phase 8: Hook API + Diagnostics
- [ ] **T8.1** Add `/api/startup_profile` endpoint in `src/api_hooks.py` returning `controller.startup_profiler.snapshot()`
- [ ] **T8.2** Register `startup_profile` in `_gettable_fields`
- [ ] **T8.3** Add a "Startup Profile" section to the Diagnostics panel (`src/gui_2.py` `_render_diagnostics` or similar). Show: phase name, duration, % of total.
- [ ] **T8.4** Add `/api/io_pool_status` endpoint returning `{max_workers, active_threads, queued, completed}` so the user can see the job pool is alive.
- [ ] **T8.5** Update `docs/guide_api_hooks.md` with both new endpoints.
- [ ] **T8.6** Tests: extend `tests/test_api_hooks.py` + new `tests/test_startup_profiler.py` + new `tests/test_io_pool_endpoint.py`.
- [ ] **T8.7** Commit: `feat(diagnostics): expose startup profile and io_pool status via Hook API` + git note
**Phase 8 checkpoint:** User can see per-phase startup cost + job-pool liveness in the GUI.
---
## Phase 9: Verify + Phase Checkpoint
- [ ] **T9.1** Re-run `scripts/benchmark_imports.py --runs=3`. Save to `docs/startup_after_20260606.txt`. Diff against T1.1 baseline; confirm:
- `import src.ai_client` < 50ms
- `import src.gui_2` < 500ms
- `import src.app_controller` < 300ms (includes `_io_pool` creation; should still be < 300ms)
- [ ] **T9.2** Re-run `scripts/audit_main_thread_imports.py` (T1.4). Confirm exit 0. No violations.
- [ ] **T9.3** Run `live_gui` test batch (per `conductor/workflow.md:147-150`: max 4 test files per batch, long timeout):
- `uv run pytest tests/test_live_gui_*.py --timeout=60 -v` in batches
- Confirm `wait_for_server(timeout=15)` does not time out
- [ ] **T9.4** Manual smoke:
- `uv run sloppy.py` (normal mode): time-to-first-frame
- `uv run sloppy.py --enable-test-hooks` (test mode): time-to-first-frame
- `uv run sloppy.py --headless` (headless): time-to-server-ready
- [ ] **T9.5** Phase checkpoint commit: `conductor(checkpoint): Phase 9 complete - sloppy.py startup speedup track` + git note with full verification report
- [ ] **T9.6** Update `conductor/tracks.md`: mark track complete, link to archived folder
**Phase 9 checkpoint:** All verification criteria in `spec.md:6` met.
---
## Definition of Done
- [ ] All Phase 1-9 tasks checked
- [ ] All tests pass (273+ existing + new TDD tests including `test_main_thread_purity`)
- [ ] `uv run ruff check .` and `uv run mypy --explicit-package-bases .` clean (per `mma-tier2-tech-lead` skill)
- [ ] `uv run python scripts/audit_main_thread_imports.py` exits 0
- [ ] `docs/startup_baseline_20260606.txt` and `docs/startup_after_20260606.txt` archived
- [ ] Phase 9 git note contains: baseline diff, audit script result, runtime audit hook result, full test batch results, manual smoke timings, file inventory
- [ ] Track moved to `conductor/tracks/archive/`
- [ ] **NO new `threading.Thread(...)` calls in `src/`** (verified by `grep -rn "threading.Thread(" src/`)
---
## Notes for Tier 3 Workers
- **Always use 1-space indentation for Python code.** Confirm via `uv run python -c "import ast; ..."` AST check if you do any class-body reorganization (the "Indentation-Driven Class Method Visibility" pitfall in `conductor/workflow.md`).
- **Test fixtures**: `isolate_workspace`, `reset_paths`, `reset_ai_client`, `vlogger`, `kill_process_tree`, `mock_app`, `live_gui` — see `docs/guide_testing.md`.
- **Subprocess tests for module-level imports**: spawn `uv run python -c "..."` and inspect `sys.modules` after the import. Pattern:
```python
result = subprocess.run(
[sys.executable, "-c", "import sys; import src.ai_client; import json; print(json.dumps(sorted(sys.modules.keys())))"],
capture_output=True, text=True
)
assert 'google.genai' not in result.stdout
```
- **For new background work**: use `controller.submit_io(fn, *args)`, NOT `threading.Thread(target=fn).start()`. The user constraint is "no new threads."
- **Atomic commits per task.** No batching. If a task touches 3 files, commit all 3 in one commit but the commit message describes the task.
- **The `_io_pool` is a daemon executor by default in Python 3.9+; non-daemon workers in 3.8.** Check `pyproject.toml` for `requires-python`. Either way, the pool is shut down on `AppController.shutdown()`.
---
## Cross-References
- Spec: [./spec.md](./spec.md)
- Original backlog entry: `conductor/tracks.md:152`
- Benchmark tool: `scripts/benchmark_imports.py`
- Lazy pattern templates: `src/app_controller.py:241-271` (RAG + MMA)
- Threading constraints: `docs/guide_architecture.md:43-67`
- Architectural Invariant: `spec.md:2.1`
- Job pool spec: `spec.md:2.2 Layer 2`
- Hot reload constraints: `docs/guide_hot_reload.md:295-312`
@@ -0,0 +1,527 @@
# Track: Sloppy.py Startup Speedup
**Status:** Active
**Initialized:** 2026-06-06
**Owner:** Tier 2 Tech Lead
**Priority:** High (regression blocker — `live_gui` fixtures time out at `wait_for_server(timeout=15)`)
---
## 1. Problem Statement
`uv run sloppy.py --enable-test-hooks` startup latency has crept up. `live_gui` tests
time out at `wait_for_server(timeout=15)`. Root cause is **too much work on the main
thread before `immapp.run()` returns and the GUI becomes interactive**:
- 5 AI provider SDKs (`google.genai`, `anthropic`, `openai`, `requests`, ...) eagerly
imported at `src/ai_client.py` module top-level, even though only one is the active
provider at runtime
- `imgui_bundle` transitively pulls `numpy` and 9 other heavy modules at the top of
`src/gui_2.py` and 9 sibling files
- NERV theme, command palette, markdown table extensions are loaded eagerly even
though they are feature-gated
- `AppController.__init__` does all subsystem construction synchronously on the
thread that will become the main GUI thread (path manager, presets, personas,
context presets, tool presets, history, workspace, RAG, hook server)
The architecture is already correct: AI calls go through the asyncio worker thread,
so the *call* is non-blocking. The *imports* are still synchronous on the main
thread, and that is what the user sees as "sloppy.py is slow to open."
### 1.1 Measurement Baseline (from `scripts/benchmark_imports.py`)
Cold-start subprocess timings, median of 3 runs, 85 unique import paths:
| module | time | files | classification |
|---|---:|---:|---|
| google.genai | ~955ms | 1 | **defer (provider SDK, default)** |
| openai | ~445ms | 1 | defer (provider SDK) |
| anthropic | ~430ms | 1 | defer (provider SDK) |
| src.markdown_table | ~250ms | 1 | defer (feature-gated) |
| src.theme_nerv | ~245ms | 1 | defer (feature-gated) |
| imgui_bundle | ~245ms | 10 | **KEEP (ImGui hot path)** |
| src.command_palette | ~244ms | 1 | defer (feature-gated) |
| src.theme_nerv_fx | ~240ms | 1 | defer (feature-gated) |
| fastapi (+ security.api_key) | ~470ms combined | 1 | defer (only `--enable-test-hooks` or web mode) |
| requests | ~92ms | 3 | defer (deepseek/minimax only) |
| numpy | ~65ms | 2 | keep (bg_shader; optional in gui_2) |
| pydantic | ~70ms | 1 | keep (models.py is loaded by everyone) |
| tree_sitter_* | ~25ms each | 1 | keep (file_cache) |
**Estimated main-thread import cost today (worst case, all paths):**
~2500-3000ms (1.0s SDKs + 1.0s web/fastapi + 0.5s GUI extras + ~0.5s transitives).
**Estimated main-thread import cost after this track:**
~500-600ms (`imgui_bundle` + lean `gui_2` + `pydantic` models). Net savings
~2000-2400ms.
---
## 2. Approach
The architecture is already correct. The fix is **systematic application of the
lazy-load + shared-job-pool patterns** the codebase already uses for `RAGEngine`
(`get_rag_engine` in `src/app_controller.py:244-249`) and `MultiAgentConductor`
(`get_mma_conductor` in `src/app_controller.py:266-271`).
### 2.1 Architectural Invariant: Main Thread Purity
> **The main thread (the one that enters `immapp.run()`) must NEVER import a
> module heavier than `imgui_bundle` and the lean `gui_2` skeleton. Every heavy
> import is loaded by the asyncio worker thread, the AppController's shared
> job pool, or the MMA WorkerPool. This invariant is enforced by an audit
> script (CI gate) and a runtime audit-hook test that fails if a heavy import
> is observed on the main thread at startup.**
Concretely, the main thread's import chain is allowed to contain:
- All `import X` statements transitively reachable from `src/gui_2.py` whose
accumulated import time is < 50ms
- The modules: `imgui_bundle`, `defer`, `src.imgui_scopes`, `src.theme_2`
(default theme only), `src.theme_models`, `src.paths`, `src.models`,
`src.events`
- Anything in `sys.stdlib_module_names`
Everything else — provider SDKs, FastAPI, NERV theme, command palette, markdown
table extensions, the full `src.ai_client` provider list, `numpy`/`psutil`/
`tree_sitter_*` if used by lazy code paths — must be loaded by a background
mechanism that does not run on the main thread.
### 2.2 Four layers of protection
#### Layer 1 — Pure lazy loading (the load-bearing wall, non-negotiable)
Move heavy imports from module top-level into the function body that needs them:
```python
# BEFORE (src/ai_client.py, current)
from google import genai
import anthropic
import openai
# ... 5 provider SDKs loaded unconditionally
# AFTER
def _send_gemini(md_content, user_message, ...):
from google import genai # 955ms, paid once, on the first call's thread
...
def _send_anthropic(...):
import anthropic
...
```
**Main-thread cost: zero.** First call still pays the latency, but it happens on
the asyncio worker thread (per `guide_architecture.md:215-234`), so the GUI never
sees it.
#### Layer 2 — Shared job pool on AppController (no new threads per task)
The codebase already has these dedicated / shared threads:
- `AppController._loop_thread` — asyncio worker (**DEDICATED** to the AI event
loop, do not use for arbitrary work)
- `WorkerPool` (in `src/multi_agent_conductor.py`) — 4-thread pool for MMA
workers (**DEDICATED** to MMA, do not pollute with imports or I/O)
- `HookServer` thread — **DEDICATED** to the FastAPI server
- Ad-hoc `threading.Thread` calls — used for one-off tasks; the user wants to
**MINIMIZE** these
**User constraint:** no new daemon threads per import prefetch, per I/O task, per
log-prune. We add ONE shared `ThreadPoolExecutor` to `AppController` named
`_io_pool`, and any subsystem that needs background work submits jobs to it.
This includes:
- Initial RAG index warm-up (if applicable)
- Log pruning (currently a one-shot thread — refactor to use the pool)
- Disk-bound subsystem initialization (e.g., TOML re-read on persona switch)
- Any other ad-hoc I/O
```python
# In AppController.__init__
from concurrent.futures import ThreadPoolExecutor
self._io_pool = ThreadPoolExecutor(
max_workers=4,
thread_name_prefix="controller-io",
)
```
**Threads created by this track: 4** (the pool). Not 4+1 per job, not 1 per
import, not 1 per subsystem. Just 4 long-lived threads that all background work
shares. Future work that needs a bg thread should `controller._io_pool.submit(fn)`.
#### Layer 3 — NO prefetch of the heaviest SDKs (deliberate)
The original Phase 5 of this plan proposed a `import-prefetch` daemon thread that
warms `google.genai` (~955ms) on a background thread. **This has been explicitly
rejected** for the heavy SDKs, and the reasoning is sound:
- A 955ms import on a background thread holds the GIL for ~10-50ms at a time
during C extension init. Each hold stalls the main thread's render loop.
- The user pays 955ms total either way: prefetch = 955ms of background stutter
+ instant first call; lazy-only = 955ms of stutter on the first call only,
with the GUI fully interactive in between.
- Prefetching wastes the import cost when the user never uses that provider
(e.g., default is Gemini but the user actually only uses Anthropic).
**Rule: heavy SDKs (`google.genai`, `anthropic`, `openai`, `fastapi`) are
lazy-only, never prefetched.** Lighter modules (themes, command palette,
markdown table) MAY be optionally warmed on the `_io_pool` if profiling shows
they're commonly used, but it's not a hard requirement and the default is
"don't warm."
#### Layer 4 — Worker-process isolation (future, out of scope)
The codebase already runs `gemini_cli` and external MCP servers as subprocesses
for this exact reason. A future track could move `google.genai` / `anthropic` into
their own worker processes, communicating via the existing `SyncEventQueue`. This
track does NOT do this — Layer 1+2+3 is sufficient for the current problem.
### 2.3 Threading constraints (verified empirically)
The user's question: *"if I import in the app controller's thread, will it block
the GUI's thread?"* The answer is:
| Scenario | Blocks GUI? |
|---|---|
| Module top-level import of heavy X, then main imports X | **YES** (X's import is in main's chain) |
| Lazy import of X inside a function called from the asyncio thread | **NO** (asyncio thread blocks, not main) |
| Lazy import of X inside a function called from the main thread | **YES** (first call only; the function caller blocks) |
| `_io_pool` worker importing X while main thread renders | **NO direct block, but GIL contention causes micro-stutters** (~5-50ms each). Acceptable because the pool is capped at 4 threads. |
| `_io_pool` worker imports X; main thread later imports X (same module) | **YES** (main blocks on per-module import lock until worker finishes). This is why Layer 1 must come first. |
| Spawning a new `threading.Thread` for each import prefetch | **Wasteful** (thread creation ~1-5ms each; thread count explodes). Use the `_io_pool` instead. |
This means: **Layer 1 is non-negotiable.** Even with the `_io_pool`, if the
heavy import is also in the main thread's import chain, the main thread will
block on the import lock the moment it tries to use the module. Layer 1 removes
the heavy imports from the main thread's chain; Layer 2 reuses threads
efficiently; Layer 3 deliberately avoids prefetching the heaviest.
### 2.4 Enforcement: the "main thread purity" audit
Two enforcement mechanisms, both required:
#### Static: `scripts/audit_main_thread_imports.py` (CI gate)
1. AST-walk the import graph reachable from `sloppy.py` (the main entry).
For each `.py` file in the graph, collect top-level `import X` and
`from X import Y` statements.
2. Compare against an allowlist of "main-thread-safe" modules (stdlib +
`imgui_bundle` + the lean gui_2 skeleton list from §2.1). Any
non-allowlist import is a violation.
3. Exit non-zero with a clear message naming the file, line, and heavy module.
4. Run as part of CI (`uv run python scripts/audit_main_thread_imports.py`)
and as a pre-commit hook.
#### Runtime: `tests/test_main_thread_purity.py` (TDD, empirical)
1. Spawn `uv run python sloppy.py --headless --enable-test-hooks` as a
subprocess, with a `sys.addaudithook` callback that logs every
`import` event with the calling thread.
2. Wait for the headless server to be ready (or 5s timeout).
3. Read the audit log. Assert: every `import` event with
`threading.current_thread() is threading.main_thread()` was for a module in
the allowlist.
4. Kill the subprocess.
This is the empirical enforcement: it proves the invariant holds at runtime,
not just at static analysis time.
---
## 3. Architectural Changes
### 3.1 Per-file import plan
#### `src/ai_client.py` (the biggest win: ~1800ms)
Top-level today: `from google import genai`, `import anthropic`, `import openai`,
`import requests` (used by deepseek/minimax).
After:
- Drop `from google import genai` from top — lazy in `_send_gemini()`
- Drop `import anthropic` from top — lazy in `_send_anthropic()`
- Drop `import openai` from top — lazy in `_send_deepseek()` and `_send_minimax()`
- Drop `import requests` from top — lazy in those two providers' HTTP code
- Provider client objects (`_gemini_client`, `_anthropic_client`, etc.) stay as
module globals but are now `None` until first use
- The `_send_*` functions check their provider client is initialized and call a
new `_ensure_<provider>_client()` lazy initializer (extracted from the current
top-level logic)
**Result:** ~1800ms off the main thread. First AI call still pays it, but on
the asyncio worker.
#### `src/app_controller.py` (FastAPI in headless/web only)
Top-level today: `from fastapi import ...`, `from fastapi.security.api_key import ...`
(only needed if `--enable-test-hooks` or `--web-host`).
After:
- Drop these from top — lazy inside `HookServer.__init__` (which is itself lazy
in the controller: `if enable_test_hooks: from src.api_hooks import HookServer; ...`)
**Result:** ~470ms off the main thread for non-test, non-web launches. Critical
because `live_gui` tests launch with `--enable-test-hooks` but the FastAPI work
can be deferred until the asyncio loop is ready.
#### `src/commands.py` and `src/command_palette.py` (command palette lazy)
Top-level today: `from src.command_palette import ...` at `src/commands.py:1`.
After:
- Lazy in each `_*_command()` function in `src/commands.py` that actually
opens the palette
- The CommandRegistry decorator can keep module-level function references, but
the *body* of the command does the heavy import
**Result:** ~244ms off if user doesn't open palette during the first session.
#### `src/theme_2.py` and `src/theme_nerv.py` / `src/theme_nerv_fx.py` (NERV theme lazy)
Top-level today: NERV modules imported at `src/theme_2.py` module top.
After:
- Lazy in `apply_nerv_theme()` (the function that activates NERV)
- The default theme path stays lean (uses only `src/theme_2.py` + `src/theme_models.py`)
**Result:** ~485ms off if user doesn't pick NERV theme (the default path).
#### `src/markdown_helper.py` (markdown table lazy)
Top-level today: `from src.markdown_table import ...` at `src/markdown_helper.py:1`.
After:
- Lazy in `_render_table_block()` (or wherever GFM table detection happens)
- The first markdown render that hits a table pays the 250ms; subsequent hits are
cached in `sys.modules`
**Result:** ~250ms off the first markdown render that lacks tables (typical).
#### `src/imgui_scopes.py`, `src/gui_2.py`, `src/bg_shader.py` (KEEP `imgui_bundle`)
These MUST keep `import imgui_bundle` at top — the ImGui render loop is the hot
path and needs the module on first frame. There is no way to defer this without
breaking the render loop.
What CAN be deferred inside `src/gui_2.py`:
- `import numpy` (only needed for `bg_shader`; the GUI itself doesn't need numpy
on the first frame)
- Other feature-gated imports
#### `src/gui_2.py` direct heavy imports (audit)
We will use AST to audit which `import X` statements at `src/gui_2.py` top-level
are reachable from the first-frame render path (`render_main_window`,
`render_main_menu_bar`, etc.) and which are feature-gated. Feature-gated ones
move inside the function that gates them.
### 3.2 Job pool scaffolding
New code in `src/app_controller.py`:
```python
from concurrent.futures import ThreadPoolExecutor
# In AppController.__init__, after the asyncio loop starts:
self._io_pool = ThreadPoolExecutor(
max_workers=4,
thread_name_prefix="controller-io",
)
def submit_io(self, fn, *args, **kwargs):
"""Submit a background job to the shared I/O pool. Use this instead of
threading.Thread for new background work.
Returns a concurrent.futures.Future. Caller can .result() if they need
to block, or .add_done_callback for fire-and-forget with error handling.
"""
return self._io_pool.submit(fn, *args, **kwargs)
```
In `AppController.shutdown()` (or wherever lifecycle cleanup lives):
`self._io_pool.shutdown(wait=False)`. Non-blocking because the pool's
workers are daemon threads and will die with the process anyway.
### 3.3 Startup timing instrumentation
Add `src/startup_profiler.py`:
```python
class StartupProfiler:
"""Records wall-clock time spent in each named init phase.
Cheap (no I/O). Stored on AppController.startup_profile for later inspection
via the Hook API (`GET /api/startup_profile`) and the Diagnostics panel.
"""
_phases: list[tuple[str, float, float]] # (name, start, duration_ms)
@contextmanager
def phase(self, name: str) -> Iterator[None]:
t0 = time.perf_counter()
yield
self._phases.append((name, t0, (time.perf_counter() - t0) * 1000))
```
Used at every major init step in `AppController.__init__` and `App.__init__`.
---
## 4. Phases
### Phase 1: Audit + Benchmark + Foundation (Day 1)
- T1.1: Run `scripts/benchmark_imports.py` and capture baseline
- T1.2: AST-audit every `import X` in `src/*.py` to map which is reachable
from the first-frame render path vs feature-gated
- T1.3: Add `StartupProfiler` to `src/app_controller.py` and instrument
current init
- T1.4: Add `scripts/audit_main_thread_imports.py` (static gate)
- T1.5: Commit baseline + audit script
### Phase 2: Job Pool Foundation (Day 1) — the "no new threads" rule
- T2.1 (TDD Red): Write `tests/test_app_controller_io_pool.py` asserting
`AppController` has a `_io_pool: ThreadPoolExecutor` with 4 workers, named
`controller-io-*`
- T2.2 (Green): Add `self._io_pool = ThreadPoolExecutor(max_workers=4,
thread_name_prefix="controller-io")` to `AppController.__init__`. Add
`submit_io(fn, *args)` helper. Wire shutdown into `controller.shutdown()`.
- T2.3: Verify T2.1 tests pass + full suite still passes
### Phase 3: Lazy-load AI provider SDKs (Day 2)
- T3.1 (TDD Red): Write `tests/test_ai_client_lazy_imports.py` asserting
`import src.ai_client` does NOT import any provider SDK
- T3.2 (Green): Move `from google import genai` / `import anthropic` /
`import openai` / `import requests` into their respective `_send_*` functions
- T3.3: Verify existing `tests/test_ai_client.py` still passes
- T3.4: Commit, re-run benchmark, expect `import src.ai_client` < 50ms
### Phase 4: Lazy-load FastAPI in `HookServer` (Day 2)
- T4.1 (TDD Red): Write `tests/test_hook_server_lazy_fastapi.py` asserting
`from src.api_hooks import HookServer` does NOT import fastapi
- T4.2 (Green): Move `from fastapi import ...` inside the methods that need them
- T4.3: Verify existing `tests/test_api_hooks.py` still passes
- T4.4: Commit
### Phase 5: Lazy-load feature-gated GUI modules (Day 3)
- T5.1: Lazy-load `src.command_palette` in `src/commands.py`
- T5.2: Lazy-load `src.theme_nerv` and `src.theme_nerv_fx` in `src/theme_2.py`
- T5.3: Lazy-load `src.markdown_table` in `src/markdown_helper.py`
- T5.4: Audit and lazy-load feature-gated imports in `src/gui_2.py`
- T5.5: Run all GUI tests; fix any circular imports
- T5.6: Commit per task
### Phase 6: Migrate ad-hoc threads to `_io_pool` (Day 4)
- T6.1: Audit: `grep -rn "threading.Thread(" src/` to find all ad-hoc
thread spawns (excluding `HookServer` and `WorkerPool` which are domain-specific)
- T6.2: Refactor each ad-hoc thread to use `controller.submit_io(fn)` instead
- T6.3: Per-migration commit
- T6.4: Final `grep -rn "threading.Thread(" src/` shows ZERO new spawns
(the grep result should be identical to the T6.1 audit list, no new entries)
### Phase 7: Enforcement — Runtime Audit Hook (Day 4)
- T7.1 (TDD Red): `tests/test_main_thread_purity.py` — spawn `sloppy.py
--headless --enable-test-hooks` with a `sys.addaudithook` shim, verify no
heavy import happens on the main thread
- T7.2: Once Phase 3-5 land, this test should start passing. Wire into CI.
- T7.3: Commit
### Phase 8: Hook API + Diagnostics (Day 5)
- T8.1: Add `/api/startup_profile` endpoint
- T8.2: Add `/api/io_pool_status` endpoint
- T8.3: Add to `_gettable_fields` and the Diagnostics panel
- T8.4: Document in `docs/guide_api_hooks.md`
- T8.5: Tests + commit
### Phase 9: Verify + Checkpoint (Day 5)
- T9.1: Re-run `scripts/benchmark_imports.py`; confirm `import src.gui_2` and
`import src.ai_client` are now < 100ms each
- T9.2: Re-run `scripts/audit_main_thread_imports.py`; exit 0
- T9.3: Run `tests/test_main_thread_purity.py`; pass
- T9.4: Run full `live_gui` test batch; `wait_for_server(timeout=15)` no
longer times out
- T9.5: Manual smoke test: `uv run sloppy.py` and
`uv run sloppy.py --enable-test-hooks` both feel snappier
- T9.6: Phase checkpoint commit with full verification report
---
## 5. Risks and Mitigations
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Lazy import inside a hot path adds latency on every call | Med | Med | Always gate the import with `sys.modules` check OR use module-level sentinel |
| First AI call on the asyncio thread blocks for ~955ms while `google.genai` imports | High | Low | The user already paid this latency budget; happens on the asyncio worker, not main. Document the expected first-call pause. |
| Lazy import surfaces circular import that was hidden by top-level ordering | Med | Med | Phase 1 audit catches this; defer each lazy import to the test phase |
| Test fixtures import the heavy module before main code, breaking assumptions | Low | Low | `reset_ai_client` and `isolate_workspace` fixtures already lazy-reset |
| Hot reload of a now-lazy module doesn't trigger | Low | Med | Update `HotReloader.HOT_MODULES` to register the lazy module's gate function |
| `_io_pool` worker importing a heavy module holds GIL and stutters GUI | Med | Low | The pool is capped at 4 threads; stutter is bounded; user sees responsive UI before any stutter |
| A future commit re-introduces a heavy import on the main thread | Med | High | Static gate (`audit_main_thread_imports.py`, CI) + runtime audit hook (`test_main_thread_purity.py`) catch this |
### Hot Reload consideration
`src/hot_reloader.py` registers modules at import time. Lazy-loaded modules
(imported inside functions) are NOT registered. The hot-reload workflow needs:
- Either: register the lazy module with a callback that forces a re-import via
`importlib.reload`
- Or: explicitly trigger the lazy import on hot-reload trigger
This is a small follow-up task; the lazy import itself doesn't break hot reload
(it just means you have to invoke the gate function once to materialize the
module before reload can take effect).
---
## 6. Verification Criteria
The track is complete when:
- [ ] `import src.ai_client` cold start < 50ms (down from ~1800ms)
- [ ] `import src.gui_2` cold start < 500ms (down from ~3000ms)
- [ ] `import src.app_controller` cold start < 300ms (down from ~700ms)
- [ ] `uv run sloppy.py --enable-test-hooks` reaches `immapp.run()` in < 1.5s
- [ ] `live_gui.wait_for_server(timeout=15)` passes for all 273+ tests
- [ ] `scripts/audit_main_thread_imports.py` exits 0 (no heavy imports on main)
- [ ] `tests/test_main_thread_purity.py` passes (runtime audit hook confirms invariant)
- [ ] `scripts/benchmark_imports.py` shows no new red entries in the top-20
- [ ] First AI call latency on the asyncio thread is < 1500ms (pays the SDK load once,
then the user has a snappy first call forever after). Main thread sees ZERO
of this cost.
- [ ] No regressions in the existing 272/273 passing tests
- [ ] `grep -rn "threading.Thread(" src/` shows ZERO new spawns after Phase 6
migration (only the existing project scaffolding threads like `HookServer`
and `WorkerPool` remain, and they're domain-specific)
- [ ] Startup profile + io_pool status visible in `/api/startup_profile`,
`/api/io_pool_status`, and the Diagnostics panel
---
## 7. Out of Scope
- Process-isolation of heavy SDKs (Layer 4 in §2.2) — future track
- `imgui_bundle` lazy loading — fundamentally impossible (ImGui hot path)
- Importing on the main thread for the lean `gui_2` skeleton (~300ms unavoidable)
- `pydantic` lazy loading (used by `src/models.py` which is imported by 16 files;
the cost is already amortized and deferring it would cascade)
- Prefetch / warm-up of the heavy SDKs in the background (Layer 3 in §2.2 is
deliberately the "do nothing" layer; the user pays the import cost once on
first use, on the asyncio thread, not in the background)
---
## 8. Cross-References
- `conductor/tracks.md` line 152 — original backlog entry that this track fulfills
- `docs/guide_architecture.md:43-67` — thread domains (asyncio worker is the right
place for heavy work)
- `docs/guide_architecture.md:880-898` — Architectural Invariants (single-writer
principle; this track respects it)
- `docs/guide_app_controller.md:241-271` — existing `get_rag_engine` /
`get_mma_conductor` lazy patterns (the templates this track replicates)
- `docs/guide_hot_reload.md:295-312` — what is/isn't safe to hot-reload
(lazy-loaded modules need a small follow-up)
- `conductor/workflow.md` — TDD Red-Green-Refactor protocol + atomic per-task
commits + git notes
- `scripts/benchmark_imports.py` — the measurement tool built in this conversation
@@ -0,0 +1,110 @@
# Track state for startup_speedup_20260606
# Updated by Tier 2 Tech Lead as tasks complete
[meta]
track_id = "startup_speedup_20260606"
name = "Sloppy.py Startup Speedup"
status = "active"
current_phase = 1
last_updated = "2026-06-06"
[phases]
phase_1 = { status = "in_progress", checkpoint_sha = "", name = "Audit + Benchmark + Foundation" }
phase_2 = { status = "pending", checkpoint_sha = "", name = "Job Pool Foundation (no new threads)" }
phase_3 = { status = "pending", checkpoint_sha = "", name = "Lazy-load AI provider SDKs" }
phase_4 = { status = "pending", checkpoint_sha = "", name = "Lazy-load FastAPI in HookServer" }
phase_5 = { status = "pending", checkpoint_sha = "", name = "Lazy-load feature-gated GUI modules" }
phase_6 = { status = "pending", checkpoint_sha = "", name = "Migrate ad-hoc threads to _io_pool" }
phase_7 = { status = "pending", checkpoint_sha = "", name = "Enforcement: runtime audit hook" }
phase_8 = { status = "pending", checkpoint_sha = "", name = "Hook API + Diagnostics" }
phase_9 = { status = "pending", checkpoint_sha = "", name = "Verify + Checkpoint" }
[tasks]
# Phase 1: Audit + Benchmark + Foundation
t1_1 = { status = "pending", commit_sha = "", description = "Capture baseline benchmark" }
t1_2 = { status = "pending", commit_sha = "", description = "Audit src/gui_2.py imports (first-frame vs feature-gated)" }
t1_3 = { status = "pending", commit_sha = "", description = "Add StartupProfiler and instrument init" }
t1_4 = { status = "pending", commit_sha = "", description = "Write scripts/audit_main_thread_imports.py (static CI gate)" }
t1_5 = { status = "pending", commit_sha = "", description = "Commit baseline + audit script" }
# Phase 2: Job Pool Foundation
t2_1 = { status = "pending", commit_sha = "", description = "Red: tests/test_app_controller_io_pool.py" }
t2_2 = { status = "pending", commit_sha = "", description = "Green: add _io_pool ThreadPoolExecutor + submit_io helper to AppController" }
t2_3 = { status = "pending", commit_sha = "", description = "Confirm T2.1 tests pass + full suite still passes" }
t2_4 = { status = "pending", commit_sha = "", description = "Commit T2" }
# Phase 3: Lazy-load AI Provider SDKs
t3_1 = { status = "pending", commit_sha = "", description = "Red: tests/test_ai_client_lazy_imports.py" }
t3_2 = { status = "pending", commit_sha = "", description = "Green: move provider SDK imports into _send_* funcs" }
t3_3 = { status = "pending", commit_sha = "", description = "Fix existing test_ai_client.py breakage" }
t3_4 = { status = "pending", commit_sha = "", description = "Confirm T3.1 tests PASS" }
t3_5 = { status = "pending", commit_sha = "", description = "Commit T3" }
t3_6 = { status = "pending", commit_sha = "", description = "Update tracks.md T3 row" }
# Phase 4: Lazy-load FastAPI
t4_1 = { status = "pending", commit_sha = "", description = "Red: tests/test_hook_server_lazy_fastapi.py" }
t4_2 = { status = "pending", commit_sha = "", description = "Green: move fastapi imports into HookServer methods" }
t4_3 = { status = "pending", commit_sha = "", description = "Fix existing test_api_hooks.py breakage" }
t4_4 = { status = "pending", commit_sha = "", description = "Confirm T4.1 tests PASS" }
t4_5 = { status = "pending", commit_sha = "", description = "Commit T4" }
# Phase 5A: Command Palette
t5a_1 = { status = "pending", commit_sha = "", description = "Red: tests/test_command_palette_lazy.py" }
t5a_2 = { status = "pending", commit_sha = "", description = "Green: lazy-load in src/commands.py" }
t5a_3 = { status = "pending", commit_sha = "", description = "Fix existing test_command_palette.py" }
t5a_4 = { status = "pending", commit_sha = "", description = "Commit T5A" }
# Phase 5B: NERV Theme
t5b_1 = { status = "pending", commit_sha = "", description = "Red: tests/test_theme_nerv_lazy.py" }
t5b_2 = { status = "pending", commit_sha = "", description = "Green: lazy-load in src/theme_2.py" }
t5b_3 = { status = "pending", commit_sha = "", description = "Fix existing test_theme_2.py + test_theme_nerv.py" }
t5b_4 = { status = "pending", commit_sha = "", description = "Commit T5B" }
# Phase 5C: Markdown Table
t5c_1 = { status = "pending", commit_sha = "", description = "Red: tests/test_markdown_helper_lazy.py" }
t5c_2 = { status = "pending", commit_sha = "", description = "Green: lazy-load in src/markdown_helper.py" }
t5c_3 = { status = "pending", commit_sha = "", description = "Fix existing test_markdown_helper.py" }
t5c_4 = { status = "pending", commit_sha = "", description = "Commit T5C" }
# Phase 5D: gui_2 feature-gated imports
t5d_1 = { status = "pending", commit_sha = "", description = "Run audit_gui2_imports.py and collect feature-gated list" }
t5d_2 = { status = "pending", commit_sha = "", description = "Apply TDD pattern per feature-gated import" }
t5d_3 = { status = "pending", commit_sha = "", description = "Run full GUI test suite; fix" }
t5d_4 = { status = "pending", commit_sha = "", description = "Commit per feature group" }
# Phase 6: Migrate ad-hoc threads
t6_1 = { status = "pending", commit_sha = "", description = "Audit threading.Thread( spawns; document each" }
t6_2 = { status = "pending", commit_sha = "", description = "Refactor each ad-hoc thread to use controller.submit_io" }
t6_3 = { status = "pending", commit_sha = "", description = "Run full test suite; fix" }
t6_4 = { status = "pending", commit_sha = "", description = "Commit per migration; final grep shows zero new spawns" }
# Phase 7: Enforcement - Runtime Audit Hook
t7_1 = { status = "pending", commit_sha = "", description = "Red: tests/test_main_thread_purity.py" }
t7_2 = { status = "pending", commit_sha = "", description = "Confirm test passes after Phase 3-5" }
t7_3 = { status = "pending", commit_sha = "", description = "Wire into CI as @pytest.mark.slow gating test" }
t7_4 = { status = "pending", commit_sha = "", description = "Commit T7" }
# Phase 8: Hook API + Diagnostics
t8_1 = { status = "pending", commit_sha = "", description = "Add /api/startup_profile endpoint" }
t8_2 = { status = "pending", commit_sha = "", description = "Add /api/io_pool_status endpoint" }
t8_3 = { status = "pending", commit_sha = "", description = "Add startup profile + io_pool status to Diagnostics panel" }
t8_4 = { status = "pending", commit_sha = "", description = "Update docs/guide_api_hooks.md" }
t8_5 = { status = "pending", commit_sha = "", description = "Tests for endpoints + profiler round-trip" }
t8_6 = { status = "pending", commit_sha = "", description = "Commit T8" }
# Phase 9: Verify + Checkpoint
t9_1 = { status = "pending", commit_sha = "", description = "Re-run benchmark; diff vs baseline" }
t9_2 = { status = "pending", commit_sha = "", description = "Re-run audit_main_thread_imports.py; exit 0" }
t9_3 = { status = "pending", commit_sha = "", description = "Run test_main_thread_purity.py; pass" }
t9_4 = { status = "pending", commit_sha = "", description = "Run live_gui test batch; confirm wait_for_server passes" }
t9_5 = { status = "pending", commit_sha = "", description = "Manual smoke (normal, test-hooks, headless modes)" }
t9_6 = { status = "pending", commit_sha = "", description = "Phase checkpoint commit + git note" }
t9_7 = { status = "pending", commit_sha = "", description = "Update tracks.md; archive track" }
[verification]
# To be filled at Phase 9
baseline_ai_client_ms = 0
after_ai_client_ms = 0
baseline_gui_2_ms = 0
after_gui_2_ms = 0
baseline_app_controller_ms = 0
after_app_controller_ms = 0
live_gui_passed = 0
live_gui_failed = 0
audit_main_thread_violations = 0
io_pool_max_workers = 4
io_pool_thread_name_prefix = "controller-io"
new_threading_thread_calls = 0
[ad_hoc_threads]
# Filled in Phase 6 T6.1 audit
# Format: {file = "src/foo.py", line = 42, current_target = "lambda", proposed_target = "controller.submit_io(...)"}