20 KiB
Plan: Sloppy.py Startup Speedup
Track: startup_speedup_20260606
Spec: ./spec.md
Status: In progress
Started: 2026-06-06
Phase 1: Audit + Benchmark + Foundation
- T1.1 Capture baseline with
scripts/benchmark_imports.py --runs=3 --color=never > docs/startup_baseline_20260606.txt - T1.2 Write
scripts/audit_gui2_imports.py(AST walker): for eachimport Xinsrc/gui_2.py, classify asfirst-frame(reachable frommain()/render_main_windowetc.) vsfeature-gated(inside anif/elifbranch that requires user action). Commit audit results todocs/startup_audit_20260606.md. - T1.3 Add
src/startup_profiler.pywithStartupProfilerclass (context managerphase(name)). Wire intoAppController.__init__andApp.__init__at 8 major init points. (No new test; verify via manual run + diagnostics panel.)[T1.3: 5a856536] - T1.4 Write
scripts/audit_main_thread_imports.py(static gate, fails CI). AST-walks the import graph reachable fromsloppy.py, collects all top-levelimport X/from X import Y, compares against an allowlist. Exits non-zero with file:line:module on violation. Allowlist:sys.stdlib_module_names+ the lean gui_2 skeleton list fromspec.md:2.1(imgui_bundle,defer,src.imgui_scopes,src.theme_2(default theme only),src.theme_models,src.paths,src.models,src.events). - T1.5 Commit baseline + audit script:
git add . && git commit -m "conductor(startup): baseline measurements + main thread import audit script"+ git note
Phase 1 checkpoint: Baseline established. Static gate exists. All three import classes (first-frame, feature-gated, background-safe) documented.
Phase 2: Job Pool + Warmup Foundation (the "no new threads" + "no lazy-loading" rules)
Two user constraints, addressed together:
- No new
threading.Thread(...)per task, per import, per ad-hoc job. - No lazy-loading in function bodies. Heavy imports are warmed on bg threads at startup, not loaded on first use.
The codebase gets ONE shared ThreadPoolExecutor on AppController named
_io_pool, used for warmup AND any future background work.
- T2.1 (Red)
tests/test_app_controller_io_pool.py:test_app_controller_has_io_pool: instantiateAppController, asserthasattr(controller, '_io_pool')and it's aThreadPoolExecutortest_io_pool_uses_named_threads: submit a job, assert the executing thread name starts withcontroller-iotest_io_pool_size_is_4: assert_io_pool._max_workers == 4test_io_pool_shuts_down_on_close: callcontroller.shutdown(), assert the pool is shut down- Confirm FAIL (no
_io_poolyet)
- T2.2 (Green) In
src/app_controller.py:- Add
from concurrent.futures import ThreadPoolExecutorandimport importlibat top - In
__init__, after the asyncio loop starts and BEFORE the existing HookServer block:self._io_pool = ThreadPoolExecutor(max_workers=4, thread_name_prefix="controller-io") - Add warmup state:
self._warmup_lock,self._warmup_done_event,self._warmup_status(dict withpending/completed/failedlists),self._warmup_callbacks - Call
self._submit_warmup_jobs()at the end of__init__ - In
shutdown()(already exists inApp.shutdownfor the GUI; ensure the AppController has a matching shutdown that callsself._io_pool.shutdown(wait=False))
- Add
- T2.3 (Red)
tests/test_warmup_mechanism.py:test_warmup_jobs_submitted_on_init: afterAppController.__init__, assertlen(controller.warmup_status()['pending']) > 0test_warmup_jobs_complete_within_timeout: callcontroller.wait_for_warmup(timeout=10.0), assert Truetest_warmup_status_reflects_completion: afterwait_for_warmup, assertcontroller.is_warmup_done() == Trueandlen(warmup_status()['pending']) == 0test_warmup_callback_fires_on_completion: register a callback viacontroller.on_warmup_complete(cb), assert it was called once warmup donetest_warmup_does_not_block_init: time__init__with a 4-worker pool, assert it returns in < 200ms even though warmup takes longer- Confirm FAIL (no warmup yet)
- T2.4 (Green) Implement
_submit_warmup_jobs(),_compute_warmup_list(),_warmup_one(),warmup_status(),is_warmup_done(),wait_for_warmup(),on_warmup_complete()per spec §3.2. Warmup list includes:google.genai,anthropic,openai,requests,src.command_palette,src.theme_nerv,src.theme_nerv_fx,src.markdown_table,numpy. Conditionally addsfastapi,fastapi.security.api_keyifenable_test_hooksorweb_hostis set. - T2.5 Run T2.1 and T2.3 tests; confirm PASS
- T2.6 Commit:
feat(app_controller): add _io_pool + proactive warmup mechanism+ git note
Phase 2 checkpoint: AppController owns a 4-thread named pool. Warmup jobs are submitted in __init__ and complete in the background. controller.wait_for_warmup(), controller.warmup_status(), and controller.on_warmup_complete(cb) are the public API. Main thread does NOT block waiting for warmup.
Phase 3: Remove top-level heavy imports from src/ai_client.py (TDD)
The current src/ai_client.py has from google import genai etc. at the top,
which puts the main thread in the import chain. Phase 3 removes these and
swaps to _require_warmed(name).
- T3.1 (Red) Write
tests/test_ai_client_no_top_level_sdk_imports.py:test_ai_client_does_not_import_genai_at_module_level: spawn fresh subprocess,import src.ai_client, assert'google.genai' not in sys.modules(warmup hasn't run in this subprocess)test_ai_client_does_not_import_anthropic_at_module_leveltest_ai_client_does_not_import_openai_at_module_leveltest_ai_client_does_not_import_requests_at_module_level- Confirm tests FAIL (proves the imports are currently eager)
- T3.2 (Green) In
src/ai_client.py:- Add
import sys, importlib, threadingat top - Remove
from google import genai,import anthropic,import openai,import requestsfrom top - Add
_require_warmed(name)helper: returnssys.modules[name]or raisesRuntimeError - Each
_send_*function calls_require_warmed("google.genai")etc. instead of using the module directly - Provider client globals stay as
Noneuntil first_send_*initializes them via_ensure_<provider>_client()(extracted from current top-level logic, uses the warmed module)
- Add
- T3.3 Run existing
tests/test_ai_client.py; fix any breakage. Tests that relied on top-level import side effects need a fixture that warms the modules (or a fallback for test mode). - T3.4 Re-run T3.1 tests, confirm PASS
- T3.5 Commit:
refactor(ai_client): remove top-level SDK imports; use _require_warmed+ git note - T3.6 Update
conductor/tracks.mdT3 row with SHA
Phase 3 checkpoint: import src.ai_client < 50ms cold. When run inside an AppController whose warmup has completed, _send_* functions find the SDKs in sys.modules and execute instantly.
Phase 4: Remove top-level FastAPI imports from src/api_hooks.py (TDD)
Same pattern as Phase 3, for the FastAPI imports.
- T4.1 (Red) Write
tests/test_hook_server_no_top_level_fastapi.py:test_hook_server_does_not_import_fastapi_at_module_level: subprocess testtest_hook_server_does_not_import_fastapi_security_at_module_level- Confirm FAIL
- T4.2 (Green) In
src/api_hooks.py:- Remove
from fastapi import ...,from fastapi.security.api_key import ...from top - Add
_require_warmed(name)calls inside the methods that need them (FastAPI app construction, route registration)
- Remove
- T4.3 Run existing
tests/test_api_hooks.py; fix breakage (similar fallback strategy as Phase 3) - T4.4 Confirm T4.1 tests PASS
- T4.5 Commit:
refactor(api_hooks): remove top-level fastapi imports; use _require_warmed+ git note
Phase 4 checkpoint: from src.api_hooks import HookServer does not import fastapi. The HookServer is fully constructed only after AppController's warmup has loaded fastapi (or after _require_warmed("fastapi") triggers the import in test mode).
Phase 5: Remove top-level imports for feature-gated GUI modules (TDD per module)
5A: Command Palette
- T5A.1 (Red)
tests/test_command_palette_no_top_level_import.py:from src.commands import COMMANDSdoes not importsrc.command_palette. Confirm FAIL. - T5A.2 (Green) In
src/commands.py: removefrom src.command_palette import ...from top. The command functions (_open_command_palette,_toggle_command_palette) call_require_warmed("src.command_palette")to access the module. - T5A.3 Run
tests/test_command_palette.py; fix. - T5A.4 Commit:
refactor(commands): remove top-level command_palette import; use _require_warmed
5B: NERV Theme
- T5B.1 (Red)
tests/test_theme_nerv_no_top_level_import.py:from src.theme_2 import *does not importsrc.theme_nervorsrc.theme_nerv_fx. Confirm FAIL. - T5B.2 (Green) In
src/theme_2.py: removefrom src.theme_nerv import ...andfrom src.theme_nerv_fx import ...from top.apply_nerv_theme()(or whichever function activates the theme) calls_require_warmed("src.theme_nerv")and_require_warmed("src.theme_nerv_fx"). - T5B.3 Run
tests/test_theme_2.pyandtests/test_theme_nerv.py; fix. - T5B.4 Commit:
refactor(theme): remove top-level nerv theme imports; use _require_warmed
5C: Markdown Table
- T5C.1 (Red)
tests/test_markdown_helper_no_top_level_import.py:from src.markdown_helper import MarkdownRendererdoes not importsrc.markdown_table. Confirm FAIL. - T5C.2 (Green) In
src/markdown_helper.py: removefrom src.markdown_table import ...from top. The table-detection branch ofrender()calls_require_warmed("src.markdown_table"). - T5C.3 Run
tests/test_markdown_helper.py; fix. - T5C.4 Commit:
refactor(markdown): remove top-level markdown_table import; use _require_warmed
5D: GUI module feature-gated imports
- T5D.1 Run
scripts/audit_gui2_imports.py(built in T1.2); collect list of feature-gated imports insrc/gui_2.py - T5D.2 For each feature-gated import, apply the same TDD pattern (5A-5C). Group into 1-2 atomic commits per logical feature.
- T5D.3 Run full GUI test suite; fix.
- T5D.4 Commit per feature group
Phase 5 checkpoint: All heavy imports removed from main-thread-reachable source files. Default-theme / non-palette / non-table path is lean. Warmup pre-loads all of them in the background.
Phase 6: Migrate Ad-hoc Threads to _io_pool
The codebase has several ad-hoc threading.Thread(...) calls. Per the user
constraint, these should migrate to controller.submit_io(fn).
- T6.1 Audit:
grep -rn "threading.Thread(" src/to find all ad-hoc thread spawns. Document each instate.toml(a new[ad_hoc_threads]section). - T6.2 For each ad-hoc thread in
src/log_pruner.py,src/project_manager.py, etc., refactor to usecontroller.submit_io(fn)instead. Wrap the callable body in a try/except (the pool's default behavior is to surface exceptions via the Future; preserve existing error logging). - T6.3 Run full test suite; fix.
- T6.4 Per-migration commit (or grouped by subsystem if 3+ threads in one file). Final commit:
refactor: migrate ad-hoc threads to AppController._io_pool+ git note.
Phase 6 checkpoint: grep -rn "threading.Thread(" src/ shows ZERO new spawns after this phase (existing project scaffolding threads like HookServer and MMA WorkerPool are exempt — they're domain-specific).
Phase 7: Warmup Notification (Hook API + GUI)
The user said: "the app controller should post to test clients or the user when its threads are warmed up with imports — that way the user knows 'hey you have the ui first, but now you have all the functionality.'" This phase implements the notification surfaces.
7A: Hook API endpoints
- T7A.1 (Red)
tests/test_api_hooks_warmup.py:test_warmup_status_endpoint: hitGET /api/warmup_status, assert response haspending/completed/failedkeystest_warmup_wait_endpoint: hitGET /api/warmup_wait?timeout=10, assert response includes the completion state- Confirm FAIL (endpoints don't exist yet)
- T7A.2 (Green) In
src/api_hooks.py:- Add
GET /api/warmup_statusreturningcontroller.warmup_status() - Add
GET /api/warmup_waitaccepting?timeout=N(default 30s), callingcontroller.wait_for_warmup(timeout)then returning the final status - Register
warmup_statusin_gettable_fieldsso the existing Hook API client can fetch it
- Add
- T7A.3 Run T7A.1 tests; confirm PASS
- T7A.4 Commit:
feat(api_hooks): add /api/warmup_status and /api/warmup_wait+ git note
7B: GUI status indicator + toast
- T7B.1 In
src/gui_2.py(in the status bar render function), pollcontroller.warmup_status()once per frame. Whilependingis non-empty: show "Warming up... (N/M)" text. Whenpendingis empty ANDfailedis empty: show "All imports ready" with a green dot. Whenfailedis non-empty: show "Imports: N failed" with a yellow dot. - T7B.2 Register a callback via
controller.on_warmup_complete(cb)that:- On transition to done (with no failures): queue a toast notification "All providers ready (M modules)" via the existing toast system
- On transition to done (with failures): queue a warning toast "Warmup finished with N failures — see Diagnostics"
- T7B.3 Update
docs/guide_gui_2.md(or wherever status bar is documented) to describe the new indicator - T7B.4 Commit:
feat(gui_2): warmup status indicator + completion toast+ git note
Phase 7 checkpoint: Tests can poll /api/warmup_status to know when the system is fully ready. The GUI shows progress during startup and a toast when complete.
Phase 8: Enforcement (Runtime Audit Hook)
The static gate (T1.4) catches known imports at audit time. This phase adds
empirical enforcement: a test that spawns sloppy.py and verifies NO heavy
import happens on the main thread at runtime.
- T8.1 (Red)
tests/test_main_thread_purity.py:test_headless_startup_no_heavy_imports_on_main: spawnuv run python sloppy.py --headless --enable-test-hookswith asitecustomize.pyshim that installssys.addaudithookto log everyimportevent with the calling thread. The hook writes to a temp file as JSON-L.- Wait for headless server ready (5s timeout via
ApiHookClient). - Read the audit log. Assert: no event with
thread_name == "MainThread"for any module in the heavy denylist (google.genai,anthropic,openai,fastapi,requests,numpy,tkinter,psutil,pydantic,tree_sitter_*,src.command_palette,src.theme_nerv,src.theme_nerv_fx,src.markdown_table). - Kill subprocess. Confirm FAIL (current state imports these on main).
- T8.2 Once Phase 3-5 land and the static gate passes, this test should start passing. If it doesn't, debug and add more top-level import removals.
- T8.3 Wire
test_main_thread_purity.pyinto CI as a gating test (it'll be slow, ~10s, so mark with@pytest.mark.slowand only run in batched CI). - T8.4 Commit:
test: empirical main-thread purity check via sys.audit hook+ git note
Phase 8 checkpoint: CI fails if a future commit re-introduces a heavy main-thread import.
Phase 9: Verify + Phase Checkpoint
- T9.1 Re-run
scripts/benchmark_imports.py --runs=3. Save todocs/startup_after_20260606.txt. Diff against T1.1 baseline; confirm:import src.ai_client< 50msimport src.gui_2< 500msimport src.app_controller< 300ms (includes_io_poolcreation; should still be < 300ms)
- T9.2 Re-run
scripts/audit_main_thread_imports.py(T1.4). Confirm exit 0. No violations. - T9.3 Run
tests/test_warmup_mechanism.py(T2.3); confirm warmup completes and notifications fire - T9.4 Run
live_guitest batch (perconductor/workflow.md:147-150: max 4 test files per batch, long timeout):uv run pytest tests/test_live_gui_*.py --timeout=60 -vin batches- Confirm
wait_for_server(timeout=15)does not time out - Optionally: tests can call
controller.wait_for_warmup()before exercising functionality that depends on warmed modules
- T9.5 Manual smoke:
uv run sloppy.py(normal mode): time-to-first-frame, observe "Warming up... (N/M)" status, then "All imports ready" toastuv run sloppy.py --enable-test-hooks(test mode): same observations, plus/api/warmup_statusreturnscompleteduv run sloppy.py --headless(headless): time-to-server-ready- Verify a user action that switches provider (or other warmup-dependent operation) is INSTANT, not 1s-delayed
- T9.6 Phase checkpoint commit:
conductor(checkpoint): Phase 9 complete - sloppy.py startup speedup track+ git note with full verification report - T9.7 Update
conductor/tracks.md: mark track complete, link to archived folder
Phase 9 checkpoint: All verification criteria in spec.md:6 met. User can switch providers with zero perceptible lag because warmup already loaded the SDK.
Definition of Done
- All Phase 1-9 tasks checked
- All tests pass (273+ existing + new TDD tests including
test_main_thread_purityandtest_warmup_mechanism) uv run ruff check .anduv run mypy --explicit-package-bases .clean (permma-tier2-tech-leadskill)uv run python scripts/audit_main_thread_imports.pyexits 0docs/startup_baseline_20260606.txtanddocs/startup_after_20260606.txtarchived- Phase 9 git note contains: baseline diff, audit script result, runtime audit hook result, full test batch results, manual smoke timings, file inventory
- Track moved to
conductor/tracks/archive/ - NO new
threading.Thread(...)calls insrc/(verified bygrep -rn "threading.Thread(" src/) - NO
import Xstatements in function bodies for heavy modules — verified bygrep -rn "^\s*import \(google\|anthropic\|openai\|fastapi\|src\.command_palette\|src\.theme_nerv\|src\.markdown_table\)" src/ - Warmup completion notification works — GUI shows toast, Hook API returns
completed,controller.is_warmup_done()returns True within 10s of startup - User action latency is zero for warmup-dependent operations — manual smoke test switching providers / opening palette / rendering NERV is instant
Notes for Tier 3 Workers
- Always use 1-space indentation for Python code. Confirm via
uv run python -c "import ast; ..."AST check if you do any class-body reorganization (the "Indentation-Driven Class Method Visibility" pitfall inconductor/workflow.md). - Test fixtures:
isolate_workspace,reset_paths,reset_ai_client,vlogger,kill_process_tree,mock_app,live_gui— seedocs/guide_testing.md. - Subprocess tests for module-level imports: spawn
uv run python -c "..."and inspectsys.modulesafter the import. Pattern:result = subprocess.run( [sys.executable, "-c", "import sys; import src.ai_client; import json; print(json.dumps(sorted(sys.modules.keys())))"], capture_output=True, text=True ) assert 'google.genai' not in result.stdout - For new background work: use
controller.submit_io(fn, *args), NOTthreading.Thread(target=fn).start(). The user constraint is "no new threads." - Atomic commits per task. No batching. If a task touches 3 files, commit all 3 in one commit but the commit message describes the task.
- The
_io_poolis a daemon executor by default in Python 3.9+; non-daemon workers in 3.8. Checkpyproject.tomlforrequires-python. Either way, the pool is shut down onAppController.shutdown().
Cross-References
- Spec: ./spec.md
- Original backlog entry:
conductor/tracks.md:152 - Benchmark tool:
scripts/benchmark_imports.py - Lazy pattern templates:
src/app_controller.py:241-271(RAG + MMA) - Threading constraints:
docs/guide_architecture.md:43-67 - Architectural Invariant:
spec.md:2.1 - Job pool spec:
spec.md:2.2 Layer 2 - Hot reload constraints:
docs/guide_hot_reload.md:295-312