manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	515a302967	conductor(checkpoint): Phase 5A-5C complete - feature-gated imports lazy (commands, theme_2, markdown_helper)	2026-06-06 17:01:17 -04:00
ed	32edad0a4b	conductor(plan): Mark Phase 5A-5C complete (commands, theme_2, markdown_helper lazy imports)	2026-06-06 17:01:05 -04:00
ed	48c9649951	refactor(markdown_helper): remove top-level src.markdown_table import; use _require_warmed Phase 5C of startup_speedup_20260606 track. src/markdown_helper.py imported src.markdown_table at module level: from src.markdown_table import parse_tables, render_table Both parse_tables and render_table are only used inside MarkdownRenderer.render(). Removed the top-level import; the MarkdownRenderer.render() method now does: markdown_table = _require_warmed('src.markdown_table') parse_tables = markdown_table.parse_tables render_table = markdown_table.render_table at the top of its body, before any other logic. TESTS: - tests/test_markdown_helper_no_top_level_table.py: 3/3 PASS (all RED -> GREEN) - tests/test_markdown_table*.py (5 files) + test_markdown_helper_bullets.py + test_markdown_render_robust.py: 24/24 PASS (no breakage) EFFECTIVENESS: import src.markdown_helper no longer triggers src.markdown_table (~250ms). For renderers that never hit a GFM table, the import is never paid. For renderers that do, the warmup pre-loads it on _io_pool and the render() lookup is O(1). NEXT: Phase 5D - bulk refactor of src/gui_2.py feature-gated imports via scripts/audit_gui2_imports.py.	2026-06-06 16:58:32 -04:00
ed	cbc3b075a0	conductor(track): Initialize data_oriented_error_handling_20260606 Track + metadata + state + tracks.md registration for the Fleury-pattern error handling refactor. Key design decisions (per user approval): - Option A for _send_<vendor>() handling: rename to _send_<vendor>_result() and change return type to Result[str] (contained to internal callers). - send() is marked @typing_extensions.deprecated; send_result() is the new public API. - ProviderError exception is FULLY REPLACED by ErrorInfo dataclass (a value, not an exception). - 5 phases: foundation, mcp_client, ai_client, rag_engine, deprecation+archive. - Post-tracks baseline check (Phase 1 Task 1.1) verifies the 3 pending tracks have merged before proceeding. - 9 Open Questions, 7 Risks, 5 verification criteria, follow-up track public_api_migration_20260606 planned in spec §12.1. Blocked by: startup_speedup_20260606, test_batching_refactor_20260606, qwen_llama_grok_integration_20260606. Blocks: public_api_migration_20260606.	2026-06-06 16:58:22 -04:00
ed	69d098baaa	refactor(theme_2): remove top-level NERV theme imports; use _require_warmed Phase 5B of startup_speedup_20260606 track. src/theme_2.py had 3 top-level NERV imports: from src import theme_nerv from src.theme_nerv import DATA_GREEN from src.theme_nerv_fx import CRTFilter, AlertPulsing, StatusFlicker And 3 module-level FX object instantiations: _crt_filter = CRTFilter() _alert_pulsing = AlertPulsing() _status_flicker = StatusFlicker() ALL removed. The 3 use sites now lookup via _require_warmed: - apply() NERV branch: theme_nerv = _require_warmed('src.theme_nerv') - ai_text_color(): theme_nerv = _require_warmed('src.theme_nerv') (then uses theme_nerv.DATA_GREEN) - render_post_fx(): theme_nerv_fx = _require_warmed('src.theme_nerv_fx') (then creates FX objects locally per-call) The _status_flicker was instantiated but never used (dead code path; the StatusFlicker class is still importable via theme_nerv_fx but not auto-constructed in theme_2.py). TESTS: - tests/test_theme_2_no_top_level_nerv.py: 4/4 PASS (all RED -> GREEN) - tests/test_theme.py, test_theme_nerv.py, test_theme_nerv_fx.py, test_theme_models.py: 21/21 PASS (no breakage) EFFECTIVENESS: import src.theme_2 no longer triggers src.theme_nerv or src.theme_nerv_fx (~485ms combined). For users on default theme, these are NEVER loaded. For NERV users, the warmup pre-loads on _io_pool and the lookup is O(1). NEXT: Phase 5C (markdown table) follows same TDD pattern.	2026-06-06 16:55:20 -04:00
ed	494f68f9d9	conductor(spec): Add 'Coordination with Pending Tracks' section (§10) This track executes after startup_speedup, test_batching_refactor, and qwen_llama_grok_integration land. Section 10 documents the expected post-tracks codebase state and answers 6 critical coordination questions: - Q1: Existing _send_<vendor>() functions (returning str) are renamed to _send_<vendor>_result() and changed to return Result[str] (Option A: clean rename, contained to internal callers). - Q2: send_openai_compatible in src/openai_compatible.py STAYS as-is (it raises at the SDK boundary; correct per Fleury). The new _send_<vendor>_result() functions catch and convert to ErrorInfo. - Q3: Deprecation warning on send() will produce Python warnings in tests; filterwarnings in conftest.py silences them during transition. - Q4: The except ProviderError clauses in src/ai_client.py become dead code after the refactor and are removed in Phase 3. - Q5: ProviderError is FULLY REPLACED by ErrorInfo (a value, not an exception). ProviderError removed entirely; ErrorInfo is the new error type. - Q6: ProviderError.ui_message() moves to ErrorInfo.ui_message(). Phase 1 also adds a baseline verification task to confirm the 3 pending tracks have merged before proceeding. Also renumbered Out of Scope (11) and See Also (12) sections to preserve monotonic section numbers.	2026-06-06 16:54:25 -04:00
ed	78d3a1db1f	refactor(commands): use lazy registry proxy to defer src.command_palette import Phase 5A T5A.1-T5A.4 of startup_speedup_20260606 track. src/commands.py was importing src.command_palette at module load to create the CommandRegistry singleton. The 32 @registry.register decorators on the command functions needed this registry at import time. Approach: lazy registry proxy. The @registry.register decorator now just queues the function in a list; the real CommandRegistry is built on first access to any other registry attribute (.all, .get, etc.). By that time, all 32 decorators have run and the pending list is populated, so the real registration is complete in one pass. src/commands.py changes: - Removed 'from src.command_palette import CommandRegistry' - Added 'from src.module_loader import _require_warmed' - Added _LazyCommandRegistry class (proxy) - Added _get_real_registry() function (initializes on first access) - Replaced 'registry = CommandRegistry()' with 'registry = _LazyCommandRegistry()' - The 32 @registry.register decorators are unchanged (the proxy's register method returns the function unchanged after queueing it) EFFECTIVENESS: - 'import src.commands' no longer triggers src.command_palette (~244ms) - The warmup on AppController's _io_pool pre-loads src.command_palette on a background thread during startup - First access to registry.all() (e.g. from gui_2.py at palette open time) is O(1) - the warmup module is already in sys.modules TESTS: - tests/test_commands_no_top_level_command_palette.py: 4/4 PASS (3 RED, 1 green; now all green) - tests/test_command_palette.py: 13/13 PASS (no breakage) - tests/test_command_palette_sim.py: 7/7 PASS (live_gui tests, the full palette flow works end-to-end with the lazy proxy) ARCHITECTURAL NOTE: The lazy proxy is a minimal-change solution that preserves the public API. The 32 decorated functions don't need any changes; gui_2.py's 'from src.commands import registry' still works unchanged. The deferral is invisible to consumers. NEXT: Phase 5B (NERV theme) and 5C (markdown table) follow the same TDD pattern. 5D is the bulk refactor of src/gui_2.py feature-gated imports via the audit_gui2_imports.py script.	2026-06-06 16:48:04 -04:00
ed	16291234ff	conductor(plan): Record Phase 4 checkpoint SHA `883682c1`	2026-06-06 16:37:27 -04:00
ed	883682c1c2	conductor(checkpoint): Phase 4 complete - fastapi no longer in main-thread import chain	2026-06-06 16:36:31 -04:00
ed	a0ff1bde91	conductor(plan): Mark Phase 4 complete - app_controller fastapi import removal + _require_warmed lift	2026-06-06 16:36:20 -04:00
ed	3849d30441	refactor(app_controller): remove top-level fastapi imports; lift _require_warmed to shared module Phase 4 T4.1-T4.4 of startup_speedup_20260606 track. DEVIATION FROM ORIGINAL SPEC: spec.md said fastapi was in src/api_hooks.py but it was actually in src/app_controller.py (lines 17, 21). api_hooks.py uses stdlib http.server. Phase 4 target corrected to app_controller. LIFTED _require_warmed TO SHARED MODULE: created src/module_loader.py to avoid duplicating the lookup logic and the cross-module import smell (app_controller -> ai_client). src/ai_client.py re-exports it so the T3.1 test (which asserts hasattr(src.ai_client, '_require_warmed')) continues to work. src/app_controller.py changes: - Added 'from __future__ import annotations' (enables lazy type annotations; -> FastAPI return type now a forward reference) - Removed 'from fastapi import FastAPI, Depends, HTTPException' (line 17) - Removed 'from fastapi.security.api_key import APIKeyHeader' (line 21) - Added 'from src.module_loader import _require_warmed' (cross-module via shared utility, not via ai_client) - create_api(): added lookups at top of function body - 7 _api_* helper functions (_api_get_key, _api_generate, _api_stream, _api_confirm_action, _api_get_session, _api_delete_session, _api_get_context): added 'HTTPException = _require_warmed(...).HTTPException' at top of each function body EFFECTIVENESS: - import src.app_controller no longer triggers fastapi import (saves ~470ms in main thread; only loaded when --enable-test-hooks is set) - When --enable-test-hooks is set, the AppController's warmup pre-loads fastapi on the _io_pool, so create_api()'s lookup is O(1) TESTS: - tests/test_app_controller_no_top_level_fastapi.py: 4/4 PASS (was 3 RED + 1 pass) - tests/test_ai_client_no_top_level_sdk_imports.py: 9/9 still PASS (re-export works) - tests/test_app_controller_mcp.py, test_app_controller_offloading.py: pass - tests/test_headless_service.py: 10/11 PASS (1 pre-existing failure test_generate_endpoint is a circular-import issue in google.genai, reproduces identically on stashed pre-Phase-4 state - NOT a regression from this change) - tests/test_hooks.py: pass NEXT: Phase 5 (feature-gated GUI module imports - command palette, NERV theme, markdown table), then Phase 6 (ad-hoc threads -> _io_pool).	2026-06-06 16:34:46 -04:00
ed	7fb13fbf4b	conductor(plan): Record Phase 3 checkpoint SHA + mark T3.6 complete	2026-06-06 16:13:35 -04:00
ed	056358f230	conductor(checkpoint): Phase 3 complete - ai_client heavy SDK imports removed	2026-06-06 16:12:17 -04:00
ed	8905c26bff	conductor(plan): Mark Phase 3 complete - ai_client SDK import removal done	2026-06-06 16:11:14 -04:00
ed	51c054ece8	refactor(ai_client): remove top-level SDK imports; use _require_warmed Phase 3 T3.2 + T3.3 of startup_speedup_20260606 track. The 5 heavy SDKs (anthropic, google.genai, openai, google.genai.types, requests) are no longer imported at module level. Each function that needs them now calls _require_warmed(name) to get the module from sys.modules (populated by AppController's warmup on _io_pool). This is the load-bearing wall of the Main Thread Purity Invariant: heavy modules are never in the main thread's import chain. run_discussion_compression now uses _require_warmed for both google.genai.types (gemini branch) and requests (deepseek branch). Tests/test_tier4_patch_generation.py adapted: the 2 tests that mocked 'src.ai_client.types' (no longer a module-level attr) now mock 'src.ai_client._require_warmed' (the new public mechanism). T3.1 tests now pass (9/9). T3.3 breakage fixed. All 25 ai_client + tier4 tests pass.	2026-06-06 16:09:16 -04:00
ed	ca35b3ef48	fix(opencode): Remove invalid MCP tools block, add timeout/env, grant subagent access The 46-entry mcp.manual-slop.tools block added in commit `30281843` was invalid per the v1.16.2 schema (McpLocalConfig has additionalProperties: false) and was being silently dropped. Also adds proper MCP server configuration and subagent permission grants. Changes: opencode.json: - Remove the silently-dropped mcp.manual-slop.tools block (46 entries) - Add timeout: 30000 (default 5000 is fragile) - Add environment block with PYTHONPATH, GIT_TERMINAL_PROMPT, GCM_INTERACTIVE, GIT_ASKPASS, HOME so mcp_env.toml values are injected into the MCP server process - Top-level 'tools' block intentionally omitted: schema only accepts boolean values (enable/disable), not description objects. Tool descriptions come from the MCP server's list_tools response (mcp_client.MCP_TOOL_SPECS). .opencode/agents/{tier1-orchestrator,tier2-tech-lead,tier3-worker,tier4-qa,explore}.md: - Add 'manual-slop_*': allow to each agent's permission block so subagents can use the 46 MCP tools (previously defaulted to deny in some permission schemas) general.md: no change (no permission block, defaults to allow all) Verified: - opencode.json is now schema-valid (no more 'Expected boolean' errors) - Both MCP servers connected: MiniMax (2 tools), manual-slop (46 tools) - manual-slop MCP server startup: ~651ms (well under 30s timeout) - All MCP tests pass: test_mcp_config.py + test_mcp_perf_tool.py = 4/4 - Subagent permission blocks confirmed in 'opencode debug config' output	2026-06-06 15:44:52 -04:00
ed	9eed60238a	conductor(plan): mark T3.1 RED done; T3.2 holding for MCP fix (`16780ec6`)	2026-06-06 15:16:02 -04:00
ed	16780ec6d4	test(ai_client): TDD red phase - no top-level SDK imports allowed Phase 3 Task T3.1 of startup_speedup_20260606 track. 9 tests assert: - import src.ai_client does NOT trigger google.genai / anthropic / openai / requests / google.genai.types imports (the main thread must not load these on import; they're warmed on _io_pool) - _require_warmed(name) helper exists and is callable - _require_warmed returns the cached module if already in sys.modules - _require_warmed falls back to importlib for tests/dev where warmup didn't run - The static audit script does not see src/ai_client.py as a contributor of heavy-import violations All 9 tests are currently FAILING (RED). They will turn GREEN when T3.2 (the actual refactor of src/ai_client.py to remove top-level imports and add _require_warmed) lands. The implementation is held pending MCP client fix (per user instruction).	2026-06-06 15:11:13 -04:00
ed	b17cbbdeca	conductor(plan): write 6-phase implementation plan for qwen_llama_grok_integration_20260606 ~30 tasks across 6 phases, each with explicit Red-Green-Refactor TDD steps: - Phase 1 (1.1-1.8): Capability matrix framework (src/vendor_capabilities.py) + shared OpenAI-compatible helper (src/openai_compatible.py). 13 unit tests. - Phase 2 (2.1-2.8): Qwen via DashScope native SDK. 5 unit tests. - Phase 3 (3.1-3.7): Grok (xAI) + Llama (Ollama + OpenRouter + custom URL) via shared helper. 8 unit tests. - Phase 4 (4.1-4.3): MiniMax refactor (_send_minimax from ~250 -> ~50 lines). Safety net: existing tests/test_minimax_provider.py. - Phase 5 (5.1-5.5): 9 capability-driven UX adaptations in src/gui_2.py. Manual smoke test for all 3 new vendors. - Phase 6 (6.1-6.4): Update docs/guide_ai_client.md + guide_models.md. Archive the track. Data-oriented design: shared helper is the algorithm on normalized data; _send_<vendor>() entry points are thin boundary adapters. 1-space indentation per project style guide. No placeholders. All test code is concrete. Self-review at end confirms spec coverage (every section of spec.md mapped to a task).	2026-06-06 15:06:30 -04:00
ed	97daaff29b	conductor(spec): Fix Qwen-Audio matrix entry consistency (vision=false, audio deferred) The capability matrix v1 has no 'audio' field (audio_input is deferred to v2). Qwen-Audio's vision flag was incorrectly marked true. Changed to false and clarified that v1 uses Qwen-Audio as text-only; audio attachment UI is hidden via the absent audio capability check.	2026-06-06 14:58:03 -04:00
ed	055430a75a	conductor(tracks): Register qwen_llama_grok_integration_20260606 in registry (item 0d)	2026-06-06 14:56:55 -04:00
ed	7c1d597ef1	conductor(track): Initialize qwen_llama_grok_integration_20260606 spec Three new vendors + capability matrix framework + MiniMax refactor: Capability matrix v1 (7 features): vision, tool_calling, caching, streaming, model_discovery, context_window, cost_tracking. Audio and server-side code execution deferred to a follow-up track. Qwen via DashScope native SDK: Qwen-Turbo, Qwen-Plus, Qwen-Max, Qwen-Long (1M context), Qwen-VL-Plus/Max (vision), Qwen-Audio. Native API chosen over OpenAI-compatible mode to unlock Qwen-Audio, Qwen-Long custom chunking, and Qwen-VL-Max enhanced vision. Llama (OpenAI-compatible, multi-backend): Ollama (local, free), OpenRouter (cloud aggregator covering Together/Groq/Fireworks), custom URL escape hatch. Models: Llama 3.1 8B/70B/405B, 3.2 1B/3B, 3.2 11B/90B Vision, 3.3 70B. Grok via xAI (OpenAI-compatible): Grok-2, Grok-2-Vision, Grok-Beta. Shared OpenAI-compatible helper in src/openai_compatible.py processes a normalized request/response data structure; each _send_<vendor>() is a thin adapter at the boundary (data-oriented design per Fleury/Acton/Lottes). MiniMax refactor: ~250 lines reduced to ~50 by using the shared helper. Existing test_minimax_provider.py is the safety net. UX adaptation: 9 UI elements (screenshot, tools toggle, cache panel, stream progress, fetch models, token budget, cost panel) read from the matrix instead of hard-coding per-vendor branches. Out of scope (deferred): Anthropic/Gemini/DeepSeek migration to the matrix (separate track), audio input, server-side code execution, PDF input, batch API, fine-tuning. 6 phases planned: matrix+helper, Qwen, Grok+Llama, MiniMax refactor, UX adaptation, docs+archive.	2026-06-06 14:56:00 -04:00
ed	7eb743c6cb	conductor(plan): Phase 2 complete - io_pool + warmup foundation in place Phase 2 of startup_speedup_20260606 is done. Tasks: T2.1 (Red) tests/test_io_pool.py `1354679e` 4 tests T2.2 (Green) src/io_pool.py `1354679e` make_io_pool() factory T2.3 (Red) tests/test_warmup.py `1354679e` 10 tests T2.4 (Green) src/warmup.py `1354679e` WarmupManager T2.5 (Wire) AppController integration `922c5ad9` io_pool + warmup in __init__ + 5 public delegation methods T2.6 (Plan) this commit What now exists: - make_io_pool() returns a 4-worker ThreadPoolExecutor named 'controller-io-N' - WarmupManager class with submit/status/is_done/wait/on_complete/reset - AppController creates self._io_pool + self._warmup early in __init__ - Warmup is submitted immediately (jobs run concurrent with the rest of init) - Public API: controller.warmup_status(), controller.is_warmup_done(), controller.wait_for_warmup(timeout), controller.on_warmup_complete(cb) - controller._compute_warmup_list() returns 9 always + 2 conditional (fastapi) - shutdown() now also shuts down the io_pool Currently the warmup is a no-op for modules already imported at the top of app_controller.py (fastapi, requests). Phase 3 will remove those top-level imports; the warmup infrastructure will then start doing real work. 18/18 tests passing (4 io_pool + 10 warmup + 4 test_app_controller_*). Next: Phase 3 (remove top-level SDK imports from src/ai_client.py). Expected to fix ~3 audit violations (google.genai, anthropic, openai).	2026-06-06 14:52:04 -04:00
ed	922c5ad9ab	feat(app_controller): wire _io_pool + warmup + 5 public delegation methods Phase 2 Task T2.5 of the startup_speedup_20260606 track. In AppController.__init__, right after the lock init (and before the heavy subsystem construction that follows), create the shared _io_pool and WarmupManager, then submit the warmup list. The warmup runs concurrently with the rest of __init__, so by the time __init__ returns, the heavy modules are loaded (or in flight). Changes: - Add imports: from src.io_pool import make_io_pool, from src.warmup import WarmupManager - In __init__, after the locks block, add: self._io_pool = make_io_pool() self._warmup = WarmupManager(self._io_pool) self._warmup.submit(self._compute_warmup_list()) - Add _compute_warmup_list() method: returns ['google.genai', 'anthropic', 'openai', 'requests', 'src.command_palette', 'src.theme_nerv', 'src.theme_nerv_fx', 'src.markdown_table', 'numpy'] always, plus ['fastapi', 'fastapi.security.api_key'] if self.test_hooks_enabled - Add public delegation methods: warmup_status(), is_warmup_done(), wait_for_warmup(timeout), on_warmup(callback) - In shutdown(), add self._io_pool.shutdown(wait=False) The warmup currently is a no-op for the heavy modules already imported at the top of app_controller.py (fastapi, requests, etc. are already in sys.modules). The infrastructure is in place; Phase 3 will remove the top-level imports so the warmup actually does work. Verified: all 18 tests pass (test_io_pool + test_warmup + existing test_app_controller_mcp + test_app_controller_offloading).	2026-06-06 14:48:51 -04:00
ed	1354679e33	feat(io_pool, warmup): add shared 4-thread pool + WarmupManager Phase 2 Tasks T2.1-T2.4 of the startup_speedup_20260606 track. NEW: src/io_pool.py make_io_pool() factory: 4-worker ThreadPoolExecutor with thread_name_prefix='controller-io'. The sanctioned way for any background work. Replaces ad-hoc threading.Thread() calls per the 'no new threads' rule. NEW: src/warmup.py WarmupManager: manages a list of modules to import on the shared pool. Public API: .submit(modules) - start warmup (call once) .status() - {pending, completed, failed} .is_done() - bool .wait(timeout) - block until done .on_complete(callback) - register completion callback .reset() - clear state Thread-safe (lock-guarded). 10 tests cover all paths. NEW: tests/test_io_pool.py (4 tests): - ThreadPoolExecutor returned - 4 workers - Threads named 'controller-io-*' - Jobs run in parallel (barrier test) NEW: tests/test_warmup.py (10 tests): - One job per module submitted - Initial pending list correct - Failed imports tracked - Done event set after all complete - wait() blocks until done - on_complete callback fires (and immediately if already done) - Modules actually end up in sys.modules - reset() clears state - Jobs run concurrently (not serially) All 14 tests pass. AppController integration is the next commit.	2026-06-06 14:47:02 -04:00
ed	7fdab70529	conductor(plan): write 4-phase implementation plan for test_batching_refactor_20260606 16 tasks across 4 phases, each with explicit Red-Green-Refactor TDD steps: - Phase 1 (1.1-1.16): Library + dry-run. 20 unit tests across categorizer, batcher, plugin. New run_tests_batched.py has --plan/--audit only. - Phase 2 (2.1-2.3): Shadow run via CI. Compare new vs old plan output. - Phase 3 (3.1-3.4): Switch default. Full CLI with --tiers, --durations. Old script becomes .legacy. Update docs/guide_testing.md. - Phase 4 (4.1-4.6): Populate registry, gitignore durations, delete legacy, archive track. 1-space indentation per project style guide. No placeholders. All test code is concrete.	2026-06-06 14:24:39 -04:00
ed	f9a0125847	conductor(plan): Phase 1 complete - baseline + audit infrastructure ready Phase 1 of startup_speedup_20260606 track is done. Tasks completed: T1.1 baseline benchmark -> `6f9a3af2` (docs/reports/startup_baseline_20260606.txt) T1.2 audit_gui2_imports.py -> `6f9a3af2` (scripts/ + audit results) T1.3 StartupProfiler -> `5a856536` (src/ + 5 tests) T1.4 audit_main_thread_imports -> `6f9a3af2` (scripts/ + 9 tests) T1.5 plan update -> this commit Baseline numbers (3-run median, from scripts/benchmark_imports.py): src.gui_2 1770ms (main-thread bottleneck) simulation.user_agent 1517ms google.genai 1001ms openai 482ms anthropic 441ms imgui_bundle 255ms (KEEP - ImGui hot path) src.theme_nerv_fx 254ms src.theme_nerv 246ms src.markdown_table 243ms src.command_palette 242ms Audit violations on current codebase: 67. These are the targets for Phases 3-5 (remove top-level heavy imports to fix each one). Next: Phase 2 (Job Pool + Warmup Foundation).	2026-06-06 14:24:20 -04:00
ed	6f9a3af201	feat(audit): add main-thread import graph audit + baseline measurements Phase 1, Tasks T1.2 + T1.4 of the startup_speedup_20260606 track. NEW: scripts/audit_main_thread_imports.py Static CI gate that AST-walks the import graph reachable from sloppy.py and fails (exit 1) if any heavy module is imported at the top of a main-thread-reachable file. Walks into if/elif/else and try/except branches (which run at import time) but skips function bodies (which only run when called). Allowlist: stdlib + the lean gui_2 skeleton (imgui_bundle, defer, src.imgui_scopes, src.theme_2, src.theme_models, src.paths, src.models, src.events). NEW: scripts/audit_gui2_imports.py Read-only analysis tool that lists every top-level and function-level import in src/gui_2.py, classified by location. Used in Phase 5D to identify which imports to remove. NEW: tests/test_audit_main_thread_imports.py 9 tests covering: --help exits 0, clean stdlib-only passes, heavy third-party fails, google.genai fails, transitive walks, function- body imports ignored, if-branch imports flagged, try-block imports flagged, file:line reported. All 9 pass. NEW: docs/reports/startup_baseline_20260606.txt 3-run median cold-start benchmark. Worst offenders: src.gui_2 (1770ms), simulation.user_agent (1517ms), google.genai (1001ms), openai (482ms), anthropic (441ms), imgui_bundle (255ms), src.theme_nerv* (485ms combined), src.markdown_table (243ms), src.command_palette (242ms). NEW: docs/reports/startup_audit_20260606.txt Audit output on the CURRENT codebase. Reports 67 violations across the main-thread import graph (incl. numpy in src/gui_2.py:9, tomli_w in src/gui_2.py:18, fastapi + requests in src/app_controller, tree_sitter_* in src/file_cache, pydantic in src/models, plus all the src.* subsystem imports that drag in heavy transitive deps). Phase 3-5 of the track will resolve these one by one. After Phase 3-5, this audit must exit 0 (no violations). Co-located reports in docs/reports/ per project convention; the other agent finished their work in docs/superpowers/ and is unrelated.	2026-06-06 14:22:18 -04:00
ed	0553983ce9	conductor(spec): Clarify --audit --strict semantics in Section 4.3 Default --audit exits non-zero on hard errors only. --strict adds the 'multiple subsystems = probably cross-cutting' heuristic from Section 9 as a CI gate. Two modes, one flag.	2026-06-06 14:16:13 -04:00
ed	cbfd78c51d	conductor(tracks): Register test_batching_refactor_20260606 in registry	2026-06-06 14:14:11 -04:00
ed	b7a9737443	conductor(track): Initialize test_batching_refactor_20260606 spec Three-tier batching refactor: replace alphabetical 4-at-a-time batching with fixture-class-isolated tiers (0 opt-in, 1 unit/xdist, 2 mock_app, 3 live_gui in one session, H headless, P performance). Hybrid classification: auto-infer from filename + AST fixture scan; hand-curated tests/test_categories.toml overrides for cross-cutting and ambiguous files. Opt-in per-test order control via [[files.X.test_order]] sub-tables, gated on a conftest-loaded pytest plugin (no-op without entries). Priority order: B (process isolation) > A (subsystem diagnostic) > C (speed).	2026-06-06 14:12:14 -04:00
ed	96158edd97	conductor(plan): mark T1.3 StartupProfiler complete (`5a856536`)	2026-06-06 13:59:02 -04:00
ed	5a85653654	feat(startup_profiler): add StartupProfiler for per-phase init timing Lightweight, in-memory profiler for AppController init phases. Used by the startup_speedup_20260606 track to measure where the time goes during boot (config hydration, hook server start, subsystem init, etc.). The profiler is exposed via /api/startup_profile (Phase 8 work) and the Diagnostics panel so the user can see the exact per-phase cost. Public API: StartupProfiler() - create .phase(name) - context manager .snapshot() - {phases: {name: {start_ts, duration_ms}}, total_ms, count} .reset() - clear recorded phases .enable() / .disable() - toggle recording Implementation: - dataclass with list of _Phase(name, start_ts, end_ts) - @contextmanager records wall-clock via time.perf_counter - records duration even if the body raises (try/finally) - snapshot is a copy, so consumers can't mutate the live state TDD: 5 tests in tests/test_startup_profiler.py cover: basic recording, total math, snapshot isolation, exception safety, empty state.	2026-06-06 13:57:26 -04:00
ed	f2f5ee1197	conductor(plan): flip track from lazy-loading to proactive warmup Architectural shift driven by user clarification: lazy-loading on first use causes user-perceptible lag when the user-triggered action (e.g. provider switch) propagates to a controller method that triggers the first import. The fix is to pre-import heavy modules on a bg thread at startup and have functions access them via _require_warmed(). Old design (rejected): - from google import genai inside _send_gemini (lazy on first call) - First user action that triggers this pays the cost; UI feels laggy New design (this commit): - Top-level heavy imports REMOVED from main-thread-reachable files - AppController.__init__ submits warmup jobs to _io_pool (4 threads, named 'controller-io-N') - Each warmup worker imports its module and updates a thread-safe warmup_status dict - Functions access modules via _require_warmed(name), which assumes the module is in sys.modules (warmed at startup) - When all jobs complete, _warmup_done_event is set and registered on_warmup_complete callbacks fire - GUI shows status indicator + toast when warmup completes - Hook API exposes /api/warmup_status and /api/warmup_wait - Tests can call controller.wait_for_warmup() before exercising warmup-dependent functionality Phase 2 now bundles job pool + warmup (T2.3+T2.4 add warmup tests + implementation). Phases 3-5 do 'remove top-level imports' instead of 'lazy-load'. Phase 7 is the notification surface (Hook API + GUI). Definition of Done includes warmup-completion criteria, the 'no function-body imports' check, and an end-to-end 'provider switch is INSTANT' smoke test. No code changes; this is a planning update only.	2026-06-06 13:45:05 -04:00
ed	ca254bac41	fix(imports): break models<->dag_engine circular dependency Track.get_executable_tickets (in models.py) called TrackDAG at runtime, forcing a top-level import of src.dag_engine into models.py and creating a 2-cycle that broke whichever module loaded second (Ticket was not yet defined when models.py loaded first; TrackDAG was not yet defined when dag_engine.py loaded first). Fix: hoist the method out of the Track dataclass and into a free function get_executable_tickets(track) in dag_engine.py. models.py no longer needs TrackDAG at all, so the cycle is one-directional (models -> dag_engine) and resolves cleanly in any import order. Tests updated: - tests/test_mma_models.py: import get_executable_tickets and call it instead of track.get_executable_tickets() (4 call sites) - tests/test_conductor_engine_v2.py: comment update Verified both import orders resolve cleanly: forward: import src.models; import src.dag_engine -> OK reverse: import src.dag_engine; import src.models -> OK 34 tests pass (test_mma_models, test_dag_engine, test_execution_engine, test_arch_boundary_phase3, test_track_state_schema).	2026-06-06 13:30:18 -04:00
r00tz	9e4fac496d	made local rag needs optional (prevents having to have torch / sentence-transformers if you never use local embedding)	2026-06-06 13:21:43 -04:00
ed	32e633b3ec	conductor(plan): mark startup_speedup_20260606 track creation committed (`cd4fb045`)	2026-06-06 13:01:32 -04:00
ed	cd4fb04541	conductor(track): create startup_speedup_20260606 track for sloppy.py startup latency Fulfills the existing backlog entry at conductor/tracks.md:152 (2026-06-05 root-cause analysis of live_gui wait_for_server timeouts). Main Thread Purity Invariant: the main thread (entering immapp.run()) must never import a module heavier than imgui_bundle and the lean gui_2 skeleton. Enforced by: - static gate: scripts/audit_main_thread_imports.py (CI) - runtime hook: tests/test_main_thread_purity.py (sys.addaudithook) Threading constraint: no new threading.Thread(...) calls in src/. All background work goes through AppController._io_pool (ThreadPoolExecutor, max_workers=4, thread_name_prefix='controller-io'). 9 phases, 57 tasks: audit+baseline, job pool, lazy-load SDKs, lazy-load FastAPI, lazy-load feature-gated GUI, migrate ad-hoc threads, runtime enforcement, hook API + diagnostics, verify+checkpoint. Expected savings: ~2000-2400ms off main-thread import cost. Target: import src.ai_client < 50ms (from ~1800ms), live_gui fixtures no longer time out at wait_for_server(timeout=15).	2026-06-06 12:57:20 -04:00
ed	2adf3274af	add benchmark scriptr	2026-06-06 12:47:41 -04:00
ed	311fde9a8b	fixes	2026-06-06 12:44:07 -04:00
ed	9ccaf0594c	some org on ai_client	2026-06-06 11:35:20 -04:00
ed	9d72d98b50	conductor(tracks): mark rag_phase4_stress_test_flake resolved (commit `16412ad5`)	2026-06-06 11:29:03 -04:00
ed	16412ad5f9	fix(rag): detect ChromaDB dim mismatch and recreate collection on provider switch	2026-06-06 11:26:47 -04:00
ed	339b062913	more organization	2026-06-06 11:08:07 -04:00
ed	7d555361f9	more organization	2026-06-06 10:24:22 -04:00
ed	1c627bcc30	fix(docs): correct section order in guide_testing (patterns before See Also) + fix LF/CRLF	2026-06-06 09:34:38 -04:00
ed	0f742b1d5f	conductor(workflow): add Indentation-Driven Class Method Visibility pitfall (2026-06-05)	2026-06-06 02:04:05 -04:00
ed	e276bac093	docs(gui_2): add __getattr__/__setattr__ delegation pattern + indentation gotcha	2026-06-06 01:59:20 -04:00
ed	4ee22dedb9	docs(testing): add Narrow Test Paths + Indentation-Driven Method Visibility patterns	2026-06-06 01:53:25 -04:00
ed	e7b8877f2a	docs(readme): update for v2 completion (24 guides, 273 test files, 98.9% pass rate)	2026-06-06 01:42:45 -04:00

1 2 3 4 5 ...

2595 Commits