manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	32edad0a4b	conductor(plan): Mark Phase 5A-5C complete (commands, theme_2, markdown_helper lazy imports)	2026-06-06 17:01:05 -04:00
ed	cbc3b075a0	conductor(track): Initialize data_oriented_error_handling_20260606 Track + metadata + state + tracks.md registration for the Fleury-pattern error handling refactor. Key design decisions (per user approval): - Option A for _send_<vendor>() handling: rename to _send_<vendor>_result() and change return type to Result[str] (contained to internal callers). - send() is marked @typing_extensions.deprecated; send_result() is the new public API. - ProviderError exception is FULLY REPLACED by ErrorInfo dataclass (a value, not an exception). - 5 phases: foundation, mcp_client, ai_client, rag_engine, deprecation+archive. - Post-tracks baseline check (Phase 1 Task 1.1) verifies the 3 pending tracks have merged before proceeding. - 9 Open Questions, 7 Risks, 5 verification criteria, follow-up track public_api_migration_20260606 planned in spec §12.1. Blocked by: startup_speedup_20260606, test_batching_refactor_20260606, qwen_llama_grok_integration_20260606. Blocks: public_api_migration_20260606.	2026-06-06 16:58:22 -04:00
ed	494f68f9d9	conductor(spec): Add 'Coordination with Pending Tracks' section (§10) This track executes after startup_speedup, test_batching_refactor, and qwen_llama_grok_integration land. Section 10 documents the expected post-tracks codebase state and answers 6 critical coordination questions: - Q1: Existing _send_<vendor>() functions (returning str) are renamed to _send_<vendor>_result() and changed to return Result[str] (Option A: clean rename, contained to internal callers). - Q2: send_openai_compatible in src/openai_compatible.py STAYS as-is (it raises at the SDK boundary; correct per Fleury). The new _send_<vendor>_result() functions catch and convert to ErrorInfo. - Q3: Deprecation warning on send() will produce Python warnings in tests; filterwarnings in conftest.py silences them during transition. - Q4: The except ProviderError clauses in src/ai_client.py become dead code after the refactor and are removed in Phase 3. - Q5: ProviderError is FULLY REPLACED by ErrorInfo (a value, not an exception). ProviderError removed entirely; ErrorInfo is the new error type. - Q6: ProviderError.ui_message() moves to ErrorInfo.ui_message(). Phase 1 also adds a baseline verification task to confirm the 3 pending tracks have merged before proceeding. Also renumbered Out of Scope (11) and See Also (12) sections to preserve monotonic section numbers.	2026-06-06 16:54:25 -04:00
ed	16291234ff	conductor(plan): Record Phase 4 checkpoint SHA `883682c1`	2026-06-06 16:37:27 -04:00
ed	a0ff1bde91	conductor(plan): Mark Phase 4 complete - app_controller fastapi import removal + _require_warmed lift	2026-06-06 16:36:20 -04:00
ed	7fb13fbf4b	conductor(plan): Record Phase 3 checkpoint SHA + mark T3.6 complete	2026-06-06 16:13:35 -04:00
ed	8905c26bff	conductor(plan): Mark Phase 3 complete - ai_client SDK import removal done	2026-06-06 16:11:14 -04:00
ed	9eed60238a	conductor(plan): mark T3.1 RED done; T3.2 holding for MCP fix (`16780ec6`)	2026-06-06 15:16:02 -04:00
ed	b17cbbdeca	conductor(plan): write 6-phase implementation plan for qwen_llama_grok_integration_20260606 ~30 tasks across 6 phases, each with explicit Red-Green-Refactor TDD steps: - Phase 1 (1.1-1.8): Capability matrix framework (src/vendor_capabilities.py) + shared OpenAI-compatible helper (src/openai_compatible.py). 13 unit tests. - Phase 2 (2.1-2.8): Qwen via DashScope native SDK. 5 unit tests. - Phase 3 (3.1-3.7): Grok (xAI) + Llama (Ollama + OpenRouter + custom URL) via shared helper. 8 unit tests. - Phase 4 (4.1-4.3): MiniMax refactor (_send_minimax from ~250 -> ~50 lines). Safety net: existing tests/test_minimax_provider.py. - Phase 5 (5.1-5.5): 9 capability-driven UX adaptations in src/gui_2.py. Manual smoke test for all 3 new vendors. - Phase 6 (6.1-6.4): Update docs/guide_ai_client.md + guide_models.md. Archive the track. Data-oriented design: shared helper is the algorithm on normalized data; _send_<vendor>() entry points are thin boundary adapters. 1-space indentation per project style guide. No placeholders. All test code is concrete. Self-review at end confirms spec coverage (every section of spec.md mapped to a task).	2026-06-06 15:06:30 -04:00
ed	97daaff29b	conductor(spec): Fix Qwen-Audio matrix entry consistency (vision=false, audio deferred) The capability matrix v1 has no 'audio' field (audio_input is deferred to v2). Qwen-Audio's vision flag was incorrectly marked true. Changed to false and clarified that v1 uses Qwen-Audio as text-only; audio attachment UI is hidden via the absent audio capability check.	2026-06-06 14:58:03 -04:00
ed	055430a75a	conductor(tracks): Register qwen_llama_grok_integration_20260606 in registry (item 0d)	2026-06-06 14:56:55 -04:00
ed	7c1d597ef1	conductor(track): Initialize qwen_llama_grok_integration_20260606 spec Three new vendors + capability matrix framework + MiniMax refactor: Capability matrix v1 (7 features): vision, tool_calling, caching, streaming, model_discovery, context_window, cost_tracking. Audio and server-side code execution deferred to a follow-up track. Qwen via DashScope native SDK: Qwen-Turbo, Qwen-Plus, Qwen-Max, Qwen-Long (1M context), Qwen-VL-Plus/Max (vision), Qwen-Audio. Native API chosen over OpenAI-compatible mode to unlock Qwen-Audio, Qwen-Long custom chunking, and Qwen-VL-Max enhanced vision. Llama (OpenAI-compatible, multi-backend): Ollama (local, free), OpenRouter (cloud aggregator covering Together/Groq/Fireworks), custom URL escape hatch. Models: Llama 3.1 8B/70B/405B, 3.2 1B/3B, 3.2 11B/90B Vision, 3.3 70B. Grok via xAI (OpenAI-compatible): Grok-2, Grok-2-Vision, Grok-Beta. Shared OpenAI-compatible helper in src/openai_compatible.py processes a normalized request/response data structure; each _send_<vendor>() is a thin adapter at the boundary (data-oriented design per Fleury/Acton/Lottes). MiniMax refactor: ~250 lines reduced to ~50 by using the shared helper. Existing test_minimax_provider.py is the safety net. UX adaptation: 9 UI elements (screenshot, tools toggle, cache panel, stream progress, fetch models, token budget, cost panel) read from the matrix instead of hard-coding per-vendor branches. Out of scope (deferred): Anthropic/Gemini/DeepSeek migration to the matrix (separate track), audio input, server-side code execution, PDF input, batch API, fine-tuning. 6 phases planned: matrix+helper, Qwen, Grok+Llama, MiniMax refactor, UX adaptation, docs+archive.	2026-06-06 14:56:00 -04:00
ed	7eb743c6cb	conductor(plan): Phase 2 complete - io_pool + warmup foundation in place Phase 2 of startup_speedup_20260606 is done. Tasks: T2.1 (Red) tests/test_io_pool.py `1354679e` 4 tests T2.2 (Green) src/io_pool.py `1354679e` make_io_pool() factory T2.3 (Red) tests/test_warmup.py `1354679e` 10 tests T2.4 (Green) src/warmup.py `1354679e` WarmupManager T2.5 (Wire) AppController integration `922c5ad9` io_pool + warmup in __init__ + 5 public delegation methods T2.6 (Plan) this commit What now exists: - make_io_pool() returns a 4-worker ThreadPoolExecutor named 'controller-io-N' - WarmupManager class with submit/status/is_done/wait/on_complete/reset - AppController creates self._io_pool + self._warmup early in __init__ - Warmup is submitted immediately (jobs run concurrent with the rest of init) - Public API: controller.warmup_status(), controller.is_warmup_done(), controller.wait_for_warmup(timeout), controller.on_warmup_complete(cb) - controller._compute_warmup_list() returns 9 always + 2 conditional (fastapi) - shutdown() now also shuts down the io_pool Currently the warmup is a no-op for modules already imported at the top of app_controller.py (fastapi, requests). Phase 3 will remove those top-level imports; the warmup infrastructure will then start doing real work. 18/18 tests passing (4 io_pool + 10 warmup + 4 test_app_controller_*). Next: Phase 3 (remove top-level SDK imports from src/ai_client.py). Expected to fix ~3 audit violations (google.genai, anthropic, openai).	2026-06-06 14:52:04 -04:00
ed	7fdab70529	conductor(plan): write 4-phase implementation plan for test_batching_refactor_20260606 16 tasks across 4 phases, each with explicit Red-Green-Refactor TDD steps: - Phase 1 (1.1-1.16): Library + dry-run. 20 unit tests across categorizer, batcher, plugin. New run_tests_batched.py has --plan/--audit only. - Phase 2 (2.1-2.3): Shadow run via CI. Compare new vs old plan output. - Phase 3 (3.1-3.4): Switch default. Full CLI with --tiers, --durations. Old script becomes .legacy. Update docs/guide_testing.md. - Phase 4 (4.1-4.6): Populate registry, gitignore durations, delete legacy, archive track. 1-space indentation per project style guide. No placeholders. All test code is concrete.	2026-06-06 14:24:39 -04:00
ed	f9a0125847	conductor(plan): Phase 1 complete - baseline + audit infrastructure ready Phase 1 of startup_speedup_20260606 track is done. Tasks completed: T1.1 baseline benchmark -> `6f9a3af2` (docs/reports/startup_baseline_20260606.txt) T1.2 audit_gui2_imports.py -> `6f9a3af2` (scripts/ + audit results) T1.3 StartupProfiler -> `5a856536` (src/ + 5 tests) T1.4 audit_main_thread_imports -> `6f9a3af2` (scripts/ + 9 tests) T1.5 plan update -> this commit Baseline numbers (3-run median, from scripts/benchmark_imports.py): src.gui_2 1770ms (main-thread bottleneck) simulation.user_agent 1517ms google.genai 1001ms openai 482ms anthropic 441ms imgui_bundle 255ms (KEEP - ImGui hot path) src.theme_nerv_fx 254ms src.theme_nerv 246ms src.markdown_table 243ms src.command_palette 242ms Audit violations on current codebase: 67. These are the targets for Phases 3-5 (remove top-level heavy imports to fix each one). Next: Phase 2 (Job Pool + Warmup Foundation).	2026-06-06 14:24:20 -04:00
ed	0553983ce9	conductor(spec): Clarify --audit --strict semantics in Section 4.3 Default --audit exits non-zero on hard errors only. --strict adds the 'multiple subsystems = probably cross-cutting' heuristic from Section 9 as a CI gate. Two modes, one flag.	2026-06-06 14:16:13 -04:00
ed	cbfd78c51d	conductor(tracks): Register test_batching_refactor_20260606 in registry	2026-06-06 14:14:11 -04:00
ed	b7a9737443	conductor(track): Initialize test_batching_refactor_20260606 spec Three-tier batching refactor: replace alphabetical 4-at-a-time batching with fixture-class-isolated tiers (0 opt-in, 1 unit/xdist, 2 mock_app, 3 live_gui in one session, H headless, P performance). Hybrid classification: auto-infer from filename + AST fixture scan; hand-curated tests/test_categories.toml overrides for cross-cutting and ambiguous files. Opt-in per-test order control via [[files.X.test_order]] sub-tables, gated on a conftest-loaded pytest plugin (no-op without entries). Priority order: B (process isolation) > A (subsystem diagnostic) > C (speed).	2026-06-06 14:12:14 -04:00
ed	96158edd97	conductor(plan): mark T1.3 StartupProfiler complete (`5a856536`)	2026-06-06 13:59:02 -04:00
ed	f2f5ee1197	conductor(plan): flip track from lazy-loading to proactive warmup Architectural shift driven by user clarification: lazy-loading on first use causes user-perceptible lag when the user-triggered action (e.g. provider switch) propagates to a controller method that triggers the first import. The fix is to pre-import heavy modules on a bg thread at startup and have functions access them via _require_warmed(). Old design (rejected): - from google import genai inside _send_gemini (lazy on first call) - First user action that triggers this pays the cost; UI feels laggy New design (this commit): - Top-level heavy imports REMOVED from main-thread-reachable files - AppController.__init__ submits warmup jobs to _io_pool (4 threads, named 'controller-io-N') - Each warmup worker imports its module and updates a thread-safe warmup_status dict - Functions access modules via _require_warmed(name), which assumes the module is in sys.modules (warmed at startup) - When all jobs complete, _warmup_done_event is set and registered on_warmup_complete callbacks fire - GUI shows status indicator + toast when warmup completes - Hook API exposes /api/warmup_status and /api/warmup_wait - Tests can call controller.wait_for_warmup() before exercising warmup-dependent functionality Phase 2 now bundles job pool + warmup (T2.3+T2.4 add warmup tests + implementation). Phases 3-5 do 'remove top-level imports' instead of 'lazy-load'. Phase 7 is the notification surface (Hook API + GUI). Definition of Done includes warmup-completion criteria, the 'no function-body imports' check, and an end-to-end 'provider switch is INSTANT' smoke test. No code changes; this is a planning update only.	2026-06-06 13:45:05 -04:00
r00tz	9e4fac496d	made local rag needs optional (prevents having to have torch / sentence-transformers if you never use local embedding)	2026-06-06 13:21:43 -04:00
ed	32e633b3ec	conductor(plan): mark startup_speedup_20260606 track creation committed (`cd4fb045`)	2026-06-06 13:01:32 -04:00
ed	cd4fb04541	conductor(track): create startup_speedup_20260606 track for sloppy.py startup latency Fulfills the existing backlog entry at conductor/tracks.md:152 (2026-06-05 root-cause analysis of live_gui wait_for_server timeouts). Main Thread Purity Invariant: the main thread (entering immapp.run()) must never import a module heavier than imgui_bundle and the lean gui_2 skeleton. Enforced by: - static gate: scripts/audit_main_thread_imports.py (CI) - runtime hook: tests/test_main_thread_purity.py (sys.addaudithook) Threading constraint: no new threading.Thread(...) calls in src/. All background work goes through AppController._io_pool (ThreadPoolExecutor, max_workers=4, thread_name_prefix='controller-io'). 9 phases, 57 tasks: audit+baseline, job pool, lazy-load SDKs, lazy-load FastAPI, lazy-load feature-gated GUI, migrate ad-hoc threads, runtime enforcement, hook API + diagnostics, verify+checkpoint. Expected savings: ~2000-2400ms off main-thread import cost. Target: import src.ai_client < 50ms (from ~1800ms), live_gui fixtures no longer time out at wait_for_server(timeout=15).	2026-06-06 12:57:20 -04:00
ed	9d72d98b50	conductor(tracks): mark rag_phase4_stress_test_flake resolved (commit `16412ad5`)	2026-06-06 11:29:03 -04:00
ed	0f742b1d5f	conductor(workflow): add Indentation-Driven Class Method Visibility pitfall (2026-06-05)	2026-06-06 02:04:05 -04:00
ed	5e0b6bbfd3	conductor(tracks): queue RAG test flake as new backlog item; mark prior_session complete	2026-06-06 01:35:21 -04:00
ed	008179360f	conductor(index): v2 recently shipped, all 4 live_gui failures resolved	2026-06-06 01:30:03 -04:00
ed	9a3831897b	conductor(tracks): mark live_gui_test_hardening_v2 complete (root cause was indent, not state sync)	2026-06-06 01:28:02 -04:00
ed	6c541bc788	move track mds to tracks	2026-06-06 00:42:40 -04:00
ed	5c23ad190d	conductor(tracks): link v2 to 4 sub-track specs and plans	2026-06-05 22:56:55 -04:00
ed	8b83c5d0b7	conductor(index): v2 active, v1 + regression_fixes now in recently-shipped	2026-06-05 22:12:34 -04:00
ed	70c18f92c3	conductor(tracks): mark v1 fragility_fixes complete, queue v2 (state sync + undo_redo + prior_session)	2026-06-05 22:09:30 -04:00
ed	1488e71568	docs: add Sentinel type contract note to 3 defer-not-catch sections	2026-06-05 20:31:38 -04:00
ed	0e299140ca	conductor(tracks): register live_gui_fragility_fixes + queue prior_session_test_harden follow-up	2026-06-05 20:17:11 -04:00
ed	449a827a82	conductor(tracks): queue sloppy.py startup speedup as new backlog item	2026-06-05 18:53:01 -04:00
ed	dc691e3de0	docs(workflow): reframe live_gui fragility as authoring-side, not fixture bug	2026-06-05 18:43:58 -04:00
ed	71b0082bbf	docs(workflow): add Known Pitfalls section (defer-not-catch, theme bisect anchors, live_gui fragility)	2026-06-05 18:31:14 -04:00
ed	2f0c1eb3cc	conductor(index): mark regression_fixes active, add multi_themes recently shipped	2026-06-05 18:18:27 -04:00
ed	8663498725	conductor(tracks): register multi_themes ship and regression_fixes checkpoint	2026-06-05 18:12:03 -04:00
ed	db3490a70f	conductor(plan): document imgui save_ini crash root cause and fix	2026-06-05 15:12:23 -04:00
ed	b0c8589f68	conductor(plan): document root cause - imgui-bundle C-level crash blocks live_gui	2026-06-05 13:47:55 -04:00
ed	1c6919aafc	conductor(plan): update task status - 5 done, 6 deferred pending live_gui	2026-06-05 12:43:33 -04:00
ed	07d35c9d39	conductor(plan): regression fixes - 21 failures from full suite run	2026-06-05 10:10:29 -04:00
ed	06e305aba6	feat(theme): add tone mapping and fix missing palette colors	2026-06-04 23:44:43 -04:00
ed	cd24c43f8f	conductor(plan): theme + syntax modularization - 7-task plan	2026-06-04 22:20:58 -04:00
ed	ce211e76f8	straggler spec	2026-06-04 19:42:04 -04:00
ed	ba7733b365	conductor(plan): Mark context_first_message_fix task complete	2026-06-04 18:47:42 -04:00
ed	0d4fade5ed	fix(context): Only send context on first message in discussion Previously, context (files, screenshots) was always sent with every message, even on subsequent messages where the AI provider already had the context from the first message via its history mechanism. This change: - Detects if the discussion has any AI responses already - Only sends md_content (stable_md) on the first message - Subsequent messages pass empty string for md_content to avoid redundant sending - Context now properly goes in md_content parameter, not crammed into user_message The fix is in _api_generate() in src/app_controller.py	2026-06-04 18:43:39 -04:00
ed	11253e8d60	conductor(plan): UI Polish track - 5 phases, design spec + impl plan	2026-06-03 10:29:25 -04:00
ed	db177e4494	docs(api): correct endpoint /api/mma_status -> /api/gui/mma_status across docs	2026-06-03 00:56:32 -04:00

1 2 3 4 5 ...