manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	7c1d597ef1	conductor(track): Initialize qwen_llama_grok_integration_20260606 spec Three new vendors + capability matrix framework + MiniMax refactor: Capability matrix v1 (7 features): vision, tool_calling, caching, streaming, model_discovery, context_window, cost_tracking. Audio and server-side code execution deferred to a follow-up track. Qwen via DashScope native SDK: Qwen-Turbo, Qwen-Plus, Qwen-Max, Qwen-Long (1M context), Qwen-VL-Plus/Max (vision), Qwen-Audio. Native API chosen over OpenAI-compatible mode to unlock Qwen-Audio, Qwen-Long custom chunking, and Qwen-VL-Max enhanced vision. Llama (OpenAI-compatible, multi-backend): Ollama (local, free), OpenRouter (cloud aggregator covering Together/Groq/Fireworks), custom URL escape hatch. Models: Llama 3.1 8B/70B/405B, 3.2 1B/3B, 3.2 11B/90B Vision, 3.3 70B. Grok via xAI (OpenAI-compatible): Grok-2, Grok-2-Vision, Grok-Beta. Shared OpenAI-compatible helper in src/openai_compatible.py processes a normalized request/response data structure; each _send_<vendor>() is a thin adapter at the boundary (data-oriented design per Fleury/Acton/Lottes). MiniMax refactor: ~250 lines reduced to ~50 by using the shared helper. Existing test_minimax_provider.py is the safety net. UX adaptation: 9 UI elements (screenshot, tools toggle, cache panel, stream progress, fetch models, token budget, cost panel) read from the matrix instead of hard-coding per-vendor branches. Out of scope (deferred): Anthropic/Gemini/DeepSeek migration to the matrix (separate track), audio input, server-side code execution, PDF input, batch API, fine-tuning. 6 phases planned: matrix+helper, Qwen, Grok+Llama, MiniMax refactor, UX adaptation, docs+archive.	2026-06-06 14:56:00 -04:00
ed	7eb743c6cb	conductor(plan): Phase 2 complete - io_pool + warmup foundation in place Phase 2 of startup_speedup_20260606 is done. Tasks: T2.1 (Red) tests/test_io_pool.py `1354679e` 4 tests T2.2 (Green) src/io_pool.py `1354679e` make_io_pool() factory T2.3 (Red) tests/test_warmup.py `1354679e` 10 tests T2.4 (Green) src/warmup.py `1354679e` WarmupManager T2.5 (Wire) AppController integration `922c5ad9` io_pool + warmup in __init__ + 5 public delegation methods T2.6 (Plan) this commit What now exists: - make_io_pool() returns a 4-worker ThreadPoolExecutor named 'controller-io-N' - WarmupManager class with submit/status/is_done/wait/on_complete/reset - AppController creates self._io_pool + self._warmup early in __init__ - Warmup is submitted immediately (jobs run concurrent with the rest of init) - Public API: controller.warmup_status(), controller.is_warmup_done(), controller.wait_for_warmup(timeout), controller.on_warmup_complete(cb) - controller._compute_warmup_list() returns 9 always + 2 conditional (fastapi) - shutdown() now also shuts down the io_pool Currently the warmup is a no-op for modules already imported at the top of app_controller.py (fastapi, requests). Phase 3 will remove those top-level imports; the warmup infrastructure will then start doing real work. 18/18 tests passing (4 io_pool + 10 warmup + 4 test_app_controller_*). Next: Phase 3 (remove top-level SDK imports from src/ai_client.py). Expected to fix ~3 audit violations (google.genai, anthropic, openai).	2026-06-06 14:52:04 -04:00
ed	7fdab70529	conductor(plan): write 4-phase implementation plan for test_batching_refactor_20260606 16 tasks across 4 phases, each with explicit Red-Green-Refactor TDD steps: - Phase 1 (1.1-1.16): Library + dry-run. 20 unit tests across categorizer, batcher, plugin. New run_tests_batched.py has --plan/--audit only. - Phase 2 (2.1-2.3): Shadow run via CI. Compare new vs old plan output. - Phase 3 (3.1-3.4): Switch default. Full CLI with --tiers, --durations. Old script becomes .legacy. Update docs/guide_testing.md. - Phase 4 (4.1-4.6): Populate registry, gitignore durations, delete legacy, archive track. 1-space indentation per project style guide. No placeholders. All test code is concrete.	2026-06-06 14:24:39 -04:00
ed	f9a0125847	conductor(plan): Phase 1 complete - baseline + audit infrastructure ready Phase 1 of startup_speedup_20260606 track is done. Tasks completed: T1.1 baseline benchmark -> `6f9a3af2` (docs/reports/startup_baseline_20260606.txt) T1.2 audit_gui2_imports.py -> `6f9a3af2` (scripts/ + audit results) T1.3 StartupProfiler -> `5a856536` (src/ + 5 tests) T1.4 audit_main_thread_imports -> `6f9a3af2` (scripts/ + 9 tests) T1.5 plan update -> this commit Baseline numbers (3-run median, from scripts/benchmark_imports.py): src.gui_2 1770ms (main-thread bottleneck) simulation.user_agent 1517ms google.genai 1001ms openai 482ms anthropic 441ms imgui_bundle 255ms (KEEP - ImGui hot path) src.theme_nerv_fx 254ms src.theme_nerv 246ms src.markdown_table 243ms src.command_palette 242ms Audit violations on current codebase: 67. These are the targets for Phases 3-5 (remove top-level heavy imports to fix each one). Next: Phase 2 (Job Pool + Warmup Foundation).	2026-06-06 14:24:20 -04:00
ed	0553983ce9	conductor(spec): Clarify --audit --strict semantics in Section 4.3 Default --audit exits non-zero on hard errors only. --strict adds the 'multiple subsystems = probably cross-cutting' heuristic from Section 9 as a CI gate. Two modes, one flag.	2026-06-06 14:16:13 -04:00
ed	cbfd78c51d	conductor(tracks): Register test_batching_refactor_20260606 in registry	2026-06-06 14:14:11 -04:00
ed	b7a9737443	conductor(track): Initialize test_batching_refactor_20260606 spec Three-tier batching refactor: replace alphabetical 4-at-a-time batching with fixture-class-isolated tiers (0 opt-in, 1 unit/xdist, 2 mock_app, 3 live_gui in one session, H headless, P performance). Hybrid classification: auto-infer from filename + AST fixture scan; hand-curated tests/test_categories.toml overrides for cross-cutting and ambiguous files. Opt-in per-test order control via [[files.X.test_order]] sub-tables, gated on a conftest-loaded pytest plugin (no-op without entries). Priority order: B (process isolation) > A (subsystem diagnostic) > C (speed).	2026-06-06 14:12:14 -04:00
ed	96158edd97	conductor(plan): mark T1.3 StartupProfiler complete (`5a856536`)	2026-06-06 13:59:02 -04:00
ed	f2f5ee1197	conductor(plan): flip track from lazy-loading to proactive warmup Architectural shift driven by user clarification: lazy-loading on first use causes user-perceptible lag when the user-triggered action (e.g. provider switch) propagates to a controller method that triggers the first import. The fix is to pre-import heavy modules on a bg thread at startup and have functions access them via _require_warmed(). Old design (rejected): - from google import genai inside _send_gemini (lazy on first call) - First user action that triggers this pays the cost; UI feels laggy New design (this commit): - Top-level heavy imports REMOVED from main-thread-reachable files - AppController.__init__ submits warmup jobs to _io_pool (4 threads, named 'controller-io-N') - Each warmup worker imports its module and updates a thread-safe warmup_status dict - Functions access modules via _require_warmed(name), which assumes the module is in sys.modules (warmed at startup) - When all jobs complete, _warmup_done_event is set and registered on_warmup_complete callbacks fire - GUI shows status indicator + toast when warmup completes - Hook API exposes /api/warmup_status and /api/warmup_wait - Tests can call controller.wait_for_warmup() before exercising warmup-dependent functionality Phase 2 now bundles job pool + warmup (T2.3+T2.4 add warmup tests + implementation). Phases 3-5 do 'remove top-level imports' instead of 'lazy-load'. Phase 7 is the notification surface (Hook API + GUI). Definition of Done includes warmup-completion criteria, the 'no function-body imports' check, and an end-to-end 'provider switch is INSTANT' smoke test. No code changes; this is a planning update only.	2026-06-06 13:45:05 -04:00
r00tz	9e4fac496d	made local rag needs optional (prevents having to have torch / sentence-transformers if you never use local embedding)	2026-06-06 13:21:43 -04:00
ed	32e633b3ec	conductor(plan): mark startup_speedup_20260606 track creation committed (`cd4fb045`)	2026-06-06 13:01:32 -04:00
ed	cd4fb04541	conductor(track): create startup_speedup_20260606 track for sloppy.py startup latency Fulfills the existing backlog entry at conductor/tracks.md:152 (2026-06-05 root-cause analysis of live_gui wait_for_server timeouts). Main Thread Purity Invariant: the main thread (entering immapp.run()) must never import a module heavier than imgui_bundle and the lean gui_2 skeleton. Enforced by: - static gate: scripts/audit_main_thread_imports.py (CI) - runtime hook: tests/test_main_thread_purity.py (sys.addaudithook) Threading constraint: no new threading.Thread(...) calls in src/. All background work goes through AppController._io_pool (ThreadPoolExecutor, max_workers=4, thread_name_prefix='controller-io'). 9 phases, 57 tasks: audit+baseline, job pool, lazy-load SDKs, lazy-load FastAPI, lazy-load feature-gated GUI, migrate ad-hoc threads, runtime enforcement, hook API + diagnostics, verify+checkpoint. Expected savings: ~2000-2400ms off main-thread import cost. Target: import src.ai_client < 50ms (from ~1800ms), live_gui fixtures no longer time out at wait_for_server(timeout=15).	2026-06-06 12:57:20 -04:00
ed	9d72d98b50	conductor(tracks): mark rag_phase4_stress_test_flake resolved (commit `16412ad5`)	2026-06-06 11:29:03 -04:00
ed	0f742b1d5f	conductor(workflow): add Indentation-Driven Class Method Visibility pitfall (2026-06-05)	2026-06-06 02:04:05 -04:00
ed	5e0b6bbfd3	conductor(tracks): queue RAG test flake as new backlog item; mark prior_session complete	2026-06-06 01:35:21 -04:00
ed	008179360f	conductor(index): v2 recently shipped, all 4 live_gui failures resolved	2026-06-06 01:30:03 -04:00
ed	9a3831897b	conductor(tracks): mark live_gui_test_hardening_v2 complete (root cause was indent, not state sync)	2026-06-06 01:28:02 -04:00
ed	6c541bc788	move track mds to tracks	2026-06-06 00:42:40 -04:00
ed	5c23ad190d	conductor(tracks): link v2 to 4 sub-track specs and plans	2026-06-05 22:56:55 -04:00
ed	8b83c5d0b7	conductor(index): v2 active, v1 + regression_fixes now in recently-shipped	2026-06-05 22:12:34 -04:00
ed	70c18f92c3	conductor(tracks): mark v1 fragility_fixes complete, queue v2 (state sync + undo_redo + prior_session)	2026-06-05 22:09:30 -04:00
ed	1488e71568	docs: add Sentinel type contract note to 3 defer-not-catch sections	2026-06-05 20:31:38 -04:00
ed	0e299140ca	conductor(tracks): register live_gui_fragility_fixes + queue prior_session_test_harden follow-up	2026-06-05 20:17:11 -04:00
ed	449a827a82	conductor(tracks): queue sloppy.py startup speedup as new backlog item	2026-06-05 18:53:01 -04:00
ed	dc691e3de0	docs(workflow): reframe live_gui fragility as authoring-side, not fixture bug	2026-06-05 18:43:58 -04:00
ed	71b0082bbf	docs(workflow): add Known Pitfalls section (defer-not-catch, theme bisect anchors, live_gui fragility)	2026-06-05 18:31:14 -04:00
ed	2f0c1eb3cc	conductor(index): mark regression_fixes active, add multi_themes recently shipped	2026-06-05 18:18:27 -04:00
ed	8663498725	conductor(tracks): register multi_themes ship and regression_fixes checkpoint	2026-06-05 18:12:03 -04:00
ed	db3490a70f	conductor(plan): document imgui save_ini crash root cause and fix	2026-06-05 15:12:23 -04:00
ed	b0c8589f68	conductor(plan): document root cause - imgui-bundle C-level crash blocks live_gui	2026-06-05 13:47:55 -04:00
ed	1c6919aafc	conductor(plan): update task status - 5 done, 6 deferred pending live_gui	2026-06-05 12:43:33 -04:00
ed	07d35c9d39	conductor(plan): regression fixes - 21 failures from full suite run	2026-06-05 10:10:29 -04:00
ed	06e305aba6	feat(theme): add tone mapping and fix missing palette colors	2026-06-04 23:44:43 -04:00
ed	cd24c43f8f	conductor(plan): theme + syntax modularization - 7-task plan	2026-06-04 22:20:58 -04:00
ed	ce211e76f8	straggler spec	2026-06-04 19:42:04 -04:00
ed	ba7733b365	conductor(plan): Mark context_first_message_fix task complete	2026-06-04 18:47:42 -04:00
ed	0d4fade5ed	fix(context): Only send context on first message in discussion Previously, context (files, screenshots) was always sent with every message, even on subsequent messages where the AI provider already had the context from the first message via its history mechanism. This change: - Detects if the discussion has any AI responses already - Only sends md_content (stable_md) on the first message - Subsequent messages pass empty string for md_content to avoid redundant sending - Context now properly goes in md_content parameter, not crammed into user_message The fix is in _api_generate() in src/app_controller.py	2026-06-04 18:43:39 -04:00
ed	11253e8d60	conductor(plan): UI Polish track - 5 phases, design spec + impl plan	2026-06-03 10:29:25 -04:00
ed	db177e4494	docs(api): correct endpoint /api/mma_status -> /api/gui/mma_status across docs	2026-06-03 00:56:32 -04:00
ed	6ce119dffe	conductor(checkpoint): Fix markdown_helper.py for imgui-bundle >=1.92.801 complete	2026-06-03 00:54:07 -04:00
ed	7a34edf605	fixes	2026-06-03 00:47:40 -04:00
ed	79a12d2c3e	conductor(tracks): register Clean Install Test track with checkpoint `d14ae3b`	2026-06-03 00:33:13 -04:00
ed	d14ae3bd08	conductor(checkpoint): Clean install test complete	2026-06-03 00:31:55 -04:00
ed	573d289941	test(pytest): register clean_install marker for opt-in clone-and-verify test	2026-06-03 00:28:20 -04:00
ed	0309420ba1	conductor(checkpoint): Archive Completed Tracks (2026-05 to 2026-06) complete	2026-06-03 00:19:13 -04:00
ed	b87742ecba	conductor(tracks): fix 25 broken links in Phase 5/6/Hot Reload sections after archival	2026-06-03 00:17:38 -04:00
ed	56ea316afa	conductor(tracks): consolidate 'Earlier Archives' into 'Recent Completed Tracks (2026-05+)' with archive/ links	2026-06-03 00:14:45 -04:00
ed	594f14f943	conductor(archive): move 39 completed tracks (2026-05 to 2026-06) to archive/	2026-06-03 00:09:52 -04:00
ed	0ffeccc7d3	conductor(tracks): register 4 completed 2026-06-02 tracks with checkpoint SHAs	2026-06-03 00:02:15 -04:00
ed	f93dac7d8f	conductor(guidelines): add See Also section linking to per-file conventions	2026-06-02 23:53:31 -04:00

1 2 3 4 5 ...