manual_slop

10 Commits

Include renames

Author	SHA1	Message	Date
ed	5370f8dcc6	conductor(track): mark result_migration_small_files_20260617 Phase 11 complete Phase 11 (REJECT Phase 10's sliming). The full Result[T] migration for the 21 slimed sites has been completed: - 5 full Result migrations in warmup.py (on_complete, _record_success, _record_failure, _log_canary, _log_summary now return Result[T]) - 2 helper extracts: startup_profiler._log_phase_output and file_cache._get_mtime_safe (Result-returning helpers) - 14 sites documented as already compliant (Result/BOUNDARY_CONVERSION/ Heuristic #19 - not sliming, valid existing pattern) - 1 known limitation: warmup._warmup_one L185 (indirect Result return via delegation; convention followed; audit has known limitation) 5 LAUNDERING HEURISTICS (#22-#26) REVERTED in commit `37872544`. Heuristic A (Result-returning recovery) ADDED in commit `3c839c91`. Test count corrected: Phase 10 wrongly claimed '10 tiers'; the 11th tier is tier-1-unit-comms. Phase 11 ran ALL 11 tiers and 10 PASS; tier-3 fails on the pre-existing test_execution_sim_live flake (unrelated). Updated: - conductor/tracks/result_migration_small_files_20260617/state.toml - conductor/tracks/result_migration_small_files_20260617/metadata.json - conductor/tracks.md (sub-track 6d-2 row) - conductor/tracks/result_migration_20260616/spec.md (umbrella) - docs/reports/RESULT_MIGRATION_SMALL_FILES_20260617.md (Phase 11 addendum) - docs/reports/TRACK_COMPLETION_result_migration_small_files_20260617.md (Phase 11 addendum with corrected test count) Phase 11 is the actual completion. Phase 10 was rejected for sliming.	2026-06-18 00:39:59 -04:00
ed	6c66c03e82	refactor(src): file_cache.py Phase 11.3.5 - extract _get_mtime_safe Phase 11.3.5. The original try/except (OSError, ValueError): mtime = 0.0 in get_cached_tree is now extracted to a Result-returning helper. The helper returns Result[float]; the caller uses .data (0.0 fallback) and can inspect .errors. The convention requires Result[T] for try/except sites that can fail; the helper satisfies this requirement. Audit post-migration: - _get_mtime_safe L48 = INTERNAL_COMPLIANT (Heuristic A) ✓ - get_cached_tree L92 = no try/except for mtime (extracted) Tests: 24/24 pass (test_ast_parser, test_file_cache_no_top_level_tree_sitter).	2026-06-18 00:14:17 -04:00
ed	2ed449ee5f	refactor(src): startup_profiler.py Phase 11.3.2 - extract _log_phase_output Phase 11.3.2. CONTEXT-MANAGER EXCEPTION. The plan claimed 'StartupProfiler.phase() is NOT a context manager; tier-2's claim is factually wrong.' This is incorrect. phase() IS a context manager: - Decorated with @contextmanager (src/startup_profiler.py:26) - Used in 13 'with startup_profiler.phase(...)' call sites in src/gui_2.py (lines 308, 311, 327, 338, 343, 627, 629, 631, 669, 672, 711, 729, 739) It cannot return Result[None] because: - @contextmanager requires the function to yield (not return) - The except body is inside a finally block (which cannot return) Best partial migration: extract _log_phase_output helper that returns Result[None]; phase() calls it and ignores the Result (we're in a finally block). Audit post-migration: - _log_phase_output L28 = INTERNAL_COMPLIANT (Heuristic A) ✓ - phase() L54 try/finally = INTERNAL_COMPLIANT (canonical cleanup) ✓ Tests: 12/12 pass (test_audit_allowlist_2d, test_gui_startup_smoke, test_headless_service, test_startup_profiler, test_warmup_canaries). This site is documented in the per-site report as a CONTEXT-MANAGER EXCEPTION. The Heuristic #19 (catch+log) classification remains valid; the partial migration adds explicit Result-returning helpers where possible without breaking the context manager pattern.	2026-06-18 00:10:16 -04:00
ed	3c839c910a	feat(scripts): Heuristic A - Result-returning recovery = INTERNAL_COMPLIANT Phase 11.2. Adds the LEGITIMATE heuristic that recognizes the canonical data-oriented pattern: \ ry: ...; except: return Result(data=..., errors=[...])\ is the convention's canonical recovery pattern. Detection: - New _returns_result(stmts) helper on ExceptionVisitor - New step 0 in _classify_except (BEFORE BOUNDARY_CONVERSION check) - Classifies as INTERNAL_COMPLIANT with a hint that names the pattern The function-name-not-ending-in-_result is documented as a smell (rename to xxx_result for canonical naming), but the pattern itself is compliant. Tests: - 2 new tests in test_audit_exception_handling_heuristics.py: - test_result_returning_recovery_in_non_result_named_function_is_compliant - test_result_returning_recovery_in_result_named_function_is_compliant - Both pass; the 2 REJECTED tests (#22, #23) remain xfailed. Per conductor/tracks/result_migration_small_files_20260617/plan.md section 11.2.	2026-06-18 00:00:42 -04:00
ed	052881ec20	fix(src): update load_context_preset to handle Result from load_all After migrating ContextPresetManager.load_all to return Result[Dict], the caller in app_controller.load_context_preset needs to extract .data from the Result before checking 'name not in presets'. Updates: - src/app_controller.py:load_context_preset - check result.ok and extract result.data before iterating; raise RuntimeError if result.ok is False (consistent with the convention). - tests/test_context_presets_manager.py:test_manager_load_all - extract result.data before assertions. Tests verified: - tests/test_context_presets_manager.py (4 tests) PASS - tests/test_project_switch_persona_preset.py:: test_load_context_preset_missing_raises_keyerror PASS (KeyError raised correctly when preset not found) - tests/test_phase6_engine.py (3 tests) PASS	2026-06-17 23:15:57 -04:00
ed	dc5e581368	chore(track): archive throw-away scripts for result_migration_review_pass_20260617 (4 helper scripts + sites_to_classify.json)	2026-06-17 17:02:27 -04:00
ed	4b5d5caa8b	docs(tier2): hand off to tier 1 - architectural investigation of stack overflow User indicated they want tier 1 to investigate ('something feels architecturally wrong'). Investigation summary: ROOT CAUSE: imgui.set_window_focus('Response') called on the same frame as the response render, when _trigger_blink is set by _handle_ai_response. The native call exhausts the main thread's 1.94MB stack. VERIFIED: disabling _trigger_blink and _autofocus_response_tab makes the test PASS. The process survives, the response event arrives with correct error text. HISTORY CHECK (git log -S): - _trigger_blink: pre-existing since March 2026 (`c88330cc` feat(hot- reload) Exhaustive region grouping for module-level render funcs) - _autofocus_response_tab: pre-existing since March 6 2026 (`0e9f84f0` 'fixing') - set_window_focus in render_response_panel: pre-existing since `96a013c3` 'fixes and possible wip gui_2/theme_2 for multi-viewport' - response event flow: pre-existing since `68861c07` feat(mma): Decouple UI from API calls using UserRequestEvent and AsyncEventQueue - FR1 (send_result error routing): commit `24ba2499` (Jun 15 2026) in public_api_migration_and_ui_polish_20260615 track The jank is OLDER than the user thinks. The most likely explanation: the test was never run as part of the regular tier-3 batch, so the crash was masked by the Isolated-Pass Verification Fallacy. QUESTIONS FOR TIER 1: 1. Is _trigger_blink a sound design? 2. Should imgui focus changes be deferred to next frame's idle phase? 3. Is there a general principle that no native imgui call should be made during the same frame as a draw call? PROPOSED MINIMAL FIX: defer set_window_focus to next frame's idle phase via a _pending_focus_response flag handled in _process_pending_gui_tasks (which runs before the render).	2026-06-17 13:40:12 -04:00
ed	694cfd2b70	diag(tier2): isolate the jank - _trigger_blink in render_response_panel User asked: 'what does negative flows cause in the imgui procedural dag graph that would cause a recursive processing of the stack?' Tested 4 hypotheses: 1. PYTHONSTACKSIZE env var to bump main thread stack: IGNORED. Main thread stays at 1.94MB regardless of env var or PE header (PE header SizeOfStackReserve is 4TB but Windows OS uses its own default for the main thread commit size). 2. -X faulthandler: doesn't capture native STATUS_STACK_OVERFLOW (faulthandler only catches Python-level signals). 3. Editbin /STACK: editbin not installed on this system. 4. PE header patching with ctypes: SizeOfStackReserve is 4TB but the OS commits only 1.94MB for the main thread and Python doesn't honor any env var to change it. The breakthrough: monkey-patched _handle_ai_response via sitecustomize to disable _trigger_blink and _autofocus_response_tab. Result: WITHOUT _trigger_blink: process survives 60s, response event arrives with status='error' and correct error text. The test WOULD PASS. WITH _trigger_blink (default): process dies with 0xC00000FD (STATUS_STACK_OVERFLOW) within 1s of click. The jank: in src/gui_2.py:render_response_panel (line 5537), the _trigger_blink flag triggers imgui.set_window_focus('Response') on the SAME frame as the response render. This native imgui call apparently triggers imgui-bundle to do extra C++ draw work that exhausts the main thread's 1.94MB stack. Why negative_flows specifically: it's the ONLY tier-3 test where the error response triggers the _trigger_blink path. Success responses also trigger _trigger_blink but don't crash (perhaps because imgui- bundle's layout calculations for an error overlay are heavier than for a normal text response). User predicted: 'i wont solve it but just pad out until failure'. Confirmed - bumping stack didn't fix it (couldn't bump anyway, but the prediction about recursion-related behavior is on track). The fix (per user's framing 'needs to be guarded'): wrap the set_window_focus call in render_response_panel in a try/except or add a stack-depth guard before calling it. Or move the _trigger_blink logic to a deferred frame to avoid the same-frame race with the response render.	2026-06-17 13:22:38 -04:00
ed	cc234b1b83	docs(tier2): architecture check - click chain isolation is correct Per user question about whether execution is properly isolated between AppController and gui_2.py main thread. Verified by reading the architecture contract (docs/guide_architecture.md lines 12, 884-890) and the two click handlers in question: - _handle_generate_send (btn_gen_send): self.submit_io(worker) - _cb_plan_epic (btn_mma_plan_epic): self.submit_io(_bg_task) BOTH click handlers return immediately after submitting work. The heavy AI call (ai_client.send -> subprocess.Popen -> process.communicate) runs on the io_pool worker thread. The execution isolation between AppController and gui_2.py's main render thread IS being followed. The crash (STATUS_STACK_OVERFLOW, 0xC00000FD) is NOT in the click handler chain. It IS in the main thread's imgui-bundle render loop. The render loop runs concurrently with the io_pool worker's subprocess operations. imgui-bundle's per-frame C++ draw code can exceed the main thread's 1.94 MB stack (verified via kernel32.GetCurrentThreadStackLimits). What aspect of negative_flows triggers this: the error-response render path. MOCK_MODE=malformed_json causes the adapter to raise, which triggers _handle_request_event to emit a 'response' event with status='error'. The render loop draws this error response on the next frame, exhausting the main thread's stack. test_visual_orchestration.py uses the same provider setup but does NOT set MOCK_MODE, so the mock defaults to 'success' mode, the adapter returns normally, no error event, no crash. Empirically PASSED in 11.01s. The architecture's render-loop contract assumes imgui-bundle's C stack usage is bounded. It's not. The architecture has no enforcement mechanism (no stack guard, no per-frame stack measurement, no graceful degradation). Next step (post-compact): capture Windows crash dump via procdump to identify the specific imgui-bundle draw call.	2026-06-17 13:09:57 -04:00
ed	cc2105dc65	docs(tier2): what's special about test_z_negative_flows User asked why this test is uniquely affected. Answer: it's the ONLY tier-3 test where the AI call runs ASYNCHRONOUSLY in the io_pool worker while the imgui-bundle render loop continues on the main thread. Verified: test_visual_orchestration.py::test_mma_epic_lifecycle uses the same provider setup (gemini_cli + mock_gemini_cli.py + click) but calls orchestrator_pm.generate_tracks() synchronously in the main thread, blocking the render loop. It PASSES in 11s. test_mma_step_mode_sim.py::test_mma_step_mode_approval_flow also uses the async path but is @pytest.mark.skipif(not RUN_MMA_INTEGRATION) - skipped by default. Would likely also crash if unsuppressed. All other MockProvider tests short-circuit at ai_client.send and never spawn a subprocess. The crash is on the MAIN thread (1.94 MB stack, verified via kernel32.GetCurrentThreadStackLimits), not the io_pool worker (which has 8MB after threading.stack_size(8MB) patch). The main thread's imgui-bundle render loop runs concurrently with the io_pool worker's subprocess.Popen / process.communicate. The accumulated imgui-bundle C++ frames exhaust the main thread's 1.94 MB stack. This explains: - Why bumping io_pool stack to 8MB doesn't help (the patch can't reach the main thread, which was created before any sitecustomize runs). - Why the standalone subprocess call works (no render loop concurrent). - Why the no-click baseline survives 60s (no AI call to trigger the race). Next step: capture a Windows crash dump via procdump or cdb.exe to confirm the crashing thread is the main thread and identify the specific imgui-bundle C++ stack frame.	2026-06-17 12:58:15 -04:00