manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	cc234b1b83	docs(tier2): architecture check - click chain isolation is correct Per user question about whether execution is properly isolated between AppController and gui_2.py main thread. Verified by reading the architecture contract (docs/guide_architecture.md lines 12, 884-890) and the two click handlers in question: - _handle_generate_send (btn_gen_send): self.submit_io(worker) - _cb_plan_epic (btn_mma_plan_epic): self.submit_io(_bg_task) BOTH click handlers return immediately after submitting work. The heavy AI call (ai_client.send -> subprocess.Popen -> process.communicate) runs on the io_pool worker thread. The execution isolation between AppController and gui_2.py's main render thread IS being followed. The crash (STATUS_STACK_OVERFLOW, 0xC00000FD) is NOT in the click handler chain. It IS in the main thread's imgui-bundle render loop. The render loop runs concurrently with the io_pool worker's subprocess operations. imgui-bundle's per-frame C++ draw code can exceed the main thread's 1.94 MB stack (verified via kernel32.GetCurrentThreadStackLimits). What aspect of negative_flows triggers this: the error-response render path. MOCK_MODE=malformed_json causes the adapter to raise, which triggers _handle_request_event to emit a 'response' event with status='error'. The render loop draws this error response on the next frame, exhausting the main thread's stack. test_visual_orchestration.py uses the same provider setup but does NOT set MOCK_MODE, so the mock defaults to 'success' mode, the adapter returns normally, no error event, no crash. Empirically PASSED in 11.01s. The architecture's render-loop contract assumes imgui-bundle's C stack usage is bounded. It's not. The architecture has no enforcement mechanism (no stack guard, no per-frame stack measurement, no graceful degradation). Next step (post-compact): capture Windows crash dump via procdump to identify the specific imgui-bundle draw call.	2026-06-17 13:09:57 -04:00
ed	cc2105dc65	docs(tier2): what's special about test_z_negative_flows User asked why this test is uniquely affected. Answer: it's the ONLY tier-3 test where the AI call runs ASYNCHRONOUSLY in the io_pool worker while the imgui-bundle render loop continues on the main thread. Verified: test_visual_orchestration.py::test_mma_epic_lifecycle uses the same provider setup (gemini_cli + mock_gemini_cli.py + click) but calls orchestrator_pm.generate_tracks() synchronously in the main thread, blocking the render loop. It PASSES in 11s. test_mma_step_mode_sim.py::test_mma_step_mode_approval_flow also uses the async path but is @pytest.mark.skipif(not RUN_MMA_INTEGRATION) - skipped by default. Would likely also crash if unsuppressed. All other MockProvider tests short-circuit at ai_client.send and never spawn a subprocess. The crash is on the MAIN thread (1.94 MB stack, verified via kernel32.GetCurrentThreadStackLimits), not the io_pool worker (which has 8MB after threading.stack_size(8MB) patch). The main thread's imgui-bundle render loop runs concurrently with the io_pool worker's subprocess.Popen / process.communicate. The accumulated imgui-bundle C++ frames exhaust the main thread's 1.94 MB stack. This explains: - Why bumping io_pool stack to 8MB doesn't help (the patch can't reach the main thread, which was created before any sitecustomize runs). - Why the standalone subprocess call works (no render loop concurrent). - Why the no-click baseline survives 60s (no AI call to trigger the race). Next step: capture a Windows crash dump via procdump or cdb.exe to confirm the crashing thread is the main thread and identify the specific imgui-bundle C++ stack frame.	2026-06-17 12:58:15 -04:00
ed	86fc1c5477	Merge branch 'master' of C:\projects\manual_slop into tier2/send_result_to_send_20260616	2026-06-17 02:00:56 -04:00
ed	abf92a8b31	feat(tier2): add fetch_tier2_branch.ps1 - bridge from sandbox to main repo The Tier 2 sandbox blocks git push (and all other destructive git ops). After Tier 2 finishes a track, this script is the bridge: it fetches the tier2/<track> branch from the sandboxed clone (C:\projects\manual_slop_tier2) into the main repo (C:\projects\manual_slop), creating a local review/<track> branch so the working tree is untouched. Usage: pwsh -File scripts\\tier2\\fetch_tier2_branch.ps1 -TrackName send_result_to_send_20260616 Supports -WhatIf for dry-run. Does NOT push to origin (user's call).	2026-06-17 01:52:04 -04:00
ed	219b653a45	docs(tier2): add track completion report (final verification + handoff) End-of-track report following the same format as TRACK_COMPLETION_tier2_autonomous_sandbox_20260616.md. Documents: - 24-commit inventory (10 atomic renames + 14 plan/script commits) - All 6 phases completed, all 9 verification flags = true - Pre-existing failures (7 tests, all credentials.toml, confirmed against origin/master baseline where they also fail) - 2 surgical doc fixes in error_handling.md (deprecation section + line 204 contradiction) - Sandbox enforcement contracts held (4 of 4 hard bans + 4 of 4 secondary contracts) - User handoff instructions (fetch + diff + merge + per-commit review) The track is the first end-to-end test of the tier2_autonomous_sandbox; this report is the final deliverable for that test.	2026-06-17 01:22:57 -04:00
ed	c0e2051ec9	conductor(plan): Mark Phase 6 complete - all track tasks done Phase 6 tasks (t6_1, t6_2, t6_3) and the phase itself marked completed. All 16 task entries now have status=completed. All 6 phase entries now have status=completed. This is the final state.toml commit for the track.	2026-06-17 01:18:40 -04:00
ed	9a5d3b9c8c	conductor(plan): Mark Task 6.3 complete - register in tracks.md Added entry after the Tier 2 Autonomous Sandbox track (its parent dependency). Status: shipped 2026-06-17. Notes: 6 phases, 10 atomic rename commits, 37 files modified, 0 new/deleted. Test inventory: 100/101 pass in renamed files; 7 broader pre-existing failures all due to missing credentials.toml (confirmed against origin/master).	2026-06-17 01:18:02 -04:00
ed	5a58e1ceaf	conductor(plan): Mark Task 6.2 complete - metadata.json to status=shipped Track marked shipped 2026-06-17. All 6 verification criteria evaluated with PASS/EXCEEDED/READY status and notes. 7 pre-existing test failures documented with root cause and pre_existing_failures_remaining flag. Risk register updated: scope_creep=none, behavior_change=none, doc_drift=medium (error_handling.md deprecation section required surgical rewrite to historical note). No deferred_to_followup_tracks (this track completed cleanly).	2026-06-17 01:16:43 -04:00
ed	aad6deffcb	conductor(plan): Mark Task 6.1 complete - state.toml updated All 16 task entries now have status=completed and commit_sha. All 6 phases marked completed (phase_6 in_progress pending metadata+tracks.md). All 9 verification flags = true. All 6 enforcement_stack flags = true (sandbox contracts exercised). Added [notes] section documenting: - Phase 4 file count discrepancy (22 actual vs 24 spec) - error_handling.md deprecation section replacement - Pre-existing test failures (unrelated to track) - MCP edit_file unreliability + Python fallback	2026-06-17 01:15:33 -04:00
ed	d86131d951	conductor(plan): Mark Task 5.2 + 5.3 complete (Phase 5 verification) Final grep: 0 send_result in active code. 3 historical refs in error_handling.md (intentional, in the 'Historical deprecation' note). Test verification: 100/101 tests pass in the 26 files renamed by this track. 1 pre-existing failure in test_headless_service.py due to missing credentials.toml (verified against origin/master baseline where it also fails - unrelated to the rename).	2026-06-17 01:14:24 -04:00
ed	ea7d794a6b	conductor(plan): Mark Task 5.2 + 5.3 complete (Phase 5 verification done) Final grep: 0 send_result in active code. 3 historical refs in error_handling.md (intentional, in the 'Historical deprecation' note). Test verification: 100/101 tests pass in the 26 files renamed by this track. 1 pre-existing failure in test_headless_service.py due to missing credentials.toml (verified against origin/master baseline where it also fails - unrelated to the rename). 7 broader suite failures all pre-existing (all FileNotFoundError on credentials.toml, confirmed against origin/master baseline). Track verification: - git grep send_result: 0 in active code (3 historical intentional) - Full test suite: matches pre-rename baseline (7 pre-existing failures unrelated to the rename, 0 new regressions)	2026-06-17 01:13:25 -04:00
ed	5cc422b34b	conductor(plan): Mark Task 5.1 complete (Phase 5 docs done)	2026-06-17 00:51:07 -04:00
ed	9b5011231c	docs(ai_client): rename send_result to send in 3 current docs Doc consistency: guide_ai_client.md, guide_app_controller.md, and the error_handling styleguide now reference the new symbol name. Also fixes two consistency issues in error_handling.md introduced by the mechanical rename: 1. The 'Deprecation: send -> send_result' section (lines 623-642) was rewritten as a 'Historical deprecation (added 2026-06-15, reverted 2026-06-16)' note that points to the relevant track specs. 2. Line 204 (the 'Current State Audit' summary for src/ai_client.py) had a self-contradictory claim ('send() is the new public API; send() is @deprecated') after the rename. Updated to describe the canonical public API. Historical archives (conductor/tracks//spec.md, conductor/tracks//plan.md, docs/reports/*) are NOT modified - they document the 2026-06-15 public_api_migration decision and stay as historical record.	2026-06-17 00:50:36 -04:00
ed	d17d8743dd	conductor(plan): Mark Task 4.1 complete (Phase 4 done)	2026-06-17 00:45:44 -04:00
ed	ada9617308	test(ai_client): rename send_result to send in 22 remaining test files Batch rename of 22 test files. 62 references renamed total. The full test suite is now GREEN again, matching the pre-rename baseline from Task 1.1. Pure mechanical rename. No behavior change. Files affected: test_ai_cache_tracking, test_ai_client_cli, test_ai_client_result, test_api_events, test_context_pruner, test_deepseek_provider, test_gemini_cli_* (3 files), test_gui2_mcp, test_headless_* (2 files), test_live_gui_integration_v2, test_orchestration_logic, test_phase6_engine, test_rag_integration, test_run_worker_lifecycle_abort, test_spawn_interception_v2, test_symbol_parsing, test_tier4_interceptor, test_tiered_aggregation, test_token_usage. Note: spec estimated 24 files; actual is 22 (test_deprecation_warnings no longer exists, and 1 fewer file than spec's list). Refs: conductor/tracks/send_result_to_send_20260616/	2026-06-17 00:38:29 -04:00
ed	2f45bc4d68	conductor(plan): Mark Task 3.5 + 3.6 complete (Phase 3 done)	2026-06-17 00:35:32 -04:00
ed	53b35de5c6	conductor(plan): Mark Task 3.4 complete	2026-06-17 00:34:00 -04:00
ed	58fe3a9cb5	conductor(plan): Mark Task 3.3 complete	2026-06-17 00:33:00 -04:00
ed	6dbba46a25	conductor(plan): Mark Task 3.2 complete	2026-06-17 00:31:33 -04:00
ed	f0663fda6a	conductor(plan): Mark Task 3.1 complete	2026-06-17 00:29:54 -04:00
ed	3e2b4f74ba	test(ai_client): rename send_result to send in test_conductor_engine_v2 22 references renamed (mostly monkeypatch.setattr calls + comments). Test file state: GREEN. All 10 tests in this file now pass.	2026-06-17 00:29:21 -04:00
ed	d714d10fd4	conductor(plan): Mark Task 2.1 complete	2026-06-17 00:28:17 -04:00
ed	d87d909f7b	refactor(ai_client): rename send_result to send in 5 src/ call sites Renames 10 references across app_controller, conductor_tech_lead, mcp_client (docstring example), multi_agent_conductor, orchestrator_pm. 5 call sites in ai_client.send_result(...) -> ai_client.send(...) 3 print strings mentioning send_result 1 docstring comment (conductor_tech_lead) 1 docstring example (mcp_client) 'src.ai_client.send_result' -> 'src.ai_client.send' Test suite state: still red, but all src/-level call sites are now renamed. Remaining failures are in test files (mocks and patches that still reference send_result). Refs: conductor/tracks/send_result_to_send_20260616/	2026-06-17 00:27:47 -04:00
ed	4a59567939	conductor(plan): Mark Task 1.1 complete	2026-06-17 00:26:05 -04:00
ed	5351389fc0	refactor(ai_client): rename send_result to send (the impl, TDD red moment) The TDD red moment. The implementation is renamed but the call sites in src/, tests/, and docs still use send_result. Subsequent commits rename the call sites and progressively move the test suite back to green. 10 references renamed in src/ai_client.py: - 4 'Called by: send_result' docstring tags in private provider helpers - 1 function definition (def send_result -> def send) - 1 [C: ...] SDM tag referencing test function names - 2 monitor component names (start_component / end_component) - 2 error source strings (CONFIG + INTERNAL) Also adds scripts/tier2/apply_t1_1_edits.py - the helper script that applied the 10 edits. Kept in scripts/tier2/ as a record of the mechanical change pattern. Refs: conductor/tracks/send_result_to_send_20260616/	2026-06-17 00:23:16 -04:00
ed	cba5457b9d	feat(tier2): add run_tier2_sandboxed.ps1 launcher with restricted token (skeleton)	2026-06-16 19:49:47 -04:00
ed	a9be60ae50	feat(tier2): add setup_tier2_clone.ps1 bootstrap script with -WhatIf support	2026-06-16 19:47:06 -04:00
ed	796da0de60	feat(tier2): add run_track.py CLI with init/status/report modes + git fetch/switch	2026-06-16 19:27:08 -04:00
ed	73ab2778ca	feat(report): implement write_failure_report + 8 tests, 100% coverage	2026-06-16 19:13:30 -04:00
ed	190766fe03	feat(failcount): add default failcount.toml thresholds	2026-06-16 19:01:31 -04:00
ed	fc92e1aa74	feat(failcount): add FailcountState + FailcountConfig dataclasses + all stub functions	2026-06-16 18:59:38 -04:00
ed	9f2ff29c2e	feat(tier2): create scripts/tier2/ package	2026-06-16 18:57:09 -04:00
ed	b90d4bdd4e	feat(scripts): add --ci alias for --strict + CI-gate doc updates	2026-06-16 10:40:21 -04:00
ed	4521a7df96	feat(scripts): add --summary and --by-size modes to exception_handling audit	2026-06-16 09:41:20 -04:00
ed	9a04153abd	feat(scripts): add exception_handling audit script (10-category classification)	2026-06-16 09:06:25 -04:00
ed	125a226525	was called rest	2026-06-15 20:10:18 -04:00
ed	48b47d250c	oops	2026-06-15 20:04:35 -04:00
ed	4419922bce	review batch script	2026-06-15 20:02:36 -04:00
ed	c00161a13d	Adjust audi_line_count.py to take into account doc strings	2026-06-12 22:47:58 -04:00
ed	1577cca568	fix(audit): remove stale 'gemini_native' from deferred-vendors exclusion The previous exclusion list had 'gemini_native' which is NOT a real function name in src/ai_client.py. The actual function is _send_gemini_cli (already migrated to run_with_tool_loop via send_func + on_pre_dispatch in commit `4748d134`). The current deferred vendors are now correctly: - anthropic (uses anthropic SDK) - gemini (uses google-genai streaming) - deepseek (uses requests.post) These will be addressed in Phase 5 t5_6/7/8. When those ship, the DEFERRED_VENDORS frozenset should be emptied so the audit gates the migration. Verified: script still passes; gemini_cli's run_with_tool_loop usage is detected correctly.	2026-06-11 21:30:04 -04:00
ed	be5056051a	feat(audit): add scripts/audit_providers_source_of_truth.py Phase 2 task 2.4 (the script part). The script enforces: PROVIDERS is declared as a literal only in src/ai_client.py. The __getattr__ re-export in src/models.py is allowed (it lazy-imports, not a literal declaration). Catches the literal pattern 'PROVIDERS: List[str] = [' specifically, which the __getattr__ re-export does not match. OK: passes against current state where PROVIDERS is declared only in src/ai_client.py:56.	2026-06-11 16:44:59 -04:00
ed	7e4503f4e8	feat(audit): add scripts/audit_no_inline_tool_loops.py + state.toml Phase 1 progress Task 1.8 (the plan's numbering: 'Add audit script'). Audit checks that no _send_<vendor> in src/ai_client.py contains an inline 'for round_idx in range(MAX_TOOL_ROUNDS' loop. The audit excludes the 4 vendored-call-path vendors (anthropic, gemini, gemini_native, deepseek) which are documented in state.toml's deferred_work section as future work (they use their own SDKs and need separate per-vendor conversion to OpenAICompatibleRequest). state.toml: - t1_7 (Apply to 4 inline-loop vendors): completed for _send_gemini_cli only. Anthropic + Gemini + DeepSeek deferred. - t1_8 (Add audit script): in_progress. - t1_7 reuses commit `4748d134` (the send_func + on_pre_dispatch refactor that introduced the new helper pattern for vendored call paths). OK: audit passes against the current 4 OpenAI-compat vendors (minimax, grok, llama, qwen still uses _dashscope_call but has no inline loop) + gemini_cli.	2026-06-11 16:17:23 -04:00
ed	749120d239	feat(audit): flag hardcoded workspace and project-root paths in tests	2026-06-09 17:01:14 -04:00
ed	488ae04459	fix(run_tests_batched): detect batch failure from output when proc.returncode is wrong	2026-06-08 02:03:50 -04:00
ed	5c6eb620a1	fix(run_tests_batched): colorize non-xdist format (tests/... STATUS), filter 'Error during log pruning' noise	2026-06-08 01:54:56 -04:00
ed	272b7841ae	fix(run_tests_batched): filter xdist scheduling queue output (test paths without status prefix)	2026-06-08 01:51:07 -04:00
ed	a2d16541d0	fix(run_tests_batched): keep pytest's full -v output, only filter LogPruner/win errors, colorize per-test status	2026-06-08 01:49:39 -04:00
ed	21cb57b31d	fix(run_tests_batched): graceful xdist fallback, live progress streaming, ANSI colors, absolute default paths	2026-06-08 01:28:53 -04:00
ed	50f26f0d5c	chore: delete legacy run_tests_batched.py (was preserved for one cycle)	2026-06-08 01:15:12 -04:00
ed	e6ad2ecda2	chore: preserve old run_tests_batched.py as .legacy for one cycle	2026-06-08 00:59:49 -04:00

1 2 3 4 5