manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	7ea802ab80	refactor(orchestrator_pm): migrate to send_result() (G2, public_api_migration_and_ui_polish_20260615 Phase 1.2) Replaces deprecated ai_client.send(md_content='', user_message=user_message, enable_tools=False) with ai_client.send_result(...) and branches on result.ok. On error, logs the ui_message() and returns [] (the function returns a list of track definitions or [] on failure). The 3 tests in test_orchestrator_pm.py + 1 in test_orchestrator_pm_history.py now fail because they mock src.ai_client.send. These will be fixed in Phase 2.14-2.15 by mocking send_result instead.	2026-06-15 15:57:00 -04:00
ed	bbb3d59712	refactor(conductor_tech_lead): migrate to send_result() (G1, public_api_migration_and_ui_polish_20260615 Phase 1.1) Replaces deprecated ai_client.send(md_content='', user_message=user_message) with ai_client.send_result(...) and branches on result.ok. On error, logs the ui_message() and returns None (the function returns a list of ticket definitions or None on failure). The previous code called the @deprecated send() shim which silently returns '' on error. The empty string would then be passed to json.loads, causing JSONDecodeError and 3 retry attempts. The new code short-circuits on the first error and returns None immediately. This is the easiest of the 3 production migrations (2-arg call with no callbacks). See plan.md Phase 1.1. Test fixes for the production-affected mocks in test_conductor_tech_lead.py and test_orchestration_logic.py are in Phase 2.12 and Phase 2.13. NOTE: 4 tests now fail (3 in test_conductor_tech_lead.py + 1 in test_orchestration_logic.py) because they mock src.ai_client.send. These will be fixed in Phase 2.12/2.13 by mocking send_result instead.	2026-06-15 15:53:08 -04:00
ed	bb3b3056b4	conductor(plan): add 7 production-affected test mock files to Phase 2 The original Phase 2 covered 12 test files that call ai_client.send(...). Phase 1.1 implementation revealed 7 additional test files that mock ai_client.send (via patch()) for tests of the production code paths. When production migrates to send_result(), these mocks receive 0 calls and the tests fail with 'send was called 0 times'. Adding Phase 2.12-2.18 to cover: - test_conductor_tech_lead.py (3 mocks; breaks after Phase 1.1) - test_orchestration_logic.py (1 mock; breaks after Phase 1.1) - test_orchestrator_pm.py (3 mocks; pre-empt Phase 1.2) - test_orchestrator_pm_history.py (1 mock; pre-empt Phase 1.2) - test_phase6_engine.py (1 mock; pre-empt Phase 1.3) - test_run_worker_lifecycle_abort.py (1 mock; pre-empt Phase 1.3) - test_spawn_interception_v2.py (1 mock; pre-empt Phase 1.3) test_rag_integration.py mock migration deferred to RAG track (OOS1). Also adds state.toml for the track (7 phases, 28 tasks, audit fields).	2026-06-15 15:50:56 -04:00
ed	0c9086afda	conductor: register public_api_migration_and_ui_polish_20260615 in tracks.md + update UI Polish row	2026-06-15 15:27:04 -04:00
ed	55ff733df5	conductor(track): metadata.json for public_api_migration_and_ui_polish_20260615	2026-06-15 15:24:46 -04:00
ed	8ab71035d5	conductor(track): plan for public_api_migration_and_ui_polish_20260615 (7 phases, 28 tasks)	2026-06-15 15:23:19 -04:00
ed	3febdab42c	conductor(track): spec for public_api_migration_and_ui_polish_20260615 (3 prod + 12 test migrations + 2 UI Polish test fixes)	2026-06-15 15:20:44 -04:00
ed	431ebce2b9	completion report	2026-06-15 14:57:08 -04:00
ed	a8c8125118	conductor(track): mark doeh_test_thinking_cleanup_20260615 as completed	2026-06-15 14:49:59 -04:00
ed	cf5fdd3d62	docs(ai_client): add 2 follow-up notes for doeh_test_thinking_cleanup_20260615	2026-06-15 14:48:38 -04:00
ed	6edeb2b5a9	conductor(state): fix duplicate keys in ai_loop_regressions_20260614 state.toml	2026-06-15 14:29:07 -04:00
ed	e4a8a0bca1	test(thinking_trace): add test for <think> half-width marker (doeh cleanup Phase 4.2)	2026-06-15 14:26:32 -04:00
ed	4e97156e77	fix(thinking_parser): add <think> (half-width) marker support (doeh cleanup Phase 4.1)	2026-06-15 14:25:54 -04:00
ed	cb985f08ed	test(gemini): add regression tests for thinking-format extraction (doeh cleanup Phase 3.1)	2026-06-15 14:15:52 -04:00
ed	e9abadc867	fix(ai_client): extract Gemini thought=True parts and wrap in <thinking> tags for parse_thinking_trace	2026-06-15 14:10:43 -04:00
ed	81882c398e	test(headless_service): adapt test_generate_endpoint to send_result (doeh cleanup Phase 2.5)	2026-06-15 13:57:47 -04:00
ed	9e89d52607	test(ai_client_tool_loop): adapt mock to return Result[NormalizedResponse] (doeh cleanup Phase 2.4)	2026-06-15 13:54:57 -04:00
ed	dbdf9ba9e1	test(llama_native): adapt 4 tests to Result API (doeh cleanup Phase 2.3)	2026-06-15 13:52:38 -04:00
ed	439a0ac074	test(llama): adapt 3 tests to Result API (doeh cleanup Phase 2.2)	2026-06-15 13:25:31 -04:00
ed	d7e42a4a3d	test(grok): adapt 2 tests to Result API (doeh cleanup Phase 2.1)	2026-06-15 13:04:45 -04:00
ed	27d7a04fd3	conductor(plan): Mark Phase 1 (G1 critical regression fix) complete	2026-06-15 12:58:34 -04:00
ed	7b323e3e5f	fix(app_controller): restore context_to_send definition in _api_generate (CRITICAL regression from ai_loop_regressions_20260614)	2026-06-15 12:54:11 -04:00
ed	6f4bd75ef9	conductor: register doeh_test_thinking_cleanup_20260615 in tracks.md + mark ai_loop_regressions_20260614 shipped	2026-06-15 12:22:56 -04:00
ed	88bf04eb3d	conductor(track): metadata.json for doeh_test_thinking_cleanup_20260615	2026-06-15 12:21:16 -04:00
ed	304f469663	conductor(track): plan for doeh_test_thinking_cleanup_20260615 (TDD-style, 5 phases, 16 tasks)	2026-06-15 12:20:06 -04:00
ed	925e366cdd	conductor(track): spec for doeh_test_thinking_cleanup_20260615 (1 critical regression + 11 test mocks + 2 deferred bugs)	2026-06-15 12:17:51 -04:00
ed	515ef933a1	docs(report): add track completion report for ai_loop_regressions_20260614 In-depth handoff for Tier 1 review covering: - Executive summary with TL;DR - Goal & scope (planned vs delivered) - Per-phase delivery summary - Test coverage analysis (7 new + 2 adapted + 2 smoke) - Deferred items documentation (3 cross-references) - Pre-existing failures (14, verified not caused by this track) - Plan deviations (6 items, with rationale) - Post-ship risk register - Commit inventory with diff stat - 7 recommendations for the Tier 1 reviewer - Handoff checklist Working tree was clean before adding the report (no other changes to commit).	2026-06-15 11:32:33 -04:00
ed	e6afefdc66	conductor(plan): mark track complete (all 5 phases, 17 tasks done)	2026-06-15 11:25:32 -04:00
ed	010752229b	conductor(track): mark ai_loop_regressions_20260614 as completed Updates status: active -> completed, adds completed_at date, updates verification_criteria with the actual verification results. 7 regression tests pass; 14 pre-existing failures (parent track's state.toml [regressions_20260612]) are not caused by these changes.	2026-06-15 11:24:43 -04:00
ed	2489e3215b	docs(ai_client): add 2 follow-up notes for ai_loop_regressions_20260614 Adds 3 entries to the See Also section: 1. Gemini / Gemini CLI thinking-format compatibility (deferred from ai_loop_regressions_20260614) - investigate empirically 2. <think> (half-width) marker support in thinking_parser (deferred) 3. Public API Result Migration (planned, separate track public_api_migration_20260606) Each entry links to the corresponding spec section for traceability.	2026-06-15 11:21:58 -04:00
ed	10046293ae	test(ai_loop): add live_gui smoke test for FR3 thinking substrate (Phase 4.3) Mirrors the FR1 live_gui smoke test: the full end-to-end live_gui FR3 test would require mock injection into the live_gui subprocess. The mock-based regression coverage for FR3 is already in test_ai_loop_regressions_20260614.py::test_fr3_minimax_thinking_in_returned_text. This smoke test verifies the disc_entries field is exposed via the Hook API, establishing the integration substrate for follow-up work.	2026-06-15 11:04:46 -04:00
ed	5f4c347824	conductor(plan): mark Phase 4 (FR3 fix) complete	2026-06-15 10:58:45 -04:00
ed	f4a782d99f	fix(ai_loop): wrap MiniMax reasoning in <thinking> tags for parse_thinking_trace (FR3, Bug #3 ) Adds a new wrap_reasoning_in_text: bool = False keyword argument to run_with_tool_loop. When True and reasoning_content is non-empty, the returned text is prepended with <thinking>...</thinking> tags so thinking_parser.parse_thinking_trace can extract a ThinkingSegment for the discussion entry. The wrap is conditional (default False) so it doesn't break providers that already wrap inline (e.g. DeepSeek, which wraps at line 2117-2118 before run_with_tool_loop sees the response). _send_minimax now passes wrap_reasoning_in_text=bool(caps.reasoning). When caps.reasoning is True (M2.5/M2.7), the reasoning is wrapped in <thinking> tags. When False (M2/M2.1), the parameter is False and no wrap happens (avoids useless getattr on non-reasoning models). Also fixes a bug in the test_fr3_minimax_thinking_in_returned_text test mock: it was returning a raw MagicMock instead of a Result object, which caused the test to see auto-created MagicMock attributes instead of the expected text. Now wraps in Result(data=MagicMock(...)) and sets ai_client._model to ensure get_capabilities('minimax', _model) resolves to the M2.7 capabilities (reasoning=True).	2026-06-15 10:56:24 -04:00
ed	722b09b99b	conductor(plan): mark Phase 3 (FR2 fix) complete	2026-06-15 10:28:26 -04:00
ed	2b7b571a64	fix(ai_loop): replace dead ProviderError except clauses with send_result() pattern (FR2, Bug #1 ) Replaces 3 dead 'except ai_client.ProviderError' clauses (the class was removed in commit `64b787b8`) with the new send_result() + result.ok pattern. Removes the inner try/except block entirely (replaced by 'if not result.ok: raise HTTPException(502, ...)'). Sites fixed: - _api_generate: send() -> send_result() + result.ok branch - _handle_request_event (already fixed in FR1 commit `24ba2499`) AST scan via test_fr2_no_provider_error_in_source now passes: zero remaining references to ai_client.ProviderError in src/app_controller.py. The single remaining 'except Exception as e: import traceback; traceback.print_exc(); raise HTTPException(500, str(e))' is the legitimate outer except for unexpected in-flight errors. Added a one-line comment per the plan referencing the data-oriented error handling styleguide, so future migrations follow the same pattern.	2026-06-15 10:27:51 -04:00
ed	95288e4cb2	conductor(plan): mark Phase 2 (FR1 fix) complete	2026-06-15 09:42:44 -04:00
ed	2d1ff9e433	test(ai_loop): add live_gui smoke test for FR1 substrate (Phase 2.2) The full end-to-end live_gui FR1 test would require mock injection into the live_gui subprocess (patches in the test process do NOT propagate). The mock-based regression coverage for FR1 is already in: - tests/test_live_gui_integration_v2.py::test_user_request_error_handling (full controller flow with mock_app fixture) - tests/test_ai_loop_regressions_20260614.py::test_fr1_* (unit-level) This smoke test verifies the live_gui's ai_status field is reachable via the Hook API, establishing the integration substrate exists for follow-up work to add subprocess mock injection.	2026-06-15 09:41:39 -04:00
ed	25112f4157	test(live_gui): adapt test_user_request_* to new send_result() flow The 2 tests in test_live_gui_integration_v2.py were mocking the old ai_client.send() and asserting on the old error format. The FR1 fix migrated _handle_request_event to ai_client.send_result() and routes errors via ErrorInfo.ui_message() instead of f'ERROR: {e}'. Updated: - test_user_request_integration_flow: mock send_result instead of send - test_user_request_error_handling: mock send_result returning an error Result; assert new error format (just the message, no 'ERROR:' prefix) Per AGENTS.md 'do not skip tests just because they fail' -- adapted the tests to test the new (correct) behavior, not skipped or simplified.	2026-06-15 09:25:50 -04:00
ed	24ba249901	fix(ai_loop): route send_result() errors to Discussion Hub as error entries (FR1, Bug #2 ) Replaces deprecated ai_client.send() in _handle_request_event with send_result() and branches on result.ok. On error, the first ErrorInfo is routed to the event_queue as a 'response' with status='error', allowing _on_comms_entry to add it to the discussion history. The previous code called the @deprecated send() shim which silently returns '' on error. The empty string was then filtered out by _on_comms_entry (text_content.strip() check at line 3801), so users saw no discussion entry for failed AI requests. This also removes the dead 'except ai_client.ProviderError' clause at line 3692 (the class was removed in commit `64b787b8`). The 2 remaining dead clauses at lines 305, 313 are fixed in the next commit (FR2).	2026-06-15 09:22:47 -04:00
ed	9b280a43fb	conductor(plan): mark Phase 1 (TDD red) complete	2026-06-15 09:20:41 -04:00
ed	44dc90bca8	test(ai_loop): add FR1/FR2/FR3 tests for ai_loop_regressions_20260614 (TDD red) 3 bug groups, all reproducing documented regressions: - test_fr1_: error response becomes a discussion entry (Bug #2) - test_fr2_: no ProviderError references in src/app_controller.py (Bug #1) - test_fr3_*: MiniMax thinking mono rendering in returned text (Bug #3) 4 critical tests fail for the documented reasons; 3 sanity checks pass.	2026-06-15 09:18:07 -04:00
ed	52c01c6cbc	config	2026-06-15 09:01:53 -04:00
ed	f4c497b1e8	conductor: register ai_loop_regressions_20260614 in tracks.md (priority A, ready for Tier 2)	2026-06-15 00:48:12 -04:00
ed	acc294ae4e	conductor(track): metadata.json for ai_loop_regressions_20260614	2026-06-15 00:44:52 -04:00
ed	884e40b9d1	conductor(track): plan for ai_loop_regressions_20260614 (TDD-style, 5 phases, 17 tasks)	2026-06-15 00:41:57 -04:00
ed	7a4dcc9690	conductor(track): spec for ai_loop_regressions_20260614 (MiniMax/Gemini/Gemini CLI/DeepSeek)	2026-06-15 00:33:04 -04:00
ed	74e02485a1	files & media ux improvemetn with directory folding and file name vis	2026-06-14 23:29:43 -04:00
ed	ae8d01d0f7	add missing region start comment.	2026-06-14 22:43:55 -04:00
ed	2d51199699	fix(regression): for adding files in the files & media panel.	2026-06-14 22:43:42 -04:00
ed	dcdcaa92f6	tiny	2026-06-13 20:50:36 -04:00

1 2 3 4 5 ...

3222 Commits