manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	e40b122b1b	test(ai_client): delete obsolete test_deprecation_warnings.py (Phase 6.2) Per plan Task 6.3: both tests in test_deprecation_warnings.py are obsolete after the send() function was removed in Phase 6.1: - test_send_deprecated_warning_emitted_once_per_site: literally cannot run without ai_client.send (AttributeError) - test_send_result_does_not_emit_deprecation: trivially true after send() is removed (no deprecation source) The test_send_result_does_not_emit_deprecation regression test is preserved in tests/test_ai_client_result.py (added in Phase 2.7 as the renamed test). The pre-Phase-2.7 test_send_deprecated_emits_warning was deleted in Phase 2.7. Verification: pytest tests/test_deprecation_warnings.py reports 'ERROR: file or directory not found'.	2026-06-15 18:53:02 -04:00
ed	c50367c6d5	test(log_management_refresh): use rfind() to locate code (Phase 5.2, fixes 1 pre-existing failure) The test used src.find() which locates the first occurrence of 'Refresh Registry' in the comment block (line 2090 in src/gui_2.py), not the actual code (line 2111). The 400-char snippet window doesn't reach the code, so the assertion for 'load_registry' fails. Production code is already correct (in-place load_registry()) at src/gui_2.py:2111-2112 (user commit `df7bda6e`). This test just needs to use rfind() to locate the actual code, not the comment. Change: src.find(marker) -> src.rfind(marker) 1 test passes (was 1 pre-existing failure).	2026-06-15 18:27:40 -04:00
ed	f663a34f52	test(discussion_truncate): use rfind() to locate code (Phase 5.1, fixes 1 pre-existing failure) The test used src.find() which locates the first occurrence of 'Keep Pairs:' in the comment block (line 5113 in src/gui_2.py), not the actual code (line 5130). The 200-char snippet window only reaches the comment, so the assertions for set_next_item_width(140) and drag_int fail. Production code is already correct (set_next_item_width(140) + drag_int) at src/gui_2.py:5130-5131 (user commit `d0b06575`). This test just needs to use rfind() to locate the actual code, not the comment. Change: src.find(marker) -> src.rfind(marker) 1 test passes (was 1 pre-existing failure).	2026-06-15 18:21:58 -04:00
ed	effa24a7ae	test(symbol_parsing): mock send_result not send (Phase 4, fixes 2 pre-existing failures) The 2 tests in test_symbol_parsing.py mock src.ai_client.send but production now uses send_result (migrated by doeh_test_thinking_cleanup_20260615 commit `24ba2499`). Mocks receive 0 calls; tests fail with "send was called 0 times". Changes: - Replace patch(src.ai_client.send) with patch(src.ai_client.send_result) - Rename mock_send to mock_send_result - Set return_value=Result(data="mocked response") - Add "from src.result_types import Result" import All 2 tests in test_symbol_parsing.py pass (were 2 pre-existing failures).	2026-06-15 18:20:00 -04:00
ed	3be28cc524	test(qwen): adapt 2 tests to Result API (Phase 3, fixes 2 pre-existing failures) The _send_qwen() function returns Result[str] after the data_oriented_error_handling_20260606 refactor (commit `64d6ba2d`), but 2 tests in test_qwen_provider.py were asserting against the raw str type. They were 2 of the 10 pre-existing failures documented in the track spec. Changes (mirrors the doeh_test_thinking_cleanup_20260615 pattern for grok/llama/llama_native): - Replace assert result == "hi from qwen" with assert result.ok and result.data == "hi from qwen" - Replace assert "cat" in result.lower() with assert result.ok and "cat" in result.data.lower() - Add "from src.result_types import Result" import All 5 tests in test_qwen_provider.py now pass (was 3/5).	2026-06-15 18:05:45 -04:00
ed	4592618372	fix(orchestration_logic): migrate test_run_worker_lifecycle_blocked mock (Phase 2 follow-up) Phase 2.13 missed the test_run_worker_lifecycle_blocked test in test_orchestration_logic.py - it also mocked src.ai_client.send. The test was failing with "Worker send_result failed for T1: ... [Errno 2] No such file or directory: .beads_mock/beads.json" because the unmocked send_result fell through to the real provider which tried to read beads.json. Changes: - Replace patch(src.ai_client.send) with patch(src.ai_client.send_result) - Wrap mock return_value with Result(data="BLOCKED because of missing info") All 8 tests in test_orchestration_logic.py now pass.	2026-06-15 17:45:18 -04:00
ed	36962ef6b6	test(tier4_interceptor): migrate to send_result() (Phase 2.11) The test_ai_client_passes_qa_callback test calls ai_client.send() with qa_callback=lambda. The qa_callback is passed through to the provider function (_send_gemini). Per plan note: the test has complex callback setup; the Result handling needs the mock to return Result(data="ok") so the qa_callback passes through and the test succeeds. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert result.ok - Mock _send_gemini to return Result(data="ok") instead of relying on the default (which would call the real provider) - Add "from src.result_types import Result" import 7 tests pass (the migrated test_ai_client_passes_qa_callback was previously broken because the send() call hit the real provider and either failed or returned empty; the mock now provides a clean response).	2026-06-15 17:27:31 -04:00
ed	cfeb3cb3e0	test(gemini_cli_integration): migrate 2 sites to send_result() (Phase 2.10) Changes: - Rename ai_client.send(...) to ai_client.send_result(...) (2 sites) - Add assert result.ok (1 site; the second test only checks result is not None) - Add "from src.result_types import Result" import 2 tests pass.	2026-06-15 17:07:20 -04:00
ed	363fe91db0	test(deepseek): migrate 6 sites to send_result() (Phase 2.9) All 6 sites in test_deepseek_provider.py call ai_client.send(...). Each assertion pattern is slightly different (==, "in", call_args inspection); migration follows the same pattern: rename to send_result(), add assert result.ok, and use result.data for the response text. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) (6 sites) - Add assert result.ok (6 sites) - Replace result == "x" with result.data == "x" (or "x" in result.data) - Add "from src.result_types import Result" import 7 tests pass (1 unrelated test_deepseek_model_selection + 6 migrated).	2026-06-15 16:59:46 -04:00
ed	d9a79efa25	test(api_events): migrate 2 sites to send_result() (Phase 2.8) The test_send_emits_events_proper and test_send_emits_tool_events tests both call ai_client.send(). Migrating to send_result() + assert result.ok. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) (2 sites) - Add assert result.ok (2 sites) - Add "from src.result_types import Result" import 4 tests pass.	2026-06-15 16:57:53 -04:00
ed	0192978646	test(ai_client_result): migrate to send_result(); drop test_send_deprecated (Phase 2.7) Per plan Task 2.7: - DELETE test_send_deprecated_emits_warning (obsolete after Phase 6; send() is being removed) - RENAME test_send_extracts_data_from_result -> test_send_result_does_not_emit_deprecation (this is the regression test the plan said to KEEP; it now asserts the new API does not emit a deprecation warning, instead of testing the old behavior) - MIGRATE test_send_extracts_data_from_result (renamed to the above) - MIGRATE test_send_returns_empty_string_on_error_result -> test_send_result_returns_empty_data_with_error_on_auth_failure (asserts the Result has data="" and not ok) 5 tests pass (down from 6; the deleted test removed 1; the renamed test_send_extracts_data_from_result became test_send_result_does_not_emit_deprecation).	2026-06-15 16:55:30 -04:00
ed	1e2c34313c	test(token_usage): migrate to send_result() (Phase 2.6) The test_token_usage_tracking test calls ai_client.send() and verifies the comms log entry. Migrating to send_result() + assert result.ok. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert result.ok - Add "from src.result_types import Result" import 1 test passes.	2026-06-15 16:51:24 -04:00
ed	c59bac59f2	test(gui2_mcp): migrate to send_result() (Phase 2.5) The test_mcp_tool_call_is_dispatched test calls ai_client.send() and asserts the MCP dispatch function was called. Migrating to send_result() + assert result.ok. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert result.ok - Add "from src.result_types import Result" import 1 test passes.	2026-06-15 16:49:11 -04:00
ed	fe52024311	test(gemini_cli_parity_regression): migrate to send_result() (Phase 2.4) The test_send_invokes_adapter_send test calls ai_client.send() and asserts the return value. Migrating to send_result() with assert res.ok and res.data == "Hello from mock adapter". Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert res.ok before accessing res.data - Add "from src.result_types import Result" import 1 test passes.	2026-06-15 16:39:31 -04:00
ed	b4c9ebd963	test(gemini_cli_edge_cases): migrate to send_result() (Phase 2.3) The test_gemini_cli_loop_termination test calls ai_client.send() and asserts the return value. Migrating to send_result() with assert result.ok and result.data == "Final answer". Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert result.ok before accessing result.data - Add "from src.result_types import Result" import 3 tests pass.	2026-06-15 16:31:26 -04:00
ed	fab9196bea	test(ai_cache_tracking): migrate to send_result() (Phase 2.2) The test calls ai_client.send() but does not check the return value - it only verifies the side effect on gemini cache stats. Migrating to send_result() and asserting result.ok is enough. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert result.ok (the return value is unused) - Add "from src.result_types import Result" import 2 tests pass.	2026-06-15 16:28:20 -04:00
ed	ba0df1fa95	test(ai_client_cli): migrate to send_result() (Phase 2.1) Replaces the deprecated ai_client.send() call with ai_client.send_result() in the test. The mock for GeminiCliAdapter is unchanged (it is patched to return a dict that send_result unwraps internally). Changes: - Rename response = ai_client.send(...) to result = ai_client.send_result(...) - Add assert result.ok before accessing result.data - Add "from src.result_types import Result" import 1 test passes.	2026-06-15 16:26:06 -04:00
ed	16c6705b80	test(spawn_interception_v2): mock send_result not send (Phase 2.18, pre-empts Phase 1.3 regression) Phase 1.3 migrated run_worker_lifecycle to send_result(). The mock_ai_client fixture in test_spawn_interception_v2.py mocked src.ai_client.send and returned a string. The test_run_worker_lifecycle_approved test asserts on the call_args (user_message + md_content), which still works with the new mock name. Changes: - Replace patch(src.ai_client.send) with patch(src.ai_client.send_result) - Rename mock_send to mock_send_result - Wrap mock return_value with Result(data="Task completed") - Add "from src.result_types import Result" import All 3 tests in test_spawn_interception_v2.py pass.	2026-06-15 16:24:05 -04:00
ed	7a6ffd8954	test(run_worker_lifecycle_abort): mock send_result not send (Phase 2.17, pre-empts Phase 1.3 regression) Phase 1.3 migrated run_worker_lifecycle to send_result(). This test mocks src.ai_client.send and asserts it is NOT called (abort fires before the AI dispatch). Migrating the mock to send_result is purely for consistency and future-proofing; the test still passes either way. Changes: - Rename patch(src.ai_client.send) to patch(src.ai_client.send_result) - Rename mock_send to mock_send_result - Comment updated to reference send_result	2026-06-15 16:21:08 -04:00
ed	bb2add1249	test(phase6_engine): mock send_result not send (Phase 2.16, pre-empts Phase 1.3 regression) Phase 1.3 migrated src/multi_agent_conductor.py:591 (run_worker_lifecycle) to send_result(). The test_worker_streaming_intermediate test mocked src.ai_client.send, which would break once Phase 1.3 was applied. (Confirmed: test failed after Phase 1.3 commit.) Changes: - Replace patch(src.ai_client.send) with patch(src.ai_client.send_result) - Rename mock_send to mock_send_result - Wrap mock side_effect return with Result(data="DONE") - Add "from src.result_types import Result" import All 3 tests in test_phase6_engine.py pass.	2026-06-15 16:16:53 -04:00
ed	499762d8f0	test(orchestrator_pm_history): mock send_result not send (Phase 2.15, pre-empts Phase 1.2 regression) Phase 1.2 migrated src/orchestrator_pm.py:86 to send_result(). The test_generate_tracks_with_history test mocked src.ai_client.send, which would break once Phase 1.2 was applied. (Confirmed: test failed after Phase 1.2 commit.) Changes: - Replace @patch(src.ai_client.send) with @patch(src.ai_client.send_result) - Rename mock_send to mock_send_result - Wrap mock return_value with Result(data="[]") - Add "from src.result_types import Result" import All 3 tests in test_orchestrator_pm_history.py pass.	2026-06-15 16:15:06 -04:00
ed	e4a2a20469	test(orchestrator_pm): mock send_result not send (Phase 2.14, pre-empts Phase 1.2 regression) Phase 1.2 migrated src/orchestrator_pm.py:86 to send_result(). The 3 tests in TestOrchestratorPM mocked src.ai_client.send, which would break once Phase 1.2 was applied. (Confirmed: tests failed after Phase 1.2 commit.) Changes: - Replace @patch(src.ai_client.send) with @patch(src.ai_client.send_result) - Rename mock_send to mock_send_result throughout - Wrap mock return_value with Result(data=json.dumps(...)) - Add "from src.result_types import Result" import All 3 tests pass.	2026-06-15 16:10:47 -04:00
ed	953689c8b3	test(orchestration_logic): mock send_result not send (Phase 2.13, fixes Phase 1.1 regression) Phase 1.1 + 1.2 migrated the production code to send_result(). The test_generate_tracks and test_generate_tickets tests mocked src.ai_client.send, causing "send was called 0 times" failures. Changes: - Replace patch(src.ai_client.send) with patch(src.ai_client.send_result) - Wrap mock return_value with Result(data=mock_response) - Add "from src.result_types import Result" import All 8 tests in tests/test_orchestration_logic.py pass (2 migrated + 6 unaffected tests).	2026-06-15 16:08:04 -04:00
ed	488254527c	test(conductor_tech_lead): mock send_result not send (Phase 2.12, fixes Phase 1.1 regression) Phase 1.1 migrated src/conductor_tech_lead.py:68 from ai_client.send() to ai_client.send_result(). The 3 tests in TestConductorTechLead mocked src.ai_client.send which is no longer called by the production code, causing "send was called 0 times" failures. Changes: - Replace patch("src.ai_client.send") with patch("src.ai_client.send_result") - Wrap mock return_value with Result(data=...) and mock side_effect with Result(data=...) values - Add "from src.result_types import Result" import All 9 tests in tests/test_conductor_tech_lead.py pass (3 migrated + 6 unaffected topological sort tests).	2026-06-15 16:06:17 -04:00
ed	e4a8a0bca1	test(thinking_trace): add test for <think> half-width marker (doeh cleanup Phase 4.2)	2026-06-15 14:26:32 -04:00
ed	cb985f08ed	test(gemini): add regression tests for thinking-format extraction (doeh cleanup Phase 3.1)	2026-06-15 14:15:52 -04:00
ed	81882c398e	test(headless_service): adapt test_generate_endpoint to send_result (doeh cleanup Phase 2.5)	2026-06-15 13:57:47 -04:00
ed	9e89d52607	test(ai_client_tool_loop): adapt mock to return Result[NormalizedResponse] (doeh cleanup Phase 2.4)	2026-06-15 13:54:57 -04:00
ed	dbdf9ba9e1	test(llama_native): adapt 4 tests to Result API (doeh cleanup Phase 2.3)	2026-06-15 13:52:38 -04:00
ed	439a0ac074	test(llama): adapt 3 tests to Result API (doeh cleanup Phase 2.2)	2026-06-15 13:25:31 -04:00
ed	d7e42a4a3d	test(grok): adapt 2 tests to Result API (doeh cleanup Phase 2.1)	2026-06-15 13:04:45 -04:00
ed	10046293ae	test(ai_loop): add live_gui smoke test for FR3 thinking substrate (Phase 4.3) Mirrors the FR1 live_gui smoke test: the full end-to-end live_gui FR3 test would require mock injection into the live_gui subprocess. The mock-based regression coverage for FR3 is already in test_ai_loop_regressions_20260614.py::test_fr3_minimax_thinking_in_returned_text. This smoke test verifies the disc_entries field is exposed via the Hook API, establishing the integration substrate for follow-up work.	2026-06-15 11:04:46 -04:00
ed	f4a782d99f	fix(ai_loop): wrap MiniMax reasoning in <thinking> tags for parse_thinking_trace (FR3, Bug #3 ) Adds a new wrap_reasoning_in_text: bool = False keyword argument to run_with_tool_loop. When True and reasoning_content is non-empty, the returned text is prepended with <thinking>...</thinking> tags so thinking_parser.parse_thinking_trace can extract a ThinkingSegment for the discussion entry. The wrap is conditional (default False) so it doesn't break providers that already wrap inline (e.g. DeepSeek, which wraps at line 2117-2118 before run_with_tool_loop sees the response). _send_minimax now passes wrap_reasoning_in_text=bool(caps.reasoning). When caps.reasoning is True (M2.5/M2.7), the reasoning is wrapped in <thinking> tags. When False (M2/M2.1), the parameter is False and no wrap happens (avoids useless getattr on non-reasoning models). Also fixes a bug in the test_fr3_minimax_thinking_in_returned_text test mock: it was returning a raw MagicMock instead of a Result object, which caused the test to see auto-created MagicMock attributes instead of the expected text. Now wraps in Result(data=MagicMock(...)) and sets ai_client._model to ensure get_capabilities('minimax', _model) resolves to the M2.7 capabilities (reasoning=True).	2026-06-15 10:56:24 -04:00
ed	2d1ff9e433	test(ai_loop): add live_gui smoke test for FR1 substrate (Phase 2.2) The full end-to-end live_gui FR1 test would require mock injection into the live_gui subprocess (patches in the test process do NOT propagate). The mock-based regression coverage for FR1 is already in: - tests/test_live_gui_integration_v2.py::test_user_request_error_handling (full controller flow with mock_app fixture) - tests/test_ai_loop_regressions_20260614.py::test_fr1_* (unit-level) This smoke test verifies the live_gui's ai_status field is reachable via the Hook API, establishing the integration substrate exists for follow-up work to add subprocess mock injection.	2026-06-15 09:41:39 -04:00
ed	25112f4157	test(live_gui): adapt test_user_request_* to new send_result() flow The 2 tests in test_live_gui_integration_v2.py were mocking the old ai_client.send() and asserting on the old error format. The FR1 fix migrated _handle_request_event to ai_client.send_result() and routes errors via ErrorInfo.ui_message() instead of f'ERROR: {e}'. Updated: - test_user_request_integration_flow: mock send_result instead of send - test_user_request_error_handling: mock send_result returning an error Result; assert new error format (just the message, no 'ERROR:' prefix) Per AGENTS.md 'do not skip tests just because they fail' -- adapted the tests to test the new (correct) behavior, not skipped or simplified.	2026-06-15 09:25:50 -04:00
ed	44dc90bca8	test(ai_loop): add FR1/FR2/FR3 tests for ai_loop_regressions_20260614 (TDD red) 3 bug groups, all reproducing documented regressions: - test_fr1_: error response becomes a discussion entry (Bug #2) - test_fr2_: no ProviderError references in src/app_controller.py (Bug #1) - test_fr3_*: MiniMax thinking mono rendering in returned text (Bug #3) 4 critical tests fail for the documented reasons; 3 sanity checks pass.	2026-06-15 09:18:07 -04:00
ed	2e91cd7123	test(minimax): add client instantiation unit tests to catch credential and base URL regressions	2026-06-13 18:57:44 -04:00
ed	82f21d7f55	docs(ai_client): add SQLite-granularity docstrings to tool execution functions Also fixes return-type discrepancy in tests/test_ai_client_tool_loop.py mock by wrapping NormalizedResponse inside Result.	2026-06-13 18:05:12 -04:00
ed	94b9e2217a	test(ai_client): fix mocked gemini provider send function name to match implementation	2026-06-13 18:00:14 -04:00
ed	3aa7bdca99	Fix: Return NormalizedResponse from send_openai_compatible This resolves the issue where calling 'send_openai_compatible' discarded the NormalizedResponse details, resulting in an AttributeError when accessing 'raw_response' inside the tool loop.	2026-06-13 17:50:43 -04:00
ed	ee3c90b865	refactor(rag_engine): Result API + NilRAGState (_init_vector_store, _validate_collection_dim, _get_state)	2026-06-12 20:14:40 -04:00
ed	2222c31db3	test(rag_engine): add 4 red tests for Result API + NilRAGState	2026-06-12 20:14:01 -04:00
ed	64b787b881	refactor(ai_client): remove ProviderError class; ErrorInfo is the new error type	2026-06-12 19:41:41 -04:00
ed	73cf321cdf	feat(ai_client): mark send() @deprecated; rewire to call send_result()	2026-06-12 19:22:27 -04:00
ed	9f86b2bee3	feat(ai_client): add send_result() public API returning Result[str]	2026-06-12 19:01:50 -04:00
ed	1c99724670	test(ai_client): Add failing tests for send_result/deprecation/warning	2026-06-12 18:21:19 -04:00
ed	b144450bf9	test(mcp): add tests for _resolve_and_check_result and *_result tool variants	2026-06-12 18:07:16 -04:00
ed	7ccf835450	test(result_types): add red tests for Result, ErrorInfo, NilPath, NilRAGState	2026-06-12 16:29:22 -04:00
ed	d7c6d67f69	feat(ai_client): wire v2 matrix fields into old vendor send functions The matrix has v2 fields (reasoning, web_search, x_search) populated for the old vendors (minimax-M2.5/M2.7, grok-*), but the send functions didn't consult them. This commit makes the code path actually USE the matrix: _send_minimax: gate reasoning_extractor on caps.reasoning (was unconditional; now skipped for non-reasoning models to avoid useless getattr calls) _send_grok: populate OpenAICompatibleRequest.extra_body with search_parameters when caps.web_search or caps.x_search is True. caps.web_search -> {mode: auto}; caps.x_search -> {sources: [{type: x}]} per the xAI Live Search spec OpenAICompatibleRequest: added extra_body field. Wired through send_openai_compatible (passed as extra_body kwarg to client.chat.completions.create). Also fixed 2 latent bugs in _send_minimax surfaced by the new tests: the function was missing 'tools' variable (NameError) and 'stream_callback' parameter. These are pre-existing bugs masked by mock-based tests that don't exercise the actual call path. Also cancelled t5_6/7/8 (the invented 'deferred tool-loop conversion' work). The 3 vendors (anthropic, gemini, deepseek) use vendor-specific call paths. Their inline loops are NOT defects. The '3-5 days' / '1-2 weeks' estimates were made up by the agent. The audit script's DEFERRED_VENDORS exclusion is permanent. Tests: - 2 new grok tests: web_search and x_search populate extra_body correctly - 2 new minimax tests: reasoning_extractor used/omitted based on caps.reasoning - 122/122 vendor+tool+provider+import-isolation tests pass (no regressions; +4 new tests this commit) - 3 audit scripts pass	2026-06-11 22:27:42 -04:00
ed	c9135b0565	feat(gui): add v2 capability badges in provider panel Phase 5 t5_4 (UI adaptations for 11 v2 fields): the simplest honest adaptation — render small colored badges for the 11 v2 fields where the active vendor+model supports them. Each badge has a tooltip showing the field name. The 11 fields: reasoning, structured_output, code_execution, web_search, x_search, file_search, mcp_support, audio, video, grounding, computer_use A new module-level function _render_v2_capability_badges(caps) is added to src/gui_2.py (per the HARD RULE on no new src/<thing>.py files). It's called from render_provider_panel right after the existing '[Local]' badge (which uses the runtime override for caps.local). What this is NOT: a full UI for the 11 fields (per-field toggles, panels, attachment buttons). Those are design-heavy work and need their own track. This change gives the user visibility into which capabilities the active vendor+model supports, so they can make informed decisions about which prompts/features to use. For example, when the user selects qwen-audio, they'll see: Provider: qwen [Local] Capabilities [Audio] Which makes it obvious they can attach audio files. Tests: - 2 new tests in tests/test_vendor_capabilities.py: * All 11 v2 fields are present in the helper (drift guard) * Helper is a no-op on empty caps (no fields True) - 118/118 vendor+tool+provider+import-isolation tests pass (no regressions; +2 new tests this commit) - 3 audit scripts pass	2026-06-11 21:46:41 -04:00

1 2 3 4 5 ...