manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	3e17aa6c8b	test(tier2): add smoke e2e test (opt-in, double-gate TIER2_SANDBOX_TESTS+TIER2_SMOKE)	2026-06-16 22:26:04 -04:00
ed	5b6e7db174	test(tier2): add sandbox enforcement test (pre-push hook refuses push)	2026-06-16 20:25:44 -04:00
ed	5d150dc6e0	test(tier2): add bootstrap -WhatIf test (opt-in via TIER2_SANDBOX_TESTS)	2026-06-16 20:01:32 -04:00
ed	37eafc008e	test(tier2): add trivial smoke track for e2e test (force-added, fixture)	2026-06-16 19:57:36 -04:00
ed	9964ad3b3e	test(tier2): add 12 slash command + agent + config spec contract tests	2026-06-16 19:23:10 -04:00
ed	73ab2778ca	feat(report): implement write_failure_report + 8 tests, 100% coverage	2026-06-16 19:13:30 -04:00
ed	5ca8444f35	test(report): add report writer tests (red, opt-in via TIER2_SANDBOX_TESTS=1)	2026-06-16 19:10:22 -04:00
ed	2dbfaeb60e	test(failcount): add 13 unit tests + 6 coverage tests; 100% coverage achieved	2026-06-16 19:06:09 -04:00
ed	e646067a8a	test(failcount): add test_initial_state_zero (red)	2026-06-16 18:58:00 -04:00
ed	355811635d	fix(rag): handle None metadata in get_all_indexed_paths and non-empty numpy in dim check Two bugs in src/rag_engine.py were causing 'NoneType object has no attribute get' in the live_gui RAG tests (test_rag_phase4_final_verify, test_rag_phase4_stress): 1. _validate_collection_dim_result:148 Old: if not embeddings or len(embeddings) == 0: New: if embeddings is None or len(embeddings) == 0: The 'if not embeddings' check raises ValueError('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()') when 'embeddings' is a non-empty numpy array (which is the normal case after documents are upserted). The exception is caught by the outer 'except Exception' which returns a non-ok Result, causing __init__ to set self.collection = None. Subsequent 'get_all_indexed_paths()' then fails with 'NoneType has no attribute get' on self.collection.get(). 2. get_all_indexed_paths:334 Old: return list(set(m.get('path') for m in res['metadatas'] if m.get('path'))) New: return list(set(m['path'] for m in res['metadatas'] if m is not None and m.get('path'))) When chromadb returns 'metadatas=[None, ...]' (documents upserted without metadata), 'm.get('path')' fails with AttributeError on the first None element. Adds 'm is not None' guard. Both fixes are defensive: the conditions that trigger them (orphan docs without metadata, non-empty embeddings arrays) are normal valid states that the old code couldn't handle. New file: tests/test_rag_sync_none_error.py 3 unit tests covering both bugs: - test_dim_check_does_not_raise_on_non_empty_ndarray - test_get_all_indexed_paths_handles_none_metadata - test_get_all_indexed_paths_returns_paths_with_metadata Verified: - 3/3 focused tests pass - test_rag_phase4_final_verify.py::test_phase4_final_verify PASSES (was failing) - test_rag_phase4_stress.py::test_rag_large_codebase_verification_sim PASSES (was failing) - test_rag_visual_sim.py::test_rag_full_lifecycle_sim PASSES (still passing)	2026-06-16 00:09:02 -04:00
ed	e35b6a34ad	test(headless_verification): wrap mock return in Result(data=...) The test_headless_verification_full_run test in test_headless_verification.py mocked src.multi_agent_conductor.ai_client.send_result with a return_value of a raw string. The production code does 'if not result.ok:' which fails on raw strings with AttributeError. In xdist mode this caused a worker crash (gw0/gw11: 'node down: Not properly terminated') that hung the entire tier-1-unit-headless batch in the batched test runner (~50s+ per batch). The crash was the worker dying while pytest-master waited for it; the master never got a clean exit and the run was orphaned until the user's manual cancel. The test was missed in the original Phase 2 list (it was an xdist crash rather than a test logic failure) and in the 4 Phase 2 follow-up commits (which targeted the 4 specific test files the user reported during the run). Change: mock_send.return_value = 'Task completed successfully.' -> mock_send.return_value = Result(data='Task completed successfully.') Plus add the Result import. 2/2 tests in test_headless_verification.py now pass under xdist (was 1/2 + worker crash in xdist). Full headless batch (14 tests) completes in 18.7s.	2026-06-15 21:26:42 -04:00
ed	13f32f52e0	test(tiered_aggregation): wrap mock_send return in Result(data=...) (Phase 2 follow-up) The test_run_worker_lifecycle_uses_strategy test in test_tiered_aggregation.py mocked src.multi_agent_conductor.ai_client.send_result with a return_value of a raw string. The production code does "if not result.ok:" which fails on raw strings. 3/3 tests in test_tiered_aggregation.py pass (was 2/3).	2026-06-15 20:28:41 -04:00
ed	26e1b65298	test(rag_integration): wrap _send_gemini mock return in Result(data=...) The test_rag_integration test mocks the internal _send_gemini function to return a raw string. The production code in app_controller._handle_request_event now does 'if result.ok:' which fails on raw strings. Change: mock_provider.return_value = 'Mock AI Response' -> mock_provider.return_value = Result(data='Mock AI Response') Plus add the Result import. 1 test passes (was 1 pre-existing failure).	2026-06-15 20:27:07 -04:00
ed	58576fcba7	test(context_pruner): wrap send_result lambda in Result(data=...) (Phase 2 follow-up) The test_token_reduction_logging test in test_context_pruner.py mocked src.ai_client.send_result with a lambda that returned a raw string. The production code now does "if not result.ok:" which fails on raw strings. 1 test passes (was 1 pre-existing failure).	2026-06-15 20:25:44 -04:00
ed	64278d5313	test(conductor_engine_v2): wrap mock_send return values in Result(data=...) The 7 tests in test_conductor_engine_v2.py (already updated to mock src.ai_client.send_result) were still returning raw strings from the mocks. The production code in multi_agent_conductor.py now does "if not result.ok:" which fails on raw strings with AttributeError. Changes: - Add "from src.result_types import Result" import - Wrap all mock_send.return_value = "..." with Result(data="...") (4 sites) - Wrap MagicMock(return_value="...") with Result(data="...") (2 sites) - Wrap side_effect return with Result(data="Success") 10/10 tests pass (was 3/10).	2026-06-15 20:21:46 -04:00
ed	4910a703a7	more manual corrections	2026-06-15 19:41:33 -04:00
ed	f9832b07b3	manaul correction attempts	2026-06-15 19:14:22 -04:00
ed	e40b122b1b	test(ai_client): delete obsolete test_deprecation_warnings.py (Phase 6.2) Per plan Task 6.3: both tests in test_deprecation_warnings.py are obsolete after the send() function was removed in Phase 6.1: - test_send_deprecated_warning_emitted_once_per_site: literally cannot run without ai_client.send (AttributeError) - test_send_result_does_not_emit_deprecation: trivially true after send() is removed (no deprecation source) The test_send_result_does_not_emit_deprecation regression test is preserved in tests/test_ai_client_result.py (added in Phase 2.7 as the renamed test). The pre-Phase-2.7 test_send_deprecated_emits_warning was deleted in Phase 2.7. Verification: pytest tests/test_deprecation_warnings.py reports 'ERROR: file or directory not found'.	2026-06-15 18:53:02 -04:00
ed	c50367c6d5	test(log_management_refresh): use rfind() to locate code (Phase 5.2, fixes 1 pre-existing failure) The test used src.find() which locates the first occurrence of 'Refresh Registry' in the comment block (line 2090 in src/gui_2.py), not the actual code (line 2111). The 400-char snippet window doesn't reach the code, so the assertion for 'load_registry' fails. Production code is already correct (in-place load_registry()) at src/gui_2.py:2111-2112 (user commit `df7bda6e`). This test just needs to use rfind() to locate the actual code, not the comment. Change: src.find(marker) -> src.rfind(marker) 1 test passes (was 1 pre-existing failure).	2026-06-15 18:27:40 -04:00
ed	f663a34f52	test(discussion_truncate): use rfind() to locate code (Phase 5.1, fixes 1 pre-existing failure) The test used src.find() which locates the first occurrence of 'Keep Pairs:' in the comment block (line 5113 in src/gui_2.py), not the actual code (line 5130). The 200-char snippet window only reaches the comment, so the assertions for set_next_item_width(140) and drag_int fail. Production code is already correct (set_next_item_width(140) + drag_int) at src/gui_2.py:5130-5131 (user commit `d0b06575`). This test just needs to use rfind() to locate the actual code, not the comment. Change: src.find(marker) -> src.rfind(marker) 1 test passes (was 1 pre-existing failure).	2026-06-15 18:21:58 -04:00
ed	effa24a7ae	test(symbol_parsing): mock send_result not send (Phase 4, fixes 2 pre-existing failures) The 2 tests in test_symbol_parsing.py mock src.ai_client.send but production now uses send_result (migrated by doeh_test_thinking_cleanup_20260615 commit `24ba2499`). Mocks receive 0 calls; tests fail with "send was called 0 times". Changes: - Replace patch(src.ai_client.send) with patch(src.ai_client.send_result) - Rename mock_send to mock_send_result - Set return_value=Result(data="mocked response") - Add "from src.result_types import Result" import All 2 tests in test_symbol_parsing.py pass (were 2 pre-existing failures).	2026-06-15 18:20:00 -04:00
ed	3be28cc524	test(qwen): adapt 2 tests to Result API (Phase 3, fixes 2 pre-existing failures) The _send_qwen() function returns Result[str] after the data_oriented_error_handling_20260606 refactor (commit `64d6ba2d`), but 2 tests in test_qwen_provider.py were asserting against the raw str type. They were 2 of the 10 pre-existing failures documented in the track spec. Changes (mirrors the doeh_test_thinking_cleanup_20260615 pattern for grok/llama/llama_native): - Replace assert result == "hi from qwen" with assert result.ok and result.data == "hi from qwen" - Replace assert "cat" in result.lower() with assert result.ok and "cat" in result.data.lower() - Add "from src.result_types import Result" import All 5 tests in test_qwen_provider.py now pass (was 3/5).	2026-06-15 18:05:45 -04:00
ed	4592618372	fix(orchestration_logic): migrate test_run_worker_lifecycle_blocked mock (Phase 2 follow-up) Phase 2.13 missed the test_run_worker_lifecycle_blocked test in test_orchestration_logic.py - it also mocked src.ai_client.send. The test was failing with "Worker send_result failed for T1: ... [Errno 2] No such file or directory: .beads_mock/beads.json" because the unmocked send_result fell through to the real provider which tried to read beads.json. Changes: - Replace patch(src.ai_client.send) with patch(src.ai_client.send_result) - Wrap mock return_value with Result(data="BLOCKED because of missing info") All 8 tests in test_orchestration_logic.py now pass.	2026-06-15 17:45:18 -04:00
ed	36962ef6b6	test(tier4_interceptor): migrate to send_result() (Phase 2.11) The test_ai_client_passes_qa_callback test calls ai_client.send() with qa_callback=lambda. The qa_callback is passed through to the provider function (_send_gemini). Per plan note: the test has complex callback setup; the Result handling needs the mock to return Result(data="ok") so the qa_callback passes through and the test succeeds. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert result.ok - Mock _send_gemini to return Result(data="ok") instead of relying on the default (which would call the real provider) - Add "from src.result_types import Result" import 7 tests pass (the migrated test_ai_client_passes_qa_callback was previously broken because the send() call hit the real provider and either failed or returned empty; the mock now provides a clean response).	2026-06-15 17:27:31 -04:00
ed	cfeb3cb3e0	test(gemini_cli_integration): migrate 2 sites to send_result() (Phase 2.10) Changes: - Rename ai_client.send(...) to ai_client.send_result(...) (2 sites) - Add assert result.ok (1 site; the second test only checks result is not None) - Add "from src.result_types import Result" import 2 tests pass.	2026-06-15 17:07:20 -04:00
ed	363fe91db0	test(deepseek): migrate 6 sites to send_result() (Phase 2.9) All 6 sites in test_deepseek_provider.py call ai_client.send(...). Each assertion pattern is slightly different (==, "in", call_args inspection); migration follows the same pattern: rename to send_result(), add assert result.ok, and use result.data for the response text. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) (6 sites) - Add assert result.ok (6 sites) - Replace result == "x" with result.data == "x" (or "x" in result.data) - Add "from src.result_types import Result" import 7 tests pass (1 unrelated test_deepseek_model_selection + 6 migrated).	2026-06-15 16:59:46 -04:00
ed	d9a79efa25	test(api_events): migrate 2 sites to send_result() (Phase 2.8) The test_send_emits_events_proper and test_send_emits_tool_events tests both call ai_client.send(). Migrating to send_result() + assert result.ok. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) (2 sites) - Add assert result.ok (2 sites) - Add "from src.result_types import Result" import 4 tests pass.	2026-06-15 16:57:53 -04:00
ed	0192978646	test(ai_client_result): migrate to send_result(); drop test_send_deprecated (Phase 2.7) Per plan Task 2.7: - DELETE test_send_deprecated_emits_warning (obsolete after Phase 6; send() is being removed) - RENAME test_send_extracts_data_from_result -> test_send_result_does_not_emit_deprecation (this is the regression test the plan said to KEEP; it now asserts the new API does not emit a deprecation warning, instead of testing the old behavior) - MIGRATE test_send_extracts_data_from_result (renamed to the above) - MIGRATE test_send_returns_empty_string_on_error_result -> test_send_result_returns_empty_data_with_error_on_auth_failure (asserts the Result has data="" and not ok) 5 tests pass (down from 6; the deleted test removed 1; the renamed test_send_extracts_data_from_result became test_send_result_does_not_emit_deprecation).	2026-06-15 16:55:30 -04:00
ed	1e2c34313c	test(token_usage): migrate to send_result() (Phase 2.6) The test_token_usage_tracking test calls ai_client.send() and verifies the comms log entry. Migrating to send_result() + assert result.ok. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert result.ok - Add "from src.result_types import Result" import 1 test passes.	2026-06-15 16:51:24 -04:00
ed	c59bac59f2	test(gui2_mcp): migrate to send_result() (Phase 2.5) The test_mcp_tool_call_is_dispatched test calls ai_client.send() and asserts the MCP dispatch function was called. Migrating to send_result() + assert result.ok. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert result.ok - Add "from src.result_types import Result" import 1 test passes.	2026-06-15 16:49:11 -04:00
ed	fe52024311	test(gemini_cli_parity_regression): migrate to send_result() (Phase 2.4) The test_send_invokes_adapter_send test calls ai_client.send() and asserts the return value. Migrating to send_result() with assert res.ok and res.data == "Hello from mock adapter". Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert res.ok before accessing res.data - Add "from src.result_types import Result" import 1 test passes.	2026-06-15 16:39:31 -04:00
ed	b4c9ebd963	test(gemini_cli_edge_cases): migrate to send_result() (Phase 2.3) The test_gemini_cli_loop_termination test calls ai_client.send() and asserts the return value. Migrating to send_result() with assert result.ok and result.data == "Final answer". Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert result.ok before accessing result.data - Add "from src.result_types import Result" import 3 tests pass.	2026-06-15 16:31:26 -04:00
ed	fab9196bea	test(ai_cache_tracking): migrate to send_result() (Phase 2.2) The test calls ai_client.send() but does not check the return value - it only verifies the side effect on gemini cache stats. Migrating to send_result() and asserting result.ok is enough. Changes: - Rename ai_client.send(...) to ai_client.send_result(...) - Add assert result.ok (the return value is unused) - Add "from src.result_types import Result" import 2 tests pass.	2026-06-15 16:28:20 -04:00
ed	ba0df1fa95	test(ai_client_cli): migrate to send_result() (Phase 2.1) Replaces the deprecated ai_client.send() call with ai_client.send_result() in the test. The mock for GeminiCliAdapter is unchanged (it is patched to return a dict that send_result unwraps internally). Changes: - Rename response = ai_client.send(...) to result = ai_client.send_result(...) - Add assert result.ok before accessing result.data - Add "from src.result_types import Result" import 1 test passes.	2026-06-15 16:26:06 -04:00
ed	16c6705b80	test(spawn_interception_v2): mock send_result not send (Phase 2.18, pre-empts Phase 1.3 regression) Phase 1.3 migrated run_worker_lifecycle to send_result(). The mock_ai_client fixture in test_spawn_interception_v2.py mocked src.ai_client.send and returned a string. The test_run_worker_lifecycle_approved test asserts on the call_args (user_message + md_content), which still works with the new mock name. Changes: - Replace patch(src.ai_client.send) with patch(src.ai_client.send_result) - Rename mock_send to mock_send_result - Wrap mock return_value with Result(data="Task completed") - Add "from src.result_types import Result" import All 3 tests in test_spawn_interception_v2.py pass.	2026-06-15 16:24:05 -04:00
ed	7a6ffd8954	test(run_worker_lifecycle_abort): mock send_result not send (Phase 2.17, pre-empts Phase 1.3 regression) Phase 1.3 migrated run_worker_lifecycle to send_result(). This test mocks src.ai_client.send and asserts it is NOT called (abort fires before the AI dispatch). Migrating the mock to send_result is purely for consistency and future-proofing; the test still passes either way. Changes: - Rename patch(src.ai_client.send) to patch(src.ai_client.send_result) - Rename mock_send to mock_send_result - Comment updated to reference send_result	2026-06-15 16:21:08 -04:00
ed	bb2add1249	test(phase6_engine): mock send_result not send (Phase 2.16, pre-empts Phase 1.3 regression) Phase 1.3 migrated src/multi_agent_conductor.py:591 (run_worker_lifecycle) to send_result(). The test_worker_streaming_intermediate test mocked src.ai_client.send, which would break once Phase 1.3 was applied. (Confirmed: test failed after Phase 1.3 commit.) Changes: - Replace patch(src.ai_client.send) with patch(src.ai_client.send_result) - Rename mock_send to mock_send_result - Wrap mock side_effect return with Result(data="DONE") - Add "from src.result_types import Result" import All 3 tests in test_phase6_engine.py pass.	2026-06-15 16:16:53 -04:00
ed	499762d8f0	test(orchestrator_pm_history): mock send_result not send (Phase 2.15, pre-empts Phase 1.2 regression) Phase 1.2 migrated src/orchestrator_pm.py:86 to send_result(). The test_generate_tracks_with_history test mocked src.ai_client.send, which would break once Phase 1.2 was applied. (Confirmed: test failed after Phase 1.2 commit.) Changes: - Replace @patch(src.ai_client.send) with @patch(src.ai_client.send_result) - Rename mock_send to mock_send_result - Wrap mock return_value with Result(data="[]") - Add "from src.result_types import Result" import All 3 tests in test_orchestrator_pm_history.py pass.	2026-06-15 16:15:06 -04:00
ed	e4a2a20469	test(orchestrator_pm): mock send_result not send (Phase 2.14, pre-empts Phase 1.2 regression) Phase 1.2 migrated src/orchestrator_pm.py:86 to send_result(). The 3 tests in TestOrchestratorPM mocked src.ai_client.send, which would break once Phase 1.2 was applied. (Confirmed: tests failed after Phase 1.2 commit.) Changes: - Replace @patch(src.ai_client.send) with @patch(src.ai_client.send_result) - Rename mock_send to mock_send_result throughout - Wrap mock return_value with Result(data=json.dumps(...)) - Add "from src.result_types import Result" import All 3 tests pass.	2026-06-15 16:10:47 -04:00
ed	953689c8b3	test(orchestration_logic): mock send_result not send (Phase 2.13, fixes Phase 1.1 regression) Phase 1.1 + 1.2 migrated the production code to send_result(). The test_generate_tracks and test_generate_tickets tests mocked src.ai_client.send, causing "send was called 0 times" failures. Changes: - Replace patch(src.ai_client.send) with patch(src.ai_client.send_result) - Wrap mock return_value with Result(data=mock_response) - Add "from src.result_types import Result" import All 8 tests in tests/test_orchestration_logic.py pass (2 migrated + 6 unaffected tests).	2026-06-15 16:08:04 -04:00
ed	488254527c	test(conductor_tech_lead): mock send_result not send (Phase 2.12, fixes Phase 1.1 regression) Phase 1.1 migrated src/conductor_tech_lead.py:68 from ai_client.send() to ai_client.send_result(). The 3 tests in TestConductorTechLead mocked src.ai_client.send which is no longer called by the production code, causing "send was called 0 times" failures. Changes: - Replace patch("src.ai_client.send") with patch("src.ai_client.send_result") - Wrap mock return_value with Result(data=...) and mock side_effect with Result(data=...) values - Add "from src.result_types import Result" import All 9 tests in tests/test_conductor_tech_lead.py pass (3 migrated + 6 unaffected topological sort tests).	2026-06-15 16:06:17 -04:00
ed	e4a8a0bca1	test(thinking_trace): add test for <think> half-width marker (doeh cleanup Phase 4.2)	2026-06-15 14:26:32 -04:00
ed	cb985f08ed	test(gemini): add regression tests for thinking-format extraction (doeh cleanup Phase 3.1)	2026-06-15 14:15:52 -04:00
ed	81882c398e	test(headless_service): adapt test_generate_endpoint to send_result (doeh cleanup Phase 2.5)	2026-06-15 13:57:47 -04:00
ed	9e89d52607	test(ai_client_tool_loop): adapt mock to return Result[NormalizedResponse] (doeh cleanup Phase 2.4)	2026-06-15 13:54:57 -04:00
ed	dbdf9ba9e1	test(llama_native): adapt 4 tests to Result API (doeh cleanup Phase 2.3)	2026-06-15 13:52:38 -04:00
ed	439a0ac074	test(llama): adapt 3 tests to Result API (doeh cleanup Phase 2.2)	2026-06-15 13:25:31 -04:00
ed	d7e42a4a3d	test(grok): adapt 2 tests to Result API (doeh cleanup Phase 2.1)	2026-06-15 13:04:45 -04:00
ed	10046293ae	test(ai_loop): add live_gui smoke test for FR3 thinking substrate (Phase 4.3) Mirrors the FR1 live_gui smoke test: the full end-to-end live_gui FR3 test would require mock injection into the live_gui subprocess. The mock-based regression coverage for FR3 is already in test_ai_loop_regressions_20260614.py::test_fr3_minimax_thinking_in_returned_text. This smoke test verifies the disc_entries field is exposed via the Hook API, establishing the integration substrate for follow-up work.	2026-06-15 11:04:46 -04:00
ed	f4a782d99f	fix(ai_loop): wrap MiniMax reasoning in <thinking> tags for parse_thinking_trace (FR3, Bug #3 ) Adds a new wrap_reasoning_in_text: bool = False keyword argument to run_with_tool_loop. When True and reasoning_content is non-empty, the returned text is prepended with <thinking>...</thinking> tags so thinking_parser.parse_thinking_trace can extract a ThinkingSegment for the discussion entry. The wrap is conditional (default False) so it doesn't break providers that already wrap inline (e.g. DeepSeek, which wraps at line 2117-2118 before run_with_tool_loop sees the response). _send_minimax now passes wrap_reasoning_in_text=bool(caps.reasoning). When caps.reasoning is True (M2.5/M2.7), the reasoning is wrapped in <thinking> tags. When False (M2/M2.1), the parameter is False and no wrap happens (avoids useless getattr on non-reasoning models). Also fixes a bug in the test_fr3_minimax_thinking_in_returned_text test mock: it was returning a raw MagicMock instead of a Result object, which caused the test to see auto-created MagicMock attributes instead of the expected text. Now wraps in Result(data=MagicMock(...)) and sets ai_client._model to ensure get_capabilities('minimax', _model) resolves to the M2.7 capabilities (reasoning=True).	2026-06-15 10:56:24 -04:00

1 2 3 4 5 ...