Private
Public Access
0
0
Commit Graph

3811 Commits

Author SHA1 Message Date
ed e4a8a0bca1 test(thinking_trace): add test for <think> half-width marker (doeh cleanup Phase 4.2) 2026-06-15 14:26:32 -04:00
ed 4e97156e77 fix(thinking_parser): add <think> (half-width) marker support (doeh cleanup Phase 4.1) 2026-06-15 14:25:54 -04:00
ed cb985f08ed test(gemini): add regression tests for thinking-format extraction (doeh cleanup Phase 3.1) 2026-06-15 14:15:52 -04:00
ed e9abadc867 fix(ai_client): extract Gemini thought=True parts and wrap in <thinking> tags for parse_thinking_trace 2026-06-15 14:10:43 -04:00
ed 81882c398e test(headless_service): adapt test_generate_endpoint to send_result (doeh cleanup Phase 2.5) 2026-06-15 13:57:47 -04:00
ed 9e89d52607 test(ai_client_tool_loop): adapt mock to return Result[NormalizedResponse] (doeh cleanup Phase 2.4) 2026-06-15 13:54:57 -04:00
ed dbdf9ba9e1 test(llama_native): adapt 4 tests to Result API (doeh cleanup Phase 2.3) 2026-06-15 13:52:38 -04:00
ed 439a0ac074 test(llama): adapt 3 tests to Result API (doeh cleanup Phase 2.2) 2026-06-15 13:25:31 -04:00
ed d7e42a4a3d test(grok): adapt 2 tests to Result API (doeh cleanup Phase 2.1) 2026-06-15 13:04:45 -04:00
ed 27d7a04fd3 conductor(plan): Mark Phase 1 (G1 critical regression fix) complete 2026-06-15 12:58:34 -04:00
ed 7b323e3e5f fix(app_controller): restore context_to_send definition in _api_generate (CRITICAL regression from ai_loop_regressions_20260614) 2026-06-15 12:54:11 -04:00
ed 6f4bd75ef9 conductor: register doeh_test_thinking_cleanup_20260615 in tracks.md + mark ai_loop_regressions_20260614 shipped 2026-06-15 12:22:56 -04:00
ed 88bf04eb3d conductor(track): metadata.json for doeh_test_thinking_cleanup_20260615 2026-06-15 12:21:16 -04:00
ed 304f469663 conductor(track): plan for doeh_test_thinking_cleanup_20260615 (TDD-style, 5 phases, 16 tasks) 2026-06-15 12:20:06 -04:00
ed 925e366cdd conductor(track): spec for doeh_test_thinking_cleanup_20260615 (1 critical regression + 11 test mocks + 2 deferred bugs) 2026-06-15 12:17:51 -04:00
ed 515ef933a1 docs(report): add track completion report for ai_loop_regressions_20260614
In-depth handoff for Tier 1 review covering:
- Executive summary with TL;DR
- Goal & scope (planned vs delivered)
- Per-phase delivery summary
- Test coverage analysis (7 new + 2 adapted + 2 smoke)
- Deferred items documentation (3 cross-references)
- Pre-existing failures (14, verified not caused by this track)
- Plan deviations (6 items, with rationale)
- Post-ship risk register
- Commit inventory with diff stat
- 7 recommendations for the Tier 1 reviewer
- Handoff checklist

Working tree was clean before adding the report (no other changes to commit).
2026-06-15 11:32:33 -04:00
ed e6afefdc66 conductor(plan): mark track complete (all 5 phases, 17 tasks done) 2026-06-15 11:25:32 -04:00
ed 010752229b conductor(track): mark ai_loop_regressions_20260614 as completed
Updates status: active -> completed, adds completed_at date,
updates verification_criteria with the actual verification results.

7 regression tests pass; 14 pre-existing failures (parent track's
state.toml [regressions_20260612]) are not caused by these changes.
2026-06-15 11:24:43 -04:00
ed 2489e3215b docs(ai_client): add 2 follow-up notes for ai_loop_regressions_20260614
Adds 3 entries to the See Also section:
1. Gemini / Gemini CLI thinking-format compatibility (deferred from
   ai_loop_regressions_20260614) - investigate empirically
2. <think> (half-width) marker support in thinking_parser (deferred)
3. Public API Result Migration (planned, separate track public_api_migration_20260606)

Each entry links to the corresponding spec section for traceability.
2026-06-15 11:21:58 -04:00
ed 10046293ae test(ai_loop): add live_gui smoke test for FR3 thinking substrate (Phase 4.3)
Mirrors the FR1 live_gui smoke test: the full end-to-end live_gui FR3
test would require mock injection into the live_gui subprocess. The
mock-based regression coverage for FR3 is already in
test_ai_loop_regressions_20260614.py::test_fr3_minimax_thinking_in_returned_text.

This smoke test verifies the disc_entries field is exposed via the
Hook API, establishing the integration substrate for follow-up work.
2026-06-15 11:04:46 -04:00
ed 5f4c347824 conductor(plan): mark Phase 4 (FR3 fix) complete 2026-06-15 10:58:45 -04:00
ed f4a782d99f fix(ai_loop): wrap MiniMax reasoning in <thinking> tags for parse_thinking_trace (FR3, Bug #3)
Adds a new wrap_reasoning_in_text: bool = False keyword argument to
run_with_tool_loop. When True and reasoning_content is non-empty, the
returned text is prepended with <thinking>...</thinking> tags so
thinking_parser.parse_thinking_trace can extract a ThinkingSegment
for the discussion entry.

The wrap is conditional (default False) so it doesn't break providers
that already wrap inline (e.g. DeepSeek, which wraps at line 2117-2118
before run_with_tool_loop sees the response).

_send_minimax now passes wrap_reasoning_in_text=bool(caps.reasoning).
When caps.reasoning is True (M2.5/M2.7), the reasoning is wrapped in
<thinking> tags. When False (M2/M2.1), the parameter is False and
no wrap happens (avoids useless getattr on non-reasoning models).

Also fixes a bug in the test_fr3_minimax_thinking_in_returned_text
test mock: it was returning a raw MagicMock instead of a Result
object, which caused the test to see auto-created MagicMock attributes
instead of the expected text. Now wraps in Result(data=MagicMock(...))
and sets ai_client._model to ensure get_capabilities('minimax', _model)
resolves to the M2.7 capabilities (reasoning=True).
2026-06-15 10:56:24 -04:00
ed 722b09b99b conductor(plan): mark Phase 3 (FR2 fix) complete 2026-06-15 10:28:26 -04:00
ed 2b7b571a64 fix(ai_loop): replace dead ProviderError except clauses with send_result() pattern (FR2, Bug #1)
Replaces 3 dead 'except ai_client.ProviderError' clauses (the class was
removed in commit 64b787b8) with the new send_result() + result.ok
pattern. Removes the inner try/except block entirely (replaced by
'if not result.ok: raise HTTPException(502, ...)').

Sites fixed:
- _api_generate: send() -> send_result() + result.ok branch
- _handle_request_event (already fixed in FR1 commit 24ba2499)

AST scan via test_fr2_no_provider_error_in_source now passes: zero
remaining references to ai_client.ProviderError in src/app_controller.py.

The single remaining 'except Exception as e: import traceback;
traceback.print_exc(); raise HTTPException(500, str(e))' is the
legitimate outer except for unexpected in-flight errors.

Added a one-line comment per the plan referencing the data-oriented
error handling styleguide, so future migrations follow the same pattern.
2026-06-15 10:27:51 -04:00
ed 95288e4cb2 conductor(plan): mark Phase 2 (FR1 fix) complete 2026-06-15 09:42:44 -04:00
ed 2d1ff9e433 test(ai_loop): add live_gui smoke test for FR1 substrate (Phase 2.2)
The full end-to-end live_gui FR1 test would require mock injection into
the live_gui subprocess (patches in the test process do NOT propagate).
The mock-based regression coverage for FR1 is already in:
- tests/test_live_gui_integration_v2.py::test_user_request_error_handling
  (full controller flow with mock_app fixture)
- tests/test_ai_loop_regressions_20260614.py::test_fr1_*
  (unit-level)

This smoke test verifies the live_gui's ai_status field is reachable via
the Hook API, establishing the integration substrate exists for
follow-up work to add subprocess mock injection.
2026-06-15 09:41:39 -04:00
ed 25112f4157 test(live_gui): adapt test_user_request_* to new send_result() flow
The 2 tests in test_live_gui_integration_v2.py were mocking the old
ai_client.send() and asserting on the old error format. The FR1 fix
migrated _handle_request_event to ai_client.send_result() and routes
errors via ErrorInfo.ui_message() instead of f'ERROR: {e}'.

Updated:
- test_user_request_integration_flow: mock send_result instead of send
- test_user_request_error_handling: mock send_result returning an error
  Result; assert new error format (just the message, no 'ERROR:' prefix)

Per AGENTS.md 'do not skip tests just because they fail' -- adapted
the tests to test the new (correct) behavior, not skipped or simplified.
2026-06-15 09:25:50 -04:00
ed 24ba249901 fix(ai_loop): route send_result() errors to Discussion Hub as error entries (FR1, Bug #2)
Replaces deprecated ai_client.send() in _handle_request_event with
send_result() and branches on result.ok. On error, the first ErrorInfo
is routed to the event_queue as a 'response' with status='error',
allowing _on_comms_entry to add it to the discussion history.

The previous code called the @deprecated send() shim which silently
returns '' on error. The empty string was then filtered out by
_on_comms_entry (text_content.strip() check at line 3801), so users
saw no discussion entry for failed AI requests.

This also removes the dead 'except ai_client.ProviderError' clause at
line 3692 (the class was removed in commit 64b787b8). The 2 remaining
dead clauses at lines 305, 313 are fixed in the next commit (FR2).
2026-06-15 09:22:47 -04:00
ed 9b280a43fb conductor(plan): mark Phase 1 (TDD red) complete 2026-06-15 09:20:41 -04:00
ed 44dc90bca8 test(ai_loop): add FR1/FR2/FR3 tests for ai_loop_regressions_20260614 (TDD red)
3 bug groups, all reproducing documented regressions:
- test_fr1_*: error response becomes a discussion entry (Bug #2)
- test_fr2_*: no ProviderError references in src/app_controller.py (Bug #1)
- test_fr3_*: MiniMax thinking mono rendering in returned text (Bug #3)

4 critical tests fail for the documented reasons; 3 sanity checks pass.
2026-06-15 09:18:07 -04:00
ed 52c01c6cbc config 2026-06-15 09:01:53 -04:00
ed f4c497b1e8 conductor: register ai_loop_regressions_20260614 in tracks.md (priority A, ready for Tier 2) 2026-06-15 00:48:12 -04:00
ed acc294ae4e conductor(track): metadata.json for ai_loop_regressions_20260614 2026-06-15 00:44:52 -04:00
ed 884e40b9d1 conductor(track): plan for ai_loop_regressions_20260614 (TDD-style, 5 phases, 17 tasks) 2026-06-15 00:41:57 -04:00
ed 7a4dcc9690 conductor(track): spec for ai_loop_regressions_20260614 (MiniMax/Gemini/Gemini CLI/DeepSeek) 2026-06-15 00:33:04 -04:00
ed 74e02485a1 files & media ux improvemetn with directory folding and file name vis 2026-06-14 23:29:43 -04:00
ed ae8d01d0f7 add missing region start comment. 2026-06-14 22:43:55 -04:00
ed 2d51199699 fix(regression): for adding files in the files & media panel. 2026-06-14 22:43:42 -04:00
ed dcdcaa92f6 tiny 2026-06-13 20:50:36 -04:00
ed 5030bd848f ai client pass (in gemini region) 2026-06-13 20:49:37 -04:00
ed 94ab6dcc6f config 2026-06-13 19:02:24 -04:00
ed c87bb46e4f docs(reports): add MiniMax & OpenAI test regression analysis report 2026-06-13 18:57:56 -04:00
ed 2e91cd7123 test(minimax): add client instantiation unit tests to catch credential and base URL regressions 2026-06-13 18:57:44 -04:00
ed 286f952417 docs(reports): add SSDL Context Curation and Caching Pipeline report 2026-06-13 18:51:14 -04:00
ed 385538f477 docs(reports): add SSDL Conductor Engine DAG execution loop report 2026-06-13 18:49:01 -04:00
ed bcd7ee14cb docs(reports): add SSDL Discussion AI Turn Cycle report 2026-06-13 18:44:57 -04:00
ed cc8771b99b docs(reports): add track completion report for ai_client_docs_20260613 2026-06-13 18:29:21 -04:00
ed 4b4721ded4 Track: Complete track ai_client_docs_20260613 2026-06-13 18:27:50 -04:00
ed 1ccef1685f docs(ai_client): add detailed SQLite-granularity docstrings to secondary senders and context helpers 2026-06-13 18:27:46 -04:00
ed 20b5544d3e doc: add detailed SSDL docstrings to primary provider functions in ai_client.py 2026-06-13 18:08:04 -04:00