Private
Public Access
0
0

conductor(track): spec for public_api_migration_and_ui_polish_20260615 (3 prod + 12 test migrations + 2 UI Polish test fixes)

This commit is contained in:
2026-06-15 15:20:44 -04:00
parent 431ebce2b9
commit 3febdab42c
@@ -0,0 +1,585 @@
# Track Specification: Public API Migration + UI Polish Test Cleanup
**Track ID:** `public_api_migration_and_ui_polish_20260615`
**Status:** Active (spec approved 2026-06-15)
**Priority:** A (foundational; precedes `data_structure_strengthening_20260606`)
**Owner:** Tier 2 Tech Lead
**Type:** refactor + bugfix + test_cleanup + documentation
**Estimated effort:** 2-3 days Tier 2 work (16-24 hours)
**Parent tracks:** `data_oriented_error_handling_20260606` (shipped 2026-06-12), `ai_loop_regressions_20260614` (shipped 2026-06-15), `doeh_test_thinking_cleanup_20260615` (shipped 2026-06-15)
**Blocks:** `data_structure_strengthening_20260606` (cleaner `Result` API usage makes type-alias replacement easier), `mcp_architecture_refactor_20260606` (transitively)
---
## 0. TL;DR
This is a **stability track** that finishes the cleanup work started by `data_oriented_error_handling_20260606` and `doeh_test_thinking_cleanup_20260615`. Two concerns, one track:
1. **Public API Migration**: remove the deprecated `ai_client.send()` legacy wrapper; migrate 3 remaining production call sites + 12 test files to `send_result()`; fix 4 of the 10 pre-existing test failures (2 Qwen + 2 symbol_parsing) as a side effect of the migration.
2. **UI Polish Test Cleanup**: fix 2 broken test assertions in `test_discussion_truncate_layout.py` and `test_log_management_refresh.py` (the production code was already fixed by user commits `d0b06575` and `df7bda6e`; the tests use `find()` which locates the comment block instead of the actual code).
**Result:** 6 of 10 pre-existing test failures fixed. Remaining 4 RAG failures are deferred to the next track (a separate RAG subsystem track — out of scope for this one). Project reaches a stable state suitable for the `data_structure_strengthening_20260606` track.
---
## 1. Overview
### 1.1 Current State (as of 2026-06-15)
The `data_oriented_error_handling_20260606` track (shipped 2026-06-12) introduced the `Result[T, ErrorInfo]` pattern and `send_result()` as the new public API. The legacy `ai_client.send()` was marked `@deprecated` and routed through `send_result()` internally. Two follow-up tracks shipped fixes for the immediate user-blocking issues (`ai_loop_regressions_20260614`) and the easy test mock bugs (`doeh_test_thinking_cleanup_20260615`).
**As of 2026-06-15:**
- 3 production call sites of the deprecated `send()` remain in `src/`
- 12 test files use `ai_client.send()` directly
- 1 test file uses `_send_<vendor>()` with the new `Result` return type but the old assertion pattern (causing 2 of 10 pre-existing failures)
- 2 test files mock `ai_client.send` directly (causing 2 of 10 pre-existing failures)
- 2 UI Polish test files use `find()` to locate a comment block instead of the actual code (causing 2 of 10 pre-existing failures)
- 4 RAG test files fail (separate subsystem; deferred to a follow-up RAG track)
### 1.2 Gaps to Fill (this Track's Scope)
| Gap | Count | Type | Spec Section |
|---|---|---|---|
| Production `ai_client.send()` callers | 3 | refactor (deprecation removal) | §3.1 |
| Test files using `ai_client.send()` | 12 | refactor (deprecation removal) | §3.2 |
| Test files using `_send_<vendor>()` with old assertions | 1 | test fix (G3 in pre-existing failures) | §3.3 |
| Test files mocking `ai_client.send` | 1 | test fix (G4 in pre-existing failures) | §3.4 |
| UI Polish test bugs (`find()` not `rfind()`) | 2 | test fix (G6, G7 in pre-existing failures) | §3.5 |
| Deprecation marker + legacy `send()` function | 1 | refactor (deprecation removal) | §3.6 |
| `filterwarnings` conftest entry | 1 | housekeeping (deprecation removal) | §3.6 |
| `test_deprecation_warnings.py` | 1 file (2 tests) | delete (tests obsolete) | §3.6 |
| `docs/guide_ai_client.md` deprecation references | multiple | documentation | §3.7 |
| `conductor/product-guidelines.md` deprecation language | multiple | documentation | §3.7 |
### 1.3 Already Implemented (DO NOT re-implement)
Verified by code audit (2026-06-15) — the following already work and are NOT in this track's scope:
- **`send_result()` public API** — added in commit `9f86b2be` by `data_oriented_error_handling_20260606`
- **`_send_<vendor>()` returning `Result[str]`** — all 6 vendors (`_send_gemini`, `_send_gemini_cli`, `_send_grok`, `_send_minimax`, `_send_qwen`, `_send_llama`, `_send_llama_native`) already return `Result[str]` (refactored in commits `0282f9ff`, `943a21bf`, `e384afce`, `64d6ba2d`)
- **The 2 in-flight `_api_generate` and `_handle_request_event` migrations in `app_controller.py`** — already done by `doeh_test_thinking_cleanup_20260615` (commits `24ba2499` and `7b323e3e`)
- **`test_ai_client_result.py::test_send_result_does_not_emit_deprecation`** — passes; the deprecation warning filter works
- **11 test mock fixes from `doeh_test_thinking_cleanup_20260615`** — 29/29 tests in 5 files (`test_grok_provider`, `test_llama_provider`, `test_llama_ollama_native`, `test_ai_client_tool_loop_builder`, `test_headless_service`) now use the `Result` API
- **UI Polish Phase 1 (markdown tables) — `src/markdown_table.py`** — shipped by commit `79ac9210`
- **UI Polish Phase 2 (Keep Pairs input)** — code fix shipped by user commit `d0b06575` (`src/gui_2.py:5130-5131`); test bug remains (this track fixes)
- **UI Polish Phase 3 (Refresh Registry)** — code fix shipped by user commit `df7bda6e` (`src/gui_2.py:2111-2112`); test bug remains (this track fixes)
- **UI Polish Phase 4 (Vendor State tab)** — shipped by commit `3a864076` (`src/vendor_state.py`)
- **UI Polish Phase 5 (Files & Media directory tree)** — shipped by commit `74e02485` (`src/gui_2.py:render_files_and_media`)
---
## 2. Goals
### 2.1 Functional Goals
| ID | Goal | Acceptance Criterion |
|---|---|---|
| **G1** | Migrate `src/conductor_tech_lead.py:68` to `send_result()` + Result handling | `uv run pytest tests/test_conductor_tech_lead.py` (or nearest tests) passes; no regression in tier-2 dispatch |
| **G2** | Migrate `src/orchestrator_pm.py:86` to `send_result()` + Result handling | No regression in tier-1 dispatch tests |
| **G3** | Migrate `src/multi_agent_conductor.py:591` to `send_result()` + Result handling | `test_mma_concurrent_tracks_sim`, `test_mma_step_mode_sim`, `test_undo_redo_sim`, and 30+ MMA live_gui tests pass |
| **G4** | Migrate 12 test files using `ai_client.send()` to use `send_result()` | All migrated tests pass; no test calls `ai_client.send()` after this track |
| **G5** | Fix `test_qwen_provider.py` (2 tests) to use `Result` API assertion pattern | 2/2 tests pass; same approach as `doeh_test_thinking_cleanup_20260615` used for grok/llama |
| **G6** | Fix `test_symbol_parsing.py` (2 tests) to mock `send_result` not `send` | 2/2 tests pass |
| **G7** | Fix `test_discussion_truncate_layout.py` (1 test) to use `rfind()` not `find()` | 1/1 test passes |
| **G8** | Fix `test_log_management_refresh.py` (1 test) to use `rfind()` not `find()` | 1/1 test passes |
| **G9** | Remove `@deprecated` decorator + legacy `send()` function in `src/ai_client.py` | `ai_client.send` AttributeError if called; `rg "ai_client\.send\(" src/ tests/` returns 0 hits |
| **G10** | Delete `tests/test_deprecation_warnings.py` (2 obsolete tests) | File removed; no test imports or calls `ai_client.send` |
| **G11** | Remove `filterwarnings` entry in `pyproject.toml:46-47` | `rg "ignore:Use ai_client.send_result" pyproject.toml` returns 0 hits |
| **G12** | Update `docs/guide_ai_client.md` to remove deprecation references | No `@deprecated` mention; Result API section no longer notes "send() is deprecated" |
| **G13** | Update `conductor/product-guidelines.md` to remove deprecation language | No "send() is deprecated; use send_result()" in product guidelines |
### 2.2 Non-Functional Goals
| ID | Goal | Acceptance Criterion |
|---|---|---|
| **NF1** | Zero new test regressions | `uv run pytest tests/` shows 4 fewer failures than the pre-track baseline (10 - 6 = 4 remaining; all RAG) |
| **NF2** | All 28 production changes atomic per-task | 28 git commits; each commit is buildable + testable |
| **NF3** | All changes follow the project's 1-space indentation, no-comments, type-hinting rules | `uv run python -c "import ast; ast.parse(open('src/ai_client.py').read())"` succeeds; production code has zero `#` comments in changed lines |
| **NF4** | Per-commit git notes attached | `git log --format='%H %s' --grep="^public_api_migration_and_ui_polish_20260615" \| wc -l` matches task count |
| **NF5** | `doeh_test_thinking_cleanup_20260615` state.toml remains parseable | `python -c "import tomllib; tomllib.load(open('conductor/tracks/doeh_test_thinking_cleanup_20260615/state.toml','rb'))"` succeeds |
---
## 3. Per-File Design
### 3.1 Production call sites to migrate
**Why these 3 only:** `data_oriented_error_handling_20260606` spec §12.1 lists 5 production call sites. Two of the five (`src/app_controller.py:282` and `src/app_controller.py:3674`) were already migrated by `doeh_test_thinking_cleanup_20260615` (commits `7b323e3e` and `24ba2499`). One was a misidentification — `src/mcp_client.py:2274` is an MCP tool schema for `py_check_syntax`, not a `send()` call. The remaining 3 are real.
| File:Line | Current code | After this track | Difficulty |
|---|---|---|---|
| `src/conductor_tech_lead.py:68` | `response = ai_client.send(md_content="", user_message=user_message)` (2-arg) | `result = ai_client.send_result(md_content="", user_message=user_message); if not result.ok: <log warn + return None>; response = result.data` | Easy (2-arg call) |
| `src/orchestrator_pm.py:86` | `response = ai_client.send(md_content="", user_message=user_message, enable_tools=False)` (3-arg) | `result = ai_client.send_result(md_content="", user_message=user_message, enable_tools=False); if not result.ok: <log warn + return None>; response = result.data` | Easy (3-arg call) |
| `src/multi_agent_conductor.py:591` | `response = ai_client.send(md_content=..., user_message=..., base_dir=".", pre_tool_callback=..., qa_callback=..., patch_callback=..., stream_callback=...)` (8-arg, with 5 callbacks) | `result = ai_client.send_result(md_content=..., ...); if not result.ok: <log warn via comms + return per-ticket error>; response = result.data` | Hard (5 callbacks; per-ticket error handling needed in MMA) |
**MMA per-ticket error handling:** the existing `_handle_request_event` pattern in `app_controller.py:3674` (already migrated by `doeh_test_thinking_cleanup_20260615`) uses `raise HTTPException(status_code=502, detail=err.ui_message())`. The MMA worker does not have an HTTP layer; the per-ticket error should be:
- Logged to the comms log as `WARN/deprecated_send_with_errors` (or `WARN/worker_send_failed`)
- Returned via `worker_comms_callback` as a status entry (per `multi_agent_conductor.py:584` callback)
- The worker exits with a non-zero status so the DAG engine marks the ticket as failed
**Reference:** `conductor/tracks/data_oriented_error_handling_20260606/spec.md` §12.1 lines 677-688; `doeh_test_thinking_cleanup_20260615/spec.md` §3.1 (the G1 fix pattern at `src/app_controller.py:265-295` is the canonical reference for Result handling).
### 3.2 Test files using `ai_client.send()` to migrate
**Why 12 not 63:** the parent spec claimed "63 test files (verified 2026-06-11)". The current count (rg verified 2026-06-15) is **12 files with 20 call sites**. The discrepancy is because the spec was written when there were more legacy call sites; `data_oriented_error_handling_20260606` Phase 3 + `doeh_test_thinking_cleanup_20260615` Phase 2 already migrated the rest.
| File | Call sites | Migration pattern |
|---|---|---|
| `tests/test_ai_client_cli.py` | 1 | `response = ai_client.send(...)``result = ai_client.send_result(...); assert result.ok; response = result.data` |
| `tests/test_ai_cache_tracking.py` | 1 | Same pattern |
| `tests/test_ai_client_result.py` | 3 | One is the existing `test_send_deprecated_emits_warning` (will be DELETED in Phase 6); the other two are pre-existing tests that test `send()` directly — they need to be rewritten to test `send_result()` semantics |
| `tests/test_api_events.py` | 2 | Same pattern |
| `tests/test_deepseek_provider.py` | 6 | Same pattern (per-call-site migration; 6 commits is too many; consolidate to 1-2 commits) |
| `tests/test_gemini_cli_edge_cases.py` | 1 | Same pattern |
| `tests/test_gemini_cli_integration.py` | 2 | Same pattern |
| `tests/test_gemini_cli_parity_regression.py` | 1 | Same pattern |
| `tests/test_gui2_mcp.py` | 1 | Same pattern |
| `tests/test_tier4_interceptor.py` | 1 | Same pattern |
| `tests/test_token_usage.py` | 1 | Same pattern |
| **Total** | **20 call sites in 11 files** | The 12th file is `test_symbol_parsing.py` which mocks `send` not calls it; handled separately in §3.4 |
**Migration pattern (canonical):**
```python
# Before:
result = ai_client.send(md_content, user_message, base_dir)
assert result == "expected text"
# After:
result = ai_client.send_result(md_content, user_message, base_dir)
assert result.ok, f"send_result failed: {result.errors[0].ui_message() if result.errors else 'no error info'}"
assert result.data == "expected text"
```
**Special case: `test_ai_client_result.py`:** The current file has 3 tests for the deprecated `send()`. The track DELETES the `test_send_deprecated_emits_warning` test (send() is removed in Phase 6) and KEEPS the `test_send_result_does_not_emit_deprecation` test (it remains a regression test for the new API). The 3rd test (`test_send_result_does_not_emit_deprecation` is the 2nd) needs review — see the file directly.
**Test isolation:** Group migration by file. Per-file atomic commits preserve the file as a rollback unit. 11 files = 11 atomic commits (consolidate `test_ai_client_result.py` and `test_deepseek_provider.py` since they have multiple sites per file).
**Reference:** `doeh_test_thinking_cleanup_20260615/plan.md` Phase 2 (Tasks 2.1-2.5) for the exact migration pattern; `conductor/code_styleguides/error_handling.md` §3.1 (AND over OR pattern).
### 3.3 `test_qwen_provider.py` fix (2 tests)
**Current state (verified 2026-06-15):**
- `_send_qwen()` returns `Result[str]` (refactored by `data_oriented_error_handling_20260606` commit `64d6ba2d`)
- Tests at `tests/test_qwen_provider.py:17-19` and `:27-28` assert against raw `str`:
```python
result = ai_client._send_qwen("system", "user", ".", None, "", False, None, None, None)
assert result == "hi from qwen" # FAILS: result is Result(data="hi from qwen")
```
**Fix:** Mirror the pattern used by `doeh_test_thinking_cleanup_20260615` for `test_grok_provider`, `test_llama_provider`, `test_llama_ollama_native`:
```python
result = ai_client._send_qwen("system", "user", ".", None, "", False, None, None, None)
assert result.ok and result.data == "hi from qwen"
```
And for the image test:
```python
result = ai_client._send_qwen("system", "describe this image", ".", file_items, "", False, None, None, None)
assert result.ok and "cat" in result.data.lower()
```
**Why this approach and not renaming `_send_qwen` → `_send_qwen_result`:** the parent spec at line 611 planned the rename, but commit `64d6ba2d` only changed the return type (not the name). The function name `_send_qwen` is stable; only the return type changed. Migrating the tests to handle `Result` is the right scope for this track. A future "rename to `_send_qwen_result`" track could be planned separately if needed.
**Test isolation:** 1 atomic commit for both test fixes (per-file atomicity).
**Reference:** `doeh_test_thinking_cleanup_20260615/spec.md` §1.1 (G2-G11 test mock bugs); `doeh_test_thinking_cleanup_20260615/plan.md` Phase 2.1-2.3 (the grok/llama/llama_native patterns).
### 3.4 `test_symbol_parsing.py` fix (2 tests)
**Current state (verified 2026-06-15):**
- `tests/test_symbol_parsing.py:45,74` mock `src.ai_client.send`
- Production now calls `src.ai_client.send_result` (per the migration done by `doeh_test_thinking_cleanup_20260615` commit `24ba2499`)
- Mock receives 0 calls; test fails with `Expected 'send' to have been called once. Called 0 times.`
**Fix:**
```python
# Before:
with patch('src.ai_client.send') as mock_send:
...
mock_send.assert_called_once()
# After:
with patch('src.ai_client.send_result') as mock_send_result:
mock_send_result.return_value = Result(data="mocked response")
...
mock_send_result.assert_called_once()
```
**Test isolation:** 1 atomic commit for both test fixes (per-file atomicity).
**Reference:** `doeh_test_thinking_cleanup_20260615/plan.md` Task 2.7 (the headless_service `test_generate_endpoint` mock migration is the canonical reference).
### 3.5 UI Polish test fixes (2 tests)
**Current state (verified 2026-06-15):**
The UI Polish Five Issues track (`docs/superpowers/specs/2026-06-03-ui-polish-design.md`) has 5 phases. Per the code audit (2026-06-15):
| Phase | Status | Code location | Test status |
|---|---|---|---|
| 1. Markdown tables | SHIPPED (commit `79ac9210`) | `src/markdown_table.py` | passing |
| 2. Keep Pairs input | SHIPPED (user commit `d0b06575`) | `src/gui_2.py:5130-5131` (now `set_next_item_width(140)` + `drag_int`) | FAILING (test bug — see below) |
| 3. Refresh Registry | SHIPPED (user commit `df7bda6e`) | `src/gui_2.py:2111-2112` (in-place `load_registry()`) | FAILING (test bug — see below) |
| 4. Vendor State tab | SHIPPED (commit `3a864076`) | `src/vendor_state.py` | passing |
| 5. Files & Media directory tree | SHIPPED (commit `74e02485`) | `src/gui_2.py:render_files_and_media` | passing |
**Test bug in Phase 2 (`test_discussion_truncate_layout.py`):**
```python
def test_keep_pairs_input_uses_adequate_width():
src = inspect.getsource(gui_2)
marker = "Keep Pairs:"
idx = src.find(marker) # ← BUG: finds comment block at line 5113
assert idx != -1, "Could not find Keep Pairs label in gui_2.py"
snippet = src[idx:idx + 200] # ← snippet window doesn't reach line 5130
assert "set_next_item_width(80)" not in snippet, ... # passes (vacuously)
assert "set_next_item_width(140)" in snippet, ... # FAILS: snippet ends at the comment
assert "drag_int" in snippet, ... # FAILS: snippet ends at the comment
```
The first occurrence of "Keep Pairs:" is in a comment at line 5113 (in the docstring of `render_discussion_entry_controls`). The actual code is at line 5130. The 200-char snippet window only reaches into the docstring.
**Fix:** Use `rfind()` instead of `find()` to find the LAST occurrence (the actual code):
```python
def test_keep_pairs_input_uses_adequate_width():
src = inspect.getsource(gui_2)
marker = "Keep Pairs:"
idx = src.rfind(marker) # ← finds the code at line 5130
assert idx != -1, "Could not find Keep Pairs label in gui_2.py"
snippet = src[idx:idx + 200] # ← snippet now includes line 5130-5131
assert "set_next_item_width(80)" not in snippet, ...
assert "set_next_item_width(140)" in snippet, ... # passes
assert "drag_int" in snippet, ... # passes
```
**Test bug in Phase 3 (`test_log_management_refresh.py`):**
```python
def test_refresh_registry_button_calls_load_registry():
src = inspect.getsource(gui_2)
marker = "Refresh Registry"
idx = src.find(marker) # ← BUG: finds comment block at line 2090
assert idx != -1, "Could not find Refresh Registry button in gui_2.py"
snippet = src[idx:idx + 400] # ← snippet window doesn't reach line 2111
assert "load_registry" in snippet, ... # FAILS
```
The first occurrence of "Refresh Registry" is in a comment at line 2090. The actual code is at line 2111. The 400-char snippet window doesn't reach the code.
**Fix:** Same pattern — use `rfind()` to find the actual code:
```python
def test_refresh_registry_button_calls_load_registry():
src = inspect.getsource(gui_2)
marker = "Refresh Registry"
idx = src.rfind(marker) # ← finds the code at line 2111
assert idx != -1, "Could not find Refresh Registry button in gui_2.py"
snippet = src[idx:idx + 400]
assert "load_registry" in snippet, ... # passes
assert snippet.count("log_registry.LogRegistry(") <= 1, ... # passes
```
**Test isolation:** 1 atomic commit for both test fixes (per-file atomicity; they're both 1-character changes in the same test fixture style).
**Reference:** `docs/superpowers/specs/2026-06-03-ui-polish-design.md` §3.2 (Phase 2 design) and §3.3 (Phase 3 design).
### 3.6 Deprecation removal
**Files to modify:**
1. **`src/ai_client.py:2939-3040`** — Remove the `@deprecated` decorator on `def send(...)` and the entire function body. The function is replaced by `send_result()` (which already exists at `src/ai_client.py:3002`).
- Verify: `rg "def send\(" src/ai_client.py` returns 0 hits (only `def send_result(` should remain).
2. **`tests/test_deprecation_warnings.py`** — Delete the file. Both tests are obsolete:
- `test_send_deprecated_warning_emitted_once_per_site` — tests `send()`; can't run after `send()` is removed
- `test_send_result_does_not_emit_deprecation` — tests `send_result()` doesn't emit a deprecation; trivially true after `send()` is removed (no deprecation source)
3. **`pyproject.toml:46-47`** — Remove the `filterwarnings` entry:
```toml
filterwarnings = [
"ignore:Use ai_client.send_result.*:DeprecationWarning", # DELETE THIS LINE
]
```
- Verify: `rg "ignore:Use ai_client.send_result" pyproject.toml` returns 0 hits.
**Test isolation:** 1 atomic commit for the 3 changes (consecutive cleanup; the changes are meaningless without each other).
**Reference:** `conductor/tracks/data_oriented_error_handling_20260606/spec.md` §3.5 (deprecation strategy); `pyproject.toml:46-47` (current entry).
### 3.7 Documentation updates
**1. `docs/guide_ai_client.md` — remove deprecation references:**
Search for "deprecat" (case-insensitive) and remove:
- "Use ai_client.send_result() instead" mentions
- "The deprecated send() will be removed in..." warnings
- The entire deprecation warning table at the bottom of the `send_result` section
**2. `conductor/product-guidelines.md` — remove deprecation language:**
Search for "deprecat" (case-insensitive) and remove or update:
- "send() is deprecated" mentions
- "Use send_result()" instructions (the deprecation is being removed)
- Update the "Public API deprecation" section to mark as resolved
**Test isolation:** 1 atomic commit for the 2 doc updates (consecutive cleanup).
**Reference:** `conductor/product-guidelines.md` "Data-Oriented Error Handling > Public API deprecation" section (search for the heading; mark as RESOLVED).
---
## 4. Architecture Reference
### 4.1 The Result API (Fleury Pattern)
The `Result[T, ErrorInfo]` pattern from `conductor/code_styleguides/error_handling.md` is the foundation. This track is the **removal of the deprecation** that the data_oriented_error_handling track introduced; the new API is the permanent one.
**Key files:**
- `src/result_types.py` — `Result`, `ErrorInfo`, `ErrorKind`, `NilPath`, `NilRAGState`
- `src/ai_client.py:3002` — `def send_result(...)` (the permanent public API after this track)
- `src/ai_client.py:_send_<vendor>()` (6 vendors) — return `Result[str]`
**Per-call-site error handling pattern (canonical):**
```python
result = ai_client.send_result(md_content, user_message, base_dir, ...)
if not result.ok:
err = result.errors[0]
# call-site-specific error handling:
# - HTTP layer (app_controller:_api_generate): raise HTTPException(502, detail=err.ui_message())
# - GUI layer (app_controller:_handle_request_event): log to comms + add error entry
# - MMA worker (multi_agent_conductor): log to comms + return per-ticket error
# - Tier 1/2 sub-agents (orchestrator_pm, conductor_tech_lead): log warn + return None or empty
response = result.data
```
### 4.2 The deprecated send() function
Per `conductor/tracks/data_oriented_error_handling_20260606/spec.md` §3.5 lines 183-206, the `send()` function:
- Was added in `data_oriented_error_handling_20260606` Phase 3 (commit `73cf321c`)
- Wraps `send_result()` and unwraps the `Result` to return `str`
- Is marked `@deprecated` via `typing_extensions.deprecated` (Python 3.11+ backport)
- Emits a `DeprecationWarning` at runtime (cached per call site)
The `filterwarnings` entry in `pyproject.toml:46-47` silences the warning during the transition period. This track removes both the function and the filter entry.
### 4.3 Threading & Locking
The production call site migrations MUST preserve the existing locking:
- `multi_agent_conductor.py:591` — runs in a worker thread; the `set_comms_log_callback` and `set_current_tier` calls before the `send()` call MUST be preserved
- `orchestrator_pm.py:86` — runs in the orchestrator thread; lock acquisition patterns must be preserved
- `conductor_tech_lead.py:68` — runs in a sub-agent thread; the `set_custom_system_prompt` and `set_current_tier` calls before the `send()` call MUST be preserved
**Reference:** `docs/guide_ai_client.md` "Threading Model" section; `docs/guide_app_controller.md` "AI Loop Lifecycle" section.
### 4.4 The MMA per-ticket error handling
The MMA worker (`multi_agent_conductor.py:run_worker_lifecycle`) currently does NOT have per-ticket error handling — it expects `send()` to return a `str` (and raises an exception on internal errors which the worker catches). After this track, `send_result()` returns a `Result[str]` with the errors in `result.errors`. The migration must:
1. Check `result.ok` immediately after the call
2. If `!result.ok`:
- Log the error to the comms log via `worker_comms_callback` (status entry with `err.ui_message()`)
- Return a sentinel value that the DAG engine marks as failed (e.g., return `None` and the worker exits with non-zero status)
3. If `result.ok`: continue with `result.data` as before
**Reference:** `docs/guide_mma.md` "Worker Lifecycle" section; the `multi_agent_conductor.py:584` `worker_comms_callback` (already wired up).
---
## 5. Test Plan
### 5.1 Per-phase test verification
Each phase must pass targeted tests before moving to the next:
| Phase | Test command | Expected |
|---|---|---|
| 1 | `uv run pytest tests/test_conductor_tech_lead.py tests/test_orchestrator_pm.py tests/test_mma_concurrent_tracks_sim.py tests/test_mma_step_mode_sim.py tests/test_undo_redo_sim.py -v 2>&1 \| tee tests/artifacts/public_api_phase1.log` | All pass |
| 2 | `uv run pytest tests/test_ai_client_cli.py tests/test_ai_cache_tracking.py tests/test_ai_client_result.py tests/test_api_events.py tests/test_deepseek_provider.py tests/test_gemini_cli_*.py tests/test_gui2_mcp.py tests/test_tier4_interceptor.py tests/test_token_usage.py -v 2>&1 \| tee tests/artifacts/public_api_phase2.log` | All pass |
| 3 | `uv run pytest tests/test_qwen_provider.py -v` | 5/5 pass (2 of which were the pre-existing failures) |
| 4 | `uv run pytest tests/test_symbol_parsing.py -v` | 2/2 pass (which were the pre-existing failures) |
| 5 | `uv run pytest tests/test_discussion_truncate_layout.py tests/test_log_management_refresh.py -v` | 2/2 pass (which were the pre-existing failures) |
| 6 | `uv run pytest tests/test_deprecation_warnings.py -v 2>&1` (should fail — file is deleted) + `uv run rg "ai_client\.send\(" src/ tests/` (should return 0) | File deleted; 0 rg hits |
| 7 | `uv run pytest tests/ 2>&1 \| tee tests/artifacts/public_api_phase7_full.log` | 4 fewer failures than pre-track (10 - 6 = 4 RAG failures remain) |
### 5.2 Per-task TDD red verification
For each task that introduces a new test, the implementer MUST:
1. Verify the test FAILS as expected (red phase)
2. Implement the fix
3. Verify the test PASSES (green phase)
4. Commit
**Anti-pattern guard:** per `AGENTS.md` "Critical Anti-Patterns", no skipping tests just because they fail. If a test fails for an unexpected reason, the implementer MUST investigate before committing.
### 5.3 Test isolation
Per `docs/guide_testing.md` "Structural Testing Contract":
- No `unittest.mock.patch` on core infrastructure (event queues, `ai_client` internals, threading primitives) unless explicitly authorized
- All integration tests use `live_gui` fixture
- Test artifacts in `tests/artifacts/` or `tests/logs/` (gitignored)
This track's tests are mostly UNIT tests (no `live_gui` needed). The MMA migration test (Phase 1) MAY need `live_gui` for the worker dispatch path; verify by running targeted tests first.
---
## 6. Migration Strategy
### 6.1 The order matters
**Phase 1 must complete before Phase 6:**
- Phase 1 migrates the 3 production call sites to `send_result()`
- Phase 6 removes the legacy `send()` function
- If Phase 6 runs first, the production code (still using `send()`) crashes
**Phase 2 must complete before Phase 6:**
- Phase 2 migrates the 12 test files to `send_result()`
- Phase 6 removes the legacy `send()` function
- If Phase 6 runs first, the tests (still using `send()`) crash
**Phase 3, 4, 5 can run in any order after Phase 1** (they're independent test fixes).
**Phase 7 is the final sweep** (docs + tracks.md + full suite).
### 6.2 Per-commit safety
Each atomic commit must:
- Be buildable (`python -c "import src.ai_client"` succeeds)
- Pass its targeted tests
- Not introduce a regression in the previously-passing tests
- Have a clear commit message with the task number
The per-task commit pattern (per `conductor/workflow.md`):
```
fix(ai_client): migrate conductor_tech_lead.py:68 to send_result() (G1, public_api_migration_and_ui_polish_20260615 Phase 1.1)
```
The per-phase checkpoint pattern:
```
conductor(checkpoint): Phase 1 complete - 3 production call sites migrated
```
---
## 7. Out of Scope
### 7.1 Deferred to separate tracks
| ID | Item | Defer to | Why |
|---|---|---|---|
| OOS1 | 4 RAG test failures (test_rag_integration, test_rag_phase4_final_verify, test_rag_phase4_stress, test_rag_visual_sim) | RAG subsystem track (planned; not yet specced) | Pre-existing RAG subsystem issues; error is in RAG config lookup code, not AI client code. A partial fix was attempted in commit `16412ad5`; the remaining issue is a different code path. |
| OOS2 | The `_send_<vendor>()` → `_send_<vendor>_result()` rename per the data_oriented_error_handling spec §3.4 line 611 | Separate "private API rename" track (if needed) | Not blocking; tests work with current names. The function names are stable; only the return type changed. |
| OOS3 | The 23 lower-impact files with weak types (per `data_structure_strengthening_20260606/spec.md` §1 line 20) | `data_structure_strengthening_20260606` (the next major track after this) | That's exactly what data_structure_strengthening is for. |
| OOS4 | The 4 remaining UI Polish track phases that ARE NOT in this scope (none — all 5 are either shipped or addressed by this track's test fixes) | N/A | All 5 UI Polish phases are accounted for. |
| OOS5 | `live_gui_mock_injection_20260615` infrastructure | Separate infrastructure track | Not blocking. Recommended but not required. |
### 7.2 Explicitly NOT in this track
- **Renaming `_send_<vendor>()` to `_send_<vendor>_result()`** — not needed; tests work with current names after assertion pattern fix
- **Adding TypedDict / @dataclass schemas** — that's data_structure_strengthening's scope
- **MMA per-ticket Result returns (per `data_oriented_error_handling_20260606/spec.md` §12.1 line 677 "Adds any new public API surface needed (e.g., per-ticket Result returns in the MMA conductor)")** — the MMA worker already gets `Result[str]` from `send_result()`; the existing `worker_comms_callback` already handles per-ticket status updates. The spec's mention of "per-ticket Result returns" was speculative; the current Result-based flow is sufficient.
- **Removing the `filterwarnings` for the `Optional[T]` ban** — the `audit_optional_in_3_files.py` audit (per `data_oriented_error_handling_20260606/spec.md`) is unrelated to this track's deprecation removal.
---
## 8. Risks & Mitigations
| ID | Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|---|
| **R1** | `multi_agent_conductor.py:591` migration breaks MMA worker dispatch (5 callbacks) | Medium | High | TDD red first: verify a known MMA test fails before the fix; verify it passes after. The existing `doeh_test_thinking_cleanup_20260615` G1 fix pattern is the canonical reference for Result handling. |
| **R2** | Removing `send()` breaks a test that imports it indirectly | Low | Medium | Run `rg "ai_client\.send\(" src/ tests/` before AND after Phase 6 to confirm 0 hits. |
| **R3** | `pyproject.toml` filterwarnings removal causes test suite to fail with `DeprecationWarning` (e.g., from another library) | Low | Low | The filter was added in `data_oriented_error_handling_20260606` specifically to silence `send()` deprecation; no other deprecation in the codebase is silenced by it. Verified by checking the rg history. |
| **R4** | UI Polish test fixes (`find()` → `rfind()`) mask a real production bug | Low | Medium | The production code at `src/gui_2.py:5130-5131` and `:2111-2112` was already verified to have the correct values (`set_next_item_width(140)` + `drag_int` and in-place `load_registry()`). The test bug is just the search logic. |
| **R5** | Qwen test fix uses a different pattern than grok/llama/llama_native | Low | Low | The plan uses the same `assert result.ok and result.data == "x"` pattern as `doeh_test_thinking_cleanup_20260615` (commits `d7e42a4a`, `439a0ac0`, `dbdf9ba9`). |
| **R6** | `test_deprecation_warnings.py` deletion is misinterpreted as "deleting tests instead of fixing them" | Low | Low | Both tests in the file are obsolete after `send()` removal. The first test (test_send_deprecated) literally cannot run without `send()`. The second test (test_send_result_does_not_emit_deprecation) is trivially true. Document in the commit message. |
| **R7** | The 4 RAG test failures get introduced or regressed during this track | Low | Medium | Run full test suite in Phase 7 and compare to the pre-track baseline. The 4 RAG failures are documented as "pre-existing" with their defer-to track recorded. |
---
## 9. Verification Criteria (definition of "done")
The track is DONE when **ALL** of the following are true:
1. **G1-G3 production migrations complete**: 3 call sites use `send_result()`; no `ai_client.send(` in `src/`
2. **G4 test migration complete**: 12 test files use `send_result()`; no `ai_client.send(` in `tests/`
3. **G5 Qwen test fix complete**: `test_qwen_provider.py` 5/5 pass
4. **G6 symbol_parsing test fix complete**: `test_symbol_parsing.py` 2/2 pass
5. **G7-G8 UI Polish test fixes complete**: `test_discussion_truncate_layout.py` 1/1 + `test_log_management_refresh.py` 1/1 pass
6. **G9 deprecation removed**: `@deprecated` decorator and `send()` function gone from `src/ai_client.py`
7. **G10 test_deprecation_warnings.py deleted**: file does not exist
8. **G11 filterwarnings removed**: no `ignore:Use ai_client.send_result` in `pyproject.toml`
9. **G12-G13 docs updated**: no `@deprecated` or "send is deprecated" mentions in `docs/guide_ai_client.md` or `conductor/product-guidelines.md`
10. **NF1 no regressions**: full test suite has 4 RAG failures remaining (down from 10); no new failures
11. **NF2 per-task commits**: ~28 atomic commits with clear messages
12. **NF3 style preserved**: 1-space indentation, no comments, type hints in all changed code
13. **NF4 per-commit git notes**: all 28 commits have git notes summarizing the task
14. **NF5 doeh state.toml parseable**: `tomllib.load()` succeeds (unchanged from previous track; sanity check)
15. **Final state**: 1280 + 6 newly-passing = 1286 tests pass; 4 RAG failures documented as deferred
**Test count math:**
- Pre-track baseline: 1280 pass + 4 skip + 10 fail (verified 2026-06-15)
- After this track: 1286 pass + 4 skip + 4 fail (6 newly-passing: 2 Qwen + 2 symbol_parsing + 1 truncate + 1 refresh)
- The 4 remaining failures are all RAG subsystem; deferred to the next track
---
## 10. Execution Order & Dependencies
**No external blockers.** This track can start immediately after the Tier 1 review approves the spec.
**Execution order (the plan):**
1. Phase 1 (production migration) — 1 day
2. Phase 2 (test migration, 12 files) — 1 day
3. Phase 3 (Qwen test fix) — 1 hour (can be combined with Phase 2)
4. Phase 4 (symbol_parsing test fix) — 30 min (can be combined with Phase 2)
5. Phase 5 (UI Polish test fixes) — 30 min (independent)
6. Phase 6 (deprecation removal) — 30 min (MUST be after Phases 1 + 2)
7. Phase 7 (docs + housekeep) — 1 hour (after Phase 6)
**Total:** 2-3 days Tier 2 work (the estimate accounts for the per-commit overhead + per-task git notes + 7 phase checkpoints).
**Followed by:** the user can start `data_structure_strengthening_20260606` track (already has spec, plan pending).
---
## 11. References
### Architecture docs
- `docs/guide_ai_client.md` — multi-provider LLM client; `send_result()` is the canonical public API
- `docs/guide_app_controller.md` — headless controller; `app_controller.py:_handle_request_event` was migrated by `doeh_test_thinking_cleanup_20260615`
- `docs/guide_mma.md` — 4-tier MMA orchestration; `multi_agent_conductor.py:run_worker_lifecycle` is the worker entry point
- `docs/guide_mcp_client.md` — MCP tool registry (note: `mcp_client.py:2274` was a misidentification in the parent spec)
- `docs/guide_testing.md` — `live_gui` fixture + structural testing contract
### Styleguides
- `conductor/code_styleguides/error_handling.md` — `Result[T]` pattern + the AND-over-OR convention
- `conductor/code_styleguides/data_oriented_design.md` — canonical DOD reference
- `conductor/product-guidelines.md` — 1-space indentation, no comments, type hints, SDM tags
### Parent tracks
- `conductor/tracks/data_oriented_error_handling_20260606/spec.md` §3.5 (deprecation strategy), §12.1 (follow-up scope)
- `conductor/tracks/data_oriented_error_handling_20260606/state.toml` — the parent track's state
- `conductor/tracks/doeh_test_thinking_cleanup_20260615/spec.md` — the previous track; the migration pattern reference
- `conductor/tracks/doeh_test_thinking_cleanup_20260615/plan.md` Phase 2 — exact test mock fix pattern (Tasks 2.1-2.5)
- `docs/reports/TRACK_COMPLETION_doeh_test_thinking_cleanup_20260615.md` — the 11 mock fixes that established the pattern
### UI Polish track
- `docs/superpowers/specs/2026-06-03-ui-polish-design.md` — the 5-phase UI Polish spec
- `docs/superpowers/plans/2026-06-03-ui-polish.md` — the 5-phase UI Polish plan
- User commits: `d0b06575` (Phase 2 code fix), `df7bda6e` (Phase 3 code fix)
- Track commits: `79ac9210` (Phase 1), `3a864076` (Phase 4), `74e02485` (Phase 5)
### Test files (the 12 + 1 to migrate, the 4 UI Polish fixes)
- 12 send() test files: `test_ai_client_cli`, `test_ai_cache_tracking`, `test_ai_client_result`, `test_api_events`, `test_deepseek_provider`, `test_gemini_cli_edge_cases`, `test_gemini_cli_integration`, `test_gemini_cli_parity_regression`, `test_gui2_mcp`, `test_tier4_interceptor`, `test_token_usage`, `test_symbol_parsing`
- 1 _send_ test file: `test_qwen_provider`
- 2 UI Polish test files: `test_discussion_truncate_layout`, `test_log_management_refresh`
- 1 file to delete: `test_deprecation_warnings`
### Production call sites (3 to migrate)
- `src/conductor_tech_lead.py:68`
- `src/orchestrator_pm.py:86`
- `src/multi_agent_conductor.py:591`
### Codebase locations
- `src/ai_client.py:2939-3040` — the deprecated `send()` function (to be deleted)
- `src/ai_client.py:3002` — the new `send_result()` public API (kept)
- `pyproject.toml:46-47` — the `filterwarnings` entry (to be deleted)
- `tests/test_deprecation_warnings.py` — the 2 obsolete tests (to be deleted)
- `docs/guide_ai_client.md` — deprecation references (to be removed)
- `conductor/product-guidelines.md` — deprecation language (to be removed)