The 6 error-classifier functions in ai_client.py, openai_compatible.py,
and qwen_adapter.py now return ErrorInfo (data-oriented) instead of
ProviderError. Each takes a source: str parameter for telemetry
provenance. ProviderError class is still used in production code paths
(Task 3.4) and will be removed in Task 3.7.
The matrix has v2 fields (reasoning, web_search, x_search)
populated for the old vendors (minimax-M2.5/M2.7, grok-*),
but the send functions didn't consult them. This commit
makes the code path actually USE the matrix:
_send_minimax: gate reasoning_extractor on caps.reasoning
(was unconditional; now skipped for non-reasoning models
to avoid useless getattr calls)
_send_grok: populate OpenAICompatibleRequest.extra_body with
search_parameters when caps.web_search or caps.x_search is
True. caps.web_search -> {mode: auto}; caps.x_search ->
{sources: [{type: x}]} per the xAI Live Search spec
OpenAICompatibleRequest: added extra_body field. Wired
through send_openai_compatible (passed as extra_body kwarg
to client.chat.completions.create).
Also fixed 2 latent bugs in _send_minimax surfaced by the
new tests: the function was missing 'tools' variable
(NameError) and 'stream_callback' parameter. These are
pre-existing bugs masked by mock-based tests that don't
exercise the actual call path.
Also cancelled t5_6/7/8 (the invented 'deferred tool-loop
conversion' work). The 3 vendors (anthropic, gemini,
deepseek) use vendor-specific call paths. Their inline
loops are NOT defects. The '3-5 days' / '1-2 weeks'
estimates were made up by the agent. The audit script's
DEFERRED_VENDORS exclusion is permanent.
Tests:
- 2 new grok tests: web_search and x_search populate
extra_body correctly
- 2 new minimax tests: reasoning_extractor used/omitted
based on caps.reasoning
- 122/122 vendor+tool+provider+import-isolation tests pass
(no regressions; +4 new tests this commit)
- 3 audit scripts pass
Green phase: src/openai_compatible.py now exists and all 6 Red-phase
tests in tests/test_openai_compatible.py pass.
Implementation (144 lines, 1-space indent, no comments):
Data structures:
- NormalizedResponse: frozen dataclass with text, tool_calls,
usage_input_tokens, usage_output_tokens, usage_cache_read_tokens,
usage_cache_creation_tokens, raw_response
- OpenAICompatibleRequest: regular dataclass with messages, model,
temperature=0.0, top_p=1.0, max_tokens=8192, tools=None,
tool_choice='auto', stream=False, stream_callback=None
Algorithms:
- send_openai_compatible(client, request, *, capabilities) -> NormalizedResponse
Dispatches to _send_blocking or _send_streaming based on request.stream.
Catches openai.OpenAIError and re-raises as classified ProviderError.
- _send_blocking: extracts message text + tool_calls, converts tool_calls
to dicts via _to_dict_tool_call, reads usage.prompt_tokens /
usage.completion_tokens (with int() coercion for MagicMock test compat).
- _send_streaming: iterates chunks, accumulates text parts, aggregates
tool_calls by index, fires stream_callback per text delta, reads
chunk.usage for final token counts.
- _classify_openai_compatible_error: maps RateLimitError -> 'rate_limit',
AuthenticationError/PermissionDeniedError -> 'auth', APIConnectionError
-> 'network', APIStatusError with 402/429/401-403/500-504 -> 'balance'/
'rate_limit'/'auth'/'network', BadRequestError -> 'quota', fallback
'unknown'. All use provider='openai_compatible'.
Fixed plan's code smell: removed the 'MagicMock_noop' forward-reference
class (defined after first use) and replaced with the cleaner Pythonic
pattern 'int(getattr(usage, prompt_tokens, 0) or 0)'. Real OpenAI SDK
always sets usage on responses; the defensive fallback was noise.
Function-level import of ProviderError inside _classify_openai_compatible_error
avoids any circular import risk.