06716252f1
Three additions to the spec, per the user's architectural correction
in this session:
1. NEW section 3.1.1: 'Architectural principle: Use the best API per
vendor' — explains why the OpenAI-compatible shim loses vendor-
specific features (xAI: prompt_cache_key, reasoning_effort, server-
side tools, cost_in_usd_ticks; Ollama: think param, images array,
thinking field, structured outputs) and states the principle:
'use each vendor's native SDK or REST API when one exists, falling
back to OpenAI-compatible only when no native option exists.'
Also notes that the capability matrix IS the aggregate tracker;
future native features go into the matrix, and the GUI filters
based on it (no per-vendor UI branches).
2. UPDATED section 4.3 (Grok): 'Grok via xAI (Native REST API)' — was
'OpenAI-Compatible'. Now specifies two native endpoints
(/v1/chat/completions and /v1/responses), the native features that
matter, the updated capability registry (caching=true for Grok
via prompt_cache_key), and a 'Phase 3 placeholder behavior' note
that this track's Phase 3 ships the OpenAI-compatible Grok as a
placeholder. The native refactor is deferred to follow-up B.
3. UPDATED section 13.1: added follow-up track B 'Native Vendor APIs
(post-OpenAI-compatible-placeholder)' which documents:
- Grok → xAI native REST
- Llama (Ollama) → native /api/chat
- Llama (Meta Llama API) → new 4th backend (deferred pending
verification of Meta's API spec; llama.developer.meta.com/docs/overview
returned 400 on fetch this session)
- Capability matrix expansion (web_search, x_search, code_execution,
file_search, mcp_support, reasoning_effort, structured_output)
- Test rewrites (mock requests.post instead of chat.completions.create)
This is a docs-only commit; no code changes. The Phase 3 Green work
continues with the OpenAI-compatible approach as planned in the
existing Red tests (t3.3 Grok + t3.14 Llama), and the follow-up track
B handles the native refactor when prioritized.