Private
Public Access
0
0
Files
manual_slop/conductor/tracks
ed 06716252f1 docs(spec): add 'best API per vendor' principle; mark xAI native as target; document follow-ups
Three additions to the spec, per the user's architectural correction
in this session:

1. NEW section 3.1.1: 'Architectural principle: Use the best API per
   vendor' — explains why the OpenAI-compatible shim loses vendor-
   specific features (xAI: prompt_cache_key, reasoning_effort, server-
   side tools, cost_in_usd_ticks; Ollama: think param, images array,
   thinking field, structured outputs) and states the principle:
   'use each vendor's native SDK or REST API when one exists, falling
   back to OpenAI-compatible only when no native option exists.'

   Also notes that the capability matrix IS the aggregate tracker;
   future native features go into the matrix, and the GUI filters
   based on it (no per-vendor UI branches).

2. UPDATED section 4.3 (Grok): 'Grok via xAI (Native REST API)' — was
   'OpenAI-Compatible'. Now specifies two native endpoints
   (/v1/chat/completions and /v1/responses), the native features that
   matter, the updated capability registry (caching=true for Grok
   via prompt_cache_key), and a 'Phase 3 placeholder behavior' note
   that this track's Phase 3 ships the OpenAI-compatible Grok as a
   placeholder. The native refactor is deferred to follow-up B.

3. UPDATED section 13.1: added follow-up track B 'Native Vendor APIs
   (post-OpenAI-compatible-placeholder)' which documents:
   - Grok → xAI native REST
   - Llama (Ollama) → native /api/chat
   - Llama (Meta Llama API) → new 4th backend (deferred pending
     verification of Meta's API spec; llama.developer.meta.com/docs/overview
     returned 400 on fetch this session)
   - Capability matrix expansion (web_search, x_search, code_execution,
     file_search, mcp_support, reasoning_effort, structured_output)
   - Test rewrites (mock requests.post instead of chat.completions.create)

This is a docs-only commit; no code changes. The Phase 3 Green work
continues with the OpenAI-compatible approach as planned in the
existing Red tests (t3.3 Grok + t3.14 Llama), and the follow-up track
B handles the native refactor when prioritized.
2026-06-11 01:49:36 -04:00
..
2026-06-06 00:42:40 -04:00