diff --git a/conductor/tracks/qwen_llama_grok_integration_20260606/spec.md b/conductor/tracks/qwen_llama_grok_integration_20260606/spec.md index 6cf93f29..5f0ee28b 100644 --- a/conductor/tracks/qwen_llama_grok_integration_20260606/spec.md +++ b/conductor/tracks/qwen_llama_grok_integration_20260606/spec.md @@ -156,7 +156,7 @@ _qwen_history_lock: threading.Lock = threading.Lock() | `qwen-long` | false | true | false | 1,000,000 | $0.07 | $0.28 | | `qwen-vl-plus` | true | true | false | 131,072 | $0.21 | $0.63 | | `qwen-vl-max` | true | true | false | 32,768 | $0.50 | $1.50 | -| `qwen-audio` | true (audio) | true | false | 32,768 | $0.10 | $0.30 | +| `qwen-audio` | false | true | false | 32,768 | $0.10 | $0.30 | (Pricing from Alibaba Cloud DashScope public pricing as of 2026-06-06; update if needed.) @@ -164,7 +164,7 @@ _qwen_history_lock: threading.Lock = threading.Lock() **Tool format translation:** DashScope uses a slightly different tool schema than OpenAI. The Qwen adapter translates from the normalized tool definitions (OpenAI-shaped) to DashScope's `tools: list[dict]` with `parameters: dict` schema. -**Vision / audio:** Qwen-VL accepts image URLs or base64; Qwen-Audio accepts audio URLs or base64. The adapter handles the multipart encoding. +**Vision / audio:** Qwen-VL accepts image URLs or base64; the adapter handles the multipart encoding for the OpenAI-compatible `image_url` content type. **Qwen-Audio in v1 is text-only** — the `audio_input` capability is deferred to v2 (see §3.3). Users can still select Qwen-Audio in v1 for text-only tasks; the audio attachment button is hidden via the (absent) audio capability check. **Error classification:** `_classify_qwen_error()` maps DashScope exceptions to `ProviderError` kinds (`quota`, `rate_limit`, `auth`, `balance`, `network`). @@ -345,7 +345,7 @@ The GUI reads `get_capabilities(active_vendor, active_model)` once per render fr | UI Element | Behavior based on matrix | |---|---| | **Screenshot button** (Message panel) | Enabled iff `vision: true`. Tooltip explains why if disabled. | -| **Audio attachment button** (Message panel) | **Deferred to v2.** Stub: always hidden in v1. | +| **Audio attachment button** (Message panel) | **Deferred to v2.** Stub: always hidden in v1 (the `audio_input` capability is not in the v1 matrix; v1 has no audio UI at all). | | **Tools enabled toggle** (Message panel) | Enabled iff `tool_calling: true`. | | **Cache panel** (Operations Hub) | Visible iff `caching: true`. | | **Cache indicators** (Token budget) | Shown iff `caching: true`. |