Private
Public Access
0
0

conductor(spec): Fix Qwen-Audio matrix entry consistency (vision=false, audio deferred)

The capability matrix v1 has no 'audio' field (audio_input is deferred to v2).
Qwen-Audio's vision flag was incorrectly marked true. Changed to false and
clarified that v1 uses Qwen-Audio as text-only; audio attachment UI is
hidden via the absent audio capability check.
This commit is contained in:
2026-06-06 14:58:03 -04:00
parent 055430a75a
commit 97daaff29b
@@ -156,7 +156,7 @@ _qwen_history_lock: threading.Lock = threading.Lock()
| `qwen-long` | false | true | false | 1,000,000 | $0.07 | $0.28 |
| `qwen-vl-plus` | true | true | false | 131,072 | $0.21 | $0.63 |
| `qwen-vl-max` | true | true | false | 32,768 | $0.50 | $1.50 |
| `qwen-audio` | true (audio) | true | false | 32,768 | $0.10 | $0.30 |
| `qwen-audio` | false | true | false | 32,768 | $0.10 | $0.30 |
(Pricing from Alibaba Cloud DashScope public pricing as of 2026-06-06; update if needed.)
@@ -164,7 +164,7 @@ _qwen_history_lock: threading.Lock = threading.Lock()
**Tool format translation:** DashScope uses a slightly different tool schema than OpenAI. The Qwen adapter translates from the normalized tool definitions (OpenAI-shaped) to DashScope's `tools: list[dict]` with `parameters: dict` schema.
**Vision / audio:** Qwen-VL accepts image URLs or base64; Qwen-Audio accepts audio URLs or base64. The adapter handles the multipart encoding.
**Vision / audio:** Qwen-VL accepts image URLs or base64; the adapter handles the multipart encoding for the OpenAI-compatible `image_url` content type. **Qwen-Audio in v1 is text-only** — the `audio_input` capability is deferred to v2 (see §3.3). Users can still select Qwen-Audio in v1 for text-only tasks; the audio attachment button is hidden via the (absent) audio capability check.
**Error classification:** `_classify_qwen_error()` maps DashScope exceptions to `ProviderError` kinds (`quota`, `rate_limit`, `auth`, `balance`, `network`).
@@ -345,7 +345,7 @@ The GUI reads `get_capabilities(active_vendor, active_model)` once per render fr
| UI Element | Behavior based on matrix |
|---|---|
| **Screenshot button** (Message panel) | Enabled iff `vision: true`. Tooltip explains why if disabled. |
| **Audio attachment button** (Message panel) | **Deferred to v2.** Stub: always hidden in v1. |
| **Audio attachment button** (Message panel) | **Deferred to v2.** Stub: always hidden in v1 (the `audio_input` capability is not in the v1 matrix; v1 has no audio UI at all). |
| **Tools enabled toggle** (Message panel) | Enabled iff `tool_calling: true`. |
| **Cache panel** (Operations Hub) | Visible iff `caching: true`. |
| **Cache indicators** (Token budget) | Shown iff `caching: true`. |