Side concerns for Phase 3:
1. PROVIDERS: src/models.py:56 now includes 'grok' and 'llama' alongside
the 6 existing vendors. Centralized registry; gui_2.py and
app_controller.py import from here. State tasks t3.5 and t3.16
were scoped to gui_2.py/app_controller.py but the actual change
is at the centralized registry, per the project's single-source-of-
truth pattern (per src/models.py module docstring and the Phase 5
audit script audit_no_models_config_io.py which enforces that
PROVIDERS lives in models.py).
2. cost_tracker.py: added 11 regex pricing entries (3 Grok + 8 Llama):
Grok (per xAI public pricing):
- grok-2: 2.00 / 10.00
- grok-2-vision: 2.00 / 10.00
- grok-beta: 5.00 / 15.00
Llama (per Grok's consultation: pricing varies by backend; registry
entries represent the most common case):
- llama-3.1-8b-instant: 0.05 / 0.08 (Groq)
- llama-3.1-70b-versatile: 0.59 / 0.79 (Groq)
- llama-3.1-405b-reasoning: 3.00 / 3.00 (OpenRouter avg)
- llama-3.2-1b-preview: 0.04 / 0.04
- llama-3.2-3b-preview: 0.06 / 0.06
- llama-3.2-11b-vision-preview: 0.18 / 0.18
- llama-3.2-90b-vision-preview: 0.90 / 0.90
- llama-3.3-70b-specdec: 0.59 / 0.79 (Groq)
(all per 1M tokens, USD; matches the structure of existing entries;
note: 'llama-3.1', 'llama-3.2', 'llama-3.3' are regex patterns to
allow future model variants in the same family.)
Spot check:
- estimate_cost('grok-2', 1000, 500) = 0.007 (= 0.002 + 0.005)
- estimate_cost('llama-3.3-70b-specdec', 1000, 500) = 0.000985
3. SKIPPED t3.4 and t3.15 (credentials templates): no
credentials_template.toml exists in the project (Phase 2 established
this). The user maintains their own credentials.toml directly.
4. t3.6 and t3.17 (Grok/Llama models in capability registry) were
completed in Phase 1's initial population of 22 entries
(commit 6be04bc). Grok has 4 entries (1 wildcard + 3 models);
Llama has 9 entries (1 wildcard + 8 models). Grok-2-vision has
vision=True; Llama 3.2-11b/90b vision variants have vision=True.
Verification: 38/38 tests pass in batch.
Side concerns for Phase 2:
1. PROVIDERS: src/models.py:56 now includes 'qwen' alongside the existing
5 vendors. The other 4 references to PROVIDERS in src/gui_2.py and
src/app_controller.py import from this centralized list, so this
one edit propagates everywhere. State task t2.8 was scoped to
'gui_2.py and app_controller.py' but the actual change is at the
centralized registry, per the project's single-source-of-truth
pattern (per src/models.py module docstring and the Phase 5 audit
script audit_no_models_config_io.py which enforces that PROVIDERS
lives in models.py).
2. cost_tracker.py: added 7 regex pricing entries for the Qwen models
shipped in Phase 1's vendor_capabilities.py:
- qwen-turbo: 0.05 / 0.10
- qwen-plus: 0.40 / 1.20
- qwen-max: 2.00 / 6.00
- qwen-long: 0.07 / 0.28
- qwen-vl-plus: 0.21 / 0.63
- qwen-vl-max: 0.50 / 1.50
- qwen-audio: 0.10 / 0.30
(all per 1M tokens, USD; matches the structure of existing entries)
Spot check: estimate_cost('qwen-max', 1000, 500) = 0.005 (= 0.002 + 0.003)
3. SKIPPED t2.7 (credentials template): no credentials_template.toml
exists in the project. The only credentials file is the active
credentials.toml which the user maintains directly with their own
API keys. The plan's assumption of a template file does not match
the project's actual structure. Documented in the commit log
rather than modifying the user's actual credentials.toml with a
placeholder key (which would be inconsistent with the rest of
that file's pattern of real keys). When the user obtains a
DashScope API key, they can add a [qwen] section directly.
4. t2.9 (Qwen models in capability registry) was completed in Phase 1's
initial population of 22 entries (commit 6be04bc). The 8 qwen
entries (1 wildcard + 7 specific models) are in src/vendor_capabilities.py.
Verification: 30/30 tests pass in batch
(test_qwen_provider, test_minimax_provider, test_ai_client_no_top_level_sdk_imports,
test_vendor_capabilities, test_openai_compatible, test_cost_tracker)