Private
Public Access
0
0
Commit Graph

3 Commits

Author SHA1 Message Date
ed 0a9e277564 feat(capability_matrix): add 12 v2 fields to VendorCapabilities
The 7 v1 fields (vision, tool_calling, caching, streaming,
model_discovery, context_window, cost_tracking) plus 2 cost
fields and notes are now extended by 12 v2 fields:

  local, reasoning, structured_output, code_execution,
  web_search, x_search, file_search, mcp_support,
  audio, video, grounding, computer_use

All default to False. Registry entries continue to work
unchanged (backward compatible). t4_1 of Phase 4.

Tests:
- 12 parameterized 'default is False' tests
- 12 parameterized 'round-trip to True' tests
- 3 'local flag' tests: per-model, wildcard fallback,
  vendor isolation
- 3 pre-existing registry tests still pass
- 96/96 vendor+tool+provider+import-isolation tests pass
  (no regressions; +27 new tests this commit)
2026-06-11 20:24:30 -04:00
ed 6be04bc4f0 feat(vendor_capabilities): implement registry with initial 22-entry population
Green phase: src/vendor_capabilities.py now exists and all 3 Red-phase
tests in tests/test_vendor_capabilities.py pass.

Implementation:
- VendorCapabilities frozen dataclass with 12 fields (vendor, model, vision,
  tool_calling, caching, streaming, model_discovery, context_window,
  cost_tracking, cost_input_per_mtok, cost_output_per_mtok, notes)
- Module-level _REGISTRY dict keyed by (vendor, model)
- register() inserts/overwrites entries
- get_capabilities() returns specific entry if present, else vendor '*'
  default, else raises KeyError with 'No capabilities registered' message
- list_models_for_vendor() returns sorted model names for a vendor
  (excludes '*' wildcard)

Initial population (22 entries at module load):
- 1 minimax wildcard (cost: 0.20/0.20 per Mtok)
- 4 grok (1 wildcard + 3 models; grok-2-vision has vision=True)
- 9 llama (1 wildcard + 8 models; 11b/90b vision variants have vision=True)
- 8 qwen (1 wildcard + 7 models; qwen-vl-plus/max have vision=True;
  qwen-audio has notes='Text-only in v1; audio input deferred')

The plan's Task 1.3 listed 22 entries but included one impossible entry
(vendor='minimax', model='grok-2-latest'). Omitted; 21 entries shipped.

Test fix: test_fallback_to_vendor_default previously used model name
'llama-3.3-70b-specdec' which IS in the registry, so the specific entry
was returned (with default cost_tracking=True), not the wildcard. Fixed
by changing to 'llama-3.3-future-unregistered' (not in registry, so
fallback fires correctly).
2026-06-11 00:30:52 -04:00
ed 6fb6f8653c test(vendor_capabilities): red phase for registry lookup, fallback, unknown vendor
3 failing tests in tests/test_vendor_capabilities.py that establish the
core behaviors of the new VendorCapability matrix:

1. test_registry_lookup_known_model: registering and looking up a specific
   (vendor, model) entry returns the registered entry
2. test_fallback_to_vendor_default: looking up an unregistered model returns
   the vendor's '*' default entry
3. test_unknown_vendor_raises: looking up a vendor with no entries raises
   KeyError with a 'No capabilities registered' message

All 3 tests fail with ModuleNotFoundError: No module named
'src.vendor_capabilities' (confirmed via pytest). The implementation file
will be created in the next commit (Green phase).

The autouse _clean_registry fixture snapshots src.vendor_capabilities._REGISTRY
before each test and restores it after, providing test isolation for the
module-level state.
2026-06-11 00:19:00 -04:00