more refinements

2026-03-06 15:47:18 -05:00
parent fca40fd8da
commit 49ae811be9
2 changed files with 149 additions and 106 deletions
--- a/conductor/tracks/cost_token_analytics_20260306/spec.md
+++ b/conductor/tracks/cost_token_analytics_20260306/spec.md
@@ -8,59 +8,48 @@ Real-time cost tracking panel displaying cost per model, session totals, and bre
 ### Already Implemented (DO NOT re-implement)

 #### cost_tracker.py (src/cost_tracker.py)
- **`MODEL_PRICING` dict**: Pricing per 1M tokens for all supported models
+- **`MODEL_PRICING` list**: List of (regex_pattern, rates_dict) tuples
  ```python
-  MODEL_PRICING: dict[str, dict[str, float]] = {
-   "gemini-2.5-flash-lite": {"input": 0.075, "output": 0.30},
-   "gemini-2.5-flash": {"input": 0.15, "output": 0.60},
-   "gemini-3.1-pro-preview": {"input": 1.25, "output": 5.00},
-   "claude-3-5-sonnet": {"input": 3.00, "output": 15.00},
-   "deepseek-v3": {"input": 0.27, "output": 1.10},
-   # ... more models
-  }
+  MODEL_PRICING = [
+      (r"gemini-2\.5-flash-lite", {"input_per_mtok": 0.075, "output_per_mtok": 0.30}),
+      (r"gemini-2\.5-flash", {"input_per_mtok": 0.15, "output_per_mtok": 0.60}),
+      (r"gemini-3-flash-preview", {"input_per_mtok": 0.15, "output_per_mtok": 0.60}),
+      (r"gemini-3\.1-pro-preview", {"input_per_mtok": 3.50, "output_per_mtok": 10.50}),
+      (r"claude-.*-sonnet", {"input_per_mtok": 3.0, "output_per_mtok": 15.0}),
+      (r"deepseek-v3", {"input_per_mtok": 0.27, "output_per_mtok": 1.10}),
+  ]
  ```
- **`estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float`**: Calculate cost in USD
- **Returns 0.0 for unknown models** - safe default
+- **`estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float`**: Uses regex match, returns 0.0 for unknown models

-#### Token Tracking in ai_client.py
- **`_add_bleed_derived()`** (ai_client.py): Adds derived token counts to comms entries
- **`get_history_bleed_stats()`**: Returns token statistics from history
- **Gemini**: Token counts from API response (`usage_metadata`)
- **Anthropic**: Token counts from API response (`usage`)
- **DeepSeek**: Token counts from API response (`usage`)
-
-#### MMA Tier Usage Tracking
- **`ConductorEngine.tier_usage`** (multi_agent_conductor.py): Tracks per-tier token usage
+#### MMA Tier Usage Tracking (multi_agent_conductor.py)
+- **`ConductorEngine.tier_usage`** already tracks per-tier token counts AND model:
  ```python
  self.tier_usage = {
-   "Tier 1": {"input": 0, "output": 0},
-   "Tier 2": {"input": 0, "output": 0},
-   "Tier 3": {"input": 0, "output": 0},
-   "Tier 4": {"input": 0, "output": 0},
+   "Tier 1": {"input": 0, "output": 0, "model": "gemini-3.1-pro-preview"},
+   "Tier 2": {"input": 0, "output": 0, "model": "gemini-3-flash-preview"},
+   "Tier 3": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"},
+   "Tier 4": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"},
  }
  ```
+- **Key insight**: The model name is already tracked per tier in `tier_usage[tier]["model"]`
+- Updated in `run_worker_lifecycle()` from comms_log token counts

 ### Gaps to Fill (This Track's Scope)
 - No GUI panel to display cost information
 - No session-level cost accumulation
- No per-model breakdown visualization
- No tier breakdown visualization
+- No per-tier cost breakdown in UI

 ## Architectural Constraints

 ### Non-Blocking Updates
 - Cost calculations MUST NOT block UI thread
- Token counts are read from existing tracking - no new API calls
+- Token counts are read from existing tier_usage - no new tracking needed
 - Use cached values, update on state change events

 ### Cross-Thread Data Access
- `tier_usage` is updated on asyncio worker thread
- GUI reads via `_process_pending_gui_tasks` pattern
- Already synchronized through MMA state updates
-
-### Memory Efficiency
- Session cost is a simple float - no history array needed
- Per-model costs can be dict: `{model_name: float}`
+- `tier_usage` is updated on worker threads
+- GUI reads via MMA state updates through `_pending_gui_tasks` pattern
+- Already synchronized through existing state update mechanism

 ## Architecture Reference

@@ -69,38 +58,72 @@ Real-time cost tracking panel displaying cost per model, session totals, and bre
 | File | Lines | Purpose |
 |------|-------|---------|
 | `src/cost_tracker.py` | 10-40 | `MODEL_PRICING`, `estimate_cost()` |
-| `src/ai_client.py` | ~500-550 | `_add_bleed_derived()`, `get_history_bleed_stats()` |
-| `src/multi_agent_conductor.py` | ~50-60 | `tier_usage` dict |
+| `src/multi_agent_conductor.py` | ~50-60 | `tier_usage` dict with input/output/model |
 | `src/gui_2.py` | ~2700-2800 | `_render_mma_dashboard()` - existing tier usage display |
-| `src/gui_2.py` | ~1800-1900 | `_render_token_budget_panel()` - potential location |

-### Existing MMA Dashboard Pattern
-The `_render_mma_dashboard()` method already displays tier usage in a table. Extend this pattern for cost display.
+### Cost Calculation Pattern
+```python
+from src import cost_tracker
+usage = engine.tier_usage["Tier 3"]
+cost = cost_tracker.estimate_cost(
+    usage["model"],      # Already tracked!
+    usage["input"],
+    usage["output"]
+)
+```

 ## Functional Requirements

 ### FR1: Session Cost Accumulation
- Track total cost for the current session
+- Track total cost for the current session in App/AppController state
 - Reset on session reset
- Store in `App` or `AppController` state
+- Sum of all tier costs

-### FR2: Per-Model Cost Display
- Show cost broken down by model name
- Group by provider (Gemini, Anthropic, DeepSeek)
- Show token counts alongside costs
+### FR2: Per-Tier Cost Display
+- Show cost per MMA tier using existing `tier_usage[tier]["model"]` for model
+- Show input/output tokens alongside cost
+- Calculate using `cost_tracker.estimate_cost()`

-### FR3: Tier Breakdown Display
- Show cost per MMA tier (Tier 1-4)
- Use existing `tier_usage` data
- Calculate cost using `cost_tracker.estimate_cost()`
-
-### FR4: Real-Time Updates
+### FR3: Real-Time Updates
 - Update cost display when MMA state changes
 - Hook into existing `mma_state_update` event handling
 - No polling - event-driven

 ## Non-Functional Requirements

+| Requirement | Constraint |
+|-------------|------------|
+| Frame Time Impact | <1ms when panel visible |
+| Memory Overhead | <1KB for session cost state |
+
+## Testing Requirements
+
+### Unit Tests
+- Test `estimate_cost()` with known model/token combinations
+- Test unknown model returns 0.0
+- Test session cost accumulation
+
+### Integration Tests (via `live_gui` fixture)
+- Verify cost panel displays after MMA execution
+- Verify session reset clears costs
+
+## Out of Scope
+- Historical cost tracking across sessions
+- Cost budgeting/alerts
+- Per-model aggregation (model already per-tier)
+
+## Acceptance Criteria
+- [ ] Cost panel displays in GUI
+- [ ] Per-tier cost shown with token counts
+- [ ] Tier breakdown uses existing tier_usage model field
+- [ ] Total session cost accumulates correctly
+- [ ] Panel updates on MMA state changes
+- [ ] Uses existing `cost_tracker.estimate_cost()`
+- [ ] Session reset clears costs
+- [ ] 1-space indentation maintained
+
+## Non-Functional Requirements
+
 | Requirement | Constraint |
 |-------------|------------|
 | Frame Time Impact | <1ms when panel visible |