refinement of upcoming tracks

This commit is contained in:
2026-03-06 15:41:33 -05:00
parent 3ce6a2ec8a
commit fca40fd8da
24 changed files with 2388 additions and 391 deletions

View File

@@ -1,21 +1,140 @@
# Track Specification: Cost & Token Analytics Panel (cost_token_analytics_20260306)
## Overview
Real-time cost tracking panel displaying cost per model, session totals, and breakdown by tier. Uses existing cost_tracker.py which is implemented but has no GUI.
Real-time cost tracking panel displaying cost per model, session totals, and breakdown by tier. Uses existing `cost_tracker.py` which is implemented but has no GUI representation.
## Current State Audit
### Already Implemented (DO NOT re-implement)
#### cost_tracker.py (src/cost_tracker.py)
- **`MODEL_PRICING` dict**: Pricing per 1M tokens for all supported models
```python
MODEL_PRICING: dict[str, dict[str, float]] = {
"gemini-2.5-flash-lite": {"input": 0.075, "output": 0.30},
"gemini-2.5-flash": {"input": 0.15, "output": 0.60},
"gemini-3.1-pro-preview": {"input": 1.25, "output": 5.00},
"claude-3-5-sonnet": {"input": 3.00, "output": 15.00},
"deepseek-v3": {"input": 0.27, "output": 1.10},
# ... more models
}
```
- **`estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float`**: Calculate cost in USD
- **Returns 0.0 for unknown models** - safe default
#### Token Tracking in ai_client.py
- **`_add_bleed_derived()`** (ai_client.py): Adds derived token counts to comms entries
- **`get_history_bleed_stats()`**: Returns token statistics from history
- **Gemini**: Token counts from API response (`usage_metadata`)
- **Anthropic**: Token counts from API response (`usage`)
- **DeepSeek**: Token counts from API response (`usage`)
#### MMA Tier Usage Tracking
- **`ConductorEngine.tier_usage`** (multi_agent_conductor.py): Tracks per-tier token usage
```python
self.tier_usage = {
"Tier 1": {"input": 0, "output": 0},
"Tier 2": {"input": 0, "output": 0},
"Tier 3": {"input": 0, "output": 0},
"Tier 4": {"input": 0, "output": 0},
}
```
### Gaps to Fill (This Track's Scope)
- No GUI panel to display cost information
- No session-level cost accumulation
- No per-model breakdown visualization
- No tier breakdown visualization
## Architectural Constraints
- **Non-Blocking**: Cost calculations MUST NOT block UI thread.
- **Efficient Updates**: Updates SHOULD be throttled to <10ms latency.
### Non-Blocking Updates
- Cost calculations MUST NOT block UI thread
- Token counts are read from existing tracking - no new API calls
- Use cached values, update on state change events
### Cross-Thread Data Access
- `tier_usage` is updated on asyncio worker thread
- GUI reads via `_process_pending_gui_tasks` pattern
- Already synchronized through MMA state updates
### Memory Efficiency
- Session cost is a simple float - no history array needed
- Per-model costs can be dict: `{model_name: float}`
## Architecture Reference
### Key Integration Points
| File | Lines | Purpose |
|------|-------|---------|
| `src/cost_tracker.py` | 10-40 | `MODEL_PRICING`, `estimate_cost()` |
| `src/ai_client.py` | ~500-550 | `_add_bleed_derived()`, `get_history_bleed_stats()` |
| `src/multi_agent_conductor.py` | ~50-60 | `tier_usage` dict |
| `src/gui_2.py` | ~2700-2800 | `_render_mma_dashboard()` - existing tier usage display |
| `src/gui_2.py` | ~1800-1900 | `_render_token_budget_panel()` - potential location |
### Existing MMA Dashboard Pattern
The `_render_mma_dashboard()` method already displays tier usage in a table. Extend this pattern for cost display.
## Functional Requirements
- **Cost Display**: Show real-time cost for current session.
- **Per-Model Breakdown**: Display cost grouped by model (Gemini, Anthropic, DeepSeek).
- **Tier Breakdown**: Show cost grouped by tier (Tier 1-4).
- **Session Totals**: Accumulate and display total session cost.
### FR1: Session Cost Accumulation
- Track total cost for the current session
- Reset on session reset
- Store in `App` or `AppController` state
### FR2: Per-Model Cost Display
- Show cost broken down by model name
- Group by provider (Gemini, Anthropic, DeepSeek)
- Show token counts alongside costs
### FR3: Tier Breakdown Display
- Show cost per MMA tier (Tier 1-4)
- Use existing `tier_usage` data
- Calculate cost using `cost_tracker.estimate_cost()`
### FR4: Real-Time Updates
- Update cost display when MMA state changes
- Hook into existing `mma_state_update` event handling
- No polling - event-driven
## Non-Functional Requirements
| Requirement | Constraint |
|-------------|------------|
| Frame Time Impact | <1ms when panel visible |
| Memory Overhead | <1KB for session cost state |
| Thread Safety | Read tier_usage via state updates only |
## Testing Requirements
### Unit Tests
- Test `estimate_cost()` with known model/token combinations
- Test unknown model returns 0.0
- Test session cost accumulation
### Integration Tests (via `live_gui` fixture)
- Verify cost panel displays after API call
- Verify costs update after MMA execution
- Verify session reset clears costs
### Structural Testing Contract
- Use real `cost_tracker` module - no mocking
- Test artifacts go to `tests/artifacts/`
## Out of Scope
- Historical cost tracking across sessions
- Cost budgeting/alerts
- Export cost reports
- API cost for web searches (no token counts available)
## Acceptance Criteria
- [ ] Cost panel displays in GUI.
- [ ] Per-model cost shown correctly.
- [ ] Tier breakdown accurate.
- [ ] Total accumulates correctly.
- [ ] Uses existing cost_tracker.py functions.
- [ ] Cost panel displays in GUI
- [ ] Per-model cost shown with token counts
- [ ] Tier breakdown accurate using `tier_usage`
- [ ] Total session cost accumulates correctly
- [ ] Panel updates on MMA state changes
- [ ] Uses existing `cost_tracker.estimate_cost()`
- [ ] Session reset clears costs
- [ ] 1-space indentation maintained