Files

Ed_ fca40fd8da refinement of upcoming tracks

2026-03-06 15:41:33 -05:00

4.9 KiB

Raw Blame History

Track Specification: Cost & Token Analytics Panel (cost_token_analytics_20260306)

Overview

Real-time cost tracking panel displaying cost per model, session totals, and breakdown by tier. Uses existing cost_tracker.py which is implemented but has no GUI representation.

Current State Audit

Already Implemented (DO NOT re-implement)

cost_tracker.py (src/cost_tracker.py)

MODEL_PRICING dict: Pricing per 1M tokens for all supported models

MODEL_PRICING: dict[str, dict[str, float]] = {
 "gemini-2.5-flash-lite": {"input": 0.075, "output": 0.30},
 "gemini-2.5-flash": {"input": 0.15, "output": 0.60},
 "gemini-3.1-pro-preview": {"input": 1.25, "output": 5.00},
 "claude-3-5-sonnet": {"input": 3.00, "output": 15.00},
 "deepseek-v3": {"input": 0.27, "output": 1.10},
 # ... more models
}

estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float: Calculate cost in USD
Returns 0.0 for unknown models - safe default

Token Tracking in ai_client.py

_add_bleed_derived() (ai_client.py): Adds derived token counts to comms entries
get_history_bleed_stats(): Returns token statistics from history
Gemini: Token counts from API response (usage_metadata)
Anthropic: Token counts from API response (usage)
DeepSeek: Token counts from API response (usage)

MMA Tier Usage Tracking

ConductorEngine.tier_usage (multi_agent_conductor.py): Tracks per-tier token usage

self.tier_usage = {
 "Tier 1": {"input": 0, "output": 0},
 "Tier 2": {"input": 0, "output": 0},
 "Tier 3": {"input": 0, "output": 0},
 "Tier 4": {"input": 0, "output": 0},
}

Gaps to Fill (This Track's Scope)

No GUI panel to display cost information
No session-level cost accumulation
No per-model breakdown visualization
No tier breakdown visualization

Architectural Constraints

Non-Blocking Updates

Cost calculations MUST NOT block UI thread
Token counts are read from existing tracking - no new API calls
Use cached values, update on state change events

Cross-Thread Data Access

tier_usage is updated on asyncio worker thread
GUI reads via _process_pending_gui_tasks pattern
Already synchronized through MMA state updates

Memory Efficiency

Session cost is a simple float - no history array needed
Per-model costs can be dict: {model_name: float}

Architecture Reference

Key Integration Points

File	Lines	Purpose
`src/cost_tracker.py`	10-40	`MODEL_PRICING`, `estimate_cost()`
`src/ai_client.py`	~500-550	`_add_bleed_derived()`, `get_history_bleed_stats()`
`src/multi_agent_conductor.py`	~50-60	`tier_usage` dict
`src/gui_2.py`	~2700-2800	`_render_mma_dashboard()` - existing tier usage display
`src/gui_2.py`	~1800-1900	`_render_token_budget_panel()` - potential location

Existing MMA Dashboard Pattern

The _render_mma_dashboard() method already displays tier usage in a table. Extend this pattern for cost display.

Functional Requirements

FR1: Session Cost Accumulation

Track total cost for the current session
Reset on session reset
Store in App or AppController state

FR2: Per-Model Cost Display

Show cost broken down by model name
Group by provider (Gemini, Anthropic, DeepSeek)
Show token counts alongside costs

FR3: Tier Breakdown Display

Show cost per MMA tier (Tier 1-4)
Use existing tier_usage data
Calculate cost using cost_tracker.estimate_cost()

FR4: Real-Time Updates

Update cost display when MMA state changes
Hook into existing mma_state_update event handling
No polling - event-driven

Non-Functional Requirements

Requirement	Constraint
Frame Time Impact	<1ms when panel visible
Memory Overhead	<1KB for session cost state
Thread Safety	Read tier_usage via state updates only

Testing Requirements

Unit Tests

Test estimate_cost() with known model/token combinations
Test unknown model returns 0.0
Test session cost accumulation

Integration Tests (via `live_gui` fixture)

Verify cost panel displays after API call
Verify costs update after MMA execution
Verify session reset clears costs

Structural Testing Contract

Use real cost_tracker module - no mocking
Test artifacts go to tests/artifacts/

Out of Scope

Historical cost tracking across sessions
Cost budgeting/alerts
Export cost reports
API cost for web searches (no token counts available)

Acceptance Criteria

Cost panel displays in GUI
Per-model cost shown with token counts
Tier breakdown accurate using tier_usage
Total session cost accumulates correctly
Panel updates on MMA state changes
Uses existing cost_tracker.estimate_cost()
Session reset clears costs
1-space indentation maintained

4.9 KiB Raw Blame History