Files
manual_slop/conductor/tracks/cost_token_analytics_20260306/spec.md
2026-03-06 15:41:33 -05:00

4.9 KiB

Track Specification: Cost & Token Analytics Panel (cost_token_analytics_20260306)

Overview

Real-time cost tracking panel displaying cost per model, session totals, and breakdown by tier. Uses existing cost_tracker.py which is implemented but has no GUI representation.

Current State Audit

Already Implemented (DO NOT re-implement)

cost_tracker.py (src/cost_tracker.py)

  • MODEL_PRICING dict: Pricing per 1M tokens for all supported models
    MODEL_PRICING: dict[str, dict[str, float]] = {
     "gemini-2.5-flash-lite": {"input": 0.075, "output": 0.30},
     "gemini-2.5-flash": {"input": 0.15, "output": 0.60},
     "gemini-3.1-pro-preview": {"input": 1.25, "output": 5.00},
     "claude-3-5-sonnet": {"input": 3.00, "output": 15.00},
     "deepseek-v3": {"input": 0.27, "output": 1.10},
     # ... more models
    }
    
  • estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float: Calculate cost in USD
  • Returns 0.0 for unknown models - safe default

Token Tracking in ai_client.py

  • _add_bleed_derived() (ai_client.py): Adds derived token counts to comms entries
  • get_history_bleed_stats(): Returns token statistics from history
  • Gemini: Token counts from API response (usage_metadata)
  • Anthropic: Token counts from API response (usage)
  • DeepSeek: Token counts from API response (usage)

MMA Tier Usage Tracking

  • ConductorEngine.tier_usage (multi_agent_conductor.py): Tracks per-tier token usage
    self.tier_usage = {
     "Tier 1": {"input": 0, "output": 0},
     "Tier 2": {"input": 0, "output": 0},
     "Tier 3": {"input": 0, "output": 0},
     "Tier 4": {"input": 0, "output": 0},
    }
    

Gaps to Fill (This Track's Scope)

  • No GUI panel to display cost information
  • No session-level cost accumulation
  • No per-model breakdown visualization
  • No tier breakdown visualization

Architectural Constraints

Non-Blocking Updates

  • Cost calculations MUST NOT block UI thread
  • Token counts are read from existing tracking - no new API calls
  • Use cached values, update on state change events

Cross-Thread Data Access

  • tier_usage is updated on asyncio worker thread
  • GUI reads via _process_pending_gui_tasks pattern
  • Already synchronized through MMA state updates

Memory Efficiency

  • Session cost is a simple float - no history array needed
  • Per-model costs can be dict: {model_name: float}

Architecture Reference

Key Integration Points

File Lines Purpose
src/cost_tracker.py 10-40 MODEL_PRICING, estimate_cost()
src/ai_client.py ~500-550 _add_bleed_derived(), get_history_bleed_stats()
src/multi_agent_conductor.py ~50-60 tier_usage dict
src/gui_2.py ~2700-2800 _render_mma_dashboard() - existing tier usage display
src/gui_2.py ~1800-1900 _render_token_budget_panel() - potential location

Existing MMA Dashboard Pattern

The _render_mma_dashboard() method already displays tier usage in a table. Extend this pattern for cost display.

Functional Requirements

FR1: Session Cost Accumulation

  • Track total cost for the current session
  • Reset on session reset
  • Store in App or AppController state

FR2: Per-Model Cost Display

  • Show cost broken down by model name
  • Group by provider (Gemini, Anthropic, DeepSeek)
  • Show token counts alongside costs

FR3: Tier Breakdown Display

  • Show cost per MMA tier (Tier 1-4)
  • Use existing tier_usage data
  • Calculate cost using cost_tracker.estimate_cost()

FR4: Real-Time Updates

  • Update cost display when MMA state changes
  • Hook into existing mma_state_update event handling
  • No polling - event-driven

Non-Functional Requirements

Requirement Constraint
Frame Time Impact <1ms when panel visible
Memory Overhead <1KB for session cost state
Thread Safety Read tier_usage via state updates only

Testing Requirements

Unit Tests

  • Test estimate_cost() with known model/token combinations
  • Test unknown model returns 0.0
  • Test session cost accumulation

Integration Tests (via live_gui fixture)

  • Verify cost panel displays after API call
  • Verify costs update after MMA execution
  • Verify session reset clears costs

Structural Testing Contract

  • Use real cost_tracker module - no mocking
  • Test artifacts go to tests/artifacts/

Out of Scope

  • Historical cost tracking across sessions
  • Cost budgeting/alerts
  • Export cost reports
  • API cost for web searches (no token counts available)

Acceptance Criteria

  • Cost panel displays in GUI
  • Per-model cost shown with token counts
  • Tier breakdown accurate using tier_usage
  • Total session cost accumulates correctly
  • Panel updates on MMA state changes
  • Uses existing cost_tracker.estimate_cost()
  • Session reset clears costs
  • 1-space indentation maintained