Files
manual_slop/conductor/tracks/cost_token_analytics_20260306/spec.md
2026-03-06 15:47:18 -05:00

5.6 KiB

Track Specification: Cost & Token Analytics Panel (cost_token_analytics_20260306)

Overview

Real-time cost tracking panel displaying cost per model, session totals, and breakdown by tier. Uses existing cost_tracker.py which is implemented but has no GUI representation.

Current State Audit

Already Implemented (DO NOT re-implement)

cost_tracker.py (src/cost_tracker.py)

  • MODEL_PRICING list: List of (regex_pattern, rates_dict) tuples
    MODEL_PRICING = [
        (r"gemini-2\.5-flash-lite", {"input_per_mtok": 0.075, "output_per_mtok": 0.30}),
        (r"gemini-2\.5-flash", {"input_per_mtok": 0.15, "output_per_mtok": 0.60}),
        (r"gemini-3-flash-preview", {"input_per_mtok": 0.15, "output_per_mtok": 0.60}),
        (r"gemini-3\.1-pro-preview", {"input_per_mtok": 3.50, "output_per_mtok": 10.50}),
        (r"claude-.*-sonnet", {"input_per_mtok": 3.0, "output_per_mtok": 15.0}),
        (r"deepseek-v3", {"input_per_mtok": 0.27, "output_per_mtok": 1.10}),
    ]
    
  • estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float: Uses regex match, returns 0.0 for unknown models

MMA Tier Usage Tracking (multi_agent_conductor.py)

  • ConductorEngine.tier_usage already tracks per-tier token counts AND model:
    self.tier_usage = {
     "Tier 1": {"input": 0, "output": 0, "model": "gemini-3.1-pro-preview"},
     "Tier 2": {"input": 0, "output": 0, "model": "gemini-3-flash-preview"},
     "Tier 3": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"},
     "Tier 4": {"input": 0, "output": 0, "model": "gemini-2.5-flash-lite"},
    }
    
  • Key insight: The model name is already tracked per tier in tier_usage[tier]["model"]
  • Updated in run_worker_lifecycle() from comms_log token counts

Gaps to Fill (This Track's Scope)

  • No GUI panel to display cost information
  • No session-level cost accumulation
  • No per-tier cost breakdown in UI

Architectural Constraints

Non-Blocking Updates

  • Cost calculations MUST NOT block UI thread
  • Token counts are read from existing tier_usage - no new tracking needed
  • Use cached values, update on state change events

Cross-Thread Data Access

  • tier_usage is updated on worker threads
  • GUI reads via MMA state updates through _pending_gui_tasks pattern
  • Already synchronized through existing state update mechanism

Architecture Reference

Key Integration Points

File Lines Purpose
src/cost_tracker.py 10-40 MODEL_PRICING, estimate_cost()
src/multi_agent_conductor.py ~50-60 tier_usage dict with input/output/model
src/gui_2.py ~2700-2800 _render_mma_dashboard() - existing tier usage display

Cost Calculation Pattern

from src import cost_tracker
usage = engine.tier_usage["Tier 3"]
cost = cost_tracker.estimate_cost(
    usage["model"],      # Already tracked!
    usage["input"],
    usage["output"]
)

Functional Requirements

FR1: Session Cost Accumulation

  • Track total cost for the current session in App/AppController state
  • Reset on session reset
  • Sum of all tier costs

FR2: Per-Tier Cost Display

  • Show cost per MMA tier using existing tier_usage[tier]["model"] for model
  • Show input/output tokens alongside cost
  • Calculate using cost_tracker.estimate_cost()

FR3: Real-Time Updates

  • Update cost display when MMA state changes
  • Hook into existing mma_state_update event handling
  • No polling - event-driven

Non-Functional Requirements

Requirement Constraint
Frame Time Impact <1ms when panel visible
Memory Overhead <1KB for session cost state

Testing Requirements

Unit Tests

  • Test estimate_cost() with known model/token combinations
  • Test unknown model returns 0.0
  • Test session cost accumulation

Integration Tests (via live_gui fixture)

  • Verify cost panel displays after MMA execution
  • Verify session reset clears costs

Out of Scope

  • Historical cost tracking across sessions
  • Cost budgeting/alerts
  • Per-model aggregation (model already per-tier)

Acceptance Criteria

  • Cost panel displays in GUI
  • Per-tier cost shown with token counts
  • Tier breakdown uses existing tier_usage model field
  • Total session cost accumulates correctly
  • Panel updates on MMA state changes
  • Uses existing cost_tracker.estimate_cost()
  • Session reset clears costs
  • 1-space indentation maintained

Non-Functional Requirements

Requirement Constraint
Frame Time Impact <1ms when panel visible
Memory Overhead <1KB for session cost state
Thread Safety Read tier_usage via state updates only

Testing Requirements

Unit Tests

  • Test estimate_cost() with known model/token combinations
  • Test unknown model returns 0.0
  • Test session cost accumulation

Integration Tests (via live_gui fixture)

  • Verify cost panel displays after API call
  • Verify costs update after MMA execution
  • Verify session reset clears costs

Structural Testing Contract

  • Use real cost_tracker module - no mocking
  • Test artifacts go to tests/artifacts/

Out of Scope

  • Historical cost tracking across sessions
  • Cost budgeting/alerts
  • Export cost reports
  • API cost for web searches (no token counts available)

Acceptance Criteria

  • Cost panel displays in GUI
  • Per-model cost shown with token counts
  • Tier breakdown accurate using tier_usage
  • Total session cost accumulates correctly
  • Panel updates on MMA state changes
  • Uses existing cost_tracker.estimate_cost()
  • Session reset clears costs
  • 1-space indentation maintained