refinement of upcoming tracks

2026-03-06 15:41:33 -05:00
parent 3ce6a2ec8a
commit fca40fd8da
24 changed files with 2388 additions and 391 deletions
@@ -1,32 +1,169 @@
 # Implementation Plan: Cost & Token Analytics Panel (cost_token_analytics_20260306)

-## Phase 1: Panel Setup
- [ ] Task: Initialize MMA Environment
- [ ] Task: Create cost panel structure
-    - WHERE: src/gui_2.py
-    - WHAT: New panel for cost display
-    - HOW: Add _render_cost_panel method
-    - SAFETY: Non-blocking updates
+> **Reference:** [Spec](./spec.md) | [Architecture Guide](../../../docs/guide_architecture.md)

-## Phase 2: Cost Calculations
- [ ] Task: Integrate cost_tracker
-    - WHERE: src/gui_2.py
-    - WHAT: Use cost_tracker.estimate_cost
-    - HOW: Call with model and token counts
-    - SAFETY: Cache expensive calculations
- [ ] Task: Track session totals
-    - WHERE: src/gui_2.py or app_controller
-    - WHAT: Accumulate cost over session
-    - HOW: Maintain running total
-    - SAFETY: Thread-safe updates
+## Phase 1: Foundation & Research
+Focus: Verify existing infrastructure

-## Phase 3: UI Implementation
- [ ] Task: Render cost breakdown
-    - WHERE: src/gui_2.py
-    - WHAT: Show per-model and per-tier costs
-    - HOW: imgui tables
-    - SAFETY: Handle zero/empty states
+- [ ] Task 1.1: Initialize MMA Environment
+    - Run `activate_skill mma-orchestrator` before starting

-## Phase 4: Verification
- [ ] Task: Test cost calculations
- [ ] Task: Conductor - Phase Verification
+- [ ] Task 1.2: Verify cost_tracker.py implementation
+    - WHERE: `src/cost_tracker.py`
+    - WHAT: Confirm `MODEL_PRICING` dict and `estimate_cost()` function
+    - HOW: Use `manual-slop_py_get_definition` on `estimate_cost`
+    - OUTPUT: Document exact MODEL_PRICING structure for reference
+
+- [ ] Task 1.3: Verify tier_usage in ConductorEngine
+    - WHERE: `src/multi_agent_conductor.py` lines ~50-60
+    - WHAT: Confirm tier_usage dict structure and update mechanism
+    - HOW: Use `manual-slop_py_get_code_outline` on ConductorEngine
+    - SAFETY: Note thread that updates tier_usage
+
+- [ ] Task 1.4: Review existing MMA dashboard
+    - WHERE: `src/gui_2.py` `_render_mma_dashboard()` method
+    - WHAT: Understand existing tier usage table pattern
+    - HOW: Read method to identify extension points
+    - OUTPUT: Note line numbers for table rendering
+
+## Phase 2: State Management
+Focus: Add cost tracking state to app
+
+- [ ] Task 2.1: Add session cost state
+    - WHERE: `src/gui_2.py` or `src/app_controller.py` in `__init__`
+    - WHAT: Add session-level cost tracking state
+    - HOW:
+      ```python
+      self._session_cost_total: float = 0.0
+      self._session_cost_by_model: dict[str, float] = {}
+      self._session_cost_by_tier: dict[str, float] = {
+       "Tier 1": 0.0, "Tier 2": 0.0, "Tier 3": 0.0, "Tier 4": 0.0
+      }
+      ```
+    - CODE STYLE: 1-space indentation
+
+- [ ] Task 2.2: Add cost update logic
+    - WHERE: `src/gui_2.py` in MMA state update handler
+    - WHAT: Calculate costs when tier_usage updates
+    - HOW:
+      ```python
+      def _update_costs_from_tier_usage(self, tier_usage: dict) -> None:
+       for tier, usage in tier_usage.items():
+        cost = cost_tracker.estimate_cost(
+         self.current_model, usage["input"], usage["output"]
+        )
+        self._session_cost_by_tier[tier] = cost
+        self._session_cost_total += cost
+      ```
+    - SAFETY: Called from GUI thread via state update
+
+- [ ] Task 2.3: Reset costs on session reset
+    - WHERE: `src/gui_2.py` or `src/app_controller.py` reset handler
+    - WHAT: Clear cost state when session resets
+    - HOW: Set all cost values to 0.0 in reset function
+
+## Phase 3: Panel Implementation
+Focus: Create the GUI panel
+
+- [ ] Task 3.1: Create _render_cost_panel() method
+    - WHERE: `src/gui_2.py` after other render methods
+    - WHAT: New method to display cost information
+    - HOW:
+      ```python
+      def _render_cost_panel(self) -> None:
+       if not imgui.collapsing_header("Cost Analytics"):
+        return
+       
+       # Total session cost
+       imgui.text(f"Session Total: ${self._session_cost_total:.4f}")
+       
+       # Per-tier breakdown
+       if imgui.begin_table("tier_costs", 3):
+        imgui.table_setup_column("Tier")
+        imgui.table_setup_column("Tokens")
+        imgui.table_setup_column("Cost")
+        imgui.table_headers_row()
+        for tier, cost in self._session_cost_by_tier.items():
+         imgui.table_next_row()
+         imgui.table_set_column_index(0)
+         imgui.text(tier)
+         imgui.table_set_column_index(2)
+         imgui.text(f"${cost:.4f}")
+        imgui.end_table()
+       
+       # Per-model breakdown
+       if self._session_cost_by_model:
+        imgui.separator()
+        imgui.text("By Model:")
+        for model, cost in self._session_cost_by_model.items():
+         imgui.bullet_text(f"{model}: ${cost:.4f}")
+      ```
+    - CODE STYLE: 1-space indentation, no comments
+
+- [ ] Task 3.2: Integrate panel into main GUI
+    - WHERE: `src/gui_2.py` in `_gui_func()` or appropriate panel
+    - WHAT: Call `_render_cost_panel()` in layout
+    - HOW: Add near token budget panel or MMA dashboard
+    - SAFETY: None
+
+## Phase 4: Integration with MMA Dashboard
+Focus: Extend existing dashboard with cost column
+
+- [ ] Task 4.1: Add cost column to tier usage table
+    - WHERE: `src/gui_2.py` `_render_mma_dashboard()` 
+    - WHAT: Add "Est. Cost" column to existing tier usage table
+    - HOW:
+      - Change `imgui.table_setup_column()` from 3 to 4 columns
+      - Add "Est. Cost" header
+      - Calculate cost per tier using current model
+      - Display with dollar formatting
+    - SAFETY: Handle missing tier_usage gracefully
+
+- [ ] Task 4.2: Display model name in table
+    - WHERE: `src/gui_2.py` `_render_mma_dashboard()`
+    - WHAT: Show which model was used for each tier
+    - HOW: Add "Model" column with model name
+    - SAFETY: May not know per-tier model - use current_model as fallback
+
+## Phase 5: Testing
+Focus: Verify all functionality
+
+- [ ] Task 5.1: Write unit tests for cost calculation
+    - WHERE: `tests/test_cost_panel.py` (new file)
+    - WHAT: Test cost accumulation logic
+    - HOW: Mock tier_usage, verify costs calculated correctly
+    - PATTERN: Follow `test_cost_tracker.py` as reference
+
+- [ ] Task 5.2: Write integration test
+    - WHERE: `tests/test_cost_panel.py`
+    - WHAT: Test with live_gui, verify panel displays
+    - HOW: Use `live_gui` fixture, trigger API call, check costs
+    - ARTIFACTS: Write to `tests/artifacts/`
+
+- [ ] Task 5.3: Conductor - Phase Verification
+    - Run: `uv run pytest tests/test_cost_panel.py tests/test_cost_tracker.py -v`
+    - Manual: Verify panel displays in GUI
+
+## Implementation Notes
+
+### Thread Safety
+- tier_usage is updated on asyncio worker thread
+- GUI reads via `_process_pending_gui_tasks` - already synchronized
+- No additional locking needed
+
+### Cost Calculation Strategy
+- Use current model for all tiers (simplification)
+- Future: Track model per tier if needed
+- Unknown models return 0.0 cost (safe default)
+
+### Files Modified
+- `src/gui_2.py`: Add cost state, render methods
+- `src/app_controller.py`: Possibly add cost state (if using controller)
+- `tests/test_cost_panel.py`: New test file
+
+### Code Style Checklist
+- [ ] 1-space indentation throughout
+- [ ] CRLF line endings on Windows
+- [ ] No comments unless requested
+- [ ] Type hints on new state variables
+- [ ] Use existing `vec4` colors for consistency
@@ -1,21 +1,140 @@
 # Track Specification: Cost & Token Analytics Panel (cost_token_analytics_20260306)

 ## Overview
-Real-time cost tracking panel displaying cost per model, session totals, and breakdown by tier. Uses existing cost_tracker.py which is implemented but has no GUI.
+Real-time cost tracking panel displaying cost per model, session totals, and breakdown by tier. Uses existing `cost_tracker.py` which is implemented but has no GUI representation.
+
+## Current State Audit
+
+### Already Implemented (DO NOT re-implement)
+
+#### cost_tracker.py (src/cost_tracker.py)
+- **`MODEL_PRICING` dict**: Pricing per 1M tokens for all supported models
+  ```python
+  MODEL_PRICING: dict[str, dict[str, float]] = {
+   "gemini-2.5-flash-lite": {"input": 0.075, "output": 0.30},
+   "gemini-2.5-flash": {"input": 0.15, "output": 0.60},
+   "gemini-3.1-pro-preview": {"input": 1.25, "output": 5.00},
+   "claude-3-5-sonnet": {"input": 3.00, "output": 15.00},
+   "deepseek-v3": {"input": 0.27, "output": 1.10},
+   # ... more models
+  }
+  ```
+- **`estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float`**: Calculate cost in USD
+- **Returns 0.0 for unknown models** - safe default
+
+#### Token Tracking in ai_client.py
+- **`_add_bleed_derived()`** (ai_client.py): Adds derived token counts to comms entries
+- **`get_history_bleed_stats()`**: Returns token statistics from history
+- **Gemini**: Token counts from API response (`usage_metadata`)
+- **Anthropic**: Token counts from API response (`usage`)
+- **DeepSeek**: Token counts from API response (`usage`)
+
+#### MMA Tier Usage Tracking
+- **`ConductorEngine.tier_usage`** (multi_agent_conductor.py): Tracks per-tier token usage
+  ```python
+  self.tier_usage = {
+   "Tier 1": {"input": 0, "output": 0},
+   "Tier 2": {"input": 0, "output": 0},
+   "Tier 3": {"input": 0, "output": 0},
+   "Tier 4": {"input": 0, "output": 0},
+  }
+  ```
+
+### Gaps to Fill (This Track's Scope)
+- No GUI panel to display cost information
+- No session-level cost accumulation
+- No per-model breakdown visualization
+- No tier breakdown visualization

 ## Architectural Constraints
- **Non-Blocking**: Cost calculations MUST NOT block UI thread.
- **Efficient Updates**: Updates SHOULD be throttled to <10ms latency.
+
+### Non-Blocking Updates
+- Cost calculations MUST NOT block UI thread
+- Token counts are read from existing tracking - no new API calls
+- Use cached values, update on state change events
+
+### Cross-Thread Data Access
+- `tier_usage` is updated on asyncio worker thread
+- GUI reads via `_process_pending_gui_tasks` pattern
+- Already synchronized through MMA state updates
+
+### Memory Efficiency
+- Session cost is a simple float - no history array needed
+- Per-model costs can be dict: `{model_name: float}`
+
+## Architecture Reference
+
+### Key Integration Points
+
+| File | Lines | Purpose |
+|------|-------|---------|
+| `src/cost_tracker.py` | 10-40 | `MODEL_PRICING`, `estimate_cost()` |
+| `src/ai_client.py` | ~500-550 | `_add_bleed_derived()`, `get_history_bleed_stats()` |
+| `src/multi_agent_conductor.py` | ~50-60 | `tier_usage` dict |
+| `src/gui_2.py` | ~2700-2800 | `_render_mma_dashboard()` - existing tier usage display |
+| `src/gui_2.py` | ~1800-1900 | `_render_token_budget_panel()` - potential location |
+
+### Existing MMA Dashboard Pattern
+The `_render_mma_dashboard()` method already displays tier usage in a table. Extend this pattern for cost display.

 ## Functional Requirements
- **Cost Display**: Show real-time cost for current session.
- **Per-Model Breakdown**: Display cost grouped by model (Gemini, Anthropic, DeepSeek).
- **Tier Breakdown**: Show cost grouped by tier (Tier 1-4).
- **Session Totals**: Accumulate and display total session cost.
+
+### FR1: Session Cost Accumulation
+- Track total cost for the current session
+- Reset on session reset
+- Store in `App` or `AppController` state
+
+### FR2: Per-Model Cost Display
+- Show cost broken down by model name
+- Group by provider (Gemini, Anthropic, DeepSeek)
+- Show token counts alongside costs
+
+### FR3: Tier Breakdown Display
+- Show cost per MMA tier (Tier 1-4)
+- Use existing `tier_usage` data
+- Calculate cost using `cost_tracker.estimate_cost()`
+
+### FR4: Real-Time Updates
+- Update cost display when MMA state changes
+- Hook into existing `mma_state_update` event handling
+- No polling - event-driven
+
+## Non-Functional Requirements
+
+| Requirement | Constraint |
+|-------------|------------|
+| Frame Time Impact | <1ms when panel visible |
+| Memory Overhead | <1KB for session cost state |
+| Thread Safety | Read tier_usage via state updates only |
+
+## Testing Requirements
+
+### Unit Tests
+- Test `estimate_cost()` with known model/token combinations
+- Test unknown model returns 0.0
+- Test session cost accumulation
+
+### Integration Tests (via `live_gui` fixture)
+- Verify cost panel displays after API call
+- Verify costs update after MMA execution
+- Verify session reset clears costs
+
+### Structural Testing Contract
+- Use real `cost_tracker` module - no mocking
+- Test artifacts go to `tests/artifacts/`
+
+## Out of Scope
+- Historical cost tracking across sessions
+- Cost budgeting/alerts
+- Export cost reports
+- API cost for web searches (no token counts available)

 ## Acceptance Criteria
- [ ] Cost panel displays in GUI.
- [ ] Per-model cost shown correctly.
- [ ] Tier breakdown accurate.
- [ ] Total accumulates correctly.
- [ ] Uses existing cost_tracker.py functions.
+- [ ] Cost panel displays in GUI
+- [ ] Per-model cost shown with token counts
+- [ ] Tier breakdown accurate using `tier_usage`
+- [ ] Total session cost accumulates correctly
+- [ ] Panel updates on MMA state changes
+- [ ] Uses existing `cost_tracker.estimate_cost()`
+- [ ] Session reset clears costs
+- [ ] 1-space indentation maintained