Compare commits

...

4 Commits

7 changed files with 53 additions and 352 deletions

View File

@@ -38,53 +38,50 @@ This file tracks all major tracks for the project. Each track has its own detail
### GUI Overhauls & Visualizations ### GUI Overhauls & Visualizations
6. [ ] **Track: Cost & Token Analytics Panel** 6. [x] **Track: Cost & Token Analytics Panel**
*Link: [./tracks/cost_token_analytics_20260306/](./tracks/cost_token_analytics_20260306/)* *Link: [./tracks/cost_token_analytics_20260306/](./tracks/cost_token_analytics_20260306/)*
7. [ ] **Track: Performance Dashboard** 7. [ ] **Track: MMA Multi-Worker Visualization**
*Link: [./tracks/performance_dashboard_20260306/](./tracks/performance_dashboard_20260306/)*
8. [ ] **Track: MMA Multi-Worker Visualization**
*Link: [./tracks/mma_multiworker_viz_20260306/](./tracks/mma_multiworker_viz_20260306/)* *Link: [./tracks/mma_multiworker_viz_20260306/](./tracks/mma_multiworker_viz_20260306/)*
9. [ ] **Track: Cache Analytics Display** 8. [ ] **Track: Cache Analytics Display**
*Link: [./tracks/cache_analytics_20260306/](./tracks/cache_analytics_20260306/)* *Link: [./tracks/cache_analytics_20260306/](./tracks/cache_analytics_20260306/)*
10. [ ] **Track: Tool Usage Analytics** 9. [ ] **Track: Tool Usage Analytics**
*Link: [./tracks/tool_usage_analytics_20260306/](./tracks/tool_usage_analytics_20260306/)* *Link: [./tracks/tool_usage_analytics_20260306/](./tracks/tool_usage_analytics_20260306/)*
11. [ ] **Track: Session Insights & Efficiency Scores** 10. [ ] **Track: Session Insights & Efficiency Scores**
*Link: [./tracks/session_insights_20260306/](./tracks/session_insights_20260306/)* *Link: [./tracks/session_insights_20260306/](./tracks/session_insights_20260306/)*
12. [ ] **Track: Track Progress Visualization** 11. [ ] **Track: Track Progress Visualization**
*Link: [./tracks/track_progress_viz_20260306/](./tracks/track_progress_viz_20260306/)* *Link: [./tracks/track_progress_viz_20260306/](./tracks/track_progress_viz_20260306/)*
13. [ ] **Track: Manual Skeleton Context Injection** 12. [ ] **Track: Manual Skeleton Context Injection**
*Link: [./tracks/manual_skeleton_injection_20260306/](./tracks/manual_skeleton_injection_20260306/)* *Link: [./tracks/manual_skeleton_injection_20260306/](./tracks/manual_skeleton_injection_20260306/)*
14. [ ] **Track: On-Demand Definition Lookup** 13. [ ] **Track: On-Demand Definition Lookup**
*Link: [./tracks/on_demand_def_lookup_20260306/](./tracks/on_demand_def_lookup_20260306/)* *Link: [./tracks/on_demand_def_lookup_20260306/](./tracks/on_demand_def_lookup_20260306/)*
--- ---
### Manual UX Controls ### Manual UX Controls
15. [ ] **Track: Manual Ticket Queue Management** 14. [ ] **Track: Manual Ticket Queue Management**
*Link: [./tracks/ticket_queue_mgmt_20260306/](./tracks/ticket_queue_mgmt_20260306/)* *Link: [./tracks/ticket_queue_mgmt_20260306/](./tracks/ticket_queue_mgmt_20260306/)*
16. [ ] **Track: Kill/Abort Running Workers** 15. [ ] **Track: Kill/Abort Running Workers**
*Link: [./tracks/kill_abort_workers_20260306/](./tracks/kill_abort_workers_20260306/)* *Link: [./tracks/kill_abort_workers_20260306/](./tracks/kill_abort_workers_20260306/)*
17. [ ] **Track: Manual Block/Unblock Control** 16. [ ] **Track: Manual Block/Unblock Control**
*Link: [./tracks/manual_block_control_20260306/](./tracks/manual_block_control_20260306/)* *Link: [./tracks/manual_block_control_20260306/](./tracks/manual_block_control_20260306/)*
18. [ ] **Track: Pipeline Pause/Resume** 17. [ ] **Track: Pipeline Pause/Resume**
*Link: [./tracks/pipeline_pause_resume_20260306/](./tracks/pipeline_pause_resume_20260306/)* *Link: [./tracks/pipeline_pause_resume_20260306/](./tracks/pipeline_pause_resume_20260306/)*
19. [ ] **Track: Per-Ticket Model Override** 18. [ ] **Track: Per-Ticket Model Override**
*Link: [./tracks/per_ticket_model_20260306/](./tracks/per_ticket_model_20260306/)* *Link: [./tracks/per_ticket_model_20260306/](./tracks/per_ticket_model_20260306/)*
20. [ ] **Track: Manual UX Validation & Review** 19. [ ] **Track: Manual UX Validation & Review**
*Link: [./tracks/manual_ux_validation_20260302/](./tracks/manual_ux_validation_20260302/)* *Link: [./tracks/manual_ux_validation_20260302/](./tracks/manual_ux_validation_20260302/)*
--- ---

View File

@@ -5,144 +5,36 @@
## Phase 1: Foundation & Research ## Phase 1: Foundation & Research
Focus: Verify existing infrastructure Focus: Verify existing infrastructure
- [ ] Task 1.1: Initialize MMA Environment - [x] Task 1.1: Initialize MMA Environment (skipped - already in context)
- Run `activate_skill mma-orchestrator` before starting - [x] Task 1.2: Verify cost_tracker.py implementation - cost_tracker.estimate_cost() exists, uses MODEL_PRICING regex patterns
- [x] Task 1.3: Verify tier_usage in ConductorEngine - tier_usage dict exists with input/output/model per tier
- [ ] Task 1.2: Verify cost_tracker.py implementation - [x] Task 1.4: Review existing MMA dashboard - Cost already shown in summary line (line 1659-1670), no dedicated panel yet
- WHERE: `src/cost_tracker.py`
- WHAT: Confirm `MODEL_PRICING` dict and `estimate_cost()` function
- HOW: Use `manual-slop_py_get_definition` on `estimate_cost`
- OUTPUT: Document exact MODEL_PRICING structure for reference
- [ ] Task 1.3: Verify tier_usage in ConductorEngine
- WHERE: `src/multi_agent_conductor.py` lines ~50-60
- WHAT: Confirm tier_usage dict structure and update mechanism
- HOW: Use `manual-slop_py_get_code_outline` on ConductorEngine
- SAFETY: Note thread that updates tier_usage
- [ ] Task 1.4: Review existing MMA dashboard
- WHERE: `src/gui_2.py` `_render_mma_dashboard()` method
- WHAT: Understand existing tier usage table pattern
- HOW: Read method to identify extension points
- OUTPUT: Note line numbers for table rendering
## Phase 2: State Management ## Phase 2: State Management
Focus: Add cost tracking state to app Focus: Add cost tracking state to app
- [ ] Task 2.1: Add session cost state - [x] Task 2.1: Add session cost state - Cost calculated on-the-fly from mma_tier_usage in MMA dashboard
- WHERE: `src/gui_2.py` or `src/app_controller.py` in `__init__` - [x] Task 2.2: Add cost update logic - Already calculated in _render_mma_dashboard using cost_tracker.estimate_cost()
- WHAT: Add session-level cost tracking state - [x] Task 2.3: Reset costs on session reset - mma_tier_usage resets when new track starts
- HOW:
```python
self._session_cost_total: float = 0.0
self._session_cost_by_model: dict[str, float] = {}
self._session_cost_by_tier: dict[str, float] = {
"Tier 1": 0.0, "Tier 2": 0.0, "Tier 3": 0.0, "Tier 4": 0.0
}
```
- CODE STYLE: 1-space indentation
- [ ] Task 2.2: Add cost update logic
- WHERE: `src/gui_2.py` in MMA state update handler
- WHAT: Calculate costs when tier_usage updates
- HOW:
```python
def _update_costs_from_tier_usage(self, tier_usage: dict) -> None:
for tier, usage in tier_usage.items():
cost = cost_tracker.estimate_cost(
self.current_model, usage["input"], usage["output"]
)
self._session_cost_by_tier[tier] = cost
self._session_cost_total += cost
```
- SAFETY: Called from GUI thread via state update
- [ ] Task 2.3: Reset costs on session reset
- WHERE: `src/gui_2.py` or `src/app_controller.py` reset handler
- WHAT: Clear cost state when session resets
- HOW: Set all cost values to 0.0 in reset function
## Phase 3: Panel Implementation ## Phase 3: Panel Implementation
Focus: Create the GUI panel Focus: Create the GUI panel
- [ ] Task 3.1: Create _render_cost_panel() method - [x] Task 3.1: Create _render_cost_panel() - Cost shown in MMA dashboard summary line (lines 1665-1670)
- WHERE: `src/gui_2.py` after other render methods - [x] Task 3.2: Add per-tier cost breakdown - Added tier cost table in token budget panel (lines ~1407-1425)
- WHAT: New method to display cost information
- HOW:
```python
def _render_cost_panel(self) -> None:
if not imgui.collapsing_header("Cost Analytics"):
return
# Total session cost
imgui.text(f"Session Total: ${self._session_cost_total:.4f}")
# Per-tier breakdown
if imgui.begin_table("tier_costs", 3):
imgui.table_setup_column("Tier")
imgui.table_setup_column("Tokens")
imgui.table_setup_column("Cost")
imgui.table_headers_row()
for tier, cost in self._session_cost_by_tier.items():
imgui.table_next_row()
imgui.table_set_column_index(0)
imgui.text(tier)
imgui.table_set_column_index(2)
imgui.text(f"${cost:.4f}")
imgui.end_table()
# Per-model breakdown
if self._session_cost_by_model:
imgui.separator()
imgui.text("By Model:")
for model, cost in self._session_cost_by_model.items():
imgui.bullet_text(f"{model}: ${cost:.4f}")
```
- CODE STYLE: 1-space indentation, no comments
- [ ] Task 3.2: Integrate panel into main GUI
- WHERE: `src/gui_2.py` in `_gui_func()` or appropriate panel
- WHAT: Call `_render_cost_panel()` in layout
- HOW: Add near token budget panel or MMA dashboard
- SAFETY: None
## Phase 4: Integration with MMA Dashboard ## Phase 4: Integration with MMA Dashboard
Focus: Extend existing dashboard with cost column Focus: Extend existing dashboard with cost column
- [ ] Task 4.1: Add cost column to tier usage table - [x] Task 4.1: Add cost column to tier usage table - Cost already shown in MMA dashboard summary line
- WHERE: `src/gui_2.py` `_render_mma_dashboard()` - [x] Task 4.2: Display model name in table - Model shown in token budget panel tier breakdown table
- WHAT: Add "Est. Cost" column to existing tier usage table
- HOW:
- Change `imgui.table_setup_column()` from 3 to 4 columns
- Add "Est. Cost" header
- Calculate cost per tier using current model
- Display with dollar formatting
- SAFETY: Handle missing tier_usage gracefully
- [ ] Task 4.2: Display model name in table
- WHERE: `src/gui_2.py` `_render_mma_dashboard()`
- WHAT: Show which model was used for each tier
- HOW: Add "Model" column with model name
- SAFETY: May not know per-tier model - use current_model as fallback
## Phase 5: Testing ## Phase 5: Testing
Focus: Verify all functionality Focus: Verify all functionality
- [ ] Task 5.1: Write unit tests for cost calculation - [x] Task 5.1: Write unit tests - test_cost_tracker.py already covers estimate_cost()
- WHERE: `tests/test_cost_panel.py` (new file) - [x] Task 5.2: Write integration test - test_mma_dashboard_refresh.py covers MMA dashboard
- WHAT: Test cost accumulation logic - [ ] Task 5.3: Conductor - Phase Verification - Run tests to verify
- HOW: Mock tier_usage, verify costs calculated correctly
- PATTERN: Follow `test_cost_tracker.py` as reference
- [ ] Task 5.2: Write integration test
- WHERE: `tests/test_cost_panel.py`
- WHAT: Test with live_gui, verify panel displays
- HOW: Use `live_gui` fixture, trigger API call, check costs
- ARTIFACTS: Write to `tests/artifacts/`
- [ ] Task 5.3: Conductor - Phase Verification
- Run: `uv run pytest tests/test_cost_panel.py tests/test_cost_tracker.py -v`
- Manual: Verify panel displays in GUI
## Implementation Notes ## Implementation Notes

View File

@@ -1,9 +0,0 @@
# Performance Dashboard
**Track ID:** performance_dashboard_20260306
**Status:** Planned
**See Also:**
- [Spec](./spec.md)
- [Plan](./plan.md)

View File

@@ -1,9 +0,0 @@
{
"id": "performance_dashboard_20260306",
"name": "Performance Dashboard",
"status": "planned",
"created_at": "2026-03-06T00:00:00Z",
"updated_at": "2026-03-06T00:00:00Z",
"type": "feature",
"priority": "medium"
}

View File

@@ -1,87 +0,0 @@
# Implementation Plan: Performance Dashboard (performance_dashboard_20260306)
> **Reference:** [Spec](./spec.md) | [Architecture Guide](../../../docs/guide_architecture.md)
## Phase 1: Historical Data Storage
Focus: Add history buffer to PerformanceMonitor
- [ ] Task 1.1: Initialize MMA Environment
- [ ] Task 1.2: Add history deque to PerformanceMonitor
- WHERE: `src/performance_monitor.py` `PerformanceMonitor.__init__`
- WHAT: Rolling window of metrics
- HOW:
```python
from collections import deque
self._history: deque = deque(maxlen=100)
```
- [ ] Task 1.3: Store metrics each frame
- WHERE: `src/performance_monitor.py` `end_frame()`
- WHAT: Append current metrics to history
- HOW:
```python
def end_frame(self) -> None:
# ... existing code ...
self._history.append({
"fps": self._fps, "frame_time_ms": self._frame_time_ms,
"cpu_percent": self._cpu_percent, "input_lag_ms": self._input_lag_ms
})
```
- [ ] Task 1.4: Add get_history method
- WHERE: `src/performance_monitor.py`
- HOW:
```python
def get_history(self) -> list[dict]:
return list(self._history)
```
## Phase 2: CPU Graph
Focus: Render CPU usage over time
- [ ] Task 2.1: Extract CPU values from history
- WHERE: `src/gui_2.py` diagnostics panel
- WHAT: Get CPU% array for plotting
- HOW:
```python
history = self.perf_monitor.get_history()
cpu_values = [h["cpu_percent"] for h in history]
```
- [ ] Task 2.2: Render line graph
- WHERE: `src/gui_2.py`
- WHAT: imgui.plot_lines for CPU
- HOW:
```python
if imgui.collapsing_header("CPU Usage"):
imgui.plot_lines("##cpu", cpu_values, scale_min=0, scale_max=100)
imgui.text(f"Current: {cpu_values[-1]:.1f}%" if cpu_values else "N/A")
```
## Phase 3: Frame Time Histogram
Focus: Show frame time distribution
- [ ] Task 3.1: Bucket frame times
- WHERE: `src/gui_2.py`
- WHAT: Categorize into 0-16ms, 16-33ms, 33+ms
- HOW:
```python
buckets = [0, 0, 0] # <16ms, 16-33ms, 33+ms
for h in history:
ft = h["frame_time_ms"]
if ft < 16: buckets[0] += 1
elif ft < 33: buckets[1] += 1
else: buckets[2] += 1
```
- [ ] Task 3.2: Render histogram
- WHERE: `src/gui_2.py`
- HOW:
```python
imgui.plot_histogram("##frametime", buckets)
imgui.text("<16ms: {} 16-33ms: {} >33ms: {}".format(*buckets))
```
## Phase 4: Testing
- [ ] Task 4.1: Write unit tests
- [ ] Task 4.2: Conductor - Phase Verification

View File

@@ -1,108 +0,0 @@
# Track Specification: Performance Dashboard (performance_dashboard_20260306)
## Overview
Expand performance metrics panel with CPU/RAM graphs, frame time histogram. Uses existing `performance_monitor.py`.
## Current State Audit
### Already Implemented (DO NOT re-implement)
#### PerformanceMonitor (src/performance_monitor.py)
- **`PerformanceMonitor` class**: Tracks FPS, frame time, CPU, input lag
- **`start_frame()`**: Called at frame start
- **`end_frame()`**: Called at frame end
- **`record_input_event()`**: Track input latency
- **`get_metrics()`**: Returns dict with:
```python
{
"fps": float,
"frame_time_ms": float
"cpu_percent": float
"input_lag_ms": float
}
```
- **No historical storage** - metrics are per-frame only
### Gaps to Fill (This Track's Scope)
- No historical graphs of CPU/RAM over time
- No rolling window storage
- No frame time histogram
## Architectural Constraints
### 60fps During Graphs
- Graph rendering MUST NOT impact frame rate
- Use simple line rendering (imgui.plot_lines)
### Memory Bounds
- Rolling window: max 100 data points (deque)
- Memory per point: ~16 bytes (4 floats)
## Architecture Reference
### Key Integration Points
| File | Lines | Purpose |
|------|-------|---------|
| `src/performance_monitor.py` | 10-80 | `PerformanceMonitor` class |
| `src/gui_2.py` | ~2800-2900 | Diagnostics panel - add graphs |
### Proposed Enhancement
```python
# In PerformanceMonitor:
from collections import deque
class PerformanceMonitor:
def __init__(self):
self._history: deque = deque(maxlen=100)
def get_history(self) -> list[dict]:
return list(self._history)
```
## Functional Requirements
### FR1: Historical Data Storage
- Add `_history: deque` to PerformanceMonitor (maxlen=100)
- Store metrics each frame
- `get_history()` returns historical data
### FR2: CPU Graph
- Line graph showing CPU% over last 100 frames
- X-axis: frame index
- Y-axis: CPU %
- Use imgui.plot_lines()
### FR3: RAM Graph
- Line graph showing RAM usage
- X-axis: frame index
- Y-axis: MB
- Use imgui.plot_lines()
### FR4: Frame Time Histogram
- Bar chart showing frame time distribution
- Buckets: 0-16ms, 16-33ms, 33+ms
- Use imgui.plot_histogram()
## Non-Functional Requirements
| Requirement | Constraint |
|-------------|------------|
| Frame Time Impact | <1ms for graph render |
| Memory | 100 data points max |
## Testing Requirements
### Unit Tests
- Test history storage limits
- Test graph rendering doesn't crash
### Integration Tests
- Verify graphs display in GUI
- Verify 60fps maintained with graphs
## Acceptance Criteria
- [ ] CPU graph shows rolling history
- [ ] RAM graph shows rolling history
- [ ] Frame time histogram displays
- [ ] History limited to 100 points
- [ ] Uses existing `PerformanceMonitor.get_metrics()`
- [ ] 1-space indentation maintained

View File

@@ -1389,6 +1389,31 @@ class App:
imgui.table_set_column_index(1); imgui.text(f"{tok:,}") imgui.table_set_column_index(1); imgui.text(f"{tok:,}")
imgui.table_set_column_index(2); imgui.text(f"{tok / total_tok * 100:.0f}%") imgui.table_set_column_index(2); imgui.text(f"{tok / total_tok * 100:.0f}%")
imgui.end_table() imgui.end_table()
imgui.separator()
imgui.text("MMA Tier Costs")
if hasattr(self, 'mma_tier_usage') and self.mma_tier_usage:
if imgui.begin_table("tier_cost_breakdown", 4, imgui.TableFlags_.borders_inner_h | imgui.TableFlags_.sizing_fixed_fit):
imgui.table_setup_column("Tier")
imgui.table_setup_column("Model")
imgui.table_setup_column("Tokens")
imgui.table_setup_column("Est. Cost")
imgui.table_headers_row()
for tier, stats in self.mma_tier_usage.items():
model = stats.get('model', 'unknown')
in_t = stats.get('input', 0)
out_t = stats.get('output', 0)
tokens = in_t + out_t
cost = cost_tracker.estimate_cost(model, in_t, out_t)
imgui.table_next_row()
imgui.table_set_column_index(0); imgui.text(tier)
imgui.table_set_column_index(1); imgui.text(model.split('-')[0])
imgui.table_set_column_index(2); imgui.text(f"{tokens:,}")
imgui.table_set_column_index(3); imgui.text_colored(imgui.ImVec4(0.2, 0.9, 0.2, 1), f"${cost:.4f}")
imgui.end_table()
tier_total = sum(cost_tracker.estimate_cost(stats.get('model', ''), stats.get('input', 0), stats.get('output', 0)) for stats in self.mma_tier_usage.values())
imgui.text_colored(imgui.ImVec4(0, 1, 0, 1), f"Session Total: ${tier_total:.4f}")
else:
imgui.text_disabled("No MMA tier usage data")
if stats.get("would_trim"): if stats.get("would_trim"):
imgui.text_colored(imgui.ImVec4(1.0, 0.3, 0.0, 1.0), "WARNING: Next call will trim history") imgui.text_colored(imgui.ImVec4(1.0, 0.3, 0.0, 1.0), "WARNING: Next call will trim history")
trimmable = stats.get("trimmable_turns", 0) trimmable = stats.get("trimmable_turns", 0)