diff --git a/TASKS.md b/TASKS.md index 6b6aedc..6a9ecaf 100644 --- a/TASKS.md +++ b/TASKS.md @@ -10,25 +10,17 @@ ## Planned: Next Track -### `mma_agent_focus_ux` (not yet created) +### `mma_agent_focus_ux_20260302` (initialized — run after bleed cleanup) **Priority:** High -**Depends on:** nothing -**Origin:** User feedback 2026-03-02 — token viz is agent-agnostic; MMA observability panels (comms, tool calls, discussion history, token budget) show global/session-scoped data with no way to isolate a specific tier or agent. +**Depends on:** `feature_bleed_cleanup_20260302` Phase 1 (dead comms panel removed) +**Track dir:** `conductor/tracks/mma_agent_focus_ux_20260302/` -**The Gap (audit-confirmed):** -- `_comms_log` entries (gui_2.py:861–895) have no tier/agent tag — only `direction`, `type`, `payload` -- `_tool_log` entries (gui_2.py:897–900) are `(script, result, ts)` — no tier tag -- `mma_streams` dict uses `stream_id` (`"Tier 1"` etc.) — the **only** existing per-agent key -- `_on_comms_entry` never attaches caller tier/agent context -- Token stats are global (single `ai_client` provider, no per-tier history separation) +**Audit-confirmed gaps:** +- `ai_client._append_comms` emits entries with no `source_tier` key +- `ai_client` has no `current_tier` module variable — no way for tiers to self-identify +- `_tool_log` is `list[tuple[str,str,float]]` — no tier field, tuple must migrate to dict +- `run_worker_lifecycle` replaces `comms_log_callback` but never stamps `source_tier` +- `generate_tickets` (Tier 2) does NOT replace callback at all +- No Focus Agent selector widget in Operations Hub -**Intent:** -1. Add a `source_tier` / `agent_id` field to comms log entries and tool log tuples at the point of emission -2. Add a "Focus Agent" selector widget to the MMA Dashboard (None = global; Tier 1–4 = filtered) -3. Filter `_render_comms_history_panel`, `_render_tool_calls_panel`, and `_render_discussion_panel` by the selected agent when focus is active -4. Token budget panel: when a tier is focused, show token stats for that tier's model/history (requires per-tier history tracking in ai_client or conductor engine) -5. Discussion history entries emitted by MMA workers (via `history_add` comms kind) already carry a `role` — use that to group by tier - -**Scope note:** Item 4 (per-tier token stats) is the most architectural — may warrant a sub-track or phased deferral. - -**To initialize:** Run `/conductor-new-track mma_agent_focus_ux` at start of next session after reading this file. +**Scope:** Phase 1 (tier tagging) → Phase 2 (tool log dict migration) → Phase 3 (Focus Agent UI + filter). Per-tier token stats deferred to sub-track. diff --git a/conductor/tracks/mma_agent_focus_ux_20260302/index.md b/conductor/tracks/mma_agent_focus_ux_20260302/index.md new file mode 100644 index 0000000..8b02260 --- /dev/null +++ b/conductor/tracks/mma_agent_focus_ux_20260302/index.md @@ -0,0 +1,5 @@ +# Track mma_agent_focus_ux_20260302 Context + +- [Specification](./spec.md) +- [Implementation Plan](./plan.md) +- [Metadata](./metadata.json) diff --git a/conductor/tracks/mma_agent_focus_ux_20260302/metadata.json b/conductor/tracks/mma_agent_focus_ux_20260302/metadata.json new file mode 100644 index 0000000..36a1736 --- /dev/null +++ b/conductor/tracks/mma_agent_focus_ux_20260302/metadata.json @@ -0,0 +1,8 @@ +{ + "track_id": "mma_agent_focus_ux_20260302", + "type": "feat", + "status": "new", + "created_at": "2026-03-02T00:00:00Z", + "updated_at": "2026-03-02T00:00:00Z", + "description": "Add per-tier agent focus to MMA observability panels: tag comms/tool log entries with source_tier at emission, then filter comms, tool, and discussion panels by selected agent." +} diff --git a/conductor/tracks/mma_agent_focus_ux_20260302/plan.md b/conductor/tracks/mma_agent_focus_ux_20260302/plan.md new file mode 100644 index 0000000..1241901 --- /dev/null +++ b/conductor/tracks/mma_agent_focus_ux_20260302/plan.md @@ -0,0 +1,160 @@ +# Implementation Plan: MMA Agent Focus UX + +Architecture reference: [docs/guide_mma.md](../../../docs/guide_mma.md) + +**Prerequisite:** `feature_bleed_cleanup_20260302` Phase 1 must be complete (dead comms panel removed, line numbers stabilized). + +--- + +## Phase 1: Tier Tagging at Emission +Focus: Add `current_tier` context variable to `ai_client` and stamp it on every comms/tool entry at the point of emission. No UI changes — purely data layer. + +- [ ] Task 1.1: Add `current_tier` module variable to `ai_client.py`. + - **Location**: `ai_client.py` line 91 (beside `tool_log_callback`). Confirm with `get_file_slice(87, 95)`. + - **What**: Add `current_tier: str | None = None` as a module-level variable. + - **How**: Use `Edit` to insert after `tool_log_callback: Callable[[str, str], None] | None = None`. + - **Verify**: `grep -n "current_tier" ai_client.py` returns the new line. + +- [ ] Task 1.2: Stamp `source_tier` in `_append_comms`. + - **Location**: `ai_client._append_comms` (`ai_client.py:136-147`). Confirm with `py_get_definition`. + - **What**: Add `"source_tier": current_tier` as a key in the `entry` dict (after `"model"`). + - **How**: Use `Edit` to insert the key into the dict literal. + - **Note**: Add comment: `# current_tier is set/cleared by caller tiers; safe — ai_client.send() calls are serialized by the MMA engine executor.` + - **Verify**: Manually check the dict has `source_tier` key. + +- [ ] Task 1.3: Set/clear `current_tier` in `run_worker_lifecycle` (Tier 3). + - **Location**: `multi_agent_conductor.run_worker_lifecycle` (`multi_agent_conductor.py:224-354`). The `try:` block that calls `ai_client.send()` starts at line ~296. Confirm with `py_get_definition`. + - **What**: Before the `try:` block, add `ai_client.current_tier = "Tier 3"`. In the existing `finally:` block (which already restores `ai_client.comms_log_callback`), add `ai_client.current_tier = None`. + - **How**: Use `Edit` to insert before `try:` and inside `finally:`. + - **Verify**: After edit, `py_get_definition(run_worker_lifecycle)` shows both lines. + +- [ ] Task 1.4: Set/clear `current_tier` in `generate_tickets` (Tier 2). + - **Location**: `conductor_tech_lead.generate_tickets` (`conductor_tech_lead.py:6-48`). The `try:` block starts at line ~21. Confirm with `py_get_definition`. + - **What**: Before the `try:` block (before `response = ai_client.send(...)`), add `ai_client.current_tier = "Tier 2"`. In the existing `finally:` block (which restores `_custom_system_prompt`), add `ai_client.current_tier = None`. + - **How**: Use `Edit`. + - **Verify**: `py_get_definition(generate_tickets)` shows both lines. + +- [ ] Task 1.5: Migrate `_tool_log` from tuple to dict; update emission and storage. + - **Step A — `_on_tool_log`** (`gui_2.py:897-900`): Change to read `ai_client.current_tier` and pass it: `self._append_tool_log(script, result, ai_client.current_tier)`. + - **Step B — `_append_tool_log`** (`gui_2.py:1496-1503`): Change signature to `_append_tool_log(self, script: str, result: str, source_tier: str | None = None)`. Change `self._tool_log.append((script, result, time.time()))` to `self._tool_log.append({"script": script, "result": result, "ts": time.time(), "source_tier": source_tier})`. + - **Step C — type hint in `__init__`**: Change `self._tool_log: list[tuple[str, str, float]] = []` to `self._tool_log: list[dict] = []`. + - **How**: Use `Edit` for each step. Confirm with `py_get_definition` after each. + - **Verify**: `grep -n "_tool_log" gui_2.py` — all references confirmed; `_render_tool_calls_panel` still uses tuple destructure (fixed in Phase 2). + +- [ ] Task 1.6: Write tests for Phase 1. + - Confirm `ai_client._append_comms` produces entries with `source_tier` key (even if `None`). + - Confirm `_append_tool_log` stores a dict with `source_tier` key. + - Run `uv run pytest tests/ -x -q`. + +- [ ] Task 1.7: Conductor — User Manual Verification + - Launch app. Open a send in normal mode — confirm comms entries in Operations Hub > Comms History still render. + - (MMA run not required at this phase — data layer only.) + +--- + +## Phase 2: Tool Log Reader Migration +Focus: Update `_render_tool_calls_panel` to read dicts. No UI change — just fixes the access pattern before Phase 3 adds filter logic. + +- [ ] Task 2.1: Update `_render_tool_calls_panel` to use dict access. + - **Location**: `gui_2.py:2989-3039`. Confirm with `get_file_slice(2989, 3042)`. + - **What**: Replace `script, result, _ = self._tool_log[i_minus_one]` with: + ```python + entry = self._tool_log[i_minus_one] + script = entry["script"] + result = entry["result"] + ``` + - All subsequent uses of `script` and `result` in the same loop body are unchanged. + - **How**: Use `Edit` targeting the destructure line. + - **Verify**: `py_check_syntax(gui_2.py)` passes; run tests. + +- [ ] Task 2.2: Write/run tests. + - Run `uv run pytest tests/ -x -q`. Confirm tool log panel simulation tests (if any) pass. + +- [ ] Task 2.3: Conductor — User Manual Verification + - Launch app. Generate a script send (or use existing tool call in history). Confirm "Tool Calls" tab in Operations Hub renders correctly. + +--- + +## Phase 3: Focus Agent UI + Filter Logic +Focus: Add the combo selector and filter the two log panels. + +- [ ] Task 3.1: Add `ui_focus_agent` state var to `App.__init__`. + - **Location**: `gui_2.py` `__init__`, after `self.active_tier: str | None = None` (line ~283 — confirm with `grep -n "self.active_tier" gui_2.py`). + - **What**: Insert `self.ui_focus_agent: str | None = None`. + - **How**: Use `Edit`. + - **Verify**: `grep -n "ui_focus_agent" gui_2.py` returns exactly 1 hit (the new line, before Phase 3.3 adds more). + +- [ ] Task 3.2: Add Focus Agent selector widget in Operations Hub. + - **Location**: `gui_2.py` `_gui_func`, Operations Hub block (line ~1774). Confirm with `get_file_slice(1774, 1792)`. Current content: + ```python + if imgui.begin_tab_bar("OperationsTabs"): + ``` + - **What**: Insert immediately before `if imgui.begin_tab_bar("OperationsTabs"):`: + ```python + imgui.text("Focus Agent:") + imgui.same_line() + focus_label = self.ui_focus_agent or "All" + if imgui.begin_combo("##focus_agent", focus_label, imgui.ComboFlags_.width_fit_preview): + if imgui.selectable("All", self.ui_focus_agent is None)[0]: + self.ui_focus_agent = None + for tier in ["Tier 2", "Tier 3", "Tier 4"]: + if imgui.selectable(tier, self.ui_focus_agent == tier)[0]: + self.ui_focus_agent = tier + imgui.end_combo() + imgui.same_line() + if self.ui_focus_agent: + if imgui.button("x##clear_focus"): + self.ui_focus_agent = None + imgui.separator() + ``` + - **Note**: Tier 1 omitted — Tier 1 (Claude Code) never calls `ai_client.send()`, so it produces no comms entries. + - **How**: Use `Edit`. + +- [ ] Task 3.3: Add filter logic to `_render_comms_history_panel`. + - **Location**: `gui_2.py` `_render_comms_history_panel` (after bleed cleanup, line ~3400). Confirm with `py_get_definition`. + - **What**: After the `log_to_render = self.prior_session_entries if self.is_viewing_prior_session else list(self._comms_log)` line, add: + ```python + if self.ui_focus_agent and not self.is_viewing_prior_session: + log_to_render = [e for e in log_to_render if e.get("source_tier") == self.ui_focus_agent] + ``` + - Also add a `source_tier` label in the entry header row (after the `provider/model` text): + ```python + tier_label = entry.get("source_tier") or "main" + imgui.text_colored(C_SUB, f"[{tier_label}]") + imgui.same_line() + ``` + Insert this after the `imgui.text_colored(C_LBL, f"{entry.get('provider', '?')}/{entry.get('model', '?')}")` line. + - **How**: Use `Edit` for each insertion. + +- [ ] Task 3.4: Add filter logic to `_render_tool_calls_panel`. + - **Location**: `gui_2.py:2989`. Confirm with `get_file_slice(2989, 3000)`. + - **What**: After `imgui.begin_child("scroll_area")` + clipper setup, change the render source: + - Replace `clipper.begin(len(self._tool_log))` with a pre-filtered list: + ```python + tool_log_filtered = self._tool_log if not self.ui_focus_agent else [ + e for e in self._tool_log if e.get("source_tier") == self.ui_focus_agent + ] + ``` + - Then `clipper.begin(len(tool_log_filtered))`. + - Inside the loop use `tool_log_filtered[i_minus_one]` instead of `self._tool_log[i_minus_one]`. + - **How**: Use `Edit`. + +- [ ] Task 3.5: Write tests for Phase 3. + - Test that `ui_focus_agent = "Tier 3"` filters out entries with `source_tier = "Tier 2"`. + - Run `uv run pytest tests/ -x -q`. + +- [ ] Task 3.6: Conductor — User Manual Verification + - Launch app. Open Operations Hub. + - Confirm "Focus Agent:" combo appears above tabs with options: All, Tier 2, Tier 3, Tier 4. + - With "All" selected: all entries show with `[main]` or `[Tier N]` labels in comms history. + - With "Tier 3" selected: comms history shows only entries tagged `source_tier = "Tier 3"`. + - Confirm "x" clear button resets to "All". + +--- + +## Phase Completion Checkpoint +After all phases pass manual verification: +- Run `uv run pytest tests/ -x -q` one final time. +- Commit: `feat(mma): per-tier agent focus — source_tier tagging + Focus Agent filter UI` +- Update TASKS.md: move `mma_agent_focus_ux` from Planned to Active/Completed. +- Update JOURNAL.md with What/Why/How/Issues/Result. diff --git a/conductor/tracks/mma_agent_focus_ux_20260302/spec.md b/conductor/tracks/mma_agent_focus_ux_20260302/spec.md new file mode 100644 index 0000000..ca20698 --- /dev/null +++ b/conductor/tracks/mma_agent_focus_ux_20260302/spec.md @@ -0,0 +1,95 @@ +# Track Specification: MMA Agent Focus UX + +## Overview +All MMA observability panels (comms history, tool calls, discussion) display +global/session-scoped data. When 4 tiers are running concurrently, their traffic +is indistinguishable. This track adds a `source_tier` field to every comms and +tool log entry at the point of emission, then adds a "Focus Agent" selector that +filters the Operations Hub panels to show only one tier's traffic at a time. + +**Depends on:** `feature_bleed_cleanup_20260302` (Phase 1 removes the dead comms +panel duplicate; this track extends the live panel at gui_2.py:~3400). + +## Current State Audit (as of 0ad47af) + +### Already Implemented (DO NOT re-implement) +- **`ai_client._append_comms`** (`ai_client.py:136-147`): Emits entries with keys `ts`, `direction`, `kind`, `provider`, `model`, `payload`. No `source_tier` key. +- **`ai_client.comms_log_callback`** (`ai_client.py:87`): Module-level `Callable | None`. Tier 3 workers temporarily replace it in `run_worker_lifecycle` (`multi_agent_conductor.py:224-354`); Tier 2 (`conductor_tech_lead.py:6-48`) does NOT replace it. +- **`ai_client.tool_log_callback`** (`ai_client.py:91`): Module-level `Callable[[str,str],None] | None`. Never replaced by any tier — fires from whichever tier is active via `_run_script` (`ai_client.py:490-500`). +- **`self._tool_log`** (`gui_2.py:__init__`): `list[tuple[str, str, float]]` — stored as `(script, result, timestamp)`. Destructured in `_render_tool_calls_panel` as `script, result, _ = self._tool_log[i_minus_one]`. +- **`self._comms_log`** (`gui_2.py:__init__`): `list[dict]` — each entry is the raw dict from `_append_comms` plus `local_ts` stamped in `_on_comms_entry`. +- **`self.active_tier`** (`gui_2.py:__init__`): `str | None` — set by `_push_mma_state_update` when the engine reports tier activity. Tracks the *current* active tier but is not stamped onto individual log entries. +- **`run_worker_lifecycle` stream_id** (`multi_agent_conductor.py:299`): Uses `f"Tier 3 (Worker): {ticket.id}"` as `stream_id` for mma_streams. This is already tier+ticket scoped for the stream panels. +- **`_render_comms_history_panel`** (`gui_2.py:~3435`): Renders `self._comms_log` entries with `direction`, `kind`, `provider`, `model` fields. No tier column or filter. +- **`_render_tool_calls_panel`** (`gui_2.py:2989-3039`): Renders `self._tool_log` entries. No tier column or filter. +- **`disc_entries`** (`gui_2.py:__init__`, `_render_discussion_panel:gui_2.py:2482-2685`): List of dicts with `role`, `content`, `collapsed`, `ts`. Role values include "User", "AI", "Tool", "Vendor API" — MMA workers inject via `history_add` kind with a `role` field. + +### Gaps to Fill (This Track's Scope) + +1. **No `source_tier` on comms entries**: `_append_comms` never reads tier context. Tier 3 callbacks call the old callback chain but don't stamp tier info. Result: all comms from all tiers are visually identical in `_render_comms_history_panel`. + +2. **No `source_tier` on tool log entries**: `_on_tool_log` receives `(script, result)` with no tier context. `_tool_log` is a flat list of tuples — no way to filter by tier. + +3. **No `current_tier` module variable in `ai_client`**: There is no mechanism for callers to declare which tier is currently active. Both `run_worker_lifecycle` and `generate_tickets` call `ai_client.send()` without setting any "who am I" context that `_append_comms` could read. + +4. **No Focus Agent UI widget**: No selector in Operations Hub or MMA Dashboard to choose a tier to filter on. + +5. **No filter logic in `_render_comms_history_panel` or `_render_tool_calls_panel`**: Both render all entries unconditionally. + +## Goals +1. Add `current_tier: str | None` module-level variable to `ai_client.py`; `_append_comms` reads and includes it as `source_tier` on every entry. +2. Set `ai_client.current_tier` in `run_worker_lifecycle` (Tier 3) and `generate_tickets` (Tier 2) around the `ai_client.send()` call; clear it in `finally`. +3. Change `_tool_log` from `list[tuple]` to `list[dict]` to support `source_tier` field; update all three access sites. +4. Add `self.ui_focus_agent: str | None = None` state var to `App.__init__`. +5. Add a "Focus Agent" combo widget in the Operations Hub tab bar header (or above it in MMA Dashboard). +6. Filter `_render_comms_history_panel` and `_render_tool_calls_panel` by `ui_focus_agent` when it is non-None. +7. Defer per-tier token stats (Phase 4 of TASKS.md intent) to a separate sub-track. + +## Functional Requirements + +### Phase 1 — Tier Tagging at Emission (ai_client.py + conductors) +- `ai_client.py`: Add `current_tier: str | None = None` module-level variable after line 91 (beside `tool_log_callback`). +- `ai_client._append_comms` (`ai_client.py:136-147`): Add `"source_tier": current_tier` to the entry dict (can be `None` for main-session calls). +- `multi_agent_conductor.run_worker_lifecycle` (`multi_agent_conductor.py:224-354`): Before the `try:` block that calls `ai_client.send()` (line ~296), add `ai_client.current_tier = "Tier 3"`. In the `finally:` block, add `ai_client.current_tier = None`. +- `conductor_tech_lead.generate_tickets` (`conductor_tech_lead.py:6-48`): In the `try:` block before `ai_client.send()`, add `ai_client.current_tier = "Tier 2"`. In `finally:`, add `ai_client.current_tier = None`. +- `gui_2.py _on_tool_log` (`gui_2.py:897-900`): Capture `ai_client.current_tier` at call time and pass it along. +- `gui_2.py _append_tool_log` (`gui_2.py:1496-1503`): Change stored format from `(script, result, time.time())` to dict: `{"script": script, "result": result, "ts": time.time(), "source_tier": source_tier}`. +- Update `_on_tool_log` signature to accept and pass `source_tier`, OR read it directly from `ai_client.current_tier` inside `_append_tool_log`. + +### Phase 2 — Tool Log Reader Migration +- `_render_tool_calls_panel` (`gui_2.py:2989-3039`): Replace `script, result, _ = self._tool_log[i_minus_one]` with dict access: `entry = self._tool_log[i_minus_one]`, `script = entry["script"]`, `result = entry["result"]`. +- No filtering yet — just migrate readers so Phase 3 can add filter logic cleanly. + +### Phase 3 — Focus Agent UI + Filter Logic +- `App.__init__`: Add `self.ui_focus_agent: str | None = None` after `self.active_tier` (line ~283). +- `_render_tool_calls_panel` AND `_render_comms_history_panel`: At top of each method, derive `log_to_render` by filtering on `entry.get("source_tier") == self.ui_focus_agent` when `ui_focus_agent` is not None. +- **Focus Agent selector widget**: In `_gui_func` Operations Hub block (line ~1774-1783), before the `imgui.begin_tab_bar("OperationsTabs")` call, add: + ```python + imgui.text("Focus Agent:") + imgui.same_line() + focus_label = self.ui_focus_agent or "All" + if imgui.begin_combo("##focus_agent", focus_label): + if imgui.selectable("All", self.ui_focus_agent is None)[0]: + self.ui_focus_agent = None + for tier in ["Tier 1", "Tier 2", "Tier 3", "Tier 4"]: + if imgui.selectable(tier, self.ui_focus_agent == tier)[0]: + self.ui_focus_agent = tier + imgui.end_combo() + ``` +- Show `source_tier` label in `_render_comms_history_panel` entry header row (after `provider/model` field). + +## Non-Functional Requirements +- `current_tier` must be cleared in `finally` blocks — never left set after a send call. +- Thread safety: `current_tier` is a module-level var. Because `ai_client.send()` calls are serialized (one tier at a time in the MMA engine's executor), race conditions are negligible. Document this assumption in a code comment. +- No new Python package dependencies. +- `_tool_log` dict format change must be handled as a breaking change — confirm no simulation tests directly inspect raw `_tool_log` tuples. + +## Architecture Reference +- [docs/guide_architecture.md](../../../docs/guide_architecture.md): Threading model, event system +- [docs/guide_mma.md](../../../docs/guide_mma.md): Worker lifecycle, tier context + +## Out of Scope +- Per-tier token stats / token budget panel filtering (separate sub-track). +- Discussion panel role-based tier filtering (the `role` values don't consistently map to tier names; out of scope here). +- Tier 1 (Claude Code conductor) comms — Tier 1 never calls `ai_client.send()`. +- Filtering the Tier 1–4 stream panels (already tier-scoped via `mma_streams` stream_id key).