chore(conductor): Archive track 'Fix Concurrent MMA Live GUI Tests'

This commit is contained in:
2026-05-07 18:53:32 -04:00
parent b043d06771
commit 87bcd698bb
5 changed files with 0 additions and 4 deletions
@@ -1,475 +0,0 @@
# Concurrent MMA Tracks Bug: Comprehensive Technical Report
**Track:** `fix_concurrent_mma_tests_20260507`
**Issue:** When two MMA tracks are started concurrently, only Track B's worker appears in `mma_streams`. Track A's worker never manifests despite its engine being created and stored in `self.engines[track.id]`.
**Status:** Active investigation with multiple root causes identified.
---
## Executive Summary
The concurrent MMA track execution bug is a multi-faceted race condition involving:
1. **Global `comms_log_callback` race** (FIXED) - Thread-local storage now used
2. **`confirm_spawn` dialog container race** (PENDING) - Per-ticket isolation needed
3. **Global `_gemini_cli_adapter` singleton** (PENDING) - Per-call adapter or locking needed
---
## 1. Architecture Overview
### 1.1 MMA 4-Tier Architecture
```
Tier 1: Orchestrator (Strategic planning, epic → tracks)
Tier 2: Tech Lead (Track execution, ticket generation) - src/conductor_tech_lead.py
Tier 3: Worker (Stateless code implementation) - src/multi_agent_conductor.py:run_worker_lifecycle()
Tier 4: QA (Error analysis, stateless)
```
### 1.2 Concurrent Execution Flow
```python
# Test flow:
client.click('btn_mma_start_track', user_data=track_a_id) # Start Track A
time.sleep(0.5) # 500ms delay
client.click('btn_mma_start_track', user_data=track_b_id) # Start Track B
```
---
## 2. State Management Architecture
### 2.1 Per-Track Engine Isolation
Each track gets its own `ConductorEngine` stored in `app_controller.engines`:
```python
# src/app_controller.py:2920-2921
engine = multi_agent_conductor.ConductorEngine(self.active_track, self.event_queue, auto_queue=not self.mma_step_mode)
self.engines[self.active_track.id] = engine # Per-track engine storage
```
**State ownership:**
| Component | Location | Scope |
|-----------|----------|-------|
| `app_controller.engines` | `Dict[track_id → Engine]` | Shared, per-track |
| `app_controller.active_track` | Single `Track` object | Current loaded track |
| `app_controller.mma_streams` | `Dict[stream_id → text]` | Shared, all tracks |
| `engine._active_workers` | `Dict[ticket_id → Thread]` | Per-engine |
| `engine._abort_events` | `Dict[ticket_id → Event]` | Per-engine |
| `engine.dag` | `TrackDAG` with ticket statuses | Per-engine |
| `engine.pool` | `WorkerPool` with concurrency | Per-engine |
### 2.2 Thread-Local Callback Storage (FIXED)
**Original (BUGGY):**
```python
# src/ai_client.py:103 (GLOBAL - thread-unsafe)
comms_log_callback: Optional[Callable[[dict[str, Any]], None]] = None
```
**Fixed (THREAD-SAFE):**
```python
# src/ai_client.py:121-127
def get_comms_log_callback() -> Optional[Callable[[dict[str, Any]], None]]:
"""Returns the comms log callback from thread-local storage."""
return getattr(_local_storage, "comms_log_callback", None)
def set_comms_log_callback(cb: Optional[Callable[[dict[str, Any]], None]]) -> None:
"""Sets the comms log callback in thread-local storage."""
_local_storage.comms_log_callback = cb
```
### 2.3 Worker Lifecycle State Flow
```python
# src/multi_agent_conductor.py:run_worker_lifecycle()
def run_worker_lifecycle(ticket, context, context_files, event_queue, engine, md_content):
# 1. Set thread-local callback for THIS worker
old_comms_cb = ai_client.get_comms_log_callback()
ai_client.set_comms_log_callback(worker_comms_callback) # Thread-local
# 2. Set current tier (ALSO thread-local)
ai_client.set_current_tier(f"Tier 3 (Worker): {ticket.id}")
try:
# 3. Call ai_client.send() - THIS is where stream_callback fires
response = ai_client.send(
md_content=md_content,
user_message=user_message,
stream_callback=stream_callback # Created INSIDE run_worker_lifecycle
)
finally:
# 4. Restore
ai_client.set_comms_log_callback(old_comms_cb)
ai_client.set_current_tier(None)
```
---
## 3. Root Causes Identified
### 3.1 Root Cause #1: Global `comms_log_callback` (FIXED)
**Problem:** The `comms_log_callback` was a module-level global variable. When two workers ran concurrently:
```python
# Timeline (BEFORE FIX):
T1: Worker A sets ai_client.comms_log_callback = callback_A
T2: Worker A releases _send_lock while waiting for subprocess
T3: Worker B sets ai_client.comms_log_callback = callback_B # OVERWRITES!
T4: Worker A's subprocess returns → uses callback_B (WRONG!)
```
**Fix Applied:**
- Changed `comms_log_callback` to use thread-local storage via `_local_storage.comms_log_callback`
- Each worker thread now has isolated callback storage
### 3.2 Root Cause #2: `confirm_spawn` Dialog Container Race (PENDING)
**Problem:** The `confirm_spawn` function uses a single-element list as a dialog container:
```python
# src/multi_agent_conductor.py:358-372
def confirm_spawn(role, prompt, context_md, event_queue, ticket_id):
dialog_container = [None] # SHARED per-call but not per-ticket!
task = {
"action": "mma_spawn_approval",
"ticket_id": ticket_id,
"dialog_container": dialog_container
}
_queue_put(event_queue, "mma_spawn_approval", task)
# Worker waits here
while dialog_container[0] is None and time.time() - start < 60:
time.sleep(0.1)
res = dialog_container[0].wait()
```
**Race Condition:**
1. Track A's worker calls `confirm_spawn``dialog_container_A = [None]`
2. Track B's worker calls `confirm_spawn``dialog_container_B = [None]`
3. Both tasks queue to `_pending_gui_tasks`
4. GUI processes Track B's dialog first → sets `dialog_container_B[0]`
5. Track A times out waiting for `dialog_container_A[0]` to be set
6. **Track A worker returns `"BLOCKED: Spawn rejected by user."` and NEVER calls `ai_client.send()`**
**Impact:** If `confirm_spawn` times out, the worker never reaches `ai_client.send()`, so no `stream_callback` is created, and the worker never appears in `mma_streams`.
### 3.3 Root Cause #3: Global Singleton `_gemini_cli_adapter` (PENDING)
**Problem:** The `GeminiCliAdapter` is a global singleton:
```python
# src/ai_client.py:97, 1192-1194
_gemini_cli_adapter: Optional[GeminiCliAdapter] = None
def _send_gemini_cli(...):
global _gemini_cli_adapter
if _gemini_cli_adapter is None:
_gemini_cli_adapter = GeminiCliAdapter(binary_path="gemini")
```
When two workers use `gemini_cli` provider concurrently:
- Both call `adapter.send()` on the **same adapter instance**
- The adapter has `session_id` and `last_usage` instance state
- Sessions may interfere or overwrite each other
---
## 4. The `stream_callback` Path
### 4.1 Stream Callback Creation
The `stream_callback` is created **inside `run_worker_lifecycle`** and captures `ticket.id`:
```python
# src/multi_agent_conductor.py:528-530
def stream_callback(chunk: str) -> None:
if event_queue:
_queue_put(event_queue, 'mma_stream', {
'stream_id': f'Tier 3 (Worker): {ticket.id}', # Captures ticket.id!
'text': chunk
})
```
### 4.2 Stream Callback Flow
```
Worker Thread A Worker Thread B
─────────────── ───────────────
1. create stream_callback(ticket_id="ticket-a-1")
1. create stream_callback(ticket_id="ticket-b-1")
2. ai_client.send(..., stream_callback=stream_callback_A)
2. ai_client.send(..., stream_callback=stream_callback_B)
3. _send_gemini_cli() calls adapter.send(..., stream_callback=stream_callback_A)
3. _send_gemini_cli() calls adapter.send(..., stream_callback=stream_callback_B)
4. adapter.send() fires chunks → stream_callback_A(chunk)
4. adapter.send() fires chunks → stream_callback_B(chunk)
5. stream_callback_A does _queue_put(event_queue, 'mma_stream', {'stream_id': 'Tier 3 (Worker): ticket-a-1', ...})
5. stream_callback_B does _queue_put(event_queue, 'mma_stream', {'stream_id': 'Tier 3 (Worker): ticket-b-1', ...})
```
### 4.3 mma_streams Population
```python
# src/app_controller.py:755-768
def _handle_mma_stream(self, payload):
text = payload.get("text", "")
stream_id = payload.get("stream_id") # e.g., "Tier 3 (Worker): ticket-a-1"
is_streaming = payload.get("status") == "streaming..."
if stream_id:
if is_streaming:
if stream_id not in self.mma_streams:
self.mma_streams[stream_id] = ""
self.mma_streams[stream_id] += text # ACCUMULATES chunks
else:
self.mma_streams[stream_id] = text # Final response
```
### 4.4 get_mma_workers API
```python
# src/api_hooks.py:250-255
elif self.path == "/api/mma/workers":
mma_streams = _get_app_attr(app, "mma_streams", {})
self.wfile.write(json.dumps({"workers": _serialize_for_api(mma_streams)}).encode("utf-8"))
```
The test polls `get_mma_workers()` and checks if `"ticket-A-1"` or `"ticket-B-1"` appears in the stream_ids.
---
## 5. Why Only Track B Appears
### 5.1 Hypothesis: `confirm_spawn` Timeout
If Track A's `confirm_spawn` times out:
1. Worker returns `"BLOCKED: Spawn rejected by user."`
2. `ai_client.send()` is NEVER called
3. `stream_callback` is NEVER created
4. No `mma_stream` events are ever queued
5. `"ticket-a-1"` never appears in `mma_streams`
### 5.2 Why B "Stays Longer"
Track B's dialog is processed first (due to queue ordering), so:
1. Track B's `confirm_spawn` succeeds
2. Track B calls `ai_client.send()`
3. Track B's `stream_callback` fires
4. Track B appears in `mma_streams`
5. Track A is still waiting for its dialog (or has timed out)
---
## 6. Detailed Fix Strategy
### 6.1 Fix #1: Thread-Local Callbacks (IMPLEMENTED ✓)
**Status:** DONE
**Files Modified:**
- `src/ai_client.py:103-127` - Added `get_comms_log_callback()` and `set_comms_log_callback()` using `_local_storage`
- `src/ai_client.py:215-217` - Changed `_append_comms()` to use thread-local callback
- `src/ai_client.py:1238-1249` - Changed direct callback invocation to use thread-local getter
- `src/multi_agent_conductor.py:532,551,565` - Changed to use `get_comms_log_callback()` and `set_comms_log_callback()`
- `src/app_controller.py:1522` - Changed to use `set_comms_log_callback()`
### 6.2 Fix #2: Per-Ticket Dialog Containers (PENDING)
**Required Change:** Make `confirm_spawn` dialog containers ticket-specific:
```python
# Option A: Use threading.Event per ticket
_ticket_events: Dict[str, threading.Event] = {}
def confirm_spawn(ticket_id, ...):
event = threading.Event()
_ticket_events[ticket_id] = event
task = {"action": "mma_spawn_approval", "ticket_id": ticket_id, ...}
_queue_put(event_queue, "mma_spawn_approval", task)
event.wait(timeout=60)
return _ticket_events[ticket_id].result
# GUI sets:
_ticket_events[ticket_id].result = (approved, prompt, context)
_ticket_events[ticket_id].set()
```
### 6.3 Fix #3: Per-Call Gemini CLI Adapter (PENDING)
**Option A:** Create new adapter per send call
```python
def _send_gemini_cli(..., adapter=None):
if adapter is None:
adapter = GeminiCliAdapter(binary_path="gemini")
```
**Option B:** Add locking around adapter.send()
```python
_gemini_cli_lock = threading.Lock()
def _send_gemini_cli(...):
global _gemini_cli_adapter
with _gemini_cli_lock:
if _gemini_cli_adapter is None:
_gemini_cli_adapter = GeminiCliAdapter(...)
return _gemini_cli_adapter.send(...)
```
---
## 7. Test Verification Strategy
### 7.1 Current Test
```python
# tests/test_mma_concurrent_tracks_sim.py
def test_mma_concurrent_tracks_execution(live_gui):
# 1. Setup mock provider
client.set_value('current_provider', 'gemini_cli')
client.set_value('gcli_path', f'"{sys.executable}" "{mock_path}"')
# 2. Plan epic → generates Track A and Track B
client.set_value('mma_epic_input', 'PATH: Epic Initialization')
client.click('btn_mma_plan_epic')
# Poll until len(proposed_tracks) >= 2
# 3. Accept tracks
client.click('btn_mma_accept_tracks')
# Poll until len(tracks) >= 2
# 4. Start Track A, then Track B
client.click('btn_mma_start_track', user_data=track_a_id)
time.sleep(0.5)
client.click('btn_mma_start_track', user_data=track_b_id)
# 5. Poll get_mma_workers() looking for both ticket-a-1 and ticket-b-1
for i in range(40):
workers = client.get_mma_workers().get('workers', {})
stream_ids = workers.keys()
if any("ticket-A-1" in sid for sid in stream_ids):
seen_a = True
if any("ticket-B-1" in sid for sid in stream_ids):
seen_b = True
if seen_a and seen_b:
break
time.sleep(1)
assert seen_a, "Worker from Track A never appeared"
assert seen_b, "Worker from Track B never appeared"
```
### 7.2 Debug Points
Add debug logging at:
1. `_cb_start_track` entry/exit
2. `engine.run()` loop iteration
3. `confirm_spawn` call and timeout
4. `stream_callback` invocation
5. `_handle_mma_stream` entry
---
## 8. Related Files Reference
| File | Key Functions/Classes | Relevance |
|------|----------------------|-----------|
| `src/app_controller.py` | `_cb_start_track`, `_cb_load_track`, `engines`, `mma_streams`, `_handle_mma_stream` | Central coordinator |
| `src/multi_agent_conductor.py` | `run_worker_lifecycle`, `confirm_spawn`, `stream_callback`, `WorkerPool` | Worker execution |
| `src/ai_client.py` | `send`, `_send_gemini_cli`, `comms_log_callback`, `_append_comms` | AI client, callback dispatch |
| `src/project_manager.py` | `flat_config`, `load_track_state`, `save_track_state` | Project/track persistence |
| `src/aggregate.py` | `run()` | Context aggregation |
| `src/api_hooks.py` | `HookServer`, `/api/mma/workers` | Test interface |
| `src/api_hook_client.py` | `get_mma_workers()` | Test client |
| `src/gemini_cli_adapter.py` | `GeminiCliAdapter.send()` | CLI subprocess adapter |
| `tests/mock_concurrent_mma.py` | `main()` | Mock provider for tests |
---
## 9. Commit History for This Track
| Commit | Description |
|--------|-------------|
| `ac0b564` | fix(mma): Change self.engine to self.engines dict for concurrent track support |
| `6f2a2c2` | fix(gui): Initialize active_tickets in AppController.init_state |
| `f3585cb` | fix(mma): Use track.id instead of self.active_track.id in _start_track_logic |
| `b0a837d` | fix(mma): Add explicit reload logic when _cb_load_track fails in _cb_start_track |
| `cab733a` | debug: Add logging to _cb_start_track |
| `b44ee29` | conductor(plan): Mark task complete |
| `6f2f539` | conductor(track): Update spec and plan with complete bug analysis |
---
## 10. Open Questions
1. **Is `confirm_spawn` actually timing out for Track A?** Need to add logging to confirm.
2. **Is the Gemini CLI adapter causing session interference?** Need to verify with per-call adapters.
3. **Is `auto_queue=True` causing immediate execution before dialog can be shown?** The `auto_queue` parameter bypasses step mode but still calls `confirm_spawn` for HITL clutch.
---
## 11. Recommended Immediate Fixes
### Priority 1: Fix `confirm_spawn` Dialog Container
Make dialog containers ticket-specific using a dictionary keyed by `ticket_id` instead of a shared list.
### Priority 2: Add Per-Call Adapter Lock
Add a lock around `_gemini_cli_adapter.send()` to prevent session interference.
### Priority 3: Add Comprehensive Debug Logging
Log every major lifecycle event in both worker threads and the GUI to trace exactly where Track A diverges from Track B.
---
## 12. Phase Checklist
### Phase 1: Investigate Test Infrastructure
- [x] Task: Compare test_mma_concurrent_tracks_sim.py with working tests
- [x] Task: Check subprocess/port cleanup between live_gui fixture tests
- [x] Task: Verify get_mma_workers() API returns expected format
- [x] Task: Run isolated concurrent test with verbose debugging
### Phase 2: Identify Root Cause - MULTIPLE BUGS FOUND
- [x] Bug 1: `self.engine` single reference overwritten (FIXED in ac0b564)
- [x] Bug 2: `active_tickets` not initialized (FIXED in 6f2a2c2)
- [x] Bug 3: `_start_track_logic` uses `self.active_track.id` when None (FIXED in f3585cb)
- [x] Bug 4: `_cb_start_track` overwrites `active_track` (PARTIALLY FIXED in b0a837d)
### Phase 3: Implement Fix
- [x] Task 3.1: Change `self.engine: Optional[ConductorEngine]` to `self.engines: Dict[str, ConductorEngine]`
- [x] Task 3.2: Update all `self.engine` references to track-specific lookup
- [x] Task 3.3: Fix `_start_track_logic` to use `track.id` instead of `self.active_track.id`
- [x] Task 3.4: Add `self.active_tickets = []` in `init_state()`
- [x] Task 3.5: Add explicit reload path in `_cb_start_track` when load fails
- [ ] Task 3.6: Verify `_cb_load_track` properly loads tracks created by `_cb_accept_tracks`
### Phase 4: Fix Thread-Local Callback (IMPLEMENTED)
- [x] Task 4.1: Make comms_log_callback thread-local via _local_storage
- [x] Task 4.2: Update _append_comms to use thread-local getter
- [x] Task 4.3: Update direct callback invocations to use thread-local getter
- [x] Task 4.4: Update run_worker_lifecycle to use set_comms_log_callback
### Phase 5: Fix confirm_spawn Dialog (PENDING)
- [ ] Task 5.1: Make dialog_container per-ticket using ticket_id key
- [ ] Task 5.2: Update GUI handler to set correct dialog_container
### Phase 6: Fix Gemini CLI Adapter (PENDING)
- [ ] Task 6.1: Add lock around _gemini_cli_adapter.send() OR
- [ ] Task 6.2: Create per-call adapter
### Phase 7: Final Verification
- [ ] Task 7.1: Run concurrent MMA tests
- [ ] Task 7.2: Run full test suite to check for regressions
- [ ] Task 7.3: Remove debug logging and commit final fixes
---
*This report was generated during the fix_concurrent_mma_tests_20260507 track investigation.*
*Last updated: 2026-05-07*
@@ -1,10 +0,0 @@
{
"id": "fix_concurrent_mma_tests_20260507",
"title": "Fix Concurrent MMA Live GUI Tests",
"created": "2026-05-07",
"status": "in_progress",
"priority": "high",
"estimate": "medium",
"tags": ["testing", "mma", "concurrency", "integration-tests"],
"depends_on": []
}
@@ -1,42 +0,0 @@
# Implementation Plan: Fix Concurrent MMA Live GUI Tests
## Phase 1: Investigate Test Infrastructure
- [x] Task: Compare test_mma_concurrent_tracks_sim.py with working tests - Same fixture, same pattern
- [x] Task: Check subprocess/port cleanup between live_gui fixture tests - Not cleanup issue, single test also times out
- [x] Task: Verify get_mma_workers() API returns expected format - Returns {"workers": Dict[str, str]}
- [x] Task: Run isolated concurrent test with verbose debugging - Confirmed: only ONE worker appears when starting two tracks
## Phase 2: Identify Root Cause - MULTIPLE BUGS FOUND
- [x] Bug 1: `self.engine` single reference overwritten (FIXED in ac0b564)
- [x] Bug 2: `active_tickets` not initialized (FIXED in 6f2a2c2)
- [x] Bug 3: `_start_track_logic` uses `self.active_track.id` when None (FIXED in f3585cb)
- [x] Bug 4: `_cb_start_track` overwrites `active_track` (PARTIALLY FIXED in b0a837d)
## Phase 3: Implement Fix
- [x] Task 3.1: Change `self.engine: Optional[ConductorEngine]` to `self.engines: Dict[str, ConductorEngine]`
- [x] Task 3.2: Update all `self.engine` references to track-specific lookup
- [x] Task 3.3: Fix `_start_track_logic` to use `track.id` instead of `self.active_track.id`
- [x] Task 3.4: Add `self.active_tickets = []` in `init_state()`
- [x] Task 3.5: Add explicit reload path in `_cb_start_track` when load fails
- [ ] Task 3.6: Verify `_cb_load_track` properly loads tracks created by `_cb_accept_tracks`
## Phase 4: Debug Remaining Issue
- [ ] Task 4.1: Investigate why reload path in `_cb_start_track` doesn't bring back Track A worker
- [ ] Task 4.2: Add debug logging to `_cb_load_track` to trace failure
- [ ] Task 4.3: Verify track state is saved with correct ID when created by `_cb_accept_tracks`
- [ ] Task 4.4: Run test and verify both workers appear
## Phase 5: Final Verification
- [ ] Task 5.1: Run all concurrent MMA tests
- [ ] Task 5.2: Run full test suite to check for regressions
- [ ] Task 5.3: Remove debug logging and commit final fixes
## Commits So Far
- `ac0b564` - fix(mma): Change self.engine to self.engines dict for concurrent track support
- `6f2a2c2` - fix(gui): Initialize active_tickets in AppController.init_state
- `f3585cb` - fix(mma): Use track.id instead of self.active_track.id in _start_track_logic
- `b0a837d` - fix(mma): Add explicit reload logic when _cb_load_track fails in _cb_start_track
- `cab733a` - debug: Add logging to _cb_start_track
## Current Blocker
After commits 1-4, Track B's worker appears but Track A's never does. The reload path (b0a837d) is triggered for Track A but the worker doesn't become visible. Need to trace why `_cb_load_track` succeeds for Track B but fails for Track A, or why the engine thread doesn't produce visible output.
@@ -1,73 +0,0 @@
# Track: Fix Concurrent MMA Live GUI Tests
## Problem Statement
When starting two MMA tracks concurrently via `btn_mma_start_track`, only ONE worker appears instead of two. The test `test_mma_concurrent_tracks_stress_sim.py` correctly identifies this bug.
## Bugs Found (Multiple Layers)
### Bug 1: `self.engine` Single Reference (FIXED - commit ac0b564)
**Root Cause:** `self.engine` was a single ConductorEngine attribute. When second track started, it overwrote `self.engine`, orphaning first track's engine.
**Fix:** Changed to `self.engines: Dict[str, ConductorEngine]` keyed by track.id.
### Bug 2: `active_tickets` Not Initialized (FIXED - commit 6f2a2c2)
**Root Cause:** `AppController.init_state()` never initialized `self.active_tickets`. When GUI tried to render `_render_mma_dashboard()`, `__getattr__` delegation returned `controller.active_tickets` but it didn't exist, causing `AttributeError: 'App' object has no attribute 'active_tickets'`.
**Fix:** Added `self.active_tickets = []` in `init_state()`.
### Bug 3: `_start_track_logic` Uses `self.active_track.id` When None (FIXED - commit f3585cb)
**Root Cause:** `_cb_accept_tracks` calls `_start_track_logic` in a loop for each proposed track. `_start_track_logic` used `self.engines[self.active_track.id]` but `self.active_track` was None (only set by `_cb_load_track`, not by accept logic).
**Fix:** Changed to `self.engines[track.id]` using the locally created track object.
### Bug 4: `_cb_start_track` Overwrites `active_track` (PARTIALLY FIXED - commit b0a837d)
**Root Cause:** When `btn_mma_start_track(track_a_id)` is called:
1. `_cb_load_track(track_a_id)` is called - may fail if track not saved to disk
2. If fails, `self.active_track` remains None or pointing to wrong track
3. When `btn_mma_start_track(track_b_id)` is called next, it overwrites `self.active_track` to track B
4. Engine for track A never created properly
**Fix:** Added explicit reload path when loaded track doesn't match requested track.
## Current Status
After fixes 1-4, the log shows:
- Track A: `_cb_start_track` called, load fails, reload path triggered but track A worker never visible
- Track B: Works correctly, worker appears
- Test assertion: `seen_a=False, seen_b=True`
## Remaining Issue
The reload path in `_cb_start_track` (commit b0a837d) correctly identifies that load failed, but the reload itself may not be working. Need to verify:
1. `_cb_load_track(track_a_id)` is called with correct track ID
2. Track state is properly saved and can be loaded
3. `active_track` is correctly set after reload
## Architecture Notes
### API Flow for Starting Tracks
```
test clicks btn_mma_start_track(track_a_id)
→ api_hook_client.click('btn_mma_start_track', user_data=track_a_id)
→ POST /api/gui {"action": "click", "item": "btn_mma_start_track", "user_data": track_a_id}
→ _process_gui_task finds btn_mma_start_track in _clickable_actions
→ calls _cb_start_track(user_data=track_a_id)
→ calls _cb_load_track(track_id)
→ loads track state from disk
→ sets self.active_track
→ creates ConductorEngine
→ stores in self.engines[track_id]
→ starts engine thread
```
### Key Files
- `src/app_controller.py` - Contains all the bugs and fixes
- `src/multi_agent_conductor.py` - ConductorEngine, run loop
- `src/api_hooks.py` - API endpoint routing
- `src/gui_2.py` - UI rendering, `__getattr__` delegation
- `tests/test_mma_concurrent_tracks_sim.py` - The failing test
## Success Criteria
- [x] GUI starts without crash (active_tickets initialized)
- [x] `self.engines` dict properly stores multiple concurrent engines
- [ ] Two concurrent tracks produce two concurrent workers visible in `get_mma_workers()`
- [ ] No regression in single-track execution
- [ ] All concurrent tests pass