manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	d28e373e54	fix(mock_concurrent_mma): remove session_id fallback from worker check Root cause discovered after the user's batched test run revealed the stress test still failed when run after the execution test. The gemini_cli_adapter persists session_id across tests (singleton). The execution test set session_id to 'mock-worker-ticket-A-1' (from the worker call). When the stress test's epic call ran, it used --resume with that stale session_id. The mock's worker check had a session_id fallback: if 'You are assigned to Ticket' in prompt or session_id.startswith('mock-worker-'): ...worker response... The fallback incorrectly matched the stress test's epic call (which used the stale worker session_id), causing the mock to return a worker response instead of an epic response. The production's generate_tracks then failed to parse the response, returning 0 tracks. Fix: remove the session_id.startswith('mock-worker-') fallback. Route workers based on prompt content only. The session_id is for the production's session management, not for the mock's routing. This is a 'fix the test infrastructure' change (the mock is a test artifact, not production). The production's gemini_cli_adapter could also be fixed to reset session_id on reset_session(), but that's out of scope for this track. Verified: the failing test combination (execution test before stress test) was reproduced and the fix resolves it. The isolated stress test still passes (3 consecutive runs). Note: a separate issue was discovered where self.tracks is being replaced between track appends (different id(self.tracks) values in the diagnostic log). This causes the API to read 0 tracks after the accept. The root cause is unclear from this session's investigation; it appears to be a production code issue where the in-memory track state is being overwritten by a disk read from a different project path. This is documented as a follow-up.	2026-06-27 16:31:45 -04:00
ed	fad1755b7d	fix(mock_concurrent_mma): make epic branch a catch-all for non-empty prompts The stress test (tests/test_mma_concurrent_tracks_stress_sim.py) uses mma_epic_input='STRESS TEST: TRACK A AND TRACK B', which the mock's epic branch did NOT match (it only matched 'PATH: Epic Initialization'). The stress prompt fell to the Default branch which returns text (not JSON), and the production's orchestrator_pm.generate_tracks failed to parse it, returning 0 tracks. The test polled for proposed_tracks (60s timeout, never broke), clicked accept (no proposed_tracks to process), then asserted tracks >= 2 and found 0. Root cause: the mock's epic branch was a literal-substring check for a single test-specific prompt. It was not robust to other test prompts. Fix: restructure routing so that sprint and worker are checked first (more specific patterns), and ANY non-empty prompt that does not match those patterns is treated as an epic request (returns 2 tracks). Empty prompts fall to the Default branch. Verification: - test_mma_concurrent_tracks_execution: still PASSES (uses 'PATH: Epic Initialization' which matches the new catch-all since it doesn't contain sprint or worker patterns) - test_mma_concurrent_tracks_stress_sim: now PASSES (uses 'STRESS TEST: TRACK A AND TRACK B' which matches the new catch-all) - 3 consecutive PASS runs of both tests (13.94s, 14.81s, 14.13s) This is 'adjust the tests instead' per user directive - the mock is a test artifact, not production. The production's generate_tracks correctly returns [] for unparseable responses; the test mock should be robust enough to return valid JSON for any epic-like prompt.	2026-06-27 14:59:04 -04:00
ed	913aa48ca9	fix(mock_concurrent_mma): route sprints on prompt content not session_id The prior session_id-based routing (added in `635ca552`) had two bugs: 1. call_n literal matching (== 2, == 3) is fragile to test ordering: the file-based counter persists across tests in the same session, so call_n != 2 for the 1st sprint if a prior test ran. 2. session_id='mock-sprint-A' means 'this is a follow-up call after the 1st sprint returned mock-sprint-A', so the response should be sprint-B (2nd track tickets), not sprint-A. The prior code routed this to sprint-A, which means track-b's worker has stream id 'ticket-A-1' (not 'ticket-B-1') and the test's 'ticket-B-1' poll never finds it. Fix: route on prompt content. The production's conductor_tech_lead passes the track_brief (containing 'Track A Goal' or 'Track B Goal') in the user_message. The prompt is NOT empty in --resume mode (the gemini_cli_adapter passes the prompt as the first turn of the resumed session). The prompt-based routing is the original pre-635ca552 design and works correctly for any number of tracks (A, B, C) without depending on call ordering. Verified: 3 consecutive test runs PASS (7.81s, 8.90s, 7.95s) after the fix. The 'Worker from Track B never appeared' flakiness is gone.	2026-06-27 14:20:33 -04:00
ed	635ca5523d	fix(mma_concurrent_tracks): partial fix for production+mock regression This test was failing for multiple stacked reasons. Fixed the ones I could identify but the test still does not pass (the bg_task for the second track does not run, suggesting a deeper integration issue). Fixes: 1. src/app_controller.py: _start_track_logic_result and _cb_plan_epic both mutated the frozen ProjectContext dataclass returned by flat_config() via flat.setdefault('files', {})['paths'] = .... The flat_config() return type was changed from dict[str, Any] to a frozen @dataclass ProjectContext by cruft_elimination Phase 2 (in `0d2a9b5e`), but the consumers were never updated. Fix: call flat.to_dict() to get a mutable dict before mutation. 2. src/app_controller.py: _start_track_logic_result iterated over sorted_tickets_data expecting dicts but conductor_tech_lead.topological_sort() returns list[Ticket]. So t_data['id'] raised 'Ticket' object is not subscriptable. Fix: use Ticket attribute access (t_data.id, etc.). 3. tests/mock_concurrent_mma.py: The mock was not handling the --resume session-id case that the gemini_cli_adapter uses for subsequent calls. The mock's first call returns the epic, but the second call (--resume mock-epic) fell to the default case. Fix: parse --resume arg from sys.argv and route to per-track sprint-ticket response based on a persistent call counter. Known remaining issue: only one sprint-ticket mock call is observed in the test log; the second track's _start_track_logic does not appear to call the mock. Could be a deeper integration issue in the test sandbox or in the _cb_accept_tracks._bg_task loop. Test still fails at line 66.	2026-06-27 13:35:05 -04:00
ed	d4b4312dd2	chore: remove debug logging and fix closure bug in test hooks	2026-05-07 15:02:00 -04:00
ed	40f0c04a91	chore(conductor): Mark track 'Fix Concurrent MMA Live GUI Tests' as complete Fixes UI flickering between tracks in app_controller.py and an indentation bug in multi_agent_conductor.py that caused workers to crash silently.	2026-05-07 13:30:42 -04:00
ed	fbd03dc336	missing commits	2026-05-02 19:00:40 -04:00

7 Commits