manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	63fa181192	feat(sim): add pytest timeout(300) and tier_usage Stage 9 check Task 2.3: prevent infinite CI hangs with 300s hard timeout Task 3.2: non-blocking Stage 9 logs mma_tier_usage after Tier 3 completes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 14:24:05 -05:00
ed	08734532ce	test(mock): add standalone test for mock_gemini_cli routing 4 tests verify: epic prompt -> Track JSON, sprint prompt -> Ticket JSON with correct field names, worker prompt -> plain text, tool-result -> plain text. All pass in 0.57s. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 14:22:53 -05:00
ed	0593b289e5	fix(mock): correct sprint ticket format and add keyword detection - description/status/assigned_to fields now match parse_json_tickets expectations - Sprint planning branch also detects 'generate the implementation tickets' Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 14:21:21 -05:00
ed	f7e417b3df	fix(mma-exec): add --dangerously-skip-permissions for headless file writes Tier 3 workers need to read/write files in headless mode. Without this flag, all file tool calls are blocked waiting for interactive permission. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 14:20:38 -05:00
ed	36d464f82f	fix(mma-exec): strip ANTHROPIC_API_KEY from subprocess env to use subscription login When ANTHROPIC_API_KEY is set in the shell environment, claude --print routes through the API key instead of subscription auth. Stripping it forces the CLI to use subscription login for all Tier 3/4 delegation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 14:18:57 -05:00
ed	3f8ae2ec3b	fix(conductor): load Tier 2 role doc in startup, add Tier 3 failure protocol - Add step 1: read mma-tier2-tech-lead.md before any track work - Add explicit stop rule when Tier 3 delegation fails (credit/API error) Tier 2 must NOT silently absorb Tier 3 work as a fallback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 14:09:23 -05:00
ed	5cacbb1151	conductor(plan): Mark task 3.2 complete — sim test PASSED Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 14:04:57 -05:00
ed	ce5b6d202b	fix(tier1): disable tools in generate_tracks, add enable_tools param to ai_client.send Tier 1 planning calls are strategic — the model should never use file tools during epic initialization. This caused JSON parse failures when the model tried to verify file references in the epic prompt. - ai_client.py: add enable_tools param to send() and _send_gemini() - orchestrator_pm.py: pass enable_tools=False in generate_tracks() - tests/visual_sim_mma_v2.py: remove file reference from test epic Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 14:04:44 -05:00
ed	c023ae14dc	conductor(plan): Update task 3.1 complete, 3.2 awaiting verification	2026-03-01 13:42:52 -05:00
ed	89a8d9bcc2	test(sim): Rewrite visual_sim_mma_v2 for real Gemini API with frame-sync fixes Uses gemini-2.5-flash-lite (real API, CLI quota exhausted). Adds _poll/_drain_approvals helpers, frame-sync sleeps after all state-changing clicks, proper stage transitions, and 120s timeouts for real API latency. Addresses simulation_hardening Issues 2 & 3.	2026-03-01 13:42:34 -05:00
ed	24ed309ac1	conductor(plan): Mark task 3.1 complete — Stage 8 assertions already correct	2026-03-01 13:26:15 -05:00
ed	0fe74660e1	conductor(plan): Mark Phase 2 complete, begin Phase 3	2026-03-01 13:25:24 -05:00
ed	a2097f14b3	fix(mma): Add Tier 1 and Tier 2 token tracking from comms log Task 2.2 of mma_pipeline_fix_20260301: _cb_plan_epic captures comms baseline before generate_tracks() and pushes mma_tier_usage['Tier 1'] update via custom_callback. _start_track_logic does same for generate_tickets() -> mma_tier_usage['Tier 2'].	2026-03-01 13:25:07 -05:00
ed	2f9f71d2dc	conductor(plan): Mark task 2.1 complete, begin 2.2	2026-03-01 13:22:34 -05:00
ed	3eefdfd29d	fix(mma): Replace token stats stub with real comms log extraction in run_worker_lifecycle Task 2.1 of mma_pipeline_fix_20260301: capture comms baseline before send(), then sum input_tokens/output_tokens from IN/response entries to populate engine.tier_usage['Tier 3'].	2026-03-01 13:22:15 -05:00
ed	d5eb3f472e	conductor(plan): Mark task 1.4 as complete, begin Phase 2	2026-03-01 13:20:10 -05:00
ed	c5695c6dac	test(mma): Add test verifying run_worker_lifecycle pushes response via _queue_put Task 1.4 of mma_pipeline_fix_20260301: asserts stream_id='Tier 3 (Worker): T1', event_name='response', text and status fields correct.	2026-03-01 13:19:50 -05:00
ed	130a36d7b2	conductor(plan): Mark tasks 1.1, 1.2, 1.3 as complete	2026-03-01 13:18:09 -05:00
ed	b7c283972c	fix(mma): Add diagnostic logging and remove unsafe asyncio.Queue else branches Tasks 1.1, 1.2, 1.3 of mma_pipeline_fix_20260301: - Task 1.1: Add [MMA] diagnostic print before _queue_put in run_worker_lifecycle; enhance except to include traceback - Task 1.2: Replace unsafe event_queue._queue.put_nowait() else branches with RuntimeError in run_worker_lifecycle, confirm_execution, confirm_spawn - Task 1.3: Verified run_in_executor positional arg order is correct (no change needed)	2026-03-01 13:17:37 -05:00
ed	cf7938a843	wrong archive location	2026-03-01 13:17:34 -05:00
ed	3d398f1905	remove main context	2026-03-01 10:26:01 -05:00
ed	52f3820199	conductor(gui_ux): Add Phase 6 — live streaming, per-tier model config, parallel DAG, auto-retry Addresses three gaps where Claude Code and Gemini CLI outperform Manual Slop's MMA during actual execution: 1. Live worker streaming: Wire comms_log_callback to per-ticket streams so users see real-time output instead of waiting for worker completion. 2. Per-tier model config: Replace hardcoded get_model_for_role with GUI dropdowns persisted to project TOML. 3. Parallel DAG execution: asyncio.gather for independent tickets (exploratory — _send_lock may block, needs investigation). 4. Auto-retry with escalation: flash-lite -> flash -> pro on BLOCKED, up to 2 retries (wires existing --failure-count mechanism into ConductorEngine). 7 new tasks across Phase 6, bringing total to 30 tasks across 6 phases. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-01 10:24:29 -05:00
ed	0b03b612b9	chore: Wire architecture docs into mma_exec.py and workflow delegation prompts mma_exec.py changes: - get_role_documents: Tier 1 now gets docs/guide_architecture.md + guide_mma.md (was: only product.md). Tier 2 gets same (was: only tech-stack + workflow). Tier 3 gets guide_architecture.md (was: only workflow.md — workers modifying gui_2.py had zero knowledge of threading model). Tier 4 gets guide_architecture.md (was: nothing). - Tier 3 system directive: Added ARCHITECTURE REFERENCE callout, CRITICAL THREADING RULE (never write GUI state from background thread), TASK FORMAT instruction (follow WHERE/WHAT/HOW/SAFETY from surgical tasks), and py_get_definition to tool list. - Tier 4 system directive: Added ARCHITECTURE REFERENCE callout and instruction to trace errors through thread domains documented in guide_architecture.md. conductor/workflow.md changes: - Red Phase delegation prompt: Replaced 'with a prompt to create tests' with surgical prompt format example showing WHERE/WHAT/HOW/SAFETY. - Green Phase delegation prompt: Replaced 'with a highly specific prompt' with surgical prompt format example with exact line refs and API calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-01 10:16:38 -05:00
ed	4e2003c191	chore(gemini): Encode surgical methodology into all Gemini MMA skills Updates three Gemini skill files to match the Claude command methodology: mma-orchestrator/SKILL.md: - New Section 0: Architecture Fallback with links to all 4 docs/guide_*.md - New Surgical Spec Protocol (6-point mandatory checklist) - New Section 5: Cross-Skill Activation for tier transitions - Example 2 rewritten with surgical prompt (exact line refs + API calls) - New Example 3: Track creation with audit-first workflow - Added py_get_definition to tool usage guidance mma-tier1-orchestrator/SKILL.md: - Added Architecture Fallback and Surgical Spec Protocol summary - References activate_skill mma-orchestrator for full protocol mma-tier2-tech-lead/SKILL.md: - Added Architecture Fallback section - Added Surgical Delegation Protocol with WHERE/WHAT/HOW/SAFETY example Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-01 10:13:29 -05:00
ed	52a463d13f	conductor: Encode surgical spec methodology into Tier 1 skills for Claude and Gemini Distills what made this session's track specs high-quality into reusable methodology for both Claude and Gemini Tier 1 orchestrators: Key additions to conductor-new-track.md: - MANDATORY Step 2: Deep Codebase Audit before writing any spec - 'Current State Audit' section template (Already Implemented + Gaps) - 6 rules for writing worker-ready tasks (WHERE/WHAT/HOW/SAFETY) - Anti-patterns section (vague specs, no line refs, no audit, etc.) - Architecture doc fallback references Key additions to mma-tier1-orchestrator.md (Claude + Gemini): - 'The Surgical Methodology' section with 6 protocols - Spec template with REQUIRED sections (Current State Audit is mandatory) - Plan template with REQUIRED task format (file:line refs + API calls) - Root cause analysis requirement for fix tracks - Cross-track dependency mapping requirement - Added py_get_definition to Gemini's tool list (was missing) The core insight: the quality gap between this session's output and previous track specs came from (1) reading actual code before writing specs, (2) listing what EXISTS before what's MISSING, and (3) specifying exact locations and APIs in tasks so lesser models don't have to search or guess. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-01 10:08:25 -05:00
ed	458529fb13	chore(conductor): Add index.md to new tracks, archive completed/superseded tracks - Add index.md to mma_pipeline_fix, simulation_hardening, context_token_viz - Archive documentation_refresh_20260224 (superseded by `08e003a` rewrite) - Archive robust_live_simulation_verification (context distilled into simulation_hardening_20260301 spec) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-01 10:00:49 -05:00
ed	0d2b6049d1	conductor: Create 3 MVP tracks with surgical specs from full codebase analysis Three new tracks identified by analyzing product.md requirements against actual codebase state using 1M-context Opus with all architecture docs loaded: 1. mma_pipeline_fix_20260301 (P0, blocker): - Diagnoses why Tier 3 worker output never reaches mma_streams in GUI - Identifies 4 root cause candidates: positional arg ordering, asyncio.Queue thread-safety violation, ai_client.reset_session() side effects, token stats stub returning empty dict - 2 phases, 6 tasks with exact line references 2. simulation_hardening_20260301 (P1, depends on pipeline fix): - Addresses 3 documented issues from robust_live_simulation session compression - Mock triggers wrong approval popup, popup state desync, approval ambiguity - 3 phases, 9 tasks including standalone mock test suite 3. context_token_viz_20260301 (P2): - Builds UI for product.md primary use case #2 'Context & Memory Management' - Backend already complete (get_history_bleed_stats, 140 lines) - Token budget bar, proportion breakdown, trimming preview, cache status - 3 phases, 10 tasks Execution order: pipeline_fix -> simulation_hardening -> gui_ux (parallel w/ token_viz) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-01 09:58:34 -05:00
ed	d93f650c3a	conductor: Refine GUI UX track with full codebase knowledge, add doc references Rewrites comprehensive_gui_ux_20260228 spec and plan using deep analysis of the actual gui_2.py implementation (3078 lines). The previous spec asked to implement features that already exist (Track Browser, DAG tree, epic planning, approval dialogs, token table, performance monitor). The new spec: - Documents 15 already-implemented features with exact line references - Identifies 8 actual gaps (tier stream panels, DAG editing, cost tracking, conductor lifecycle forms, track-scoped discussions, approval indicators, track proposal editing, stream scrollability) - Rewrites all 5 phases with surgical task descriptions referencing exact gui_2.py line ranges, function names, and data structures - Each task specifies the precise imgui API calls to use - References docs/guide_architecture.md for threading constraints - References docs/guide_mma.md for Ticket/Track data structures Also adds architecture documentation fallback references to: - conductor/workflow.md (new principle #9) - conductor/product.md (new Architecture Reference section) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-01 09:51:37 -05:00
ed	08e003a137	docs: Complete documentation rewrite at gencpp/VEFontCache reference quality Rewrites all docs from Gemini's 330-line executive summaries to 1874 lines of expert-level architectural reference matching the pedagogical depth of gencpp (Parser_Algo.md, AST_Types.md) and VEFontCache-Odin (guide_architecture.md). Changes: - guide_architecture.md: 73 -> 542 lines. Adds inline data structures for all dialog classes, cross-thread communication patterns, complete action type catalog, provider comparison table, 4-breakpoint Anthropic cache strategy, Gemini server-side cache lifecycle, context refresh algorithm. - guide_tools.md: 66 -> 385 lines. Full 26-tool inventory with parameters, 3-layer MCP security model walkthrough, all Hook API GET/POST endpoints with request/response formats, ApiHookClient method reference, /api/ask synchronous HITL protocol, shell runner with env config. - guide_mma.md: NEW (368 lines). Fills major documentation gap — complete Ticket/Track/WorkerContext data structures, DAG engine algorithms (cycle detection, topological sort), ConductorEngine execution loop, Tier 2 ticket generation, Tier 3 worker lifecycle with context amnesia, token firewalling. - guide_simulations.md: 64 -> 377 lines. 8-stage Puppeteer simulation lifecycle, mock_gemini_cli.py JSON-L protocol, approval automation pattern, ASTParser tree-sitter vs stdlib ast comparison, VerificationLogger. - Readme.md: Rewritten with module map, architecture summary, config examples. - docs/Readme.md: Proper index with guide contents table and GUI panel docs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-01 09:44:50 -05:00
ed	bf4468f125	docs(conductor): Expert-level architectural documentation refresh	2026-03-01 09:19:48 -05:00
ed	7384df1e29	remove track fro tracks	2026-03-01 09:09:04 -05:00
ed	e19b78e090	chore(conductor): Archive track 'Consolidate Temp/Test Cruft & Log Taxonomy'	2026-03-01 09:08:15 -05:00
ed	cfcfd33453	docs(conductor): Synchronize docs for track 'Consolidate Temp/Test Cruft & Log Taxonomy'	2026-03-01 09:07:39 -05:00
ed	bcbccf3cc4	dont use flash-lite for tier 3	2026-03-01 09:07:17 -05:00
ed	cb129d06cd	chore(conductor): Mark track 'Consolidate Temp/Test Cruft & Log Taxonomy' as complete	2026-03-01 09:07:04 -05:00
ed	68b9f9baee	conductor(plan): Mark Phase 4 and Track as complete	2026-03-01 09:06:55 -05:00
ed	7f95ebd85e	conductor(plan): Mark Phase 3 as complete [checkpoint: `61d513a`]	2026-03-01 09:06:19 -05:00
ed	61d513ad08	feat(migration): Add script to consolidate legacy logs and artifacts	2026-03-01 09:06:07 -05:00
ed	32f7a13fa8	conductor(plan): Mark Phase 2 as complete [checkpoint: `6326546`]	2026-03-01 09:03:15 -05:00
ed	6326546005	feat(taxonomy): Redirect logs and artifacts to dedicated sub-folders	2026-03-01 09:03:02 -05:00
ed	09bedbf4f0	conductor(plan): Mark Phase 1 as complete [checkpoint: `590293e`]	2026-03-01 08:59:15 -05:00
ed	590293e3d8	conductor(plan): Mark Phase 1 as complete	2026-03-01 08:59:07 -05:00
ed	fab109e31b	chore(conductor): Fix .gitignore corruption and add artifact/log dirs	2026-03-01 08:58:45 -05:00
ed	27e67df4e3	prep doc track.	2026-03-01 08:57:01 -05:00
ed	efaf4e98c4	chore(conductor): Add new track 'Consolidate Temp/Test Cruft & Log Taxonomy'	2026-03-01 08:49:19 -05:00
ed	26287215c5	get rid of cruft	2026-03-01 08:44:30 -05:00
ed	472966cb61	chore(conductor): Add new track 'Comprehensive Conductor & MMA GUI UX'	2026-03-01 08:43:15 -05:00
ed	332cc9da84	chore(conductor): Mark track 'Robust Live Simulation Verification' as complete	2026-03-01 08:37:23 -05:00
ed	da21ed543d	fix(mma): Unblock visual simulation - event routing, loop passing, adapter preservation Three independent root causes fixed: - gui_2.py: Route mma_spawn_approval/mma_step_approval events in _process_event_queue - multi_agent_conductor.py: Pass asyncio loop from ConductorEngine.run() through to thread-pool workers for thread-safe event queue access; add _queue_put helper - ai_client.py: Preserve GeminiCliAdapter in reset_session() instead of nulling it Test: visual_sim_mma_v2::test_mma_complete_lifecycle passes in ~8s Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-01 08:32:31 -05:00
ed	db32a874fd	ignore temp workspace	2026-02-28 23:02:22 -05:00

1 2 3 4 5 ...

789 Commits