manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	f9bd8505c9	docs(tier2): workflow.md hard bans - AppData denied (no exception) Updated conductor/workflow.md §'Tier 2 Autonomous Sandbox' hard bans table. The 'File access outside Tier 2 clone + app-data dir' row now says: 'File access outside Tier 2 clone (AppData, Temp, Documents, etc. all denied at the OpenCode * level + targeted AppData\\\\ deny)'. Per the user's 2026-06-18 'NEVER USE APPDATA' directive. Refs: conductor/tracks/tier2_no_appdata_20260618	2026-06-18 14:41:26 -04:00
ed	54eb4740b3	conductor+layout: remove T-shirt size metric, regenerate stale layout Per user feedback 2026-06-17: - T-shirt size is not an acceptable sizing metric. Remove it from conductor/workflow.md (the policy file), conductor/tracks.md (the registry), and docs/reports/NEGATIVE_FLOWS_INVESTIGATION_20260617.md. - Regenerate manualslop_layout.ini to remove 83 stale window references that pointed to deleted/renamed windows (Projects, Files, Screenshots, Provider, System Prompts, Discussion History, Comms History, etc.). Layout now matches the windows registered in src/app_controller.py _default_windows (lines 1862-1886). Stale window count: 10 -> 3. T-shirt size removal details: - conductor/workflow.md: Removed the S/M/L/XL table, the replacement pattern row, and the 'reasonable effort' guard's reference. Scope (N files, M sites, N tasks) is the only effort dimension. - conductor/tracks.md: Removed the T-shirt column from the table header and removed T-shirt size mentions from the Fable track entry. - docs/reports/NEGATIVE_FLOWS_INVESTIGATION_20260617.md: Removed the T-shirt size mention in the follow-up track suggestion. Layout fix: - manualslop_layout.ini went from 17,360 bytes (102 windows, 83 stale) to 3,361 bytes (23 windows, all matching _default_windows). The stale window warning dropped from 10 windows to 3 (Message, Tool Calls, Response - these are in _default_windows but reference separate panels in the layout). Verification: layout fix did NOT fix the underlying stack overflow crash. After layout fix, the test still dies with rc=3221225725 (0xC00000FD). The user noted 'Something more fundamental is wrong.' Investigation continues; this commit only addresses the explicit ask (remove T-shirt, fix layout).	2026-06-17 12:23:03 -04:00
ed	07a0e66a19	docs(tier2): apply user feedback - 6 workflow conventions User feedback from the first sandbox run (send_result_to_send_20260616, 2026-06-17) identified 6 conventions Tier 2 must follow. Update the agent prompt template, slash command template, user guide, and workflow doc: 1. Test runner: ALWAYS use 'uv run python scripts/run_tests_batched.py' (NOT 'uv run pytest'). The batched runner provides tier filtering, parallelization (xdist), and a summary table that direct pytest lacks. 2. Default branch: this repo uses 'master', not 'main'. The Tier 2 slash command now does 'git fetch origin master' (was 'origin main'). 3. Line endings: preserve existing. This repo has a mix of CRLF and LF; a repo-wide LF standardization is a future track. 4. Throw-away scripts: write to 'scripts/tier2/artifacts/<track>/', NOT the base 'scripts/tier2/' directory. The base is reserved for production code; throw-away scripts are kept for archival but isolated per-track. 5. End-of-track report: write 'docs/reports/TRACK_COMPLETION_<track>.md' and update 'state.toml' to 'status=completed'. The user reads this to decide merge. Previously this was implicit; now it's explicit. 6. Run-time expectation: tracks are 1-4 hours. If context runs out, Tier 2 notes progress to disk and continues. The --resume flag picks up from the last completed task. Also updated the user guide with a 'Conventions' section and a troubleshooting entry for the resume flow. The verify-the-sandbox checklist now uses 'origin master' instead of 'origin main'.	2026-06-17 02:13:29 -04:00
ed	4cf885da90	docs(workflow+agents): add HARD BAN on day estimates + Tier 1 Track Initialization Rules section	2026-06-16 10:16:49 -04:00
ed	35c6cca134	docs: agent workflow docs + regular docs (v2.3 surfacing) Per user request 'use your remaining context to update agent workflow docs and then regular docs based on what was discussed in this report', this commit creates/updates 15 files derived from the v2.3 nagent review (the 12 new nagent additions + the 4 memory dimensions reframing + the cache strategy + the RAG discipline + the knowledge harvest pattern). Agent workflow docs (4 files): - AGENTS.md (UPDATE): add @import line to canonical DOD + 'Code Styleguides' section pointing to the 6 new styleguides + new 'Human-Facing Documentation' section pointing to ./docs/AGENTS.md - conductor/workflow.md (UPDATE): new section 'Additions (2026-06-12) - the 12 patterns from the latest nagent corpus' with TDD protocols for knowledge harvest, cache ordering, compaction, RAG discipline - conductor/product-guidelines.md (UPDATE): new sections 'Memory Dimensions (added 2026-06-12)' + 'See Also - Updated' with the 6-styleguide catalog - docs/AGENTS.md (NEW): the agent-facing mirror of docs/Readme.md (per the nagent CLAUDE.md pattern). 10 sections + the per-tier reading path + the 4 memory dimensions + the caching strategy + the knowledge harvest + the RAG discipline + the feature flags Regular docs (11 files): - 6 new styleguides (the convention catalog): * data_oriented_design.md: the canonical DOD reference (Tier 0/1/2; 3 defaults to reject; 8 core defaults; 7-question simplification pass; 10-question self-check; 4 memory dimensions in Manual Slop context) * agent_memory_dimensions.md: the 4 memory dims (curation / discussion / RAG / knowledge) + when to use each + the boundaries * rag_integration_discipline.md: the conservative-RAG rule (opt-in, complement, provenance, no mutation, feature-gated, graceful failure) * cache_friendly_context.md: stable-to-volatile context ordering + the cache TTL GUI contract + the byte-comparison test * knowledge_artifacts.md: the knowledge harvest pattern (category files, provenance, sha256 ledger, digest regeneration, 'delete to turn off') * feature_flags.md: file presence vs config flags vs CLI flags - 3 new project docs (the cross-cutting guides): * guide_agent_memory_dimensions.md: the cross-cutting guide on the 4 dims + the decision tree * guide_caching_strategy.md: caching across providers + stable-to-volatile ordering + cache TTL GUI + the byte- comparison test + the 5th provider (claude-code) * guide_knowledge_curation.md: the knowledge memory guide (4th dim) + the 5 category files + per-file notes + the digest + the ledger + the harvest workflow - 2 existing doc updates: * guide_mma.md: new sections 'Delegation as context management' + 'The 4 memory dimensions (the MMA scope)' * guide_ai_client.md: new section 'Cache strategy and the 12- layer model' + the 5th provider (claude-code) All files use the same style as the v2.3 review (the user's preferred format): 7-column tables, no JSON, SSDL shape tags, forth/array notation, file:line citations, ASCII sketches where useful. The human Readme files (Readme.md, docs/Readme.md) are NOT modified (per repeated user instruction). The 5th provider (claude-code) is documented in guide_ai_client.md + the data_oriented_design.md references the nagent pattern as the source of the canonical rules. The cross-references are bidirectional: the 6 styleguides reference the 3 project docs; the 3 project docs reference the 6 styleguides; the 2 doc updates reference both; AGENTS.md + ./docs/AGENTS.md provide the entry points.	2026-06-12 13:50:40 -04:00
ed	8919342b22	docs(workflow): link to error_handling.md styleguide from Code Style section	2026-06-11 23:32:48 -04:00
ed	51edbdef20	docs(workflow,agents): remove 'large files are bad' propaganda; add naming rule The user called out the LLM training data bias: 'small files are good, large files are bad.' This is wrong for production codebases. Unreal has 15K+ line files; OS kernels, game engines, compilers all routinely have 10K+ line files. File size is a non-issue. Cognitive load is managed via naming, regions, and navigation tools (the manual-slop MCP) — NOT via file splitting. Updates: 1. AGENTS.md (master agent guidance): - Added 'File Size and Naming Convention' section - Added the hard rule: 'New namespaced src/<thing>.py files may only be created on the user's explicit request. If you find yourself about to create one, ASK FIRST.' - Defaults: helpers and sub-systems go in the parent module 2. conductor/workflow.md (Guiding Principles): - Removed 'Do NOT perform large file writes directamente' from principle 7 (it was a delegating rule, but 'large file writes' carried the propaganda) - Added principle 8: 'File Naming Convention (HARD RULE)' that references AGENTS.md - Re-phrased principle 9 (Research-First) to clarify it's about navigation efficiency, not file size 3. conductor/code_styleguides/python.md: - Removed the 'extremely large files that violate the Anti-OOP rule by necessity' framing - Added the new rule about new src/<thing>.py files 4. .opencode/agents/tier3-worker.md and .opencode/agents/tier4-qa.md: - Re-phrased 'Do NOT read full large files' to 'Use skeleton tools to navigate any file regardless of size. File size is not a concern; the right tools are.' - Added the new rule about not creating new src/<thing>.py files unless user explicitly requests it 5. conductor/tracks/qwen_llama_grok_followup_20260611/plan.md: - Updated the 'Naming Convention' section to reference the new 'user explicit request' rule This is docs-only. No code changes. The rule is now codified: agents must ASK FIRST before creating new top-level src/ files.	2026-06-11 10:07:07 -04:00
ed	965e015709	docs(workflow): add 3 test-hell lessons to Known Pitfalls + Live_gui Test Fragility Known Pitfalls (new subsection): - HARD BAN: git checkout -- <file>, git restore, git reset (per AGENTS.md Critical Anti-Patterns; destroyed user in-progress edits twice on 2026-06-07; concrete 2026-06-10 incident: mma_tier_usage_reset_fix regression) Live_gui Test Fragility (2 new subsections): - Anti-pattern: push_event + time.sleep(N) + assert is a race. Fix: poll-until-state-visible with bounded retries. 5+ tests affected in 2026-06-10 batch-green wave. - Async setters need poll-for-state. mma_state_update and rag_* setters dispatch to _pending_gui_tasks queue; the setter returns before the GUI render loop processes the task. Assert immediately = race. Fix: poll via get_value with bounded retry.	2026-06-10 20:19:54 -04:00
ed	93ec28097c	docs(styleguide): add workspace_paths.md — hard rule for test workspace paths	2026-06-09 20:36:41 -04:00
conductor-tier2	631c40c9c4	docs(workflow): add Process Anti-Patterns section + Isolated-Pass rule Two additions to conductor/workflow.md §"Known Pitfalls": 1. Isolated-Pass Verification Fallacy (Added 2026-06-09) — the rule that a test passing in isolation but failing in batch is FAILING. The only verification that matters for live_gui tests is the batch run. This is the flip side of the existing "Live_gui Test Fragility (Authoring-Side)" rule. Cross-references that rule. 2. Process Anti-Patterns (Added 2026-06-09) — 8-rule summary list, with cross-reference to AGENTS.md for the full ruleset. The 8 patterns are: Deduction Loop, Report-Instead-of-Fix, Scope-Creep Track-Doc, Inherited-Cruft, Diagnostic Noise in Production, Premature Surrender, Verbose Commit Message, Isolated-Pass Verification Fallacy. Markdown only. No code modified. Cross-references AGENTS.md (the load-bearing agent doc) for the full text of each pattern.	2026-06-09 14:03:00 -04:00
ed	c9c5535889	docs(workflow): add Skip-Marker Policy section Per 2026-06-07 user feedback during test_suite cleanup: "if the intent is to annotate a known failure, fine. But that known failure must be addressed with priority." New section between "Per-Task Decision Protocol" and "Documentation Refresh Protocol" makes the policy explicit: - Skip markers are DOCUMENTATION, not avoidance - They're useful for opt-in integration tests, unimplemented features, or feature-flag-gated code - They're NOT useful for pre-existing failures, "I don't understand this" issues, or racy tests the agent doesn't want to debug - When adding a marker, MUST document the underlying issue AND what the fix would be - When the fix is in-session reachable, FIX IT INSTEAD of skipping — limited context is not an excuse Includes a 4-question review checklist before adding a skip. References the existing AGENTS.md "Use skip markers as excuse to AVOID" rule so the two policies don't drift.	2026-06-07 16:57:54 -04:00
ed	c073e42a7a	docs(workflow,agents): add 7 process improvements from planning session All additive; no breaking changes to existing content. Derived from gaps observed during the 2026-06-06 planning session (5 tracks spec'd + planned end-to-end). AGENTS.md (1 new section, 16 lines): - Compaction Recovery - explicit recovery path for a new agent picking up mid-track (read the digest, check state.toml, run audits, resume from next unchecked task). Cross-references the workflow-level 'Compaction Recovery' section. conductor/workflow.md (6 new sections, 145 lines): - Planning Session Workflow - documents the brainstorming -> spec -> plan flow used 5x this session; mandates spec approval before plan; notes the plan is the only artifact the implementer reads. - Track Dependencies and Execution Order - verify the blocked_by chain in metadata.json before starting; topological sort gives the recommended execution order (recorded in PLANNING_DIGEST). - State.toml Template - canonical structure (meta / blocked_by / blocks / phases / tasks / verification / track-specific) so future tracks have a consistent shape. - Per-Task Decision Protocol - small decisions (cosmetic) decide yourself; large decisions (architectural) STOP and report; regressions STOP and report. The boundary is 'does this require a new spec or plan update?'. - Documentation Refresh Protocol - after a track ships, identify affected guides (grep for renamed/moved symbols), update them, add new guides for new modules, add styleguides for new conventions. The 'post-tracks documentation' pattern is repeatable; tracks that only update code are incomplete. - Audit Script Policy - whenever a track introduces a new convention that can be statically checked, add an audit script in scripts/ with --help / --json / strict modes. The audit + CI gate pair is the convention-enforcement mechanism; 3 existing audits (audit_main_thread_imports, audit_weak_types, check_test_toml_paths) are the precedent. All sections reference existing project files (brainstorming skill, writing-plans skill, audit scripts, tracks.md, the existing 5 new tracks' spec.md files, PLANNING_DIGEST_20260606.md). No code changes. Documentation only. ~160 lines total added.	2026-06-06 21:22:40 -04:00
ed	0f742b1d5f	conductor(workflow): add Indentation-Driven Class Method Visibility pitfall (2026-06-05)	2026-06-06 02:04:05 -04:00
ed	1488e71568	docs: add Sentinel type contract note to 3 defer-not-catch sections	2026-06-05 20:31:38 -04:00
ed	dc691e3de0	docs(workflow): reframe live_gui fragility as authoring-side, not fixture bug	2026-06-05 18:43:58 -04:00
ed	71b0082bbf	docs(workflow): add Known Pitfalls section (defer-not-catch, theme bisect anchors, live_gui fragility)	2026-06-05 18:31:14 -04:00
ed	a615bbdaa0	conductor(workflow): add 8 new per-file guide references to docs fallback	2026-06-02 23:46:38 -04:00
ed	4c0114f296	docs(workflow): update architecture fallback to 2026-06-02 doc refresh, 45-tool inventory, full guide index	2026-06-02 20:32:34 -04:00
ed	34b1349c4f	WIP: cleaning up ai_client.py	2026-05-13 19:06:33 -04:00
ed	c4e1cca66b	progress on fixing up gui code	2026-05-12 15:20:34 -04:00
ed	ce9306d441	adjustments	2026-03-06 10:21:39 -05:00
ed	d575ebb471	adjustments	2026-03-06 10:18:16 -05:00
ed	c5418acbfe	redundant checklist...	2026-03-04 22:43:49 -05:00
ed	dccfbd8bb7	docs(post-mortem): Apply session start checklists and edit tool warnings From gui_decoupling_controller track post-mortem: workflow.md: - Add mandatory session start checklist (6 items) - Add code style section with 1-space indentation enforcement - Add native edit tool warning with MCP alternatives AGENTS.md: - Add critical native edit tool warning - Document MCP tool alternatives for file editing tier1-orchestrator.md: - Add session start checklist tier2-tech-lead.md: - Add session start checklist - Add tool restrictions section (allowed vs forbidden) - Add explicit delegation pattern tier3-worker.md: - Add task start checklist tier4-qa.md: - Add analysis start checklist	2026-03-04 22:42:52 -05:00
ed	2b15bfb1c1	docs: Update workflow rules, create new async tool track, and log journal	2026-03-03 01:49:04 -05:00
ed	6b2270f811	docs: Update core documentation with Structural Testing Contract	2026-03-03 01:13:03 -05:00
ed	e334cd0e7d	docs(workflow): Add Zero-Assertion Ban to TDD section	2026-03-02 19:42:26 -05:00
ed	b00d9ffa42	docs(workflow): Add State Auditing requirement to Research Phase	2026-03-02 19:41:52 -05:00
ed	b4de62f2e7	docs: Enforce strict atomic per-task commits for Tier 2 agents	2026-03-02 12:52:04 -05:00
ed	0b03b612b9	chore: Wire architecture docs into mma_exec.py and workflow delegation prompts mma_exec.py changes: - get_role_documents: Tier 1 now gets docs/guide_architecture.md + guide_mma.md (was: only product.md). Tier 2 gets same (was: only tech-stack + workflow). Tier 3 gets guide_architecture.md (was: only workflow.md — workers modifying gui_2.py had zero knowledge of threading model). Tier 4 gets guide_architecture.md (was: nothing). - Tier 3 system directive: Added ARCHITECTURE REFERENCE callout, CRITICAL THREADING RULE (never write GUI state from background thread), TASK FORMAT instruction (follow WHERE/WHAT/HOW/SAFETY from surgical tasks), and py_get_definition to tool list. - Tier 4 system directive: Added ARCHITECTURE REFERENCE callout and instruction to trace errors through thread domains documented in guide_architecture.md. conductor/workflow.md changes: - Red Phase delegation prompt: Replaced 'with a prompt to create tests' with surgical prompt format example showing WHERE/WHAT/HOW/SAFETY. - Green Phase delegation prompt: Replaced 'with a highly specific prompt' with surgical prompt format example with exact line refs and API calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-01 10:16:38 -05:00
ed	d93f650c3a	conductor: Refine GUI UX track with full codebase knowledge, add doc references Rewrites comprehensive_gui_ux_20260228 spec and plan using deep analysis of the actual gui_2.py implementation (3078 lines). The previous spec asked to implement features that already exist (Track Browser, DAG tree, epic planning, approval dialogs, token table, performance monitor). The new spec: - Documents 15 already-implemented features with exact line references - Identifies 8 actual gaps (tier stream panels, DAG editing, cost tracking, conductor lifecycle forms, track-scoped discussions, approval indicators, track proposal editing, stream scrollability) - Rewrites all 5 phases with surgical task descriptions referencing exact gui_2.py line ranges, function names, and data structures - Each task specifies the precise imgui API calls to use - References docs/guide_architecture.md for threading constraints - References docs/guide_mma.md for Ticket/Track data structures Also adds architecture documentation fallback references to: - conductor/workflow.md (new principle #9) - conductor/product.md (new Architecture Reference section) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-01 09:51:37 -05:00
ed	27e67df4e3	prep doc track.	2026-03-01 08:57:01 -05:00
ed	cb0e14e1c0	Fixes to mma and conductor.	2026-02-28 21:59:28 -05:00
ed	ed56e56a2c	chore(mma): Checkpoint progress on visual simulation and UI refresh before sub-agent delegation	2026-02-28 21:41:46 -05:00
ed	db118f0a5c	updates to tools and mma skills	2026-02-28 07:51:02 -05:00
ed	7adacd06b7	checkpoint	2026-02-27 20:48:38 -05:00
ed	91693a5168	feat(mma): Refine tier roles, tool access, and observability	2026-02-26 08:31:19 -05:00
ed	5e256d1c12	docs(conductor): Update workflow with mma-exec and 4-tier model definitions	2026-02-25 20:23:25 -05:00
ed	63fd391dff	chore(conductor): Integrate strict MMA token firewalling and tiered delegation into core workflow	2026-02-25 13:29:16 -05:00
ed	e60eef5df8	docs(conductor): Synchronize docs for track 'MMA Tiered Architecture Verification'	2026-02-25 09:02:40 -05:00
ed	b255d4b935	conductor(checkpoint): Phase 1: Setup and Architecture complete	2026-02-24 23:54:15 -05:00
ed	462ed2266a	feat(conductor): Add run_subagent script for stable headless skill invocation	2026-02-24 23:17:45 -05:00
ed	10c5705748	docs(conductor): Add Token Firewalling and Model Switching Strategy	2026-02-24 22:45:17 -05:00
ed	5515a72cf3	update conductor files	2026-02-24 18:32:38 -05:00
ed	6d825e6585	wip: gemini doing gui_2.py catchup track	2026-02-23 21:07:06 -05:00
ed	db251a1038	conductor(checkpoint): Checkpoint end of Phase 1: Infrastructure & Core Utilities	2026-02-23 15:53:16 -05:00
ed	85fad6bb04	chore(conductor): Update workflow with API hook verification guidelines	2026-02-23 15:06:17 -05:00
ed	2ec1ecfd50	docs(workflow): Automate phase verification protocol with API hooks	2026-02-23 12:48:09 -05:00
ed	243a0cc5ca	trying out conductor	2026-02-23 10:51:24 -05:00

49 Commits