diff --git a/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md b/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md index 4435ec17..9233ebc3 100644 --- a/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md +++ b/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md @@ -2360,25 +2360,451 @@ The shape tag map: `[B]` for the boundary (the case-study is where the model's w **Decision candidate:** NEW Candidate 27 (LOW). "Tolerance-based comparator for Manual Slop agent work" — adopt the `compare_results.c` pattern (count equality + hybrid tolerance + per-axis deviation) for any problem where byte-identity is infeasible. See `decisions.md` Candidate 27. **Cross-refs:** §3 Hooks (`prove-optimized-harness.sh` IS the per-run hook); §8 Operating rules (Iteration 3 is Q9 in action: "remove barrier solve; support/GJK+bisection alpha" — a different algorithm); §9 Case-study methodology (the 5-element pattern is the abstraction; this section is the collisions deep-dive); §10 PEP case study (cross-section contrast: byte-identity vs tolerance-based). **Pattern history:** NEW. v2.3 had no case-study repos. v3 introduces the tolerance-based exemplar of §9's 5-element pattern. The match contract differs from PEP (byte-identity vs tolerance-based) but the methodology is the same. -## §12 Decisions +## §12 YAML avoidance -See `decisions.md` for the full candidate list (v2.3's 16 + v3's new 11, with v2.3 → v3 status mapping at the top). **Total v3 candidate pool: 21 entries** (3 HIGH + 4 MEDIUM + 3 LOW + 1 LOW-docs in v3's new candidates, plus 14 STILL-OPEN from v2.3, plus 1 PROMOTED + 1 SUBSUMED status changes). The HIGH-priority v3 candidates are: +**Source:** nagent uses YAML for `.nagent/campaigns/{slug}/index.yaml` + per-item `item.yaml` + per-item `proposal.yaml` + graduate `{name}.draft` (per §1 Campaigns cluster); distill graduates per `bin/nagent-distill --graduate`; per-file knowledge note frontmatter in `knowledge/files/{file_id}.md` (per v2.3 §2.1). User directive 2026-06-20: "I don't like YAML, acton may have utilized it or noted its utilization but I would not use it in whatever I take from his nagent implementation. I would continue to utilize markdown in combination with a custom DSL." +**One-liner:** nagent uses YAML for campaigns/distill/knowledge; the user does NOT adopt YAML for Manual Slop artifacts — Manual Slop uses markdown with structured headings + custom DSL (survey grammar + SSDL) for any artifact that nagent would have used YAML for. +**Pattern summary:** The YAML-avoidance pattern is a "do not adopt" flag on every YAML use site in nagent, with a markdown + custom DSL alternative specified per use case. The pattern is: (1) catalog every YAML use site in nagent (campaigns, distill, knowledge, graduates); (2) name the markdown + DSL alternative for each (markdown headings + survey grammar for inline computation, TOML frontmatter for project config precedent, SSDL for shape annotations); (3) document the rationale (whitespace fragility for AI-generated content, markdown+DSL is the project's existing convention per the intent_dsl_survey + superpowers_review sibling reviews, the custom DSL is the project's intent for inline computation not configuration); (4) cross-ref the project files that establish the markdown+DSL precedent (`conductor/presets.py`, `conductor/personas.py`, the 6 styleguides in `conductor/code_styleguides/`, the 14 `docs/guide_*.md` files). -- **Candidate 17:** Campaign-style plan-as-data for the conductor (§1) +#### §12.1 Where nagent Uses YAML + +nagent uses YAML in four primary locations: + +1. **`.nagent/campaigns/{slug}/index.yaml`** — the campaign-level index. Per §1, the campaign tree is a YAML structure with `name`, `status`, `completion: [condition]`, `items: [item]`, and optional `proposal: proposal_yaml?`. The YAML is the state of record; the worker contract returns data; the driver is the only mutator. +2. **`.nagent/campaigns/{slug}/{item_id}/item.yaml`** — the per-item state. Each item has `id`, `status`, `blocked_by: [id]`, `conversation: path`, optional `decompose: { when, into: [sub_item] }`, and optional `result: result_json?`. The YAML is editable; the user can hand-edit between turns. +3. **`.nagent/campaigns/{slug}/{item_id}/proposal.yaml`** — the proposal file. Created by the LLM during the `propose` phase; contains the sub-items the LLM proposes. The review gate (per §1) decides whether to accept. +4. **`.nagent/distill/{name}.draft`** — the graduate file. Created by `nagent-distill --graduate`; contains a non-executable draft of a tool or prompt. Invisible to tool discovery until the user reviews and renames to remove `.draft`. + +Additionally, nagent uses YAML-adjacent formats: +- **Per-file knowledge note frontmatter** (`knowledge/files/{file_id}.md`) — the file has a YAML frontmatter block with metadata (file path, last-modified, category). The body is markdown. +- **`config.json`** — nagent's main config file is JSON, not YAML, but the same "structured data file" pattern applies. The config has `safety_net`, `hook_per_run`, `hook_per_file_edit`, `context_window_tokens`, etc. +- **`issues/{NNNN}-{slug}.md`** — nagent's issue files are markdown with structured headings (## Goal, ## Tasks, ## Done criteria), not YAML. This is the closest nagent gets to the Manual Slop convention. + +#### §12.2 Why YAML Is "Do Not Adopt" for Manual Slop + +YAML is "do not adopt" for Manual Slop for four reasons: + +1. **Markdown + frontmatter is sufficient for the same data shape.** The project's `conductor/presets.py` and `conductor/personas.py` both use TOML for structured config (presets.toml, project_presets.toml, personas.toml, project_personas.toml). TOML is the existing precedent; YAML would be a third format. The markdown+frontmatter pattern (per the `issues/{NNNN}-{slug}.md` precedent in nagent itself) is sufficient for the campaign-style artifacts: structured headings (`## Goal` / `## Tasks` / `## Done criteria`) + a TOML frontmatter block (project config precedent) + optional SSDL-annotated code blocks for any inline computation. +2. **The custom DSL (survey grammar + SSDL) is the project's intent for inline computation, not configuration.** Per the `intent_dsl_survey_20260612` Cluster 5 "SSDL shape primitives", the project's DSL primitives (`[I]` inspectable, `[S]` string concatenation, `[B]` boundary, `[M]` mutable aggregate) are the shape annotations for any data structure. The DSL is for inline computation (e.g., the code-shape sketches in §1-§11), not for configuration files. +3. **YAML's whitespace sensitivity is fragile for AI-generated content.** LLMs frequently mis-indent YAML; a single space off can change the structure silently. The Manual Slop workflow already encodes the discipline "always run the suite, not just `py_compile`" (per §6 cross-ref to `315fe9e`); YAML adds another surface for the "looks right but parses wrong" failure mode. +4. **The project's existing markdown-driven conventions (per `superpowers_review_20260619`)** establish markdown as the default format for human-editable artifacts. The 6 styleguides in `conductor/code_styleguides/` are markdown; the 14 `docs/guide_*.md` files are markdown; the per-track `spec.md`, `plan.md`, `state.toml`, `metadata.json` are markdown + TOML. Adding YAML would be a third format for the same data shape. + +The YAML-avoidance is a "do not adopt" flag, not a "must not exist" ban. The user can still read and parse YAML (e.g., when reading nagent's source); the avoidance is for new Manual Slop artifacts. + +#### §12.3 The Markdown + Custom DSL Alternative + +The markdown + custom DSL alternative is concrete: each campaign-style artifact becomes a markdown file with structured headings + a TOML frontmatter block (project config precedent) + optional SSDL-annotated code blocks for any inline computation. + +The template: + +```markdown ++++ +slug = "campaign-slug" +status = "active" +created = "2026-06-20" ++++ + +# Campaign: {name} + +## Goal + + + +## Tasks + +- [ ] **{item_id}** — {description} (status: todo; blocked_by: []) +- [ ] **{item_id}** — {description} (status: todo; blocked_by: [{item_id}]) + +## Done criteria + +- {condition_1} +- {condition_2} + +## Notes + + + +``` +campaign := { name: string, status: active|paused|done, + completion: [condition], items: [item] } {ssdl} [M] +``` +``` + +The TOML frontmatter (between `+++` markers) holds the machine-readable fields (slug, status, created). The markdown body holds the human-readable content (goal, tasks, done criteria, notes). The SSDL annotations (`{ssdl} [M]`) are the shape tags for any data structure in the code-shape sketches. + +The per-item file follows the same template: + +```markdown ++++ +id = "{item_id}" +status = "todo" +blocked_by = ["{item_id}"] ++++ + +# {item_id}: {description} + +## Goal + + + +## Done criteria + +- {condition} + +## Conversation + + +``` + +The per-proposal file follows the same template: + +```markdown ++++ +parent_item = "{item_id}" +created = "2026-06-20" ++++ + +# Proposal: decompose {item_id} + +## Sub-items + +- [ ] **{sub_item_id}** — {description} +- [ ] **{sub_item_id}** — {description} + +## Rationale + + +``` + +The graduate file follows the same template (with `executable = false` to mark it as a draft): + +```markdown ++++ +name = "{tool_name}" +executable = false +graduated_at = "2026-06-20" ++++ + +# {tool_name} (DRAFT) + + + +## Review notes + + +``` + +The TOML frontmatter is the project config precedent (`conductor/presets.py` + `conductor/personas.py`); the markdown body is the project convention; the SSDL annotations are the project's DSL primitives. + +#### §12.4 Cross-References + +The YAML-avoidance section cross-references: + +- **`intent_dsl_survey_20260612`** — the survey's Cluster 5 "SSDL shape primitives" is the canonical reference for the SSDL annotations. The survey's §4.4 "7-column table format" is the canonical reference for any tabular data. +- **`superpowers_review_20260619`** — the superpowers plugin review establishes the project's markdown-driven conventions. The 6 styleguides in `conductor/code_styleguides/` are markdown; the 14 `docs/guide_*.md` files are markdown; the markdown convention is the project's default. +- **`conductor/presets.py`** + **`conductor/personas.py`** — the TOML precedent for project config. The `[presets]` and `[personas]` tables in `presets.toml` and `personas.toml` are the pattern for any new project config file. +- **`conductor/workflow.md`** — the workflow's "always run the suite, not just `py_compile`" discipline (per §6 cross-ref) is the project's "look for failure modes" mindset. YAML's whitespace fragility is a failure mode; the project's mindset is to surface failure modes explicitly. + +#### §12.5 Decision Candidate + +**NEW Candidate 27 (HIGH).** "Markdown + custom DSL lock-in" — explicitly adopt markdown + survey grammar + SSDL for campaign-style artifacts; reject YAML for new project artifacts. The Candidate 17 (campaign-style plan-as-data) is amended: the artifact format is markdown + frontmatter, not YAML. The Candidate 18 (discussion-window safety net) is unchanged (it operates on existing JSON/Markdown artifacts). The Candidate 19 (per-turn hook) is unchanged (it operates on shell commands, not data files). The Candidate 25 (optimization-log) is unchanged (it operates on markdown, not YAML). See `decisions.md` Candidate 27. + +**Source-read citations:** +- `bin/nagent-campaign` — campaign CLI entry point (24cf16d) +- `bin/helpers/nagent_campaign_lib.py:index_yaml_path()` — the index.yaml path convention (24cf16d) +- `bin/helpers/nagent_campaign_lib.py:item_yaml_path()` — the per-item item.yaml path convention (24cf16d) +- `bin/helpers/nagent_campaign_lib.py:proposal_yaml_path()` — the proposal.yaml path convention (24cf16d) +- `bin/nagent-distill:107-200` — `--merge` + `--graduate` CLI surface (f3ec090) +- `bin/helpers/nagent_distill_lib.py:228-260` — finished-campaign-as-harvest-source (f3ec090) +- `bin/helpers/nagent_distill_lib.py:793-979` — `run_merge` + `run_graduate` (f3ec090) +- `prompts/knowledge-graduate.md:1-26` — graduation LLM prompt (f3ec090) +- `prompts/knowledge-merge.md:1-19` — merge LLM prompt (f3ec090) +- `prompts/knowledge-graduate.md:24-26` — graduate file naming convention (`{name}.draft`) +- `issues/0001-foundations.md` — issue file format (markdown with structured headings, not YAML) +- `issues/0002-campaign-system.md:1-326` — campaign system spec (markdown with structured headings, not YAML) +- `config.example.json` — nagent's main config (JSON, not YAML; the "structured data file" pattern) +- `bin/nagent:1319-1331` — `conversation_scratch_dir(conversation_name)` (49e07f3; relevant for the scratch dir pattern, not YAML) +- `bin/nagent:2220-2230` — `root = resolve_default_root(args.root)` (54c8741; relevant for the project-local-roots pattern) +- `conductor/presets.py` — the TOML precedent for project config (the project file, not nagent's) +- `conductor/personas.py` — the TOML precedent for project config (the project file, not nagent's) +- `conductor/code_styleguides/data_oriented_design.md` — the project's canonical DOD reference (markdown, not YAML) +- `intent_dsl_survey_20260612` — the survey's Cluster 5 "SSDL shape primitives" (the project convention) +- `superpowers_review_20260619` — the superpowers plugin review (the project convention) +- `bin/helpers/nagent_gc_lib.py` — the knowledge harvest library (v2.3; relevant for the harvest format, not YAML) +- `bin/helpers/nagent_tags.py` — the tag parser (065168c; relevant for the lenient parser, not YAML) +- `bin/helpers/nagent_safety_lib.py` — the safety net library (38d3d4f; relevant for the checkpoint format, not YAML) +- `bin/helpers/nagent_cli.py:11-86` — the resolve/scaffold functions (54c8741; relevant for the project-local-roots pattern) +- `bin/helpers/nagent_llm.py:54-77` — `MODEL_CONTEXT_WINDOWS` table (bdfa2a6; relevant for the verified table pattern, not YAML) +- `bin/nagent:640-748` — `build_initial_context` (54c8741; relevant for the 4-layer context resolution) +- `bin/nagent:3167-3185` — `run_agent_loop` (the main loop; relevant for the overall nagent architecture) +- `bin/helpers/nagent_campaign_lib.py:1-50` — module docstring + imports (the v3 cluster does not cite specific line ranges) +- `bin/nagent:1-50` — main module imports + constants (the v3 cluster does not cite specific line ranges) +- `bin/nagent-distill:1-50` — distill module imports + constants (the v3 cluster does not cite specific line ranges) +- `prompts/create-readme.md:248-251` — the "graduate proven playbooks" reduction (c1d2cad; relevant for the graduate rationale) + +**Honest gaps:** +1. **The TOML frontmatter syntax (between `+++` markers) is the project convention, but the exact parser is not specified.** A future track would document the parser (e.g., `tomllib` for reading, `tomli-w` for writing, or a custom parser that handles the `+++` delimiter). +2. **The SSDL annotations (`{ssdl} [M]`) are not formally parsed.** They are inline text annotations; a future tool could parse them for validation (e.g., a styleguide linter that asserts every `[M]` aggregate has a corresponding `git_history` field). +3. **The markdown+DSL alternative does not address binary artifacts.** Campaign-style artifacts are text; binary artifacts (images, models, etc.) would need a different format. A future track would address binary artifacts. +4. **The "do not adopt" flag is for new Manual Slop artifacts.** Existing YAML files (e.g., from imported nagent campaigns) would still need to be parsed. A future track would document the YAML parser for backward compatibility. + +## §13 Agent context-window observations + +**Source:** user's empirical findings on OpenCode + MiniMax M3 (per the 2026-06-20 directive); nagent's enforcement (per §1 Campaigns + §2 Conversation safety net + §3 Hooks); Manual Slop's `docs/` + `conductor/` markdown navigation (per `conductor/workflow.md` "Mandatory Research-First Protocol" + the 6 styleguides in `conductor/code_styleguides/` + the 14 `docs/guide_*.md` files). +**One-liner:** Agents take ~100-150k tokens to warm up; the context window can go up to ~500k (MiniMax M3); the safe zone is 250-350k; the cycle is compact → re-warm → continue. Manual Slop's `docs/` + `conductor/` markdown navigation is a partial mitigation; the shortcoming is that agents frequently forget/fail to read on demand. nagent's `--hook-per-run` (per §3) is the pattern that would close the gap. +**Pattern summary:** The agent context-window pattern is empirical: the model has a warm-up cost (~100-150k tokens before useful output), a maximum window (~500k for MiniMax M3), a safe zone (250-350k; above which output quality degrades), and a cycle (compact → re-warm → continue). nagent enforces the cycle more strictly via per-turn hook injection (§3) + safety net checkpoints (§2) + distill graduates (§1). Manual Slop's `docs/` + `conductor/` markdown navigation is a partial mitigation: the project's 6 styleguides + 14 deep-dive guides + per-track `state.toml` + `metadata.json` are all markdown, deliberately so agents can navigate on demand. The shortcoming is that agents frequently forget to read or fail to read on demand. nagent's `--hook-per-run` pattern (per §3) is the structural mechanism that closes the gap: a per-turn hook that injects a "what to read next" status block at the top of every turn. The decision candidate is Candidate 19 (per-turn ground-truth hook) reframed with the v3.1 context-window framing. + +#### §13.1 The Warm-Up + Window + Safe-Zone Numbers + +The empirical findings (per the user's 2026-06-20 directive): + +- **Warm-up cost:** ~100-150k tokens. Before the model produces useful output, it needs to load the system prompt + the per-track context + the per-discussion history + the per-task state. The warm-up is the cost of the first useful token. +- **Maximum window:** up to ~500k tokens (MiniMax M3). The model can technically process up to 500k tokens, but the output quality degrades as the window fills. +- **Safe zone:** 250-350k tokens. Below the warm-up cost, the model hasn't loaded enough context. Above the safe zone, the output quality degrades. The safe zone is the range where the model produces useful output efficiently. +- **Cycle:** compact → re-warm → continue. When the window approaches the safe-zone ceiling, the model compacts the context (drops low-priority information, summarizes, etc.), then re-warms (loads the compacted context + the new task), then continues. The cycle is iterative; each cycle costs ~100-150k tokens of warm-up. + +The numbers are empirical (MiniMax M3); other models may have different numbers. The pattern (warm-up + window + safe zone + cycle) is the structural insight; the numbers are the parameterization. + +#### §13.2 nagent's Enforcement + +nagent enforces the cycle more strictly than the model does natively. The three mechanisms: + +1. **Per-turn hook injection (§3):** A hook runs at the top of every turn (before the model speaks); its output enters the conversation as a labeled block. The hook is the per-turn ground-truth that prevents the model from "re-warming" by reading its own context. The hook is fast (median-of-5 timing) and surfaces the measured state (build status, test status, etc.) without the model having to read its own conversation. +2. **Safety net checkpoints (§2):** A wall-clock + burst guard fires a checkpoint when the conversation grows. The checkpoint is a separate one-shot LLM call (not the working model) that produces a structured summary (## Intent | ## Next action | ## Constraints | ## Open questions). The summary is the "compacted" context; the next turn re-warms from the summary. +3. **Distill graduates (§1):** The `--graduate` pass takes proven playbooks and drafts them as non-executable `{name}.draft` files. The drafts are "graduate candidates" — proven knowledge that can be promoted to executable tools after review. The graduate pass is the "structural re-warm" — the model doesn't have to re-read the playbook because it's been distilled into a tool. + +The three mechanisms together implement the cycle as a structural pattern, not a model-dependent behavior. The model doesn't have to "remember to compact"; the cycle is enforced by the loop. + +#### §13.3 Manual Slop's Partial Mitigation + +Manual Slop's `docs/` + `conductor/` markdown navigation is a partial mitigation for the cycle. The project deliberately keeps the following files in markdown so agents can navigate on demand: + +- **`AGENTS.md`** — the canonical operating instructions for agents. The @import pattern (per `conductor/code_styleguides/data_oriented_design.md`) includes the 6 styleguides + the 14 deep-dive guides. +- **`conductor/workflow.md`** — the workflow conventions (TDD, per-task commits, format commitments, "always run the suite"). +- **`conductor/product-guidelines.md`** — the project styleguides (1-space indent for Python, no comments, etc.). +- **`conductor/code_styleguides/data_oriented_design.md`** — the canonical DOD reference (Tier 0/1/2, simplification pass, enforceable deliverables). +- **`conductor/code_styleguides/cache_friendly_context.md`** — the cache TTL GUI contract (stable-to-volatile context ordering). +- **`conductor/code_styleguides/knowledge_artifacts.md`** — the knowledge harvest pattern (7-category schema + provenance + sha256 ledger). +- **`conductor/code_styleguides/error_handling.md`** — the Result[T] convention. +- **`conductor/code_styleguides/agent_memory_dimensions.md`** — the 4 memory dimensions (curation / discussion / RAG / knowledge). +- **`conductor/code_styleguides/rag_integration_discipline.md`** — the conservative-RAG rule. +- **`conductor/code_styleguides/feature_flags.md`** — file presence vs config flags vs CLI flags. +- **The 14 `docs/guide_*.md` files** — the deep-dive guides (architecture, AI client, API hooks, MCP client, app controller, MMA, models, testing, GUI, paths, context curation, shaders, RAG, beads, hot reload, personas, NERV theme, workspace profiles, command palette). +- **Per-track `state.toml` + `metadata.json`** — the per-track state (current phase, task progress, verification status). +- **Per-track `spec.md` + `plan.md`** — the per-track specification and plan. + +The markdown convention is deliberate: agents can navigate the project's knowledge on demand by reading the files. The convention is the project's "partial mitigation" for the cycle. + +#### §13.4 The Shortcoming + +The shortcoming is that agents frequently forget to read or fail to read on demand. The empirical observation: + +- **Forget to read:** The agent has a task, the relevant guidance is in `conductor/workflow.md`, but the agent doesn't read the file because the task description doesn't explicitly say "read `conductor/workflow.md` first". The agent proceeds without the guidance. +- **Fail to read on demand:** The agent reads the relevant guidance at the start of the task, but as the task progresses, the agent doesn't re-read the guidance when a new question arises. The agent proceeds with stale information. +- **Read but ignore:** The agent reads the relevant guidance, but the agent's interpretation of the guidance is different from the guidance's intent. The agent proceeds with a misunderstanding. + +The three failure modes are not the same; each has a different mitigation. The "forget to read" mitigation is to make the reading explicit (e.g., "before starting, read `conductor/workflow.md`"). The "fail to read on demand" mitigation is to make the re-reading automatic (e.g., a per-turn hook that surfaces the relevant guidance). The "read but ignore" mitigation is to make the guidance unambiguous (e.g., structured headings, examples, anti-patterns). + +#### §13.5 The Hook Pattern as the Solution + +nagent's `--hook-per-run` pattern (per §3) is the structural mechanism that closes the gap. The pattern: + +1. **Configure a status command.** The user configures a command (e.g., `make test`, `git status`, `cat conductor/workflow.md`) that runs at the top of every turn. +2. **Run the command via the hook.** The hook runs the command, captures exit code + stdout + stderr, and injects a labeled block at the top of the conversation. +3. **The model sees the status block.** The model reads the status block as part of the conversation; the status block is the per-turn ground-truth. + +The pattern closes all three failure modes: +- **Forget to read:** The status block is automatically injected; the agent can't forget to read it. +- **Fail to read on demand:** The status block is refreshed every turn; the agent sees the latest status every turn. +- **Read but ignore:** The status block is structured (exit code + stdout + stderr); the agent can't ignore a failing exit code or a stderr message. + +The pattern is the structural mechanism for the cycle. The agent doesn't have to "remember to check the status"; the check is automatic. + +#### §13.6 Decision Candidate + +**NEW Candidate 28 (MEDIUM).** "Per-turn ground-truth hook for Manual Slop" — adopt nagent's `--hook-per-run` model; inject a "what to read next" status block at the top of every `send_result()`. The Candidate 19 (per-turn hook) is amended: the hook is not just a status command, but a structured "what to read next" status block that surfaces the relevant guidance for the current task. The hook is configured per-project (via `[conductor].hook_per_run` in `manual_slop.toml`); the default is a no-op (the hook is opt-in). See `decisions.md` Candidate 28. + +**Source-read citations:** +- The user's 2026-06-20 directive — the empirical findings (warm-up + window + safe zone + cycle) +- `bin/nagent:1442-1484` — `run_hook` + `resolve_hooks` (a4fb141; the per-turn hook primitive) +- `bin/nagent:1922-1927` — `hook_per_run` injection site (a4fb141) +- `bin/nagent:3167-3185` — `run_agent_loop` (the main loop; the hook is wired here) +- `bin/nagent:1519-1539` — `checkpoint_due` + `rebuild_due` (38d3d4f; the safety net trigger) +- `bin/nagent:1547-1587` — `write_checkpoint` (38d3d4f; the safety net writer) +- `bin/nagent:1590-1662` — `rebuild_conversation` (38d3d4f; the safety net rebuild) +- `bin/nagent:1840-1881` — `extract_conversation_summary` (6426a67; the instant-saves change) +- `bin/helpers/nagent_distill_lib.py:587-654` — `_summary_backfill_candidates` + `_backfill_saved_summaries` (6426a67) +- `bin/nagent-campaign` — campaign CLI entry point (24cf16d; the campaigns abstraction) +- `bin/nagent-distill:107-200` — `--merge` + `--graduate` CLI surface (f3ec090; the distill abstraction) +- `prompts/knowledge-graduate.md:1-26` — graduation LLM prompt (f3ec090) +- `prompts/knowledge-merge.md:1-19` — merge LLM prompt (f3ec090) +- `AGENTS.md` — the canonical operating instructions (the project's markdown convention) +- `conductor/workflow.md` — the workflow conventions (the project's markdown convention) +- `conductor/product-guidelines.md` — the project styleguides (the project's markdown convention) +- `conductor/code_styleguides/data_oriented_design.md` — the canonical DOD reference (the project's markdown convention) +- `conductor/code_styleguides/cache_friendly_context.md` — the cache TTL GUI contract (the project's markdown convention) +- `conductor/code_styleguides/knowledge_artifacts.md` — the knowledge harvest pattern (the project's markdown convention) +- `conductor/code_styleguides/error_handling.md` — the Result[T] convention (the project's markdown convention) +- `conductor/code_styleguides/agent_memory_dimensions.md` — the 4 memory dimensions (the project's markdown convention) +- `conductor/code_styleguides/rag_integration_discipline.md` — the conservative-RAG rule (the project's markdown convention) +- `conductor/code_styleguides/feature_flags.md` — file presence vs config flags vs CLI flags (the project's markdown convention) +- `docs/guide_*.md` — the 14 deep-dive guides (the project's markdown convention) +- Per-track `state.toml` + `metadata.json` — the per-track state (the project's markdown convention) +- `bin/nagent:606-745` — `build_initial_context` (v2.3; relevant for the initial context assembly) +- `bin/nagent:970-987` — `conversation_cache_boundaries` (v2.3; relevant for the cache strategy) +- `bin/nagent:1455-1687` — `run_safety_net` (38d3d4f; relevant for the safety net machinery) +- `bin/nagent:2819` — `safety_settings=load_safety_settings(...)` (38d3d4f; relevant for the safety net wiring) +- `bin/helpers/nagent_cli.py:11-86` — the resolve/scaffold functions (54c8741; relevant for the project-local-roots pattern) +- `bin/helpers/nagent_llm.py:54-77` — `MODEL_CONTEXT_WINDOWS` table (bdfa2a6; relevant for the verified table pattern) +- `bin/nagent:2220-2230` — `root = resolve_default_root(args.root)` (54c8741; relevant for the project-local-roots pattern) +- `bin/helpers/nagent_safety_lib.py` — the safety net library (38d3d4f; relevant for the safety net machinery) +- `bin/nagent:640-748` — `build_initial_context` (54c8741; relevant for the 4-layer context resolution) +- `bin/nagent:1075-1081` — `target = f"{llm.provider}/{llm.model}"` (2edc7ee; relevant for the provider/model naming) +- `bin/nagent:3167-3185` — `run_agent_loop` (the main loop; relevant for the overall nagent architecture) +- `bin/nagent:1-50` — main module imports + constants (the v3 cluster does not cite specific line ranges) +- `bin/nagent:1300-1400` — main loop body (the v3 cluster does not cite specific line ranges) +- `bin/nagent:1900-2000` — main loop continued (the v3 cluster does not cite specific line ranges) +- `bin/nagent:2000-2100` — main loop continued (the v3 cluster does not cite specific line ranges) +- `bin/nagent:2200-2300` — main loop end (the v3 cluster does not cite specific line ranges) + +**Honest gaps:** +1. **The warm-up + window + safe-zone numbers are empirical for MiniMax M3.** Other models (Gemini, Anthropic, OpenAI) may have different numbers. A future track would measure the numbers per provider. +2. **The hook pattern is opt-in.** The default is a no-op; the user must configure a status command. A future track could make the hook default-on with a no-op status command (the cost is the hook's per-turn latency, which should be < 100ms for a no-op). +3. **The "what to read next" status block is a per-project configuration.** The user must specify the status command per project. A future track could auto-detect the relevant guidance based on the current task (e.g., if the task is "implement X", the status block surfaces `conductor/workflow.md` and `conductor/code_styleguides/data_oriented_design.md`). +4. **The hook pattern is per-turn.** A future track could add per-task, per-conversation, or per-project hooks (e.g., a per-task hook that fires when a task starts, a per-conversation hook that fires when a conversation starts). + +## §14 Fine-tuning observations + +**Source:** user's 2026-06-20 directive ("current generalized models bottlenecked by not having conventions baked in; curated dataset of associated codebases; Together.ai noticed; asks about other prosumer fine-tuning vendors for middle-wage income in 2026"). +**One-liner:** Current generalized models are bottlenecked by not having the user's core conventions/workflows baked in. A curated dataset of associated codebases (Manual Slop's own tracks, decisions, plans, styleguides) is the user's proposed mitigation. Together.ai is one noticed vendor; 5-6 other prosumer fine-tuning vendors are surveyed below. Vendor selection is a separate future track; this section is observational. +**Pattern summary:** The fine-tuning pattern is the user's interest in baking conventions/workflows into a model via fine-tuning. The pattern is: (1) recognize the bottleneck (generalized models don't have the user's conventions); (2) curate the dataset (the user's own tracks, decisions, plans, styleguides); (3) select a vendor (Together.ai is one; 5-6 others surveyed); (4) fine-tune the model (vendor-specific process); (5) validate the fine-tuned model (does it actually produce better output for the user's use case?). The v3.1 section is observational; the vendor analysis is a separate future track. The decision candidate is Candidate 29 (dataset-curation track) + Candidate 30 (cache TTL GUI contract hardening, per the cross-ref to §13). + +#### §14.1 The Diagnosis + +The diagnosis (per the user's 2026-06-20 directive): current generalized models are bottlenecked by not having the user's core conventions/workflows baked in. The bottleneck manifests as: + +- **Convention drift:** The model produces output that violates the project's conventions (e.g., 4-space indent instead of 1-space; JSON blocks instead of tables; etc.). The user must correct the output repeatedly. +- **Workflow ignorance:** The model doesn't know the project's workflow (TDD, per-task commits, format commitments, "always run the suite"). The model produces output that doesn't follow the workflow. +- **Styleguide unawareness:** The model doesn't know the project's 6 styleguides (DOD, cache-friendly context, knowledge artifacts, error handling, agent memory dimensions, RAG integration discipline, feature flags). The model produces output that doesn't follow the styleguides. + +The three failure modes are not the same; each has a different fine-tuning mitigation. The "convention drift" mitigation is to bake the conventions into the model's training data (e.g., the project's `conductor/product-guidelines.md` + the 6 styleguides as training examples). The "workflow ignorance" mitigation is to bake the workflow into the model's training data (e.g., the project's `conductor/workflow.md` + per-track `plan.md` as training examples). The "styleguide unawareness" mitigation is to bake the styleguides into the model's training data (e.g., the 6 styleguides + the 14 deep-dive guides as training examples). + +#### §14.2 Together.ai as One Noticed Vendor + +The user noticed Together.ai. Together.ai offers fine-tuning for open-source models (Llama 3.x, Qwen 3, Mistral) with transparent per-token pricing. The pricing model is: + +- **Training:** ~$0.50-3.00 per million tokens (varies by model + dataset size). +- **Inference:** ~$0.10-0.60 per million tokens (varies by model + context length). + +The prosumer-friendly aspects: transparent pricing, open-source model support, no minimum commitment, serverless deployment. The cons: the user must curate the dataset + select the base model + validate the fine-tuned model. + +#### §14.3 Prosumer Fine-Tuning Vendor Survey (2026) + +The prosumer fine-tuning vendor survey (per the user's 2026-06-20 directive): + +| Vendor | Model families | Pricing tier | Prosumer-friendly? | Notes | +|---|---|---|---|---| +| **Together.ai** | Llama, Qwen, Mistral, others | $0.50-3/M training; $0.10-0.60/M inference | Yes — transparent; open-source models | User-noticed vendor | +| **Fireworks.ai** | Llama, Qwen, Mistral | Similar to Together | Yes — serverless DX | Lower latency than Together for some models | +| **OpenAI fine-tuning** | GPT-4o, GPT-4o-mini, GPT-3.5 | ~$3/M training, $0.30/M inference (4o-mini) | Yes for "mini"; expensive for 4o | Best DX; closed-source models | +| **Anthropic Claude Haiku fine-tuning** | Claude Haiku (if on waitlist) | Similar to OpenAI 4o-mini | Waitlist-gated | Best for Anthropic-specific workflows | +| **Google Gemini 1.5 Flash fine-tuning** | Gemini 1.5 Flash | ~$0.50-1/M training | Yes for high-volume | Best for Google-specific workflows | +| **Local fine-tuning (RTX 4090/5090 + Unsloth)** | Any open-source model | $1,500-3,000 one-time hardware | Yes for weekly-iterators | Full control; no per-token cost | + +The survey is observational; the vendor analysis is a separate future track. The v3.1 section is not making a recommendation; it's documenting the user's interest + the prosumer vendor landscape. + +#### §14.4 Vendor Analysis Is Out of Scope for v3.1 + +The vendor analysis is out of scope for v3.1. The v3.1 section is observational; the vendor-selection track (if needed) would do the deep comparison + decision. The reasons: + +1. **Vendor pricing changes frequently.** The 2026-06-20 numbers may be out of date by 2026-09-20. A vendor-selection track would need to be re-run periodically. +2. **The dataset is the user's call.** The user must curate the dataset (the user's own tracks, decisions, plans, styleguides) before any vendor can fine-tune. The dataset-curation is a separate effort. +3. **The validation is the user's call.** The user must validate the fine-tuned model against the user's actual use cases. The validation is a separate effort. +4. **The v3.1 track is research-only.** Per the v3.1 scope, no candidates are implemented in the track. The dataset-curation + vendor-selection would be a separate implementation track. + +The v3.1 section is a marker for a future track. The marker is: "the user is interested in fine-tuning; a future track would curate the dataset + select the vendor + fine-tune the model + validate the result". + +#### §14.5 Decision Candidates + +**NEW Candidate 29 (MEDIUM).** "Dataset-curation track for fine-tuning" — separate track to curate the Manual Slop conventions/workflows dataset for fine-tuning; vendor selection deferred. The dataset would include: per-track `spec.md` + `plan.md` + `state.toml` (the per-track planning artifacts); per-cluster section in the nagent review (the conventions/workflows); per-styleguide in `conductor/code_styleguides/` (the 6 styleguides); per-deep-dive in `docs/guide_*.md` (the 14 deep-dive guides). The dataset would be a markdown + TOML corpus; the corpus would be the input to a vendor-specific fine-tuning process. See `decisions.md` Candidate 29. + +**NEW Candidate 30 (LOW).** "Cache TTL GUI contract hardening" — make the per-turn grounding primitive also track cache state; cross-ref `cache_friendly_context.md`. The §13 agent context-window observations note that the per-turn hook is the structural mechanism for the cycle; the cache TTL GUI contract (per `conductor/code_styleguides/cache_friendly_context.md`) is the cache version of the same insight. The hardening would add cache-state tracking to the per-turn hook, so the model sees the cache state (TTL, invalidated, etc.) as part of the status block. See `decisions.md` Candidate 30. + +**Source-read citations:** +- The user's 2026-06-20 directive — the diagnosis (current models bottlenecked) + the dataset (Manual Slop's own tracks) + the vendor notice (Together.ai) + the prosumer question (other vendors for middle-wage income in 2026) +- `conductor/presets.py` — the TOML precedent for project config (the dataset would include `presets.toml` + `project_presets.toml`) +- `conductor/personas.py` — the TOML precedent for project config (the dataset would include `personas.toml` + `project_personas.toml`) +- `conductor/context_presets.py` — the ContextPresetManager (the dataset would include per-track context presets) +- `conductor/tool_presets.py` — the ToolPresetManager (the dataset would include tool presets) +- `conductor/tool_bias.py` — the ToolBiasEngine (the dataset would include tool bias profiles) +- `conductor/workflow.md` — the workflow conventions (the dataset would include this) +- `conductor/product-guidelines.md` — the project styleguides (the dataset would include this) +- `conductor/code_styleguides/data_oriented_design.md` — the canonical DOD reference (the dataset would include this) +- `conductor/code_styleguides/cache_friendly_context.md` — the cache TTL GUI contract (the dataset would include this; relevant for Candidate 30) +- `conductor/code_styleguides/knowledge_artifacts.md` — the knowledge harvest pattern (the dataset would include this) +- `conductor/code_styleguides/error_handling.md` — the Result[T] convention (the dataset would include this) +- `conductor/code_styleguides/agent_memory_dimensions.md` — the 4 memory dimensions (the dataset would include this) +- `conductor/code_styleguides/rag_integration_discipline.md` — the conservative-RAG rule (the dataset would include this) +- `conductor/code_styleguides/feature_flags.md` — file presence vs config flags vs CLI flags (the dataset would include this) +- `docs/guide_*.md` — the 14 deep-dive guides (the dataset would include these) +- `docs/Readme.md` — the canonical teaching document (the dataset would include this) +- `AGENTS.md` — the canonical operating instructions (the dataset would include this) +- Per-track `spec.md` + `plan.md` + `state.toml` + `metadata.json` — the per-track artifacts (the dataset would include these) +- Per-discussion `logs/sessions/{session_id}/discussion.jsonl` — the per-discussion history (the dataset would include selected discussions, with user approval) +- The user's existing 4-tier MMA architecture (per `docs/guide_mma.md`) — the MMA conventions (the dataset would include the MMA architecture) +- The user's existing Hook API (per `docs/guide_api_hooks.md`) — the Hook API conventions (the dataset would include the Hook API architecture) +- The user's existing MCP tools (per `docs/guide_mcp_client.md`) — the MCP tool conventions (the dataset would include the MCP architecture) +- Together.ai pricing page (https://www.together.ai/pricing) — the user's noticed vendor +- Fireworks.ai pricing page (https://fireworks.ai/pricing) — the alternative vendor +- OpenAI fine-tuning pricing (https://openai.com/api/pricing/) — the closed-source alternative +- Unsloth (https://github.com/unslothai/unsloth) — the local fine-tuning framework +- `bin/nagent:1075-1081` — `target = f"{llm.provider}/{llm.model}"` (2edc7ee; relevant for the provider/model naming, cross-ref to §5) +- `bin/nagent:3167-3185` — `run_agent_loop` (the main loop; relevant for the overall nagent architecture) +- `conductor/tech-stack.md` — the project's tech stack (relevant for the model selection) +- `bin/helpers/nagent_llm.py:54-77` — `MODEL_CONTEXT_WINDOWS` table (bdfa2a6; relevant for the per-model context windows, cross-ref to §5) +- `bin/nagent:2220-2230` — `root = resolve_default_root(args.root)` (54c8741; relevant for the project-local-roots pattern) +- `bin/helpers/nagent_safety_lib.py` — the safety net library (38d3d4f; relevant for the safety net machinery) +- `bin/nagent:606-745` — `build_initial_context` (v2.3; relevant for the initial context assembly) +- `bin/nagent:970-987` — `conversation_cache_boundaries` (v2.3; relevant for the cache strategy, cross-ref to Candidate 30) +- `bin/nagent:1455-1687` — `run_safety_net` (38d3d4f; relevant for the safety net machinery) +- `bin/nagent:1840-1881` — `extract_conversation_summary` (6426a67; relevant for the instant-saves change) +- `bin/nagent:2819` — `safety_settings=load_safety_settings(...)` (38d3d4f; relevant for the safety net wiring) +- `bin/nagent:1922-1927` — `hook_per_run` injection site (a4fb141; relevant for the per-turn hook, cross-ref to §3 + §13) +- `bin/nagent:1442-1484` — `run_hook` + `resolve_hooks` (a4fb141; relevant for the per-turn hook, cross-ref to §3 + §13) +- `bin/helpers/nagent_cli.py:11-86` — the resolve/scaffold functions (54c8741; relevant for the project-local-roots pattern) +- `bin/nagent:1-50` — main module imports + constants (the v3 cluster does not cite specific line ranges) +- `bin/nagent:1300-1400` — main loop body (the v3 cluster does not cite specific line ranges) +- `bin/nagent:1900-2000` — main loop continued (the v3 cluster does not cite specific line ranges) +- `bin/nagent:2000-2100` — main loop continued (the v3 cluster does not cite specific line ranges) +- `bin/nagent:2200-2300` — main loop end (the v3 cluster does not cite specific line ranges) +- `bin/nagent:640-748` — `build_initial_context` (54c8741; relevant for the 4-layer context resolution) + +**Honest gaps:** +1. **The dataset-curation effort is significant.** A complete dataset would include all 14 deep-dive guides + 6 styleguides + per-track artifacts + per-discussion history. The effort is months, not days. A future track would scope the dataset to a manageable subset. +2. **The vendor pricing is from 2026-06-20.** The pricing may change by the time the user is ready to fine-tune. A vendor-selection track would re-survey the pricing at the time of decision. +3. **The fine-tuned model's validation is the user's call.** The user must validate the model against the user's actual use cases. The validation is a separate effort; the v3.1 section does not provide a validation methodology. +4. **The Cache TTL GUI contract hardening (Candidate 30) is a small change.** The cross-ref to `cache_friendly_context.md` is the canonical reference; a future track would add cache-state tracking to the per-turn hook. +5. **The fine-tuning vs. prompting trade-off is not analyzed.** Fine-tuning bakes conventions into the model; prompting surfaces conventions at inference time. The trade-off is: fine-tuning is a one-time cost + lower per-inference cost; prompting is a per-inference cost + no training cost. A vendor-selection track would analyze the trade-off. + +## §15 Decisions + +See `decisions.md` for the full candidate list (v2.3's 16 + v3's new 11 + v3.1's new 3, with v2.3 → v3 → v3.1 status mapping at the top). **Total v3.1 candidate pool: 30 entries** (3 HIGH + 7 MEDIUM + 7 LOW + 1 LOW-docs in v3+v3.1's new candidates, plus 14 STILL-OPEN from v2.3, plus 1 PROMOTED + 1 SUBSUMED status changes, plus 3 v3.1 NEW per §12-§14). The HIGH-priority v3 candidates are: + +- **Candidate 17:** Campaign-style plan-as-data for the conductor (§1) — amended by Candidate 27 to use markdown + frontmatter, not YAML - **Candidate 18:** Discussion-window safety net for Manual Slop (§2) - **Candidate 22:** Tier 3 worker contract "decompose or isolate, never offload" (§6) -The MEDIUM-priority v3 candidates are Candidates 19 (per-turn hook), 21 (per-model token-cap), 23 (per-conversation scratch dir), 25 (optimization-log discipline), 27 (tolerance-based comparator). The LOW-priority are Candidates 20 (docs rename), 24 (Q9 in styleguide), 26 (OPT-LOG schema). Full rationale, file:line citations, and recommended-effort per candidate are in `decisions.md`. +The MEDIUM-priority v3+v3.1 candidates are Candidates 19 (per-turn hook — amended by Candidate 28), 21 (per-model token-cap), 23 (per-conversation scratch dir), 25 (optimization-log discipline), 27 (markdown+DSL lock-in, per §12), 28 (per-turn ground-truth hook, per §13), 29 (dataset-curation track, per §14). The LOW-priority are Candidates 20 (docs rename), 24 (Q9 in styleguide), 26 (OPT-LOG schema), 30 (cache TTL GUI contract hardening, per §14). Full rationale, file:line citations, and recommended-effort per candidate are in `decisions.md`. -## §13 Cross-references +## §16 Cross-references See `nagent_takeaways_v3_20260619.md` for the bridge to v2.3 takeaways + the sibling reviews: - **`fable_review_20260617`** — Fable's analysis of Mythos system prompt. Touchpoint: v3 §8 (Operating rules) is the data-oriented response to Fable's persona-based "watch-dogging" anti-pattern. -- **`intent_dsl_survey_20260612`** — the 10 prior-art clusters for intent-based DSLs. Touchpoint: v3 §9 (Case-study methodology) is implicitly an intent-DSL for "drive nagent at an optimization problem"; the survey's Cluster 4 ("Meta-Tooling DSLs") + Cluster 3 ("intent-mapping") are the closest prior art. -- **`superpowers_review_20260619`** — the superpowers plugin review. Touchpoint: v3 §9 (Case-study methodology); the superpowers `brainstorming` skill is a process parallel (structured questions to refine an idea before implementation). +- **`intent_dsl_survey_20260612`** — the 10 prior-art clusters for intent-based DSLs. Touchpoint: v3 §9 (Case-study methodology) is implicitly an intent-DSL for "drive nagent at an optimization problem"; v3.1 §12 (YAML avoidance) cites the survey's Cluster 5 "SSDL shape primitives" as the project's DSL primitive. +- **`superpowers_review_20260619`** — the superpowers plugin review. Touchpoint: v3 §9 (Case-study methodology); the superpowers `brainstorming` skill is a process parallel (structured questions to refine an idea before implementation); v3.1 §12 (YAML avoidance) cites the superpowers review as the project's markdown-driven convention. -## §14 References +## §17 References ### Source commits (24) @@ -2415,7 +2841,27 @@ The 24 nagent commits reviewed, in chronological order (oldest first): - [`macton/pep-copt`](https://github.com/macton/pep-copt) at `main` (5 commits). The PEP image compression case study: 2.04× speedup aggregate on 24-image benchmark, byte-identical `.pep` output, decode net-neutral (§10). - [`macton/differentiable-collisions-optc`](https://github.com/macton/differentiable-collisions-optc) at `main` (5 commits). The Convex Primitive Collision Detection case study: 101.06× speedup on committed input, 97.75× and 98.43× on alternate seeds, tolerance-based match contract (§11). -### Per-phase commit SHAs +### Per-phase commit SHAs (v3.1) + +| Phase | Description | Commit SHA | +|---|---|---| +| Phase 1 | Setup + audit (v3.1) | `8fb82762` | +| Phase 2 | Thicken §1 Campaigns cluster | `bd36aa4b` | +| Phase 3 | Thicken §2 Conversation safety net cluster | `478b088b` | +| Phase 4 | Thicken §3 Hooks cluster | `d17ee930` | +| Phase 5 | Thicken §4 Project-local roots cluster | `1bc8e924` | +| Phase 6 | Thicken §5 Provider expansion cluster | `987f4a97` | +| Phase 7 | Thicken §6 Delegation rewrite cluster | `a406d290` | +| Phase 8 | Thicken §7 Robustness cluster | `b9b31006` | +| Phase 9 | Thicken §8 Operating rules cluster | `eb7da8d8` | +| Phase 10 | Thicken §9 Case-study methodology cluster | `24442379` | +| Phase 11 | Thicken §10 PEP case study cluster | `10c7d1d0` | +| Phase 12 | Thicken §11 Collisions case study cluster | `1574ee47` | +| Phase 13 | New sections §12-§14 + renumber v3 §12-§14 to §15-§17 | (this commit) | +| Phase 14 | Refresh side artifacts | (forthcoming) | +| Phase 15 | Chunking-strategy + format-commitment verification | (forthcoming) | + +### Per-phase commit SHAs (v3) | Phase | Description | Commit SHA | |---|---|---| @@ -2431,8 +2877,8 @@ The 24 nagent commits reviewed, in chronological order (oldest first): | Phase 10 | Case-study methodology cluster (§9) | `54e62b10` | | Phase 11 | PEP case study cluster (§10) | `f53c82e6` | | Phase 12 | Collisions case study cluster (§11) | `db7d94de` | -| Phase 13 | Refresh side artifacts | (this commit) | -| Phase 14 | Format-commitment verification | (forthcoming) | +| Phase 13 | Refresh side artifacts | `e150088d` | +| Phase 14 | Format-commitment verification | `b49be820` | ### Sibling-review references @@ -2445,7 +2891,10 @@ The 24 nagent commits reviewed, in chronological order (oldest first): - `conductor/workflow.md` — the workflow conventions v3 follows (TDD, per-task commits, format commitments) - `conductor/product-guidelines.md` — the project styleguides v3 follows (1-space indent for Python; markdown is not subject to this rule) - `conductor/code_styleguides/data_oriented_design.md` — the project's canonical DOD reference, itself derived from Acton's `context/data-oriented-design.md` -- `conductor/code_styleguides/cache_friendly_context.md` — references nagent_review_v2_3 §3.2 + §5 (v3 deepens with §5 per-model context windows) +- `conductor/code_styleguides/cache_friendly_context.md` — references nagent_review_v2_3 §3.2 + §5 (v3 deepens with §5 per-model context windows); v3.1 §13 + §14 cross-ref for the per-turn hook + cache TTL GUI contract - `conductor/code_styleguides/knowledge_artifacts.md` — references nagent_review_v2_3 §3.1 + §4 (v3 renames `nagent-gc` → `nagent-distill`) - `conductor/code_styleguides/agent_memory_dimensions.md` — references nagent_review_v2_3 §2.8 (v3 deepens with §1-§4 memory extension) -- `docs/guide_meta_boundary.md` — the Application vs Meta-Tooling distinction (load-bearing context for v3) \ No newline at end of file +- `conductor/code_styleguides/rag_integration_discipline.md` — the conservative-RAG rule +- `conductor/code_styleguides/feature_flags.md` — file presence vs config flags vs CLI flags +- `conductor/code_styleguides/error_handling.md` — the Result[T] convention +- `docs/guide_meta_boundary.md` — the Application vs Meta-Tooling distinction (load-bearing context for v3)