Private
Public Access
0
0

conductor(track): nagent_review_v3.1 §12-§14 new sections + renumber v3 §12-§14 to §15-§17

This commit is contained in:
2026-06-20 11:34:40 -04:00
parent 1574ee47e4
commit 63b34eaef1
@@ -2360,25 +2360,451 @@ The shape tag map: `[B]` for the boundary (the case-study is where the model's w
**Decision candidate:** NEW Candidate 27 (LOW). "Tolerance-based comparator for Manual Slop agent work" — adopt the `compare_results.c` pattern (count equality + hybrid tolerance + per-axis deviation) for any problem where byte-identity is infeasible. See `decisions.md` Candidate 27.
**Cross-refs:** §3 Hooks (`prove-optimized-harness.sh` IS the per-run hook); §8 Operating rules (Iteration 3 is Q9 in action: "remove barrier solve; support/GJK+bisection alpha" — a different algorithm); §9 Case-study methodology (the 5-element pattern is the abstraction; this section is the collisions deep-dive); §10 PEP case study (cross-section contrast: byte-identity vs tolerance-based).
**Pattern history:** NEW. v2.3 had no case-study repos. v3 introduces the tolerance-based exemplar of §9's 5-element pattern. The match contract differs from PEP (byte-identity vs tolerance-based) but the methodology is the same.
## §12 Decisions
## §12 YAML avoidance
See `decisions.md` for the full candidate list (v2.3's 16 + v3's new 11, with v2.3 → v3 status mapping at the top). **Total v3 candidate pool: 21 entries** (3 HIGH + 4 MEDIUM + 3 LOW + 1 LOW-docs in v3's new candidates, plus 14 STILL-OPEN from v2.3, plus 1 PROMOTED + 1 SUBSUMED status changes). The HIGH-priority v3 candidates are:
**Source:** nagent uses YAML for `.nagent/campaigns/{slug}/index.yaml` + per-item `item.yaml` + per-item `proposal.yaml` + graduate `{name}.draft` (per §1 Campaigns cluster); distill graduates per `bin/nagent-distill --graduate`; per-file knowledge note frontmatter in `knowledge/files/{file_id}.md` (per v2.3 §2.1). User directive 2026-06-20: "I don't like YAML, acton may have utilized it or noted its utilization but I would not use it in whatever I take from his nagent implementation. I would continue to utilize markdown in combination with a custom DSL."
**One-liner:** nagent uses YAML for campaigns/distill/knowledge; the user does NOT adopt YAML for Manual Slop artifacts — Manual Slop uses markdown with structured headings + custom DSL (survey grammar + SSDL) for any artifact that nagent would have used YAML for.
**Pattern summary:** The YAML-avoidance pattern is a "do not adopt" flag on every YAML use site in nagent, with a markdown + custom DSL alternative specified per use case. The pattern is: (1) catalog every YAML use site in nagent (campaigns, distill, knowledge, graduates); (2) name the markdown + DSL alternative for each (markdown headings + survey grammar for inline computation, TOML frontmatter for project config precedent, SSDL for shape annotations); (3) document the rationale (whitespace fragility for AI-generated content, markdown+DSL is the project's existing convention per the intent_dsl_survey + superpowers_review sibling reviews, the custom DSL is the project's intent for inline computation not configuration); (4) cross-ref the project files that establish the markdown+DSL precedent (`conductor/presets.py`, `conductor/personas.py`, the 6 styleguides in `conductor/code_styleguides/`, the 14 `docs/guide_*.md` files).
- **Candidate 17:** Campaign-style plan-as-data for the conductor (§1)
#### §12.1 Where nagent Uses YAML
nagent uses YAML in four primary locations:
1. **`.nagent/campaigns/{slug}/index.yaml`** — the campaign-level index. Per §1, the campaign tree is a YAML structure with `name`, `status`, `completion: [condition]`, `items: [item]`, and optional `proposal: proposal_yaml?`. The YAML is the state of record; the worker contract returns data; the driver is the only mutator.
2. **`.nagent/campaigns/{slug}/{item_id}/item.yaml`** — the per-item state. Each item has `id`, `status`, `blocked_by: [id]`, `conversation: path`, optional `decompose: { when, into: [sub_item] }`, and optional `result: result_json?`. The YAML is editable; the user can hand-edit between turns.
3. **`.nagent/campaigns/{slug}/{item_id}/proposal.yaml`** — the proposal file. Created by the LLM during the `propose` phase; contains the sub-items the LLM proposes. The review gate (per §1) decides whether to accept.
4. **`.nagent/distill/{name}.draft`** — the graduate file. Created by `nagent-distill --graduate`; contains a non-executable draft of a tool or prompt. Invisible to tool discovery until the user reviews and renames to remove `.draft`.
Additionally, nagent uses YAML-adjacent formats:
- **Per-file knowledge note frontmatter** (`knowledge/files/{file_id}.md`) — the file has a YAML frontmatter block with metadata (file path, last-modified, category). The body is markdown.
- **`config.json`** — nagent's main config file is JSON, not YAML, but the same "structured data file" pattern applies. The config has `safety_net`, `hook_per_run`, `hook_per_file_edit`, `context_window_tokens`, etc.
- **`issues/{NNNN}-{slug}.md`** — nagent's issue files are markdown with structured headings (## Goal, ## Tasks, ## Done criteria), not YAML. This is the closest nagent gets to the Manual Slop convention.
#### §12.2 Why YAML Is "Do Not Adopt" for Manual Slop
YAML is "do not adopt" for Manual Slop for four reasons:
1. **Markdown + frontmatter is sufficient for the same data shape.** The project's `conductor/presets.py` and `conductor/personas.py` both use TOML for structured config (presets.toml, project_presets.toml, personas.toml, project_personas.toml). TOML is the existing precedent; YAML would be a third format. The markdown+frontmatter pattern (per the `issues/{NNNN}-{slug}.md` precedent in nagent itself) is sufficient for the campaign-style artifacts: structured headings (`## Goal` / `## Tasks` / `## Done criteria`) + a TOML frontmatter block (project config precedent) + optional SSDL-annotated code blocks for any inline computation.
2. **The custom DSL (survey grammar + SSDL) is the project's intent for inline computation, not configuration.** Per the `intent_dsl_survey_20260612` Cluster 5 "SSDL shape primitives", the project's DSL primitives (`[I]` inspectable, `[S]` string concatenation, `[B]` boundary, `[M]` mutable aggregate) are the shape annotations for any data structure. The DSL is for inline computation (e.g., the code-shape sketches in §1-§11), not for configuration files.
3. **YAML's whitespace sensitivity is fragile for AI-generated content.** LLMs frequently mis-indent YAML; a single space off can change the structure silently. The Manual Slop workflow already encodes the discipline "always run the suite, not just `py_compile`" (per §6 cross-ref to `315fe9e`); YAML adds another surface for the "looks right but parses wrong" failure mode.
4. **The project's existing markdown-driven conventions (per `superpowers_review_20260619`)** establish markdown as the default format for human-editable artifacts. The 6 styleguides in `conductor/code_styleguides/` are markdown; the 14 `docs/guide_*.md` files are markdown; the per-track `spec.md`, `plan.md`, `state.toml`, `metadata.json` are markdown + TOML. Adding YAML would be a third format for the same data shape.
The YAML-avoidance is a "do not adopt" flag, not a "must not exist" ban. The user can still read and parse YAML (e.g., when reading nagent's source); the avoidance is for new Manual Slop artifacts.
#### §12.3 The Markdown + Custom DSL Alternative
The markdown + custom DSL alternative is concrete: each campaign-style artifact becomes a markdown file with structured headings + a TOML frontmatter block (project config precedent) + optional SSDL-annotated code blocks for any inline computation.
The template:
```markdown
+++
slug = "campaign-slug"
status = "active"
created = "2026-06-20"
+++
# Campaign: {name}
## Goal
<one sentence: what the user is trying to achieve>
## Tasks
- [ ] **{item_id}** — {description} (status: todo; blocked_by: [])
- [ ] **{item_id}** — {description} (status: todo; blocked_by: [{item_id}])
## Done criteria
- {condition_1}
- {condition_2}
## Notes
<optional: inline code-shape sketch with SSDL annotations>
```
campaign := { name: string, status: active|paused|done,
completion: [condition], items: [item] } {ssdl} [M]
```
```
The TOML frontmatter (between `+++` markers) holds the machine-readable fields (slug, status, created). The markdown body holds the human-readable content (goal, tasks, done criteria, notes). The SSDL annotations (`{ssdl} [M]`) are the shape tags for any data structure in the code-shape sketches.
The per-item file follows the same template:
```markdown
+++
id = "{item_id}"
status = "todo"
blocked_by = ["{item_id}"]
+++
# {item_id}: {description}
## Goal
<one sentence: what this item is trying to achieve>
## Done criteria
- {condition}
## Conversation
<path to the conversation file>
```
The per-proposal file follows the same template:
```markdown
+++
parent_item = "{item_id}"
created = "2026-06-20"
+++
# Proposal: decompose {item_id}
## Sub-items
- [ ] **{sub_item_id}** — {description}
- [ ] **{sub_item_id}** — {description}
## Rationale
<why this decomposition; the LLM's reasoning>
```
The graduate file follows the same template (with `executable = false` to mark it as a draft):
```markdown
+++
name = "{tool_name}"
executable = false
graduated_at = "2026-06-20"
+++
# {tool_name} (DRAFT)
<the tool's prompt or code>
## Review notes
<what the user should check before promoting from draft>
```
The TOML frontmatter is the project config precedent (`conductor/presets.py` + `conductor/personas.py`); the markdown body is the project convention; the SSDL annotations are the project's DSL primitives.
#### §12.4 Cross-References
The YAML-avoidance section cross-references:
- **`intent_dsl_survey_20260612`** — the survey's Cluster 5 "SSDL shape primitives" is the canonical reference for the SSDL annotations. The survey's §4.4 "7-column table format" is the canonical reference for any tabular data.
- **`superpowers_review_20260619`** — the superpowers plugin review establishes the project's markdown-driven conventions. The 6 styleguides in `conductor/code_styleguides/` are markdown; the 14 `docs/guide_*.md` files are markdown; the markdown convention is the project's default.
- **`conductor/presets.py`** + **`conductor/personas.py`** — the TOML precedent for project config. The `[presets]` and `[personas]` tables in `presets.toml` and `personas.toml` are the pattern for any new project config file.
- **`conductor/workflow.md`** — the workflow's "always run the suite, not just `py_compile`" discipline (per §6 cross-ref) is the project's "look for failure modes" mindset. YAML's whitespace fragility is a failure mode; the project's mindset is to surface failure modes explicitly.
#### §12.5 Decision Candidate
**NEW Candidate 27 (HIGH).** "Markdown + custom DSL lock-in" — explicitly adopt markdown + survey grammar + SSDL for campaign-style artifacts; reject YAML for new project artifacts. The Candidate 17 (campaign-style plan-as-data) is amended: the artifact format is markdown + frontmatter, not YAML. The Candidate 18 (discussion-window safety net) is unchanged (it operates on existing JSON/Markdown artifacts). The Candidate 19 (per-turn hook) is unchanged (it operates on shell commands, not data files). The Candidate 25 (optimization-log) is unchanged (it operates on markdown, not YAML). See `decisions.md` Candidate 27.
**Source-read citations:**
- `bin/nagent-campaign` — campaign CLI entry point (24cf16d)
- `bin/helpers/nagent_campaign_lib.py:index_yaml_path()` — the index.yaml path convention (24cf16d)
- `bin/helpers/nagent_campaign_lib.py:item_yaml_path()` — the per-item item.yaml path convention (24cf16d)
- `bin/helpers/nagent_campaign_lib.py:proposal_yaml_path()` — the proposal.yaml path convention (24cf16d)
- `bin/nagent-distill:107-200``--merge` + `--graduate` CLI surface (f3ec090)
- `bin/helpers/nagent_distill_lib.py:228-260` — finished-campaign-as-harvest-source (f3ec090)
- `bin/helpers/nagent_distill_lib.py:793-979``run_merge` + `run_graduate` (f3ec090)
- `prompts/knowledge-graduate.md:1-26` — graduation LLM prompt (f3ec090)
- `prompts/knowledge-merge.md:1-19` — merge LLM prompt (f3ec090)
- `prompts/knowledge-graduate.md:24-26` — graduate file naming convention (`{name}.draft`)
- `issues/0001-foundations.md` — issue file format (markdown with structured headings, not YAML)
- `issues/0002-campaign-system.md:1-326` — campaign system spec (markdown with structured headings, not YAML)
- `config.example.json` — nagent's main config (JSON, not YAML; the "structured data file" pattern)
- `bin/nagent:1319-1331``conversation_scratch_dir(conversation_name)` (49e07f3; relevant for the scratch dir pattern, not YAML)
- `bin/nagent:2220-2230``root = resolve_default_root(args.root)` (54c8741; relevant for the project-local-roots pattern)
- `conductor/presets.py` — the TOML precedent for project config (the project file, not nagent's)
- `conductor/personas.py` — the TOML precedent for project config (the project file, not nagent's)
- `conductor/code_styleguides/data_oriented_design.md` — the project's canonical DOD reference (markdown, not YAML)
- `intent_dsl_survey_20260612` — the survey's Cluster 5 "SSDL shape primitives" (the project convention)
- `superpowers_review_20260619` — the superpowers plugin review (the project convention)
- `bin/helpers/nagent_gc_lib.py` — the knowledge harvest library (v2.3; relevant for the harvest format, not YAML)
- `bin/helpers/nagent_tags.py` — the tag parser (065168c; relevant for the lenient parser, not YAML)
- `bin/helpers/nagent_safety_lib.py` — the safety net library (38d3d4f; relevant for the checkpoint format, not YAML)
- `bin/helpers/nagent_cli.py:11-86` — the resolve/scaffold functions (54c8741; relevant for the project-local-roots pattern)
- `bin/helpers/nagent_llm.py:54-77``MODEL_CONTEXT_WINDOWS` table (bdfa2a6; relevant for the verified table pattern, not YAML)
- `bin/nagent:640-748``build_initial_context` (54c8741; relevant for the 4-layer context resolution)
- `bin/nagent:3167-3185``run_agent_loop` (the main loop; relevant for the overall nagent architecture)
- `bin/helpers/nagent_campaign_lib.py:1-50` — module docstring + imports (the v3 cluster does not cite specific line ranges)
- `bin/nagent:1-50` — main module imports + constants (the v3 cluster does not cite specific line ranges)
- `bin/nagent-distill:1-50` — distill module imports + constants (the v3 cluster does not cite specific line ranges)
- `prompts/create-readme.md:248-251` — the "graduate proven playbooks" reduction (c1d2cad; relevant for the graduate rationale)
**Honest gaps:**
1. **The TOML frontmatter syntax (between `+++` markers) is the project convention, but the exact parser is not specified.** A future track would document the parser (e.g., `tomllib` for reading, `tomli-w` for writing, or a custom parser that handles the `+++` delimiter).
2. **The SSDL annotations (`{ssdl} [M]`) are not formally parsed.** They are inline text annotations; a future tool could parse them for validation (e.g., a styleguide linter that asserts every `[M]` aggregate has a corresponding `git_history` field).
3. **The markdown+DSL alternative does not address binary artifacts.** Campaign-style artifacts are text; binary artifacts (images, models, etc.) would need a different format. A future track would address binary artifacts.
4. **The "do not adopt" flag is for new Manual Slop artifacts.** Existing YAML files (e.g., from imported nagent campaigns) would still need to be parsed. A future track would document the YAML parser for backward compatibility.
## §13 Agent context-window observations
**Source:** user's empirical findings on OpenCode + MiniMax M3 (per the 2026-06-20 directive); nagent's enforcement (per §1 Campaigns + §2 Conversation safety net + §3 Hooks); Manual Slop's `docs/` + `conductor/` markdown navigation (per `conductor/workflow.md` "Mandatory Research-First Protocol" + the 6 styleguides in `conductor/code_styleguides/` + the 14 `docs/guide_*.md` files).
**One-liner:** Agents take ~100-150k tokens to warm up; the context window can go up to ~500k (MiniMax M3); the safe zone is 250-350k; the cycle is compact → re-warm → continue. Manual Slop's `docs/` + `conductor/` markdown navigation is a partial mitigation; the shortcoming is that agents frequently forget/fail to read on demand. nagent's `--hook-per-run` (per §3) is the pattern that would close the gap.
**Pattern summary:** The agent context-window pattern is empirical: the model has a warm-up cost (~100-150k tokens before useful output), a maximum window (~500k for MiniMax M3), a safe zone (250-350k; above which output quality degrades), and a cycle (compact → re-warm → continue). nagent enforces the cycle more strictly via per-turn hook injection (§3) + safety net checkpoints (§2) + distill graduates (§1). Manual Slop's `docs/` + `conductor/` markdown navigation is a partial mitigation: the project's 6 styleguides + 14 deep-dive guides + per-track `state.toml` + `metadata.json` are all markdown, deliberately so agents can navigate on demand. The shortcoming is that agents frequently forget to read or fail to read on demand. nagent's `--hook-per-run` pattern (per §3) is the structural mechanism that closes the gap: a per-turn hook that injects a "what to read next" status block at the top of every turn. The decision candidate is Candidate 19 (per-turn ground-truth hook) reframed with the v3.1 context-window framing.
#### §13.1 The Warm-Up + Window + Safe-Zone Numbers
The empirical findings (per the user's 2026-06-20 directive):
- **Warm-up cost:** ~100-150k tokens. Before the model produces useful output, it needs to load the system prompt + the per-track context + the per-discussion history + the per-task state. The warm-up is the cost of the first useful token.
- **Maximum window:** up to ~500k tokens (MiniMax M3). The model can technically process up to 500k tokens, but the output quality degrades as the window fills.
- **Safe zone:** 250-350k tokens. Below the warm-up cost, the model hasn't loaded enough context. Above the safe zone, the output quality degrades. The safe zone is the range where the model produces useful output efficiently.
- **Cycle:** compact → re-warm → continue. When the window approaches the safe-zone ceiling, the model compacts the context (drops low-priority information, summarizes, etc.), then re-warms (loads the compacted context + the new task), then continues. The cycle is iterative; each cycle costs ~100-150k tokens of warm-up.
The numbers are empirical (MiniMax M3); other models may have different numbers. The pattern (warm-up + window + safe zone + cycle) is the structural insight; the numbers are the parameterization.
#### §13.2 nagent's Enforcement
nagent enforces the cycle more strictly than the model does natively. The three mechanisms:
1. **Per-turn hook injection (§3):** A hook runs at the top of every turn (before the model speaks); its output enters the conversation as a labeled block. The hook is the per-turn ground-truth that prevents the model from "re-warming" by reading its own context. The hook is fast (median-of-5 timing) and surfaces the measured state (build status, test status, etc.) without the model having to read its own conversation.
2. **Safety net checkpoints (§2):** A wall-clock + burst guard fires a checkpoint when the conversation grows. The checkpoint is a separate one-shot LLM call (not the working model) that produces a structured summary (## Intent | ## Next action | ## Constraints | ## Open questions). The summary is the "compacted" context; the next turn re-warms from the summary.
3. **Distill graduates (§1):** The `--graduate` pass takes proven playbooks and drafts them as non-executable `{name}.draft` files. The drafts are "graduate candidates" — proven knowledge that can be promoted to executable tools after review. The graduate pass is the "structural re-warm" — the model doesn't have to re-read the playbook because it's been distilled into a tool.
The three mechanisms together implement the cycle as a structural pattern, not a model-dependent behavior. The model doesn't have to "remember to compact"; the cycle is enforced by the loop.
#### §13.3 Manual Slop's Partial Mitigation
Manual Slop's `docs/` + `conductor/` markdown navigation is a partial mitigation for the cycle. The project deliberately keeps the following files in markdown so agents can navigate on demand:
- **`AGENTS.md`** — the canonical operating instructions for agents. The @import pattern (per `conductor/code_styleguides/data_oriented_design.md`) includes the 6 styleguides + the 14 deep-dive guides.
- **`conductor/workflow.md`** — the workflow conventions (TDD, per-task commits, format commitments, "always run the suite").
- **`conductor/product-guidelines.md`** — the project styleguides (1-space indent for Python, no comments, etc.).
- **`conductor/code_styleguides/data_oriented_design.md`** — the canonical DOD reference (Tier 0/1/2, simplification pass, enforceable deliverables).
- **`conductor/code_styleguides/cache_friendly_context.md`** — the cache TTL GUI contract (stable-to-volatile context ordering).
- **`conductor/code_styleguides/knowledge_artifacts.md`** — the knowledge harvest pattern (7-category schema + provenance + sha256 ledger).
- **`conductor/code_styleguides/error_handling.md`** — the Result[T] convention.
- **`conductor/code_styleguides/agent_memory_dimensions.md`** — the 4 memory dimensions (curation / discussion / RAG / knowledge).
- **`conductor/code_styleguides/rag_integration_discipline.md`** — the conservative-RAG rule.
- **`conductor/code_styleguides/feature_flags.md`** — file presence vs config flags vs CLI flags.
- **The 14 `docs/guide_*.md` files** — the deep-dive guides (architecture, AI client, API hooks, MCP client, app controller, MMA, models, testing, GUI, paths, context curation, shaders, RAG, beads, hot reload, personas, NERV theme, workspace profiles, command palette).
- **Per-track `state.toml` + `metadata.json`** — the per-track state (current phase, task progress, verification status).
- **Per-track `spec.md` + `plan.md`** — the per-track specification and plan.
The markdown convention is deliberate: agents can navigate the project's knowledge on demand by reading the files. The convention is the project's "partial mitigation" for the cycle.
#### §13.4 The Shortcoming
The shortcoming is that agents frequently forget to read or fail to read on demand. The empirical observation:
- **Forget to read:** The agent has a task, the relevant guidance is in `conductor/workflow.md`, but the agent doesn't read the file because the task description doesn't explicitly say "read `conductor/workflow.md` first". The agent proceeds without the guidance.
- **Fail to read on demand:** The agent reads the relevant guidance at the start of the task, but as the task progresses, the agent doesn't re-read the guidance when a new question arises. The agent proceeds with stale information.
- **Read but ignore:** The agent reads the relevant guidance, but the agent's interpretation of the guidance is different from the guidance's intent. The agent proceeds with a misunderstanding.
The three failure modes are not the same; each has a different mitigation. The "forget to read" mitigation is to make the reading explicit (e.g., "before starting, read `conductor/workflow.md`"). The "fail to read on demand" mitigation is to make the re-reading automatic (e.g., a per-turn hook that surfaces the relevant guidance). The "read but ignore" mitigation is to make the guidance unambiguous (e.g., structured headings, examples, anti-patterns).
#### §13.5 The Hook Pattern as the Solution
nagent's `--hook-per-run` pattern (per §3) is the structural mechanism that closes the gap. The pattern:
1. **Configure a status command.** The user configures a command (e.g., `make test`, `git status`, `cat conductor/workflow.md`) that runs at the top of every turn.
2. **Run the command via the hook.** The hook runs the command, captures exit code + stdout + stderr, and injects a labeled block at the top of the conversation.
3. **The model sees the status block.** The model reads the status block as part of the conversation; the status block is the per-turn ground-truth.
The pattern closes all three failure modes:
- **Forget to read:** The status block is automatically injected; the agent can't forget to read it.
- **Fail to read on demand:** The status block is refreshed every turn; the agent sees the latest status every turn.
- **Read but ignore:** The status block is structured (exit code + stdout + stderr); the agent can't ignore a failing exit code or a stderr message.
The pattern is the structural mechanism for the cycle. The agent doesn't have to "remember to check the status"; the check is automatic.
#### §13.6 Decision Candidate
**NEW Candidate 28 (MEDIUM).** "Per-turn ground-truth hook for Manual Slop" — adopt nagent's `--hook-per-run` model; inject a "what to read next" status block at the top of every `send_result()`. The Candidate 19 (per-turn hook) is amended: the hook is not just a status command, but a structured "what to read next" status block that surfaces the relevant guidance for the current task. The hook is configured per-project (via `[conductor].hook_per_run` in `manual_slop.toml`); the default is a no-op (the hook is opt-in). See `decisions.md` Candidate 28.
**Source-read citations:**
- The user's 2026-06-20 directive — the empirical findings (warm-up + window + safe zone + cycle)
- `bin/nagent:1442-1484``run_hook` + `resolve_hooks` (a4fb141; the per-turn hook primitive)
- `bin/nagent:1922-1927``hook_per_run` injection site (a4fb141)
- `bin/nagent:3167-3185``run_agent_loop` (the main loop; the hook is wired here)
- `bin/nagent:1519-1539``checkpoint_due` + `rebuild_due` (38d3d4f; the safety net trigger)
- `bin/nagent:1547-1587``write_checkpoint` (38d3d4f; the safety net writer)
- `bin/nagent:1590-1662``rebuild_conversation` (38d3d4f; the safety net rebuild)
- `bin/nagent:1840-1881``extract_conversation_summary` (6426a67; the instant-saves change)
- `bin/helpers/nagent_distill_lib.py:587-654``_summary_backfill_candidates` + `_backfill_saved_summaries` (6426a67)
- `bin/nagent-campaign` — campaign CLI entry point (24cf16d; the campaigns abstraction)
- `bin/nagent-distill:107-200``--merge` + `--graduate` CLI surface (f3ec090; the distill abstraction)
- `prompts/knowledge-graduate.md:1-26` — graduation LLM prompt (f3ec090)
- `prompts/knowledge-merge.md:1-19` — merge LLM prompt (f3ec090)
- `AGENTS.md` — the canonical operating instructions (the project's markdown convention)
- `conductor/workflow.md` — the workflow conventions (the project's markdown convention)
- `conductor/product-guidelines.md` — the project styleguides (the project's markdown convention)
- `conductor/code_styleguides/data_oriented_design.md` — the canonical DOD reference (the project's markdown convention)
- `conductor/code_styleguides/cache_friendly_context.md` — the cache TTL GUI contract (the project's markdown convention)
- `conductor/code_styleguides/knowledge_artifacts.md` — the knowledge harvest pattern (the project's markdown convention)
- `conductor/code_styleguides/error_handling.md` — the Result[T] convention (the project's markdown convention)
- `conductor/code_styleguides/agent_memory_dimensions.md` — the 4 memory dimensions (the project's markdown convention)
- `conductor/code_styleguides/rag_integration_discipline.md` — the conservative-RAG rule (the project's markdown convention)
- `conductor/code_styleguides/feature_flags.md` — file presence vs config flags vs CLI flags (the project's markdown convention)
- `docs/guide_*.md` — the 14 deep-dive guides (the project's markdown convention)
- Per-track `state.toml` + `metadata.json` — the per-track state (the project's markdown convention)
- `bin/nagent:606-745``build_initial_context` (v2.3; relevant for the initial context assembly)
- `bin/nagent:970-987``conversation_cache_boundaries` (v2.3; relevant for the cache strategy)
- `bin/nagent:1455-1687``run_safety_net` (38d3d4f; relevant for the safety net machinery)
- `bin/nagent:2819``safety_settings=load_safety_settings(...)` (38d3d4f; relevant for the safety net wiring)
- `bin/helpers/nagent_cli.py:11-86` — the resolve/scaffold functions (54c8741; relevant for the project-local-roots pattern)
- `bin/helpers/nagent_llm.py:54-77``MODEL_CONTEXT_WINDOWS` table (bdfa2a6; relevant for the verified table pattern)
- `bin/nagent:2220-2230``root = resolve_default_root(args.root)` (54c8741; relevant for the project-local-roots pattern)
- `bin/helpers/nagent_safety_lib.py` — the safety net library (38d3d4f; relevant for the safety net machinery)
- `bin/nagent:640-748``build_initial_context` (54c8741; relevant for the 4-layer context resolution)
- `bin/nagent:1075-1081``target = f"{llm.provider}/{llm.model}"` (2edc7ee; relevant for the provider/model naming)
- `bin/nagent:3167-3185``run_agent_loop` (the main loop; relevant for the overall nagent architecture)
- `bin/nagent:1-50` — main module imports + constants (the v3 cluster does not cite specific line ranges)
- `bin/nagent:1300-1400` — main loop body (the v3 cluster does not cite specific line ranges)
- `bin/nagent:1900-2000` — main loop continued (the v3 cluster does not cite specific line ranges)
- `bin/nagent:2000-2100` — main loop continued (the v3 cluster does not cite specific line ranges)
- `bin/nagent:2200-2300` — main loop end (the v3 cluster does not cite specific line ranges)
**Honest gaps:**
1. **The warm-up + window + safe-zone numbers are empirical for MiniMax M3.** Other models (Gemini, Anthropic, OpenAI) may have different numbers. A future track would measure the numbers per provider.
2. **The hook pattern is opt-in.** The default is a no-op; the user must configure a status command. A future track could make the hook default-on with a no-op status command (the cost is the hook's per-turn latency, which should be < 100ms for a no-op).
3. **The "what to read next" status block is a per-project configuration.** The user must specify the status command per project. A future track could auto-detect the relevant guidance based on the current task (e.g., if the task is "implement X", the status block surfaces `conductor/workflow.md` and `conductor/code_styleguides/data_oriented_design.md`).
4. **The hook pattern is per-turn.** A future track could add per-task, per-conversation, or per-project hooks (e.g., a per-task hook that fires when a task starts, a per-conversation hook that fires when a conversation starts).
## §14 Fine-tuning observations
**Source:** user's 2026-06-20 directive ("current generalized models bottlenecked by not having conventions baked in; curated dataset of associated codebases; Together.ai noticed; asks about other prosumer fine-tuning vendors for middle-wage income in 2026").
**One-liner:** Current generalized models are bottlenecked by not having the user's core conventions/workflows baked in. A curated dataset of associated codebases (Manual Slop's own tracks, decisions, plans, styleguides) is the user's proposed mitigation. Together.ai is one noticed vendor; 5-6 other prosumer fine-tuning vendors are surveyed below. Vendor selection is a separate future track; this section is observational.
**Pattern summary:** The fine-tuning pattern is the user's interest in baking conventions/workflows into a model via fine-tuning. The pattern is: (1) recognize the bottleneck (generalized models don't have the user's conventions); (2) curate the dataset (the user's own tracks, decisions, plans, styleguides); (3) select a vendor (Together.ai is one; 5-6 others surveyed); (4) fine-tune the model (vendor-specific process); (5) validate the fine-tuned model (does it actually produce better output for the user's use case?). The v3.1 section is observational; the vendor analysis is a separate future track. The decision candidate is Candidate 29 (dataset-curation track) + Candidate 30 (cache TTL GUI contract hardening, per the cross-ref to §13).
#### §14.1 The Diagnosis
The diagnosis (per the user's 2026-06-20 directive): current generalized models are bottlenecked by not having the user's core conventions/workflows baked in. The bottleneck manifests as:
- **Convention drift:** The model produces output that violates the project's conventions (e.g., 4-space indent instead of 1-space; JSON blocks instead of tables; etc.). The user must correct the output repeatedly.
- **Workflow ignorance:** The model doesn't know the project's workflow (TDD, per-task commits, format commitments, "always run the suite"). The model produces output that doesn't follow the workflow.
- **Styleguide unawareness:** The model doesn't know the project's 6 styleguides (DOD, cache-friendly context, knowledge artifacts, error handling, agent memory dimensions, RAG integration discipline, feature flags). The model produces output that doesn't follow the styleguides.
The three failure modes are not the same; each has a different fine-tuning mitigation. The "convention drift" mitigation is to bake the conventions into the model's training data (e.g., the project's `conductor/product-guidelines.md` + the 6 styleguides as training examples). The "workflow ignorance" mitigation is to bake the workflow into the model's training data (e.g., the project's `conductor/workflow.md` + per-track `plan.md` as training examples). The "styleguide unawareness" mitigation is to bake the styleguides into the model's training data (e.g., the 6 styleguides + the 14 deep-dive guides as training examples).
#### §14.2 Together.ai as One Noticed Vendor
The user noticed Together.ai. Together.ai offers fine-tuning for open-source models (Llama 3.x, Qwen 3, Mistral) with transparent per-token pricing. The pricing model is:
- **Training:** ~$0.50-3.00 per million tokens (varies by model + dataset size).
- **Inference:** ~$0.10-0.60 per million tokens (varies by model + context length).
The prosumer-friendly aspects: transparent pricing, open-source model support, no minimum commitment, serverless deployment. The cons: the user must curate the dataset + select the base model + validate the fine-tuned model.
#### §14.3 Prosumer Fine-Tuning Vendor Survey (2026)
The prosumer fine-tuning vendor survey (per the user's 2026-06-20 directive):
| Vendor | Model families | Pricing tier | Prosumer-friendly? | Notes |
|---|---|---|---|---|
| **Together.ai** | Llama, Qwen, Mistral, others | $0.50-3/M training; $0.10-0.60/M inference | Yes — transparent; open-source models | User-noticed vendor |
| **Fireworks.ai** | Llama, Qwen, Mistral | Similar to Together | Yes — serverless DX | Lower latency than Together for some models |
| **OpenAI fine-tuning** | GPT-4o, GPT-4o-mini, GPT-3.5 | ~$3/M training, $0.30/M inference (4o-mini) | Yes for "mini"; expensive for 4o | Best DX; closed-source models |
| **Anthropic Claude Haiku fine-tuning** | Claude Haiku (if on waitlist) | Similar to OpenAI 4o-mini | Waitlist-gated | Best for Anthropic-specific workflows |
| **Google Gemini 1.5 Flash fine-tuning** | Gemini 1.5 Flash | ~$0.50-1/M training | Yes for high-volume | Best for Google-specific workflows |
| **Local fine-tuning (RTX 4090/5090 + Unsloth)** | Any open-source model | $1,500-3,000 one-time hardware | Yes for weekly-iterators | Full control; no per-token cost |
The survey is observational; the vendor analysis is a separate future track. The v3.1 section is not making a recommendation; it's documenting the user's interest + the prosumer vendor landscape.
#### §14.4 Vendor Analysis Is Out of Scope for v3.1
The vendor analysis is out of scope for v3.1. The v3.1 section is observational; the vendor-selection track (if needed) would do the deep comparison + decision. The reasons:
1. **Vendor pricing changes frequently.** The 2026-06-20 numbers may be out of date by 2026-09-20. A vendor-selection track would need to be re-run periodically.
2. **The dataset is the user's call.** The user must curate the dataset (the user's own tracks, decisions, plans, styleguides) before any vendor can fine-tune. The dataset-curation is a separate effort.
3. **The validation is the user's call.** The user must validate the fine-tuned model against the user's actual use cases. The validation is a separate effort.
4. **The v3.1 track is research-only.** Per the v3.1 scope, no candidates are implemented in the track. The dataset-curation + vendor-selection would be a separate implementation track.
The v3.1 section is a marker for a future track. The marker is: "the user is interested in fine-tuning; a future track would curate the dataset + select the vendor + fine-tune the model + validate the result".
#### §14.5 Decision Candidates
**NEW Candidate 29 (MEDIUM).** "Dataset-curation track for fine-tuning" — separate track to curate the Manual Slop conventions/workflows dataset for fine-tuning; vendor selection deferred. The dataset would include: per-track `spec.md` + `plan.md` + `state.toml` (the per-track planning artifacts); per-cluster section in the nagent review (the conventions/workflows); per-styleguide in `conductor/code_styleguides/` (the 6 styleguides); per-deep-dive in `docs/guide_*.md` (the 14 deep-dive guides). The dataset would be a markdown + TOML corpus; the corpus would be the input to a vendor-specific fine-tuning process. See `decisions.md` Candidate 29.
**NEW Candidate 30 (LOW).** "Cache TTL GUI contract hardening" — make the per-turn grounding primitive also track cache state; cross-ref `cache_friendly_context.md`. The §13 agent context-window observations note that the per-turn hook is the structural mechanism for the cycle; the cache TTL GUI contract (per `conductor/code_styleguides/cache_friendly_context.md`) is the cache version of the same insight. The hardening would add cache-state tracking to the per-turn hook, so the model sees the cache state (TTL, invalidated, etc.) as part of the status block. See `decisions.md` Candidate 30.
**Source-read citations:**
- The user's 2026-06-20 directive — the diagnosis (current models bottlenecked) + the dataset (Manual Slop's own tracks) + the vendor notice (Together.ai) + the prosumer question (other vendors for middle-wage income in 2026)
- `conductor/presets.py` — the TOML precedent for project config (the dataset would include `presets.toml` + `project_presets.toml`)
- `conductor/personas.py` — the TOML precedent for project config (the dataset would include `personas.toml` + `project_personas.toml`)
- `conductor/context_presets.py` — the ContextPresetManager (the dataset would include per-track context presets)
- `conductor/tool_presets.py` — the ToolPresetManager (the dataset would include tool presets)
- `conductor/tool_bias.py` — the ToolBiasEngine (the dataset would include tool bias profiles)
- `conductor/workflow.md` — the workflow conventions (the dataset would include this)
- `conductor/product-guidelines.md` — the project styleguides (the dataset would include this)
- `conductor/code_styleguides/data_oriented_design.md` — the canonical DOD reference (the dataset would include this)
- `conductor/code_styleguides/cache_friendly_context.md` — the cache TTL GUI contract (the dataset would include this; relevant for Candidate 30)
- `conductor/code_styleguides/knowledge_artifacts.md` — the knowledge harvest pattern (the dataset would include this)
- `conductor/code_styleguides/error_handling.md` — the Result[T] convention (the dataset would include this)
- `conductor/code_styleguides/agent_memory_dimensions.md` — the 4 memory dimensions (the dataset would include this)
- `conductor/code_styleguides/rag_integration_discipline.md` — the conservative-RAG rule (the dataset would include this)
- `conductor/code_styleguides/feature_flags.md` — file presence vs config flags vs CLI flags (the dataset would include this)
- `docs/guide_*.md` — the 14 deep-dive guides (the dataset would include these)
- `docs/Readme.md` — the canonical teaching document (the dataset would include this)
- `AGENTS.md` — the canonical operating instructions (the dataset would include this)
- Per-track `spec.md` + `plan.md` + `state.toml` + `metadata.json` — the per-track artifacts (the dataset would include these)
- Per-discussion `logs/sessions/{session_id}/discussion.jsonl` — the per-discussion history (the dataset would include selected discussions, with user approval)
- The user's existing 4-tier MMA architecture (per `docs/guide_mma.md`) — the MMA conventions (the dataset would include the MMA architecture)
- The user's existing Hook API (per `docs/guide_api_hooks.md`) — the Hook API conventions (the dataset would include the Hook API architecture)
- The user's existing MCP tools (per `docs/guide_mcp_client.md`) — the MCP tool conventions (the dataset would include the MCP architecture)
- Together.ai pricing page (https://www.together.ai/pricing) — the user's noticed vendor
- Fireworks.ai pricing page (https://fireworks.ai/pricing) — the alternative vendor
- OpenAI fine-tuning pricing (https://openai.com/api/pricing/) — the closed-source alternative
- Unsloth (https://github.com/unslothai/unsloth) — the local fine-tuning framework
- `bin/nagent:1075-1081``target = f"{llm.provider}/{llm.model}"` (2edc7ee; relevant for the provider/model naming, cross-ref to §5)
- `bin/nagent:3167-3185``run_agent_loop` (the main loop; relevant for the overall nagent architecture)
- `conductor/tech-stack.md` — the project's tech stack (relevant for the model selection)
- `bin/helpers/nagent_llm.py:54-77``MODEL_CONTEXT_WINDOWS` table (bdfa2a6; relevant for the per-model context windows, cross-ref to §5)
- `bin/nagent:2220-2230``root = resolve_default_root(args.root)` (54c8741; relevant for the project-local-roots pattern)
- `bin/helpers/nagent_safety_lib.py` — the safety net library (38d3d4f; relevant for the safety net machinery)
- `bin/nagent:606-745``build_initial_context` (v2.3; relevant for the initial context assembly)
- `bin/nagent:970-987``conversation_cache_boundaries` (v2.3; relevant for the cache strategy, cross-ref to Candidate 30)
- `bin/nagent:1455-1687``run_safety_net` (38d3d4f; relevant for the safety net machinery)
- `bin/nagent:1840-1881``extract_conversation_summary` (6426a67; relevant for the instant-saves change)
- `bin/nagent:2819``safety_settings=load_safety_settings(...)` (38d3d4f; relevant for the safety net wiring)
- `bin/nagent:1922-1927``hook_per_run` injection site (a4fb141; relevant for the per-turn hook, cross-ref to §3 + §13)
- `bin/nagent:1442-1484``run_hook` + `resolve_hooks` (a4fb141; relevant for the per-turn hook, cross-ref to §3 + §13)
- `bin/helpers/nagent_cli.py:11-86` — the resolve/scaffold functions (54c8741; relevant for the project-local-roots pattern)
- `bin/nagent:1-50` — main module imports + constants (the v3 cluster does not cite specific line ranges)
- `bin/nagent:1300-1400` — main loop body (the v3 cluster does not cite specific line ranges)
- `bin/nagent:1900-2000` — main loop continued (the v3 cluster does not cite specific line ranges)
- `bin/nagent:2000-2100` — main loop continued (the v3 cluster does not cite specific line ranges)
- `bin/nagent:2200-2300` — main loop end (the v3 cluster does not cite specific line ranges)
- `bin/nagent:640-748``build_initial_context` (54c8741; relevant for the 4-layer context resolution)
**Honest gaps:**
1. **The dataset-curation effort is significant.** A complete dataset would include all 14 deep-dive guides + 6 styleguides + per-track artifacts + per-discussion history. The effort is months, not days. A future track would scope the dataset to a manageable subset.
2. **The vendor pricing is from 2026-06-20.** The pricing may change by the time the user is ready to fine-tune. A vendor-selection track would re-survey the pricing at the time of decision.
3. **The fine-tuned model's validation is the user's call.** The user must validate the model against the user's actual use cases. The validation is a separate effort; the v3.1 section does not provide a validation methodology.
4. **The Cache TTL GUI contract hardening (Candidate 30) is a small change.** The cross-ref to `cache_friendly_context.md` is the canonical reference; a future track would add cache-state tracking to the per-turn hook.
5. **The fine-tuning vs. prompting trade-off is not analyzed.** Fine-tuning bakes conventions into the model; prompting surfaces conventions at inference time. The trade-off is: fine-tuning is a one-time cost + lower per-inference cost; prompting is a per-inference cost + no training cost. A vendor-selection track would analyze the trade-off.
## §15 Decisions
See `decisions.md` for the full candidate list (v2.3's 16 + v3's new 11 + v3.1's new 3, with v2.3 → v3 → v3.1 status mapping at the top). **Total v3.1 candidate pool: 30 entries** (3 HIGH + 7 MEDIUM + 7 LOW + 1 LOW-docs in v3+v3.1's new candidates, plus 14 STILL-OPEN from v2.3, plus 1 PROMOTED + 1 SUBSUMED status changes, plus 3 v3.1 NEW per §12-§14). The HIGH-priority v3 candidates are:
- **Candidate 17:** Campaign-style plan-as-data for the conductor (§1) — amended by Candidate 27 to use markdown + frontmatter, not YAML
- **Candidate 18:** Discussion-window safety net for Manual Slop (§2)
- **Candidate 22:** Tier 3 worker contract "decompose or isolate, never offload" (§6)
The MEDIUM-priority v3 candidates are Candidates 19 (per-turn hook), 21 (per-model token-cap), 23 (per-conversation scratch dir), 25 (optimization-log discipline), 27 (tolerance-based comparator). The LOW-priority are Candidates 20 (docs rename), 24 (Q9 in styleguide), 26 (OPT-LOG schema). Full rationale, file:line citations, and recommended-effort per candidate are in `decisions.md`.
The MEDIUM-priority v3+v3.1 candidates are Candidates 19 (per-turn hook — amended by Candidate 28), 21 (per-model token-cap), 23 (per-conversation scratch dir), 25 (optimization-log discipline), 27 (markdown+DSL lock-in, per §12), 28 (per-turn ground-truth hook, per §13), 29 (dataset-curation track, per §14). The LOW-priority are Candidates 20 (docs rename), 24 (Q9 in styleguide), 26 (OPT-LOG schema), 30 (cache TTL GUI contract hardening, per §14). Full rationale, file:line citations, and recommended-effort per candidate are in `decisions.md`.
## §13 Cross-references
## §16 Cross-references
See `nagent_takeaways_v3_20260619.md` for the bridge to v2.3 takeaways + the sibling reviews:
- **`fable_review_20260617`** — Fable's analysis of Mythos system prompt. Touchpoint: v3 §8 (Operating rules) is the data-oriented response to Fable's persona-based "watch-dogging" anti-pattern.
- **`intent_dsl_survey_20260612`** — the 10 prior-art clusters for intent-based DSLs. Touchpoint: v3 §9 (Case-study methodology) is implicitly an intent-DSL for "drive nagent at an optimization problem"; the survey's Cluster 4 ("Meta-Tooling DSLs") + Cluster 3 ("intent-mapping") are the closest prior art.
- **`superpowers_review_20260619`** — the superpowers plugin review. Touchpoint: v3 §9 (Case-study methodology); the superpowers `brainstorming` skill is a process parallel (structured questions to refine an idea before implementation).
- **`intent_dsl_survey_20260612`** — the 10 prior-art clusters for intent-based DSLs. Touchpoint: v3 §9 (Case-study methodology) is implicitly an intent-DSL for "drive nagent at an optimization problem"; v3.1 §12 (YAML avoidance) cites the survey's Cluster 5 "SSDL shape primitives" as the project's DSL primitive.
- **`superpowers_review_20260619`** — the superpowers plugin review. Touchpoint: v3 §9 (Case-study methodology); the superpowers `brainstorming` skill is a process parallel (structured questions to refine an idea before implementation); v3.1 §12 (YAML avoidance) cites the superpowers review as the project's markdown-driven convention.
## §14 References
## §17 References
### Source commits (24)
@@ -2415,7 +2841,27 @@ The 24 nagent commits reviewed, in chronological order (oldest first):
- [`macton/pep-copt`](https://github.com/macton/pep-copt) at `main` (5 commits). The PEP image compression case study: 2.04× speedup aggregate on 24-image benchmark, byte-identical `.pep` output, decode net-neutral (§10).
- [`macton/differentiable-collisions-optc`](https://github.com/macton/differentiable-collisions-optc) at `main` (5 commits). The Convex Primitive Collision Detection case study: 101.06× speedup on committed input, 97.75× and 98.43× on alternate seeds, tolerance-based match contract (§11).
### Per-phase commit SHAs
### Per-phase commit SHAs (v3.1)
| Phase | Description | Commit SHA |
|---|---|---|
| Phase 1 | Setup + audit (v3.1) | `8fb82762` |
| Phase 2 | Thicken §1 Campaigns cluster | `bd36aa4b` |
| Phase 3 | Thicken §2 Conversation safety net cluster | `478b088b` |
| Phase 4 | Thicken §3 Hooks cluster | `d17ee930` |
| Phase 5 | Thicken §4 Project-local roots cluster | `1bc8e924` |
| Phase 6 | Thicken §5 Provider expansion cluster | `987f4a97` |
| Phase 7 | Thicken §6 Delegation rewrite cluster | `a406d290` |
| Phase 8 | Thicken §7 Robustness cluster | `b9b31006` |
| Phase 9 | Thicken §8 Operating rules cluster | `eb7da8d8` |
| Phase 10 | Thicken §9 Case-study methodology cluster | `24442379` |
| Phase 11 | Thicken §10 PEP case study cluster | `10c7d1d0` |
| Phase 12 | Thicken §11 Collisions case study cluster | `1574ee47` |
| Phase 13 | New sections §12-§14 + renumber v3 §12-§14 to §15-§17 | (this commit) |
| Phase 14 | Refresh side artifacts | (forthcoming) |
| Phase 15 | Chunking-strategy + format-commitment verification | (forthcoming) |
### Per-phase commit SHAs (v3)
| Phase | Description | Commit SHA |
|---|---|---|
@@ -2431,8 +2877,8 @@ The 24 nagent commits reviewed, in chronological order (oldest first):
| Phase 10 | Case-study methodology cluster (§9) | `54e62b10` |
| Phase 11 | PEP case study cluster (§10) | `f53c82e6` |
| Phase 12 | Collisions case study cluster (§11) | `db7d94de` |
| Phase 13 | Refresh side artifacts | (this commit) |
| Phase 14 | Format-commitment verification | (forthcoming) |
| Phase 13 | Refresh side artifacts | `e150088d` |
| Phase 14 | Format-commitment verification | `b49be820` |
### Sibling-review references
@@ -2445,7 +2891,10 @@ The 24 nagent commits reviewed, in chronological order (oldest first):
- `conductor/workflow.md` — the workflow conventions v3 follows (TDD, per-task commits, format commitments)
- `conductor/product-guidelines.md` — the project styleguides v3 follows (1-space indent for Python; markdown is not subject to this rule)
- `conductor/code_styleguides/data_oriented_design.md` — the project's canonical DOD reference, itself derived from Acton's `context/data-oriented-design.md`
- `conductor/code_styleguides/cache_friendly_context.md` — references nagent_review_v2_3 §3.2 + §5 (v3 deepens with §5 per-model context windows)
- `conductor/code_styleguides/cache_friendly_context.md` — references nagent_review_v2_3 §3.2 + §5 (v3 deepens with §5 per-model context windows); v3.1 §13 + §14 cross-ref for the per-turn hook + cache TTL GUI contract
- `conductor/code_styleguides/knowledge_artifacts.md` — references nagent_review_v2_3 §3.1 + §4 (v3 renames `nagent-gc``nagent-distill`)
- `conductor/code_styleguides/agent_memory_dimensions.md` — references nagent_review_v2_3 §2.8 (v3 deepens with §1-§4 memory extension)
- `docs/guide_meta_boundary.md` — the Application vs Meta-Tooling distinction (load-bearing context for v3)
- `conductor/code_styleguides/rag_integration_discipline.md` — the conservative-RAG rule
- `conductor/code_styleguides/feature_flags.md` — file presence vs config flags vs CLI flags
- `conductor/code_styleguides/error_handling.md` — the Result[T] convention
- `docs/guide_meta_boundary.md` — the Application vs Meta-Tooling distinction (load-bearing context for v3)