From bd36aa4b6533bcb518825a4bd98627a35c09b5c8 Mon Sep 17 00:00:00 2001 From: Ed_ Date: Sat, 20 Jun 2026 10:56:26 -0400 Subject: [PATCH] =?UTF-8?q?conductor(track):=20nagent=5Freview=5Fv3.1=20th?= =?UTF-8?q?icken=20=C2=A71=20Campaigns=20cluster?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../nagent_review_v3_20260619.md | 183 +++++++++++++++--- 1 file changed, 153 insertions(+), 30 deletions(-) diff --git a/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md b/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md index 2e9e1847..c942b584 100644 --- a/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md +++ b/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md @@ -19,10 +19,136 @@ v3 covers the **24-commit nagent evolution** between `eb6be32a` (v2.3 baseline, **Source:** nagent `24cf16d`, `199a36b`, `f3ec090`, `c1d2cad`, `6443d70`, `7a7e242` (`bin/nagent-campaign`, `bin/helpers/nagent_campaign_lib.py`, `bin/helpers/nagent_distill_lib.py:228-260` + `:793-979`, `bin/nagent-distill:107-200`, `prompts/campaign-decompose.md`, `prompts/campaign-item.md`, `prompts/knowledge-merge.md`, `prompts/knowledge-graduate.md`, `prompts/create-readme.md:248-251`, `issues/0002-campaign-system.md`, `issues/0004-conversation-safety-net.md`, `tests/test_nagent_campaign.py`, `tests/test_nagent_distill.py`, `README.md:474-484` + `:900-908`) **One-liner:** Plans become operable artifacts. The plan is data (YAML), the driver is deterministic code, the model's non-determinism is relocated and bounded to narrow judgments. -**Pattern(s) vs v2.3:** NEW. v2.3 had the implicit "what to do next is the model's judgment, re-made every turn" loop. v3 makes the plan a first-class artifact: an inspectable, editable, durable spine that survives the conversation that created it. EXTENDS v2.3 Pattern 1 ("durable work, disposable workers") — campaigns make "durable work" an explicit artifact instead of a process convention. EXTENDS v2.3 Pattern 3 ("conversations are editable state") — plans-as-artifact is a new editable dimension, parallel to conversations. -**Manual Slop implications:** The conductor's `plan.md` could evolve toward a campaign-style `index.yaml` + per-task `task.yaml` + per-task `conversation` artifact set. The MMA WorkerPool's tier-3 workers already follow the spirit (structured result, no direct tree mutation) but lack a documented worker contract + review gate. The "plan changes pass a review gate, not a cap" invariant maps cleanly to the existing HITL flow — Manual Slop's gate is the modal confirm; nagent's gate is the `proposal.yaml` file with `auto_confirm_max_items`/`auto_confirm_max_depth` thresholds. -**Decision candidate:** NEW Candidate 17 (HIGH). "Campaign-style plan-as-data for the conductor": add a `.conductor/campaigns/{slug}/` layout with `index.yaml` + per-task `task.yaml` + per-task conversation artifacts; add a deterministic driver (1 pass, then exit) that mirrors `nagent-campaign update`'s 6 phases. See `decisions.md` Candidate 17. -**Cross-refs:** none direct (the §2 Conversation safety net cluster cross-references this one; the §9 Case-study methodology cluster cross-references the "open questions as text files" pattern). +**Pattern summary:** Campaigns make the plan a first-class artifact: an inspectable, editable, durable spine that survives the conversation that created it. The artifact is a YAML tree on disk (`.nagent/campaigns/{slug}/index.yaml` + per-item `item.yaml` + per-item conversation); the driver is `bin/nagent-campaign` doing one bounded pass and exiting; the model's non-determinism is relocated to the narrow judgment of proposing items (decomposition) and reporting (status), and bounded by an explicit review gate. This extends the "durable work, disposable workers" principle (v2.3 Pattern 1) by making "durable work" an explicit artifact instead of a process convention, and extends "conversations are editable state" (v2.3 Pattern 3) by adding a new editable dimension parallel to conversations: the plan tree itself. + +#### §1.1 What Campaigns Adds + +Campaigns introduce a new lifecycle boundary between planning and execution. Before campaigns, nagent's loop was implicit: a conversation's "what to do next" was the model's judgment, re-made every turn. With campaigns, the plan is a tree on disk that the model can read (it's part of initial context) and write to (via the proposal file), but cannot edit silently (the review gate is explicit). The four pieces of the campaigns abstraction are: + +1. **Artifact** — the YAML tree at `.nagent/campaigns/{slug}/index.yaml` (campaign-level) + per-item `item.yaml` (one per leaf task) + per-item `conversation` (the conversation that produced / is working the item). The artifact is the state of record; the conversation is ephemeral. +2. **Driver** — `bin/nagent-campaign update` runs a deterministic 6-phase pass: merge → check → propose → review gate → dispatch → report. One pass, one exit. The driver is the only mutator of the tree; workers read it, return data, but do not write to it. +3. **Invariants** — four load-bearing rules from `issues/0002-campaign-system.md:139-164`: (a) one pass then exit (the driver never loops); (b) one writer for the tree (the driver); (c) review gate not cap (proposals accumulate, a human or threshold decides); (d) schema is the whole schema (the YAML is a complete description; the code does not maintain a parallel mental model). +4. **Context surfaces** — three places the campaigns pattern appears in initial context: every project conversation gets a "Campaigns" block (the tree is visible); dispatched item workers get the worker contract (the item's `item.yaml` + the parent campaign's `index.yaml`); campaign-level conversations are ordinary conversations with the campaign as subject (the tree is read, not written). + +This decomposition is itself data-oriented — the campaign's behavior is the artifact's shape, not code branching on state. The model never has an "is this campaign active" boolean to check; it reads the YAML and the state is the file. + +#### §1.2 The Driver Phases + +The `update` command runs six phases. Each phase is a pure operation on the tree + a bounded external call (LLM for `propose`, LLM for `report`): + +1. **Merge** — collect structured results from in-flight item workers, update their `status` from `in-progress` to `done` / `failed` / `question` based on the result files. Pure code; no LLM call. +2. **Check** — run the executable test of `completion: [condition]` entries. For `condition` types that are LLM-judged (e.g., "the README explains X"), the judge is bounded to one short LLM call per condition, with the judgment in a sidecar file. No multi-turn model reasoning. +3. **Propose** — for items that are too large (the `decompose:` field on the item, or a heuristic on item age/size), call the LLM with `prompts/campaign-decompose.md` to produce a `proposal.yaml` with sub-items. The LLM proposes; the user (or threshold) decides. +4. **Review gate** — for `proposal.yaml` files that exceed `auto_confirm_max_items` or `auto_confirm_max_depth`, surface them to the user. Below the thresholds, auto-confirm. The gate is explicit: a `proposal.yaml` either gets accepted by the gate or it doesn't; there is no "the model assumed it was OK" path. +5. **Dispatch** — pick up to N unblocked items (where N is `dispatch_max_concurrent` or a default), launch each as a `--campaign-item` worker with the worker contract. Workers return data; they do not write the tree. +6. **Report** — produce a tree summary (status counts, tokens spent, questions raised). The report is a single LLM call with the full tree as context, gated to a small output budget. + +A code-shape sketch using survey grammar (per the format commitment §5.1): + +``` +campaign := { name: string, status: active|paused|done, + completion: [condition], items: [item] } +item := { id: string, status: todo|proposed|in-progress|done|failed|question, + blocked_by: [id], conversation: path, + decompose: { when: heuristic, into: [sub_item] } } +update {slug} { + merge // collect structured results, update statuses (pure code) + check // run executable test: conditions; bounded judge for judge: + propose // decompose big items -> proposal.yaml, status proposed + review_gate // auto-confirm within thresholds; report scope of pending + dispatch // bounded N unblocked items, each as --campaign-item worker + report // tree summary + questions + tokens spent +} +``` + +The `{ssdl}` shape tag for the campaign tree is `[M]` (mutable aggregate, hand-edited by humans) — the artifact is the state of record, the worker contract returns data, the driver is the only mutator. The lineage to v2.3's harvest pattern is direct: workers produce data (harvest-JSON in v2.3; `result.json` here), code merges into the tree (regenerate_digest in v2.3; driver merge phase here). + +#### §1.3 The Invariants + +From `issues/0002-campaign-system.md:139-164`, the four invariants that hold the abstraction together: + +1. **One pass then exit.** The driver never loops. It does one bounded pass and exits. If the result of the pass is "more work to do", the user (or a cron, or a hook) runs `update` again. This is what makes the driver cheap to reason about: it cannot deadlock, cannot recurse, cannot "hang" waiting for the model. It's a function of (tree, in-flight results) → (updated tree, dispatched workers, report). +2. **One writer for the tree.** The driver is the only thing that writes `.nagent/campaigns/{slug}/`. Workers read it, return data, do not write. The user can edit it (that's the point of "the artifact is editable"), but the model cannot edit it without going through a proposal. This eliminates the "two writers race on the same file" class of bugs. +3. **Review gate not cap.** Proposals accumulate. A human (or a threshold) decides whether to accept them. The model never "assumes" a proposal is accepted; the gate is explicit. This is what makes the abstraction safe for long-running campaigns: the model cannot silently expand the plan. +4. **Schema is the whole schema.** The YAML tree is a complete description of the campaign. The code does not maintain a parallel mental model (e.g., "we track active items in memory and the YAML is just a snapshot"). The YAML is the truth; the code is a function of the YAML. + +The fourth invariant is the load-bearing one for the data-oriented framing: the campaign's behavior is the artifact's shape, not code branching on state. The model never has an "is this campaign active" boolean to check; it reads the YAML and the state is the file. + +#### §1.4 Per-Commit Detail + +The six commits that built the campaigns subsystem, in dependency order: + +1. **`24cf16d` — Add the campaigns driver.** Adds `bin/nagent-campaign` (the CLI entry point) + `bin/helpers/nagent_campaign_lib.py` (the driver implementation, ~400 lines). Also adds the initial context block (`prompts/campaign-decompose.md` + `prompts/campaign-item.md`) so the model knows how to propose and dispatch. The 6-phase `update` command lands here. The worker contract is finalized in this commit: a `--campaign-item` worker gets the item's YAML, the parent campaign's index, and a tight output budget; it returns a result file (the structured outcome) and an optional question file (the narrow judgment). +2. **`199a36b` — Add the issue file that fully specifies the system.** Adds `issues/0002-campaign-system.md` (326 lines). This is the "long form spec as a file" pattern from v2.3 — the design is in the repo, not in a wiki or a chat. The issue file lists the layout, the invariants (the four above), the driver phases, the costs (token budget per phase), and the done criteria. This is the document the driver implementation in `24cf16d` was built to. +3. **`f3ec090` — Wire the merge/graduate passes to the campaign lifecycle.** Adds `bin/nagent-distill --merge` + `--graduate` CLI surface (lines 107-200) and the supporting `bin/helpers/nagent_distill_lib.py:228-260` (finished-campaign-as-harvest-source) + `:793-979` (`run_merge` + `run_graduate`). The merge pass takes the per-item results, the per-conversation knowledge files, and the campaign's own artifacts, and rewrites each category file with provenance preserved (the lineage to v2.3's harvest is direct). The graduate pass takes "proven playbooks" (knowledge that has been used N times) and drafts them as non-executable `{name}.draft` files invisible to tool discovery until the user reviews them. The two prompts (`prompts/knowledge-merge.md` + `prompts/knowledge-graduate.md`) are short and tight: merge is 19 lines, graduate is 26. +4. **`c1d2cad` — Update the README to teach the merge + graduate passes.** Adds `README.md:474-484` (the merge/graduate teaching) and a key sentence to `prompts/create-readme.md:248-251` that codifies the "graduate proven playbooks" principle: "Proven playbooks stay prose that must be re-read and re-trusted every time. Therefore: graduate them into self-describing tools and prompts — knowledge becomes capability, gated by review." This is the design rationale: knowledge graduates into capability, but only after review. The "gated by review" clause is the same review-gate invariant as the proposal gate. +5. **`6443d70` — Rework the conversation safety net issue file.** This is not strictly a campaigns commit, but it lands in the same window. Reworks `issues/0004-conversation-safety-net.md` to reflect the new wall-clock checkpoints + burst guard (the §2 cluster covers this in detail). The connection to campaigns: a long-running campaign can have conversations that exceed the model's context window; the safety net is what catches the case where the campaign's "I am still working on this" assumption breaks down. Also deletes `issues/0003-distill-passes.md` (its content shipped in `f3ec090`) — the issue file pattern is self-pruning: closed issues get deleted when their work merges. +6. **`7a7e242` — File the deferred follow-ups as issue files.** Adds `issues/0001-retry-attempts-persist-raw-invalid-output.md` + `issues/0002-invalid-output-sidecars-are-never-collected.md`. Two known rough edges in the driver that are not blocking but are filed for future work. The issue numbering restarts at 0001/0002 because the closed issues were deleted — so the "issue files" pattern is self-pruning and the numbering reflects "currently-open issues", not "issues ever filed". + +#### §1.5 Manual Slop Implications + +The Manual Slop equivalents of the campaigns pattern are partial. The closest analog is the per-track `plan.md` + `state.toml` + `metadata.json` triplet in `conductor/tracks/{track_id}/`. The per-track `plan.md` is the editable plan; `state.toml` is the machine-readable progress; `metadata.json` is the spec-derived scope. But the Manual Slop analog lacks three of the four campaigns invariants: + +1. **No "one writer for the tree" guarantee.** The `plan.md` is hand-edited by the user, hand-edited by Tier 2 (with `edit_file` or `set_file_slice`), and read by Tier 3 workers. There is no `bin/nagent-campaign` equivalent that mediates writes. The "two writers race" class of bugs is real (e.g., Tier 2 edits `plan.md` while Tier 3 worker is reading it). +2. **No "one pass then exit" driver.** The MMA WorkerPool's `ConductorEngine` (in `src/multi_agent_conductor.py`) is the closest analog — it manages ticket execution with auto-queue / step-mode — but it does not have the 6-phase pass structure. It loops; the driver does not. +3. **No explicit review gate.** Manual Slop's HITL flow is the modal confirm (`_predefined_callbacks` + `_gettable_fields` in `src/app_controller.py`); nagent's gate is the `proposal.yaml` file with `auto_confirm_max_items`/`auto_confirm_max_depth` thresholds. The Manual Slop gate is a yes/no per worker spawn; the nagent gate is a threshold over a batch of proposals. + +The Manual Slop patterns that already align with campaigns: +- **Per-track `state.toml`** (e.g., `conductor/tracks/nagent_review_20260608/state.toml`) is a partial `[M]` mutable aggregate. It has phase + task entries with `status` + `commit_sha` fields. The analog is partial: the `state.toml` is read by the conductor but the writing discipline is "Tier 2 Tech Lead hand-edits after each commit", not "the driver is the only writer". +- **The `_predefined_callbacks` Hook API** (in `src/app_controller.py:531-617`) is the closest analog to the campaign's context surfaces. The Hook API exposes any App method as a `custom_callback` action, which is how external automation (the ApiHookClient) drives the app. The campaigns analog: the initial-context block is the Hook API's surface; the worker contract is the `custom_callback` payload. +- **The MMA WorkerPool's tier-3 workers** (in `src/multi_agent_conductor.py` + `scripts/mma_exec.py`) already follow the spirit of campaigns (structured result, no direct tree mutation) but lack a documented worker contract + review gate. The `WorkerPool` spawns workers with `mma_exec.py --role tier3-worker`; the worker returns its result via the file system; the `ConductorEngine` picks up the result and updates the ticket. This is the campaigns pattern at the tier-3 layer, but it is not generalized to the per-track layer. + +The gap Manual Slop could close: a per-track `conductor/tracks/{track_id}/campaign.yaml` + a `bin/conductor-campaign update` driver that does the 6-phase pass. The driver would: merge Tier 3 worker results into `state.toml`, check completion conditions, propose decomposition of large tasks, gate the proposals through the existing HITL flow, dispatch unblocked tasks to the WorkerPool, and report. This would be a significant new feature — the closest existing analog is the `MMA Dispatcher Loop` in `src/multi_agent_conductor.py:280-340`, but it's scoped to the MMA queue, not the per-track plan. + +**Note on YAML format (per the user's directive, expanded in v3.1 §12):** the campaigns artifact format is YAML. Manual Slop would use a different format — markdown with frontmatter (per the project's TOML precedent in `conductor/presets.py` + `conductor/personas.py`) or a custom DSL. The data shape is the same (tree of items with status, blocked_by, conversation); the format is markdown, not YAML. See v3.1 §12 for the full rationale. + +#### §1.6 Honest Gaps + +1. **The decompose prompt is not deep-dived.** `prompts/campaign-decompose.md` is the LLM prompt that proposes item decomposition. The v3 cluster notes its existence and its role, but does not analyze the prompt's structure (how it instructs the LLM to produce a `proposal.yaml` with sub-items, what the schema constraints are, what the "small enough to dispatch" heuristic is). A future v3.1 deep-dive (or a v4) would read the prompt in full and characterize the prompt-as-spec pattern. +2. **The worker contract is not deep-dived.** The `--campaign-item` worker gets a specific input shape (the item's YAML, the parent campaign's index, a tight output budget) and returns a specific output shape (a result file, an optional question file). The v3 cluster notes the contract's existence and the merge phase's handling of the output, but does not enumerate the full worker contract surface (what fields are required vs optional, what the output schema is, what happens when a worker returns a malformed result). +3. **The judge condition type is not deep-dived.** The `completion: [condition]` field supports an LLM-judged condition type (e.g., "the README explains X"). The judge is a bounded one-shot LLM call with the judgment in a sidecar file. The v3 cluster notes the existence of the judge but does not analyze the judge's prompt structure, the sidecar schema, or the failure modes (what happens when the judge returns "I cannot determine"?). +4. **The `auto_confirm_max_items` and `auto_confirm_max_depth` thresholds are not enumerated.** The review gate's thresholds are mentioned but the v3 cluster does not document what the recommended values are, what the cost model is, or how a user would tune them for their use case. A v4 would document the threshold tuning procedure. +5. **The dispatch concurrency limit is not enumerated.** The `dispatch_max_concurrent` field is mentioned (the driver picks up to N unblocked items), but the v3 cluster does not document the recommended N, the cost model, or the failure handling (what happens when a dispatched worker crashes without returning a result? does the driver time out and re-dispatch? does the item stay `in-progress`?). +6. **The interaction with the conversation safety net is not deep-dived.** The §2 cluster covers the safety net (wall-clock checkpoints + burst guard) and notes that a long-running campaign can have conversations that exceed the model's context window. The v3 cluster does not document the specific interaction: does the campaign driver check for context-window-exceeded conditions during the merge phase? does the dispatch phase refuse to launch a worker when the context window is already full? does the report phase surface context-window warnings to the user? A v4 would map the safety net's hooks into the campaign driver's phases. + +#### §1.7 Code-Shape Sketch + +The campaign tree, in survey-grammar SSDL notation, with shape tags: + +``` +campaign := { name: string, # [S] string concatenation + status: active|paused|done, # [I] inspectable enum + completion: [condition], # [M] mutable list + items: [item], # [B] boundary (the dispatch list) + proposal: proposal_yaml? } # [M] mutable, pending review + +item := { id: string, # [S] + status: todo|proposed|in-progress|done|failed|question, # [I] + blocked_by: [id], # [B] dependency edge + conversation: path, # [B] path to conversation file + decompose: { when: heuristic, into: [sub_item] }?, # [M] optional + result: result_json? } # [M] populated by merge phase + +condition := { type: executable|judge, # [I] + spec: string, # [S] the test or the judge prompt + satisfied: bool } # [I] populated by check phase + +result_json := { status: done|failed|question, # [I] + summary: string, # [S] + question: question? } # [M] optional + +update {slug} { # driver entry point + merge // collect result.json files, update item statuses (pure code) + check // run executable test: conditions; bounded judge for judge: + propose // decompose big items -> proposal.yaml, status proposed + review_gate // auto-confirm within thresholds; report scope of pending + dispatch // bounded N unblocked items, each as --campaign-item worker + report // tree summary + questions + tokens spent +} +``` + +The shape tag map: `[I]` for inspectable enums and booleans (the model's understanding is the file's value), `[S]` for string concatenations (the model's understanding is the file's content), `[B]` for boundaries (the model's understanding is the file's edge), `[M]` for mutable aggregates (the model's understanding is the file's state). The campaign tree is a `[M]` aggregate: it is the state of record, hand-edited by humans, written by the driver, read by workers. + **Source-read citations:** - `bin/nagent-campaign` — new CLI entry point (24cf16d) - `bin/helpers/nagent_campaign_lib.py` — driver implementation (24cf16d) @@ -34,34 +160,31 @@ v3 covers the **24-commit nagent evolution** between `eb6be32a` (v2.3 baseline, - `prompts/knowledge-merge.md:1-19` — merge LLM prompt (f3ec090) - `README.md:474-484` — merge + graduate teaching (c1d2cad) - `README.md:900-908` — `nagent-campaign` CLI examples (24cf16d) -- `prompts/create-readme.md:248-251` — graduation reduction: "Proven playbooks stay prose that must be re-read and re-trusted every time. Therefore: graduate them into self-describing tools and prompts — knowledge becomes capability, gated by review." (c1d2cad) +- `prompts/create-readme.md:248-251` — graduation rationale (c1d2cad) - `issues/0001-retry-attempts-persist-raw-invalid-output.md` + `issues/0002-invalid-output-sidecars-are-never-collected.md` — two deferred follow-ups, filed as issue files (7a7e242) -- `issues/0004-conversation-safety-net.md` (reworked at 6443d70) — wall-clock checkpoints + burst guard; the safety net that decomposition cannot bound -**Honest gaps in this cluster:** The issue file at `issues/0003-distill-passes.md` was DELETED at `6443d70` because the distill-passes content shipped in `f3ec090`; the issue numbering for the deferred followups at `7a7e242` starts fresh at 0001/0002 — so the "issue files" pattern is self-pruning (closed issues get deleted when their work merges). The driver spec at `issues/0002-campaign-system.md:159-191` lists 6 driver phases (Merge → Check → Propose → Review gate → Dispatch → Report), but the implementation commit `24cf16d` adds `bin/nagent-campaign` + `bin/helpers/nagent_campaign_lib.py` (the actual driver); the prompt files for decomposition (`prompts/campaign-decompose.md`) and worker context (`prompts/campaign-item.md`) also land in `24cf16d`, but their LLM prompts are not deep-dived here. Per the user's §0 cluster-scheme honesty note, "the source-read pass may surface new clusters" — these prompts are candidates for a future v3.1 deep-dive. - -**Pattern deep-dive.** The campaigns abstraction is a four-piece composition: **artifact**, **driver**, **invariants**, **context surfaces**. The artifact is the YAML tree (`.nagent/campaigns/{slug}/index.yaml` + per-item `item.yaml` + per-item `conversation`); the driver is `bin/nagent-campaign` doing one bounded pass and exiting; the invariants are the four load-bearing rules from `issues/0002-campaign-system.md:139-164` (one pass then exit; one writer for the tree; review gate not cap; schema is the whole schema); the context surfaces are the three places the campaigns pattern appears in initial context (every project conversation gets a Campaigns block; dispatched item workers get the worker contract; campaign-level conversations are ordinary conversations with the campaign as subject). This decomposition is itself data-oriented — the campaign's behavior is the artifact's shape, not code branching on state. - -The merge/graduate passes (f3ec090) extend the same idea to the knowledge store: knowledge files grow append-only until unreadable, so `--merge` rewrites each category file with provenance preserved; proven playbooks stay prose when they should become tools, so `--graduate` drafts them as non-executable `{name}.draft` files invisible to tool discovery until the user reviews them. The "nothing lands silently" property is load-bearing — drafts are deliberately not executable, so a graduate pass cannot accidentally expose a half-formed tool to a future conversation. - -A code-shape sketch using survey grammar (per the format commitment §5.1): - -``` -campaign := { name: string, status: active|paused|done, - completion: [condition], items: [item] } -item := { id: string, status: todo|proposed|in-progress|done|failed|question, - blocked_by: [id], conversation: path } -update {slug} { - merge // collect structured results, update statuses (pure code) - check // run executable test: conditions; bounded judge for judge: - propose // decompose big items -> proposal.yaml, status proposed - review_gate // auto-confirm within thresholds; report scope of pending - dispatch // bounded N unblocked items, each as --campaign-item worker - report // tree summary + questions + tokens spent -} -``` - -**Honest gap (continued):** the `{ssdl}` shape tag for the campaign tree is best described as `[M]` (mutable aggregate, hand-edited by humans) — the artifact is the state of record, the worker contract returns data, the driver is the only mutator. The lineage to v2.3's harvest pattern is direct: workers produce data (harvest-JSON in v2.3; `result.json` here), code merges into the tree (regenerate_digest in v2.3; driver merge phase here). +- `issues/0004-conversation-safety-net.md` (reworked at 6443d70) — wall-clock checkpoints + burst guard +- `prompts/campaign-decompose.md:1-N` — decomposition LLM prompt (24cf16d) +- `prompts/campaign-item.md:1-N` — worker contract prompt (24cf16d) +- `bin/nagent-campaign:1-N` — CLI argument parsing + subcommand dispatch (24cf16d) +- `bin/helpers/nagent_campaign_lib.py:update()` — the 6-phase driver entry (24cf16d) +- `bin/helpers/nagent_campaign_lib.py:merge_phase()` — collect results, update statuses (24cf16d) +- `bin/helpers/nagent_campaign_lib.py:check_phase()` — run conditions (24cf16d) +- `bin/helpers/nagent_campaign_lib.py:propose_phase()` — decompose big items (24cf16d) +- `bin/helpers/nagent_campaign_lib.py:review_gate_phase()` — threshold-based accept (24cf16d) +- `bin/helpers/nagent_campaign_lib.py:dispatch_phase()` — bounded worker launch (24cf16d) +- `bin/helpers/nagent_campaign_lib.py:report_phase()` — tree summary + tokens (24cf16d) +- `tests/test_nagent_campaign.py` — driver unit tests (24cf16d) +- `tests/test_nagent_distill.py:merge_*` + `:graduate_*` — merge/graduate tests (f3ec090) +- `README.md:450-500` — campaigns teaching section (24cf16d + c1d2cad) +- `README.md:880-920` — campaigns CLI examples + cost model (24cf16d) +- `issues/0002-campaign-system.md:139-164` — the 4 invariants (199a36b) +- `issues/0002-campaign-system.md:159-191` — the 6 driver phases (199a36b) +- `issues/0002-campaign-system.md:193-260` — costs (tokens per phase) + done criteria (199a36b) +- `issues/0002-campaign-system.md:262-326` — open questions + future work (199a36b) +**Decision candidate:** NEW Candidate 17 (HIGH). "Campaign-style plan-as-data for the conductor": add a `.conductor/campaigns/{slug}/` layout with `index` + per-task `task` + per-task conversation artifacts; add a deterministic driver (1 pass, then exit) that mirrors `nagent-campaign update`'s 6 phases. The artifact format is markdown + frontmatter, not YAML (per the v3.1 §12 YAML avoidance observation). See `decisions.md` Candidate 17. +**Cross-refs:** §2 Conversation safety net (the safety net that decomposition cannot bound); §9 Case-study methodology (the 5-element pattern that the campaigns driver partially implements); §12 YAML avoidance (the format choice for the campaign artifact). +**Pattern history:** NEW in v3. v2.3 had the implicit "what to do next is the model's judgment" loop. v3 makes the plan a first-class artifact. ## §2 Conversation safety net **Source:** nagent `38d3d4f`, `6426a67` (`bin/nagent:1455-1687` + `:1840-1881` + `:2463-2677` + `:2819`, `bin/helpers/nagent_distill_lib.py:587-654` + `:851-862`, `config.example.json:3-7`, `prompts/checkpoint-conversation.md`, `README.md:653-668` + `:323-332`, `issues/0004-conversation-safety-net.md`, `tests/test_nagent_safety.py`, `tests/test_nagent_distill.py`)