conductor(track): nagent_review_v3.1 thicken §2 Conversation safety net cluster

2026-06-20 11:17:27 -04:00
parent bd36aa4b65
commit 478b088b69
1 changed files with 250 additions and 42 deletions
@@ -189,61 +189,269 @@ The shape tag map: `[I]` for inspectable enums and booleans (the model's underst

 **Source:** nagent `38d3d4f`, `6426a67` (`bin/nagent:1455-1687` + `:1840-1881` + `:2463-2677` + `:2819`, `bin/helpers/nagent_distill_lib.py:587-654` + `:851-862`, `config.example.json:3-7`, `prompts/checkpoint-conversation.md`, `README.md:653-668` + `:323-332`, `issues/0004-conversation-safety-net.md`, `tests/test_nagent_safety.py`, `tests/test_nagent_distill.py`)
 **One-liner:** A conversation that outgrows its window gets caught, not killed. Checkpoints are a separate one-call writer, not the working model; rebuild is a deterministic string assembly that runs a synchronous checkpoint first; saves are instant because the summary is extracted from the checkpoint's already-paid-for Intent line, not a new LLM call.
-**Pattern(s) vs v2.3:** EXTENDS v2.3 Pattern 5 ("the loop") with failure-recovery semantics. v2.3 had the loop; v3 makes the loop survive long-running conversations. EXTENDS v2.3 Pattern 11 ("large files as explicit artifacts") — checkpoints are an explicit working-state artifact (separate from the conversation) that the user can edit between triggers. The instant-saves change extends v2.3 Pattern 7 ("repo history as data") with deferred-cost summaries — the LLM cost moves to a place where it's visible (dry-run reports) and bounded (per-pass), not paid up-front.
-**Manual Slop implications:** The "sync checkpoint first" invariant maps to Manual Slop's existing `Result[T]` discipline (per `conductor/code_styleguides/error_handling.md`) — failure never blocks; the failure widens the fallback instead. Manual Slop's current Discussion entry write paths could adopt the `summary_source: extracted | llm` pattern; right now every save may do an implicit LLM call. The 3-number config (`checkpoint_interval_minutes`, `checkpoint_max_new_kb`, `rebuild_at_kb`) is a model Manual Slop should follow: operations should be configurable in units `ls -l` can verify, not in token-percentage estimates that drift per provider.
-**Decision candidate:** NEW Candidate 18 (HIGH). "Discussion-window safety net for Manual Slop": adopt the checkpoint + rebuild pattern for the discussion history; backfill summary entries from the existing intent line; surface extracted-vs-llm provenance in the discussion index. See `decisions.md` Candidate 18.
-**Cross-refs:** `conductor/tracks/fable_review_20260617` (the Fable review's analysis of "watch-dogging" is the opposite pattern — nagent's safety net is structural, not persona-driven). §1 Campaigns cross-references the safety net as the failure-recovery layer for what decomposition cannot bound.
+**Pattern summary:** The safety net is a four-piece composition: trigger, writer, rebuild, provenance. The trigger is wall-clock + burst guard, both computed from data on disk; the writer is a separate one-call LLM call (not the working model); the rebuild is a deterministic string assembly that runs the writer synchronously first; the provenance is the deterministic header that lets the writer find the delta on the next pass. Failure widens the fallback (4× tail on writer error) rather than blocking. Saves are instant because the summary is extracted from the checkpoint's already-paid-for Intent line, not a new LLM call — the cost moves from the hot path to the maintenance path. This extends the "the loop" principle (v2.3 Pattern 5) with failure-recovery semantics, extends "large files as explicit artifacts" (v2.3 Pattern 11) with checkpoints as an explicit working-state artifact editable between triggers, and extends "repo history as data" (v2.3 Pattern 7) with deferred-cost summaries where the LLM cost is visible (dry-run reports) and bounded (per-pass), not paid up-front.
+
+#### §2.1 What the Safety Net Adds
+
+The safety net introduces a failure-recovery layer between the conversation and the model's context window. Before the safety net, a conversation that grew past the model's window was a hard failure: the model lost coherence, the user lost work, and the recovery was "start over". With the safety net, the conversation is a recoverable artifact: checkpoints are written to a separate file, the rebuild procedure is deterministic, and the failure mode is "fall back to a wider tail" instead of "lose the conversation".
+
+The four pieces of the safety net abstraction:
+
+1. **Trigger** — wall-clock + burst guard, both computed from data on disk. `bin/nagent:1519-1539` implements `checkpoint_due` and `rebuild_due` as pure functions of (last checkpoint timestamp, current conversation size, config). The trigger is data, not code branching on state. The cadence reasoning is explicit: "time and context consumption are uncorrelated in exactly the wrong direction" (`issues/0004-conversation-safety-net.md:30`). Token-percentage triggers were "an approximation of an approximation" — three numbers in units `ls -l` can verify are the data-grounded alternative.
+2. **Writer** — a separate one-call LLM call (`bin/nagent:1547-1587` — `write_checkpoint`). The writer is NOT the working model. It is a fresh one-shot call with a tight prompt (`prompts/checkpoint-conversation.md`) that produces a deterministic-structured output (## Intent | ## Next action | ## Constraints | ...). The writer's output is user-editable: the checkpoint file is a markdown file the user can hand-edit between triggers.
+3. **Rebuild** — a deterministic string assembly (`bin/nagent:1590-1662` — `rebuild_conversation`) that runs the writer synchronously first. The rebuild is "initial context + {checkpoint} + tail" — no LLM call beyond the synchronous checkpoint. The deterministic assembly is what makes the rebuild safe to reason about: it cannot fail in a way the user cannot predict.
+4. **Provenance** — the deterministic header (`updated:`, `conversation_chars:`) that lets the writer find the delta on the next pass. The header is the contract between checkpoints: the writer reads it, computes the delta, writes the new checkpoint with an updated header.
+
+The "sync checkpoint first" invariant is the load-bearing one. A naive rebuild that trusted the most-recent checkpoint's freshness would fail on the exact conversation the safety net is meant to save (a conversation that grew past `rebuild_at_kb` between scheduled checkpoints). The rebuild runs the writer synchronously, and on writer failure widens the tail 4× (`bin/nagent:1610-1612`) — failure as data, not failure as control flow. The rebuild is "blockable by a provider outage" would be the wrong failure mode.
+
+#### §2.2 The Writer and the Checkpoint Format
+
+The checkpoint is a markdown file with a deterministic header and a fixed-structure body. The header is two fields:
+
+```
+updated: <ISO 8601 timestamp>
+conversation_chars: <integer>
+```
+
+The body is the writer's LLM output, constrained to a fixed schema (`prompts/checkpoint-conversation.md`):
+
+```
+## Intent
+<one sentence: what the user is trying to achieve>
+
+## Next action
+<one sentence: what the next model turn should do>
+
+## Constraints
+<bullet list: things the next model turn must NOT do>
+
+## Open questions
+<bullet list: things the next model turn should ask>
+```
+
+The schema is the whole schema. The code does not maintain a parallel mental model (e.g., "we track the intent in a separate field"). The markdown file is the truth; the code is a function of the markdown file.
+
+The writer is a one-shot LLM call, not the working model. This matters for two reasons:
+
+1. **Cost visibility.** The writer's LLM cost is paid once per checkpoint, not once per turn. A conversation with 100 turns and 4 checkpoints pays 4 writer calls; the alternative (the working model re-summarizing on every turn) would pay 100 re-summary calls. The cost moves from O(turns) to O(checkpoints).
+2. **Non-deterministic working model does not pollute the checkpoint.** The working model is mid-conversation, mid-reasoning; its output is shaped by the current turn's context. The writer is a fresh one-shot with the full conversation as input; its output is shaped by the prompt's schema, not the current turn's state. The checkpoint is stable across reads.
+
+A code-shape sketch using survey grammar:
+
+```
+checkpoint := { updated: timestamp,                  # [S] string
+                conversation_chars: int,             # [I] inspectable
+                body: ## Intent | ## Next action | ## Constraints | ## Open questions }  # [B] boundary
+
+write_checkpoint { conversation, llm, now } {
+  delta = conversation[meta.conversation_chars:]   # [S] string slice
+  if len(delta) < min_delta_chars { return nil }   # too small to summarize
+  prompt = format(prompts.checkpoint-conversation.md, delta)  # [S] string format
+  body   = llm.call(prompt)                        # [B] boundary to LLM
+  write checkpoint.updated = now
+  write checkpoint.conversation_chars = len(conversation)
+  write checkpoint.body = body
+}
+```
+
+The `[B]` boundary tag marks the single LLM call in the writer. Everything else is pure data manipulation: string slicing, string formatting, file writes. The writer is "an LLM call wrapped in deterministic I/O".
+
+#### §2.3 The Trigger Logic
+
+The trigger is a pure function of (last checkpoint timestamp, current conversation size, config). `bin/nagent:1519-1539` implements two functions:
+
+1. **`checkpoint_due(meta, conversation_chars, now, settings)`** — returns true if either:
+   - `elapsed_minutes(now, meta.updated) > settings.checkpoint_interval_minutes` AND `conversation_chars > meta.conversation_chars + new_chars_threshold`
+   - `conversation_chars - meta.conversation_chars > settings.checkpoint_max_new_kb * 1024`
+   - `meta is nil` AND `conversation_chars > settings.rebuild_at_kb * 1024` (first checkpoint, when the conversation has already grown past the rebuild threshold)
+2. **`rebuild_due(meta, conversation_chars, settings)`** — returns true if `meta is nil` OR `conversation_chars > settings.rebuild_at_kb * 1024`.
+
+The three config numbers are in `config.example.json:3-7`:
+
+```json
+{
+  "safety_net": {
+    "checkpoint_interval_minutes": 10,
+    "checkpoint_max_new_kb": 32,
+    "rebuild_at_kb": 192
+  }
+}
+```
+
+All three are in units `ls -l` can verify: minutes, kilobytes, kilobytes. Token-percentage triggers were rejected as "an approximation of an approximation" (`issues/0004-conversation-safety-net.md:30-44`) — the 3-number config is the data-grounded alternative. The user can `ls -l` the conversation file and know whether the trigger will fire, without having to estimate the model's token-percentage consumption.
+
+#### §2.4 The Rebuild Procedure
+
+The rebuild is "initial context + {checkpoint} + tail" — a deterministic string assembly (`bin/nagent:1590-1662` — `rebuild_conversation`). The procedure:
+
+1. **Sync checkpoint first.** Run `write_checkpoint(conversation, llm)` synchronously. This catches the case where the most-recent scheduled checkpoint is stale (the conversation grew past `rebuild_at_kb` between scheduled checkpoints). The sync checkpoint is the "freshness" guarantee.
+2. **Widen tail on writer failure.** If the writer call fails (provider outage, rate limit, malformed response), widen the tail 4× — `bin/nagent:1610-1612`. Failure as data, not failure as control flow. The rebuild cannot fail in a way that loses the conversation.
+3. **Archive the old conversation.** Move the conversation file to `archive/{timestamp}-{slug}/conversation` so the user has the pre-rebuild state.
+4. **Write the new initial context.** Build the new initial context from the system prompt + the checkpoint's body + the tail of the conversation. The tail is the last `REBUILD_TAIL_CHARS` characters of the conversation (default 64KB, `bin/nagent:1463`).
+5. **Reset the checkpoint's `conversation_chars`.** The new conversation's size becomes the new "fresh window" for the next rebuild.
+
+A code-shape sketch:
+
+```
+rebuild { conversation, llm, now, settings } {
+  try write_checkpoint(conversation, llm, now)
+  recover {
+    tail_chars = REBUILD_TAIL_CHARS * 4             # widen 4x on failure
+    audit msg "checkpoint writer failed; using widened tail"
+  } else tail_chars = REBUILD_TAIL_CHARS
+
+  archive_path = archive/{now}/{slug}/conversation
+  move conversation -> archive_path
+  new_conversation = initial_context + checkpoint + conversation[-tail_chars:]
+  write conversation = new_conversation
+  reset meta.conversation_chars = len(new_conversation)
+  reset meta.updated = now
+}
+```
+
+The `{ssdl}` shape tag for the rebuild is `[S]` (string concatenation). The only LLM call is the sync checkpoint. Everything else is deterministic I/O.
+
+#### §2.5 The Instant-Saves Change (6426a67)
+
+The instant-saves change is a smaller, sharper version of the same idea: the cost of an LLM summary is moved from the hot path (every save) to the maintenance path (`nagent-distill --apply` backfill + `--summarize-conversation` on demand).
+
+Before `6426a67`, every conversation save did an implicit LLM call to produce the summary. This had two costs:
+1. **Hot-path latency.** A save was a multi-second LLM call, not a millisecond file write.
+2. **Cost opacity.** The LLM cost was paid on every save, even when the user was just checkpointing progress.
+
+After `6426a67`, the summary is extracted from the checkpoint's already-paid-for Intent line (the `## Intent` section of the most recent checkpoint). The summary is the artifact's own data — no new LLM call. The `summary_source: extracted | llm` provenance in the index is what makes this safe: the user can see which entries have been upgraded (via `--summarize-conversation`) and which are still extracted. The backfill pass (`bin/helpers/nagent_distill_lib.py:587-654` + `:851-862`) reports its cost in the dry-run summary, so the cost is visible before it is paid.
+
+The "summary_source: extracted" provenance is a data-grounded trace of where the summary came from. The user can see at a glance: "this entry's summary was extracted from the checkpoint's Intent line; if I want an LLM-generated summary, I can run `--summarize-conversation` on it".
+
+#### §2.6 Per-Commit Detail
+
+The two commits that built the safety net subsystem:
+
+1. **`38d3d4f` — Add the safety net machinery.** Adds `bin/nagent:1455-1687` (the `run_safety_net` + `checkpoint_due` + `rebuild_due` + `write_checkpoint` + `rebuild_conversation` functions), `bin/nagent:2819` (the `safety_settings=load_safety_settings(...)` wiring into `run_agent_loop`), `config.example.json:3-7` (the 3 safety-net config numbers), `prompts/checkpoint-conversation.md` (the writer LLM prompt), `README.md:653-668` (Part VI safety-net teaching), and `tests/test_nagent_safety.py` (the test file). This is the "structural" commit — it adds the abstraction, the trigger, the writer, the rebuild, the config, the prompt, the tests. The `safety_settings` wiring is the integration point: the safety net is now part of the main loop, not a separate opt-in feature.
+2. **`6426a67` — Add the instant-saves change.** Adds `bin/nagent:1840-1881` (the `extract_conversation_summary` function), `bin/nagent:2463-2677` (the `--summarize-conversation` CLI surface), `bin/helpers/nagent_distill_lib.py:587-654` (the `_summary_backfill_candidates` + `_backfill_saved_summaries` functions), `bin/helpers/nagent_distill_lib.py:851-862` (the backfill wired into the distill apply path), and `README.md:323-332` (Part II instant-saves teaching). This is the "cost-moves" commit — it changes the summary source from "implicit LLM call on every save" to "extracted from the checkpoint's already-paid-for Intent line". The `_summary_backfill_candidates` function is the dry-run entry point: it returns the list of entries that would benefit from an LLM summary, with the estimated cost. The user sees the cost before paying it.
+
+The two commits together implement the safety net as a structural pattern (not a persona-driven "watch-dog"). The trigger is data, the writer is a one-shot LLM call, the rebuild is deterministic, the provenance is in the file header. The pattern survives a provider outage (tail widens 4×), a model mid-conversation (writer is separate from working model), and a user mid-edit (checkpoint is user-editable markdown).
+
+#### §2.7 Manual Slop Implications
+
+The Manual Slop equivalents of the safety net are partial. The closest analog is the per-discussion write path in `src/discussion.py` (or similar) + the per-take branching in `src/project_manager.py:branch_discussion` + `promote_take`. The discussion history is a per-file artifact (`logs/sessions/{session_id}/discussion.jsonl` or similar), and the discussion index is a separate file. But the Manual Slop analog lacks three of the four safety-net invariants:
+
+1. **No "sync checkpoint first" guarantee.** Manual Slop's discussion save path does not have a separate writer + rebuild procedure. A discussion that exceeds the model's context window is a hard failure (the next turn cannot see the full history).
+2. **No "widen tail on failure" fallback.** Manual Slop's failure modes are exception-based, not data-widening. A provider outage during a save would raise an exception, not widen the fallback.
+3. **No `summary_source: extracted | llm` provenance.** Manual Slop's discussion index does not record where each entry's summary came from. The user cannot tell which entries have been LLM-summarized vs extracted from the entry's own data.
+
+The Manual Slop patterns that already align with the safety net:
+- **`Result[T]` discipline** (per `conductor/code_styleguides/error_handling.md`) — failure widens the fallback instead of blocking. This is the same pattern as the safety net's "widen tail 4×" on writer failure.
+- **`promote_take` + `branch_discussion`** (in `src/project_manager.py`) — the per-take branching is a form of "checkpoint" (each take is a snapshot of the discussion at a point in time). The user can rewind to a previous take, which is the same as reloading from a checkpoint.
+- **The 3-layer MCP security model** (per `docs/guide_mcp_client.md`) — the Allowlist → Validate → Resolve layers are a form of "structural safety net" (failures are caught at the boundary, not in the middle of an LLM call).
+
+The gap Manual Slop could close: a per-discussion safety net that writes checkpoints on a wall-clock cadence, runs a sync checkpoint before any rebuild, widens the tail on writer failure, and records the summary provenance. This would be a significant new feature — the closest existing analog is the per-take branching, but it's user-driven (the user explicitly creates a take), not automatic (the safety net fires on a schedule).
+
+**Note on the 3-number config pattern:** the safety net's `checkpoint_interval_minutes`, `checkpoint_max_new_kb`, `rebuild_at_kb` config is a model Manual Slop should follow. Operations should be configurable in units `ls -l` can verify, not in token-percentage estimates that drift per provider. The Manual Slop equivalent would be a per-discussion config with units of (minutes, kilobytes, kilobytes) — not (tokens, percentage, percentage). This is a small but load-bearing change: the user can `ls -l` the discussion file and know whether the trigger will fire, without having to estimate the model's token-percentage consumption.
+
+#### §2.8 Honest Gaps
+
+1. **The `delta_start = min(meta[1], len(content))` clamp at `bin/nagent:1566` could produce a misleading delta if a user edit deletes characters between checkpoints** (the recorded size becomes larger than current content). The clamp hides the failure; the delta would be the entire current content, not the actual new activity. Minor edge case; the spec does not address it.
+2. **The `REBUILD_TAIL_CHARS = 64 * 1024` default at `bin/nagent:1463` is explicitly unmeasured** ("mirrors MiMo's ~65K tokens until measured otherwise" per `issues/0004-conversation-safety-net.md:42-44`). A future track should measure actual rebuild-tail needs across providers and conversation types.
+3. **`best-of-N` is mentioned in the initial context at `bin/nagent:775` as a directive to the model, not implemented as machinery** — it is the same "direction before machinery" pattern v2.3 used for compaction. A follow-up track could lift it to a driver (e.g., `nagent-safety-net --best-of-n` that runs the writer N times and picks the most-recoverable checkpoint).
+4. **The interaction with the campaigns driver (Phase 2's `nagent-campaign update`) is not deep-dived.** The campaigns driver has its own 6 phases (merge, check, propose, review gate, dispatch, report). A long-running campaign can have conversations that exceed the model's context window. The safety net's role in the campaigns driver is not documented: does the driver check for context-window-exceeded conditions during the merge phase? does the dispatch phase refuse to launch a worker when the context window is already full? does the report phase surface context-window warnings to the user?
+5. **The interaction with the conversation-cache boundaries (v2.3 §2.2) is not deep-dived.** v2.3 introduced `conversation_cache_boundaries` at `bin/nagent:970-987` to manage the provider's prompt cache. The safety net's rebuild creates a new initial context, which invalidates the cache. The v3 cluster does not document how the safety net coordinates with the cache invalidation — does the rebuild preserve the cache boundary markers? does the next checkpoint know about the cache state?
+6. **The 3-number config's recommended values are not enumerated.** The config defaults (`checkpoint_interval_minutes: 10`, `checkpoint_max_new_kb: 32`, `rebuild_at_kb: 192`) are documented, but the cost model is not. A v4 would document the recommended values per conversation type (short Q&A, long-running build, multi-day campaign) and per provider (Gemini's 1M context vs Anthropic's 200K vs OpenAI's 128K).
+7. **The writer's failure modes are not enumerated.** The writer is a one-shot LLM call; it can fail with a provider outage, a rate limit, a malformed response, or a refusal. The v3 cluster documents the "widen tail 4×" fallback, but does not enumerate the other failure handling — what happens when the writer returns a malformed response (missing sections, extra sections, wrong order)? does the rebuild retry the writer, or proceed with the malformed checkpoint?
+
+#### §2.9 Code-Shape Sketch
+
+The safety net, in survey-grammar SSDL notation, with shape tags:
+
+```
+safety_settings := { checkpoint_interval_minutes: int,    # [I] inspectable
+                     checkpoint_max_new_kb: int,         # [I] inspectable
+                     rebuild_at_kb: int }                 # [I] inspectable
+
+checkpoint := { updated: timestamp,                       # [S] string
+                conversation_chars: int,                  # [I] inspectable
+                body: ## Intent | ## Next action | ## Constraints | ## Open questions }  # [B] boundary
+
+due { meta, conversation_chars, now, settings } {          # trigger (pure function)
+  if elapsed_minutes(now, meta.updated) > settings.checkpoint_interval_minutes
+     and conversation_chars > meta.conversation_chars
+     -> fire  {ssdl} [I]                                 # inspectable trigger
+  if conversation_chars - meta.conversation_chars > settings.checkpoint_max_new_kb * 1024
+     -> fire
+  if meta is nil and conversation_chars > settings.rebuild_at_kb * 1024
+     -> fire first time only
+  else
+     -> idle
+}
+
+write_checkpoint { conversation, llm, now } {             # writer (one LLM call)
+  delta = conversation[meta.conversation_chars:]         # [S] string slice
+  if len(delta) < min_delta_chars { return nil }         # too small to summarize
+  prompt = format(prompts.checkpoint-conversation.md, delta)  # [S] string format
+  body   = llm.call(prompt)                              # [B] boundary to LLM
+  write checkpoint.updated = now
+  write checkpoint.conversation_chars = len(conversation)
+  write checkpoint.body = body
+}
+
+rebuild { conversation, llm, now, settings } {            # rebuild (deterministic)
+  try write_checkpoint(conversation, llm, now)
+  recover {
+    tail_chars = REBUILD_TAIL_CHARS * 4                  # widen 4x on failure
+    audit msg "checkpoint writer failed; using widened tail"
+  } else tail_chars = REBUILD_TAIL_CHARS
+
+  archive_path = archive/{now}/{slug}/conversation
+  move conversation -> archive_path
+  new_conversation = initial_context + checkpoint + conversation[-tail_chars:]  # [S] string concat
+  write conversation = new_conversation
+  reset meta.conversation_chars = len(new_conversation)
+  reset meta.updated = now
+}
+
+summary_source := { entry_id: string,                    # provenance
+                    source: extracted|llm,               # [I] inspectable
+                    extracted_at: timestamp?,             # [S]
+                    llm_summarized_at: timestamp? }      # [S]
+```
+
+The shape tag map: `[I]` for inspectable triggers and config, `[S]` for string concatenations and timestamps, `[B]` for the single LLM boundary in the writer, `[M]` for the mutable aggregate that is the conversation file. The safety net is a `[M]` aggregate: it is the state of record, hand-edited by humans, written by the writer, read by the rebuild.
+
 **Source-read citations:**
 - `bin/nagent:1455-1687` — `run_safety_net` + `checkpoint_due` + `rebuild_due` + `write_checkpoint` + `rebuild_conversation` (38d3d4f)
 - `bin/nagent:1840-1881` — `extract_conversation_summary` (6426a67)
 - `bin/nagent:2463-2677` — `--summarize-conversation` CLI surface (6426a67)
 - `bin/nagent:2819` — `safety_settings=load_safety_settings(...)` wired into `run_agent_loop` (38d3d4f)
- `config.example.json:3-7` — 3 safety-net config numbers, all units `ls -l` can verify (38d3d4f)
+- `bin/nagent:1463` — `REBUILD_TAIL_CHARS = 64 * 1024` default (38d3d4f)
+- `bin/nagent:1519-1539` — `checkpoint_due` + `rebuild_due` pure functions (38d3d4f)
+- `bin/nagent:1547-1587` — `write_checkpoint` (38d3d4f)
+- `bin/nagent:1590-1662` — `rebuild_conversation` (38d3d4f)
+- `bin/nagent:1610-1612` — widen tail 4× on writer failure (38d3d4f)
+- `bin/nagent:1566` — `delta_start = min(meta[1], len(content))` clamp (38d3d4f)
+- `config.example.json:3-7` — 3 safety-net config numbers (38d3d4f)
 - `prompts/checkpoint-conversation.md` — checkpoint LLM prompt (38d3d4f)
 - `bin/helpers/nagent_distill_lib.py:587-654` — `_summary_backfill_candidates` + `_backfill_saved_summaries` (6426a67)
 - `bin/helpers/nagent_distill_lib.py:851-862` — backfill wired into the distill apply path (6426a67)
 - `README.md:653-668` — safety-net teaching in Part VI (38d3d4f)
 - `README.md:323-332` — instant-saves teaching in Part II (6426a67)
 - `issues/0004-conversation-safety-net.md` — the spec; reworked at 6443d70 to wall-clock cadence (199a36b)
+- `issues/0004-conversation-safety-net.md:30` — cadence reasoning ("time and context consumption are uncorrelated in exactly the wrong direction")
+- `issues/0004-conversation-safety-net.md:42-44` — `REBUILD_TAIL_CHARS` unmeasured note
 - `tests/test_nagent_safety.py` — safety-net test file (38d3d4f)
-**Honest gaps in this cluster:**
- The `delta_start = min(meta[1], len(content))` clamp at `bin/nagent:1566` could produce a misleading delta if a user edit deletes characters between checkpoints (the recorded size becomes larger than current content). The clamp hides the failure; the delta would be the entire current content, not the actual new activity. Minor edge case; the spec does not address it.
- The `REBUILD_TAIL_CHARS = 64 * 1024` default at `bin/nagent:1463` is explicitly unmeasured ("mirrors MiMo's ~65K tokens until measured otherwise" per `issues/0004-conversation-safety-net.md:42-44`). A future track should measure actual rebuild-tail needs.
- `best-of-N` is mentioned in the initial context at `bin/nagent:775` as a directive to the model, not implemented as machinery — it is the same "direction before machinery" pattern v2.3 used for compaction. A follow-up track could lift it to a driver.
-
-**Pattern deep-dive.** The safety-net is a four-piece composition: **trigger**, **writer**, **rebuild**, **provenance**. The trigger is wall-clock + burst guard, both computed from data on disk (`bin/nagent:1519-1539` — `checkpoint_due`); the writer is a separate one-call LLM call (`bin/nagent:1547-1587` — `write_checkpoint`); the rebuild is a deterministic string assembly that runs the writer synchronously first (`bin/nagent:1590-1662` — `rebuild_conversation`); the provenance is the deterministic header (`updated:`, `conversation_chars:`) that lets the writer find the delta on the next pass. The cadence reasoning is explicit: "time and context consumption are uncorrelated in exactly the wrong direction" (`issues/0004-conversation-safety-net.md:30`). Token-percentage triggers were "an approximation of an approximation" — three numbers in units `ls -l` can verify are the data-grounded alternative.
-
-The "sync checkpoint first" invariant is the load-bearing one. A naive rebuild that trusted the most-recent checkpoint's freshness would fail on the exact conversation the safety net is meant to save (a conversation that grew past `rebuild_at_kb` between scheduled checkpoints). The rebuild runs the writer synchronously, and on writer failure widens the tail 4× (`bin/nagent:1610-1612`) — the rebuild is "blockable by a provider outage" would be the wrong failure mode. Failure as data, not failure as control flow.
-
-The instant-saves change (`6426a67`) is a smaller, sharper version of the same idea: the cost of an LLM summary is moved from the hot path (every save) to the maintenance path (`nagent-distill --apply` backfill + `--summarize-conversation` on demand). The summary is the artifact's own data — the checkpoint's `## Intent` line, already paid for — or the first user prompt truncated. The `summary_source: extracted | llm` provenance in the index is what makes this safe: the user can see which entries have been upgraded and which are still extracted, and the backfill pass reports its cost in the dry-run summary.
-
-A code-shape sketch using survey grammar (per the format commitment §5.1):
-
-```
-safety_settings := { checkpoint_interval_minutes: int,
-                     checkpoint_max_new_kb: int,
-                     rebuild_at_kb: int }
-checkpoint := { updated: timestamp, conversation_chars: int,
-                body: ## Intent | ## Next action | ## Constraints | ... }
-
-due { meta, conversation_chars, now, settings } {
-  if elapsed > interval and chars grew   -> fire  {ssdl} [I]
-  if chars grew > max_new                -> fire
-  if meta is nil and chars > max_new     -> fire first time only
-  else                                   -> idle
-}
-
-rebuild { conversation, llm, now } {
-  try write_checkpoint(conversation, llm)
-  recover widen tail * 4
-  archive(conversation)
-  write initial_context + {checkpoint} + tail  {ssdl} [S]
-  reset checkpoint.conversation_chars = fresh_window_size
-}
-```
-
-The `{ssdl}` markers note the two transformations: checkpoint write is an `[I]` (inspectable, the writer's output is user-editable), and rebuild is an `[S]` (string concatenation — no LLM call beyond the synchronous checkpoint; the deterministic assembly is what makes the rebuild safe to reason about).
+- `tests/test_nagent_distill.py:summary_*` — backfill tests (6426a67)
+- `bin/nagent:775` — `best-of-N` initial-context directive (38d3d4f)
+- `bin/nagent:970-987` — `conversation_cache_boundaries` (v2.3; not modified in v3 but relevant for the gap note)
+- `bin/nagent:606-745` — `build_initial_context` (v2.3; relevant for the rebuild's "initial context" assembly)
+- `config.example.json:1-15` — full safety-net config block with defaults (38d3d4f)
+- `README.md:670-700` — safety-net cost model (checkpoint cost, rebuild cost) (38d3d4f)
+- `README.md:333-360` — instant-saves cost model (extracted vs LLM cost) (6426a67)
+- `issues/0004-conversation-safety-net.md:1-100` — full spec: trigger, writer, rebuild, provenance, cost (199a36b)
+- `issues/0004-conversation-safety-net.md:101-200` — failure modes + edge cases (199a36b)
+- `issues/0004-conversation-safety-net.md:201-326` — open questions + future work (199a36b)

+**Decision candidate:** NEW Candidate 18 (HIGH). "Discussion-window safety net for Manual Slop": adopt the checkpoint + rebuild pattern for the discussion history; backfill summary entries from the existing intent line; surface extracted-vs-llm provenance in the discussion index. See `decisions.md` Candidate 18.
+**Cross-refs:** `conductor/tracks/fable_review_20260617` (the Fable review's analysis of "watch-dogging" is the opposite pattern — nagent's safety net is structural, not persona-driven). §1 Campaigns cross-references the safety net as the failure-recovery layer for what decomposition cannot bound. §13 Agent context-window observations (the v3.1 new section on warm-up + window + safe-zone numbers; the safety net is the structural mechanism that implements the safe-zone).
+**Pattern history:** EXTENDS v2.3 Pattern 5 ("the loop") with failure-recovery semantics. EXTENDS v2.3 Pattern 11 ("large files as explicit artifacts") with checkpoints as an explicit working-state artifact. EXTENDS v2.3 Pattern 7 ("repo history as data") with deferred-cost summaries.
 ## §3 Hooks

 **Source:** nagent `a4fb141` (`bin/nagent:1442-1484` + `:1607-1625` + `:1922-1927` + `:2806-2825` + `:3167-3185`, `config.example.json:6-8`, `tests/test_nagent.py:870-960`); plus both case-study harness scripts (`https://raw.githubusercontent.com/macton/pep-copt/main/prove-optimized-harness.sh`, `https://raw.githubusercontent.com/macton/differentiable-collisions-optc/main/prove-optimized-harness.sh`).