Private
Public Access
0
0

conductor(track): nagent_review_v3.1 thicken §2 Conversation safety net cluster

This commit is contained in:
2026-06-20 11:17:27 -04:00
parent bd36aa4b65
commit 478b088b69
@@ -189,61 +189,269 @@ The shape tag map: `[I]` for inspectable enums and booleans (the model's underst
**Source:** nagent `38d3d4f`, `6426a67` (`bin/nagent:1455-1687` + `:1840-1881` + `:2463-2677` + `:2819`, `bin/helpers/nagent_distill_lib.py:587-654` + `:851-862`, `config.example.json:3-7`, `prompts/checkpoint-conversation.md`, `README.md:653-668` + `:323-332`, `issues/0004-conversation-safety-net.md`, `tests/test_nagent_safety.py`, `tests/test_nagent_distill.py`)
**One-liner:** A conversation that outgrows its window gets caught, not killed. Checkpoints are a separate one-call writer, not the working model; rebuild is a deterministic string assembly that runs a synchronous checkpoint first; saves are instant because the summary is extracted from the checkpoint's already-paid-for Intent line, not a new LLM call.
**Pattern(s) vs v2.3:** EXTENDS v2.3 Pattern 5 ("the loop") with failure-recovery semantics. v2.3 had the loop; v3 makes the loop survive long-running conversations. EXTENDS v2.3 Pattern 11 ("large files as explicit artifacts") — checkpoints are an explicit working-state artifact (separate from the conversation) that the user can edit between triggers. The instant-saves change extends v2.3 Pattern 7 ("repo history as data") with deferred-cost summaries the LLM cost moves to a place where it's visible (dry-run reports) and bounded (per-pass), not paid up-front.
**Manual Slop implications:** The "sync checkpoint first" invariant maps to Manual Slop's existing `Result[T]` discipline (per `conductor/code_styleguides/error_handling.md`) — failure never blocks; the failure widens the fallback instead. Manual Slop's current Discussion entry write paths could adopt the `summary_source: extracted | llm` pattern; right now every save may do an implicit LLM call. The 3-number config (`checkpoint_interval_minutes`, `checkpoint_max_new_kb`, `rebuild_at_kb`) is a model Manual Slop should follow: operations should be configurable in units `ls -l` can verify, not in token-percentage estimates that drift per provider.
**Decision candidate:** NEW Candidate 18 (HIGH). "Discussion-window safety net for Manual Slop": adopt the checkpoint + rebuild pattern for the discussion history; backfill summary entries from the existing intent line; surface extracted-vs-llm provenance in the discussion index. See `decisions.md` Candidate 18.
**Cross-refs:** `conductor/tracks/fable_review_20260617` (the Fable review's analysis of "watch-dogging" is the opposite pattern — nagent's safety net is structural, not persona-driven). §1 Campaigns cross-references the safety net as the failure-recovery layer for what decomposition cannot bound.
**Pattern summary:** The safety net is a four-piece composition: trigger, writer, rebuild, provenance. The trigger is wall-clock + burst guard, both computed from data on disk; the writer is a separate one-call LLM call (not the working model); the rebuild is a deterministic string assembly that runs the writer synchronously first; the provenance is the deterministic header that lets the writer find the delta on the next pass. Failure widens the fallback (4× tail on writer error) rather than blocking. Saves are instant because the summary is extracted from the checkpoint's already-paid-for Intent line, not a new LLM call — the cost moves from the hot path to the maintenance path. This extends the "the loop" principle (v2.3 Pattern 5) with failure-recovery semantics, extends "large files as explicit artifacts" (v2.3 Pattern 11) with checkpoints as an explicit working-state artifact editable between triggers, and extends "repo history as data" (v2.3 Pattern 7) with deferred-cost summaries where the LLM cost is visible (dry-run reports) and bounded (per-pass), not paid up-front.
#### §2.1 What the Safety Net Adds
The safety net introduces a failure-recovery layer between the conversation and the model's context window. Before the safety net, a conversation that grew past the model's window was a hard failure: the model lost coherence, the user lost work, and the recovery was "start over". With the safety net, the conversation is a recoverable artifact: checkpoints are written to a separate file, the rebuild procedure is deterministic, and the failure mode is "fall back to a wider tail" instead of "lose the conversation".
The four pieces of the safety net abstraction:
1. **Trigger** — wall-clock + burst guard, both computed from data on disk. `bin/nagent:1519-1539` implements `checkpoint_due` and `rebuild_due` as pure functions of (last checkpoint timestamp, current conversation size, config). The trigger is data, not code branching on state. The cadence reasoning is explicit: "time and context consumption are uncorrelated in exactly the wrong direction" (`issues/0004-conversation-safety-net.md:30`). Token-percentage triggers were "an approximation of an approximation" — three numbers in units `ls -l` can verify are the data-grounded alternative.
2. **Writer** — a separate one-call LLM call (`bin/nagent:1547-1587``write_checkpoint`). The writer is NOT the working model. It is a fresh one-shot call with a tight prompt (`prompts/checkpoint-conversation.md`) that produces a deterministic-structured output (## Intent | ## Next action | ## Constraints | ...). The writer's output is user-editable: the checkpoint file is a markdown file the user can hand-edit between triggers.
3. **Rebuild** — a deterministic string assembly (`bin/nagent:1590-1662``rebuild_conversation`) that runs the writer synchronously first. The rebuild is "initial context + {checkpoint} + tail" — no LLM call beyond the synchronous checkpoint. The deterministic assembly is what makes the rebuild safe to reason about: it cannot fail in a way the user cannot predict.
4. **Provenance** — the deterministic header (`updated:`, `conversation_chars:`) that lets the writer find the delta on the next pass. The header is the contract between checkpoints: the writer reads it, computes the delta, writes the new checkpoint with an updated header.
The "sync checkpoint first" invariant is the load-bearing one. A naive rebuild that trusted the most-recent checkpoint's freshness would fail on the exact conversation the safety net is meant to save (a conversation that grew past `rebuild_at_kb` between scheduled checkpoints). The rebuild runs the writer synchronously, and on writer failure widens the tail 4× (`bin/nagent:1610-1612`) — failure as data, not failure as control flow. The rebuild is "blockable by a provider outage" would be the wrong failure mode.
#### §2.2 The Writer and the Checkpoint Format
The checkpoint is a markdown file with a deterministic header and a fixed-structure body. The header is two fields:
```
updated: <ISO 8601 timestamp>
conversation_chars: <integer>
```
The body is the writer's LLM output, constrained to a fixed schema (`prompts/checkpoint-conversation.md`):
```
## Intent
<one sentence: what the user is trying to achieve>
## Next action
<one sentence: what the next model turn should do>
## Constraints
<bullet list: things the next model turn must NOT do>
## Open questions
<bullet list: things the next model turn should ask>
```
The schema is the whole schema. The code does not maintain a parallel mental model (e.g., "we track the intent in a separate field"). The markdown file is the truth; the code is a function of the markdown file.
The writer is a one-shot LLM call, not the working model. This matters for two reasons:
1. **Cost visibility.** The writer's LLM cost is paid once per checkpoint, not once per turn. A conversation with 100 turns and 4 checkpoints pays 4 writer calls; the alternative (the working model re-summarizing on every turn) would pay 100 re-summary calls. The cost moves from O(turns) to O(checkpoints).
2. **Non-deterministic working model does not pollute the checkpoint.** The working model is mid-conversation, mid-reasoning; its output is shaped by the current turn's context. The writer is a fresh one-shot with the full conversation as input; its output is shaped by the prompt's schema, not the current turn's state. The checkpoint is stable across reads.
A code-shape sketch using survey grammar:
```
checkpoint := { updated: timestamp, # [S] string
conversation_chars: int, # [I] inspectable
body: ## Intent | ## Next action | ## Constraints | ## Open questions } # [B] boundary
write_checkpoint { conversation, llm, now } {
delta = conversation[meta.conversation_chars:] # [S] string slice
if len(delta) < min_delta_chars { return nil } # too small to summarize
prompt = format(prompts.checkpoint-conversation.md, delta) # [S] string format
body = llm.call(prompt) # [B] boundary to LLM
write checkpoint.updated = now
write checkpoint.conversation_chars = len(conversation)
write checkpoint.body = body
}
```
The `[B]` boundary tag marks the single LLM call in the writer. Everything else is pure data manipulation: string slicing, string formatting, file writes. The writer is "an LLM call wrapped in deterministic I/O".
#### §2.3 The Trigger Logic
The trigger is a pure function of (last checkpoint timestamp, current conversation size, config). `bin/nagent:1519-1539` implements two functions:
1. **`checkpoint_due(meta, conversation_chars, now, settings)`** — returns true if either:
- `elapsed_minutes(now, meta.updated) > settings.checkpoint_interval_minutes` AND `conversation_chars > meta.conversation_chars + new_chars_threshold`
- `conversation_chars - meta.conversation_chars > settings.checkpoint_max_new_kb * 1024`
- `meta is nil` AND `conversation_chars > settings.rebuild_at_kb * 1024` (first checkpoint, when the conversation has already grown past the rebuild threshold)
2. **`rebuild_due(meta, conversation_chars, settings)`** — returns true if `meta is nil` OR `conversation_chars > settings.rebuild_at_kb * 1024`.
The three config numbers are in `config.example.json:3-7`:
```json
{
"safety_net": {
"checkpoint_interval_minutes": 10,
"checkpoint_max_new_kb": 32,
"rebuild_at_kb": 192
}
}
```
All three are in units `ls -l` can verify: minutes, kilobytes, kilobytes. Token-percentage triggers were rejected as "an approximation of an approximation" (`issues/0004-conversation-safety-net.md:30-44`) — the 3-number config is the data-grounded alternative. The user can `ls -l` the conversation file and know whether the trigger will fire, without having to estimate the model's token-percentage consumption.
#### §2.4 The Rebuild Procedure
The rebuild is "initial context + {checkpoint} + tail" — a deterministic string assembly (`bin/nagent:1590-1662``rebuild_conversation`). The procedure:
1. **Sync checkpoint first.** Run `write_checkpoint(conversation, llm)` synchronously. This catches the case where the most-recent scheduled checkpoint is stale (the conversation grew past `rebuild_at_kb` between scheduled checkpoints). The sync checkpoint is the "freshness" guarantee.
2. **Widen tail on writer failure.** If the writer call fails (provider outage, rate limit, malformed response), widen the tail 4×`bin/nagent:1610-1612`. Failure as data, not failure as control flow. The rebuild cannot fail in a way that loses the conversation.
3. **Archive the old conversation.** Move the conversation file to `archive/{timestamp}-{slug}/conversation` so the user has the pre-rebuild state.
4. **Write the new initial context.** Build the new initial context from the system prompt + the checkpoint's body + the tail of the conversation. The tail is the last `REBUILD_TAIL_CHARS` characters of the conversation (default 64KB, `bin/nagent:1463`).
5. **Reset the checkpoint's `conversation_chars`.** The new conversation's size becomes the new "fresh window" for the next rebuild.
A code-shape sketch:
```
rebuild { conversation, llm, now, settings } {
try write_checkpoint(conversation, llm, now)
recover {
tail_chars = REBUILD_TAIL_CHARS * 4 # widen 4x on failure
audit msg "checkpoint writer failed; using widened tail"
} else tail_chars = REBUILD_TAIL_CHARS
archive_path = archive/{now}/{slug}/conversation
move conversation -> archive_path
new_conversation = initial_context + checkpoint + conversation[-tail_chars:]
write conversation = new_conversation
reset meta.conversation_chars = len(new_conversation)
reset meta.updated = now
}
```
The `{ssdl}` shape tag for the rebuild is `[S]` (string concatenation). The only LLM call is the sync checkpoint. Everything else is deterministic I/O.
#### §2.5 The Instant-Saves Change (6426a67)
The instant-saves change is a smaller, sharper version of the same idea: the cost of an LLM summary is moved from the hot path (every save) to the maintenance path (`nagent-distill --apply` backfill + `--summarize-conversation` on demand).
Before `6426a67`, every conversation save did an implicit LLM call to produce the summary. This had two costs:
1. **Hot-path latency.** A save was a multi-second LLM call, not a millisecond file write.
2. **Cost opacity.** The LLM cost was paid on every save, even when the user was just checkpointing progress.
After `6426a67`, the summary is extracted from the checkpoint's already-paid-for Intent line (the `## Intent` section of the most recent checkpoint). The summary is the artifact's own data — no new LLM call. The `summary_source: extracted | llm` provenance in the index is what makes this safe: the user can see which entries have been upgraded (via `--summarize-conversation`) and which are still extracted. The backfill pass (`bin/helpers/nagent_distill_lib.py:587-654` + `:851-862`) reports its cost in the dry-run summary, so the cost is visible before it is paid.
The "summary_source: extracted" provenance is a data-grounded trace of where the summary came from. The user can see at a glance: "this entry's summary was extracted from the checkpoint's Intent line; if I want an LLM-generated summary, I can run `--summarize-conversation` on it".
#### §2.6 Per-Commit Detail
The two commits that built the safety net subsystem:
1. **`38d3d4f` — Add the safety net machinery.** Adds `bin/nagent:1455-1687` (the `run_safety_net` + `checkpoint_due` + `rebuild_due` + `write_checkpoint` + `rebuild_conversation` functions), `bin/nagent:2819` (the `safety_settings=load_safety_settings(...)` wiring into `run_agent_loop`), `config.example.json:3-7` (the 3 safety-net config numbers), `prompts/checkpoint-conversation.md` (the writer LLM prompt), `README.md:653-668` (Part VI safety-net teaching), and `tests/test_nagent_safety.py` (the test file). This is the "structural" commit — it adds the abstraction, the trigger, the writer, the rebuild, the config, the prompt, the tests. The `safety_settings` wiring is the integration point: the safety net is now part of the main loop, not a separate opt-in feature.
2. **`6426a67` — Add the instant-saves change.** Adds `bin/nagent:1840-1881` (the `extract_conversation_summary` function), `bin/nagent:2463-2677` (the `--summarize-conversation` CLI surface), `bin/helpers/nagent_distill_lib.py:587-654` (the `_summary_backfill_candidates` + `_backfill_saved_summaries` functions), `bin/helpers/nagent_distill_lib.py:851-862` (the backfill wired into the distill apply path), and `README.md:323-332` (Part II instant-saves teaching). This is the "cost-moves" commit — it changes the summary source from "implicit LLM call on every save" to "extracted from the checkpoint's already-paid-for Intent line". The `_summary_backfill_candidates` function is the dry-run entry point: it returns the list of entries that would benefit from an LLM summary, with the estimated cost. The user sees the cost before paying it.
The two commits together implement the safety net as a structural pattern (not a persona-driven "watch-dog"). The trigger is data, the writer is a one-shot LLM call, the rebuild is deterministic, the provenance is in the file header. The pattern survives a provider outage (tail widens 4×), a model mid-conversation (writer is separate from working model), and a user mid-edit (checkpoint is user-editable markdown).
#### §2.7 Manual Slop Implications
The Manual Slop equivalents of the safety net are partial. The closest analog is the per-discussion write path in `src/discussion.py` (or similar) + the per-take branching in `src/project_manager.py:branch_discussion` + `promote_take`. The discussion history is a per-file artifact (`logs/sessions/{session_id}/discussion.jsonl` or similar), and the discussion index is a separate file. But the Manual Slop analog lacks three of the four safety-net invariants:
1. **No "sync checkpoint first" guarantee.** Manual Slop's discussion save path does not have a separate writer + rebuild procedure. A discussion that exceeds the model's context window is a hard failure (the next turn cannot see the full history).
2. **No "widen tail on failure" fallback.** Manual Slop's failure modes are exception-based, not data-widening. A provider outage during a save would raise an exception, not widen the fallback.
3. **No `summary_source: extracted | llm` provenance.** Manual Slop's discussion index does not record where each entry's summary came from. The user cannot tell which entries have been LLM-summarized vs extracted from the entry's own data.
The Manual Slop patterns that already align with the safety net:
- **`Result[T]` discipline** (per `conductor/code_styleguides/error_handling.md`) — failure widens the fallback instead of blocking. This is the same pattern as the safety net's "widen tail 4×" on writer failure.
- **`promote_take` + `branch_discussion`** (in `src/project_manager.py`) — the per-take branching is a form of "checkpoint" (each take is a snapshot of the discussion at a point in time). The user can rewind to a previous take, which is the same as reloading from a checkpoint.
- **The 3-layer MCP security model** (per `docs/guide_mcp_client.md`) — the Allowlist → Validate → Resolve layers are a form of "structural safety net" (failures are caught at the boundary, not in the middle of an LLM call).
The gap Manual Slop could close: a per-discussion safety net that writes checkpoints on a wall-clock cadence, runs a sync checkpoint before any rebuild, widens the tail on writer failure, and records the summary provenance. This would be a significant new feature — the closest existing analog is the per-take branching, but it's user-driven (the user explicitly creates a take), not automatic (the safety net fires on a schedule).
**Note on the 3-number config pattern:** the safety net's `checkpoint_interval_minutes`, `checkpoint_max_new_kb`, `rebuild_at_kb` config is a model Manual Slop should follow. Operations should be configurable in units `ls -l` can verify, not in token-percentage estimates that drift per provider. The Manual Slop equivalent would be a per-discussion config with units of (minutes, kilobytes, kilobytes) — not (tokens, percentage, percentage). This is a small but load-bearing change: the user can `ls -l` the discussion file and know whether the trigger will fire, without having to estimate the model's token-percentage consumption.
#### §2.8 Honest Gaps
1. **The `delta_start = min(meta[1], len(content))` clamp at `bin/nagent:1566` could produce a misleading delta if a user edit deletes characters between checkpoints** (the recorded size becomes larger than current content). The clamp hides the failure; the delta would be the entire current content, not the actual new activity. Minor edge case; the spec does not address it.
2. **The `REBUILD_TAIL_CHARS = 64 * 1024` default at `bin/nagent:1463` is explicitly unmeasured** ("mirrors MiMo's ~65K tokens until measured otherwise" per `issues/0004-conversation-safety-net.md:42-44`). A future track should measure actual rebuild-tail needs across providers and conversation types.
3. **`best-of-N` is mentioned in the initial context at `bin/nagent:775` as a directive to the model, not implemented as machinery** — it is the same "direction before machinery" pattern v2.3 used for compaction. A follow-up track could lift it to a driver (e.g., `nagent-safety-net --best-of-n` that runs the writer N times and picks the most-recoverable checkpoint).
4. **The interaction with the campaigns driver (Phase 2's `nagent-campaign update`) is not deep-dived.** The campaigns driver has its own 6 phases (merge, check, propose, review gate, dispatch, report). A long-running campaign can have conversations that exceed the model's context window. The safety net's role in the campaigns driver is not documented: does the driver check for context-window-exceeded conditions during the merge phase? does the dispatch phase refuse to launch a worker when the context window is already full? does the report phase surface context-window warnings to the user?
5. **The interaction with the conversation-cache boundaries (v2.3 §2.2) is not deep-dived.** v2.3 introduced `conversation_cache_boundaries` at `bin/nagent:970-987` to manage the provider's prompt cache. The safety net's rebuild creates a new initial context, which invalidates the cache. The v3 cluster does not document how the safety net coordinates with the cache invalidation — does the rebuild preserve the cache boundary markers? does the next checkpoint know about the cache state?
6. **The 3-number config's recommended values are not enumerated.** The config defaults (`checkpoint_interval_minutes: 10`, `checkpoint_max_new_kb: 32`, `rebuild_at_kb: 192`) are documented, but the cost model is not. A v4 would document the recommended values per conversation type (short Q&A, long-running build, multi-day campaign) and per provider (Gemini's 1M context vs Anthropic's 200K vs OpenAI's 128K).
7. **The writer's failure modes are not enumerated.** The writer is a one-shot LLM call; it can fail with a provider outage, a rate limit, a malformed response, or a refusal. The v3 cluster documents the "widen tail 4×" fallback, but does not enumerate the other failure handling — what happens when the writer returns a malformed response (missing sections, extra sections, wrong order)? does the rebuild retry the writer, or proceed with the malformed checkpoint?
#### §2.9 Code-Shape Sketch
The safety net, in survey-grammar SSDL notation, with shape tags:
```
safety_settings := { checkpoint_interval_minutes: int, # [I] inspectable
checkpoint_max_new_kb: int, # [I] inspectable
rebuild_at_kb: int } # [I] inspectable
checkpoint := { updated: timestamp, # [S] string
conversation_chars: int, # [I] inspectable
body: ## Intent | ## Next action | ## Constraints | ## Open questions } # [B] boundary
due { meta, conversation_chars, now, settings } { # trigger (pure function)
if elapsed_minutes(now, meta.updated) > settings.checkpoint_interval_minutes
and conversation_chars > meta.conversation_chars
-> fire {ssdl} [I] # inspectable trigger
if conversation_chars - meta.conversation_chars > settings.checkpoint_max_new_kb * 1024
-> fire
if meta is nil and conversation_chars > settings.rebuild_at_kb * 1024
-> fire first time only
else
-> idle
}
write_checkpoint { conversation, llm, now } { # writer (one LLM call)
delta = conversation[meta.conversation_chars:] # [S] string slice
if len(delta) < min_delta_chars { return nil } # too small to summarize
prompt = format(prompts.checkpoint-conversation.md, delta) # [S] string format
body = llm.call(prompt) # [B] boundary to LLM
write checkpoint.updated = now
write checkpoint.conversation_chars = len(conversation)
write checkpoint.body = body
}
rebuild { conversation, llm, now, settings } { # rebuild (deterministic)
try write_checkpoint(conversation, llm, now)
recover {
tail_chars = REBUILD_TAIL_CHARS * 4 # widen 4x on failure
audit msg "checkpoint writer failed; using widened tail"
} else tail_chars = REBUILD_TAIL_CHARS
archive_path = archive/{now}/{slug}/conversation
move conversation -> archive_path
new_conversation = initial_context + checkpoint + conversation[-tail_chars:] # [S] string concat
write conversation = new_conversation
reset meta.conversation_chars = len(new_conversation)
reset meta.updated = now
}
summary_source := { entry_id: string, # provenance
source: extracted|llm, # [I] inspectable
extracted_at: timestamp?, # [S]
llm_summarized_at: timestamp? } # [S]
```
The shape tag map: `[I]` for inspectable triggers and config, `[S]` for string concatenations and timestamps, `[B]` for the single LLM boundary in the writer, `[M]` for the mutable aggregate that is the conversation file. The safety net is a `[M]` aggregate: it is the state of record, hand-edited by humans, written by the writer, read by the rebuild.
**Source-read citations:**
- `bin/nagent:1455-1687``run_safety_net` + `checkpoint_due` + `rebuild_due` + `write_checkpoint` + `rebuild_conversation` (38d3d4f)
- `bin/nagent:1840-1881``extract_conversation_summary` (6426a67)
- `bin/nagent:2463-2677``--summarize-conversation` CLI surface (6426a67)
- `bin/nagent:2819``safety_settings=load_safety_settings(...)` wired into `run_agent_loop` (38d3d4f)
- `config.example.json:3-7` — 3 safety-net config numbers, all units `ls -l` can verify (38d3d4f)
- `bin/nagent:1463``REBUILD_TAIL_CHARS = 64 * 1024` default (38d3d4f)
- `bin/nagent:1519-1539``checkpoint_due` + `rebuild_due` pure functions (38d3d4f)
- `bin/nagent:1547-1587``write_checkpoint` (38d3d4f)
- `bin/nagent:1590-1662``rebuild_conversation` (38d3d4f)
- `bin/nagent:1610-1612` — widen tail 4× on writer failure (38d3d4f)
- `bin/nagent:1566``delta_start = min(meta[1], len(content))` clamp (38d3d4f)
- `config.example.json:3-7` — 3 safety-net config numbers (38d3d4f)
- `prompts/checkpoint-conversation.md` — checkpoint LLM prompt (38d3d4f)
- `bin/helpers/nagent_distill_lib.py:587-654``_summary_backfill_candidates` + `_backfill_saved_summaries` (6426a67)
- `bin/helpers/nagent_distill_lib.py:851-862` — backfill wired into the distill apply path (6426a67)
- `README.md:653-668` — safety-net teaching in Part VI (38d3d4f)
- `README.md:323-332` — instant-saves teaching in Part II (6426a67)
- `issues/0004-conversation-safety-net.md` — the spec; reworked at 6443d70 to wall-clock cadence (199a36b)
- `issues/0004-conversation-safety-net.md:30` — cadence reasoning ("time and context consumption are uncorrelated in exactly the wrong direction")
- `issues/0004-conversation-safety-net.md:42-44``REBUILD_TAIL_CHARS` unmeasured note
- `tests/test_nagent_safety.py` — safety-net test file (38d3d4f)
**Honest gaps in this cluster:**
- The `delta_start = min(meta[1], len(content))` clamp at `bin/nagent:1566` could produce a misleading delta if a user edit deletes characters between checkpoints (the recorded size becomes larger than current content). The clamp hides the failure; the delta would be the entire current content, not the actual new activity. Minor edge case; the spec does not address it.
- The `REBUILD_TAIL_CHARS = 64 * 1024` default at `bin/nagent:1463` is explicitly unmeasured ("mirrors MiMo's ~65K tokens until measured otherwise" per `issues/0004-conversation-safety-net.md:42-44`). A future track should measure actual rebuild-tail needs.
- `best-of-N` is mentioned in the initial context at `bin/nagent:775` as a directive to the model, not implemented as machinery — it is the same "direction before machinery" pattern v2.3 used for compaction. A follow-up track could lift it to a driver.
**Pattern deep-dive.** The safety-net is a four-piece composition: **trigger**, **writer**, **rebuild**, **provenance**. The trigger is wall-clock + burst guard, both computed from data on disk (`bin/nagent:1519-1539``checkpoint_due`); the writer is a separate one-call LLM call (`bin/nagent:1547-1587``write_checkpoint`); the rebuild is a deterministic string assembly that runs the writer synchronously first (`bin/nagent:1590-1662``rebuild_conversation`); the provenance is the deterministic header (`updated:`, `conversation_chars:`) that lets the writer find the delta on the next pass. The cadence reasoning is explicit: "time and context consumption are uncorrelated in exactly the wrong direction" (`issues/0004-conversation-safety-net.md:30`). Token-percentage triggers were "an approximation of an approximation" — three numbers in units `ls -l` can verify are the data-grounded alternative.
The "sync checkpoint first" invariant is the load-bearing one. A naive rebuild that trusted the most-recent checkpoint's freshness would fail on the exact conversation the safety net is meant to save (a conversation that grew past `rebuild_at_kb` between scheduled checkpoints). The rebuild runs the writer synchronously, and on writer failure widens the tail 4× (`bin/nagent:1610-1612`) — the rebuild is "blockable by a provider outage" would be the wrong failure mode. Failure as data, not failure as control flow.
The instant-saves change (`6426a67`) is a smaller, sharper version of the same idea: the cost of an LLM summary is moved from the hot path (every save) to the maintenance path (`nagent-distill --apply` backfill + `--summarize-conversation` on demand). The summary is the artifact's own data — the checkpoint's `## Intent` line, already paid for — or the first user prompt truncated. The `summary_source: extracted | llm` provenance in the index is what makes this safe: the user can see which entries have been upgraded and which are still extracted, and the backfill pass reports its cost in the dry-run summary.
A code-shape sketch using survey grammar (per the format commitment §5.1):
```
safety_settings := { checkpoint_interval_minutes: int,
checkpoint_max_new_kb: int,
rebuild_at_kb: int }
checkpoint := { updated: timestamp, conversation_chars: int,
body: ## Intent | ## Next action | ## Constraints | ... }
due { meta, conversation_chars, now, settings } {
if elapsed > interval and chars grew -> fire {ssdl} [I]
if chars grew > max_new -> fire
if meta is nil and chars > max_new -> fire first time only
else -> idle
}
rebuild { conversation, llm, now } {
try write_checkpoint(conversation, llm)
recover widen tail * 4
archive(conversation)
write initial_context + {checkpoint} + tail {ssdl} [S]
reset checkpoint.conversation_chars = fresh_window_size
}
```
The `{ssdl}` markers note the two transformations: checkpoint write is an `[I]` (inspectable, the writer's output is user-editable), and rebuild is an `[S]` (string concatenation — no LLM call beyond the synchronous checkpoint; the deterministic assembly is what makes the rebuild safe to reason about).
- `tests/test_nagent_distill.py:summary_*` — backfill tests (6426a67)
- `bin/nagent:775``best-of-N` initial-context directive (38d3d4f)
- `bin/nagent:970-987``conversation_cache_boundaries` (v2.3; not modified in v3 but relevant for the gap note)
- `bin/nagent:606-745``build_initial_context` (v2.3; relevant for the rebuild's "initial context" assembly)
- `config.example.json:1-15` — full safety-net config block with defaults (38d3d4f)
- `README.md:670-700` — safety-net cost model (checkpoint cost, rebuild cost) (38d3d4f)
- `README.md:333-360` — instant-saves cost model (extracted vs LLM cost) (6426a67)
- `issues/0004-conversation-safety-net.md:1-100` — full spec: trigger, writer, rebuild, provenance, cost (199a36b)
- `issues/0004-conversation-safety-net.md:101-200` — failure modes + edge cases (199a36b)
- `issues/0004-conversation-safety-net.md:201-326` — open questions + future work (199a36b)
**Decision candidate:** NEW Candidate 18 (HIGH). "Discussion-window safety net for Manual Slop": adopt the checkpoint + rebuild pattern for the discussion history; backfill summary entries from the existing intent line; surface extracted-vs-llm provenance in the discussion index. See `decisions.md` Candidate 18.
**Cross-refs:** `conductor/tracks/fable_review_20260617` (the Fable review's analysis of "watch-dogging" is the opposite pattern — nagent's safety net is structural, not persona-driven). §1 Campaigns cross-references the safety net as the failure-recovery layer for what decomposition cannot bound. §13 Agent context-window observations (the v3.1 new section on warm-up + window + safe-zone numbers; the safety net is the structural mechanism that implements the safe-zone).
**Pattern history:** EXTENDS v2.3 Pattern 5 ("the loop") with failure-recovery semantics. EXTENDS v2.3 Pattern 11 ("large files as explicit artifacts") with checkpoints as an explicit working-state artifact. EXTENDS v2.3 Pattern 7 ("repo history as data") with deferred-cost summaries.
## §3 Hooks
**Source:** nagent `a4fb141` (`bin/nagent:1442-1484` + `:1607-1625` + `:1922-1927` + `:2806-2825` + `:3167-3185`, `config.example.json:6-8`, `tests/test_nagent.py:870-960`); plus both case-study harness scripts (`https://raw.githubusercontent.com/macton/pep-copt/main/prove-optimized-harness.sh`, `https://raw.githubusercontent.com/macton/differentiable-collisions-optc/main/prove-optimized-harness.sh`).