# Mike Acton's nagent: A Deep-Dive Analysis vs Manual Slop

**Track:** `nagent_review_20260608`
**Date:** 2026-06-08 (revised with user corrections same day)
**Author:** Tier 2 Tech Lead (with significant user review on §3 and §6)
**Companion to:** `spec.md` (the track wrapper)

> **Important reading note.** This report applies the **Application vs Meta-Tooling distinction** (per `docs/guide_meta_boundary.md`) as the lens for every comparison. nagent is a Meta-Tooling reference; Manual Slop's Application AI is a *different kind of thing*. Where they share patterns (MMA workers, the tool-call loop, the 3-layer security model), the report says so. Where they don't, the report says so. The report deliberately avoids "nagent is better" / "Manual Slop is better" framings.
>
> **Revision note.** The first draft overstated gaps in Manual Slop's "editable discussion" and "per-file memory" features. The user caught this and pointed at the actual files (`FileItem`, `ContextPreset`, `aggregate.py`, `project_manager.branch_discussion`, `HistoryManager`). The corrections are now folded in. Specific corrections: §3 (verdict changed from PARTIAL to **PARITY (DIFFERENT FOCUS)**); §6 (verdict changed from DOMAIN MISMATCH to **MANUAL SLOP IS STRONGER IN THE CURATION DIMENSION**); §9 (verdict now notes the MMA vs 1:1 distinction explicitly per the user).

---

## 0. Reading guide

- **Sections 1-14** map 1:1 to nagent's 14 principles. Each has: nagent's claim, nagent's implementation, Manual Slop's equivalent, a verdict, and a domain tag.
- **Section 15** extracts the 6 actionable pitfalls and maps each to a future-track candidate.
- **Section 16** is the recommended reading path for engineers who haven't read nagent.

If you only have 10 minutes, read §3 (Conversations), §6 (Per-File Memory), §9 (Sub-Conversations), §10 (Controlled Writes), and §15 (the pitfalls list).

---

## 1. Durable work, disposable workers

**nagent's claim.** A Python process is a *worker*; the files are the *system*. Workers come and go; data stays. **"The agent is not the thing; the data is the thing."**

**nagent's implementation.** `bin/nagent` is a 700-line single-file loop. It reads `~/.nagent/conversations/<conversation_name>` (a plain text file) for the current conversation, appends to it after every action, and exits. The user types `nagent "investigate this"`. The CLI is a shell. The state is a file.

**Manual Slop's equivalent.** Manual Slop has two parallel systems:

1. **MMA workers are real subprocesses.** `multi_agent_conductor._spawn_worker` runs `mma_exec.py` via `subprocess.Popen` (per `docs/guide_multi_agent_conductor.md` §"Token Firewalling"). Each Tier 3 worker is a fresh Python process with **Context Amnesia** — `ai_client.reset_session()` at the start of `run_worker_lifecycle`. The subprocess is the disposable worker; the artifacts (track state, ticket results) are the system.

2. **The Application AI is *not* a disposable worker.** `gui_2.py:App` is a long-lived Qt/ImGui process. The user types a prompt, hits Enter, gets a response, *keeps the process running for hours*. The `app_state` dataclass is the long-lived worker. This is *intentional* for the Application domain: persona-driven conversations, snapshot-based undo, cross-discussion state — all require a long-running process.

**Verdict.** **PARTIAL** — nagent's pattern lives in the Meta-Tooling + MMA, but the Application deliberately has long-lived workers. The two coexist because they serve different needs: MMA is fire-and-forget per ticket; App is an interactive partner.

**Domain tag:** Both. MMA has it; App doesn't need it. *Future-track candidate: a stateless conversation-file pattern for the App (see §15.4).*

---

## 2. Text in, text out

**nagent's claim.** The smallest useful primitive is: file in, text out. `nagent-llm-text --file question.txt` reads a file, calls the LLM, prints plain text or JSON. Everything else in nagent is orchestration around this.

**nagent's implementation.** `bin/helpers/nagent_llm.py` (300 lines) provides `generate_text(message, provider, model) -> str` for 4 providers (openai, anthropic, google, cursor). Token accounting via provider usage metadata (with character-count fallback at 1 token per 4 chars). Provider churn is isolated in this file.

**Manual Slop's equivalent.** `src/ai_client.py:send(...) -> str` is the parallel. 5 providers (gemini, anthropic, deepseek, minimax, gemini_cli). Same `provider, model, usage` shape. Manual Slop wraps the string in a larger `(md_content, user_message, base_dir, file_items, ..., rag_engine) -> str` because the Application's text-in/text-out also needs tool calls, RAG injection, tier attribution, and patch-mode. But the *primitive* is the same.

**Verdict.** **PARITY.** nagent and Manual Slop both use text-in/text-out at the bottom. The Application's `send()` is a *strict superset* of nagent's `nagent-llm-text`, with provider churn still isolated to a single module.

**Domain tag:** Both. Meta-Tooling uses the same primitive via `mma_exec.py`'s `ai_client.send`.

---

## 3. Conversations are editable state

**nagent's claim.** The conversation file is not chat history. It is working state. Memory goes stale; therefore let people save, load, summarize, edit, branch, trim, copy, diff, version, and rewrite conversations. **"The conversation does not own its memory. The user does."**

**nagent's implementation.**
- `bin/nagent` exposes `--save-conversation <name>`, `--load-conversation <name>`, `--summarize`, `--edit-conversation <prompt>`. The latter **automates** one path: archive current file, run file-edit on the archive, load the result.
- Conversations are plain text files. The user can `cat`, `vim`, `git diff`, or `cp` them with no special tooling. The `<nagent-response>` body and `<nagent-shell-result>` body are just text in the file.
- The first draft of this section understated Manual Slop's editing capability. The corrected picture is below.

**Manual Slop's equivalent (corrected, with the full operation matrix).** Manual Slop's discussion editing lives at **three nested layers**, each with its own operations. The full enumeration:

**Layer A — Per-entry operations on `app.disc_entries: list[dict]`** (the discussion's typed message list). The renderer is `src/gui_2.py:3770 render_discussion_entry(...)`. Per entry, the user can:

| # | Operation | GUI control | Source code | What it does |
|---|---|---|---|---|
| A1 | **Edit content in place** | `imgui.input_text_multiline` on the entry body | `gui_2.py:3841` | The entry's `content` field is a fully editable multi-line text input. The user can rewrite an AI's response, fix a typo in their own prompt, paste in code from another source, etc. |
| A2 | **Toggle read/edit mode** | `[Edit]` / `[Read]` button | `gui_2.py:3799` | When in `[Read]` mode, the content is rendered as Markdown with syntax highlighting (`render_discussion_entry_read_mode` at `gui_2.py:3855`). When in `[Edit]` mode, the multi-line text input is shown. |
| A3 | **Toggle collapsed/expanded** | `+/-` button per entry | `gui_2.py:3789` | Collapsed entries show a 60-char preview (line 3822-3824). Expanded entries show full content. |
| A4 | **Change role** | Combo box from `app.disc_roles` | `gui_2.py:3793-3796` | The entry's `role` field is editable. The list `app.disc_roles` is itself user-managed (see B5). |
| A5 | **Insert entry before this one** | `Ins` button | `gui_2.py:3813` | `app.disc_entries.insert(index, {"role": "User", "content": "", "collapsed": True, "ts": project_manager.now_ts()})` |
| A6 | **Delete this entry** | `Del` button | `gui_2.py:3815-3816` | `if entry in app.disc_entries: app.disc_entries.remove(entry)`. The membership check matters — ImGui can re-render stale state, so the check guards against double-delete. |
| A7 | **Branch at this entry** | `Branch` button | `gui_2.py:3821` → `app._branch_discussion(index)` → `app_controller._branch_discussion:3503` → `project_manager.branch_discussion:429` | Creates a new Take named `<base>_take_<n>` and copies the history up to and including `index` into the new Take. The user is then switched to the new Take. |

The entry dict shape itself is open: `{"role": str, "content": str, "collapsed": bool, "ts": str, ...}` plus optional `thinking_segments` (for AI entries with `<thinking>` blocks, parsed by `src/thinking_parser.py`) and `usage` (for token accounting: input/output/cache). The user can also set per-entry `read_mode` (a render-time flag, not persisted).

**Layer B — Discussion-level operations** (the Take / discussion set). These are the second-tier controls, rendered at `src/gui_2.py:4239 render_discussion_entry_controls(...)` and the discussion selector at `gui_2.py:4330 render_discussion_selector(...)`:

| # | Operation | GUI control | Source code | What it does |
|---|---|---|---|---|
| B1 | **Append new entry** | `+ Entry` button | `gui_2.py:4240` | `app.disc_entries.append({...})` with the default role from `app.disc_roles[0]`. |
| B2 | **Collapse all / Expand all** | `-All` / `+All` buttons | `gui_2.py:4242-4246` | Bulk-set `collapsed` flag on every entry. |
| B3 | **Clear all** | `Clear All` button | `gui_2.py:4248` | `app.disc_entries.clear()`. |
| B4 | **Save (flush to project TOML)** | `Save` button | `gui_2.py:4250` | `app._flush_to_project(); app._flush_to_config(); app.save_config()`. |
| B5 | **Add/remove roles** | `Add` / `X` buttons under "Roles" | `gui_2.py:4317-4328` | `app.disc_roles.append(r)` / `app.disc_roles.pop(i)`. The role list is **user-managed at runtime** — they can add `"Context"`, `"Tool"`, `"Vendor API"`, or any custom role and assign it to any entry. |
| B6 | **Switch active discussion** | Discussion combo + Take tabs | `gui_2.py:4197, 4344, 4354` | `app._switch_discussion(name)`. The Takes group by base name (`name.split("_take_")[0]`) and render as nested tabs. |
| B7 | **Rename / Delete discussion** | `Rename` / `Delete` buttons | `gui_2.py:4291, 4293` | `app._rename_discussion(...)` / `app._delete_discussion(...)`. Cannot delete the last discussion (guarded at `app_controller.py:3543`). |
| B8 | **Promote Take to top-level** | `Promote` button in takes panel | `gui_2.py:4364` | `project_manager.promote_take(app.project, app.active_discussion, new_name)` — renames a Take (e.g. `T0_take_2`) to a fresh top-level discussion name. |
| B9 | **Per-role filter** | `ui_focus_agent` selector (system-wide) | `gui_2.py:4230-4234` | `display_entries = [e for e in app.disc_entries if e.get("role") == persona_name or e.get("role") == "User"]`. The filter follows the MMA persona focus. |
| B10 | **Truncate to N pairs** | `Truncate` button + `drag_int` | `gui_2.py:4254-4260` | `truncate_entries(app.disc_entries, app.ui_disc_truncate_pairs)` keeps the last `N` User/AI pairs (per `gui_2.py:175 truncate_entries(...)`). |
| B11 | **Compress (AI summarization)** | `Compress` button | `gui_2.py:4252` → `app_controller._handle_compress_discussion:3357` | Calls `ai_client.run_discussion_compression(disc_text)` and replaces the discussion with the LLM's compressed version. |

**Layer C — UI snapshot history (undo/redo).** The `HistoryManager` (`src/history.py:71`, `max_capacity=100`) and `UISnapshot` (`history.py:8-63`) provide Ctrl+Z / Ctrl+Y across the entire UI state — including `disc_entries`:

| # | Operation | Source code | What it does |
|---|---|---|---|
| C1 | **Take snapshot** | `gui_2.py:735 _take_snapshot` → `history.UISnapshot(...)` | `copy.deepcopy(self.disc_entries)` — a deep copy of the full entry list is captured. The snapshot also captures `ai_input`, `temperature`, `top_p`, `max_tokens`, `auto_add_history`, `files`, `context_files`, `screenshots`, all system prompts. |
| C2 | **Apply snapshot (undo/redo)** | `gui_2.py:754 _apply_snapshot` | Restores `self.disc_entries = snapshot.disc_entries` (and all the other fields). |
| C3 | **Change detection triggers snapshot** | `gui_2.py:1160, 1166-1167` | `if len(current.disc_entries) != len(self._last_ui_snapshot.disc_entries) or ...` — disc_entries content change pushes a new snapshot. |
| C4 | **Capacity-evict oldest** | `history.py:80-90 push()` | When the undo stack exceeds 100, the oldest is popped from the front. |
| C5 | **Jump to specific state** | `history.py:129 jump_to_undo(index, current_state, ...)` | Allows time-traveling to any past snapshot, not just the most recent. |

**Summary of editability.** Manual Slop provides:
- **Per-entry content edit** (A1, A2) — the AI's response text is fully editable in the GUI
- **Per-entry insert at any position** (A5) — the user can drop a new entry *between* two existing entries, not just append
- **Per-entry delete at any position** (A6)
- **Per-entry role change** (A4) — the user can re-label any entry as User, AI, Tool, Context, or any custom role
- **Per-entry branch** (A7) — creates a Take at any entry, not just at the end
- **Per-entry collapse/expand** (A3) — visual organization
- **Per-discussion full CRUD** (B1, B6, B7, B8) — append, switch, rename, delete, promote
- **Per-role set management** (B5) — the role list itself is user-editable
- **Bulk operations** (B2, B3, B10) — collapse/expand all, clear, truncate
- **AI-assisted compression** (B11) — summarize the whole discussion
- **Undo/redo across all of the above** (C1-C5) — Ctrl+Z / Ctrl+Y / jump-to-state

**What Manual Slop does NOT have.** The user cannot edit the **provider-side raw transcript** — the bytes inside the `ai_client._anthropic_history`, `ai_client._gemini_chat._history`, etc. process globals. These are reset on `ai_client.reset_session()`. nagent's "edit the conversation file" pattern operates at *this* layer, not the entry abstraction. The comms log (`comms.log`) is JSON-L and append-only, not user-editable from the GUI (it can be edited on disk in a text editor, but that's a different workflow).

**Verdict.** **PARITY (DIFFERENT FOCUS).** Both systems support comprehensive editing of the conversation-as-data. The difference is *what counts as "the conversation"*:
- nagent's "conversation" = the raw transcript text file (the bytes the LLM produced)
- Manual Slop's "conversation" = a typed entry list with role + content + metadata + optional thinking segments

Manual Slop's editing is **more granular and more pervasive** (per-entry content edit, per-entry insert/delete, per-entry role-change, per-entry branch, with undo/redo). nagent's editing is **deeper at the raw transcript layer** (edit the actual AI response text before it's been abstracted into a typed entry). Both are real; both are deliberate.

**Domain tag:** Application. The Application's typed-entry abstraction is intentional — the user thinks in "discussions" not "transcripts." The user can opt-in to the raw-transcript layer by editing `comms.log` on disk or by reading the TOML `discussions/<take_name>/history` field directly.

*Future-track candidate: optionally persist the raw transcript as a sibling file under each take (Candidate 10 in `decisions.md`), enabling the nagent-style "edit the actual AI response" workflow for users who want it.*

---

## 4. Visible output protocol

**nagent's claim.** Free-form model output is hard to execute. Use a visible protocol: `<nagent-read>`, `<nagent-file-read>`, `<nagent-shell>`, `<nagent-write>`, etc. The startup prompt lists the only tags the model may emit. The parser is strict: recognized tags and whitespace. Nothing else. **"If you cannot read the protocol, you cannot debug the system."**

**nagent's implementation.** `bin/nagent:TAG_PATTERNS` is a list of `(tag_type, compiled_regex)` tuples. `parse_response()` returns `None, error` if any non-whitespace text is found outside a known tag. The error message is appended to the conversation and the model is asked to retry (up to `MAX_FORMAT_RETRIES = 3`).

**Manual Slop's equivalent.** Manual Slop's Application AI uses **provider-native function calling** (Gemini `genai.types.FunctionDeclaration`, Anthropic `tool_use` blocks, etc.). This is *opaque*: the protocol is encoded in JSON the provider parses. The user cannot read a `function_call` from the comms log and reason about it without knowing the provider's schema.

The two approaches are **structurally different**:

| Aspect | nagent regex tags | Manual Slop function calling |
|---|---|---|
| Visibility | Plain text, inspectable in the conversation file | JSON blobs in provider-specific format |
| Per-provider portability | Same tags work across all 4 providers | Each provider has its own schema; mcp_client's 45 tools have 5 different per-provider formats |
| Provider capability ceiling | Whatever the model can emit as text | Native parallel tool calls, structured outputs, JSON-mode constraints |
| Debuggability | "Why didn't the model read the file?" → grep the conversation for the tag | "Why didn't the model call read_file?" → inspect the JSON response |

**Verdict.** **ARCHITECTURAL DIFFERENCE** — both are correct for their domain. The Application *wants* parallel tool calls, JSON-mode constraints, and provider-side caching. The Meta-Tooling *might want* nagent's regex tags for explicit debuggability.

**Domain tag:** Both. The Application's choice is right (modern providers all support function calling with parallel execution — see `docs/guide_ai_client.md` §"Async Tool Execution"). The Meta-Tooling *could* adopt nagent's regex-tag protocol for its own work — for example, by using `<read src/foo.py>` instead of a tool-call JSON. This is explicitly the difference between the "Application's internal AI" and the "Meta-Tooling that builds the Application" in `docs/guide_meta_boundary.md`.

*Future-track candidate: a Meta-Tooling-side DSL for compact tool calls (per the existing `docs/reports/PLANNING_DIGEST_20260606.md` reference to "an intent-based DSL" for "discovery" or "combinatorics").*

---

## 5. The loop (append, call, parse, act, append, repeat)

**nagent's claim.** "Agent behavior" is mostly: append, call, parse, act, append, repeat. Heavier systems add infrastructure around the same steps.

**nagent's implementation.** `bin/nagent:run_agent_loop` is a `while True` loop:
1. Append user prompt to conversation file
2. Send conversation file to LLM (via `nagent-llm-text --json`)
3. Append response to conversation file
4. If response contains action tags: run those actions, append results, continue loop
5. If response contains `<nagent-response>`: print and stop

**Manual Slop's equivalent.** Manual Slop has *three* parallel "loops":

1. **`src/ai_client.py:_send_<provider>`** — the per-provider tool-call loop. Up to `MAX_TOOL_ROUNDS + 2 = 12` iterations. Each round: call provider, parse function calls, dispatch, append tool results. Same shape as nagent.

2. **`src/multi_agent_conductor.py:ConductorEngine.run`** — the MMA loop. Per ticket: `ai_client.reset_session()` (Context Amnesia), build prompt, `loop.run_in_executor(None, run_worker_lifecycle, ...)`. Different scope (per ticket, not per user turn).

3. **`simulation/workflow_sim.py:WorkflowSimulator.run_discussion_turn_async`** — the 1:1 chat loop. Per user turn: build markdown, send, wait, append response. Different scope (per user turn, in the App).

All three have the same "append, call, parse, act, repeat" shape. They differ in *what gets appended* (per-provider history vs track state vs `disc_entries`).

**Verdict.** **PARITY.** The loop is the universal pattern. Manual Slop's three loops are at different layers (LLM, MMA, App). The lack of a *single* "the loop" file is a real cost — nagent's `run_agent_loop` is 50 lines, easy to reason about. Manual Slop's loops are 100-300 lines each, scattered.

*Future-track candidate: a single `src/llm_loop.py:run_loop(...)` function that all three callers use, with the dispatch and parse layers injected. (Not a high-priority refactor; the current separation is readable.)*

**Domain tag:** Both.

---

## 6. Per-file memory (curation, not conversation log)

**nagent's claim.** One conversation grows too large. Attach memory to artifacts. Work keeps coming back to the same files; give each file its own persistent local memory. **"When work orbits one artifact, store memory on that identity."**

**nagent's implementation.** `bin/helpers/nagent_file_edit_lib.py` provides:
- `file_id_for_path(path) -> "{st_dev}:{st_ino}"` — a stable file identity across renames (the inode is preserved).
- `file_index_path(root, pid) -> conversations/file-index-{pid}.json` — a JSON registry of `{file_id: {path, conversation}}`.
- `resolve_file_edit_conversation(root, pid, file_path) -> (name, resolved, file_id)` — gets or creates a per-file conversation.
- `nagent-file-edit --file src/foo.py "add validation"` — spawns a new nagent process with `--file_edit src/foo.py`, which loads the file's *previous* conversation as the initial context. After edits, the new file is appended to the same conversation.

The result: a per-file conversation log keyed by inode. Rename with same inode = same conversation. Pure path-based: nope, you'd collide across two repos on the same machine.

**Manual Slop's equivalent (corrected per user).** The first draft of this report marked this section as "DOMAIN MISMATCH" — claiming Manual Slop has no per-file memory. **This was wrong.**

Manual Slop *does* have a per-file memory concept. It's just **a different kind of memory**. Where nagent's per-file memory is a *conversation log* (what the LLM said about this file last time), Manual Slop's is a *curation config* (how to present this file in the AI's context window). The two are complementary, not equivalent.

The Manual Slop per-file memory:

```python
# src/models.py:510
@dataclass
class FileItem:
    path:               str                # the artifact identity (path-keyed, no inode)
    auto_aggregate:     bool = True       # include in auto-aggregation?
    force_full:         bool = False      # bypass aggregation with full content?
    view_mode:          str = 'full'      # full / skeleton / summary / sig / def / agg
    selected:           bool = False      # for batch operations
    ast_signatures:     bool = False      # only signatures
    ast_definitions:    bool = False      # only definitions
    ast_mask:           dict[str, str]    # per-symbol mask (from Structural File Editor)
    custom_slices:      list[dict]        # Fuzzy Anchor slices with tag+comment
    injected_at:        Optional[float]   # timestamp
```

Plus the **ContextPreset** (`src/models.py:909`): a *named, persisted set* of `FileItem`s, stored in the project's `manual_slop.toml`. Load a preset → restore the same per-file curation state. This is the per-file memory that survives across discussions.

The user pointed at this directly: *"we have the context composition we can directly control what's in memory at the start of a discussion."* That's the right framing. `aggregate.py:run` builds the initial markdown from `self.context_files` (the active preset's FileItems) + `aggregate.run(flat, aggregation_strategy=...)`. The user controls the per-file memory at discussion start.

What's *missing* is nagent's specific pattern: **a per-file conversation log keyed by inode.** Manual Slop does not have a "last investigation of this file" concept stored as a file. The closest analog is *commit history* (the discussion itself is git-linked, per `docs/guide_gui_2.md` §"Discussions Sub-Menu" "Git Commit Tracking"). But that's discussion-scoped, not file-scoped.

**Verdict.** **MANUAL SLOP IS STRONGER IN THE CURATION DIMENSION; nagent IS STRONGER IN THE CONVERSATION-LOG DIMENSION.** Both have a real per-file memory concept. Manual Slop's is "how do I render this file next time the AI sees it" (rich, with 9 fields, AST-aware); nagent's is "what did the LLM say about this file last time" (plain text, with stable inode identity). The two are not equivalent; they're different optimizations for different needs.

**Domain tag:** Application (for the curation config). The user-correction explicitly said: *"we have the context composition we can directly control what's in memory at the start of a discussion."* That confirms this is a real Application feature, not a gap.

*Future-track candidate: extending the per-file memory with a thin "last-investigation" log per file. A `~/.manual_slop/per_file/<file_id>.md` (file_id by inode, like nagent) that records the last time a discussion referenced this file, the questions asked, and the answers received. This is a Meta-Tooling-friendly addition because it's a plain file.*

---

## 7. Repository history as data

**nagent's claim.** A repo is not only the current tree. History is data too. Transform git history into editing context for a target file. Not vague "retrieval." Explicit transformation of historical artifacts into working input.

**nagent's implementation.** `bin/nagent:file_edit_history_and_summary_block(file_edit_path, ...)`:
- `git_file_history(repo_root, rel_path)` — `git log --follow --max-count=50` per file
- `summarize_new_file_commits(...)` — LLM call to one-line-summarize new commits
- `coedited_file_rows(repo_root, rel_path, commits)` — counts files in the same commits; labels high/medium/low co-edit rate
- `format_file_history(...)` — produces a `{file-history}` block with editors, step-by-step, co-edited files, summarized commits

**Manual Slop's equivalent (partial).** Manual Slop's `_reread_file_items` (in `ai_client.py`) does mtime-based *current* content re-reading with diff injection as `[SYSTEM: FILES UPDATED]`. It does *not* do git history injection.

The closest things Manual Slop has:
- **Git commit-linked discussion tracking** in the GUI: each discussion has a "Update Commit" button that stamps `git rev-parse HEAD` (per `docs/guide_gui_2.md` §"Discussions Sub-Menu").
- **`src/dag_engine.py`** tracks ticket-to-git-commit relationships, but for *MMA* workers, not for the AI's context.

**Verdict.** **PARTIAL.** Manual Slop has current-content diff injection (the easy half) but lacks historical-context injection (the harder half). nagent's `summarize_new_file_commits` would be a useful addition to the Manual Slop AI's context — especially for "explain what this file does" questions where the LLM is meeting the file fresh.

**Domain tag:** Application. *Future-track candidate: a `src/git_history.py` module that mirrors nagent's `file_edit_history_and_summary_block` and is invoked at discussion start (after `aggregate.py`).*

---

## 8. Historical coupling & artifact neighborhoods

**nagent's claim.** A file lives in a neighborhood of related artifacts. Files that change together in git history are hints: tests, headers, config, paired implementation. High co-edit rate means "look here maybe." Not "edit everything."

**nagent's implementation.** `coedited_file_rows(repo_root, rel_path, commits)`:
- Counts files in the same commits as the target
- Labels: high (>=50% co-edit), medium (>=20%), low
- Renders a `| file | commits together | P(other file changed | target file changed) |` table
- Guidance text: "Use these files as hints. Before editing, inspect high-likelihood co-edited files when the requested change may affect interfaces, tests, config, or paired code. Do not edit them unless the user request or evidence requires it."

**Manual Slop's equivalent.** None. Manual Slop has `py_get_hierarchy` (subclass scan) and `ts_c_*_get_*` AST tools, but **no tool that returns "files that historically co-edit with this file."** The closest is `derive_code_path` (call-graph trace), which is structural not historical.

**Verdict.** **GAP.** This is a real missing tool. nagent's framing — "hints, not commands" — is exactly the right level for a co-edit suggestion. A 50-line tool (`py_coedit_files(path) -> list[(path, count, likelihood)]`) would fill the gap.

**Domain tag:** Application. *Future-track candidate: a `py_coedited_files` MCP tool + `ts_c_coedited_files` for C/C++.*

---

## 9. Disposable sub-conversations

**nagent's claim.** Exploration creates noise. Spawn disposable workers. Sub-conversations are temporary nagent processes with isolated conversations. Their lifetime does not matter. The artifact they return matters.

**nagent's implementation.** `<nagent-conversation>` tag in the main loop's response:
- Parent appends `<nagent-conversation prompt="...">` to its conversation
- Parent spawns `nagent --invocation delegated --parent-conversation <name> --json` as a subprocess
- Child's `--json` output is parsed, rolled up into the parent's `recursive_input_tokens` / `recursive_output_tokens`
- Child has its own conversation file; no shared context except the explicit prompt
- Parent gets a concise artifact: the child's `<nagent-response>` content, plus token usage

**Manual Slop's equivalent (corrected per user).** The first draft of this report claimed **PARITY (stronger in some ways)**. The user corrected this:

> *"I don't know if I have disposable sub-conversations, I don't really have them for non-mma runs. I probably want to add that for just 1:1 discussions where I use a sub-agent manually for specific points."*

So the actual picture is:

| Layer | Sub-conversation support |
|---|---|
| **MMA Tier 3 / Tier 4** | **Yes.** `mma_exec.py` spawns a real subprocess per ticket with Context Amnesia. `ai_client.reset_session()` at start of `run_worker_lifecycle`. The Ticket output is the "distilled artifact" returned to the parent (`ConductorEngine`). Per the docs: *"Tier 3 worker is a fresh subprocess with a clean context window, receiving only the prompt and the relevant context slice."* |
| **1:1 main discussion** | **No.** The Application's chat loop has no sub-conversation mechanism. The user types a prompt, the AI responds, the loop continues. There's no way to "ask a sub-agent to investigate X and bring back the answer." |

The user is correct: this is a gap. The MMA pattern is the prototype. A future track could extract `MMA's run_worker_lifecycle` into a reusable `app.spawn_sub_conversation(prompt, allowed_tools=...)` method that the App can call from `pre_tool_callback` or from a new "investigate this" command.

**Verdict.** **PARITY for MMA; GAP for 1:1 discussions.** The MMA pattern is strong. The 1:1 chat has no equivalent. The user explicitly flagged this as a want.

**Domain tag:** Application (and possibly Meta-Tooling). *Future-track candidate: a `src/sub_conversation.py:SubConversationRunner` that the App can call to spawn disposable sub-agents on-demand during 1:1 discussions. Per the user: useful for "specific points" within a longer conversation.*

---

## 10. Controlled writes

**nagent's claim.** A loop that writes files needs explicit boundaries. nagent is a reference implementation with conventions, **not a sandbox**. Shell runs with your permissions. Structured writes are checked. That is not a security boundary. Do not pretend it is.

**nagent's implementation.**
- `validate_write_path(path, file_edit_path, ...)` — in main mode: path must be in `/tmp`, `/var/tmp`, or `$TMPDIR`. In file-edit mode: path must be the target file (or one of its split segments).
- Rejected writes append `<nagent-write-result status="error">` to the conversation.
- `<nagent-shell>` runs whatever the LLM wrote, with the user's permissions, in the user's working directory. **There is no shell sandbox.** This is explicit.

**Manual Slop's equivalent.** Manual Slop has a *much* stronger security model:

| nagent | Manual Slop |
|---|---|
| `validate_write_path`: in main mode, path must be in `/tmp`, `/var/tmp`, or `$TMPDIR` | `mcp_client._is_allowed`: in main mode, path must be in the allowlist (constructed from `file_items` + `extra_base_dirs`); history.toml and `*_history.toml` are *always* blocked |
| `execute_write` writes the file directly | `set_file_slice` / `edit_file` / `py_update_definition` route through AST or string-match for validation |
| `<nagent-shell>` runs the user's full shell, full permissions, no approval | `run_powershell(script, base_dir, qa_callback=...)` requires GUI modal approval (Execution Clutch), 60s timeout, `taskkill` cleanup, optional Tier 4 QA on failure |
| No per-tool allowlist | 3-layer security: `configure` (allowlist) → `_is_allowed` (path validation) → `_resolve_and_check` (resolution + symlink resolution) |
| No sandbox at all | PowerShell-only (no bash/cmd) by default; can be enabled in `[mcp_env.toml]` |

**Verdict.** **PARITY (STRONGER on Manual Slop's side).** Manual Slop's HITL-required shell execution + 3-layer allowlist is *dramatically* more secure than nagent's tmpdir check. The user explicitly chooses "less safety but more flexibility" with nagent, and "more safety but more friction" with Manual Slop.

**Domain tag:** Both. The Application needs Manual Slop's strict model. The Meta-Tooling could legitimately use nagent's looser model *because the human is in the loop* (the bridge script pops a GUI dialog).

---

## 11. Large files as explicit artifacts (split/patch)

**nagent's claim.** Big files exceed context. Split them. Do not pretend they fit. The split is a *data structure* with `index.json` and segment files; the patch is a unified diff; the source hash validates that nothing changed.

**nagent's implementation.**

The 4-file pipeline:
1. **`nagent-file-split <file> --output <dir> --split <type> [--summarize] [--refresh INDEX] [--target-bytes 32768] [--natural]`**:
   - `EXTENSION_MAP` covers 11 languages (txt, md, cpp, py, xml, js, ts, json, yaml, go, rs, java)
   - Per-language `SCORE_BY_TYPE` (no tree-sitter; regex + line-counting + brace/JSON/XML depth counters)
   - `py_score` rewards blank lines followed by `def`/`class`/`async def`
   - `cpp_score` uses `brace_depth` to find closing braces at depth 0
   - `json_score` uses `json_depth` to find closing `}`/`]` at depth 0
   - Writes `index.json` with `source_path`, `sourcesha256`, `source_size_bytes`, `source_line_count`, `split_type`, `target_bytes`, `natural`, `created_at`, `segment_count`, `segments[]`
   - Each segment is a separate file with `name-0001.py`, `name-0002.py`, etc.
   - `--summarize` flag spawns `nagent-file-summarize` per-segment subprocess
2. **User edits the segment files** (in place, via vim, etc.)
3. **`nagent-file-patch <index> [--patch PATH] [--dry-run] [--force]`**:
   - `validate_index(index, require_hash_match=not force)` — **strict** hash check; rejects if source changed
   - `merge_segments(segments) -> str` — concatenates segment contents in order
   - `make_unified_patch(source, original, updated)` — `difflib.unified_diff`
   - Writes the patch file; if `apply=True` and `changed=True`, writes the source
4. **`nagent-file-summarize <file> [--limit-word-count N] [--output DIR] [--json]`**:
   - Files > 64 KB cascade to `nagent-file-split --summarize` first
   - `summarize_content` retries up to `SUMMARY_MAX_ATTEMPTS = 2` if the LLM overshoots the word limit
   - `combined_summary_from_index` glues per-segment summaries into one

**Manual Slop's equivalent (different mechanism, same insight).** Manual Slop has all the *parts* of nagent's split/patch/summarize, but they live in different files and use different mechanisms:

| nagent | Manual Slop |
|---|---|
| `nagent-file-split` with per-language `SCORE_BY_TYPE` (regex + line counts + brace/JSON/XML depth) | `aggregate.py:build_file_items()` + `py_get_skeleton` (tree-sitter) + `ts_c_*_get_skeleton` (tree-sitter) + `outline_tool.py` |
| `index.json` with `source_path`, `sourcesha256`, `segments[]` | No explicit `index.json`. The "split" is implicit in `_reread_file_items` (mtime-based, not hash-based) and the `py_get_skeleton` tool returns the structural view on demand. |
| `nagent-file-patch` with strict `validate_index` (hash check) | `set_file_slice` / `edit_file` with `result of file.read_text()` pre-write validation. No hash-based pre-validation. |
| `nagent-file-summarize` with per-segment LLM call + retry | `run_subagent_summarization(file_path, content, is_code, outline) -> str` (in-process LLM call) |
| Combined `combined_summary_from_index` | No equivalent; `aggregate.build_markdown_no_history` builds a single markdown per call |
| `nagent-file-summarize` cascades to `nagent-file-split --summarize` for > 64 KB | `RAGEngine._chunk_code` cascades to chunking for Python (mtime-based invalidation, ChromaDB persistence) |

**Crucial difference: Manual Slop uses tree-sitter, nagent does not.** nagent's per-language scoring functions are *all regex-based* (`cpp_score` looks for closing braces at depth 0; `py_score` looks for blank lines followed by `def`/`class` keywords; no AST parsing). Manual Slop's `py_get_skeleton` and `ts_c_*_get_skeleton` use the tree-sitter library for actual AST traversal.

This is a trade-off. Tree-sitter is more accurate but requires a native dependency. nagent's approach works on any Python install with no compiled extensions. For the Application domain, tree-sitter is already a dependency (`file_cache.py`); for the Meta-Tooling, nagent's regex approach has appeal.

**Verdict.** **PARITY (DIFFERENT MECHANISM).** Both have the "split / patch / summarize as explicit data artifacts" insight. nagent uses subprocesses + per-language scoring + hash validation. Manual Slop uses tree-sitter + in-process calls + mtime validation. The key safety property — *"the patch operation validates the source hasn't changed"* — is done by nagent via SHA-256; Manual Slop does it implicitly by re-reading the file and string-matching. Manual Slop could adopt the explicit hash approach for stronger guarantees.

**Domain tag:** Both. *Future-track candidate: an explicit `src/split_lib.py` + `src/patch_lib.py` mirroring nagent's design, used by the Application for very-large-file scenarios (e.g., a 200KB legacy C file where skeleton + sig + def aggregation isn't enough).*

---

## 12. Tool discovery (self-describing executables)

**nagent's claim.** Tool capability should be explicit data too. No central registry. Tools describe themselves.

**nagent's implementation.** `bin/helpers/nagent_cli.py:collect_bin_tool_descriptions(bin_dir)`:
- Iterates every executable in `bin/`
- Runs each with `--description` (10s timeout per)
- Captures stdout, parses it
- Concatenates into a single "Available tools:\n\n<description 1>\n\n<description 2>\n..." block
- Inserts this block into the initial context

Each tool's `__main__` starts with:
```python
def exit_on_description(description: str) -> None:
    if "--description" in sys.argv:
        print(description)
        raise SystemExit(0)
```

So `nagent-file-split --description` prints "Split a large file into structure-aware segments..." and exits 0. The main `nagent` loop calls `collect_bin_tool_descriptions` once at startup.

**Manual Slop's equivalent.** None. The 45 MCP tools in `src/mcp_client.py` are dispatched by a flat if/elif chain in `dispatch()`:
```python
def dispatch(tool_name, tool_input):
    if tool_name.startswith("bd_"):
        return _dispatch_beads(tool_name, tool_input)
    if tool_name == "read_file":
        return _read_file(tool_input["path"])
    if tool_name == "py_get_skeleton":
        return _py_get_skeleton(tool_input["path"])
    # ... 45+ branches ...
    return f"ERROR: unknown tool: {tool_name}"
```

Adding a new tool requires:
1. Edit `dispatch()` to add the branch
2. Update the security allowlist in `_resolve_and_check` (if filesystem access)
3. Update the AI capability declaration in `get_tool_schemas()`
4. Add tests

nagent's approach: drop an executable in `bin/`, implement `exit_on_description`, done. The tool is auto-discovered.

The user (per the pushback): *"The tool use is kinda upfront, I want to add an intent based dsl to help with 'discovery' or combinatorics but no where near that ideation yet."* — so this is a known want, but low priority.

**Verdict.** **GAP (Application).** nagent's pattern is genuinely better here, but Manual Slop has 45 tools in production and a migration would be a big refactor. The win is real (extensibility) but the cost is also real (rewrite the dispatch layer).

**Domain tag:** Both. For the Meta-Tooling (the `scripts/` directory), nagent's pattern is more aligned with the external-agent usage model. For the Application, the existing `dispatch` if/elif is fine.

*Future-track candidate: a `mcp_architecture_refactor_20260606` (already on the board) would benefit from nagent's pattern. The "sub-MCP" extraction the planned refactor proposes is exactly the right scope for this — each sub-MCP could be its own self-describing module.*

---

## 13. Differences from frameworks

nagent's philosophical frame: framework-style systems hide state in object graphs and long-lived agent abstractions; nagent keeps everything as explicit files. The reframing table at the end of the nagent README is excellent:

| Common term | nagent framing |
|---|---|
| memory | editable artifact |
| retrieval | preserved work / historical context |
| agent | temporary transformation function |
| context | explicit input data |

This report's §2-§12 have been showing where Manual Slop *agrees* with nagent's reframings and where it *deliberately diverges*.

**Verdict.** The reframing is useful. The application can pick and choose which reframings to adopt per layer.

**Domain tag:** Both. This is the philosophical lens for the whole report.

---

## 14. Build your own

nagent's last section: *"The minimal system is not mystical. Small loop over explicit state."* The list of 12 buildable steps: `generate_text(file) -> str`, growing conversation document, initial context with the contract, output format + parser, handlers that append results to state, loop after actions, visible retry on malformed output, child loops for delegation, per-artifact memory, repository history → context blocks, split/index/patch for large files, save/load/edit/summarize for memory maintenance.

**Verdict.** Manual Slop *has* all 12 of these. Just in different files, with different names, and at a different scale.

**Domain tag:** Both. The 12-step list is a useful checklist for any future LLM-application track.

---

## 15. The 6 Pitfalls (Revised from 8, after User Corrections)

The first draft of this report had 8 pitfalls. The user-corrections on §3 and §6 collapsed 2 of them. The remaining 6:

### Pitfall 1: No structured output protocol in the Application AI

The Application uses opaque provider-native function calling. The user can read the conversation, but cannot read a `tool_call` from the comms log without knowing the provider's schema. nagent's regex-tag protocol is more debuggable for the Meta-Tooling. **Decision: not a problem for the Application (provider-native is the right choice). Worth borrowing for the Meta-Tooling.** **Domain tag:** Both. *Future-track candidate: an intent-based DSL for Meta-Tooling agent calls.*

### Pitfall 2: Provider-specific history is in process globals

`src/ai_client.py` has `_anthropic_history`, `_deepseek_history`, `_minimax_history` — 3 separate per-provider history lists, each with their own lock. Switching providers mid-session loses history. nagent's "single conversation file" model is provider-agnostic.

**Concrete change:** A future refactor toward a stateless `LLMClient` class with an explicit `Conversation` object (the transcript as a `list[Message]`) would let:
- Users save/load/replay conversations
- Provider switching doesn't lose history
- Tier 4 QA and Tier 3 workers share a common conversation format

**Domain tag:** Application. *Future-track candidate: a `src/conversation.py:Conversation` dataclass + `src/llm_client.py:LLMClient` stateless wrapper around the 5 providers.*

### Pitfall 3: RAG is not "history as data"

Manual Slop's RAG (`src/rag_engine.py`) is fuzzy and not auditable. nagent's git-history-driven context is exact and inspectable. RAG is useful but should be **additive**, not a replacement. The Application's `_reread_file_items` mtime-based diff injection is the "history as data" mechanism Manual Slop already has.

**The user's clarification:** *"RAG is an optional thing, doesn't have to be used. Would be cool to have a sub agent maybe prepare a rag chunks before I use them in a run."*

**Decision:** RAG stays. The user wants a *staging* workflow: a sub-agent prepares RAG chunks before a run, the chunks become the discussion's starting memory. This is consistent with the nagent-inspired sub-conversation pattern (§9).

**Domain tag:** Application. *Future-track candidate: a "RAG pre-staging" sub-conversation runner that pre-builds the index for a planned run.*

### Pitfall 4: The AI client is a stateful singleton with module-level globals

2,685-line `src/ai_client.py`. The module is the abstraction layer. To import it for testing, you trigger 5 provider SDKs' lazy imports. The unit tests are the only way to know what state is in flight.

This is the *opposite* of nagent's "files are the system; the process is a worker." nagent's `run_agent_loop` is 50 lines, stateless, testable. A future refactor toward a stateless `LLMClient` class would make `ai_client` parseable, testable, and saveable.

**Domain tag:** Application. *Future-track candidate: a `src/llm_client.py:LLMClient` class with explicit `Conversation`, `Provider`, `History` objects. Backwards-compatible with the current `ai_client.send()` API.*

### Pitfall 5: No non-MMA disposable sub-conversations

The MMA pattern is strong. The 1:1 chat has no equivalent. The user *explicitly* flagged this as a want: *"I probably want to add that for just 1:1 discussions where I use a sub-agent manually for specific points."*

**Decision:** Design `src/sub_conversation.py:SubConversationRunner` that the App can call to spawn disposable sub-agents on-demand during 1:1 discussions. Reuse MMA's subprocess pattern (`mma_exec.py` as the template). The sub-agent returns a concise artifact to the parent (nagent's pattern). Useful for "investigate this file" / "summarize this concept" / "look up this API" commands.

**Domain tag:** Application. *Future-track candidate: a `src/sub_conversation.py` + a GUI "Investigate…" button on the message panel.*

### Pitfall 6: Hard-coded tool discovery

The 45 MCP tools in `mcp_client.py:dispatch` are in a flat if/elif chain. nagent's `--description` self-describing executable pattern is more extensible.

**The user's position:** *"The tool use is kinda upfront, I want to add an intent based dsl to help with 'discovery' or combinatorics but no where near that ideation yet."*

**Decision:** Low priority. The `mcp_architecture_refactor_20260606` (already on the board) is the natural place to address this — sub-MCPs as self-describing modules.

**Domain tag:** Both. *Future-track candidate: subsumed by mcp_architecture_refactor_20260606.*

### Pitfalls removed by user-corrections

- **(removed)** Pitfall about "Conversation state is buried in module-level globals" — overstated. Manual Slop has editable UI state (Takes, UISnapshot, ContextPreset); it lacks editable *raw transcripts*, but that's a *different* design choice, not a gap. (See §3.)
- **(removed)** Pitfall about "per-file memory" — overstated. Manual Slop *does* have per-file memory in the curation dimension; what's missing is nagent's conversation-log dimension, which is a different optimization. (See §6.)

---

## 16. Recommended reading path for engineers

If you haven't read nagent, here's the priority:

1. **The README's first 3 sections** ("What It Looks Like", "Durable Work", "Text In Text Out") — the philosophy in 5 minutes.
2. **`bin/nagent:run_agent_loop()`** — the actual loop, 50 lines.
3. **`bin/helpers/nagent_file_split_lib.py:SCORE_BY_TYPE`** — the per-language scoring; shows what "structure-aware" can mean without tree-sitter.
4. **`bin/helpers/nagent_file_patch_lib.py:validate_index`** — the strict hash check; the safety property of nagent's split/patch workflow.
5. **`bin/helpers/nagent_file_summarize_lib.py:summarize_content`** — the retry-with-smaller-prompt pattern.
6. **`bin/helpers/nagent_cli.py:collect_bin_tool_descriptions`** — the tool-discovery pattern; 30 lines.

The README's 14 sections can be skimmed in 15 minutes if you have the context this report provides. Read in order 1-5 above for the implementation depth.

---

## Appendix A. Cross-reference table

| nagent file | Lines | Purpose | Manual Slop equivalent |
|---|---|---|---|
| `README.md` | ~1500 | 14-section teaching document | This report + `docs/guide_*.md` |
| `bin/nagent` | ~700 | Main loop, tag parser, sub-conversation runner | `src/ai_client.py:send` + `src/multi_agent_conductor.py:ConductorEngine.run` + `simulation/workflow_sim.py:WorkflowSimulator.run_discussion_turn_async` (3 separate loops) |
| `bin/nagent-llm-text` | ~50 | CLI wrapper for `nagent-llm.py` | (implicit; the Application calls `ai_client.send` directly) |
| `bin/nagent-llm-upload` | ~30 | File upload + LLM call | (not present; the Application's read tools handle files inline) |
| `bin/nagent-file-edit` | ~120 | Per-file subprocess wrapper | (not present; this is the gap that the user wants for 1:1 discussions) |
| `bin/nagent-file-split` | ~170 | Main split executable | (not present in this form; Manual Slop uses `aggregate.py` + tree-sitter) |
| `bin/nagent-file-patch` | ~80 | Main patch executable | (not present; Manual Slop uses `set_file_slice` / `edit_file` directly) |
| `bin/nagent-file-summarize` | ~100 | Main summarize executable | `src/ai_client.py:run_subagent_summarization` (in-process) |
| `bin/helpers/nagent_cli.py` | ~80 | `--description` pattern, `WaitSpinner` | (not present) |
| `bin/helpers/nagent_llm.py` | ~300 | 4 providers, token accounting | `src/ai_client.py:_send_<provider>` × 5 (in-process, with cross-provider state) |
| `bin/helpers/nagent_file_edit_lib.py` | ~170 | file-index by inode, `resolve_file_edit_conversation` | (not present) |
| `bin/helpers/nagent_file_split_lib.py` | ~400 | `SPLIT_TYPES` (11 langs), per-language scoring | `src/file_cache.py:ASTParser` (tree-sitter) + `src/aggregate.py:build_file_items` |
| `bin/helpers/nagent_file_patch_lib.py` | ~130 | strict hash validation, `make_unified_patch` | (not present; implicit mtime check) |
| `bin/helpers/nagent_file_summarize_lib.py` | ~110 | per-segment LLM call, retry-with-smaller-prompt | `src/ai_client.py:run_subagent_summarization` (in-process, no retry) |
| **Total nagent** | **~4000** | | **Manual Slop's analogous parts: ~5000+** (ai_client + multi_agent_conductor + mcp_client + aggregate + rag_engine + history + project_manager + tree-sitter-based tools) |

Manual Slop is *not* smaller than nagent; it's *larger* because it has a GUI, persistence, HITL dialogs, Hook API, and a real test harness. The architectures serve different scales.

---

## Appendix B. Citations

- nagent source: https://github.com/macton/nagent (all 11 source files read in full)
- Internal: `docs/Readme.md`, `docs/guide_architecture.md`, `docs/guide_ai_client.md`, `docs/guide_mma.md`, `docs/guide_tools.md`, `docs/guide_mcp_client.md`, `docs/guide_app_controller.md`, `docs/guide_meta_boundary.md`, `docs/guide_context_curation.md`, `docs/guide_personas.md`, `docs/guide_rag.md`, `docs/guide_gui_2.md`
- Internal source (selectively read for user-corrections): `src/models.py` (FileItem, ContextPreset), `src/context_presets.py`, `src/project_manager.py` (branch_discussion, promote_take), `src/aggregate.py`, `src/history.py`
- Mike Acton, "Data-Oriented Design and C++" (cppCon 2014) — referenced but not directly cited
- Ryan Fleury, "The Easiest Way To Handle Errors Is To Not Have Them" — cited via the `data_oriented_error_handling_20260606` track

---

*End of report. See `comparison_table.md` for the flat reference, `decisions.md` for the future-track candidates, and `spec.md` for the track wrapper.*