Private

Public Access

Files

T

conductor-tier2 9cc51ca9af conductor(track): nagent review - deep-dive + 6 pitfalls + 10 actionable takeaways

Reference/analysis track. Produces 0 code changes.

Artifacts (conductor/tracks/nagent_review_20260608/):
- spec.md (240 lines) - track wrapper with Application/Meta-Tooling framing
- report.md (571 lines) - 14-section deep-dive; primary deliverable
- comparison_table.md (79 lines) - flat side-by-side reference
- decisions.md (286 lines) - 10 future-track candidates with priority matrix
- nagent_takeaways_20260608.md (363 lines) - 10 actionable patterns grounded
  in code (file:line refs into nagent source and Manual Slop source)
- metadata.json (132 lines) - structured metadata + verification criteria
- state.toml (113 lines) - per-task tracking + user-corrections log (7 entries)

14 nagent principles covered in report.md (durable work, text-in/text-out,
editable state, visible protocol, the loop, per-file memory, repo history,
neighborhoods, sub-conversations, controlled writes, large files, tool
discovery, framework differences, build your own).

6 pitfalls (revised from 8 after user-corrections):
1. No structured output protocol in Application AI (opaque function calling)
2. Provider-specific history in process globals (ai_client._anthropic_history
   + _deepseek_history + _minimax_history)
3. RAG is not 'history as data' (fuzzy, not auditable)
4. AI client is a stateful singleton (2,685-line ai_client.py)
5. No non-MMA disposable sub-conversations (1:1 gap; user-flagged want)
6. Hard-coded tool discovery (45-tool if/elif in mcp_client.py)

User-corrections applied (3 rounds, 7 total corrections recorded):
- Editable discussions: PARTIAL -> PARITY (DIFFERENT FOCUS) with full A1-A7
  per-entry + B1-B11 discussion-level + C1-C5 undo/redo operation matrix
- Per-file memory: DOMAIN MISMATCH -> MANUAL SLOP IS STRONGER IN
  CURATION DIMENSION (FileItem + ContextPreset vs nagent's inode-keyed
  conversation log; complementary, not equivalent)
- Sub-conversations: MMA has it; 1:1 does not -> 'PARITY for MMA; GAP for
  1:1 discussions' (user wants this)
- RAG: opt-in, not gap; user wants pre-staging via sub-conversation
- Personas: config bundling (can opt out via AI settings)
- Tool discovery: deferred (user has 'intent based DSL' idea but 'no where
  near that ideation yet')

10 actionable takeaways (separate from the 6 pitfalls - those are
diagnosis, these are prescription):
1. State visibility (UI inspector for in-process state)
2. Readable conversation log (text-greppable, not just JSON-L)
3. Sub-agents for 1:1 (HIGH priority - user-flagged)
4. File-identity over file-path (st_dev:st_ino rename-safe)
5. One loop shape visible in diagnostics
6. Visible retry on protocol failure
7. Meta-Tooling DSL (intent-based, deferred)
8. Self-describing tools (subsumed by mcp_architecture_refactor_20260606)
9. Single source of truth for disc_entries + provider history
10. Sub-agent return type constraint (bake into candidate #1 spec)

Domain classification: every recommendation tagged Application / Meta-Tooling
/ Both per docs/guide_meta_boundary.md. nagent lives in the Meta-Tooling
domain; Manual Slop's Application AI is a different kind of thing.

No code modified by this track (reference/analysis only). All 7 files
parse cleanly (JSON, TOML, Markdown). All internal cross-links resolve.
Track is 'active' awaiting human review; future-track candidates live in
decisions.md and nagent_takeaways_20260608.md.

2026-06-08 18:44:35 -04:00

21 KiB

Raw Blame History

Track: Mike Acton's nagent — Deep Dive on LLM Agent Architecture

Status: Active (spec approved 2026-06-08; revised 2026-06-08 with user-corrections) Initialized: 2026-06-08 Owner: Tier 2 Tech Lead Priority: Medium (architectural; informs future Application+Meta-Tooling decisions but is not a code refactor)

Revision note (2026-06-08): This spec was revised based on direct user corrections after the first draft. Earlier versions overstated gaps in Manual Slop's "editable discussion" and "per-file memory" features; the corrections are folded into §2 and §4 below. Read the report.md for the actual analysis; this spec.md is the wrapper.

1. Overview

This track documents a deep-dive analysis of Mike Acton's macton/nagent reference implementation ("nagent" = "not-an-agent") and its implications for how Manual Slop should think about LLM-driven workflows.

nagent is a 14-section, ~1,500-line Python reference that operationalizes the philosophy "the agent is not the thing; the data is the thing." It provides a concrete, minimal counterpoint to the standard "agent framework" model. Its central claim: durable work matters more than durable processes; explicit artifacts beat opaque state.

The companion doc (report.md) is the deep-dive analysis itself — a 14-section comparison against Manual Slop's actual implementation, written for engineers (not marketing). This spec.md is the conductor/track wrapper: the design intent, the relationship to the Application vs Meta-Tooling split, the planned follow-up tracks, and the out-of-scope notes.

1.1 What this track produces

Artifact	Purpose
`spec.md`	This file — the track design and scoping.
`report.md`	The 14-section deep-dive analysis. The primary deliverable.
`comparison_table.md`	A flat side-by-side table (one row per nagent principle) for quick reference.
`decisions.md`	Future-track candidates extracted from the analysis (each becomes a follow-up track if approved).

1.2 Non-Goals

Not rewriting Manual Slop to use nagent. The architectures serve different domains (see §2).
Not replacing any existing track. This is a reference track — it informs future tracks but doesn't compete with them.
Not a comparison of "framework vs framework." nagent is a 1,500-line reference; Manual Slop is 13,000+ lines of production code with a real GUI, real persistence, real HITL. The comparison is philosophical, not "which is better."

2. The Application / Meta-Tooling Distinction (load-bearing context)

Per docs/guide_meta_boundary.md, Manual Slop lives in two distinct architectural domains. This distinction is critical for understanding the nagent comparison:

Domain	Lives at	AI / HITL Model	Tooling
The Application (`manual_slop`)	`gui_2.py`, `ai_client.py`, `multi_agent_conductor.py`, `dag_engine.py`	A local GUI for orchestrating AI. The "Application AI" is a long-lived assistant that the user talks to over many turns. Strict HITL: every destructive action requires a GUI modal approval.	`manual_slop.toml [agent.tools]` — strict allowlist
The Meta-Tooling (us)	`scripts/mma_exec.py`, `conductor/`, `.agents/skills/`, the MCP tools in `mcp_client.py` when used by external agents	External agents (Gemini CLI, OpenCode, Claude Code) that build the Application. Each invocation is a fresh sub-agent. Token-firewalled.	Full mcp_client.py toolset, including mutation tools

nagent lives in the Meta-Tooling domain. nagent is a reference for how external agents (the ones reading this conversation, the ones writing the code) should structure their own work.

Manual Slop's Application AI does not — and should not — look like nagent. The Application AI is a chatty, conversational, persona-driven, RAG-augmented, curation-rich assistant with a real GUI. It's a different kind of thing. Conflating the two is exactly the kind of "feature bleed" guide_meta_boundary.md warns against.

Every recommendation in report.md is qualified with which domain it applies to. The Application is the production code the user cares about; the Meta-Tooling is what we (the agents) use to build it.

3. Summary of the 14-Section Comparison

The full table is in comparison_table.md. Verdict summary:

nagent Principle	Manual Slop Equivalent	Verdict
1. Durable work, disposable workers	AppState snapshots + history branching (Takes); MMA workers are real subprocesses	PARTIAL — different domains; MMA has it, App doesn't need it
2. Text in, text out	`ai_client.send()` returns `str`; `mcp_client.dispatch` returns `str`	PARITY
3. Conversations are editable state	Discussion takes + branching + edit-in-place + UISnapshot history; `ContextPreset` for per-file view-mode memory	PARITY (DIFFERENT FOCUS) — Manual Slop has this; focuses on editable UI state (per Take) and editable per-file curation (per FileItem), not editable conversation logs
4. Visible output protocol	Uses provider-native function calling; the protocol is opaque to humans	ARCHITECTURAL DIFFERENCE — Application-side; correct trade-off
5. The loop (append, call, parse, act, repeat)	`ai_client._send_*` tool-call loop, MMA `ConductorEngine.run`, `WorkflowSimulator.run_discussion_turn_async`	PARITY — but the loop is in multiple files, not as a single small function
6. Per-file memory (curation, not conversation log)	`FileItem` (path + view_mode + ast_mask + custom_slices); `ContextPreset` (saved set of FileItems); Fuzzy Anchor slices	MANUAL SLOP IS STRONGER IN THE CURATION DIMENSION; nagent's "file-edit conversation" pattern (one conversation log per file) is not present
7. Repository history as data	`_reread_file_items` mtime-based diff injection; `git_commit_file_patch` per-file history summaries; no explicit "neighborhood" computation	PARITY (PARTIAL) — diff injection is similar; the "neighborhood" computation is missing
8. Historical coupling & artifact neighborhoods	n/a (no equivalent)	GAP — could be added as a new tool
9. Disposable sub-conversations	MMA `mma_exec.py` Tier 3 workers are real subprocesses; non-MMA 1:1 discussions do NOT have disposable sub-conversations yet (per user)	GAP (Application) — useful for 1:1 discussions; PARITY for MMA**
10. Controlled writes	MCP 3-layer security + Execution Clutch + Allowlist Construction + Path Validation + Resolution Gate	PARITY (STRONGER) — Manual Slop's 3-layer is more thorough than nagent's tmpdir check
11. Large files as explicit artifacts (split/patch)	`nagent-file-split`/`nagent-file-patch`/`nagent-file-summarize` with `index.json` + segment files + source hash validation; 32 KB target size; per-language natural splitters (no tree-sitter)	PARITY (DIFFERENT MECHANISM) — both have the insight; nagent uses per-language scoring functions + subprocess isolation, Manual Slop uses tree-sitter + in-process `summarize.py`
12. Tool discovery (self-describing executables)	Hard-coded `dispatch` if/elif chain in `mcp_client.py`	GAP (Application) — could be added; useful for the Meta-Tooling domain
13. Differences from frameworks	The philosophical frame	n/a
14. Build your own	The reference's "minimal" claim is wrong for the Application	n/a for Application

The full 14-row analysis with 6 (revised from 8) specific Manual Slop pitfalls is in report.md.

4. The Revised 6 Pitfalls (corrected)

Earlier versions of this list contained two errors that user-corrections caught:

REMOVED pitfall #3 (per "Conversation state is buried in module-level globals" was over-stated) — Manual Slop has some editable-state infrastructure (HistoryManager with UISnapshot, discussion Takes/branching, ContextPreset save/load) but the actual raw conversation transcript is in ai_client._provider_specific_history globals. The truth is: Manual Slop has editable UI state, not editable conversation transcripts. That distinction is now captured honestly in §3 of the report.
REVISED pitfall #6 (per "Per-file memory") — Manual Slop does have a per-file memory concept (FileItem + ContextPreset + custom_slices + ast_mask), but it's curation memory, not nagent's conversation-log memory. Manual Slop's concept is richer in the curation dimension but absent in the conversation-log dimension. That's a useful distinction.

The remaining 6 pitfalls, after corrections:

No structured output protocol in the Application AI (uses opaque function calling; nagent's regex tag protocol is the alternative for the Meta-Tooling). Domain: Application can stay opaque; Meta-Tooling should learn.
Provider-specific history is in process globals (5 separate per-provider lists with their own locks; switching providers mid-session loses history). Domain: Application. Future-track candidate.
RAG is not "history as data" — RAG retrieval is fuzzy and not auditable. nagent's git-history-driven context is exact and inspectable. RAG is useful but should be additive, not a replacement. Domain: Application. Coexists with nagent-style history.
The AI client is a stateful singleton with module-level globals (2,685-line ai_client.py is unparseable without state). A future refactor toward a stateless LLMClient class with explicit Conversation objects would let the App save/load/replay conversations as files. Domain: Application. Future-track candidate.
No non-MMA disposable sub-conversations — only MMA workers are real subprocesses; the user explicitly noted that 1:1 discussions don't have sub-agents. nagent's <nagent-conversation> pattern (a sub-agent for bounded investigation) would be valuable for the Application. Domain: Application. Future-track candidate (user-flagged as a want).
Hard-coded tool discovery — the 45 MCP tools are in a flat if/elif chain in dispatch. nagent's --description self-describing executables pattern is more extensible. Domain: both. Low priority.

Plus 2 domain-domain recommendations that are not pitfalls per se:

Personas are config bundling (per user: "just bundles preparatory cruft — vendor/model, tools/permissions, and system prompts"). The user noted that you can completely opt out by just using AI settings directly. Domain: Application. Keep as-is; not a pitfall.
RAG is opt-in (per user: "doesn't have to be used"). Worth considering: a sub-agent that prepares RAG chunks before a run. Domain: Application. Future-track candidate.

5. What This Track Read (in full, before writing)

To avoid hand-waved claims, the report and this spec were written after reading all of:

nagent source (read in full)

README.md (~1,500 lines) — the 14-section "teaching document"
bin/nagent (~700 lines) — the main loop, tag parser, sub-conversation runner, git history + co-edit + summary integration
bin/helpers/nagent_llm.py (~300 lines) — provider dispatch, token accounting
bin/helpers/nagent_cli.py (~80 lines) — --description self-describing executable pattern, WaitSpinner
bin/helpers/nagent_file_edit_lib.py (~170 lines) — file-index by st_dev:st_ino, resolve_file_edit_conversation, is_split_segment_for_source
bin/helpers/nagent_file_split_lib.py (~400 lines) — SPLIT_TYPES (11 langs), per-language SCORE_BY_TYPE (no tree-sitter; regex + line counts + brace/JSON/XML depth), 32 KB default, source SHA-256 hashing
bin/helpers/nagent_file_patch_lib.py (~130 lines) — strict hash validation, make_unified_patch via difflib.unified_diff, apply_segment_patches writes the source
bin/helpers/nagent_file_summarize_lib.py (~110 lines) — per-segment LLM calls + retry-with-smaller-prompt (max 2 attempts), --limit-word-count validation, combined_summary_from_index
bin/nagent-file-edit (~120 lines) — per-file subprocess wrapper, default_pid = BASHPID or os.getppid()
bin/nagent-file-split (~170 lines) — main executable, --refresh INDEX mode for re-splitting without losing segment paths
bin/nagent-file-summarize (~100 lines) — main executable, cascades to nagent-file-split --summarize for files > 64 KB; uses positive_int CLI type (rejects 0)

Manual Slop docs (read in full)

docs/Readme.md (434 lines) — docs index
docs/guide_architecture.md (989 lines) — threading model, cross-thread data structures
docs/guide_ai_client.md (424 lines) — multi-provider LLM client
docs/guide_mma.md (564 lines) — 4-tier MMA orchestration
docs/guide_tools.md (506 lines) — MCP tool inventory + Hook API
docs/guide_mcp_client.md (410 lines) — 45 tools + 3-layer security
docs/guide_app_controller.md (447 lines) — headless controller
docs/guide_meta_boundary.md (57 lines) — Application vs Meta-Tooling split
docs/guide_context_curation.md (303 lines) — Granular AST Control + Fuzzy Anchor Slices + AST Inspector
docs/guide_personas.md (307 lines) — Unified agent profile model
docs/guide_rag.md (411 lines) — RAG subsystem
docs/guide_gui_2.md (477 lines) — ImGui application (App/Controller state delegation, hot-reload, defer-not-catch)

Manual Slop source (selectively read, in service of the user-corrections)

src/models.py lines 510-559 (FileItem schema), 909-937 (ContextPreset schema)
src/context_presets.py (30 lines, full file) — the ContextPresetManager
src/project_manager.py lines 429-450 (branch_discussion, promote_take)
src/aggregate.py first 80 lines (context composition pipeline)
src/history.py (full file, 141 lines) — UISnapshot and the snapshot model

The user-corrections specifically drove a re-survey of FileItem + ContextPreset + aggregate.py + HistoryManager after the first draft overstated Manual Slop's gaps.

6. Architectural Reference

nagent source code: https://github.com/macton/nagent (read in full for this analysis)
nagent README: https://github.com/macton/nagent/blob/main/README.md (the 14-section "teaching document")
Mike Acton's data-oriented design talks: https://www.youtube.com/results?search_query=mike+acton+data+oriented (foundational; nagent is a specific application)
Ryan Fleury "errors are just cases": https://www.dgtlgrove.com/p/the-easiest-way-to-handle-errors (cited in data_oriented_error_handling_20260606; consistent with nagent's data-over-control-flow stance)
Internal: docs/guide_meta_boundary.md for the Application/Meta-Tooling split
Internal: docs/guide_architecture.md §"Thread Domains" for the cross-thread state-sync problem that nagent sidesteps by having no GUI

7. See Also

Internal Documentation

docs/Readme.md — Manual Slop documentation index
docs/guide_architecture.md — Threading model and provider dispatch
docs/guide_ai_client.md — The Application's LLM client
docs/guide_mma.md — 4-tier MMA orchestration
docs/guide_meta_boundary.md — The Application vs Meta-Tooling split
docs/guide_tools.md — MCP tool inventory and Hook API
docs/guide_mcp_client.md — 45 tools + 3-layer security
docs/guide_context_curation.md — Granular AST Control + Fuzzy Anchor Slices + AST Inspector
docs/guide_personas.md — Unified agent profile model
docs/guide_rag.md — RAG subsystem
docs/guide_gui_2.md — ImGui application

data_oriented_error_handling_20260606 — Already cites Acton by name. The Result[T] + ErrorInfo data model from this track is consistent with nagent's "data, not control flow" stance.
qwen_llama_grok_integration_20260606 — The "OpenAI-compatible shared helper" pattern is exactly nagent's "thin boundary adapter on a normalized data structure" approach.
mcp_architecture_refactor_20260606 — Already blocked by data_oriented_error_handling_20260606. The sub-MCP extraction (planned) will benefit from nagent's "small helper per concept" decomposition pattern.
data_structure_strengthening_20260606 — The type-alias work is consistent with nagent's "make the data shape explicit" stance. The audit script + NamedTuple work parallels nagent's split-index / patch-artifact approach.

External

Mike Acton, "Data-Oriented Design and C++" (cppCon 2014) — The original DOD talk that nagent operationalizes
Ryan Fleury, "The Easiest Way To Handle Errors Is To Not Have Them" — Companion framework; same "errors as data" thesis
Timothy Lottes (@NOTimothyLottes) — Cited in the data_oriented_error_handling review; same "error codes are data" stance
Valigo (@valigotech) — Cited in the data_oriented_error_handling review; "exceptions mess with control flow in very weird ways"

8. Scope Boundaries

In Scope

The 14-section nagent philosophy
The 6 (revised) concrete pitfalls in Manual Slop
Mapping each pitfall to a future-track candidate (in decisions.md)
Application vs Meta-Tooling domain classification for every recommendation
The philosophical grounding for existing Manual Slop conventions (data-oriented, thread-disciplined, GUI-decoupled)

Out of Scope

Implementation work. This is a reference/analysis track. No code is being changed.
Replacing nagent in the Meta-Tooling. The Meta-Tooling is whatever the external agent (Gemini CLI, OpenCode) is. nagent is a reference example, not a competitor. It's worth reading for ideas, not adopting wholesale.
Building a new "data-oriented" track for Manual Slop. The data_oriented_error_handling_20260606 track already covers the data-vs-control-flow axis. This track is the philosophical foundation for that work; the implementation track is separate.
Comparing nagent to other LLM agent frameworks (LangChain, AutoGen, CrewAI, etc.). nagent is a specific small reference; those are different scales. This track is about nagent specifically.

Known Trade-offs (called out in the report)

Manual Slop's personas are a feature, not a bug, in the Application domain. A user-facing chatty assistant benefits from "persona = named configuration that the user can save and recall." nagent's "data, not personality" stance is correct for sub-agent invocations but wrong for long-lived assistant sessions. (Per user: personas are config bundling; the user can opt out by using AI settings directly.)
Manual Slop's RAG is a feature, not a bug, in the Application domain. RAG enables semantic search across large codebases. nagent's "git history → summaries" is exact but doesn't help when the user asks "how does the execution clutch work" and the relevant information is in guide_architecture.md (a doc, not source). RAG is opt-in.
Manual Slop's GUI is a feature, not a bug, for its domain. It enables the rich persona, curation, RAG, and snapshot UX. nagent explicitly has no GUI; the Application explicitly has a GUI. They serve different needs.
The "1,500-line reference" vs "13,000-line production" comparison is not fair. nagent is a teaching example. Manual Slop is a working tool. The right comparison is "nagent's principles vs Manual Slop's implementation," not "which codebase is better."

9. Verification Criteria

This is a reference/analysis track. The verification is:

report.md exists and covers all 14 nagent principles with a Manual Slop assessment for each
comparison_table.md exists as a flat side-by-side reference
decisions.md exists with future-track candidates (each is a separate conductor track to be specced independently)
Every "Manual Slop could learn from nagent here" recommendation is tagged with the domain (Application / Meta-Tooling / Both)
No code is being modified by this track
The companion doc is read by ≥1 person who is planning a future track (the report.md file is referenced by the relevant future-track specs)
(Post-correction) The report's verdicts on nagent §3 (Conversations are editable state) and §6 (Per-File Memory) are corrected per user feedback — the first draft overstated gaps

10. Status

Approved 2026-06-08 (initial); revised 2026-06-08 with user corrections. Ready for human review of report.md.

After human review of report.md, the decisions.md candidates will be evaluated:

High-priority items (e.g., stateless LLMClient class, non-MMA sub-conversations, RAG pre-staging) → new conductor tracks
Medium-priority items (e.g., self-describing MCP tools, conversation file persistence) → research spikes
Low-priority items → deferred until a specific Application need surfaces

The current data_oriented_error_handling_20260606 track and the future mcp_architecture_refactor_20260606 track are already philosophically aligned with nagent's principles; this track is the explicit reference to that alignment.

21 KiB Raw Blame History