Private
Public Access
0
0
Files
manual_slop/conductor/tracks/nagent_review_20260608/spec.md
T
conductor-tier2 9cc51ca9af conductor(track): nagent review - deep-dive + 6 pitfalls + 10 actionable takeaways
Reference/analysis track. Produces 0 code changes.

Artifacts (conductor/tracks/nagent_review_20260608/):
- spec.md (240 lines) - track wrapper with Application/Meta-Tooling framing
- report.md (571 lines) - 14-section deep-dive; primary deliverable
- comparison_table.md (79 lines) - flat side-by-side reference
- decisions.md (286 lines) - 10 future-track candidates with priority matrix
- nagent_takeaways_20260608.md (363 lines) - 10 actionable patterns grounded
  in code (file:line refs into nagent source and Manual Slop source)
- metadata.json (132 lines) - structured metadata + verification criteria
- state.toml (113 lines) - per-task tracking + user-corrections log (7 entries)

14 nagent principles covered in report.md (durable work, text-in/text-out,
editable state, visible protocol, the loop, per-file memory, repo history,
neighborhoods, sub-conversations, controlled writes, large files, tool
discovery, framework differences, build your own).

6 pitfalls (revised from 8 after user-corrections):
1. No structured output protocol in Application AI (opaque function calling)
2. Provider-specific history in process globals (ai_client._anthropic_history
   + _deepseek_history + _minimax_history)
3. RAG is not 'history as data' (fuzzy, not auditable)
4. AI client is a stateful singleton (2,685-line ai_client.py)
5. No non-MMA disposable sub-conversations (1:1 gap; user-flagged want)
6. Hard-coded tool discovery (45-tool if/elif in mcp_client.py)

User-corrections applied (3 rounds, 7 total corrections recorded):
- Editable discussions: PARTIAL -> PARITY (DIFFERENT FOCUS) with full A1-A7
  per-entry + B1-B11 discussion-level + C1-C5 undo/redo operation matrix
- Per-file memory: DOMAIN MISMATCH -> MANUAL SLOP IS STRONGER IN
  CURATION DIMENSION (FileItem + ContextPreset vs nagent's inode-keyed
  conversation log; complementary, not equivalent)
- Sub-conversations: MMA has it; 1:1 does not -> 'PARITY for MMA; GAP for
  1:1 discussions' (user wants this)
- RAG: opt-in, not gap; user wants pre-staging via sub-conversation
- Personas: config bundling (can opt out via AI settings)
- Tool discovery: deferred (user has 'intent based DSL' idea but 'no where
  near that ideation yet')

10 actionable takeaways (separate from the 6 pitfalls - those are
diagnosis, these are prescription):
1. State visibility (UI inspector for in-process state)
2. Readable conversation log (text-greppable, not just JSON-L)
3. Sub-agents for 1:1 (HIGH priority - user-flagged)
4. File-identity over file-path (st_dev:st_ino rename-safe)
5. One loop shape visible in diagnostics
6. Visible retry on protocol failure
7. Meta-Tooling DSL (intent-based, deferred)
8. Self-describing tools (subsumed by mcp_architecture_refactor_20260606)
9. Single source of truth for disc_entries + provider history
10. Sub-agent return type constraint (bake into candidate #1 spec)

Domain classification: every recommendation tagged Application / Meta-Tooling
/ Both per docs/guide_meta_boundary.md. nagent lives in the Meta-Tooling
domain; Manual Slop's Application AI is a different kind of thing.

No code modified by this track (reference/analysis only). All 7 files
parse cleanly (JSON, TOML, Markdown). All internal cross-links resolve.
Track is 'active' awaiting human review; future-track candidates live in
decisions.md and nagent_takeaways_20260608.md.
2026-06-08 18:44:35 -04:00

21 KiB

Track: Mike Acton's nagent — Deep Dive on LLM Agent Architecture

Status: Active (spec approved 2026-06-08; revised 2026-06-08 with user-corrections) Initialized: 2026-06-08 Owner: Tier 2 Tech Lead Priority: Medium (architectural; informs future Application+Meta-Tooling decisions but is not a code refactor)

Revision note (2026-06-08): This spec was revised based on direct user corrections after the first draft. Earlier versions overstated gaps in Manual Slop's "editable discussion" and "per-file memory" features; the corrections are folded into §2 and §4 below. Read the report.md for the actual analysis; this spec.md is the wrapper.


1. Overview

This track documents a deep-dive analysis of Mike Acton's macton/nagent reference implementation ("nagent" = "not-an-agent") and its implications for how Manual Slop should think about LLM-driven workflows.

nagent is a 14-section, ~1,500-line Python reference that operationalizes the philosophy "the agent is not the thing; the data is the thing." It provides a concrete, minimal counterpoint to the standard "agent framework" model. Its central claim: durable work matters more than durable processes; explicit artifacts beat opaque state.

The companion doc (report.md) is the deep-dive analysis itself — a 14-section comparison against Manual Slop's actual implementation, written for engineers (not marketing). This spec.md is the conductor/track wrapper: the design intent, the relationship to the Application vs Meta-Tooling split, the planned follow-up tracks, and the out-of-scope notes.

1.1 What this track produces

Artifact Purpose
spec.md This file — the track design and scoping.
report.md The 14-section deep-dive analysis. The primary deliverable.
comparison_table.md A flat side-by-side table (one row per nagent principle) for quick reference.
decisions.md Future-track candidates extracted from the analysis (each becomes a follow-up track if approved).

1.2 Non-Goals

  • Not rewriting Manual Slop to use nagent. The architectures serve different domains (see §2).
  • Not replacing any existing track. This is a reference track — it informs future tracks but doesn't compete with them.
  • Not a comparison of "framework vs framework." nagent is a 1,500-line reference; Manual Slop is 13,000+ lines of production code with a real GUI, real persistence, real HITL. The comparison is philosophical, not "which is better."

2. The Application / Meta-Tooling Distinction (load-bearing context)

Per docs/guide_meta_boundary.md, Manual Slop lives in two distinct architectural domains. This distinction is critical for understanding the nagent comparison:

Domain Lives at AI / HITL Model Tooling
The Application (manual_slop) gui_2.py, ai_client.py, multi_agent_conductor.py, dag_engine.py A local GUI for orchestrating AI. The "Application AI" is a long-lived assistant that the user talks to over many turns. Strict HITL: every destructive action requires a GUI modal approval. manual_slop.toml [agent.tools] — strict allowlist
The Meta-Tooling (us) scripts/mma_exec.py, conductor/, .agents/skills/, the MCP tools in mcp_client.py when used by external agents External agents (Gemini CLI, OpenCode, Claude Code) that build the Application. Each invocation is a fresh sub-agent. Token-firewalled. Full mcp_client.py toolset, including mutation tools

nagent lives in the Meta-Tooling domain. nagent is a reference for how external agents (the ones reading this conversation, the ones writing the code) should structure their own work.

Manual Slop's Application AI does not — and should not — look like nagent. The Application AI is a chatty, conversational, persona-driven, RAG-augmented, curation-rich assistant with a real GUI. It's a different kind of thing. Conflating the two is exactly the kind of "feature bleed" guide_meta_boundary.md warns against.

Every recommendation in report.md is qualified with which domain it applies to. The Application is the production code the user cares about; the Meta-Tooling is what we (the agents) use to build it.


3. Summary of the 14-Section Comparison

The full table is in comparison_table.md. Verdict summary:

nagent Principle Manual Slop Equivalent Verdict
1. Durable work, disposable workers AppState snapshots + history branching (Takes); MMA workers are real subprocesses PARTIAL — different domains; MMA has it, App doesn't need it
2. Text in, text out ai_client.send() returns str; mcp_client.dispatch returns str PARITY
3. Conversations are editable state Discussion takes + branching + edit-in-place + UISnapshot history; ContextPreset for per-file view-mode memory PARITY (DIFFERENT FOCUS) — Manual Slop has this; focuses on editable UI state (per Take) and editable per-file curation (per FileItem), not editable conversation logs
4. Visible output protocol Uses provider-native function calling; the protocol is opaque to humans ARCHITECTURAL DIFFERENCE — Application-side; correct trade-off
5. The loop (append, call, parse, act, repeat) ai_client._send_* tool-call loop, MMA ConductorEngine.run, WorkflowSimulator.run_discussion_turn_async PARITY — but the loop is in multiple files, not as a single small function
6. Per-file memory (curation, not conversation log) FileItem (path + view_mode + ast_mask + custom_slices); ContextPreset (saved set of FileItems); Fuzzy Anchor slices MANUAL SLOP IS STRONGER IN THE CURATION DIMENSION; nagent's "file-edit conversation" pattern (one conversation log per file) is not present
7. Repository history as data _reread_file_items mtime-based diff injection; git_commit_file_patch per-file history summaries; no explicit "neighborhood" computation PARITY (PARTIAL) — diff injection is similar; the "neighborhood" computation is missing
8. Historical coupling & artifact neighborhoods n/a (no equivalent) GAP — could be added as a new tool
9. Disposable sub-conversations MMA mma_exec.py Tier 3 workers are real subprocesses; non-MMA 1:1 discussions do NOT have disposable sub-conversations yet (per user) **GAP (Application) — useful for 1:1 discussions; PARITY for MMA
10. Controlled writes MCP 3-layer security + Execution Clutch + Allowlist Construction + Path Validation + Resolution Gate PARITY (STRONGER) — Manual Slop's 3-layer is more thorough than nagent's tmpdir check
11. Large files as explicit artifacts (split/patch) nagent-file-split/nagent-file-patch/nagent-file-summarize with index.json + segment files + source hash validation; 32 KB target size; per-language natural splitters (no tree-sitter) PARITY (DIFFERENT MECHANISM) — both have the insight; nagent uses per-language scoring functions + subprocess isolation, Manual Slop uses tree-sitter + in-process summarize.py
12. Tool discovery (self-describing executables) Hard-coded dispatch if/elif chain in mcp_client.py GAP (Application) — could be added; useful for the Meta-Tooling domain
13. Differences from frameworks The philosophical frame n/a
14. Build your own The reference's "minimal" claim is wrong for the Application n/a for Application

The full 14-row analysis with 6 (revised from 8) specific Manual Slop pitfalls is in report.md.


4. The Revised 6 Pitfalls (corrected)

Earlier versions of this list contained two errors that user-corrections caught:

  • REMOVED pitfall #3 (per "Conversation state is buried in module-level globals" was over-stated) — Manual Slop has some editable-state infrastructure (HistoryManager with UISnapshot, discussion Takes/branching, ContextPreset save/load) but the actual raw conversation transcript is in ai_client._provider_specific_history globals. The truth is: Manual Slop has editable UI state, not editable conversation transcripts. That distinction is now captured honestly in §3 of the report.

  • REVISED pitfall #6 (per "Per-file memory") — Manual Slop does have a per-file memory concept (FileItem + ContextPreset + custom_slices + ast_mask), but it's curation memory, not nagent's conversation-log memory. Manual Slop's concept is richer in the curation dimension but absent in the conversation-log dimension. That's a useful distinction.

The remaining 6 pitfalls, after corrections:

  1. No structured output protocol in the Application AI (uses opaque function calling; nagent's regex tag protocol is the alternative for the Meta-Tooling). Domain: Application can stay opaque; Meta-Tooling should learn.
  2. Provider-specific history is in process globals (5 separate per-provider lists with their own locks; switching providers mid-session loses history). Domain: Application. Future-track candidate.
  3. RAG is not "history as data" — RAG retrieval is fuzzy and not auditable. nagent's git-history-driven context is exact and inspectable. RAG is useful but should be additive, not a replacement. Domain: Application. Coexists with nagent-style history.
  4. The AI client is a stateful singleton with module-level globals (2,685-line ai_client.py is unparseable without state). A future refactor toward a stateless LLMClient class with explicit Conversation objects would let the App save/load/replay conversations as files. Domain: Application. Future-track candidate.
  5. No non-MMA disposable sub-conversations — only MMA workers are real subprocesses; the user explicitly noted that 1:1 discussions don't have sub-agents. nagent's <nagent-conversation> pattern (a sub-agent for bounded investigation) would be valuable for the Application. Domain: Application. Future-track candidate (user-flagged as a want).
  6. Hard-coded tool discovery — the 45 MCP tools are in a flat if/elif chain in dispatch. nagent's --description self-describing executables pattern is more extensible. Domain: both. Low priority.

Plus 2 domain-domain recommendations that are not pitfalls per se:

  • Personas are config bundling (per user: "just bundles preparatory cruft — vendor/model, tools/permissions, and system prompts"). The user noted that you can completely opt out by just using AI settings directly. Domain: Application. Keep as-is; not a pitfall.
  • RAG is opt-in (per user: "doesn't have to be used"). Worth considering: a sub-agent that prepares RAG chunks before a run. Domain: Application. Future-track candidate.

5. What This Track Read (in full, before writing)

To avoid hand-waved claims, the report and this spec were written after reading all of:

nagent source (read in full)

  • README.md (~1,500 lines) — the 14-section "teaching document"
  • bin/nagent (~700 lines) — the main loop, tag parser, sub-conversation runner, git history + co-edit + summary integration
  • bin/helpers/nagent_llm.py (~300 lines) — provider dispatch, token accounting
  • bin/helpers/nagent_cli.py (~80 lines) — --description self-describing executable pattern, WaitSpinner
  • bin/helpers/nagent_file_edit_lib.py (~170 lines) — file-index by st_dev:st_ino, resolve_file_edit_conversation, is_split_segment_for_source
  • bin/helpers/nagent_file_split_lib.py (~400 lines) — SPLIT_TYPES (11 langs), per-language SCORE_BY_TYPE (no tree-sitter; regex + line counts + brace/JSON/XML depth), 32 KB default, source SHA-256 hashing
  • bin/helpers/nagent_file_patch_lib.py (~130 lines) — strict hash validation, make_unified_patch via difflib.unified_diff, apply_segment_patches writes the source
  • bin/helpers/nagent_file_summarize_lib.py (~110 lines) — per-segment LLM calls + retry-with-smaller-prompt (max 2 attempts), --limit-word-count validation, combined_summary_from_index
  • bin/nagent-file-edit (~120 lines) — per-file subprocess wrapper, default_pid = BASHPID or os.getppid()
  • bin/nagent-file-split (~170 lines) — main executable, --refresh INDEX mode for re-splitting without losing segment paths
  • bin/nagent-file-summarize (~100 lines) — main executable, cascades to nagent-file-split --summarize for files > 64 KB; uses positive_int CLI type (rejects 0)

Manual Slop docs (read in full)

  • docs/Readme.md (434 lines) — docs index
  • docs/guide_architecture.md (989 lines) — threading model, cross-thread data structures
  • docs/guide_ai_client.md (424 lines) — multi-provider LLM client
  • docs/guide_mma.md (564 lines) — 4-tier MMA orchestration
  • docs/guide_tools.md (506 lines) — MCP tool inventory + Hook API
  • docs/guide_mcp_client.md (410 lines) — 45 tools + 3-layer security
  • docs/guide_app_controller.md (447 lines) — headless controller
  • docs/guide_meta_boundary.md (57 lines) — Application vs Meta-Tooling split
  • docs/guide_context_curation.md (303 lines) — Granular AST Control + Fuzzy Anchor Slices + AST Inspector
  • docs/guide_personas.md (307 lines) — Unified agent profile model
  • docs/guide_rag.md (411 lines) — RAG subsystem
  • docs/guide_gui_2.md (477 lines) — ImGui application (App/Controller state delegation, hot-reload, defer-not-catch)

Manual Slop source (selectively read, in service of the user-corrections)

  • src/models.py lines 510-559 (FileItem schema), 909-937 (ContextPreset schema)
  • src/context_presets.py (30 lines, full file) — the ContextPresetManager
  • src/project_manager.py lines 429-450 (branch_discussion, promote_take)
  • src/aggregate.py first 80 lines (context composition pipeline)
  • src/history.py (full file, 141 lines) — UISnapshot and the snapshot model

The user-corrections specifically drove a re-survey of FileItem + ContextPreset + aggregate.py + HistoryManager after the first draft overstated Manual Slop's gaps.


6. Architectural Reference


7. See Also

Internal Documentation

  • docs/Readme.md — Manual Slop documentation index
  • docs/guide_architecture.md — Threading model and provider dispatch
  • docs/guide_ai_client.md — The Application's LLM client
  • docs/guide_mma.md — 4-tier MMA orchestration
  • docs/guide_meta_boundary.md — The Application vs Meta-Tooling split
  • docs/guide_tools.md — MCP tool inventory and Hook API
  • docs/guide_mcp_client.md — 45 tools + 3-layer security
  • docs/guide_context_curation.md — Granular AST Control + Fuzzy Anchor Slices + AST Inspector
  • docs/guide_personas.md — Unified agent profile model
  • docs/guide_rag.md — RAG subsystem
  • docs/guide_gui_2.md — ImGui application
  • data_oriented_error_handling_20260606 — Already cites Acton by name. The Result[T] + ErrorInfo data model from this track is consistent with nagent's "data, not control flow" stance.
  • qwen_llama_grok_integration_20260606 — The "OpenAI-compatible shared helper" pattern is exactly nagent's "thin boundary adapter on a normalized data structure" approach.
  • mcp_architecture_refactor_20260606 — Already blocked by data_oriented_error_handling_20260606. The sub-MCP extraction (planned) will benefit from nagent's "small helper per concept" decomposition pattern.
  • data_structure_strengthening_20260606 — The type-alias work is consistent with nagent's "make the data shape explicit" stance. The audit script + NamedTuple work parallels nagent's split-index / patch-artifact approach.

External

  • Mike Acton, "Data-Oriented Design and C++" (cppCon 2014) — The original DOD talk that nagent operationalizes
  • Ryan Fleury, "The Easiest Way To Handle Errors Is To Not Have Them" — Companion framework; same "errors as data" thesis
  • Timothy Lottes (@NOTimothyLottes) — Cited in the data_oriented_error_handling review; same "error codes are data" stance
  • Valigo (@valigotech) — Cited in the data_oriented_error_handling review; "exceptions mess with control flow in very weird ways"

8. Scope Boundaries

In Scope

  • The 14-section nagent philosophy
  • The 6 (revised) concrete pitfalls in Manual Slop
  • Mapping each pitfall to a future-track candidate (in decisions.md)
  • Application vs Meta-Tooling domain classification for every recommendation
  • The philosophical grounding for existing Manual Slop conventions (data-oriented, thread-disciplined, GUI-decoupled)

Out of Scope

  • Implementation work. This is a reference/analysis track. No code is being changed.
  • Replacing nagent in the Meta-Tooling. The Meta-Tooling is whatever the external agent (Gemini CLI, OpenCode) is. nagent is a reference example, not a competitor. It's worth reading for ideas, not adopting wholesale.
  • Building a new "data-oriented" track for Manual Slop. The data_oriented_error_handling_20260606 track already covers the data-vs-control-flow axis. This track is the philosophical foundation for that work; the implementation track is separate.
  • Comparing nagent to other LLM agent frameworks (LangChain, AutoGen, CrewAI, etc.). nagent is a specific small reference; those are different scales. This track is about nagent specifically.

Known Trade-offs (called out in the report)

  • Manual Slop's personas are a feature, not a bug, in the Application domain. A user-facing chatty assistant benefits from "persona = named configuration that the user can save and recall." nagent's "data, not personality" stance is correct for sub-agent invocations but wrong for long-lived assistant sessions. (Per user: personas are config bundling; the user can opt out by using AI settings directly.)
  • Manual Slop's RAG is a feature, not a bug, in the Application domain. RAG enables semantic search across large codebases. nagent's "git history → summaries" is exact but doesn't help when the user asks "how does the execution clutch work" and the relevant information is in guide_architecture.md (a doc, not source). RAG is opt-in.
  • Manual Slop's GUI is a feature, not a bug, for its domain. It enables the rich persona, curation, RAG, and snapshot UX. nagent explicitly has no GUI; the Application explicitly has a GUI. They serve different needs.
  • The "1,500-line reference" vs "13,000-line production" comparison is not fair. nagent is a teaching example. Manual Slop is a working tool. The right comparison is "nagent's principles vs Manual Slop's implementation," not "which codebase is better."

9. Verification Criteria

This is a reference/analysis track. The verification is:

  • report.md exists and covers all 14 nagent principles with a Manual Slop assessment for each
  • comparison_table.md exists as a flat side-by-side reference
  • decisions.md exists with future-track candidates (each is a separate conductor track to be specced independently)
  • Every "Manual Slop could learn from nagent here" recommendation is tagged with the domain (Application / Meta-Tooling / Both)
  • No code is being modified by this track
  • The companion doc is read by ≥1 person who is planning a future track (the report.md file is referenced by the relevant future-track specs)
  • (Post-correction) The report's verdicts on nagent §3 (Conversations are editable state) and §6 (Per-File Memory) are corrected per user feedback — the first draft overstated gaps

10. Status

Approved 2026-06-08 (initial); revised 2026-06-08 with user corrections. Ready for human review of report.md.

After human review of report.md, the decisions.md candidates will be evaluated:

  • High-priority items (e.g., stateless LLMClient class, non-MMA sub-conversations, RAG pre-staging) → new conductor tracks
  • Medium-priority items (e.g., self-describing MCP tools, conversation file persistence) → research spikes
  • Low-priority items → deferred until a specific Application need surfaces

The current data_oriented_error_handling_20260606 track and the future mcp_architecture_refactor_20260606 track are already philosophically aligned with nagent's principles; this track is the explicit reference to that alignment.