Private

Public Access

Files

T

conductor-tier2 9cc51ca9af conductor(track): nagent review - deep-dive + 6 pitfalls + 10 actionable takeaways

Reference/analysis track. Produces 0 code changes.

Artifacts (conductor/tracks/nagent_review_20260608/):
- spec.md (240 lines) - track wrapper with Application/Meta-Tooling framing
- report.md (571 lines) - 14-section deep-dive; primary deliverable
- comparison_table.md (79 lines) - flat side-by-side reference
- decisions.md (286 lines) - 10 future-track candidates with priority matrix
- nagent_takeaways_20260608.md (363 lines) - 10 actionable patterns grounded
  in code (file:line refs into nagent source and Manual Slop source)
- metadata.json (132 lines) - structured metadata + verification criteria
- state.toml (113 lines) - per-task tracking + user-corrections log (7 entries)

14 nagent principles covered in report.md (durable work, text-in/text-out,
editable state, visible protocol, the loop, per-file memory, repo history,
neighborhoods, sub-conversations, controlled writes, large files, tool
discovery, framework differences, build your own).

6 pitfalls (revised from 8 after user-corrections):
1. No structured output protocol in Application AI (opaque function calling)
2. Provider-specific history in process globals (ai_client._anthropic_history
   + _deepseek_history + _minimax_history)
3. RAG is not 'history as data' (fuzzy, not auditable)
4. AI client is a stateful singleton (2,685-line ai_client.py)
5. No non-MMA disposable sub-conversations (1:1 gap; user-flagged want)
6. Hard-coded tool discovery (45-tool if/elif in mcp_client.py)

User-corrections applied (3 rounds, 7 total corrections recorded):
- Editable discussions: PARTIAL -> PARITY (DIFFERENT FOCUS) with full A1-A7
  per-entry + B1-B11 discussion-level + C1-C5 undo/redo operation matrix
- Per-file memory: DOMAIN MISMATCH -> MANUAL SLOP IS STRONGER IN
  CURATION DIMENSION (FileItem + ContextPreset vs nagent's inode-keyed
  conversation log; complementary, not equivalent)
- Sub-conversations: MMA has it; 1:1 does not -> 'PARITY for MMA; GAP for
  1:1 discussions' (user wants this)
- RAG: opt-in, not gap; user wants pre-staging via sub-conversation
- Personas: config bundling (can opt out via AI settings)
- Tool discovery: deferred (user has 'intent based DSL' idea but 'no where
  near that ideation yet')

10 actionable takeaways (separate from the 6 pitfalls - those are
diagnosis, these are prescription):
1. State visibility (UI inspector for in-process state)
2. Readable conversation log (text-greppable, not just JSON-L)
3. Sub-agents for 1:1 (HIGH priority - user-flagged)
4. File-identity over file-path (st_dev:st_ino rename-safe)
5. One loop shape visible in diagnostics
6. Visible retry on protocol failure
7. Meta-Tooling DSL (intent-based, deferred)
8. Self-describing tools (subsumed by mcp_architecture_refactor_20260606)
9. Single source of truth for disc_entries + provider history
10. Sub-agent return type constraint (bake into candidate #1 spec)

Domain classification: every recommendation tagged Application / Meta-Tooling
/ Both per docs/guide_meta_boundary.md. nagent lives in the Meta-Tooling
domain; Manual Slop's Application AI is a different kind of thing.

No code modified by this track (reference/analysis only). All 7 files
parse cleanly (JSON, TOML, Markdown). All internal cross-links resolve.
Track is 'active' awaiting human review; future-track candidates live in
decisions.md and nagent_takeaways_20260608.md.

2026-06-08 18:44:35 -04:00

10 KiB

Raw Blame History

nagent vs Manual Slop: Comparison Table

Companion to: report.md Date: 2026-06-08 (revised same day) Source: nagent v1.0.0 (read 2026-06-08)

Flat side-by-side reference. One row per nagent principle. Verdicts and pitfalls are in report.md.

Legend

Verdict values: PARITY (same shape), PARITY+ (Manual Slop is stronger), PARITY- (nagent is stronger), PARTIAL (one half, not the other), GAP (Manual Slop lacks the feature), DOMAIN MISMATCH (different scope).
Domain tags: APP = Application domain, MT = Meta-Tooling domain, BOTH.

#	nagent Principle (verbatim summary)	nagent Mechanism	Manual Slop Equivalent	Verdict	Domain	Action
1	Durable work, disposable workers. The agent is not the thing; the data is the thing.	`bin/nagent` 700-line single-file loop, conversation is a text file	MMA workers are real subprocesses with Context Amnesia; Application AI is long-lived by design	PARTIAL	BOTH	Future-track: stateless `LLMClient` class (§15.4)
2	Text in, text out. File in, text out is the smallest useful primitive.	`bin/nagent-llm-text` + `bin/helpers/nagent_llm.py` (4 providers)	`src/ai_client.py:send(...) -> str` (5 providers)	PARITY	BOTH	None
3	Conversations are editable state. The conversation file is not chat history; it is working state.	`bin/nagent` exposes `--save/load/edit/summarize`; text files are user-editable (vim/cat/diff/cp the raw transcript)	Discussion Takes + branching + per-entry edit (A1-A7 in report §3) + discussion-level CRUD (B1-B11) + role management (B5) + UI snapshot undo/redo (C1-C5)	PARITY (DIFFERENT FOCUS) — Manual Slop edits abstracted typed entries (`disc_entries` is a `list[dict]` with role + content + ts + thinking_segments + usage). Both have comprehensive editing; Manual Slop's is more granular at the entry layer, nagent's is deeper at the raw-transcript layer.	APP	Future-track: optional raw-transcript persistence per Take (Candidate 10)
4	Visible output protocol. Teach the model an output format; use a visible, parseable protocol.	`TAG_PATTERNS` regex list; `parse_response` strict; `MAX_FORMAT_RETRIES = 3`	Provider-native function calling (Gemini, Anthropic, etc.)	ARCHITECTURAL DIFFERENCE — Application's choice is correct (parallel tool calls, JSON mode)	BOTH	Future-track: intent-based DSL for Meta-Tooling calls
5	The loop. Append, call, parse, act, append, repeat.	`bin/nagent:run_agent_loop()` 50 lines, single `while True`	Three parallel loops: `ai_client._send_*` (LLM), `ConductorEngine.run` (MMA), `WorkflowSimulator.run_discussion_turn_async` (App)	PARITY	BOTH	(Low priority) Future-track: extract a single `src/llm_loop.py:run_loop`
6	Per-file memory. Each file gets its own persistent local memory.	`file_id_for_path` (st_dev:st_ino); `conversations/file-index-{pid}.json`; `nagent-file-edit` per-file subprocess	`FileItem` (path + view_mode + ast_mask + custom_slices); `ContextPreset` (saved set of FileItems); Structural File Editor	PARITY (DIFFERENT KIND) — Manual Slop's is curation memory (rich); nagent's is conversation log memory (plain text). Both real, both per-file, different optimization.	APP	Future-track: thin "last-investigation" log per file (Meta-Tooling-friendly)
7	Repository history as data. Turn git history into editing context.	`git_file_history` + `summarize_new_file_commits` + `coedited_file_rows` + `format_file_history`	`_reread_file_items` (mtime-based, diff injection); git-linked discussion tracking in GUI; no historical-context injection	PARTIAL — diff injection is similar; historical-context injection is missing	APP	Future-track: `src/git_history.py` mirroring nagent's `file_edit_history_and_summary_block`
8	Historical coupling & artifact neighborhoods. Files that change together are hints.	`coedited_file_rows` labels high/medium/low co-edit rate; guidance text "Use these files as hints. Do not edit unless the user request or evidence requires it."	None (closest: `py_get_hierarchy` is structural not historical)	GAP	APP	Future-track: `py_coedited_files` + `ts_c_coedited_files` MCP tools
9	Disposable sub-conversations. Exploration creates noise; spawn disposable workers.	`<nagent-conversation>` tag spawns `nagent --invocation delegated` as subprocess; isolated conversation file; recursive token rollup	MMA Tier 3/4 workers (real subprocesses); 1:1 main discussion has no sub-conversation mechanism	PARITY for MMA; GAP for 1:1 discussions	APP (and MT)	USER-FLAGGED WANT: Future-track `src/sub_conversation.py:SubConversationRunner` for 1:1 investigations
10	Controlled writes. A loop that writes files needs explicit boundaries. Not a sandbox; just conventions.	`validate_write_path`: main mode → tmpdir only; file-edit mode → target or segments; rejected writes append `<nagent-write-result status="error">`	`mcp_client._is_allowed` (3-layer: allowlist + path validation + resolution gate); `run_powershell` requires GUI modal approval; PowerShell-only by default; 60s timeout + `taskkill` cleanup; optional Tier 4 QA	PARITY+ (Manual Slop stronger) — 3-layer security + HITL + sandbox is dramatically stricter than nagent's tmpdir check	APP (and MT)	None — current design is right
11	Large files as explicit artifacts. Split, edit segments, patch.	`nagent-file-split` (11 langs, regex + line counts + brace/JSON/XML depth); `nagent-file-patch` (strict hash validation); `nagent-file-summarize` (per-segment + retry); 32 KB default; index.json with `source_path`, `sourcesha256`, `segments[]`	`aggregate.py:build_file_items` + `py_get_skeleton` (tree-sitter) + `ts_c_*_get_skeleton` (tree-sitter); `set_file_slice` / `edit_file` (mtime validation, not hash); `run_subagent_summarization` (in-process, no retry); `RAGEngine._chunk_code` (mtime-based, ChromaDB)	PARITY (DIFFERENT MECHANISM) — both have the insight; nagent uses per-language scoring functions + subprocess isolation + hash validation; Manual Slop uses tree-sitter + in-process + mtime validation	BOTH	Future-track: explicit `src/split_lib.py` + `src/patch_lib.py` mirroring nagent's design, with hash validation
12	Tool discovery. Tool capability should be explicit data.	`collect_bin_tool_descriptions` runs each `bin/* --description`; auto-builds "Available tools:" block for initial context	None (45 tools in `mcp_client.py:dispatch` if/elif chain)	GAP — nagent's pattern is genuinely better; current dispatch is fine but not extensible	BOTH (especially MT)	Future-track: subsumed by `mcp_architecture_refactor_20260606` (sub-MCPs as self-describing modules)
13	Differences from frameworks. The reframing table: memory→editable artifact, agent→temporary transformation function, context→explicit input data.	The philosophical frame	The applicable reframings: editable UI state, curated per-file memory, git history as data	N/A	BOTH	(Lens, not action)
14	Build your own. 12-step buildable list.	The reference	Manual Slop has all 12, in different files, at different scale	PARITY	BOTH	(Checklist)

The 6 Pitfalls (revised, after user-corrections)

See report.md §15 for full details. Quick reference:

#	Pitfall	Domain	Future-track	User flag?
1	No structured output protocol in Application AI (opaque function calling)	BOTH	Intent-based DSL for Meta-Tooling	Implicit ("intent based DSL to help with discovery")
2	Provider-specific history in process globals (`_anthropic_history`, `_deepseek_history`, etc.)	APP	Stateless `LLMClient` class	No
3	RAG is not "history as data" (fuzzy, not auditable)	APP	RAG pre-staging sub-conversation	Yes ("Would be cool to have a sub agent maybe prepare a rag chunks before I use them in a run")
4	AI client is a stateful singleton with module-level globals (2,685-line file)	APP	Stateless `LLMClient` class (same as #2)	No
5	No non-MMA disposable sub-conversations	APP (and MT)	`src/sub_conversation.py:SubConversationRunner`	Yes ("I probably want to add that for just 1:1 discussions where I use a sub-agent manually for specific points")
6	Hard-coded tool discovery (45-tool if/elif chain)	BOTH	Subsumed by `mcp_architecture_refactor_20260606`	Implicit ("intent based DSL to help with discovery")

Pitfalls removed by user-corrections

(removed) "Conversation state is buried in module-level globals" — overstated. Manual Slop has editable UI state (Takes, UISnapshot, ContextPreset); the lack of editable raw transcripts is a different design choice, not a gap. See report.md §3.
(removed) "No per-file memory" — overstated. Manual Slop does have per-file memory in the curation dimension (FileItem + ContextPreset + Fuzzy Anchors); what's missing is nagent's conversation-log dimension, which is a different optimization. See report.md §6.

Future-track candidates — priority list

Ordered by user signal + implementation cost:

src/sub_conversation.py:SubConversationRunner — user-flagged as a want. Extract MMA's mma_exec.py pattern into a reusable App-callable class. Useful for 1:1 investigations. High priority. (Pitfall #5)
RAG pre-staging via sub-conversation — user-flagged as a want. A sub-agent pre-builds the RAG index for a planned run; the chunks become the discussion's starting memory. High priority. (Pitfall #3)
Stateless LLMClient class — would unify Pitfall #2 and #4. Backwards-compatible with ai_client.send(). ~2-3 phases of careful refactor. Medium priority.
Intent-based DSL for Meta-Tooling tool calls — user-noted as a want ("no where near that ideation yet"). Low priority, research spike.
Self-describing MCP tools (nagent §12 pattern) — subsumed by mcp_architecture_refactor_20260606. Low priority on its own.
src/git_history.py for nagent §7 pattern — historical context injection. Medium priority, but only after #1-#2 are done.
Per-file conversation log (nagent §6 conversation dimension) — Meta-Tooling-friendly addition. Low priority.
py_coedited_files / ts_c_coedited_files MCP tools (nagent §8) — small, contained. Low priority.
Explicit src/split_lib.py + src/patch_lib.py (nagent §11) — only needed if very-large-file scenarios emerge. Defer until needed.
Optional raw-transcript persistence per Take (nagent §3 conversation dimension) — niche. Low priority.

10 KiB Raw Blame History