# Future-Track Candidates: nagent Review Follow-ups **Companion to:** `report.md` (deep-dive), `comparison_table.md` (flat reference), `nagent_takeaways_20260608.md` (actionable patterns) **Date:** 2026-06-08 **Source:** nagent v1.0.0 deep-dive review (see `report.md`) This document is the bridge from "what nagent teaches us" to "what Manual Slop should do about it." Each candidate is a *future* conductor track (not this one). The candidates are *not* committed — they emerge from the analysis but each is a separate scoping exercise. **For an actionable, code-grounded read of these candidates** (with the "what to do today, not just the future track" framing), see `nagent_takeaways_20260608.md` — it maps each candidate to specific patterns, design constraints, and small UX wins that don't need a new track. --- ## Decision-making framework For each candidate: - **Why it matters** — what pitfall or capability gap does it address? - **What it would do** — concrete description - **Where it would live** — Application or Meta-Tooling - **Dependency on existing tracks** — is anything already on the board? - **Effort estimate** — small / medium / large - **User signal** — has the user expressed want/don't-want/neutral? - **Recommended priority** — high / medium / low The candidates are listed in priority order, which factors user signal heaviest (the user is the product owner for the Application; the analysis is just a reference). --- ## Candidate 1: `src/sub_conversation.py:SubConversationRunner` **User signal:** **EXPLICIT WANT** ("I probably want to add that for just 1:1 discussions where I use a sub-agent manually for specific points.") **Why it matters.** nagent's §9 pattern (disposable sub-conversations via ``) is the cleanest way to handle "investigate this without polluting the main discussion." Manual Slop has it for MMA (`mma_exec.py` is a real subprocess) but not for 1:1 discussions. The user is asking for this. **What it would do.** A `SubConversationRunner` class that the App can call during a 1:1 discussion: - `await runner.spawn(prompt: str, *, allowed_tools: list[str] = None, system_prompt: str = None) -> SubConversationResult` - The runner spawns a fresh Python process (reusing the MMA pattern: `mma_exec.py` template with `--invocation user`, `--parent-conversation `, isolated `~/.manual_slop/sub_conversations/`) - The sub-process runs to completion (or times out) - Result returns: a concise artifact (the sub-agent's `` block) + token usage + exit code - The App inserts the result into the active discussion as a "User" role entry (so the parent LLM sees it on the next turn) - Cleanup: sub-conversation folder is auto-archived after 7 days (consistent with `log_pruner.py`) **Where it lives.** Application. Possibly Meta-Tooling too (the `scripts/` directory could use the same primitive). **Depends on.** None directly. Could leverage MMA's `mma_exec.py` as a starting template. The `public_api_migration_20260606` follow-up track is unrelated. **Effort.** **Medium.** 2-3 phases: (1) extract reusable subprocess skeleton from MMA, (2) add 1:1-specific context injection, (3) add GUI controls ("Investigate…" button, optional command-palette command). **Recommended priority.** **HIGH** — user-flagged. --- ## Candidate 2: RAG pre-staging via sub-conversation **User signal:** **EXPLICIT WANT** ("Would be cool to have a sub agent maybe prepare a rag chunks before I use them in a run.") **Why it matters.** Manual Slop's RAG (`src/rag_engine.py`) indexes files on the fly at discussion start. For large projects, indexing can take 30+ seconds (per `tests/test_rag_phase4_stress.py`). The user wants a "prep" workflow: before starting a long discussion, fire off a sub-conversation that pre-indexes everything, so the discussion starts instantly. This is also consistent with nagent's "data preparation is an explicit, visible step" philosophy (§1, §7). The RAG chunks are artifacts; preparing them is a transformation; the transformation can be a sub-conversation. **What it would do.** A "Pre-stage RAG" command in the GUI (or in `commands.py`): - Spawns a sub-conversation with the prompt: "Index all files in [project] for RAG. Use the index_file tool on every file in the context. Report top-K queries at the end." - The sub-conversation runs `rag_engine.index_file()` on each tracked file (uses the same `ChromaDB` backend, with mtime-based invalidation) - Returns a concise summary: "Indexed N files. Top-K for 'execution clutch': [file1, file2, file3]." - The main discussion starts with the index already warm; `RAGEngine.search()` is fast **Where it lives.** Application. The sub-conversation runner is the same primitive as Candidate 1; the staging logic is `RAGEngine` integration. **Depends on.** Candidate 1 (sub-conversation runner). Could be done as a feature within Candidate 1's track. **Effort.** **Small to medium.** The sub-conversation runner is the heavy lift (Candidate 1). The RAG-staging prompt is ~30 lines. **Recommended priority.** **HIGH** — user-flagged; cheap given Candidate 1. --- ## Candidate 3: Stateless `LLMClient` class **Why it matters.** `src/ai_client.py` is 2,685 lines of stateful singleton with module-level globals for every provider's history. nagent's `bin/helpers/nagent_llm.py` is 300 lines of stateless dispatch. A refactor toward a stateless `LLMClient(provider, model, conversation)` class would: - Make `ai_client` parseable (no implicit state to track) - Make tests deterministic (each test gets a fresh client) - Enable conversation save/load (the `Conversation` object is the transcript) - Enable provider switching without losing history This is a *big* refactor but a high-leverage one. Pitfalls #2 and #4 are both solved. **What it would do.** A new `src/llm_client.py`: ```python @dataclass class Conversation: messages: list[Message] # role + content + tool_calls + tool_results metadata: dict def to_dict(self) -> dict: ... def from_dict(data: dict) -> Conversation: ... def save(path: Path) -> None: ... def load(path: Path) -> Conversation: ... class LLMClient: def __init__(self, provider: str, model: str, api_key: str = None): ... def send(self, conversation: Conversation, *, tools: list[Tool] = None) -> Conversation: ... def stream_send(self, conversation: Conversation, *, tools: list[Tool] = None) -> Iterator[Event]: ... ``` Backwards-compat: `ai_client.send(...)` becomes a thin wrapper that constructs a default `Conversation` from the current state and calls the new class. **Where it lives.** Application (the AI client is the Application's main AI entry point). **Depends on.** The `data_oriented_error_handling_20260606` track is independent but related — both push toward the data-oriented principles. The `public_api_migration_20260606` follow-up track would benefit from the new `Conversation` class. **Effort.** **Large.** 3-5 phases: (1) introduce `Conversation` dataclass, (2) per-provider `LLMClient.send`, (3) migration of existing `ai_client.send` callers, (4) deprecate module-level globals, (5) remove. ~2000+ lines of refactor. **Recommended priority.** **MEDIUM.** High value, but the existing stateful singleton works. Defer until a concrete Application need forces it (e.g., the user wanting to save/replay conversations). --- ## Candidate 4: Intent-based DSL for Meta-Tooling tool calls **User signal:** **EXPLICIT WANT** ("The tool use is kinda upfront, I want to add an intent based dsl to help with 'discovery' or combinatorics but no where near that ideation yet.") **Why it matters.** nagent's §4 regex-tag protocol is more debuggable than Manual Slop's function-calling. The Meta-Tooling (the external agents that build the Application) could benefit from a more compact, inspectable tool-call format. The existing JSON function-calling format forces the user to read verbose `{"name": "...", "args": {...}}` blobs. **What it would do.** An intent-based DSL that the Meta-Tooling can use in its own work. Examples (per the user's "discovery" or "combinatorics" hint): - `` — intent: read this symbol - `` — intent: semantic search the workspace - `` — intent: surgical line-range edit - `` — intent: run a specific test - `` — intent: dependency trace These are read by the external agent (Gemini CLI, OpenCode), not by Manual Slop's Application AI. The Application's function-calling format stays the same (correct for its domain). **Where it lives.** Meta-Tooling. Documented in `docs/`; taught via the conductor convention; the external agent emits the DSL, the bridge script (`cli_tool_bridge.py`) translates to actual `mcp_client.py` tool calls. **Depends on.** None directly. The `mcp_architecture_refactor_20260606` may produce tools that are easier to call via DSL (atomic, composable). **Effort.** **Research spike, not implementation.** The user said "no where near that ideation yet." This is a design exercise, not a code change. **Recommended priority.** **LOW** — user explicitly deferred. --- ## Candidate 5: Self-describing MCP tools (nagent §12 pattern) **Why it matters.** Manual Slop's 45 MCP tools are dispatched by a flat if/elif in `mcp_client.py:dispatch`. Adding a tool requires edits in 4 places (dispatch, security allowlist, capability declaration, tests). nagent's `--description` self-describing executable pattern is more extensible: drop an executable, it auto-appears. **What it would do.** Each sub-MCP (or each tool) emits a `--description` block on `--help`. The `dispatch` function introspects via `mcp_client.get_tool_schemas()` and includes the descriptions in the AI's initial context automatically. **Where it lives.** Application (the dispatch layer). The Meta-Tooling already has self-describing (via `claude_tool_bridge.py`); this is the Application-side equivalent. **Depends on.** The `mcp_architecture_refactor_20260606` is the natural place — the sub-MCPs would each be self-describing modules. **Effort.** **Medium** (subsumed by mcp_architecture_refactor_20260606). Not a separate track. **Recommended priority.** **LOW** — subsumed. --- ## Candidate 6: `src/git_history.py` (nagent §7 pattern) **Why it matters.** Manual Slop's `_reread_file_items` does current-content diff injection. nagent's `file_edit_history_and_summary_block` does *historical* content injection: `git log --follow ` per file, LLM-summarized, plus co-edit neighborhood. For "explain this file" questions, the LLM is meeting the file fresh — git history would give it crucial context (who touched it last, why, what's nearby). **What it would do.** A `src/git_history.py:file_edit_history_and_summary_block(file_path, repo_root, provider, model, config_path, previous_initial_context=None) -> str` that: - Calls `git log --follow --max-count=50 --date=short --format=...` per file - Counts co-edited files per commit - LLM-summarizes new commits (with cache for unchanged history) - Renders a `{file-history}` block with editors, step-by-step, co-edited files, summarized commits - Called from `aggregate.py:run` at discussion start, after the file is added to context **Where it lives.** Application (it's part of the AI's initial context). **Depends on.** None directly. The `data_oriented_error_handling_20260606` is independent. The `rag_engine.py` already has a `sourcesha256` field and mtime-based invalidation — the same pattern. **Effort.** **Medium.** 2 phases: (1) git history + co-edit, (2) LLM summarization with cache. ~300-500 lines. **Recommended priority.** **MEDIUM** — high value, but only after Candidates 1-2 are done. --- ## Candidate 7: Per-file conversation log (nagent §6 conversation dimension) **Why it matters.** Manual Slop's per-file memory is the *curation* kind. nagent's is the *conversation log* kind. The user has the curation already; the conversation log is missing. The user's correction made this clear: the two are *different optimizations*, not equivalent. **What it would do.** A thin `~/.manual_slop/per_file/.md` per file (file_id by `st_dev:st_ino` for stability across renames, like nagent). Updated each time a discussion references the file. Format: ```markdown # src/foo.py (file_id: 12345:67890) Last referenced: 2026-06-08T12:34:56 (Discussion: "refactor auth") ## 2026-06-08T12:34:56 - "how does the validation work?" AI response: ... (User) followup: "what about edge cases?" ## 2026-06-05T... - "explain the parser" AI response: ... ``` When the user opens a new discussion with the file in context, the per-file log is injected as a `{per-file-history}` block. **Where it lives.** Application (the per-file log is the App's memory). The Meta-Tooling doesn't need this — sub-agent invocations are already short-lived. **Depends on.** None. Could be added in a small follow-up to Candidate 3 (the `Conversation` object becomes the per-file log). **Effort.** **Small** if done as a thin layer on top of the `Conversation` class. **Medium** if done before Candidate 3 (no `Conversation` object to leverage). **Recommended priority.** **LOW** — niche, niche feature. --- ## Candidate 8: `py_coedited_files` / `ts_c_coedited_files` MCP tools (nagent §8) **Why it matters.** nagent's `coedited_file_rows` produces a "files that historically co-edit with this file" table. Manual Slop has `py_get_hierarchy` (subclass scan) but no historical co-edit tool. Useful for "if I edit this file, what should I also look at?". **What it would do.** Two new MCP tools: - `py_coedited_files(path: str) -> list[{path, commits_together, likelihood}]` — runs `git log --follow `, counts files in each commit, labels high/medium/low - `ts_c_coedited_files(path: str) -> list[{path, commits_together, likelihood}]` — same, for C/C++ Returns a table. Used in the initial context as `{file-neighborhood}`. **Where it lives.** Application (initial context injection). **Depends on.** None. Small, contained. **Effort.** **Small.** ~200 lines + tests. The git-log is already in `aggregate.py`; this is a new tool that uses the same primitives. **Recommended priority.** **LOW** — small but niche. Worth bundling with Candidate 6 if that gets done. --- ## Candidate 9: Explicit `src/split_lib.py` + `src/patch_lib.py` (nagent §11) **Why it matters.** Manual Slop doesn't have an explicit split/patch pipeline. For very large files (>50 KB), the current `aggregate.py` + tree-sitter approach works for *reading* (skeleton, summary) but not for *patching* (no explicit segment/hash model). **What it would do.** Mirror nagent's design: - `src/split_lib.py` — per-language natural splitters, `index.json` with `source_path`, `sourcesha256`, `segments[]` - `src/patch_lib.py` — strict `validate_index` (hash check), `make_unified_patch`, `apply_segment_patches` - `src/summarize_lib.py` — per-segment LLM call + retry-with-smaller-prompt **Where it lives.** Application (the AI is the consumer). The Meta-Tooling already has nagent if it wants this. **Depends on.** None. Self-contained. **Effort.** **Medium.** 2 phases: split/patch, then summarize. ~500 lines. **Recommended priority.** **DEFER UNTIL NEEDED.** No current 1:1 use case requires explicit split/patch. If a future file is genuinely too large for tree-sitter to handle inline, this becomes Candidate #2-priority. --- ## Candidate 10: Optional raw-transcript persistence per Take (nagent §3 conversation dimension) **Why it matters.** nagent's "edit the conversation file" pattern is foreign to Manual Slop because the App stores abstracted entries (`disc_entries`), not raw transcripts. The user-edit feature in the GUI does edit individual entries, but the underlying log of `function_call` / `tool_result` blocks is implicit. **What it would do.** Optionally, when a take is snapshotted to TOML (`project_manager.save_project`), also persist the raw transcript to a sibling file `discussions//transcript.jsonl`. The GUI gets a "View Raw Transcript" button. Optional "Edit Raw Transcript" mode that re-parses and re-aggregates. **Where it lives.** Application. Optional — user can toggle per-project. **Depends on.** None. Could be a small follow-up to Candidate 3 (`Conversation` class). **Effort.** **Small.** ~150 lines + tests. Persist the existing `comms.log` in a structured way. **Recommended priority.** **LOW** — niche feature, opt-in only. --- ## Summary table | # | Candidate | User signal | Priority | Effort | Domain | |---|---|---|---|---|---| | 1 | `SubConversationRunner` (1:1 sub-convos) | **Explicit want** | **HIGH** | Medium | App + MT | | 2 | RAG pre-staging via sub-conversation | **Explicit want** | **HIGH** | Small (depends on #1) | App | | 3 | Stateless `LLMClient` class | (none) | Medium | Large | App | | 4 | Intent-based DSL for Meta-Tooling | Explicit but deferred | Low | Research | MT | | 5 | Self-describing MCP tools | Implicit | Low (subsumed) | Medium | BOTH | | 6 | `src/git_history.py` (nagent §7) | (none) | Medium | Medium | App | | 7 | Per-file conversation log | (none) | Low | Small | App | | 8 | `py_/ts_c_coedited_files` tools | (none) | Low (bundle with #6) | Small | App | | 9 | Explicit `split_lib.py` / `patch_lib.py` | (none) | Defer until needed | Medium | App | | 10 | Raw-transcript persistence per Take | (none) | Low | Small | App | --- ## Recommended next steps 1. **Spec and build Candidate 1 first** — it's the highest-priority user-flagged want, and Candidates 2 builds on it. 2. **Combine Candidate 2 with Candidate 1's track** — same primitive, different prompt. 3. **Hold Candidates 3-10 for future scoping** — each is a separate conductor track when the corresponding need surfaces. The current `nagent_review_20260608` track itself produces no code; it's the reference. Candidates 1 and 2 will be the first *implementation* tracks informed by it.