# Future-Track Candidates: nagent Review Follow-ups

**Companion to:** `report.md` (deep-dive), `comparison_table.md` (flat reference), `nagent_takeaways_20260608.md` (actionable patterns)
**Date:** 2026-06-08
**Source:** nagent v1.0.0 deep-dive review (see `report.md`)

This document is the bridge from "what nagent teaches us" to "what Manual Slop should do about it." Each candidate is a *future* conductor track (not this one). The candidates are *not* committed — they emerge from the analysis but each is a separate scoping exercise.

**For an actionable, code-grounded read of these candidates** (with the "what to do today, not just the future track" framing), see `nagent_takeaways_20260608.md` — it maps each candidate to specific patterns, design constraints, and small UX wins that don't need a new track.

---

## Decision-making framework

For each candidate:

- **Why it matters** — what pitfall or capability gap does it address?
- **What it would do** — concrete description
- **Where it would live** — Application or Meta-Tooling
- **Dependency on existing tracks** — is anything already on the board?
- **Effort estimate** — small / medium / large
- **User signal** — has the user expressed want/don't-want/neutral?
- **Recommended priority** — high / medium / low

The candidates are listed in priority order, which factors user signal heaviest (the user is the product owner for the Application; the analysis is just a reference).

---

## Candidate 1: `src/sub_conversation.py:SubConversationRunner`

**User signal:** **EXPLICIT WANT** ("I probably want to add that for just 1:1 discussions where I use a sub-agent manually for specific points.")

**Why it matters.** nagent's §9 pattern (disposable sub-conversations via `<nagent-conversation>`) is the cleanest way to handle "investigate this without polluting the main discussion." Manual Slop has it for MMA (`mma_exec.py` is a real subprocess) but not for 1:1 discussions. The user is asking for this.

**What it would do.** A `SubConversationRunner` class that the App can call during a 1:1 discussion:
- `await runner.spawn(prompt: str, *, allowed_tools: list[str] = None, system_prompt: str = None) -> SubConversationResult`
- The runner spawns a fresh Python process (reusing the MMA pattern: `mma_exec.py` template with `--invocation user`, `--parent-conversation <active_discussion_id>`, isolated `~/.manual_slop/sub_conversations/<name>`)
- The sub-process runs to completion (or times out)
- Result returns: a concise artifact (the sub-agent's `<response>` block) + token usage + exit code
- The App inserts the result into the active discussion as a "User" role entry (so the parent LLM sees it on the next turn)
- Cleanup: sub-conversation folder is auto-archived after 7 days (consistent with `log_pruner.py`)

**Where it lives.** Application. Possibly Meta-Tooling too (the `scripts/` directory could use the same primitive).

**Depends on.** None directly. Could leverage MMA's `mma_exec.py` as a starting template. The `public_api_migration_20260606` follow-up track is unrelated.

**Effort.** **Medium.** 2-3 phases: (1) extract reusable subprocess skeleton from MMA, (2) add 1:1-specific context injection, (3) add GUI controls ("Investigate…" button, optional command-palette command).

**Recommended priority.** **HIGH** — user-flagged.

---

## Candidate 2: RAG pre-staging via sub-conversation

**User signal:** **EXPLICIT WANT** ("Would be cool to have a sub agent maybe prepare a rag chunks before I use them in a run.")

**Why it matters.** Manual Slop's RAG (`src/rag_engine.py`) indexes files on the fly at discussion start. For large projects, indexing can take 30+ seconds (per `tests/test_rag_phase4_stress.py`). The user wants a "prep" workflow: before starting a long discussion, fire off a sub-conversation that pre-indexes everything, so the discussion starts instantly.

This is also consistent with nagent's "data preparation is an explicit, visible step" philosophy (§1, §7). The RAG chunks are artifacts; preparing them is a transformation; the transformation can be a sub-conversation.

**What it would do.** A "Pre-stage RAG" command in the GUI (or in `commands.py`):
- Spawns a sub-conversation with the prompt: "Index all files in [project] for RAG. Use the index_file tool on every file in the context. Report top-K queries at the end."
- The sub-conversation runs `rag_engine.index_file()` on each tracked file (uses the same `ChromaDB` backend, with mtime-based invalidation)
- Returns a concise summary: "Indexed N files. Top-K for 'execution clutch': [file1, file2, file3]."
- The main discussion starts with the index already warm; `RAGEngine.search()` is fast

**Where it lives.** Application. The sub-conversation runner is the same primitive as Candidate 1; the staging logic is `RAGEngine` integration.

**Depends on.** Candidate 1 (sub-conversation runner). Could be done as a feature within Candidate 1's track.

**Effort.** **Small to medium.** The sub-conversation runner is the heavy lift (Candidate 1). The RAG-staging prompt is ~30 lines.

**Recommended priority.** **HIGH** — user-flagged; cheap given Candidate 1.

---

## Candidate 3: Stateless `LLMClient` class

**Why it matters.** `src/ai_client.py` is 2,685 lines of stateful singleton with module-level globals for every provider's history. nagent's `bin/helpers/nagent_llm.py` is 300 lines of stateless dispatch. A refactor toward a stateless `LLMClient(provider, model, conversation)` class would:

- Make `ai_client` parseable (no implicit state to track)
- Make tests deterministic (each test gets a fresh client)
- Enable conversation save/load (the `Conversation` object is the transcript)
- Enable provider switching without losing history

This is a *big* refactor but a high-leverage one. Pitfalls #2 and #4 are both solved.

**What it would do.** A new `src/llm_client.py`:
```python
@dataclass
class Conversation:
    messages: list[Message]  # role + content + tool_calls + tool_results
    metadata: dict
    def to_dict(self) -> dict: ...
    def from_dict(data: dict) -> Conversation: ...
    def save(path: Path) -> None: ...
    def load(path: Path) -> Conversation: ...

class LLMClient:
    def __init__(self, provider: str, model: str, api_key: str = None): ...
    def send(self, conversation: Conversation, *, tools: list[Tool] = None) -> Conversation: ...
    def stream_send(self, conversation: Conversation, *, tools: list[Tool] = None) -> Iterator[Event]: ...
```

Backwards-compat: `ai_client.send(...)` becomes a thin wrapper that constructs a default `Conversation` from the current state and calls the new class.

**Where it lives.** Application (the AI client is the Application's main AI entry point).

**Depends on.** The `data_oriented_error_handling_20260606` track is independent but related — both push toward the data-oriented principles. The `public_api_migration_20260606` follow-up track would benefit from the new `Conversation` class.

**Effort.** **Large.** 3-5 phases: (1) introduce `Conversation` dataclass, (2) per-provider `LLMClient.send`, (3) migration of existing `ai_client.send` callers, (4) deprecate module-level globals, (5) remove. ~2000+ lines of refactor.

**Recommended priority.** **MEDIUM.** High value, but the existing stateful singleton works. Defer until a concrete Application need forces it (e.g., the user wanting to save/replay conversations).

---

## Candidate 4: Intent-based DSL for Meta-Tooling tool calls

**User signal:** **EXPLICIT WANT** ("The tool use is kinda upfront, I want to add an intent based dsl to help with 'discovery' or combinatorics but no where near that ideation yet.")

**Why it matters.** nagent's §4 regex-tag protocol is more debuggable than Manual Slop's function-calling. The Meta-Tooling (the external agents that build the Application) could benefit from a more compact, inspectable tool-call format. The existing JSON function-calling format forces the user to read verbose `{"name": "...", "args": {...}}` blobs.

**What it would do.** An intent-based DSL that the Meta-Tooling can use in its own work. Examples (per the user's "discovery" or "combinatorics" hint):
- `<read src/foo.py:MyClass.method>` — intent: read this symbol
- `<search "execution clutch">` — intent: semantic search the workspace
- `<edit src/foo.py:42-50:new code>` — intent: surgical line-range edit
- `<test tests/test_foo.py::test_bar>` — intent: run a specific test
- `<discover what calls X>` — intent: dependency trace

These are read by the external agent (Gemini CLI, OpenCode), not by Manual Slop's Application AI. The Application's function-calling format stays the same (correct for its domain).

**Where it lives.** Meta-Tooling. Documented in `docs/`; taught via the conductor convention; the external agent emits the DSL, the bridge script (`cli_tool_bridge.py`) translates to actual `mcp_client.py` tool calls.

**Depends on.** None directly. The `mcp_architecture_refactor_20260606` may produce tools that are easier to call via DSL (atomic, composable).

**Effort.** **Research spike, not implementation.** The user said "no where near that ideation yet." This is a design exercise, not a code change.

**Recommended priority.** **LOW** — user explicitly deferred.

---

## Candidate 5: Self-describing MCP tools (nagent §12 pattern)

**Why it matters.** Manual Slop's 45 MCP tools are dispatched by a flat if/elif in `mcp_client.py:dispatch`. Adding a tool requires edits in 4 places (dispatch, security allowlist, capability declaration, tests). nagent's `--description` self-describing executable pattern is more extensible: drop an executable, it auto-appears.

**What it would do.** Each sub-MCP (or each tool) emits a `--description` block on `--help`. The `dispatch` function introspects via `mcp_client.get_tool_schemas()` and includes the descriptions in the AI's initial context automatically.

**Where it lives.** Application (the dispatch layer). The Meta-Tooling already has self-describing (via `claude_tool_bridge.py`); this is the Application-side equivalent.

**Depends on.** The `mcp_architecture_refactor_20260606` is the natural place — the sub-MCPs would each be self-describing modules.

**Effort.** **Medium** (subsumed by mcp_architecture_refactor_20260606). Not a separate track.

**Recommended priority.** **LOW** — subsumed.

---

## Candidate 6: `src/git_history.py` (nagent §7 pattern)

**Why it matters.** Manual Slop's `_reread_file_items` does current-content diff injection. nagent's `file_edit_history_and_summary_block` does *historical* content injection: `git log --follow <file>` per file, LLM-summarized, plus co-edit neighborhood. For "explain this file" questions, the LLM is meeting the file fresh — git history would give it crucial context (who touched it last, why, what's nearby).

**What it would do.** A `src/git_history.py:file_edit_history_and_summary_block(file_path, repo_root, provider, model, config_path, previous_initial_context=None) -> str` that:
- Calls `git log --follow --max-count=50 --date=short --format=...` per file
- Counts co-edited files per commit
- LLM-summarizes new commits (with cache for unchanged history)
- Renders a `{file-history}` block with editors, step-by-step, co-edited files, summarized commits
- Called from `aggregate.py:run` at discussion start, after the file is added to context

**Where it lives.** Application (it's part of the AI's initial context).

**Depends on.** None directly. The `data_oriented_error_handling_20260606` is independent. The `rag_engine.py` already has a `sourcesha256` field and mtime-based invalidation — the same pattern.

**Effort.** **Medium.** 2 phases: (1) git history + co-edit, (2) LLM summarization with cache. ~300-500 lines.

**Recommended priority.** **MEDIUM** — high value, but only after Candidates 1-2 are done.

---

## Candidate 7: Per-file conversation log (nagent §6 conversation dimension)

**Why it matters.** Manual Slop's per-file memory is the *curation* kind. nagent's is the *conversation log* kind. The user has the curation already; the conversation log is missing. The user's correction made this clear: the two are *different optimizations*, not equivalent.

**What it would do.** A thin `~/.manual_slop/per_file/<file_id>.md` per file (file_id by `st_dev:st_ino` for stability across renames, like nagent). Updated each time a discussion references the file. Format:
```markdown
# src/foo.py (file_id: 12345:67890)
Last referenced: 2026-06-08T12:34:56 (Discussion: "refactor auth")

## 2026-06-08T12:34:56 - "how does the validation work?"
AI response: ...
(User) followup: "what about edge cases?"

## 2026-06-05T... - "explain the parser"
AI response: ...
```

When the user opens a new discussion with the file in context, the per-file log is injected as a `{per-file-history}` block.

**Where it lives.** Application (the per-file log is the App's memory). The Meta-Tooling doesn't need this — sub-agent invocations are already short-lived.

**Depends on.** None. Could be added in a small follow-up to Candidate 3 (the `Conversation` object becomes the per-file log).

**Effort.** **Small** if done as a thin layer on top of the `Conversation` class. **Medium** if done before Candidate 3 (no `Conversation` object to leverage).

**Recommended priority.** **LOW** — niche, niche feature.

---

## Candidate 8: `py_coedited_files` / `ts_c_coedited_files` MCP tools (nagent §8)

**Why it matters.** nagent's `coedited_file_rows` produces a "files that historically co-edit with this file" table. Manual Slop has `py_get_hierarchy` (subclass scan) but no historical co-edit tool. Useful for "if I edit this file, what should I also look at?".

**What it would do.** Two new MCP tools:
- `py_coedited_files(path: str) -> list[{path, commits_together, likelihood}]` — runs `git log --follow <path>`, counts files in each commit, labels high/medium/low
- `ts_c_coedited_files(path: str) -> list[{path, commits_together, likelihood}]` — same, for C/C++

Returns a table. Used in the initial context as `{file-neighborhood}`.

**Where it lives.** Application (initial context injection).

**Depends on.** None. Small, contained.

**Effort.** **Small.** ~200 lines + tests. The git-log is already in `aggregate.py`; this is a new tool that uses the same primitives.

**Recommended priority.** **LOW** — small but niche. Worth bundling with Candidate 6 if that gets done.

---

## Candidate 9: Explicit `src/split_lib.py` + `src/patch_lib.py` (nagent §11)

**Why it matters.** Manual Slop doesn't have an explicit split/patch pipeline. For very large files (>50 KB), the current `aggregate.py` + tree-sitter approach works for *reading* (skeleton, summary) but not for *patching* (no explicit segment/hash model).

**What it would do.** Mirror nagent's design:
- `src/split_lib.py` — per-language natural splitters, `index.json` with `source_path`, `sourcesha256`, `segments[]`
- `src/patch_lib.py` — strict `validate_index` (hash check), `make_unified_patch`, `apply_segment_patches`
- `src/summarize_lib.py` — per-segment LLM call + retry-with-smaller-prompt

**Where it lives.** Application (the AI is the consumer). The Meta-Tooling already has nagent if it wants this.

**Depends on.** None. Self-contained.

**Effort.** **Medium.** 2 phases: split/patch, then summarize. ~500 lines.

**Recommended priority.** **DEFER UNTIL NEEDED.** No current 1:1 use case requires explicit split/patch. If a future file is genuinely too large for tree-sitter to handle inline, this becomes Candidate #2-priority.

---

## Candidate 10: Optional raw-transcript persistence per Take (nagent §3 conversation dimension)

**Why it matters.** nagent's "edit the conversation file" pattern is foreign to Manual Slop because the App stores abstracted entries (`disc_entries`), not raw transcripts. The user-edit feature in the GUI does edit individual entries, but the underlying log of `function_call` / `tool_result` blocks is implicit.

**What it would do.** Optionally, when a take is snapshotted to TOML (`project_manager.save_project`), also persist the raw transcript to a sibling file `discussions/<take_name>/transcript.jsonl`. The GUI gets a "View Raw Transcript" button. Optional "Edit Raw Transcript" mode that re-parses and re-aggregates.

**Where it lives.** Application. Optional — user can toggle per-project.

**Depends on.** None. Could be a small follow-up to Candidate 3 (`Conversation` class).

**Effort.** **Small.** ~150 lines + tests. Persist the existing `comms.log` in a structured way.

**Recommended priority.** **LOW** — niche feature, opt-in only.

---

## Summary table

| # | Candidate | User signal | Priority | Effort | Domain |
|---|---|---|---|---|---|
| 1 | `SubConversationRunner` (1:1 sub-convos) | **Explicit want** | **HIGH** | Medium | App + MT |
| 2 | RAG pre-staging via sub-conversation | **Explicit want** | **HIGH** | Small (depends on #1) | App |
| 3 | Stateless `LLMClient` class | (none) | Medium | Large | App |
| 4 | Intent-based DSL for Meta-Tooling | Explicit but deferred | Low | Research | MT |
| 5 | Self-describing MCP tools | Implicit | Low (subsumed) | Medium | BOTH |
| 6 | `src/git_history.py` (nagent §7) | (none) | Medium | Medium | App |
| 7 | Per-file conversation log | (none) | Low | Small | App |
| 8 | `py_/ts_c_coedited_files` tools | (none) | Low (bundle with #6) | Small | App |
| 9 | Explicit `split_lib.py` / `patch_lib.py` | (none) | Defer until needed | Medium | App |
| 10 | Raw-transcript persistence per Take | (none) | Low | Small | App |

---

## Recommended next steps

1. **Spec and build Candidate 1 first** — it's the highest-priority user-flagged want, and Candidates 2 builds on it.
2. **Combine Candidate 2 with Candidate 1's track** — same primitive, different prompt.
3. **Hold Candidates 3-10 for future scoping** — each is a separate conductor track when the corresponding need surfaces.

The current `nagent_review_20260608` track itself produces no code; it's the reference. Candidates 1 and 2 will be the first *implementation* tracks informed by it.