Per user request 'use your remaining context to update agent workflow
docs and then regular docs based on what was discussed in this report',
this commit creates/updates 15 files derived from the v2.3 nagent
review (the 12 new nagent additions + the 4 memory dimensions
reframing + the cache strategy + the RAG discipline + the knowledge
harvest pattern).
Agent workflow docs (4 files):
- AGENTS.md (UPDATE): add @import line to canonical DOD + 'Code
Styleguides' section pointing to the 6 new styleguides + new
'Human-Facing Documentation' section pointing to ./docs/AGENTS.md
- conductor/workflow.md (UPDATE): new section 'Additions (2026-06-12)
- the 12 patterns from the latest nagent corpus' with TDD
protocols for knowledge harvest, cache ordering, compaction, RAG
discipline
- conductor/product-guidelines.md (UPDATE): new sections 'Memory
Dimensions (added 2026-06-12)' + 'See Also - Updated' with the
6-styleguide catalog
- docs/AGENTS.md (NEW): the agent-facing mirror of docs/Readme.md
(per the nagent CLAUDE.md pattern). 10 sections + the per-tier
reading path + the 4 memory dimensions + the caching strategy +
the knowledge harvest + the RAG discipline + the feature flags
Regular docs (11 files):
- 6 new styleguides (the convention catalog):
* data_oriented_design.md: the canonical DOD reference (Tier
0/1/2; 3 defaults to reject; 8 core defaults; 7-question
simplification pass; 10-question self-check; 4 memory
dimensions in Manual Slop context)
* agent_memory_dimensions.md: the 4 memory dims (curation /
discussion / RAG / knowledge) + when to use each + the
boundaries
* rag_integration_discipline.md: the conservative-RAG rule
(opt-in, complement, provenance, no mutation, feature-gated,
graceful failure)
* cache_friendly_context.md: stable-to-volatile context
ordering + the cache TTL GUI contract + the byte-comparison
test
* knowledge_artifacts.md: the knowledge harvest pattern
(category files, provenance, sha256 ledger, digest
regeneration, 'delete to turn off')
* feature_flags.md: file presence vs config flags vs CLI flags
- 3 new project docs (the cross-cutting guides):
* guide_agent_memory_dimensions.md: the cross-cutting guide on
the 4 dims + the decision tree
* guide_caching_strategy.md: caching across providers +
stable-to-volatile ordering + cache TTL GUI + the byte-
comparison test + the 5th provider (claude-code)
* guide_knowledge_curation.md: the knowledge memory guide (4th
dim) + the 5 category files + per-file notes + the digest +
the ledger + the harvest workflow
- 2 existing doc updates:
* guide_mma.md: new sections 'Delegation as context management'
+ 'The 4 memory dimensions (the MMA scope)'
* guide_ai_client.md: new section 'Cache strategy and the 12-
layer model' + the 5th provider (claude-code)
All files use the same style as the v2.3 review (the user's preferred
format): 7-column tables, no JSON, SSDL shape tags, forth/array
notation, file:line citations, ASCII sketches where useful. The
human Readme files (Readme.md, docs/Readme.md) are NOT modified
(per repeated user instruction).
The 5th provider (claude-code) is documented in guide_ai_client.md
+ the data_oriented_design.md references the nagent pattern as the
source of the canonical rules.
The cross-references are bidirectional: the 6 styleguides reference
the 3 project docs; the 3 project docs reference the 6 styleguides;
the 2 doc updates reference both; AGENTS.md + ./docs/AGENTS.md
provide the entry points.
13 KiB
RAG Integration Discipline
Status: Styleguide; codifies when and how to wire RAG (the opt-in, semantic-search memory dimension) into Manual Slop features.
Date: 2026-06-12
Cross-refs: conductor/code_styleguides/agent_memory_dimensions.md §3; conductor/code_styleguides/data_oriented_design.md §9; docs/guide_rag.md.
What this is. RAG is the opt-in, semantic-search memory dimension. It's useful (semantic search across large codebases; concept-level discovery; cross-file pattern matching grep can't do). It's also fuzzy (vector similarity, not exact) and opaque (the vector store is not user-editable). The discipline: be conservative about when to wire it in. The wrong shape for the right question is a common mistake.
0. The 6 rules (the one-glance table)
| # | Rule | Why |
|---|---|---|
| 1 | RAG is opt-in. Default-off in new projects | Most features don't need it; the cost of unnecessary RAG is the embedding-provider round trip + the storage cost |
| 2 | RAG complements; it never replaces | Curation / Discussion / Knowledge are the durable, user-editable dimensions; RAG is the fuzzy, semantic search |
| 3 | RAG results display with provenance | The user needs to know which file and which chunk produced the result |
| 4 | RAG never mutates state | No auto-injection of RAG results into disc_entries; no auto-update of FileItem; no auto-write to disk |
| 5 | RAG integration is feature-gated | A feature must explicitly request RAG in its scope; RAG is not the default for "give me context" |
| 6 | RAG failure is graceful | A failed search returns Result.empty or an empty list; never crashes the request |
1. RAG is opt-in (Rule 1)
The default is OFF. A new project opens with rag_enabled = false. The user opts in via the AI Settings panel.
The rationale. RAG is not free:
- The embedding-provider round trip adds latency (200-500ms per call, per provider)
- The storage cost grows with the indexed corpus (per
RAGConfig.chunk_sizeandchunk_overlap) - The dim-mismatch fix at
16412ad5shows that switching providers requires a full re-index (the existing collection is incompatible with the new provider's embedding dimension)
For a project that doesn't need semantic search (e.g., a small Python project with 20 files), RAG is overhead, not benefit.
The opt-in surface. Per the existing [ai_settings.toml] pattern:
[X] Enable RAGcheckbox- Source:
(project / global / none)radio - Embedding provider:
(gemini / local)dropdown - Chunk size: integer (default 1000)
- Chunk overlap: integer (default 200)
The opt-out is also supported. rm ~/.manual_slop/.slop_cache/chroma_<provider>/ deletes the index. Re-enabling requires a full re-index.
The opt-out via the AI Settings:
[ai_settings.rag]
enabled = false # default for new projects
The opt-in is explicit:
[ai_settings.rag]
enabled = true
source = "project"
embedding_provider = "gemini"
chunk_size = 1000
chunk_overlap = 200
2. RAG complements; it never replaces (Rule 2)
The 4 memory dimensions (per conductor/code_styleguides/agent_memory_dimensions.md):
| Dim | SSDL | Use when |
|---|---|---|
| Curation | [Q] |
"How to render a file" |
| Discussion | o==> |
"What was said in this chat" |
| RAG | [Q] |
"What similar content exists" |
| Knowledge | o==> |
"What we learned from past runs" |
The rule. RAG is the fuzzy semantic search dimension. It is NOT:
- A replacement for curation (use
FileItem.view_mode+ Fuzzy Anchors) - A replacement for discussion (use
disc_entries) - A replacement for knowledge (use
knowledge/digest.md)
The cross-cutting principle. When a feature asks "give me context," the answer is not "enable RAG." The answer is "which of the 4 dimensions is the right home?" — and the 4-dim decision tree is the test.
The "complement" examples:
- A new discussion opens: render the active preset's
FileItems (curation) + thedisc_entries(discussion) + the knowledge digest (knowledge). Optionally append{rag-context}if the user has opted in. - The LLM asks "what's the execution clutch?": try knowledge first (the user has decided it's a durable concept). Try discussion second (search the prior entries for "clutch"). Try RAG third (semantic search across the indexed codebase). Curation fourth (the user has configured specific files).
- The user asks "where does X happen?": RAG is the natural shape for this question (semantic search). Use it.
3. Provenance required (Rule 3)
The principle. When RAG returns results, the user must be able to see which file and which chunk produced the result. No black boxes.
The RAG result shape (per RAGEngine.search):
@dataclass
class SearchResult:
file_path: str # the absolute path
chunk_offset: int # byte offset within the file
chunk_length: int # length in bytes
content: str # the matched text
similarity: float # the cosine similarity
The display in the LLM context (the {rag-context} block):
{rag-context}
## src/ai_client.py:512-768 (similarity: 0.87)
...content...
## src/aggregate.py:142-289 (similarity: 0.82)
...content...
{/rag-context}
The display in the GUI (the per-result tooltip):
[Anthropic cache-aware send]
File: src/ai_client.py:512-768
Similarity: 0.87
Click to jump to file
The provenance is not optional. If a result has no provenance, it doesn't go in the context.
The cross-references. The dim-mismatch fix at 16412ad5 shows the kind of bug that happens when the RAG index loses provenance: switching providers silently corrupts the index because the embeddings have different dimensions. The provenance (file path + chunk offset) is what makes the index re-buildable.
4. RAG never mutates state (Rule 4)
The principle. RAG is a query dimension. It returns data; it does not write data.
The mutation rules:
- RAG results do NOT go into
disc_entries - RAG results do NOT update
FileItemcuration state - RAG results do NOT write to disk
- RAG results do NOT trigger knowledge harvest
- RAG results do NOT modify the system prompt or persona
The exception (none). There is no feature that should mutate state from RAG results. If a feature wants to "remember" something from RAG, the user must explicitly say "add that to the discussion" (which appends a role: "User" entry to disc_entries) or "harvest that into knowledge" (which runs the harvest workflow).
The boundary in code:
# In ai_client.py:send() (the integration point)
def send(...):
prompt = aggregate.build(...)
if config.rag_enabled:
results = rag_engine.search(prompt, k=N)
prompt = append_rag_block(prompt, results) # READ ONLY
return self._send_<provider>(prompt, ...)
# NO mutation of: disc_entries, FileItem, knowledge files
The mutation must happen in a different function, called explicitly by the user or the LLM with HITL approval.
5. Feature-gated integration (Rule 5)
The principle. A feature must explicitly request RAG in its scope. RAG is not the default for "give me context."
The gate. Every feature that uses RAG declares the dependency in its spec, plan, and changelog:
## Scope
- Feature X (uses RAG for semantic search)
- Feature Y (no RAG dependency; uses Curation + Discussion only)
## Dependencies
- RAG is required for Feature X; the user must opt-in via AI Settings
- Feature Y is independent of RAG
The runtime gate. The feature's code checks config.rag_enabled and behaves accordingly:
# In the feature's code
def feature_x(query: str) -> list[SearchResult]:
if not config.rag_enabled:
raise RAGNotEnabledError("Feature X requires RAG; opt in via AI Settings")
return rag_engine.search(query, k=N)
The error message is explicit. The user knows why the feature isn't working.
The CLI surface (for testing and debugging):
$ python -m src.feature_x "execution clutch"
# Error: RAG not enabled. Enable via: [ai_settings.toml] rag.enabled = true
The audit trail. Every feature that uses RAG is logged in metadata.json for the feature's track: uses_rag: true.
6. Graceful failure (Rule 6)
The principle. RAG failure is data, not an exception. A failed search returns an empty result; the request continues.
The failure modes (in priority order):
| Failure | Handling |
|---|---|
| RAG not enabled | Skip; no {rag-context} block; the request continues |
| ChromaDB not initialized | Skip; log a warning; the request continues |
| Embedding provider not available | Skip; log a warning; the request continues |
| Index missing (first run) | Skip; log a warning; the request continues |
| Search returns empty | Normal; no {rag-context} block; the request continues |
| Search times out | Return partial results; log a warning |
| Search raises an exception | Catch; log the exception; return empty; the request continues |
The exception is Result[T, ErrorInfo], not an exception. Per the data_oriented_error_handling_20260606 convention.
# In the RAG engine
def search(self, query: str, k: int = 5) -> Result[list[SearchResult], ErrorInfo]:
try:
if not self._enabled:
return Result(data=[], errors=[ErrorInfo(NOT_READY, "RAG not enabled")])
if not self._collection:
return Result(data=[], errors=[ErrorInfo(NOT_READY, "RAG not initialized")])
results = self._collection.query(query, k=k)
return Result(data=results, errors=[])
except Exception as exc:
return Result(data=[], errors=[ErrorInfo(INTERNAL, str(exc))])
The caller (ai_client.py:send) checks .errors and proceeds with empty results:
rag_result = rag_engine.search(prompt, k=N)
if rag_result.ok and rag_result.data:
prompt = append_rag_block(prompt, rag_result.data)
# else: proceed without RAG; the request doesn't fail
The user sees the warning in the comms log:
[RAG] search failed: ChromaDB not initialized
[RAG] request continues without RAG
7. The wiring points (the where)
Where in src/ |
What it does | What it does NOT do |
|---|---|---|
src/ai_client.py:send |
The integration point; appends {rag-context} if enabled |
Does not mutate state |
src/aggregate.py:run |
Builds the initial context; appends {rag-context} in the volatile layer |
Does not query RAG directly |
src/rag_engine.py:search |
The semantic search; returns Result[list[SearchResult], ErrorInfo] |
Does not write to the index |
src/rag_engine.py:index_file |
The indexer; called by RAGEngine._init_vector_store or by the harvest CLI |
Does not run at LLM call time |
src/ai_settings.toml (or GUI) |
The opt-in surface | Does not trigger RAG automatically |
8. The forbidden patterns (the "don't do this" list)
| Pattern | Why it's forbidden |
|---|---|
| RAG as a replacement for curation | Curation is structural (per-file schema); RAG is semantic (fuzzy). Use curation for "how to render file X" |
| RAG as a replacement for discussion | Discussion is precise (the actual messages); RAG is fuzzy. Use discussion for "what was said" |
| RAG as a replacement for knowledge | Knowledge is durable (user-edited, provenance-aware); RAG is volatile (indexed, opaque). Use knowledge for "what we decided" |
Auto-inject RAG results into disc_entries |
This is a state mutation; it changes the conversation in a way the user didn't ask for |
| Auto-write RAG results to disk | Same; no mutation |
| Use RAG when the user hasn't opted in | RAG is opt-in; default-off in new projects |
| Crash the request when RAG fails | Graceful failure; the request continues |
| Use RAG for "show me the last thing the user said" | Use disc_entries (precise) |
| Use RAG for "show me what we decided last time" | Use the knowledge digest (durable) |
| Use RAG for "show me the file the user is editing" | Use FileItem (curation) |
9. The cross-references
conductor/code_styleguides/agent_memory_dimensions.md§3 — the RAG dim in contextconductor/code_styleguides/data_oriented_design.md§1.2 — "Design around a model of the world" (the underlying anti-pattern)conductor/code_styleguides/cache_friendly_context.md— where the 4 dims get injected in the cache strategyconductor/code_styleguides/knowledge_artifacts.md— the knowledge dim (the alternative for "what we decided")docs/guide_rag.md— the existing RAG deep-divedata_oriented_error_handling_20260606— theResult[T, ErrorInfo]patternconductor/tracks/rag_phase4_stress_fix_20260606— the dim-mismatch fix at16412ad5