docs: agent workflow docs + regular docs (v2.3 surfacing)

Per user request 'use your remaining context to update agent workflow docs and then regular docs based on what was discussed in this report', this commit creates/updates 15 files derived from the v2.3 nagent review (the 12 new nagent additions + the 4 memory dimensions reframing + the cache strategy + the RAG discipline + the knowledge harvest pattern). Agent workflow docs (4 files): - AGENTS.md (UPDATE): add @import line to canonical DOD + 'Code Styleguides' section pointing to the 6 new styleguides + new 'Human-Facing Documentation' section pointing to ./docs/AGENTS.md - conductor/workflow.md (UPDATE): new section 'Additions (2026-06-12) - the 12 patterns from the latest nagent corpus' with TDD protocols for knowledge harvest, cache ordering, compaction, RAG discipline - conductor/product-guidelines.md (UPDATE): new sections 'Memory Dimensions (added 2026-06-12)' + 'See Also - Updated' with the 6-styleguide catalog - docs/AGENTS.md (NEW): the agent-facing mirror of docs/Readme.md (per the nagent CLAUDE.md pattern). 10 sections + the per-tier reading path + the 4 memory dimensions + the caching strategy + the knowledge harvest + the RAG discipline + the feature flags Regular docs (11 files): - 6 new styleguides (the convention catalog): * data_oriented_design.md: the canonical DOD reference (Tier 0/1/2; 3 defaults to reject; 8 core defaults; 7-question simplification pass; 10-question self-check; 4 memory dimensions in Manual Slop context) * agent_memory_dimensions.md: the 4 memory dims (curation / discussion / RAG / knowledge) + when to use each + the boundaries * rag_integration_discipline.md: the conservative-RAG rule (opt-in, complement, provenance, no mutation, feature-gated, graceful failure) * cache_friendly_context.md: stable-to-volatile context ordering + the cache TTL GUI contract + the byte-comparison test * knowledge_artifacts.md: the knowledge harvest pattern (category files, provenance, sha256 ledger, digest regeneration, 'delete to turn off') * feature_flags.md: file presence vs config flags vs CLI flags - 3 new project docs (the cross-cutting guides): * guide_agent_memory_dimensions.md: the cross-cutting guide on the 4 dims + the decision tree * guide_caching_strategy.md: caching across providers + stable-to-volatile ordering + cache TTL GUI + the byte- comparison test + the 5th provider (claude-code) * guide_knowledge_curation.md: the knowledge memory guide (4th dim) + the 5 category files + per-file notes + the digest + the ledger + the harvest workflow - 2 existing doc updates: * guide_mma.md: new sections 'Delegation as context management' + 'The 4 memory dimensions (the MMA scope)' * guide_ai_client.md: new section 'Cache strategy and the 12- layer model' + the 5th provider (claude-code) All files use the same style as the v2.3 review (the user's preferred format): 7-column tables, no JSON, SSDL shape tags, forth/array notation, file:line citations, ASCII sketches where useful. The human Readme files (Readme.md, docs/Readme.md) are NOT modified (per repeated user instruction). The 5th provider (claude-code) is documented in guide_ai_client.md + the data_oriented_design.md references the nagent pattern as the source of the canonical rules. The cross-references are bidirectional: the 6 styleguides reference the 3 project docs; the 3 project docs reference the 6 styleguides; the 2 doc updates reference both; AGENTS.md + ./docs/AGENTS.md provide the entry points.
2026-06-12 13:50:40 -04:00
parent d604a63e1f
commit 35c6cca134
15 changed files with 3460 additions and 1 deletions
@@ -0,0 +1,284 @@
+# RAG Integration Discipline
+
+**Status:** Styleguide; codifies when and how to wire RAG (the opt-in, semantic-search memory dimension) into Manual Slop features.
+**Date:** 2026-06-12
+**Cross-refs:** `conductor/code_styleguides/agent_memory_dimensions.md` §3; `conductor/code_styleguides/data_oriented_design.md` §9; `docs/guide_rag.md`.
+
+> **What this is.** RAG is the opt-in, semantic-search memory dimension. It's *useful* (semantic search across large codebases; concept-level discovery; cross-file pattern matching grep can't do). It's also *fuzzy* (vector similarity, not exact) and *opaque* (the vector store is not user-editable). The discipline: be conservative about when to wire it in. The wrong shape for the right question is a common mistake.
+
+---
+
+## 0. The 6 rules (the one-glance table)
+
+| # | Rule | Why |
+|---|---|---|
+| 1 | RAG is **opt-in**. Default-off in new projects | Most features don't need it; the cost of unnecessary RAG is the embedding-provider round trip + the storage cost |
+| 2 | RAG **complements**; it never **replaces** | Curation / Discussion / Knowledge are the durable, user-editable dimensions; RAG is the fuzzy, semantic search |
+| 3 | RAG results display with **provenance** | The user needs to know which file and which chunk produced the result |
+| 4 | RAG **never mutates state** | No auto-injection of RAG results into `disc_entries`; no auto-update of `FileItem`; no auto-write to disk |
+| 5 | RAG integration is **feature-gated** | A feature must explicitly request RAG in its scope; RAG is not the default for "give me context" |
+| 6 | RAG failure is **graceful** | A failed search returns `Result.empty` or an empty list; never crashes the request |
+
+---
+
+## 1. RAG is opt-in (Rule 1)
+
+**The default is OFF.** A new project opens with `rag_enabled = false`. The user opts in via the AI Settings panel.
+
+**The rationale.** RAG is not free:
+- The embedding-provider round trip adds latency (200-500ms per call, per provider)
+- The storage cost grows with the indexed corpus (per `RAGConfig.chunk_size` and `chunk_overlap`)
+- The dim-mismatch fix at `16412ad5` shows that switching providers requires a full re-index (the existing collection is incompatible with the new provider's embedding dimension)
+
+For a project that doesn't *need* semantic search (e.g., a small Python project with 20 files), RAG is overhead, not benefit.
+
+**The opt-in surface.** Per the existing `[ai_settings.toml]` pattern:
+- `[X] Enable RAG` checkbox
+- Source: `(project / global / none)` radio
+- Embedding provider: `(gemini / local)` dropdown
+- Chunk size: integer (default 1000)
+- Chunk overlap: integer (default 200)
+
+**The opt-out is also supported.** `rm ~/.manual_slop/.slop_cache/chroma_<provider>/` deletes the index. Re-enabling requires a full re-index.
+
+**The opt-out via the AI Settings:**
+```toml
+[ai_settings.rag]
+enabled = false   # default for new projects
+```
+
+**The opt-in is explicit:**
+```toml
+[ai_settings.rag]
+enabled = true
+source = "project"
+embedding_provider = "gemini"
+chunk_size = 1000
+chunk_overlap = 200
+```
+
+---
+
+## 2. RAG complements; it never replaces (Rule 2)
+
+**The 4 memory dimensions** (per `conductor/code_styleguides/agent_memory_dimensions.md`):
+
+| Dim | SSDL | Use when |
+|---|---|---|
+| Curation | `[Q]` | "How to render a file" |
+| Discussion | `o==>` | "What was said in this chat" |
+| **RAG** | `[Q]` | **"What similar content exists"** |
+| Knowledge | `o==>` | "What we learned from past runs" |
+
+**The rule.** RAG is the *fuzzy semantic search* dimension. It is NOT:
+- A replacement for curation (use `FileItem.view_mode` + Fuzzy Anchors)
+- A replacement for discussion (use `disc_entries`)
+- A replacement for knowledge (use `knowledge/digest.md`)
+
+**The cross-cutting principle.** When a feature asks "give me context," the answer is *not* "enable RAG." The answer is "which of the 4 dimensions is the right home?" — and the 4-dim decision tree is the test.
+
+**The "complement" examples:**
+- A new discussion opens: render the active preset's `FileItem`s (curation) + the `disc_entries` (discussion) + the knowledge digest (knowledge). *Optionally* append `{rag-context}` if the user has opted in.
+- The LLM asks "what's the execution clutch?": try knowledge first (the user has decided it's a durable concept). Try discussion second (search the prior entries for "clutch"). Try RAG third (semantic search across the indexed codebase). Curation fourth (the user has configured specific files).
+- The user asks "where does X happen?": RAG is the *natural* shape for this question (semantic search). Use it.
+
+---
+
+## 3. Provenance required (Rule 3)
+
+**The principle.** When RAG returns results, the user must be able to see *which file* and *which chunk* produced the result. No black boxes.
+
+**The RAG result shape** (per `RAGEngine.search`):
+
+```python
+@dataclass
+class SearchResult:
+    file_path: str           # the absolute path
+    chunk_offset: int        # byte offset within the file
+    chunk_length: int        # length in bytes
+    content: str             # the matched text
+    similarity: float         # the cosine similarity
+```
+
+**The display in the LLM context** (the `{rag-context}` block):
+
+```
+{rag-context}
+## src/ai_client.py:512-768 (similarity: 0.87)
+...content...
+
+## src/aggregate.py:142-289 (similarity: 0.82)
+...content...
+{/rag-context}
+```
+
+**The display in the GUI** (the per-result tooltip):
+
+```
+[Anthropic cache-aware send]
+File: src/ai_client.py:512-768
+Similarity: 0.87
+Click to jump to file
+```
+
+**The provenance is not optional.** If a result has no provenance, it doesn't go in the context.
+
+**The cross-references.** The dim-mismatch fix at `16412ad5` shows the kind of bug that happens when the RAG index loses provenance: switching providers silently corrupts the index because the embeddings have different dimensions. The provenance (file path + chunk offset) is what makes the index re-buildable.
+
+---
+
+## 4. RAG never mutates state (Rule 4)
+
+**The principle.** RAG is a *query* dimension. It returns data; it does not write data.
+
+**The mutation rules:**
+- RAG results **do NOT** go into `disc_entries`
+- RAG results **do NOT** update `FileItem` curation state
+- RAG results **do NOT** write to disk
+- RAG results **do NOT** trigger knowledge harvest
+- RAG results **do NOT** modify the system prompt or persona
+
+**The exception (none).** There is no feature that should mutate state from RAG results. If a feature wants to "remember" something from RAG, the user must explicitly say "add that to the discussion" (which appends a `role: "User"` entry to `disc_entries`) or "harvest that into knowledge" (which runs the harvest workflow).
+
+**The boundary in code:**
+
+```python
+# In ai_client.py:send() (the integration point)
+def send(...):
+    prompt = aggregate.build(...)
+    if config.rag_enabled:
+        results = rag_engine.search(prompt, k=N)
+        prompt = append_rag_block(prompt, results)   # READ ONLY
+    return self._send_<provider>(prompt, ...)
+    # NO mutation of: disc_entries, FileItem, knowledge files
+```
+
+**The mutation must happen in a different function, called explicitly by the user or the LLM with HITL approval.**
+
+---
+
+## 5. Feature-gated integration (Rule 5)
+
+**The principle.** A feature must explicitly request RAG in its scope. RAG is not the default for "give me context."
+
+**The gate.** Every feature that uses RAG declares the dependency in its spec, plan, and changelog:
+
+```markdown
+## Scope
+- Feature X (uses RAG for semantic search)
+- Feature Y (no RAG dependency; uses Curation + Discussion only)
+
+## Dependencies
+- RAG is required for Feature X; the user must opt-in via AI Settings
+- Feature Y is independent of RAG
+```
+
+**The runtime gate.** The feature's code checks `config.rag_enabled` and behaves accordingly:
+
+```python
+# In the feature's code
+def feature_x(query: str) -> list[SearchResult]:
+    if not config.rag_enabled:
+        raise RAGNotEnabledError("Feature X requires RAG; opt in via AI Settings")
+    return rag_engine.search(query, k=N)
+```
+
+**The error message is explicit.** The user knows why the feature isn't working.
+
+**The CLI surface** (for testing and debugging):
+```bash
+$ python -m src.feature_x "execution clutch"
+# Error: RAG not enabled. Enable via: [ai_settings.toml] rag.enabled = true
+```
+
+**The audit trail.** Every feature that uses RAG is logged in `metadata.json` for the feature's track: `uses_rag: true`.
+
+---
+
+## 6. Graceful failure (Rule 6)
+
+**The principle.** RAG failure is data, not an exception. A failed search returns an empty result; the request continues.
+
+**The failure modes** (in priority order):
+
+| Failure | Handling |
+|---|---|
+| RAG not enabled | Skip; no `{rag-context}` block; the request continues |
+| ChromaDB not initialized | Skip; log a warning; the request continues |
+| Embedding provider not available | Skip; log a warning; the request continues |
+| Index missing (first run) | Skip; log a warning; the request continues |
+| Search returns empty | Normal; no `{rag-context}` block; the request continues |
+| Search times out | Return partial results; log a warning |
+| Search raises an exception | Catch; log the exception; return empty; the request continues |
+
+**The exception is `Result[T, ErrorInfo]`, not an exception.** Per the `data_oriented_error_handling_20260606` convention.
+
+```python
+# In the RAG engine
+def search(self, query: str, k: int = 5) -> Result[list[SearchResult], ErrorInfo]:
+    try:
+        if not self._enabled:
+            return Result(data=[], errors=[ErrorInfo(NOT_READY, "RAG not enabled")])
+        if not self._collection:
+            return Result(data=[], errors=[ErrorInfo(NOT_READY, "RAG not initialized")])
+        results = self._collection.query(query, k=k)
+        return Result(data=results, errors=[])
+    except Exception as exc:
+        return Result(data=[], errors=[ErrorInfo(INTERNAL, str(exc))])
+```
+
+**The caller** (`ai_client.py:send`) checks `.errors` and proceeds with empty results:
+
+```python
+rag_result = rag_engine.search(prompt, k=N)
+if rag_result.ok and rag_result.data:
+    prompt = append_rag_block(prompt, rag_result.data)
+# else: proceed without RAG; the request doesn't fail
+```
+
+**The user sees the warning** in the comms log:
+```
+[RAG] search failed: ChromaDB not initialized
+[RAG] request continues without RAG
+```
+
+---
+
+## 7. The wiring points (the where)
+
+| Where in `src/` | What it does | What it does NOT do |
+|---|---|---|
+| `src/ai_client.py:send` | The integration point; appends `{rag-context}` if enabled | Does not mutate state |
+| `src/aggregate.py:run` | Builds the initial context; appends `{rag-context}` in the volatile layer | Does not query RAG directly |
+| `src/rag_engine.py:search` | The semantic search; returns `Result[list[SearchResult], ErrorInfo]` | Does not write to the index |
+| `src/rag_engine.py:index_file` | The indexer; called by `RAGEngine._init_vector_store` or by the harvest CLI | Does not run at LLM call time |
+| `src/ai_settings.toml` (or GUI) | The opt-in surface | Does not trigger RAG automatically |
+
+---
+
+## 8. The forbidden patterns (the "don't do this" list)
+
+| Pattern | Why it's forbidden |
+|---|---|
+| RAG as a *replacement* for curation | Curation is structural (per-file schema); RAG is semantic (fuzzy). Use curation for "how to render file X" |
+| RAG as a *replacement* for discussion | Discussion is precise (the actual messages); RAG is fuzzy. Use discussion for "what was said" |
+| RAG as a *replacement* for knowledge | Knowledge is durable (user-edited, provenance-aware); RAG is volatile (indexed, opaque). Use knowledge for "what we decided" |
+| Auto-inject RAG results into `disc_entries` | This is a state mutation; it changes the conversation in a way the user didn't ask for |
+| Auto-write RAG results to disk | Same; no mutation |
+| Use RAG when the user hasn't opted in | RAG is opt-in; default-off in new projects |
+| Crash the request when RAG fails | Graceful failure; the request continues |
+| Use RAG for "show me the last thing the user said" | Use `disc_entries` (precise) |
+| Use RAG for "show me what we decided last time" | Use the knowledge digest (durable) |
+| Use RAG for "show me the file the user is editing" | Use `FileItem` (curation) |
+
+---
+
+## 9. The cross-references
+
+- `conductor/code_styleguides/agent_memory_dimensions.md` §3 — the RAG dim in context
+- `conductor/code_styleguides/data_oriented_design.md` §1.2 — "Design around a model of the world" (the underlying anti-pattern)
+- `conductor/code_styleguides/cache_friendly_context.md` — where the 4 dims get injected in the cache strategy
+- `conductor/code_styleguides/knowledge_artifacts.md` — the knowledge dim (the alternative for "what we decided")
+- `docs/guide_rag.md` — the existing RAG deep-dive
+- `data_oriented_error_handling_20260606` — the `Result[T, ErrorInfo]` pattern
+- `conductor/tracks/rag_phase4_stress_fix_20260606` — the dim-mismatch fix at `16412ad5`