Private

Public Access

Files

T

ed 35c6cca134 docs: agent workflow docs + regular docs (v2.3 surfacing)

Per user request 'use your remaining context to update agent workflow
docs and then regular docs based on what was discussed in this report',
this commit creates/updates 15 files derived from the v2.3 nagent
review (the 12 new nagent additions + the 4 memory dimensions
reframing + the cache strategy + the RAG discipline + the knowledge
harvest pattern).

Agent workflow docs (4 files):
- AGENTS.md (UPDATE): add @import line to canonical DOD + 'Code
  Styleguides' section pointing to the 6 new styleguides + new
  'Human-Facing Documentation' section pointing to ./docs/AGENTS.md
- conductor/workflow.md (UPDATE): new section 'Additions (2026-06-12)
  - the 12 patterns from the latest nagent corpus' with TDD
  protocols for knowledge harvest, cache ordering, compaction, RAG
  discipline
- conductor/product-guidelines.md (UPDATE): new sections 'Memory
  Dimensions (added 2026-06-12)' + 'See Also - Updated' with the
  6-styleguide catalog
- docs/AGENTS.md (NEW): the agent-facing mirror of docs/Readme.md
  (per the nagent CLAUDE.md pattern). 10 sections + the per-tier
  reading path + the 4 memory dimensions + the caching strategy +
  the knowledge harvest + the RAG discipline + the feature flags

Regular docs (11 files):
- 6 new styleguides (the convention catalog):
  * data_oriented_design.md: the canonical DOD reference (Tier
    0/1/2; 3 defaults to reject; 8 core defaults; 7-question
    simplification pass; 10-question self-check; 4 memory
    dimensions in Manual Slop context)
  * agent_memory_dimensions.md: the 4 memory dims (curation /
    discussion / RAG / knowledge) + when to use each + the
    boundaries
  * rag_integration_discipline.md: the conservative-RAG rule
    (opt-in, complement, provenance, no mutation, feature-gated,
    graceful failure)
  * cache_friendly_context.md: stable-to-volatile context
    ordering + the cache TTL GUI contract + the byte-comparison
    test
  * knowledge_artifacts.md: the knowledge harvest pattern
    (category files, provenance, sha256 ledger, digest
    regeneration, 'delete to turn off')
  * feature_flags.md: file presence vs config flags vs CLI flags
- 3 new project docs (the cross-cutting guides):
  * guide_agent_memory_dimensions.md: the cross-cutting guide on
    the 4 dims + the decision tree
  * guide_caching_strategy.md: caching across providers +
    stable-to-volatile ordering + cache TTL GUI + the byte-
    comparison test + the 5th provider (claude-code)
  * guide_knowledge_curation.md: the knowledge memory guide (4th
    dim) + the 5 category files + per-file notes + the digest +
    the ledger + the harvest workflow
- 2 existing doc updates:
  * guide_mma.md: new sections 'Delegation as context management'
    + 'The 4 memory dimensions (the MMA scope)'
  * guide_ai_client.md: new section 'Cache strategy and the 12-
    layer model' + the 5th provider (claude-code)

All files use the same style as the v2.3 review (the user's preferred
format): 7-column tables, no JSON, SSDL shape tags, forth/array
notation, file:line citations, ASCII sketches where useful. The
human Readme files (Readme.md, docs/Readme.md) are NOT modified
(per repeated user instruction).

The 5th provider (claude-code) is documented in guide_ai_client.md
+ the data_oriented_design.md references the nagent pattern as the
source of the canonical rules.

The cross-references are bidirectional: the 6 styleguides reference
the 3 project docs; the 3 project docs reference the 6 styleguides;
the 2 doc updates reference both; AGENTS.md + ./docs/AGENTS.md
provide the entry points.

2026-06-12 13:50:40 -04:00

13 KiB

Raw Blame History

RAG Integration Discipline

Status: Styleguide; codifies when and how to wire RAG (the opt-in, semantic-search memory dimension) into Manual Slop features. Date: 2026-06-12 Cross-refs: conductor/code_styleguides/agent_memory_dimensions.md §3; conductor/code_styleguides/data_oriented_design.md §9; docs/guide_rag.md.

What this is. RAG is the opt-in, semantic-search memory dimension. It's useful (semantic search across large codebases; concept-level discovery; cross-file pattern matching grep can't do). It's also fuzzy (vector similarity, not exact) and opaque (the vector store is not user-editable). The discipline: be conservative about when to wire it in. The wrong shape for the right question is a common mistake.

0. The 6 rules (the one-glance table)

#	Rule	Why
1	RAG is opt-in. Default-off in new projects	Most features don't need it; the cost of unnecessary RAG is the embedding-provider round trip + the storage cost
2	RAG complements; it never replaces	Curation / Discussion / Knowledge are the durable, user-editable dimensions; RAG is the fuzzy, semantic search
3	RAG results display with provenance	The user needs to know which file and which chunk produced the result
4	RAG never mutates state	No auto-injection of RAG results into `disc_entries`; no auto-update of `FileItem`; no auto-write to disk
5	RAG integration is feature-gated	A feature must explicitly request RAG in its scope; RAG is not the default for "give me context"
6	RAG failure is graceful	A failed search returns `Result.empty` or an empty list; never crashes the request

1. RAG is opt-in (Rule 1)

The default is OFF. A new project opens with rag_enabled = false. The user opts in via the AI Settings panel.

The rationale. RAG is not free:

The embedding-provider round trip adds latency (200-500ms per call, per provider)
The storage cost grows with the indexed corpus (per RAGConfig.chunk_size and chunk_overlap)
The dim-mismatch fix at 16412ad5 shows that switching providers requires a full re-index (the existing collection is incompatible with the new provider's embedding dimension)

For a project that doesn't need semantic search (e.g., a small Python project with 20 files), RAG is overhead, not benefit.

The opt-in surface. Per the existing [ai_settings.toml] pattern:

[X] Enable RAG checkbox
Source: (project / global / none) radio
Embedding provider: (gemini / local) dropdown
Chunk size: integer (default 1000)
Chunk overlap: integer (default 200)

The opt-out is also supported. rm ~/.manual_slop/.slop_cache/chroma_<provider>/ deletes the index. Re-enabling requires a full re-index.

The opt-out via the AI Settings:

[ai_settings.rag]
enabled = false   # default for new projects

The opt-in is explicit:

[ai_settings.rag]
enabled = true
source = "project"
embedding_provider = "gemini"
chunk_size = 1000
chunk_overlap = 200

2. RAG complements; it never replaces (Rule 2)

The 4 memory dimensions (per conductor/code_styleguides/agent_memory_dimensions.md):

Dim	SSDL	Use when
Curation	`[Q]`	"How to render a file"
Discussion	`o==>`	"What was said in this chat"
RAG	`[Q]`	"What similar content exists"
Knowledge	`o==>`	"What we learned from past runs"

The rule. RAG is the fuzzy semantic search dimension. It is NOT:

A replacement for curation (use FileItem.view_mode + Fuzzy Anchors)
A replacement for discussion (use disc_entries)
A replacement for knowledge (use knowledge/digest.md)

The cross-cutting principle. When a feature asks "give me context," the answer is not "enable RAG." The answer is "which of the 4 dimensions is the right home?" — and the 4-dim decision tree is the test.

The "complement" examples:

A new discussion opens: render the active preset's FileItems (curation) + the disc_entries (discussion) + the knowledge digest (knowledge). Optionally append {rag-context} if the user has opted in.
The LLM asks "what's the execution clutch?": try knowledge first (the user has decided it's a durable concept). Try discussion second (search the prior entries for "clutch"). Try RAG third (semantic search across the indexed codebase). Curation fourth (the user has configured specific files).
The user asks "where does X happen?": RAG is the natural shape for this question (semantic search). Use it.

3. Provenance required (Rule 3)

The principle. When RAG returns results, the user must be able to see which file and which chunk produced the result. No black boxes.

The RAG result shape (per RAGEngine.search):

@dataclass
class SearchResult:
    file_path: str           # the absolute path
    chunk_offset: int        # byte offset within the file
    chunk_length: int        # length in bytes
    content: str             # the matched text
    similarity: float         # the cosine similarity

The display in the LLM context (the {rag-context} block):

{rag-context}
## src/ai_client.py:512-768 (similarity: 0.87)
...content...

## src/aggregate.py:142-289 (similarity: 0.82)
...content...
{/rag-context}

The display in the GUI (the per-result tooltip):

[Anthropic cache-aware send]
File: src/ai_client.py:512-768
Similarity: 0.87
Click to jump to file

The provenance is not optional. If a result has no provenance, it doesn't go in the context.

The cross-references. The dim-mismatch fix at 16412ad5 shows the kind of bug that happens when the RAG index loses provenance: switching providers silently corrupts the index because the embeddings have different dimensions. The provenance (file path + chunk offset) is what makes the index re-buildable.

4. RAG never mutates state (Rule 4)

The principle. RAG is a query dimension. It returns data; it does not write data.

The mutation rules:

RAG results do NOT go into disc_entries
RAG results do NOT update FileItem curation state
RAG results do NOT write to disk
RAG results do NOT trigger knowledge harvest
RAG results do NOT modify the system prompt or persona

The exception (none). There is no feature that should mutate state from RAG results. If a feature wants to "remember" something from RAG, the user must explicitly say "add that to the discussion" (which appends a role: "User" entry to disc_entries) or "harvest that into knowledge" (which runs the harvest workflow).

The boundary in code:

# In ai_client.py:send() (the integration point)
def send(...):
    prompt = aggregate.build(...)
    if config.rag_enabled:
        results = rag_engine.search(prompt, k=N)
        prompt = append_rag_block(prompt, results)   # READ ONLY
    return self._send_<provider>(prompt, ...)
    # NO mutation of: disc_entries, FileItem, knowledge files

The mutation must happen in a different function, called explicitly by the user or the LLM with HITL approval.

5. Feature-gated integration (Rule 5)

The principle. A feature must explicitly request RAG in its scope. RAG is not the default for "give me context."

The gate. Every feature that uses RAG declares the dependency in its spec, plan, and changelog:

## Scope
- Feature X (uses RAG for semantic search)
- Feature Y (no RAG dependency; uses Curation + Discussion only)

## Dependencies
- RAG is required for Feature X; the user must opt-in via AI Settings
- Feature Y is independent of RAG

The runtime gate. The feature's code checks config.rag_enabled and behaves accordingly:

# In the feature's code
def feature_x(query: str) -> list[SearchResult]:
    if not config.rag_enabled:
        raise RAGNotEnabledError("Feature X requires RAG; opt in via AI Settings")
    return rag_engine.search(query, k=N)

The error message is explicit. The user knows why the feature isn't working.

The CLI surface (for testing and debugging):

$ python -m src.feature_x "execution clutch"
# Error: RAG not enabled. Enable via: [ai_settings.toml] rag.enabled = true

The audit trail. Every feature that uses RAG is logged in metadata.json for the feature's track: uses_rag: true.

6. Graceful failure (Rule 6)

The principle. RAG failure is data, not an exception. A failed search returns an empty result; the request continues.

The failure modes (in priority order):

Failure	Handling
RAG not enabled	Skip; no `{rag-context}` block; the request continues
ChromaDB not initialized	Skip; log a warning; the request continues
Embedding provider not available	Skip; log a warning; the request continues
Index missing (first run)	Skip; log a warning; the request continues
Search returns empty	Normal; no `{rag-context}` block; the request continues
Search times out	Return partial results; log a warning
Search raises an exception	Catch; log the exception; return empty; the request continues

The exception is Result[T, ErrorInfo], not an exception. Per the data_oriented_error_handling_20260606 convention.

# In the RAG engine
def search(self, query: str, k: int = 5) -> Result[list[SearchResult], ErrorInfo]:
    try:
        if not self._enabled:
            return Result(data=[], errors=[ErrorInfo(NOT_READY, "RAG not enabled")])
        if not self._collection:
            return Result(data=[], errors=[ErrorInfo(NOT_READY, "RAG not initialized")])
        results = self._collection.query(query, k=k)
        return Result(data=results, errors=[])
    except Exception as exc:
        return Result(data=[], errors=[ErrorInfo(INTERNAL, str(exc))])

The caller (ai_client.py:send) checks .errors and proceeds with empty results:

rag_result = rag_engine.search(prompt, k=N)
if rag_result.ok and rag_result.data:
    prompt = append_rag_block(prompt, rag_result.data)
# else: proceed without RAG; the request doesn't fail

The user sees the warning in the comms log:

[RAG] search failed: ChromaDB not initialized
[RAG] request continues without RAG

7. The wiring points (the where)

Where in `src/`	What it does	What it does NOT do
`src/ai_client.py:send`	The integration point; appends `{rag-context}` if enabled	Does not mutate state
`src/aggregate.py:run`	Builds the initial context; appends `{rag-context}` in the volatile layer	Does not query RAG directly
`src/rag_engine.py:search`	The semantic search; returns `Result[list[SearchResult], ErrorInfo]`	Does not write to the index
`src/rag_engine.py:index_file`	The indexer; called by `RAGEngine._init_vector_store` or by the harvest CLI	Does not run at LLM call time
`src/ai_settings.toml` (or GUI)	The opt-in surface	Does not trigger RAG automatically

8. The forbidden patterns (the "don't do this" list)

Pattern	Why it's forbidden
RAG as a replacement for curation	Curation is structural (per-file schema); RAG is semantic (fuzzy). Use curation for "how to render file X"
RAG as a replacement for discussion	Discussion is precise (the actual messages); RAG is fuzzy. Use discussion for "what was said"
RAG as a replacement for knowledge	Knowledge is durable (user-edited, provenance-aware); RAG is volatile (indexed, opaque). Use knowledge for "what we decided"
Auto-inject RAG results into `disc_entries`	This is a state mutation; it changes the conversation in a way the user didn't ask for
Auto-write RAG results to disk	Same; no mutation
Use RAG when the user hasn't opted in	RAG is opt-in; default-off in new projects
Crash the request when RAG fails	Graceful failure; the request continues
Use RAG for "show me the last thing the user said"	Use `disc_entries` (precise)
Use RAG for "show me what we decided last time"	Use the knowledge digest (durable)
Use RAG for "show me the file the user is editing"	Use `FileItem` (curation)

9. The cross-references

conductor/code_styleguides/agent_memory_dimensions.md §3 — the RAG dim in context
conductor/code_styleguides/data_oriented_design.md §1.2 — "Design around a model of the world" (the underlying anti-pattern)
conductor/code_styleguides/cache_friendly_context.md — where the 4 dims get injected in the cache strategy
conductor/code_styleguides/knowledge_artifacts.md — the knowledge dim (the alternative for "what we decided")
docs/guide_rag.md — the existing RAG deep-dive
data_oriented_error_handling_20260606 — the Result[T, ErrorInfo] pattern
conductor/tracks/rag_phase4_stress_fix_20260606 — the dim-mismatch fix at 16412ad5

13 KiB Raw Blame History