# The 4 Memory Dimensions

**Status:** Styleguide; codifies the 4 memory dimensions of the Manual Slop conversation data.
**Date:** 2026-06-12
**Cross-refs:** `conductor/code_styleguides/data_oriented_design.md` §9; `docs/guide_agent_memory_dimensions.md`; `conductor/tracks/nagent_review_20260608/nagent_review_v2_3_20260612.md` §2.8.

> **What this is.** The conversation data has 4 distinct memory dimensions. Each lives at a different layer; each serves a different purpose. The wrong shape for the wrong layer is a common mistake. This styleguide names the 4, names the boundary between them, and gives the rule for which one to use when.

---

## 0. The 4 dimensions (the one-glance table)

| # | Dim | Where it lives | What it stores | How it's edited | How it's queried | SSDL |
|---|---|---|---|---|---|---|
| 1 | **Curation** | `FileItem` + `ContextPreset` + Fuzzy Anchors | *How to render a file* in the AI's context window | Structural File Editor; project TOML | Implicit in `aggregate.py:run` at discussion start | `[Q]` |
| 2 | **Discussion** | `app.disc_entries` + branching + UISnapshot | *What was said* in the conversation | GUI `[Edit]` mode; `[Branch]`; undo/redo | `build_markdown` renders as prior context | `o==>` |
| 3 | **RAG** | `src/rag_engine.py` (ChromaDB) | *Semantic fingerprints* of indexed files | (opaque vector store) | `RAGEngine.search()` at LLM call time | `[Q]` |
| 4 | **Knowledge** | `~/.manual_slop/knowledge/*.md` + per-file + digest + ledger | *Durable learnings* from past sessions | Plain markdown edit | Bounded digest as stable prefix | `o==>` |

---

## 1. Curation memory (per-file, per-discussion, structural)

**The shape.** Per-file curation config: `path`, `auto_aggregate`, `force_full`, `view_mode` (`full / skeleton / summary / sig / def / agg`), `ast_signatures`, `ast_definitions`, `ast_mask`, `custom_slices` (Fuzzy Anchors). A `ContextPreset` is a named, persisted set of `FileItem`s. Both persist in the project TOML.

**The query model.** "When discussion X opens, render file Y per its curation memory." Implicit in `aggregate.py:run` at discussion start. The user doesn't query the curation memory directly; they *configure* it.

**The right tool.** The Structural File Editor (per `docs/guide_context_curation.md`). AST-aware slices, Fuzzy Anchor slices, view-mode picker. The file's `FileItem` is the UI surface.

**The wrong tool.** Storing curation state in `disc_entries` (it's not conversational). Storing curation state in the RAG index (it's structural, not semantic). Storing curation state in the knowledge digest (it's per-discussion, not durable).

**The codepath** (SSDL):

```
[Q:discussion starts]
   │
   ▼
[Q:which ContextPreset is active?]
   │
   ├── preset N ──► [I:load ContextPreset N's FileItems]
   │
   ▼
[loop: each FileItem]
   │
   ├──► [Q:FileItem.view_mode?]
   │     │
   │     ├── full ──► [I:read full file]
   │     ├── skeleton ──► [I:py_get_skeleton / ts_c_get_skeleton]
   │     ├── summary ──► [I:run_subagent_summarization]
   │     ├── sig ──► [I:py_get_skeleton (signatures only)]
   │     ├── def ──► [I:py_get_skeleton (definitions only)]
   │     └── agg ──► [I:py_get_skeleton (children only)]
   │
   ├──► [Q:FileItem.ast_mask?]
   │     │
   │     └── yes ──► [I:apply ast_mask to the rendered view]
   │
   ├──► [Q:FileItem.custom_slices?]
   │     │
   │     └── yes ──► [I:apply custom_slices to the rendered view]
   │
   └──► [I:append to aggregate markdown]
```

**The shape rule.** Curation is per-file, per-discussion, structural. Edited at the Structural File Editor. Persisted in TOML. The file's `FileItem` is the single source of truth for "how do I render this file in the AI's context."

---

## 2. Discussion memory (per-discussion, conversational, multi-turn)

**The shape.** `app.disc_entries: list[dict]` where each entry is `{"role": str, "content": str, "collapsed": bool, "ts": str, ...}` plus optional `thinking_segments` and `usage` (token accounting). The discussion is rendered as a `list[Message]` for the LLM by `build_markdown` (per `src/aggregate.py`).

**The query model.** "What did the user say? What did the AI say? In what order?" The discussion is the *prior context* for the next LLM call. The user can edit, insert, delete, role-change, and branch at any entry (A1-A7 per-entry operations per the nagent review v1 §3).

**The right tool.** The Discussion Hub panel. Per-entry `[Edit]`, `[Read]`, `[+/-]`, `Ins`, `Del`, `[Branch]`, role combo. The undo/redo stack (UISnapshot) and the Take/branching/compact system.

**The wrong tool.** Storing discussion state in the RAG index (it's temporal, not semantic). Storing discussion state in the knowledge digest (it's per-discussion, not durable). Storing discussion state in a FileItem (it's not per-file).

**The codepath** (SSDL):

```
[Q:user types prompt + hits Enter]
   │
   ▼
[I:append new entry to disc_entries]    (role: "User")
   │
   ▼
[Q:which ContextPreset is active?]
   │
   ├── preset N ──► [I:render FileItems per curation memory]
   │
   ▼
[I:aggregate.build_markdown(preset, discussion) -> str]
   │
   ▼
[I:ai_client.send(aggregate_text, history)]
   │
   ▼
[I:append new entry to disc_entries]    (role: "AI", content: response)
   │
   ▼
[Q:user pressed Edit on an entry?]
   │
   ├── yes ──► [I:update disc_entries[i].content]
   │
   ▼
[Q:user pressed Branch on an entry?]
   │
   ├── yes ──► [I:project_manager.branch_discussion(index) -> new Take]
   │
   ▼
[Q:user pressed Undo?]
   │
   ├── yes ──► [I:history.UISnapshot.pop() -> restore previous state]
   │
   ▼
[Q:user pressed Compact?]
   │
   ├── yes ──► [I:ai_client.run_discussion_compaction(discussion)]    (Candidate 11)
   │
[T:render Discussion Hub panel from disc_entries]
```

**The shape rule.** Discussion is per-discussion, conversational, multi-turn. Edited per-entry. Persisted in TOML via `_flush_to_project`. The `disc_entries` list is the single source of truth for "what was said in this discussion."

---

## 3. RAG memory (opt-in, semantic, fuzzy)

**The shape.** ChromaDB vector store; per-file `FileItem`-like records with embeddings. `RAGEngine.search(query, k=N)` returns the top-N most-similar chunks. Persisted in `tests/artifacts/.slop_cache/chroma_<embedding_provider>/`.

**The query model.** "Given a query, return similar content from the indexed corpus." Semantic similarity, fuzzy. No provenance beyond the file path. No user-editable content.

**The right tool.** `RAGEngine.search()` at LLM call time (the `rag_*` results injected into the LLM prompt). The `[X] Enable RAG` toggle in AI Settings. The `RAGConfig` (embedding provider, chunk size, chunk overlap, source selection).

**The wrong tool.** Using RAG as a *replacement* for the other 3 dimensions. Using RAG results for state mutation (the integration discipline prohibits this). Using RAG for "show me the last thing the user said" (use Discussion memory). Using RAG for "show me what we decided last time" (use Knowledge memory).

**The codepath** (SSDL):

```
[Q:ai_client.send() is called]
   │
   ▼
[Q:is RAG enabled?]
   │
   ├── no ──► [T:skip]
   │
   ▼
[Q:which RAG source? (project / global / none)]
   │
   ├── project ──► [I:RAGEngine.index_file(path) for each tracked file in project]
   ├── global ──► [I:RAGEngine.index_file(path) for each file in ~/.manual_slop/knowledge/]
   └── none ──► [T:skip]
   │
   ▼
[Q:RAG engine initialized?]
   │
   ├── no ──► [I:RAGEngine._init_embedding_provider()]   (lazy init, may download)
   │
   ▼
[I:RAGEngine.search(query, k=N) -> list[SearchResult]]
   │
   ▼
[I:append "{rag-context}" block to aggregate markdown]
   │
   ▼
[I:ai_client.send() continues with augmented prompt]
```

**The shape rule.** RAG is opt-in. Default-off. Complements the other dimensions; never replaces. Provenance is required (file path, chunk offset). No mutation. See `conductor/code_styleguides/rag_integration_discipline.md` for the full rule.

---

## 4. Knowledge memory (per-project, durable, provenance-aware)

**The shape.** A markdown tree at `~/.manual_slop/knowledge/`:

| File | Format | What it stores |
|---|---|---|
| `knowledge/facts.md` | `- {statement} {provenance}` | Durable statements about systems, repos, tools |
| `knowledge/decisions.md` | `- {statement} {reason}` | Decisions that were made |
| `knowledge/questions.md` | `- {question}` | Unanswered questions |
| `knowledge/playbooks.md` | `- **{name}**: {steps}` | Reusable command sequences |
| `knowledge/tasks.md` | `- {task}` (## Open / ## Done) | Open and done tasks |
| `knowledge/files/{file_id}.md` | `- {note} {provenance}` | Per-file notes (keyed by inode) |
| `knowledge/digest.md` | bounded 4KB | The projected digest (injected as `{knowledge}` block) |
| `knowledge/ledger.json` | `{entries: {sha256: {status, at, items}}}` | The harvest audit log |

**The query model.** "Given past sessions, what durable knowledge should I inject into the current discussion?" The answer is the `{knowledge}` block in the initial context, regenerated from the category files (newest first), bounded to 4KB.

**The right tool.** The harvest CLI (`python -m src.knowledge_harvest`) for the harvest; the plain text editor (vim, nano, the GUI) for the category files. The "Knowledge" panel in the GUI for browse/edit/prune.

**The wrong tool.** Treating the knowledge digest as state (it's a projection; the category files are the state). Letting the digest grow unbounded (4KB cap; truncate with a visible note). Treating the per-file notes as a replacement for FileItem curation (different dimensions; both are useful).

**The codepath** (SSDL):

```
[Q:discussion starts]
   │
   ▼
[Q:knowledge digest exists? (knowledge/digest.md)]
   │
   ├── no ──► [T:skip]
   │
   ▼
[Q:digest within 4KB budget?]
   │
   ├── yes ──► [I:read digest]
   │
   ├── no ──► [I:read digest (truncated with note)]
   │
   ▼
[Q:aggregate.py:run is at the stable prefix position]
   │
   ▼
[I:append "{knowledge}" block to initial context]
   │
   ▼
[Q:per-file knowledge for files in scope?]
   │
   ├── yes ──► [I:append "{file-knowledge}" per FileItem]
   │
[T:continue rendering aggregate]
```

**The shape rule.** Knowledge is per-project, durable, provenance-aware. Edited by the user (plain markdown). The category files are the source of truth; the digest is a projection. See `conductor/code_styleguides/knowledge_artifacts.md` for the full harvest workflow.

---

## 5. The boundaries (when NOT to mix)

| Don't store... | In... | Because... |
|---|---|---|
| Discussion state | `FileItem` (curation) | Discussion is per-discussion, not per-file |
| File curation | `disc_entries` (discussion) | Curation is per-file structural, not conversational |
| Semantic search results | `disc_entries` (discussion) | RAG is fuzzy; the discussion is precise |
| A long conversation | the knowledge digest (knowledge) | The digest is bounded (4KB); the conversation is unbounded |
| A "this is the current state" fact | the RAG index (RAG) | RAG is semantic; state is precise |
| Per-file notes | the discussion context | The notes should follow the file, not the discussion |
| Per-discussion summary | the knowledge digest | The digest is *cross*-discussion, not per-discussion |
| LLM-derived curation | the FileItem schema | LLM outputs are untrusted; the FileItem is user-edited |
| Untrusted LLM output | the knowledge category files | The harvest prompt has retry + graceful failure; but the category files are *user-editable*, so corrections are first-class |

**The discipline.** When designing a new feature, ask: which of the 4 dimensions is the *natural* home? Don't reach for the RAG because "it's there"; reach for the dimension whose shape matches the data.

---

## 6. The cross-cutting principle (the "data is the thing")

All 4 dimensions share one principle: **the data is the thing, not the agent.** Each dimension has:
- A flat shape (no object graphs; structs of structs of scalars)
- A durable storage (TOML, ChromaDB, markdown — not Python objects)
- A user-editable surface (the Structural File Editor, the Discussion Hub, the RAG toggle, the category files)
- A query model that returns "data, not control flow" (per `data_oriented_error_handling_20260606`)

The wrong shape for the right question is a common mistake. The right question is "which of the 4 dimensions is this?" — not "is there a tool that does X?"

---

## 7. The decision tree (the 1-question test)

When a feature needs *some* memory, ask this single question:

```
Q: What is the *data* (not the operation) the feature needs?
   │
   ├── "How to render a file"          ──► Curation (FileItem)
   ├── "What was said in this chat"     ──► Discussion (disc_entries)
   ├── "What similar content exists"    ──► RAG (RAGEngine.search)
   └── "What we learned from past runs" ──► Knowledge (knowledge/digest.md)
```

Pick the matching dimension. If the feature needs 2+ dimensions, use 2+ dimensions — but be explicit about which is the *primary* (the one that holds the *answer*) and which is *secondary* (the one that provides *context*).

---

## 8. The implementation cross-references (the file:line map)

For Manual Slop's current state:

| Dim | Where in `src/` | Line range | What to look at |
|---|---|---|---|
| Curation | `src/models.py` | 510-559 | `FileItem` schema |
| Curation | `src/models.py` | 909-937 | `ContextPreset` schema |
| Curation | `src/context_presets.py` | (small) | `ContextPresetManager` |
| Curation | `src/aggregate.py` | (518 lines) | `build_file_items`, `build_markdown` |
| Discussion | `src/gui_2.py` | 3770-3853 | `render_discussion_entry` (A1-A7) |
| Discussion | `src/gui_2.py` | 4239-4260 | `render_discussion_entry_controls` (B1-B11) |
| Discussion | `src/history.py` | 8-71 | `UISnapshot`, `HistoryManager` (C1-C5) |
| Discussion | `src/project_manager.py` | 429+ | `branch_discussion`, `promote_take` |
| RAG | `src/rag_engine.py` | 1-384 | The RAG engine + ChromaDB |
| Knowledge | (NEW) `src/knowledge_store.py` | (proposed) | The knowledge store |
| Knowledge | (NEW) `src/knowledge_harvest_cli.py` | (proposed) | The harvest CLI |

---

## 9. The cross-references

- `conductor/code_styleguides/data_oriented_design.md` §9 — the 4-dim table in the canonical DOD
- `conductor/code_styleguides/rag_integration_discipline.md` — the conservative-RAG rule
- `conductor/code_styleguides/knowledge_artifacts.md` — the knowledge harvest pattern
- `conductor/code_styleguides/cache_friendly_context.md` — the cache strategy (where the 4 dims get injected)
- `docs/guide_agent_memory_dimensions.md` — the user-facing cross-cutting guide
- `docs/guide_context_curation.md` — the existing curation deep-dive
- `docs/guide_rag.md` — the existing RAG deep-dive
- `conductor/tracks/nagent_review_20260608/nagent_review_v2_3_20260612.md` §2.8 — the nagent-origin pattern that informed the knowledge dim