diff --git a/docs/guide_rag.md b/docs/guide_rag.md index 137637ec..87e245c5 100644 --- a/docs/guide_rag.md +++ b/docs/guide_rag.md @@ -73,7 +73,8 @@ class RAGEngine: **Internal state**: - `embedding_provider: BaseEmbeddingProvider` — set by `_init_embedding_provider` -- `vector_store` — a ChromaDB `Collection` (or a stub for tests) +- `client: chromadb.PersistentClient` — the chroma client (or the string `"mock"` in mock mode) +- `collection: chromadb.Collection` — the actual collection (or `"mock"` in mock mode) - `chunk_size: int` — character count per chunk - `chunk_overlap: int` — overlap between adjacent chunks @@ -125,22 +126,32 @@ The heavy dependencies (`sentence_transformers`, `google.genai`, `chromadb`) are ### Vector Store -ChromaDB is the default persistent vector store. The store is created at `/.rag/chroma/` by default (configurable via `RAGConfig.vector_store_path`). +ChromaDB is the default persistent vector store. The store is created at `/.slop_cache/chroma_/` (auto-generated from `VectorStoreConfig.collection_name`, default `"manual_slop"`). The `.slop_cache` location is intentional — it co-locates the chroma index with the existing per-project cache layout. ```python def _init_vector_store(self): - if self.config.vector_store_backend == "chromadb": - client = chromadb.PersistentClient(path=...) - self.vector_store = client.get_or_create_collection(name=...) + vs_config = self.config.vector_store + if vs_config.provider == 'chroma': + db_path = os.path.abspath(os.path.join( + self.base_dir, ".slop_cache", f"chroma_{vs_config.collection_name}" + )) + os.makedirs(db_path, exist_ok=True) + chromadb, Settings = _get_chromadb() + self.client = chromadb.PersistentClient(path=db_path) + self.collection = self.client.get_or_create_collection(name=vs_config.collection_name) + self._validate_collection_dim() + elif vs_config.provider == 'mock': + self.client = "mock" + self.collection = "mock" else: - raise NotImplementedError(...) + raise ValueError(f"Unknown vector store provider: {vs_config.provider}") ``` -**Backends**: -- `chromadb` (default) — local persistent, single-process -- *Future*: External RAG Bridge via MCP (e.g., a remote vector database server) +**Backends** (`VectorStoreConfig.provider`): +- `chroma` (default for real use) — local persistent, single-process +- `mock` — no-op collection (for tests / RAG-disabled paths) -The `_search_mcp` method is a placeholder for the future external bridge integration; current local-only mode uses `vector_store.query()` directly. +The `mcp_server` + `mcp_tool` fields in `VectorStoreConfig` are placeholders for the future External RAG Bridge via MCP (e.g., a remote vector database server); not yet implemented. ### Chunking Strategies @@ -198,6 +209,7 @@ When a project is loaded with RAG enabled, the `RAGEngine` is populated by index 1. Project load: AppController reads [rag] section from manual_slop.toml 2. AppController constructs RAGEngine(config) 3. RAGEngine._init_vector_store() creates/loads ChromaDB collection + - Calls _validate_collection_dim() to detect/recover from dim mismatch 4. For each tracked file (parallelized): a. Read content b. Choose chunker based on extension and config @@ -210,6 +222,16 @@ When a project is loaded with RAG enabled, the `RAGEngine` is populated by index **Incremental Updates**: When a file's `mtime` changes (detected by `pathlib.Path.stat().st_mtime`), `delete_documents_by_path()` is called first, then the file is re-indexed. This is critical for the auto-sync flow (see Configuration below). +**Path resolution resilience**: `index_file()` falls back to `os.getcwd()` if the `base_dir`-relative path doesn't exist. This handles batched test conditions where the subprocess CWD differs from the project root (e.g., a test chdir'ing into `tests/artifacts/live_gui_workspace_*/` for fixture isolation). Without the fallback, indexing silently skipped files in those conditions. + +### Dimension Mismatch Protection + +`_init_vector_store()` calls `_validate_collection_dim()` after creating the collection. The validation inspects the first existing vector's dim and compares it to the current embedding provider's output. On mismatch (e.g., the user switched from Gemini 3072-dim to local 384-dim, or vice versa, or a prior run populated the collection with a different model), the chroma directory is wiped via `shutil.rmtree` (with the client closed first to release file handles) and the collection is recreated with the correct dim. + +**Why this exists:** Without validation, dim-mismatched upserts silently corrupt the collection. The next `search()` raises `chromadb.errors.InvalidDimensionError: Collection expecting embedding with dimension of X, got Y`, the AI request never reaches `'done'` status, and the live_gui test polls timeout at 50×0.5s = 25s. This pattern was the dominant cause of `tier-3-live_gui` failures in the 2026-06-08 to 2026-06-10 window. + +Regression tests in `tests/test_rag_engine.py`: `test_rag_collection_dim_mismatch_recreates_collection`, `test_rag_collection_dim_match_preserves_collection`. + ### Query Flow When `ai_client.send(rag_engine=engine)` is called: @@ -262,33 +284,43 @@ RAG is configured via the project's `manual_slop.toml`: [rag] enabled = true embedding_provider = "gemini" # or "local" + +[rag.vector_store] +provider = "chroma" # "chroma" | "mock" +collection_name = "manual_slop" # the chroma subdir under .slop_cache/ +url = "" # future: external HTTP vector store +api_key = "" # future: external HTTP auth +mcp_server = "" # future: MCP-based external RAG bridge +mcp_tool = "" # future: tool name on the MCP server + +[rag] chunk_size = 1000 chunk_overlap = 200 -ast_chunking_enabled = true -vector_store_backend = "chromadb" -vector_store_path = ".rag/chroma" # relative to project base_dir -auto_index_on_load = true -auto_sync_interval_seconds = 60 # background re-indexing -top_k = 5 ``` -### `RAGConfig` Schema (`src/models.py`) +### `RAGConfig` + `VectorStoreConfig` Schema (`src/models.py`) ```python +@dataclass +class VectorStoreConfig: + provider: str # "chroma" | "mock" + url: Optional[str] = None # future: external HTTP + api_key: Optional[str] = None # future: external HTTP auth + collection_name: str = "manual_slop" + mcp_server: Optional[str] = None # future: MCP bridge + mcp_tool: Optional[str] = None # future: MCP tool name + @dataclass class RAGConfig: - enabled: bool = False - embedding_provider: str = "gemini" # "local" | "gemini" - chunk_size: int = 1000 - chunk_overlap: int = 200 - ast_chunking_enabled: bool = True - vector_store_backend: str = "chromadb" - vector_store_path: str = ".rag/chroma" - auto_index_on_load: bool = True - auto_sync_interval_seconds: int = 60 - top_k: int = 5 + enabled: bool = False + vector_store: VectorStoreConfig = field(default_factory=lambda: VectorStoreConfig(provider='mock')) + embedding_provider: str = 'gemini' # "gemini" | "local" + chunk_size: int = 1000 + chunk_overlap: int = 200 ``` +> **Removed fields** (moved to other systems or not yet implemented): `ast_chunking_enabled` lives in `ChunkingConfig` (not in `RAGConfig`); `vector_store_backend`/`vector_store_path` replaced by nested `VectorStoreConfig`; `auto_index_on_load`/`auto_sync_interval_seconds`/`top_k` are runtime parameters set by the controller, not persisted in `RAGConfig`. + ### Behavior When Disabled If `enabled = false` (the default), `RAGEngine` is never constructed. `ai_client.send()` receives `rag_engine=None` and the integration is a no-op. The lazy-loading of `chromadb`, `sentence_transformers`, and `google.genai` is also skipped, so there is zero overhead for projects that don't use RAG.