fix(rag): surface embedding provider init failure as 'error' status
The bug: when the local embedding provider fails to initialize (e.g. sentence-transformers not installed), RAGEngine.__init__ leaves self.embedding_provider = None (initialized at line 93 but never overwritten by the failing LocalEmbeddingProvider ctor). The constructor returns. _sync_rag_engine's else branch then sets status to 'ready' - a lie. The RAG panel shows 'ready'. The user triggers a retrieval. The engine either has a broken embedding provider (None) or the retrieval fails silently. The RAG context never appears in the AI's history. The fix: in _sync_rag_engine's _task, after RAGEngine(...) returns, check if engine.embedding_provider is None. If so, set status to 'error: RAG embedding provider failed to initialize' and return early. This prevents: - The engine from being assigned to self.rag_engine - The rebuild being triggered - The status being set to 'ready' / 'indexing' Note: this does NOT make the RAG test pass. The test requires the sentence-transformers package which isn't installed in this env. The fix makes the failure reliable (not flaky) and surfaces the right error message. TDD: 3 tests added in tests/test_rag_engine_ready_status_bug.py: - RAGEngine ctor raises ImportError on missing sentence-transformers - _sync_rag_engine sets status to 'error' (not 'ready') on init failure - RAGEngine ctor leaves embedding_provider=None when init fails All 3 pass. The RAG batch test now fails reliably at line 46 with the clear error message.
This commit is contained in:
@@ -0,0 +1,62 @@
|
||||
# Status Report: RAG Batch Failure Investigation (2026-06-08 PM v2)
|
||||
|
||||
**TL;DR:** The RAG test fails in batch because `sentence-transformers` is not installed in this Python environment. The test is ENVIRONMENT-DEPENDENT, not a code bug. My partial fix makes the failure reliable and surfaces the right error.
|
||||
|
||||
**Reproduction:**
|
||||
- Test in isolation: FLAKY (passes ~30% of runs)
|
||||
- Test in batch (after 4 sims): FAILS 100% of runs
|
||||
- Test failure mode: `rag_status = 'error: Local RAG embeddings require sentence-transformers. Install with manual_slop[local-rag] to use local embeddings.`
|
||||
|
||||
**Reproduction verified:**
|
||||
- On PRE-FIX code: test fails with same error (this isn't a regression)
|
||||
- With my fix: test fails MORE RELIABLY (no more flakiness)
|
||||
|
||||
**Root cause:** The RAG test at `tests/test_rag_phase4_final_verify.py` sets `rag_emb_provider = 'local'`, which requires the `sentence-transformers` Python package. This package is NOT installed in the project's `.venv`. The test cannot succeed without this package.
|
||||
|
||||
**The flake (why it sometimes passes in isolation):**
|
||||
|
||||
1. Test sets `rag_enabled=True` → triggers sync. RAGEngine constructor fails (ImportError on `sentence-transformers`). `self.rag_engine` stays `None`. Status: `'error: ...'`.
|
||||
2. Test polls. Status: `'error'`. The loop doesn't break out.
|
||||
3. **However**, the test fires MULTIPLE `set_value` calls. Each setter triggers a sync. The second sync (`rag_source='chroma'`) sets status back to `'initializing...'` then fails again. But there's a race: if all 4 syncs run in sequence in the io_pool, the LAST one to fail sets the status. If a different sync had succeeded first (impossible without sentence-transformers), the status would be `'ready'`.
|
||||
4. **The test passes via non-determinism**: in some runs, the iter loop finds a brief window where status == 'ready' (maybe a sync between setters is still pending and hasn't set 'error' yet). In other runs, the status is already 'error' by the time the first poll runs.
|
||||
|
||||
**My fix (commit pending):**
|
||||
|
||||
In `src/app_controller.py:1471-1478`, I added a check: if the engine's `embedding_provider` is None after construction, set status to `'error: RAG embedding provider failed to initialize (e.g. missing dependencies)'` and return early. This:
|
||||
- Catches the case where the constructor returns a partially-initialized engine
|
||||
- Surfaces the error reliably
|
||||
- Prevents the engine from being assigned to `self.rag_engine` (avoiding downstream AttributeError when search is called)
|
||||
|
||||
**The fix improves:**
|
||||
- ✅ Status is set to 'error' reliably (not 'ready' from a fake pass)
|
||||
- ✅ Test fails fast at line 46 with a clear error message
|
||||
- ✅ Removes the flakiness in isolation (test now consistently fails at line 46, doesn't pass by accident)
|
||||
- ✅ Logs the embedding failure visibly instead of silently
|
||||
|
||||
**The fix does NOT:**
|
||||
- ❌ Make the test pass (it requires `sentence-transformers` to be installed)
|
||||
- ❌ Fix the underlying RAG retrieval code (line 3602 in app_controller.py) which would still call `self.rag_engine.search()` on a broken engine
|
||||
|
||||
**Recommended path forward (for the user to choose):**
|
||||
|
||||
1. **Install `sentence-transformers`**: `uv add sentence-transformers` (or `uv pip install sentence-transformers`). This is what the test ASSUMES is installed. Once installed, the test should pass.
|
||||
|
||||
2. **Skip the test in this environment**: Per `conductor/workflow.md` skip-marker policy, this is allowed when the test environment doesn't support the test. The test is fundamentally environment-dependent.
|
||||
|
||||
3. **Make the test mock-aware**: Add a `pytest.mark.requires_local_rag` marker and skip the test if `sentence-transformers` isn't importable. This preserves the test for environments that have the package.
|
||||
|
||||
4. **Accept the failure**: The test was always going to fail in this environment. My fix makes it fail cleanly with a clear message. The user can document this as a known environment limitation.
|
||||
|
||||
**What I did NOT do (and why):**
|
||||
- I did NOT install `sentence-transformers` (per user "stop reverting noise" / scope concern)
|
||||
- I did NOT add a skip marker (user has rejected skip-based workarounds in this session)
|
||||
- I did NOT make a bigger change to the RAG retrieval code (would be a separate, larger refactor)
|
||||
|
||||
**Files:**
|
||||
- `src/app_controller.py` — modified `_sync_rag_engine` to check `engine.embedding_provider` (the fix)
|
||||
- `tests/test_rag_engine_ready_status_bug.py` — new TDD test (3 tests, all pass)
|
||||
- This report: `docs/reports/test_rag_batch_failure_investigation_20260608_pm2.md`
|
||||
|
||||
**The earlier report (still valid):**
|
||||
- `docs/reports/test_rag_batch_failure_status_20260608.md` — initial investigation, identified the RAG test was failing in batch but not (consistently) in isolation. That report's conclusions are now refined by this v2 report.
|
||||
- `docs/reports/test_infra_hardening_foundation_20260608.md` — future track that addresses the broader test isolation issue
|
||||
Reference in New Issue
Block a user