The fix in 644d88ab changed the recovery path from client.delete_collection
to shutil.rmtree (chromadb 1.5.x delete_collection is broken on corrupted
state). The test still asserted the old behavior.
The wipe path called self._init_vector_store() which re-invoked
_validate_collection_dim, causing infinite recursion (RecursionError)
when the dim mismatch test ran with the mock embedding provider.
Re-initialize the vector store INLINE after the rmtree wipe so the
fresh collection is created without going through the validator
again.
When the existing collection has embeddings from a different
embedding provider (e.g. Gemini 3072-dim vs local 384-dim), the
prior approach of calling client.delete_collection() fails with
'RustBindingsAPI object has no attribute bindings' in chromadb 1.5.x
when the underlying state is corrupted. rmtree is reliable and
re-creates a fresh empty collection.
Also fixes:
- 'The truth value of an empty array is ambiguous' on numpy 2.x
by using try/except around len() instead of truthiness check
- WinError 32 on rmtree by closing the chroma client first
Verified: tests/test_rag_phase4_final_verify.py passes in isolation
in 7.75s after this fix. The test still fails in batch context due
to a separate io_pool race condition (multiple _sync_rag_engine
calls collide when the test sets rag_enabled, rag_source, and
rag_emb_provider in sequence). The race is in app_controller.py
and is out of scope for this defensive fix.
Note: tests/test_rag_engine.py has explicit unit tests for
test_rag_collection_dim_mismatch_recreates_collection and
test_rag_collection_dim_match_preserves_collection which
exercise this code path.
One addition to conductor/code_styleguides/python.md §8
"AI-Agent Specific Conventions":
- **No diagnostic noise in production code (Added
2026-06-09).** `sys.stderr.write(f"[XYZ_DIAG] ...") lines
in src/*.py are technical debt. The right place for
one-time investigation output is tests/artifacts/<test>.diag.log
(a log file) or a standalone /tmp/diag_<name>.py script.
If you must instrument production code, the diag lines
are part of the same atomic commit as the fix.
- **Test files ARE allowed to be diagnostic.** The rule
applies to src/*.py only; tests/test_*.py may use
print(..., file=sys.stderr) freely.
Markdown only. No code modified.
Two additions to conductor/workflow.md §"Known Pitfalls":
1. **Isolated-Pass Verification Fallacy (Added 2026-06-09)** —
the rule that a test passing in isolation but failing in
batch is FAILING. The only verification that matters for
live_gui tests is the batch run. This is the flip side of
the existing "Live_gui Test Fragility (Authoring-Side)"
rule. Cross-references that rule.
2. **Process Anti-Patterns (Added 2026-06-09)** — 8-rule
summary list, with cross-reference to AGENTS.md for the
full ruleset. The 8 patterns are: Deduction Loop,
Report-Instead-of-Fix, Scope-Creep Track-Doc,
Inherited-Cruft, Diagnostic Noise in Production, Premature
Surrender, Verbose Commit Message, Isolated-Pass
Verification Fallacy.
Markdown only. No code modified. Cross-references
AGENTS.md (the load-bearing agent doc) for the full text
of each pattern.