manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	9646f7cf7b	refactor(rag_engine): obliterate legacy _chunk_code wrapper (Phase 5) Phase 5 (1 of 9 cruft sites obliterated): OBLITERATED: RAGEngine._chunk_code wrapper. It delegated to _chunk_code_result and provided a fallback to _chunk_text on AST failure. Migration: index_file() now calls _chunk_code_result directly with .ok check + chunk-size threshold check + fallback to _chunk_text inline. The structured ErrorInfo is propagated if needed (no caller currently consumes it). Sub-track 5 tests updated: - tests/tier2/phase13_invariant_test.py: _chunk_code moved to obliterated list - tests/tier2/phase13_site2_test.py: _legacy_no_broad_except -> _legacy_obliterated - tests/test_cruft_removal.py: 2 new tests (wrapper-obliterated invariant + caller-uses-result invariant) PITFALL encountered: the edit_file tool removed a leading space on the next class method's 'def' line, causing an IndentationError. Fixed by binary-write replacement preserving CRLF + leading-space styleguide convention (project uses 1-space indentation; class body methods start at column 1). Test result: 124/124 pass. Audit gate: src/rag_engine.py --strict exits 0 (no new violations). Wrapper count: 3 -> 2 (Phase 6 remaining: gui_2 2).	2026-06-20 20:13:10 -04:00
ed	1e323cae7d	refactor(rag_engine): migrate _async_search_mcp JSON parse to Result[T] (Phase 13 site 5) Site 5 (BC at L290): _async_search_mcp (nested in _search_mcp) had: try: data = json.loads(res_str) if isinstance(data, list): return data elif isinstance(data, dict) and 'results' in data: return data['results'] return [] except: return [] Body: bare 'except:' + return [] = empty default = SS-style violation. Migrated to Result[T] via new module-level helper _parse_search_response_result: - Returns Result(data=parsed_list) on success - Returns Result(data=None, errors=[ErrorInfo]) on JSON parse failure - Handles the list/dict/no-results branch logic The helper is module-level (does not use self) and is placed BEFORE class RAGEngine to avoid breaking the class definition (a def at column 0 inside a class ends the class prematurely). Legacy _async_search_mcp delegates to the helper; on Result errors, returns [] (preserving the original behavior). Audit: rag_engine BC 1 -> 0; migration-target: 0. Remaining 4 INTERNAL_RETHROW sites are Pattern 1/3 of the styleguide (known audit limitation).	2026-06-20 16:24:09 -04:00
ed	ee50c26556	refactor(rag_engine): migrate 3 index_file sites to Result[T] (Phase 13 sites 3+4+SS) index_file had 3 try/except sites with similar patterns: Site 3 (BC at L247): try: mtime = os.path.getmtime(full_path); except Exception: return Site 4 (BC at L261): try: with open(full_path, ...) as f: content = f.read(); except Exception: return Site 6 (SS at L255): try: res = self.collection.get(...); ...; except Exception: pass Body: broad catch + early return/pass = SS-style violation. New helpers: - _get_file_mtime_result(full_path) -> Result[float] Catches OSError only (specific to file stat failures). - _check_existing_index_result(file_path, mtime) -> Result[bool] Catches broad Exception (chromadb collection.get failures vary). Returns data=True if already indexed (skip), data=False if needs re-indexing. - _read_file_content_result(full_path) -> Result[str] Catches (OSError, UnicodeDecodeError) (file I/O + encoding failures). Legacy index_file calls each helper; on Result errors, returns early (preserving the original behavior of skipping the file on failure). Audit: rag_engine BC 3 -> 1 (L341 _async_search_mcp remaining). SS: 1 -> 0.	2026-06-20 16:10:35 -04:00
ed	7b3d723758	refactor(rag_engine): migrate _chunk_code to Result[T] (Phase 13 site 2) Site 2 (BC at L224): _chunk_code had a fallback to text chunking on any failure: try: parser = ASTParser('python') tree = parser.parse(content) ... return chunks except Exception: return self._chunk_text(content) Body: broad catch + fallback to a different implementation = empty-default fallback = SS-style violation. New helper _chunk_code_result(content, file_path) -> Result[List[str]]: - Returns Result(data=chunks) on AST parse success - Returns Result(data=None, errors=[ErrorInfo]) on parse failure Legacy _chunk_code calls helper; on Result errors, falls back to _chunk_text (preserving original behavior). The catch logic is in the legacy, not the helper, so the caller decides the fallback strategy. Audit: rag_engine BC 4 -> 3.	2026-06-20 16:08:31 -04:00
ed	f322052cc6	refactor(rag_engine): narrow 'except Exception' in _get_sentence_transformers (Phase 13 site 1) Site 1 (BC at L33) was: except Exception as e: sys.stderr.write(f'FAILED to import sentence_transformers: {e}') sys.stderr.flush() raise e Per TIER1_REVIEW: catch + log + re-raise is Pattern 2 of the styleguide. The fix is to narrow the except to specific exception types that sentence_transformers could raise on import (ImportError, AttributeError). Refactored to: except (ImportError, AttributeError) as e: sys.stderr.write(f'FAILED to import sentence_transformers: {e}') sys.stderr.flush() raise The bare 'raise' re-raises the current exception being handled, preserving the original type and traceback. (Replaces 'raise e' which raised a specific value but lost the traceback context.) Audit: rag_engine BC 5 -> 4. RETHROW +1 (the narrowed except is now classified as Pattern 3 catch+re-raise; strict mode accepts).	2026-06-20 16:06:48 -04:00
ed	355811635d	fix(rag): handle None metadata in get_all_indexed_paths and non-empty numpy in dim check Two bugs in src/rag_engine.py were causing 'NoneType object has no attribute get' in the live_gui RAG tests (test_rag_phase4_final_verify, test_rag_phase4_stress): 1. _validate_collection_dim_result:148 Old: if not embeddings or len(embeddings) == 0: New: if embeddings is None or len(embeddings) == 0: The 'if not embeddings' check raises ValueError('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()') when 'embeddings' is a non-empty numpy array (which is the normal case after documents are upserted). The exception is caught by the outer 'except Exception' which returns a non-ok Result, causing __init__ to set self.collection = None. Subsequent 'get_all_indexed_paths()' then fails with 'NoneType has no attribute get' on self.collection.get(). 2. get_all_indexed_paths:334 Old: return list(set(m.get('path') for m in res['metadatas'] if m.get('path'))) New: return list(set(m['path'] for m in res['metadatas'] if m is not None and m.get('path'))) When chromadb returns 'metadatas=[None, ...]' (documents upserted without metadata), 'm.get('path')' fails with AttributeError on the first None element. Adds 'm is not None' guard. Both fixes are defensive: the conditions that trigger them (orphan docs without metadata, non-empty embeddings arrays) are normal valid states that the old code couldn't handle. New file: tests/test_rag_sync_none_error.py 3 unit tests covering both bugs: - test_dim_check_does_not_raise_on_non_empty_ndarray - test_get_all_indexed_paths_handles_none_metadata - test_get_all_indexed_paths_returns_paths_with_metadata Verified: - 3/3 focused tests pass - test_rag_phase4_final_verify.py::test_phase4_final_verify PASSES (was failing) - test_rag_phase4_stress.py::test_rag_large_codebase_verification_sim PASSES (was failing) - test_rag_visual_sim.py::test_rag_full_lifecycle_sim PASSES (still passing)	2026-06-16 00:09:02 -04:00
ed	6f5b5f91c4	restore comment	2026-06-12 20:26:48 -04:00
ed	ee3c90b865	refactor(rag_engine): Result API + NilRAGState (_init_vector_store, _validate_collection_dim, _get_state)	2026-06-12 20:14:40 -04:00
ed	644d88ab93	fix(rag): break recursion in _validate_collection_dim The wipe path called self._init_vector_store() which re-invoked _validate_collection_dim, causing infinite recursion (RecursionError) when the dim mismatch test ran with the mock embedding provider. Re-initialize the vector store INLINE after the rmtree wipe so the fresh collection is created without going through the validator again.	2026-06-09 14:47:01 -04:00
ed	64bc04a6b8	fix(rag): wipe chroma dir on dim mismatch instead of delete_collection When the existing collection has embeddings from a different embedding provider (e.g. Gemini 3072-dim vs local 384-dim), the prior approach of calling client.delete_collection() fails with 'RustBindingsAPI object has no attribute bindings' in chromadb 1.5.x when the underlying state is corrupted. rmtree is reliable and re-creates a fresh empty collection. Also fixes: - 'The truth value of an empty array is ambiguous' on numpy 2.x by using try/except around len() instead of truthiness check - WinError 32 on rmtree by closing the chroma client first Verified: tests/test_rag_phase4_final_verify.py passes in isolation in 7.75s after this fix. The test still fails in batch context due to a separate io_pool race condition (multiple _sync_rag_engine calls collide when the test sets rag_enabled, rag_source, and rag_emb_provider in sequence). The race is in app_controller.py and is out of scope for this defensive fix. Note: tests/test_rag_engine.py has explicit unit tests for test_rag_collection_dim_mismatch_recreates_collection and test_rag_collection_dim_match_preserves_collection which exercise this code path.	2026-06-09 14:37:19 -04:00
ed	eb8357ec0e	fix(rag): add CWD fallback in index_file for path-resolution resilience RAGEngine.index_file silently returns when the joined base_dir+file_path doesn't exist. This caused the RAG batch test to fail with 0 indexed documents when the live_gui subprocess's active_project_root resolved to a parent dir (e.g. tests/artifacts/) instead of the workspace (tests/artifacts/live_gui_workspace/). The fix: if the primary path doesn't exist, try CWD+file_path. The base_dir takes priority; CWD is a safety net for relative-path resolution across the spawn CWD boundary. This is a defensive fix at the rag_engine layer. It does NOT fix the underlying path-leakage issue in tests/conftest.py (hardcoded Path('tests/artifacts/live_gui_workspace')) which needs a proper fixture refactor. The RAG test still fails in batch due to that deeper issue, documented in docs/reports/rag_test_batch_failure_status_20260609_pm3.md. Behavior: - base_dir+file_path exists: indexed from base_dir (unchanged) - base_dir+file_path missing, CWD+file_path exists: indexed from CWD (new) - Both missing: silently returns (unchanged) Verified: tests/test_rag_index_file_path_fallback.py (3 tests, all pass) - test_index_file_finds_file_via_cwd_fallback - test_index_file_uses_base_dir_first - test_index_file_silently_returns_when_no_match Note: test file was removed before commit because it was being abandoned along with the broader path-hygiene refactor. The fix itself is preserved in src/rag_engine.py.	2026-06-09 12:31:21 -04:00
r00tz	9e4fac496d	made local rag needs optional (prevents having to have torch / sentence-transformers if you never use local embedding)	2026-06-06 13:21:43 -04:00
ed	16412ad5f9	fix(rag): detect ChromaDB dim mismatch and recreate collection on provider switch	2026-06-06 11:26:47 -04:00
ed	053f5d867a	some organization pass, still need to review a bunch	2026-06-06 00:21:36 -04:00
ed	873edf42cf	began to go through the files and organize imports and gui_2.py's new context defs still a bunch to sift through after the last ai passes	2026-06-05 21:44:41 -04:00
ed	20054b0476	fix(test): Final synchronization and stability fixes for RAG stress test - Improved AppController.ai_status to prevent overwriting 'sending...' with 'models loaded'. - Enhanced est_rag_phase4_stress.py with robust polling and increased timeout. - Synchronized App and AppController history objects to ensure consistent view.	2026-05-16 01:21:27 -04:00
ed	c769a0ed18	fix(phase3): Resolve remaining test failures and stabilize GUI - Fixed ullcontext NameError in gui_2.py. - Corrected TestMMAApprovalIndicators to call real rendering methods on mock app. - Updated est_history_manager.py to provide required context_files argument to UISnapshot. - Stabilized est_z_negative_flows.py with robust polling for terminal response status and corrected field names. - Cleaned up debug logging in ag_engine.py and pp_controller.py.	2026-05-14 23:13:17 -04:00
ed	2d76381796	fix(rag): Resolve RAG test failures and race conditions - Fixed circular import in chromadb by using lazy imports in ag_engine.py. - Moved RAG engine initialization to background threads in AppController to avoid blocking UI. - Added _rag_engine_lock to prevent race conditions during engine re-initialization. - Updated Gemini embedding model to gemini-embedding-001 (available) from ext-embedding-004 (not found). - Fixed _rebuild_rag_index to use fresh ag_engine instance from self in every iteration. - Optimized est_rag_phase4_final_verify.py and est_rag_phase4_stress.py to wait for RAG sync before continuing. - Added dummy embedding fallback in LocalEmbeddingProvider if sentence-transformers fails to load.	2026-05-14 22:23:48 -04:00
ed	b5e512f483	feat(sdm): inject structural dependency mapping tags across codebase Adds [C: caller] tags to functions/methods and [M: mutation] / [U: usage] tags to class variables based on cross-module call analysis.	2026-05-13 22:35:52 -04:00
ed	8e9725792f	adjustments to rag engine	2026-05-13 06:32:26 -04:00
ed	8c06c1767b	refactor(sdm): Global pass with refined 'External Only' SDM tags. Pruned redundant internal references and fixed indentation logic in injector. Verified full project compilation.	2026-05-09 15:00:35 -04:00
ed	095368bca2	feat(rag): implement incremental and parallel indexing performance optimizations	2026-05-04 21:47:54 -04:00
ed	a3d7376535	feat(rag): final refinements for Phase 4 support and UI visualization	2026-05-04 21:41:10 -04:00
ed	8b487536c5	feat(rag): Implement auto-indexing and status indicators	2026-05-04 11:34:01 -04:00
ed	fe0069c046	feat(rag): Implement indexing and retrieval logic with AppController integration	2026-05-04 06:53:32 -04:00
ed	e80cd6bd3f	feat(rag): Implement RAG engine, configuration schema, and vector store integration	2026-05-04 05:38:23 -04:00

26 Commits