5.0 KiB
AI Client Decoupling - Attempted and Reverted
Date: 2026-05-13 Status: REVERTED
Summary
An attempt was made to decouple the AI client library imports from the main GUI application to reduce startup time. The core issue was slow startup due to heavy SDK imports (google.genai, anthropic, chromadb). The decoupling was only partially implemented and ultimately determined to be unnecessary since the actual bottleneck was RAG initialization, not AI SDK imports.
What Was Attempted
1. Created ai_client_stub.py
A lightweight stub module that provides a minimal interface to AI client functionality without importing heavy SDKs. The stub was intended to route all AI calls to a separate AI server process.
2. Module Replacement Pattern in sloppy.py
# Route all ai_client imports to ai_client_stub to avoid loading heavy SDKs
if os.environ.get("AI_SERVER_ENABLED"):
import sys
from src import ai_client_stub
sys.modules["src.ai_client"] = ai_client_stub
3. Lazy Loading of RAG
Moved rag_engine import from module-level to lazy imports inside functions/setters.
4. Async RAG Initialization
Moved RAG engine initialization to a background thread to prevent blocking the UI during startup.
What Actually Fixed the Startup Issue
The primary startup bottleneck was RAG initialization (5+ seconds), not AI SDK imports.
Timeline of discovery:
- Initial timing showed ~1.4s for
app_controllerimport - Further profiling revealed
rag_engine→chromadbimport chain at module level - Lazy loading of
rag_enginereduced startup to ~0.4s - Further profiling showed
init_state()taking 5+ seconds - Discovered
models.RAGConfig.from_dict()was parsing with RAG enabled in config - Making RAG initialization async reduced App() construction from 5.2s to 0.027s
Why Decoupling Was Not Fully Implemented
-
Incomplete module replacement: The
sys.modules["src.ai_client"] = ai_client_stubapproach was fragile and not consistently applied. Multiple modules still importedai_clientdirectly. -
AI Server never properly utilized: The
ai_client_proxyand server infrastructure existed but was never properly integrated. The proxy client was designed to spawn a subprocess and communicate via JSON-RPC, but this was never connected to actual AI calls. -
Wrong diagnosis: The real issue was RAG blocking the event loop, not AI SDK imports. Even if decoupling worked fully, it wouldn't have addressed the primary bottleneck.
-
Architectural complexity: The decoupling added significant complexity (stub modules, proxy client, server process, IPC mechanism) without proportional benefit.
Files Modified During Attempt
Created
src/ai_client_stub.py- Lightweight stub module
Modified
sloppy.py- Added AI_SERVER_ENABLED routingsrc/app_controller.py- Lazy rag_engine import, async RAG init
Files That Were Deleted
The following files were created during the attempt and subsequently deleted:
| File | Size | Purpose | Deleted |
|---|---|---|---|
src/ai_client_stub.py |
14KB | Lightweight stub module for routing AI calls | ✅ (commit b2fdca0c) |
src/ai_client_proxy.py |
3.6KB | Proxy client for spawning AI server subprocess | ✅ (commit b2fdca0c) |
Cleanup Commits
Commit b2fdca0c (2026-05-13)
remove(ai_client): delete unused stub and proxy files
Deleted:
- src/ai_client_stub.py
- src/ai_client_proxy.py
Fixed test imports to use ai_client instead of ai_client_stub.
Commit 4025a713 (2026-05-13)
revert(ai_client): remove incomplete decoupling, restore clean startup
The AI client decoupling was never properly implemented and added
unnecessary complexity. The actual startup bottleneck was RAG initialization
which is now handled via async initialization.
Report written to docs/reports/ai_decoupling_revert_report.md
Current State
After reverting the decoupling attempt:
| Metric | Time |
|---|---|
| App class load | 0.4s |
| App() construction | 0.027s |
| RAG initialization | Async (background thread) |
The application now starts quickly with RAG loading in the background.
Recommendations (Completed)
-
✅ Cleanup done:
ai_client_stub.py,ai_client_proxy.pydeleted,sloppy.pyrestored -
If AI server is needed in future: Implement properly as a separate concern, not as a module replacement hack
-
Keep async RAG init: The background thread for RAG is a good pattern and should remain
-
Profile before optimizing: The lesson learned is to profile before attempting architectural changes
Lessons Learned
- Measure first, optimize second - the actual bottleneck was discovered through profiling, not assumption
- Architectural changes should solve actual problems, not anticipated ones
- Partial decoupling is worse than no decoupling - it adds complexity without benefits
- The simplest fix is often correct - lazy imports and async initialization solved the problem without architectural overhaul