Files
manual_slop/docs/reports/ai_decoupling_revert_report.md
T

5.0 KiB

AI Client Decoupling - Attempted and Reverted

Date: 2026-05-13 Status: REVERTED

Summary

An attempt was made to decouple the AI client library imports from the main GUI application to reduce startup time. The core issue was slow startup due to heavy SDK imports (google.genai, anthropic, chromadb). The decoupling was only partially implemented and ultimately determined to be unnecessary since the actual bottleneck was RAG initialization, not AI SDK imports.

What Was Attempted

1. Created ai_client_stub.py

A lightweight stub module that provides a minimal interface to AI client functionality without importing heavy SDKs. The stub was intended to route all AI calls to a separate AI server process.

2. Module Replacement Pattern in sloppy.py

# Route all ai_client imports to ai_client_stub to avoid loading heavy SDKs
if os.environ.get("AI_SERVER_ENABLED"):
    import sys
    from src import ai_client_stub
    sys.modules["src.ai_client"] = ai_client_stub

3. Lazy Loading of RAG

Moved rag_engine import from module-level to lazy imports inside functions/setters.

4. Async RAG Initialization

Moved RAG engine initialization to a background thread to prevent blocking the UI during startup.

What Actually Fixed the Startup Issue

The primary startup bottleneck was RAG initialization (5+ seconds), not AI SDK imports.

Timeline of discovery:

  1. Initial timing showed ~1.4s for app_controller import
  2. Further profiling revealed rag_enginechromadb import chain at module level
  3. Lazy loading of rag_engine reduced startup to ~0.4s
  4. Further profiling showed init_state() taking 5+ seconds
  5. Discovered models.RAGConfig.from_dict() was parsing with RAG enabled in config
  6. Making RAG initialization async reduced App() construction from 5.2s to 0.027s

Why Decoupling Was Not Fully Implemented

  1. Incomplete module replacement: The sys.modules["src.ai_client"] = ai_client_stub approach was fragile and not consistently applied. Multiple modules still imported ai_client directly.

  2. AI Server never properly utilized: The ai_client_proxy and server infrastructure existed but was never properly integrated. The proxy client was designed to spawn a subprocess and communicate via JSON-RPC, but this was never connected to actual AI calls.

  3. Wrong diagnosis: The real issue was RAG blocking the event loop, not AI SDK imports. Even if decoupling worked fully, it wouldn't have addressed the primary bottleneck.

  4. Architectural complexity: The decoupling added significant complexity (stub modules, proxy client, server process, IPC mechanism) without proportional benefit.

Files Modified During Attempt

Created

  • src/ai_client_stub.py - Lightweight stub module

Modified

  • sloppy.py - Added AI_SERVER_ENABLED routing
  • src/app_controller.py - Lazy rag_engine import, async RAG init

Files That Were Deleted

The following files were created during the attempt and subsequently deleted:

File Size Purpose Deleted
src/ai_client_stub.py 14KB Lightweight stub module for routing AI calls (commit b2fdca0c)
src/ai_client_proxy.py 3.6KB Proxy client for spawning AI server subprocess (commit b2fdca0c)

Cleanup Commits

Commit b2fdca0c (2026-05-13)

remove(ai_client): delete unused stub and proxy files

Deleted:
- src/ai_client_stub.py
- src/ai_client_proxy.py

Fixed test imports to use ai_client instead of ai_client_stub.

Commit 4025a713 (2026-05-13)

revert(ai_client): remove incomplete decoupling, restore clean startup

The AI client decoupling was never properly implemented and added
unnecessary complexity. The actual startup bottleneck was RAG initialization
which is now handled via async initialization.

Report written to docs/reports/ai_decoupling_revert_report.md

Current State

After reverting the decoupling attempt:

Metric Time
App class load 0.4s
App() construction 0.027s
RAG initialization Async (background thread)

The application now starts quickly with RAG loading in the background.

Recommendations (Completed)

  1. Cleanup done: ai_client_stub.py, ai_client_proxy.py deleted, sloppy.py restored

  2. If AI server is needed in future: Implement properly as a separate concern, not as a module replacement hack

  3. Keep async RAG init: The background thread for RAG is a good pattern and should remain

  4. Profile before optimizing: The lesson learned is to profile before attempting architectural changes

Lessons Learned

  1. Measure first, optimize second - the actual bottleneck was discovered through profiling, not assumption
  2. Architectural changes should solve actual problems, not anticipated ones
  3. Partial decoupling is worse than no decoupling - it adds complexity without benefits
  4. The simplest fix is often correct - lazy imports and async initialization solved the problem without architectural overhaul