Before this change, app_controller imported rag_engine at module level which pulled in chromadb (~0.45s). Now rag_engine is only imported when RAG is actually enabled and needed. This improves startup time significantly.
And fix test_discussion_takes_gui.py patches to use ai_client_stub