3.1 KiB
3.1 KiB
Specification: Gemini CLI Headless Integration
Overview
This track integrates the gemini CLI as a headless backend provider for Manual Slop. This allows users to leverage their Gemini subscription and the CLI's advanced features (e.g., specialized sub-agents like codebase_investigator, structured JSON streaming, and robust session management) directly within the Manual Slop GUI.
Goals
- Add "Gemini CLI" as a selectable AI provider in Manual Slop.
- Support both persistent interactive sessions and one-off task-specific delegation (e.g., running
gemini investigate). - Implement a secure "BeforeTool" hook to ensure all CLI-initiated tool calls are intercepted and confirmed via the Manual Slop GUI.
- Capture and display the CLI's visually enriched output (via JSONL stream) within the existing discussion history.
Functional Requirements
1. Gemini CLI Provider Adapter
- Implementation: Create a
GeminiCliAdapterclass (or extendai_client.py) that wraps thegeminiCLI subprocess. - Communication: Use
--output-format stream-jsonto receive real-time updates (text chunks, tool calls, status). - Session Management: Support session persistence by tracking the session ID and passing it to subsequent CLI calls.
- Authentication:
- Provide a "Login to Gemini CLI" action in the GUI that triggers
gemini login. - Support passing an API key via environment variables if configured in
manual_slop.toml.
- Provide a "Login to Gemini CLI" action in the GUI that triggers
2. GUI Intercepted Tool Execution
- Mechanism: Use the Gemini CLI's
BeforeToolhook. - Hook Helper: A small Python script
scripts/cli_tool_bridge.pywill be registered as theBeforeToolhook. - IPC: This bridge script will communicate with Manual Slop's
HookServer(extending it to support synchronous "ask" requests). - Confirmation: When a tool is requested, the bridge blocks until the user confirms/denies the action in the GUI, returning the decision as JSON to the CLI.
3. Visual & Telemetry Integration
- Rich Output: Parse the
stream-jsonevents to display markdown content and tool status in the GUI. - Telemetry: Extract and display token usage and latency metrics provided by the CLI's
resultevent.
Non-Functional Requirements
- Performance: The subprocess bridge should introduce minimal latency (<100ms overhead for communication).
- Reliability: Gracefully handle CLI crashes or timeouts by reporting errors in the GUI and allowing session resets.
Acceptance Criteria
- User can select "Gemini CLI" in the Provider dropdown.
- User can successfully send messages and receive streamed responses from the CLI.
- Any tool call (PowerShell/MCP) initiated by the CLI triggers the standard Manual Slop confirmation modal.
- Tools only execute after user approval; rejection correctly notifies the CLI agent.
- Session history is maintained correctly across multiple turns when using the CLI provider.
Out of Scope
- Full terminal emulation (ANSI color support) within the GUI; the focus is on structured text and data.
- Migrating existing raw
client_apisessions to CLI sessions.