diff --git a/conductor/tracks.md b/conductor/tracks.md index f2dd595..df89276 100644 --- a/conductor/tracks.md +++ b/conductor/tracks.md @@ -35,4 +35,9 @@ This file tracks all major tracks for the project. Each track has its own detail --- +- [ ] **Track: Make sure gemini cli behavior and feature set have full parity with regular direct gemini api usage in ai_client.py and elsewhere** +*Link: [./tracks/gemini_cli_parity_20260225/](./tracks/gemini_cli_parity_20260225/)* + +--- + diff --git a/conductor/tracks/gemini_cli_parity_20260225/index.md b/conductor/tracks/gemini_cli_parity_20260225/index.md new file mode 100644 index 0000000..8d720ae --- /dev/null +++ b/conductor/tracks/gemini_cli_parity_20260225/index.md @@ -0,0 +1,5 @@ +# Track gemini_cli_parity_20260225 Context + +- [Specification](./spec.md) +- [Implementation Plan](./plan.md) +- [Metadata](./metadata.json) diff --git a/conductor/tracks/gemini_cli_parity_20260225/metadata.json b/conductor/tracks/gemini_cli_parity_20260225/metadata.json new file mode 100644 index 0000000..3480679 --- /dev/null +++ b/conductor/tracks/gemini_cli_parity_20260225/metadata.json @@ -0,0 +1,8 @@ +{ + "track_id": "gemini_cli_parity_20260225", + "type": "feature", + "status": "new", + "created_at": "2026-02-25T00:00:00Z", + "updated_at": "2026-02-25T00:00:00Z", + "description": "Make sure gemini cli behavior and feature set have full parity with regular direct gemini api usage in ai_client.py and elsewhere" +} \ No newline at end of file diff --git a/conductor/tracks/gemini_cli_parity_20260225/plan.md b/conductor/tracks/gemini_cli_parity_20260225/plan.md new file mode 100644 index 0000000..fcb7752 --- /dev/null +++ b/conductor/tracks/gemini_cli_parity_20260225/plan.md @@ -0,0 +1,26 @@ +# Implementation Plan: Gemini CLI Parity + +## Phase 1: Infrastructure & Common Logic +- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator` +- [ ] Task: Audit `gemini_cli_adapter.py` and `ai_client.py` for parity gaps +- [ ] Task: Implement common logging utilities for CLI bridge observability +- [ ] Task: Conductor - User Manual Verification 'Infrastructure & Common Logic' (Protocol in workflow.md) + +## Phase 2: Token Counting & Safety Settings +- [ ] Task: Write failing tests for token estimation in `GeminiCLIAdapter` +- [ ] Task: Implement token counting parity in `GeminiCLIAdapter` +- [ ] Task: Write failing tests for safety setting application in `GeminiCLIAdapter` +- [ ] Task: Implement safety filter application in `GeminiCLIAdapter` +- [ ] Task: Conductor - User Manual Verification 'Token Counting & Safety Settings' (Protocol in workflow.md) + +## Phase 3: Tool Calling Parity & System Instructions +- [ ] Task: Write failing tests for system instruction usage in `GeminiCLIAdapter` +- [ ] Task: Implement system instruction propagation in `GeminiCLIAdapter` +- [ ] Task: Write failing tests for tool call/response mapping in `cli_tool_bridge.py` +- [ ] Task: Synchronize tool call handling between bridge and `ai_client.py` +- [ ] Task: Conductor - User Manual Verification 'Tool Calling Parity & System Instructions' (Protocol in workflow.md) + +## Phase 4: Final Verification & Performance Diagnostics +- [ ] Task: Implement automated parity regression tests comparing CLI vs Direct API outputs +- [ ] Task: Verify bridge latency and error handling robustness +- [ ] Task: Conductor - User Manual Verification 'Final Verification & Performance Diagnostics' (Protocol in workflow.md) diff --git a/conductor/tracks/gemini_cli_parity_20260225/spec.md b/conductor/tracks/gemini_cli_parity_20260225/spec.md new file mode 100644 index 0000000..807928c --- /dev/null +++ b/conductor/tracks/gemini_cli_parity_20260225/spec.md @@ -0,0 +1,27 @@ +# Specification: Gemini CLI Parity + +## Overview +Achieve full functional and behavioral parity between the Gemini CLI integration (`gemini_cli_adapter.py`, `cli_tool_bridge.py`) and the direct Gemini API implementation (`ai_client.py`). This ensures that users leveraging the Gemini CLI as a headless backend provider experience the same level of capability, reliability, and observability as direct API users. + +## Functional Requirements +- **Token Estimation Parity:** Implement accurate token counting for both input and output in the Gemini CLI adapter to match the precision of the direct API. +- **Safety Settings Parity:** Enable full configuration and enforcement of Gemini safety filters when using the CLI provider. +- **Tool Calling Parity:** Synchronize tool definition mapping, call handling, and response processing between the CLI bridge and the direct SDK. +- **System Instructions Parity:** Ensure system prompts and instructions are consistently passed and handled across both providers. +- **Bridge Robustness:** Enhance the `cli_tool_bridge.py` and adapter to improve latency, error handling (retries), and detailed subprocess observability. + +## Non-Functional Requirements +- **Observability:** Detailed logging of CLI subprocess interactions for debugging. +- **Performance:** Minimize the overhead introduced by the bridge mechanism. +- **Maintainability:** Ensure that future changes to `ai_client.py` can be easily mirrored in the CLI adapter. + +## Acceptance Criteria +- [ ] Token counts for identical prompts match within a 5% margin between CLI and Direct API. +- [ ] Safety settings configured in the GUI are correctly applied to CLI sessions. +- [ ] Tool calls from the CLI are successfully executed and returned via the bridge without loss of context. +- [ ] System instructions are correctly utilized by the model when using the CLI. +- [ ] Automated tests verify that responses and tool execution flows are identical for both providers. + +## Out of Scope +- Performance optimizations for the `gemini` CLI binary itself. +- Support for non-Gemini CLI providers in this track.