Private

Public Access

Files

T

conductor-tier2 161ebb0da6 docs(fix): correct nav link case + relative-path level

Gitea (and any case-sensitive filesystem) was rendering the [Top]
nav links in /docs as broken because of two bugs:

1. Case-sensitivity: 22 links used '../README.md' (all-uppercase)
   but the actual file is 'docs/Readme.md' (capital R, lowercase
   rest). 21 guide_*.md nav bars were affected, plus 1 internal
   cross-link in Readme.md itself. Works on Windows (case-
   insensitive) but broken on Linux/Gitea.

   Fix: 22 occurrences across 22 files changed
   '../README.md' -> '../Readme.md'

2. Wrong relative-path level: 16 links used '../../conductor/...'
   from 'docs/guide_*.md' to reach 'conductor/'. This goes up 2
   levels to 'projects/', which doesn't exist. The correct path
   from 'docs/guide_*.md' to 'conductor/' is 1 level up
   ('../conductor/...'). 12 unique patterns across 10 files
   affected.

   Fix: 16 occurrences across 10 files changed
   '../../conductor/' -> '../conductor/'

3. Bonus: 1 planned-guide link in guide_context_curation.md
   referenced a never-written 'guide_context_presets.md'. The
   ContextPreset schema is now fully covered in the new
   'guide_context_aggregation.md' (per the 2026-06-08 docs
   refresh). Fix: link target updated.

No content was changed, only link paths. 24 files, 37 link
replacements, 37 deletions.

Verification:
- All .md links in docs/ now resolve to existing files
  (validated by path-resolution check from each file's directory)
- The 3 new guides from the previous docs refresh commit
  (guide_discussions.md, guide_state_lifecycle.md,
  guide_context_aggregation.md) had the case bug inherited from
  guide_architecture.md's existing nav pattern; their top-of-file
  nav bars are now correct
- The 21 pre-existing guide nav bars that had the same bug
  (all 21 of them, except the 3 that used the correct case:
  guide_mma.md, guide_simulations.md, guide_tools.md) are now
  also fixed
- Inter-guide links (e.g. [Discussions](guide_discussions.md))
  were not affected; they were always correct because both the
  link text and the actual filename are lowercase

This is a docs-only fix. No code modified.

2026-06-08 19:51:55 -04:00

15 KiB

Raw Blame History

`src/ai_client.py` — Multi-Provider LLM Abstraction

Top | Architecture | Testing | MMA

Overview

src/ai_client.py (~116KB) is the unified LLM client for 5 providers. It abstracts the differences between providers (Gemini, Anthropic, DeepSeek, MiniMax, Gemini CLI) behind a single send() function.

The module is a stateful singleton — all provider state is held in module-level globals. There is no class wrapping; the module itself is the abstraction layer.

Architecture

┌─────────────────────────────────────────────────┐
│ ai_client.send(md_content, user_message, ...)    │
│                                                 │
│ 1. _send_lock.acquire() — serialize all calls   │
│ 2. Read _provider / _model                       │
│ 3. Route to provider-specific _send_<provider>() │
│ 4. Return str response                           │
└─────────────────┬───────────────────────────────┘
                  │ dispatches based on _provider
                  ▼
   ┌────────┬─────────┬────────┬──────────┐
   ▼        ▼         ▼        ▼          ▼
_gemini  _anthropic _deepseek _minimax  _gemini_cli
                                              (subprocess)

State

All state is module-level globals. The most important:

Variable	Type	Purpose
`_provider: str`	`"gemini" \| "anthropic" \| "deepseek" \| "minimax" \| "gemini_cli"`	Active provider
`_model: str`	`str`	Active model name
`_temperature: float`	`0.0`	Sampling temperature
`_top_p: float`	`1.0`	Nucleus sampling
`_max_tokens: int`	`8192`	Output token cap
`_history_trunc_limit: int`	`8000`	Char limit for truncating old tool outputs
`_send_lock`	`threading.Lock`	Serializes all send() calls
`_current_palette: str`	theme	Last-applied theme palette

Per-Provider State

_gemini_client: Optional[genai.Client] = None
_gemini_chat: Any = None
_gemini_cache: Any = None
_gemini_cache_md_hash: Optional[str] = None
_gemini_cache_created_at: Optional[float] = None
_gemini_cached_file_paths: list[str] = []

_anthropic_client: Optional[anthropic.Anthropic] = None
_anthropic_history: list[dict] = []
_anthropic_history_lock: threading.Lock = threading.Lock()

_deepseek_client: Any = None
_deepseek_history: list[dict] = []
_deepseek_history_lock: threading.Lock = threading.Lock()

_minimax_client: Any = None
_minimax_history: list[dict] = []
_minimax_history_lock: threading.Lock = threading.Lock()

_gemini_cli_adapter: Optional[GeminiCliAdapter] = None

The Public API

`send(...)` — The Main Entry Point

def send(
    md_content: str,
    user_message: str,
    base_dir: str = ".",
    file_items: list[dict] | None = None,
    discussion_history: str = "",
    stream: bool = False,
    pre_tool_callback: Optional[Callable] = None,
    qa_callback: Optional[Callable] = None,
    enable_tools: bool = True,
    stream_callback: Optional[Callable] = None,
    patch_callback: Optional[Callable] = None,
    rag_engine: Optional[Any] = None,
) -> str:

Returns the model's response as a string. All provider calls go through here.

Parameters:

md_content — the system prompt + context (markdown)
user_message — the user's message
base_dir — for MCP tool filesystem operations
file_items — files in the context (deprecated path; usually empty)
discussion_history — legacy parameter
stream / stream_callback — for streaming responses
pre_tool_callback — called before each tool execution (HITL gate)
qa_callback — called when an error occurs (Tier 4 integration)
enable_tools — whether to enable PowerShell + MCP tools
patch_callback — Tier 4 patch generation hook
rag_engine — optional RAG engine for context augmentation

Provider Switching

from src import ai_client
ai_client.set_provider("gemini", "gemini-3-flash-preview")
ai_client.set_provider("anthropic", "claude-3-5-sonnet-latest")
ai_client.set_provider("deepseek", "deepseek-chat")
ai_client.set_provider("minimax", "grok-2-latest")
ai_client.set_provider("gemini_cli", "gemini-2.0-flash")

Parameter Setters

ai_client.set_model_params(temp=0.7, max_tok=4096, top_p=0.9, trunc_limit=4000)

Session Management

ai_client.reset_session()  # Clears all provider state, history, cache

Event Hooks

from src import ai_client

# Confirmation hook (called before destructive tool execution)
ai_client.confirm_and_run_callback = my_gui_callback

# Comms log hook (called on every API call)
ai_client.comms_log_callback = my_logging_callback

# Tool log hook (called on every tool completion)
ai_client.tool_log_callback = my_tool_logging_callback

# Event emitter (for any subscriber)
ai_client.events.on("my_event", my_handler)

Comms Log

ai_client._append_comms(direction, kind, payload)  # Add entry
ai_client.get_comms_log()  # Read all
ai_client.clear_comms_log()  # Clear
ai_client.get_token_stats(md_content)  # Estimate token usage

Provider Error Taxonomy

class ProviderError(Exception):
    kind: str  # "quota" | "rate_limit" | "auth" | "balance" | "network" | "unknown"
    provider: str
    original: Exception

    def ui_message(self) -> str:
        """Returns a user-friendly error message."""

ProviderError is raised by provider-specific _send_* functions on failure. The caller (typically app_controller.py) catches it and surfaces the error to the user via app.ai_status.

The Tool-Call Loop

All providers follow the same high-level pattern in _send_*:

def _send_<provider>(md_content, user_message, ...):
    for round in range(MAX_TOOL_ROUNDS + 2):  # up to 10 rounds
        response = provider_api_call(md_content, user_message, history, tools)
        comms_log(direction="IN", kind="response", payload=response)

        if not has_function_calls(response):
            return extract_text(response)

        for call in response.function_calls:
            if pre_tool_callback and pre_tool_callback(...) is rejected:
                return rejection_message
            tool_result = dispatch(call.name, call.args, base_dir)
            append_tool_result_to_history(call, tool_result)

        # Context refresh: re-read all tracked files (mtime check)
        _reread_file_items(file_items)

        # Truncate tool outputs at _history_trunc_limit
        truncate_tool_outputs(history)

        # Cumulative byte check
        if cumulative_tool_bytes > 500_000:
            inject_warning()

    return final_response

The constants:

MAX_TOOL_ROUNDS: int = 10 — max tool-call iterations per send()
_MAX_TOOL_OUTPUT_BYTES: int = 500_000 — cumulative tool output budget
_ANTHROPIC_CHUNK_SIZE: int = 120_000 — chars per Anthropic system text block
_ANTHROPIC_MAX_PROMPT_TOKENS: int = 180_000 — Anthropic prompt limit (200K minus headroom)
_GEMINI_MAX_INPUT_TOKENS: int = 900_000 — Gemini 1M window minus headroom

Provider-Specific Behaviors

Gemini (SDK)

Server-side cache: genai.CachedContent with TTL management
Cache rebuild at 90% TTL: proactive renewal
Cache hash: tracks content hash for invalidation
Cached file paths: tracks which files are in the active cache

Anthropic

Ephemeral prompt caching: 4 cache_control: ephemeral breakpoints
Breakpoints: system prompt, context chunks, tool def, conversation prefix
History trimming at 180K tokens: 2-phase (strip stale file refreshes, then drop turn pairs)
History repair: _repair_anthropic_history handles tool_result chain breaks

DeepSeek

Raw HTTP: uses requests.post directly (no SDK)
Streaming: supports streaming responses
History repair: _repair_deepseek_history for tool result chains

MiniMax

OpenAI-compatible endpoint: uses the openai SDK
History trimming: similar to Anthropic (drop turn pairs at threshold)
History repair: _repair_minimax_history

Gemini CLI

Subprocess adapter: GeminiCliAdapter in src/gemini_cli_adapter.py
Persistent session: CLI maintains its own session ID
JSONL output protocol: parses streaming JSONL from the CLI subprocess
Full feature parity: tool calls, streaming, usage metadata

History Trimming Strategies

Gemini (40% threshold)

if total_in > _GEMINI_MAX_INPUT_TOKENS * 0.4:
    while len(hist) > 4 and total_in > _GEMINI_MAX_INPUT_TOKENS * 0.3:
        hist.pop(0)  # Assistant
        hist.pop(0)  # User

Anthropic (180K limit)

_trim_anthropic_history(system_blocks, history) — two-phase:

Strip stale [SYSTEM: FILES UPDATED] blocks
Drop oldest turn pairs (preserving tool_result chains)

MiniMax

Same pattern as Anthropic (similar 180K limit).

DeepSeek

No built-in trimming (relies on the caller to keep history short).

Caching Strategies

Gemini Server-Side Cache

_gemini_cache_md_hash: Optional[str] = None  # Hash of cached content
_gemini_cache_created_at: Optional[float] = None  # Monotonic time

The cache decision is a 3-way branch on each _send_gemini call:

Hash changed: delete old, rebuild with new content
Cache age > 90% of TTL (3240s of 3600s): proactive renewal
No cache exists: create new if token count >= 2048, otherwise inline

Anthropic Cache (4-Breakpoint System)

[System prompt]─breakpoint 1
[Context chunks]─breakpoint 2
[Tool definitions]─breakpoint 3
[Last user message]─breakpoint 4

Before placing breakpoint 4, all existing cache_control is stripped to prevent exceeding the 4-breakpoint limit.

Context Refresh Mechanism

After the last tool call in each round, _reread_file_items(file_items) checks mtimes:

For each file item: compare Path.stat().st_mtime against stored mtime
If unchanged: pass through as-is
If changed: re-read content, store old_content for diffing, update mtime
Changed files are diffed via _build_file_diff_text:
- Files ≤ 200 lines: emit full content
- Files > 200 lines with old_content: emit difflib.unified_diff
Diff is appended to the last tool's output as [SYSTEM: FILES UPDATED]\n\n{diff}
Stale [FILES UPDATED] blocks are stripped from older history turns by _strip_stale_file_refreshes

This is the "agent always sees current code" mechanism.

Subagent Summarization

For Tier 4: when an error occurs, qa_callback may be invoked to get a Tier 4 AI summary of the traceback. The summary is injected back into the worker's history as a hint.

def run_tier4_analysis(stderr: str) -> str:
    """Stateless Tier 4 QA analysis of an error message."""
    # Uses a dedicated system prompt for error triage
    # Returns analysis text (root cause, suggested fix)
    # Does NOT modify any code — analysis only

For Tier 4 patch generation:

def run_tier4_patch_generation(error: str, file_context: str) -> str:
    """Generate a unified diff patch from an error and file context."""
    # Returns the patch as a string
    # The caller (typically the patch modal) presents it for human review

Public API Quick Reference

Function	Purpose
`send(...)`	The main entry point — call the active provider
`set_provider(provider, model)`	Switch active provider and model
`get_provider() -> str`	Get the active provider name
`set_model_params(temp, max_tok, trunc_limit, top_p)`	Update generation params
`set_custom_system_prompt(prompt)`	Set the per-session system prompt override
`set_base_system_prompt(prompt)`	Set the foundational base prompt (advanced)
`set_use_default_base_prompt(use: bool)`	Toggle whether the base prompt is included
`set_project_context_marker(marker)`	Set the project-specific context tag
`reset_session()`	Clear all provider state
`get_comms_log()`	Read the in-memory comms log
`clear_comms_log()`	Clear the in-memory comms log
`get_token_stats(md_content)`	Estimate token usage for the given content
`cleanup()`	Tear down (delete Gemini caches, etc.)
`get_current_palette() -> str`	Get the current theme palette name
`list_models(provider) -> list[str]`	List available models for a provider
`run_tier4_analysis(stderr) -> str`	Tier 4 error analysis
`run_tier4_patch_generation(error, file_context) -> str`	Tier 4 patch generation
`run_subagent_summarization(file_path, content, is_code, outline) -> str`	AI summary of a file
`run_discussion_compression(text) -> str`	AI compression of a long discussion

Thread Safety

_send_lock: threading.Lock — serializes all provider calls. No two send() calls run concurrently.
Per-provider history locks (_anthropic_history_lock, etc.) — guard the history list mutations.
The EventEmitter (in src/events.py) is thread-safe for subscribe/emit.

Testing

Unit Tests (no real API calls)

def test_set_provider():
    from src import ai_client
    ai_client.set_provider("anthropic", "claude-3-5-sonnet-latest")
    assert ai_client.get_provider() == "anthropic"
    ai_client.reset_session()  # Cleanup

Mocked Tests

from unittest.mock import patch

def test_send_routes_to_provider(monkeypatch):
    with patch.object(ai_client, "_send_anthropic", return_value="mocked") as m:
        ai_client.set_provider("anthropic", "claude-3-5-sonnet-latest")
        result = ai_client.send("system", "user")
        assert result == "mocked"
        m.assert_called_once()
    ai_client.reset_session()

Integration (real API)

Gated by env var (e.g., RUN_REAL_AI_TESTS=1). Hits the real API. Not in default CI.

15 KiB Raw Blame History

src/ai_client.py — Multi-Provider LLM Abstraction

Overview

Architecture

State

Per-Provider State

The Public API

send(...) — The Main Entry Point

Provider Switching

Parameter Setters

Session Management

Event Hooks

Comms Log

Provider Error Taxonomy

The Tool-Call Loop

Provider-Specific Behaviors

Gemini (SDK)

Anthropic

DeepSeek

MiniMax

Gemini CLI

History Trimming Strategies

Gemini (40% threshold)

Anthropic (180K limit)

MiniMax

DeepSeek

Caching Strategies

Gemini Server-Side Cache

Anthropic Cache (4-Breakpoint System)

Context Refresh Mechanism

Subagent Summarization

Public API Quick Reference

Thread Safety

Testing

Unit Tests (no real API calls)

Mocked Tests

Integration (real API)

See Also

15 KiB

Raw Blame History

`src/ai_client.py` — Multi-Provider LLM Abstraction

`send(...)` — The Main Entry Point