Private

Public Access

Files

T

ed 1a739ecef5 conductor(spec+plan): phase2_4_5_call_site_completion_20260621 + code_path_audit pre-flight adjustments + Phase 3 analysis

PHASE 2/4/5 FOLLOW-UP TRACK (Tier 1 decided SHINK to 6a + 6b + 6d):
- Phase 6a: Fix HookServer.broadcast() callers (app_controller.py + events.py + gui_2.py)
  Adds tests/test_websocket_broadcast_regression.py with no-TypeError assertion
- Phase 6b: Complete _send_grok/_send_minimax/_send_llama OpenAICompatibleRequest migration
- Phase 6d: Update those 3 senders' NormalizedResponse to use UsageStats

Total: ~16 atomic commits, ~3 hours Tier 2 work. Unblocks code_path_audit_20260607.

CODE_PATH_AUDIT_20260607 PRE-FLIGHT ADJUSTMENTS (per handoffs):
- Add 2 new actions: provider_history_append + websocket_broadcast
- Add 5 micro-benchmarks: NormalizedResponse.__init__, WebSocketMessage.__init__,
  UsageStats.__init__, ProviderHistory.lock, ToolSpec.__init__
- Add no-TypeError-errors-on-any-thread assertion (backs test_websocket_broadcast_regression.py)
- Add 89 fat-struct sites from ANY_TYPE_AUDIT_20260621.md as instrumented targets
- BLOCKER: phase2_4_5_call_site_completion_20260621 (broadcast() TypeError)

PHASE 3 HYPOTHETICAL ANALYSIS (separate doc):
docs/reports/PHASE3_HYPOTHETICAL_PROMOTION.md - dataclass definitions (already on tier2 branch),
per-provider codepath catalog (112 sites), qualitative cost estimation (~+1-2ms per session,
~+8-15us per _send_anthropic turn). Input for the audit; the audit quantifies the cost.

REGISTRATION:
conductor/tracks.md updated: new row 27 (follow-up), new row 28 (parent any_type_componentization),
row 17 (code_path_audit) updated with pre-flight adjustments note.

Files:
- conductor/tracks/phase2_4_5_call_site_completion_20260621/spec.md (NEW; 633 lines)
- conductor/tracks/phase2_4_5_call_site_completion_20260621/plan.md (NEW; 7 phases, 23 tasks)
- conductor/tracks/phase2_4_5_call_site_completion_20260621/metadata.json (NEW; 8.8KB)
- conductor/tracks/phase2_4_5_call_site_completion_20260621/state.toml (NEW; 11.8KB)
- docs/reports/PHASE3_HYPOTHETICAL_PROMOTION.md (NEW; 380 lines; qualitative cost analysis)
- conductor/tracks/code_path_audit_20260607/spec.md (MODIFIED; +93 lines Pre-Flight Adjustments)
- conductor/tracks.md (MODIFIED; +35 lines: 3 new entries + 1 stale row fix)

2026-06-21 18:32:02 -04:00

14 KiB

Raw Blame History

Phase 3 Hypothetical Promotion: `ProviderHistory` Migration Analysis

Date: 2026-06-21 Author: Tier 1 Orchestrator Status: Hypothetical — this is the analysis the deferred Phase 3 work would look like, NOT a track spec Input: docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md (Tier 2's runtime cost framing) + src/provider_state.py (the dataclass already on the tier2 branch)

1. Purpose

Phase 3 (provider_state.ProviderHistory call-site migration in src/ai_client.py) was deferred from any_type_componentization_20260621 because:

It's the highest-risk phase (112 call sites across 6 senders)
The cost depends on whether each site is in a hot path, cold path, or init path
code_path_audit_20260607 is the right tool to quantify that cost before refactoring

This document presents what the migration would look like — the approximate dataclasses, the call-site catalog, and a qualitative cost estimation of each codepath. The actual numbers will come from the audit. This document is the what; the audit produces the cost.

2. The Dataclass (already exists on `tier2/any_type_componentization_20260621` branch)

# src/provider_state.py:25-44 (verbatim from branch)
@dataclass
class ProviderHistory:
 messages: list[HistoryMessage] = field(default_factory=list)
 lock: threading.Lock = field(default_factory=threading.Lock)

 def append(self, message: HistoryMessage) -> None:
 with self.lock:
 self.messages.append(message)

 def get_all(self) -> list[HistoryMessage]:
 with self.lock:
 return list(self.messages)

 def replace_all(self, messages: list[HistoryMessage]) -> None:
 with self.lock:
 self.messages = list(messages)

 def clear(self) -> None:
 with self.lock:
 self.messages = []

# src/provider_state.py:47-69 (verbatim from branch)
_PROVIDER_HISTORIES: dict[str, ProviderHistory] = {
 "anthropic": ProviderHistory(),
 "deepseek": ProviderHistory(),
 "minimax": ProviderHistory(),
 "qwen": ProviderHistory(),
 "grok": ProviderHistory(),
 "llama": ProviderHistory(),
}

def get_history(provider: str) -> ProviderHistory:
 if provider not in _PROVIDER_HISTORIES:
 raise KeyError(f"Unknown provider: {provider!r}")
 return _PROVIDER_HISTORIES[provider]

def clear_all() -> None:
 for h in _PROVIDER_HISTORIES.values():
 h.clear()

def providers() -> tuple[str, ...]:
 return tuple(_PROVIDER_HISTORIES.keys())

Properties that hold:

@dataclass (NOT frozen=True) — the message list and lock are mutable; this is correct.
default_factory=list for messages — each ProviderHistory gets its own list.
default_factory=threading.Lock for lock — each ProviderHistory gets its own lock instance.
The 4-method interface encapsulates the lock; consumers never see it.

This is already on the tier2 branch. What Phase 3 does is migrate the consumers.

3. The Hypothetical Migration

The migration replaces direct module-global access (_anthropic_history, _anthropic_history_lock) with the typed accessor (get_history("anthropic")).

3.1 Mechanical Translation Rules

Current	Hypothetical (typed)	Lock needed?
`_anthropic_history` (read)	`get_history("anthropic").get_all()`	Yes (returns copy under lock)
`_anthropic_history` (write ref)	`get_history("anthropic").messages`	Only inside `with h.lock:`
`_anthropic_history.append(m)`	`get_history("anthropic").append(m)`	Encapsulated
`len(_anthropic_history)`	`len(get_history("anthropic").messages)`	No (length is atomic in CPython)
`for m in _anthropic_history:`	`for m in get_history("anthropic").get_all():`	Yes
`with _anthropic_history_lock:`	`with get_history("anthropic").lock:`	Same
`_anthropic_history = []`	`get_history("anthropic").clear()`	Encapsulated

3.2 Pattern Categories (per `HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md` §1)

Category	Sites	Path role
`_<provider>_history.append(message)`	6	Hot — called per LLM turn
`len(_<provider>_history)` / `_<provider>_history[-1]` / iteration	~40	Hot — called per LLM turn for trimming, tool-history cache breakpoint, strip_cache_controls
`with _<provider>_history_lock:`	~30	Mixed — per-turn append is Hot; `reset_session` is Cold
`global _<provider>_history` declarations	4	N/A — module-level, no runtime cost
`_strip_cache_controls(_<provider>_history)` + `_repair_<provider>_history()` + `_add_history_cache_breakpoint()` + `_trim_<provider>_history()`	~30	Hot for Anthropic (cache controls); Mixed for others

3.3 Per-Provider Site Count (measured from current `src/ai_client.py`)

Provider	history refs	lock refs	global decls	Total sites
anthropic	22	2	1	25
deepseek	13	6	1	20
minimax	15	5	1	21
qwen	7	4	1	12
grok	7	6	0	13
llama	12	9	0	21
Total	76	32	4	112

(Note: this 112 count is higher than the HANDOFF's "41" estimate, because the grep counts every reference including duplicates in helper functions. The migration work is the same either way — every reference gets touched — but the codepath catalog is richer.)

4. The Codepath Catalog (with Qualitative Cost Estimation)

This is the what the audit will quantify. Each codepath is tagged with path_role, call_frequency, and estimated qualitative cost delta (positive = slower, negative = faster, zero = no change).

4.1 `_send_anthropic` (L1407) — HOT per-LLM-turn

Codepaths inside _send_anthropic (per the grep):

Codepath	Path role	Per-call freq	Qualitative cost delta
`_strip_cache_controls(_anthropic_history)`	Hot (called once per send)	1× per LLM turn	+0.5-1μs (one extra dict lookup `get_history("anthropic")` per call)
`_repair_anthropic_history(_anthropic_history)`	Hot	1× per LLM turn	+0.5μs (same)
`_anthropic_history.append(...)` (user message)	Hot	1× per LLM turn	+0.5μs (method call vs. bare `.append()`)
`_add_history_cache_breakpoint(_anthropic_history)`	Hot	1× per LLM turn	+0.5μs (same)
`_trim_anthropic_history(system_blocks, _anthropic_history)`	Hot	1× per LLM turn	+0.5μs (one extra dict lookup)
`len(_anthropic_history)`	Hot	2-3× per LLM turn (used in token estimation)	+0.3μs per call (`.messages` attribute access vs. global var lookup)
`_estimate_prompt_tokens(system_blocks, _anthropic_history)`	Hot	1× per LLM turn	+1μs (the function takes a list; we pass `h.messages` under lock or `h.get_all()`; if the latter, that's a list copy — ~5μs for a 50-message history)
`for m in _anthropic_history:` (inside `_strip_cache_controls`)	Hot	1× per LLM turn (iteration over ~10-50 messages)	+5-10μs (list copy via `get_all()`; the bare global just iterates directly)

Per-turn overhead estimate: +8-15μs per _send_anthropic call. At ~50 turns per session, that's +400-750μs per session. Negligible vs LLM latency (typically 1-30 seconds).

Recommendation (subject to audit): Migrate, but use with h.lock: blocks for the hot paths inside _strip_cache_controls and _estimate_prompt_tokens to avoid the list-copy overhead of get_all().

4.2 `_send_deepseek` (L2167) — HOT per-LLM-turn

Similar pattern to _send_anthropic but simpler (no cache controls). Estimated per-turn overhead: +3-7μs. At 50 turns/session, +150-350μs/session.

4.3 `_send_minimax` (L2616) — HOT per-LLM-turn

Has _trim_minimax_history helper (L2484). Estimated per-turn overhead: +3-7μs. +150-350μs/session.

4.4 `_send_grok` (L2532) — HOT per-LLM-turn

No _trim or _repair helpers; simpler. Estimated per-turn overhead: +2-5μs. +100-250μs/session.

4.5 `_send_qwen` (L2771) — HOT per-LLM-turn

No helpers. Estimated per-turn overhead: +2-5μs. +100-250μs/session.

4.6 `_send_llama` (L2856) — HOT per-LLM-turn

Highest lock count (9 lock refs). Estimated per-turn overhead: +4-8μs. +200-400μs/session.

4.7 `cleanup()` (L454) — COLD per project-switch

Iterates over all 6 providers, calls clear() on each. Current code does with _<provider>_history_lock: _<provider>_history = [] 6 times. Hypothetical: clear_all() (already defined on branch) iterates and calls clear() once per provider.

Per-call cost: -2 to -5μs (negative — slight speedup because clear_all() is one function call vs. 6 inline blocks). Called once per project switch; negligible in absolute terms.

4.8 `reset_session()` (L461) — COLD per project-switch

Calls cleanup() (the cold path above). Total per-call cost: -2 to -5μs.

4.9 Init Path — `_PROVIDER_HISTORIES` dict construction at module load

One-time cost at module import. 6 ProviderHistory() instances each with default_factory=list + default_factory=threading.Lock. Total: ~10-15μs. Negligible.

5. Total Qualitative Cost Summary

Codepath	Path role	Est. overhead per call	Frequency	Total per session
`_send_anthropic`	Hot per turn	+8-15μs	~50 turns	+400-750μs
`_send_deepseek`	Hot per turn	+3-7μs	~50 turns	+150-350μs
`_send_minimax`	Hot per turn	+3-7μs	~50 turns	+150-350μs
`_send_grok`	Hot per turn	+2-5μs	~50 turns	+100-250μs
`_send_qwen`	Hot per turn	+2-5μs	~50 turns	+100-250μs
`_send_llama`	Hot per turn	+4-8μs	~50 turns	+200-400μs
`cleanup()` / `reset_session()`	Cold per project switch	-2-5μs	~1×	-2-5μs
Init (module load)	Once	+10-15μs	1×	+10-15μs
Total per session				~+1.1-2.4ms

Interpretation: Even at the upper bound (+2.4ms per session), this is 3+ orders of magnitude smaller than the LLM latency it lives alongside. The migration is type-safety for free in absolute runtime terms.

The actual audit will quantify these estimates. If the audit finds a >50μs delta per turn (e.g., from lock contention or get_all() list copies), the migration strategy changes (use with h.lock: blocks instead of get_all() to avoid copies).

6. The Risks (per `HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md` §1)

Risk	Likelihood	Impact	Mitigation
`get_history("anthropic").get_all()` copies the list per access; `_estimate_prompt_tokens` is called per turn and iterates the copy	Medium	+5-15μs per turn	Use `with h.lock: msg_list = h.messages` pattern in hot iteration sites
Lock contention: multiple `_send_<provider>` calls in parallel (rare but possible during batch sends)	Low	+1-10μs per turn under contention	The lock is per-provider; no cross-provider contention; benchmark will reveal
`getattr` lookup overhead for `get_history(...)` vs. global var	Low	+0.5μs per access	Could inline as a module-level constant if needed; unlikely worth the readability cost
The `_send_anthropic` cache-control helpers iterate the list; a copy doubles memory bandwidth	Medium	+10-30μs per turn if hot	Refactor to operate on `h.messages` under lock without copying
Forgotten call site (one of the 76 history refs missed)	Medium	Runtime AttributeError or NameError	Run tier-1-unit-core + tier-2-mock-app-core FULLY per the regression protocol

7. The Codepath Audit Additions (per `PROMPT_FOR_TIER_1.md` Decision 4)

Per Tier 1's sequencing decision, the code_path_audit_20260607 will instrument:

Action	Codepath	Measures
`provider_history_append`	`get_history(p).append(msg)` (or current `_anthropic_history.append(msg)`)	Per-turn append latency + lock acquire time
`websocket_broadcast`	`broadcast(WebSocketMessage(...))` (post-Phase 6a)	Per-broadcast overhead
`ai_message_lifecycle` (existing)	`_send_<provider>` end-to-end	Total per-turn latency delta pre/post Phase 3
`discussion_save_load` (existing)	`reset_session()` + project switch	Cold-path cost
`gui_startup` (existing)	`_PROVIDER_HISTORIES` init	One-time cost

8. Recommendation (subject to audit results)

If the audit confirms the qualitative estimates (+1-2ms per session; <50μs per turn):

Proceed with Phase 3 migration as planned (~10-15 commits).
Use with h.lock: blocks for hot iteration sites (_strip_cache_controls, _estimate_prompt_tokens) to avoid get_all() copies.
Run the 11-tier regression protocol per the follow-up track.

If the audit reveals a >50μs per-turn delta (e.g., lock contention >10μs):

Reconsider: do we even need to migrate the history aspect? It's list[Metadata] already typed.
Alternative: keep the module globals but rename them with a _HISTORY suffix and document the pattern; defer full ProviderHistory migration.

The audit decides. This analysis is the input to the audit, not the conclusion.

9. Open Questions

Should the ProviderHistory.messages be list[HistoryMessage] or list[dict[str, Any]]? Currently it's list[HistoryMessage] (= list[Metadata]). The legacy code uses list[Metadata] everywhere. The dataclass stays consistent with the type alias.
Should we add a __len__ method to ProviderHistory to avoid len(h.messages)?
- Pros: cleaner consumer code
- Cons: minor; only saves attribute access
Should _PROVIDER_HISTORIES be a MappingProxyType (read-only) for external code? Currently it's a regular dict; external code could mutate _PROVIDER_HISTORIES["anthropic"] = ProviderHistory(). Probably not worth the indirection.
Should get_history(p) validate p (raise on unknown)? Currently it raises KeyError. Could be Literal["anthropic", "deepseek", ...] for static type checking.

10. See Also

docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md — the original runtime cost framing
docs/handoffs/PROMPT_FOR_TIER_1.md — Tier 1's decision points
src/provider_state.py — the actual dataclass (already on tier2/any_type_componentization_20260621 branch)
conductor/tracks/any_type_componentization_20260621/spec.md — parent track spec
conductor/tracks/code_path_audit_20260607/spec.md — the audit that will quantify these estimates
conductor/tracks/phase2_4_5_call_site_completion_20260621/spec.md — the follow-up track that unblocks the audit

14 KiB Raw Blame History Unescape Escape

Phase 3 Hypothetical Promotion: ProviderHistory Migration Analysis

1. Purpose

2. The Dataclass (already exists on tier2/any_type_componentization_20260621 branch)

3. The Hypothetical Migration

3.1 Mechanical Translation Rules

3.2 Pattern Categories (per HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md §1)

3.3 Per-Provider Site Count (measured from current src/ai_client.py)

4. The Codepath Catalog (with Qualitative Cost Estimation)

4.1 _send_anthropic (L1407) — HOT per-LLM-turn

4.2 _send_deepseek (L2167) — HOT per-LLM-turn

4.3 _send_minimax (L2616) — HOT per-LLM-turn

4.4 _send_grok (L2532) — HOT per-LLM-turn

4.5 _send_qwen (L2771) — HOT per-LLM-turn

4.6 _send_llama (L2856) — HOT per-LLM-turn

4.7 cleanup() (L454) — COLD per project-switch

4.8 reset_session() (L461) — COLD per project-switch

4.9 Init Path — _PROVIDER_HISTORIES dict construction at module load