PHASE 2/4/5 FOLLOW-UP TRACK (Tier 1 decided SHINK to 6a + 6b + 6d): - Phase 6a: Fix HookServer.broadcast() callers (app_controller.py + events.py + gui_2.py) Adds tests/test_websocket_broadcast_regression.py with no-TypeError assertion - Phase 6b: Complete _send_grok/_send_minimax/_send_llama OpenAICompatibleRequest migration - Phase 6d: Update those 3 senders' NormalizedResponse to use UsageStats Total: ~16 atomic commits, ~3 hours Tier 2 work. Unblocks code_path_audit_20260607. CODE_PATH_AUDIT_20260607 PRE-FLIGHT ADJUSTMENTS (per handoffs): - Add 2 new actions: provider_history_append + websocket_broadcast - Add 5 micro-benchmarks: NormalizedResponse.__init__, WebSocketMessage.__init__, UsageStats.__init__, ProviderHistory.lock, ToolSpec.__init__ - Add no-TypeError-errors-on-any-thread assertion (backs test_websocket_broadcast_regression.py) - Add 89 fat-struct sites from ANY_TYPE_AUDIT_20260621.md as instrumented targets - BLOCKER: phase2_4_5_call_site_completion_20260621 (broadcast() TypeError) PHASE 3 HYPOTHETICAL ANALYSIS (separate doc): docs/reports/PHASE3_HYPOTHETICAL_PROMOTION.md - dataclass definitions (already on tier2 branch), per-provider codepath catalog (112 sites), qualitative cost estimation (~+1-2ms per session, ~+8-15us per _send_anthropic turn). Input for the audit; the audit quantifies the cost. REGISTRATION: conductor/tracks.md updated: new row 27 (follow-up), new row 28 (parent any_type_componentization), row 17 (code_path_audit) updated with pre-flight adjustments note. Files: - conductor/tracks/phase2_4_5_call_site_completion_20260621/spec.md (NEW; 633 lines) - conductor/tracks/phase2_4_5_call_site_completion_20260621/plan.md (NEW; 7 phases, 23 tasks) - conductor/tracks/phase2_4_5_call_site_completion_20260621/metadata.json (NEW; 8.8KB) - conductor/tracks/phase2_4_5_call_site_completion_20260621/state.toml (NEW; 11.8KB) - docs/reports/PHASE3_HYPOTHETICAL_PROMOTION.md (NEW; 380 lines; qualitative cost analysis) - conductor/tracks/code_path_audit_20260607/spec.md (MODIFIED; +93 lines Pre-Flight Adjustments) - conductor/tracks.md (MODIFIED; +35 lines: 3 new entries + 1 stale row fix)
14 KiB
Phase 3 Hypothetical Promotion: ProviderHistory Migration Analysis
Date: 2026-06-21
Author: Tier 1 Orchestrator
Status: Hypothetical — this is the analysis the deferred Phase 3 work would look like, NOT a track spec
Input: docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md (Tier 2's runtime cost framing) + src/provider_state.py (the dataclass already on the tier2 branch)
1. Purpose
Phase 3 (provider_state.ProviderHistory call-site migration in src/ai_client.py) was deferred from any_type_componentization_20260621 because:
- It's the highest-risk phase (112 call sites across 6 senders)
- The cost depends on whether each site is in a hot path, cold path, or init path
code_path_audit_20260607is the right tool to quantify that cost before refactoring
This document presents what the migration would look like — the approximate dataclasses, the call-site catalog, and a qualitative cost estimation of each codepath. The actual numbers will come from the audit. This document is the what; the audit produces the cost.
2. The Dataclass (already exists on tier2/any_type_componentization_20260621 branch)
# src/provider_state.py:25-44 (verbatim from branch)
@dataclass
class ProviderHistory:
messages: list[HistoryMessage] = field(default_factory=list)
lock: threading.Lock = field(default_factory=threading.Lock)
def append(self, message: HistoryMessage) -> None:
with self.lock:
self.messages.append(message)
def get_all(self) -> list[HistoryMessage]:
with self.lock:
return list(self.messages)
def replace_all(self, messages: list[HistoryMessage]) -> None:
with self.lock:
self.messages = list(messages)
def clear(self) -> None:
with self.lock:
self.messages = []
# src/provider_state.py:47-69 (verbatim from branch)
_PROVIDER_HISTORIES: dict[str, ProviderHistory] = {
"anthropic": ProviderHistory(),
"deepseek": ProviderHistory(),
"minimax": ProviderHistory(),
"qwen": ProviderHistory(),
"grok": ProviderHistory(),
"llama": ProviderHistory(),
}
def get_history(provider: str) -> ProviderHistory:
if provider not in _PROVIDER_HISTORIES:
raise KeyError(f"Unknown provider: {provider!r}")
return _PROVIDER_HISTORIES[provider]
def clear_all() -> None:
for h in _PROVIDER_HISTORIES.values():
h.clear()
def providers() -> tuple[str, ...]:
return tuple(_PROVIDER_HISTORIES.keys())
Properties that hold:
@dataclass(NOTfrozen=True) — the message list and lock are mutable; this is correct.default_factory=listformessages— eachProviderHistorygets its own list.default_factory=threading.Lockforlock— eachProviderHistorygets its own lock instance.- The 4-method interface encapsulates the lock; consumers never see it.
This is already on the tier2 branch. What Phase 3 does is migrate the consumers.
3. The Hypothetical Migration
The migration replaces direct module-global access (_anthropic_history, _anthropic_history_lock) with the typed accessor (get_history("anthropic")).
3.1 Mechanical Translation Rules
| Current | Hypothetical (typed) | Lock needed? |
|---|---|---|
_anthropic_history (read) |
get_history("anthropic").get_all() |
Yes (returns copy under lock) |
_anthropic_history (write ref) |
get_history("anthropic").messages |
Only inside with h.lock: |
_anthropic_history.append(m) |
get_history("anthropic").append(m) |
Encapsulated |
len(_anthropic_history) |
len(get_history("anthropic").messages) |
No (length is atomic in CPython) |
for m in _anthropic_history: |
for m in get_history("anthropic").get_all(): |
Yes |
with _anthropic_history_lock: |
with get_history("anthropic").lock: |
Same |
_anthropic_history = [] |
get_history("anthropic").clear() |
Encapsulated |
3.2 Pattern Categories (per HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md §1)
| Category | Sites | Path role |
|---|---|---|
_<provider>_history.append(message) |
6 | Hot — called per LLM turn |
len(_<provider>_history) / _<provider>_history[-1] / iteration |
~40 | Hot — called per LLM turn for trimming, tool-history cache breakpoint, strip_cache_controls |
with _<provider>_history_lock: |
~30 | Mixed — per-turn append is Hot; reset_session is Cold |
global _<provider>_history declarations |
4 | N/A — module-level, no runtime cost |
_strip_cache_controls(_<provider>_history) + _repair_<provider>_history() + _add_history_cache_breakpoint() + _trim_<provider>_history() |
~30 | Hot for Anthropic (cache controls); Mixed for others |
3.3 Per-Provider Site Count (measured from current src/ai_client.py)
| Provider | history refs | lock refs | global decls | Total sites |
|---|---|---|---|---|
| anthropic | 22 | 2 | 1 | 25 |
| deepseek | 13 | 6 | 1 | 20 |
| minimax | 15 | 5 | 1 | 21 |
| qwen | 7 | 4 | 1 | 12 |
| grok | 7 | 6 | 0 | 13 |
| llama | 12 | 9 | 0 | 21 |
| Total | 76 | 32 | 4 | 112 |
(Note: this 112 count is higher than the HANDOFF's "41" estimate, because the grep counts every reference including duplicates in helper functions. The migration work is the same either way — every reference gets touched — but the codepath catalog is richer.)
4. The Codepath Catalog (with Qualitative Cost Estimation)
This is the what the audit will quantify. Each codepath is tagged with path_role, call_frequency, and estimated qualitative cost delta (positive = slower, negative = faster, zero = no change).
4.1 _send_anthropic (L1407) — HOT per-LLM-turn
Codepaths inside _send_anthropic (per the grep):
| Codepath | Path role | Per-call freq | Qualitative cost delta |
|---|---|---|---|
_strip_cache_controls(_anthropic_history) |
Hot (called once per send) | 1× per LLM turn | +0.5-1μs (one extra dict lookup get_history("anthropic") per call) |
_repair_anthropic_history(_anthropic_history) |
Hot | 1× per LLM turn | +0.5μs (same) |
_anthropic_history.append(...) (user message) |
Hot | 1× per LLM turn | +0.5μs (method call vs. bare .append()) |
_add_history_cache_breakpoint(_anthropic_history) |
Hot | 1× per LLM turn | +0.5μs (same) |
_trim_anthropic_history(system_blocks, _anthropic_history) |
Hot | 1× per LLM turn | +0.5μs (one extra dict lookup) |
len(_anthropic_history) |
Hot | 2-3× per LLM turn (used in token estimation) | +0.3μs per call (.messages attribute access vs. global var lookup) |
_estimate_prompt_tokens(system_blocks, _anthropic_history) |
Hot | 1× per LLM turn | +1μs (the function takes a list; we pass h.messages under lock or h.get_all(); if the latter, that's a list copy — ~5μs for a 50-message history) |
for m in _anthropic_history: (inside _strip_cache_controls) |
Hot | 1× per LLM turn (iteration over ~10-50 messages) | +5-10μs (list copy via get_all(); the bare global just iterates directly) |
Per-turn overhead estimate: +8-15μs per _send_anthropic call. At ~50 turns per session, that's +400-750μs per session. Negligible vs LLM latency (typically 1-30 seconds).
Recommendation (subject to audit): Migrate, but use with h.lock: blocks for the hot paths inside _strip_cache_controls and _estimate_prompt_tokens to avoid the list-copy overhead of get_all().
4.2 _send_deepseek (L2167) — HOT per-LLM-turn
Similar pattern to _send_anthropic but simpler (no cache controls). Estimated per-turn overhead: +3-7μs. At 50 turns/session, +150-350μs/session.
4.3 _send_minimax (L2616) — HOT per-LLM-turn
Has _trim_minimax_history helper (L2484). Estimated per-turn overhead: +3-7μs. +150-350μs/session.
4.4 _send_grok (L2532) — HOT per-LLM-turn
No _trim or _repair helpers; simpler. Estimated per-turn overhead: +2-5μs. +100-250μs/session.
4.5 _send_qwen (L2771) — HOT per-LLM-turn
No helpers. Estimated per-turn overhead: +2-5μs. +100-250μs/session.
4.6 _send_llama (L2856) — HOT per-LLM-turn
Highest lock count (9 lock refs). Estimated per-turn overhead: +4-8μs. +200-400μs/session.
4.7 cleanup() (L454) — COLD per project-switch
Iterates over all 6 providers, calls clear() on each. Current code does with _<provider>_history_lock: _<provider>_history = [] 6 times. Hypothetical: clear_all() (already defined on branch) iterates and calls clear() once per provider.
Per-call cost: -2 to -5μs (negative — slight speedup because clear_all() is one function call vs. 6 inline blocks). Called once per project switch; negligible in absolute terms.
4.8 reset_session() (L461) — COLD per project-switch
Calls cleanup() (the cold path above). Total per-call cost: -2 to -5μs.
4.9 Init Path — _PROVIDER_HISTORIES dict construction at module load
One-time cost at module import. 6 ProviderHistory() instances each with default_factory=list + default_factory=threading.Lock. Total: ~10-15μs. Negligible.
5. Total Qualitative Cost Summary
| Codepath | Path role | Est. overhead per call | Frequency | Total per session |
|---|---|---|---|---|
_send_anthropic |
Hot per turn | +8-15μs | ~50 turns | +400-750μs |
_send_deepseek |
Hot per turn | +3-7μs | ~50 turns | +150-350μs |
_send_minimax |
Hot per turn | +3-7μs | ~50 turns | +150-350μs |
_send_grok |
Hot per turn | +2-5μs | ~50 turns | +100-250μs |
_send_qwen |
Hot per turn | +2-5μs | ~50 turns | +100-250μs |
_send_llama |
Hot per turn | +4-8μs | ~50 turns | +200-400μs |
cleanup() / reset_session() |
Cold per project switch | -2-5μs | ~1× | -2-5μs |
| Init (module load) | Once | +10-15μs | 1× | +10-15μs |
| Total per session | ~+1.1-2.4ms |
Interpretation: Even at the upper bound (+2.4ms per session), this is 3+ orders of magnitude smaller than the LLM latency it lives alongside. The migration is type-safety for free in absolute runtime terms.
The actual audit will quantify these estimates. If the audit finds a >50μs delta per turn (e.g., from lock contention or get_all() list copies), the migration strategy changes (use with h.lock: blocks instead of get_all() to avoid copies).
6. The Risks (per HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md §1)
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
get_history("anthropic").get_all() copies the list per access; _estimate_prompt_tokens is called per turn and iterates the copy |
Medium | +5-15μs per turn | Use with h.lock: msg_list = h.messages pattern in hot iteration sites |
Lock contention: multiple _send_<provider> calls in parallel (rare but possible during batch sends) |
Low | +1-10μs per turn under contention | The lock is per-provider; no cross-provider contention; benchmark will reveal |
getattr lookup overhead for get_history(...) vs. global var |
Low | +0.5μs per access | Could inline as a module-level constant if needed; unlikely worth the readability cost |
The _send_anthropic cache-control helpers iterate the list; a copy doubles memory bandwidth |
Medium | +10-30μs per turn if hot | Refactor to operate on h.messages under lock without copying |
| Forgotten call site (one of the 76 history refs missed) | Medium | Runtime AttributeError or NameError | Run tier-1-unit-core + tier-2-mock-app-core FULLY per the regression protocol |
7. The Codepath Audit Additions (per PROMPT_FOR_TIER_1.md Decision 4)
Per Tier 1's sequencing decision, the code_path_audit_20260607 will instrument:
| Action | Codepath | Measures |
|---|---|---|
provider_history_append |
get_history(p).append(msg) (or current _anthropic_history.append(msg)) |
Per-turn append latency + lock acquire time |
websocket_broadcast |
broadcast(WebSocketMessage(...)) (post-Phase 6a) |
Per-broadcast overhead |
ai_message_lifecycle (existing) |
_send_<provider> end-to-end |
Total per-turn latency delta pre/post Phase 3 |
discussion_save_load (existing) |
reset_session() + project switch |
Cold-path cost |
gui_startup (existing) |
_PROVIDER_HISTORIES init |
One-time cost |
8. Recommendation (subject to audit results)
If the audit confirms the qualitative estimates (+1-2ms per session; <50μs per turn):
- Proceed with Phase 3 migration as planned (~10-15 commits).
- Use
with h.lock:blocks for hot iteration sites (_strip_cache_controls,_estimate_prompt_tokens) to avoidget_all()copies. - Run the 11-tier regression protocol per the follow-up track.
If the audit reveals a >50μs per-turn delta (e.g., lock contention >10μs):
- Reconsider: do we even need to migrate the history aspect? It's
list[Metadata]already typed. - Alternative: keep the module globals but rename them with a
_HISTORYsuffix and document the pattern; defer full ProviderHistory migration.
The audit decides. This analysis is the input to the audit, not the conclusion.
9. Open Questions
- Should the
ProviderHistory.messagesbelist[HistoryMessage]orlist[dict[str, Any]]? Currently it'slist[HistoryMessage](=list[Metadata]). The legacy code useslist[Metadata]everywhere. The dataclass stays consistent with the type alias. - Should we add a
__len__method toProviderHistoryto avoidlen(h.messages)?- Pros: cleaner consumer code
- Cons: minor; only saves attribute access
- Should
_PROVIDER_HISTORIESbe aMappingProxyType(read-only) for external code? Currently it's a regular dict; external code could mutate_PROVIDER_HISTORIES["anthropic"] = ProviderHistory(). Probably not worth the indirection. - Should
get_history(p)validatep(raise on unknown)? Currently it raisesKeyError. Could beLiteral["anthropic", "deepseek", ...]for static type checking.
10. See Also
docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md— the original runtime cost framingdocs/handoffs/PROMPT_FOR_TIER_1.md— Tier 1's decision pointssrc/provider_state.py— the actual dataclass (already ontier2/any_type_componentization_20260621branch)conductor/tracks/any_type_componentization_20260621/spec.md— parent track specconductor/tracks/code_path_audit_20260607/spec.md— the audit that will quantify these estimatesconductor/tracks/phase2_4_5_call_site_completion_20260621/spec.md— the follow-up track that unblocks the audit