This reverts commitf914b2bcd4, reversing changes made to7fef95cc87.
10 KiB
Tier 1 Prompt: Follow-up Track + Code-Path Audit Sequencing
From: Tier 2 Tech Lead (autonomous sandbox, any_type_componentization_20260621)
To: Tier 1 Orchestrator
Date: 2026-06-21
Status: Branch tier2/any_type_componentization_20260621 is at 24 commits, ready for review (not merge).
TL;DR (read this first)
Tier 2 ran any_type_componentization_20260621 and the result is reconnaissance-grade, not merge-grade. The track did 48 of 89 fat-struct promotions cleanly (Phase 1, 2, 4, 5), but deferred Phase 3 entirely and left one runtime bug that didn't surface in my targeted regression suite: WebSocketServer.broadcast() callers in src/app_controller.py and src/events.py still use the old (channel, payload) signature after Phase 5 changed it to (message: WebSocketMessage). This produces worker[queue_fallback] error: WebSocketServer.broadcast() takes 2 positional arguments but 3 were given spam in tier-2-mock-app-core.
Tier 1 should: (a) approve a ~15-commit follow-up track that closes the deferred work and the broadcast() bug, then (b) sequence code_path_audit_20260607 to use the follow-up's output as input.
Do not merge this branch yet. Use it as the spec input for the follow-up track.
Context: what happened in this track
Input artifact: docs/reports/ANY_TYPE_AUDIT_20260621.md identified 89 fat-struct sites across 5 candidates (mcp_tool_specs: 8, openai_schemas: 17, provider_state: 41, log_registry.Session: 7, api_hooks.WebSocketMessage: 16).
Output:
- 48 sites promoted: Phase 1 (
ToolSpec+ToolParameterregistry; 45 tools), Phase 2 (ChatMessage+UsageStats+ToolCall+ refactoredNormalizedResponse+OpenAICompatibleRequest), Phase 4 (Session+SessionMetadatawith backward-compat__getitem__), Phase 5 (WebSocketMessage+JsonValue). - 41 sites deferred: Phase 3 (
provider_state.ProviderHistorydataclass exists; the 27 call sites insrc/ai_client.py_send_<provider>functions remain on the legacy_anthropic_history/_deepseek_history/ etc. globals). - 2 new audit scripts:
scripts/audit_dataclass_coverage.py(CI gate; baseline = 207 → post-track = 200). - 1 styleguide update:
conductor/code_styleguides/type_aliases.md§12 "When to Promote TypeAlias to dataclass" (98 lines; the codified rule future agents will follow). - 1 end-of-track report:
docs/reports/TRACK_COMPLETION_any_type_componentization_20260621.md.
Code-path audit input doc: docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md (commit 0fabeaf4). Tier 1 should read this BEFORE scoping code_path_audit_20260607.
Failure report doc: docs/handoffs/HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md (commit d7b6b229). Tier 1 should read this BEFORE scoping the follow-up track.
Tier 1 decision points
Decision 1: Approve the follow-up track?
Recommended scope (per HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md):
| Task | Scope | Est. commits |
|---|---|---|
Phase 6a: Fix WebSocketServer.broadcast() callers |
Grep src/ for \.broadcast\(; replace broadcast(channel, payload) with broadcast(WebSocketMessage(channel=, payload=)) in src/app_controller.py:_run_pending_tasks_once_result, src/events.py, src/gui_2.py. Add regression tests. |
4-6 |
Phase 6b: Complete t2_6 (OpenAICompatibleRequest callers in _send_grok, _send_minimax, _send_llama) |
Migrate the 3 remaining _send_<provider> functions in src/ai_client.py to construct OpenAICompatibleRequest(messages=[ChatMessage(...)], ...) instead of messages=[{"role": ..., "content": ...}] |
3-4 |
| Phase 6c: Complete Phase 3 (provider_state call-site migration) | Replace _anthropic_history / _anthropic_history_lock etc. in src/ai_client.py with provider_state.get_history('anthropic'). ~27 call sites. |
8-10 |
Phase 6d: Update _send_grok / _send_minimax / _send_llama callers to use new ChatMessage / UsageStats |
Migration of NormalizedResponse(text=..., usage_input_tokens=..., ...) to NormalizedResponse(text=..., usage=UsageStats(...)) in the 3 send functions. |
3-4 |
| Total | ~18-24 commits |
Tier 1 should decide: approve this scope, OR shrink (defer Phase 3 entirely to a separate track; do just Phase 6a + 6b + 6d to unblock the audit), OR expand (also include the cross-phase coupling fix: migrate OpenAICompatibleRequest.tools from list[dict[str, Any]] to list[ToolSpec]).
My recommendation: shrink. Phase 3 + cross-phase coupling are separate concerns. Do just Phase 6a + 6b + 6d (the code-path-honest part: every NormalizedResponse construction site uses the new API; every broadcast() caller uses the new signature). Defer Phase 3 + cross-phase coupling to their own tracks. This gives code_path_audit_20260607 a clean instrumented target.
Decision 2: Sequence code_path_audit_20260607 after the follow-up?
Yes. The audit's trace_action output will be polluted by worker[queue_fallback] error: WebSocketServer.broadcast() takes 2 positional arguments but 3 were given unless Phase 6a lands first. The audit's per-action profiling assumes no TypeError spam on the GUI thread; if the broadcast call site raises, the audit's timing data is contaminated.
Recommended sequencing:
T0: Tier 1 approves follow-up track (decision 1)
T1: Tier 2 implements Phase 6a + 6b + 6d (~3 hours, ~18 commits)
T2: Tier 2 runs tier-1-unit-core FULLY (no stop-on-failure)
T3: Tier 2 runs tier-3-live_gui FULLY (no stop-on-failure)
T4: Tier 1 reviews + merges follow-up track
T5: Tier 1 launches code_path_audit_20260607
T6: Tier 2 implements Phase 3 + cross-phase coupling (separate track, post-audit)
Decision 3: Adjust code_path_audit_20260607 per the handoff doc
The existing code_path_audit_20260607 spec (per ANY_TYPE_AUDIT_20260621.md §5) calls for per-action profiling. Tier 1 should ADD:
- The 5 micro-benchmarks listed in
HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md§7 (NormalizedResponse.init, WebSocketMessage.init, UsageStats.init, ProviderHistory.lock, ToolSpec.init). - A "no-TypeError-errors-on-any-thread" assertion: the audit should fail if any
worker[queue_fallback] error: WebSocketServer.broadcast()appears in the test output during the audit's per-action profiling. (Phase 6a's regression test should make this assertion.) - The 3 OpenAI-compatible providers (
grok,minimax,llama) — currently unprofiled — should be instrumented, since they're the hot paths Phase 6b will migrate.
Decision 4: Code-Path Audit pre-flight scope expansion
The existing code_path_audit_20260607 spec scopes 3 actions (ai_message_lifecycle, discussion_save_load, gui_startup). Tier 1 should ADD:
provider_history_append: every_send_<provider>path appends to history; the audit should measure per-turn latency.websocket_broadcast: the GUI thread broadcasts; the audit should measure broadcast throughput under load.
These are the hot paths Phase 3 + Phase 6a will touch. The audit's data will directly inform whether the Phase 3 + Phase 6a refactors are worth the cost.
The 4 documents Tier 1 should read (in this order)
docs/reports/ANY_TYPE_AUDIT_20260621.md(input artifact; the 89 sites and the 5-pattern taxonomy)docs/reports/TRACK_COMPLETION_any_type_componentization_20260621.md(what was done, what was deferred, the per-phase results table)docs/handoffs/HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md(test failure categorization; the 4-section follow-up scope; the micro-benchmarks)docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md(the 5-pattern taxonomy applied to runtime; the "the code is the agent debugger" framing; the recommendation not to merge this branch)
Total read time: ~45 minutes for Tier 1 to come up to speed.
What Tier 1 should NOT do
- Don't merge
tier2/any_type_componentization_20260621as-is. The 1 runtime bug (broadcast() insrc/app_controller.py) makes the branch not merge-grade. - Don't launch
code_path_audit_20260607before the follow-up track. The TypeError spam will pollute the audit's per-action profiling. - Don't try to fix Phase 3 + cross-phase coupling in the same track as the follow-up. Phase 3 is ~8-10 commits; cross-phase coupling is ~3-4 commits; combining them with the broadcast fix would balloon the follow-up to ~25 commits and exceed the 1-4 hour Tier 2 budget.
What Tier 1 SHOULD do (concrete first steps)
- Read the 4 documents above. (45 min)
- Decide on Decision 1 scope. (10 min — approve the shrunk 18-commit follow-up, OR the full 24-commit version)
- Create the follow-up track spec at
conductor/tracks/phase2_4_5_call_site_completion_2026MMDD/spec.mdreferencing this prompt + the 4 documents. - Adjust
code_path_audit_20260607spec to include the 5 micro-benchmarks + 2 new actions (provider_history_append,websocket_broadcast) + the "no-TypeError" assertion. - Launch the follow-up track via
/conductor:implement. - After follow-up completes and merges, launch
code_path_audit_20260607.
What Tier 2 is available for
Tier 2 can be re-invoked to implement the follow-up track. The handoff is in docs/handoffs/; the spec will be in conductor/tracks/.../spec.md. Same Tier 2 conventions apply:
- Read all 13
conductor/code_styleguides/*.mdbefore starting - Per-task commit + git note + state.toml update
- Throwaway scripts to
scripts/tier2/artifacts/<track-name>/ - Archive move is the user's job, not Tier 2's
Final note: the bigger vision
The user said: "We are nudging toward a much more interesting and compelling codebase to ideate this ai llm frontend towards something as novel as the rad debugger but for its domain."
The any_type_componentization_20260621 track is reconnaissance for that vision. The follow-up track is "make the codebase match the reconnaissance." code_path_audit_20260607 is "measure the runtime cost of every typed site so the agent debugger UI can read it losslessly." Together: typed code + measured paths + readable dataclasses = the foundation for an agent-debugger frontend.
Don't merge the branch. Use it as input.
— Tier 2