Private
Public Access
0
0
Files
manual_slop/docs/handoffs/PROMPT_FOR_TIER_1.md
T

10 KiB

Tier 1 Prompt: Follow-up Track + Code-Path Audit Sequencing

From: Tier 2 Tech Lead (autonomous sandbox, any_type_componentization_20260621) To: Tier 1 Orchestrator Date: 2026-06-21 Status: Branch tier2/any_type_componentization_20260621 is at 24 commits, ready for review (not merge).


TL;DR (read this first)

Tier 2 ran any_type_componentization_20260621 and the result is reconnaissance-grade, not merge-grade. The track did 48 of 89 fat-struct promotions cleanly (Phase 1, 2, 4, 5), but deferred Phase 3 entirely and left one runtime bug that didn't surface in my targeted regression suite: WebSocketServer.broadcast() callers in src/app_controller.py and src/events.py still use the old (channel, payload) signature after Phase 5 changed it to (message: WebSocketMessage). This produces worker[queue_fallback] error: WebSocketServer.broadcast() takes 2 positional arguments but 3 were given spam in tier-2-mock-app-core.

Tier 1 should: (a) approve a ~15-commit follow-up track that closes the deferred work and the broadcast() bug, then (b) sequence code_path_audit_20260607 to use the follow-up's output as input.

Do not merge this branch yet. Use it as the spec input for the follow-up track.


Context: what happened in this track

Input artifact: docs/reports/ANY_TYPE_AUDIT_20260621.md identified 89 fat-struct sites across 5 candidates (mcp_tool_specs: 8, openai_schemas: 17, provider_state: 41, log_registry.Session: 7, api_hooks.WebSocketMessage: 16).

Output:

  • 48 sites promoted: Phase 1 (ToolSpec + ToolParameter registry; 45 tools), Phase 2 (ChatMessage + UsageStats + ToolCall + refactored NormalizedResponse + OpenAICompatibleRequest), Phase 4 (Session + SessionMetadata with backward-compat __getitem__), Phase 5 (WebSocketMessage + JsonValue).
  • 41 sites deferred: Phase 3 (provider_state.ProviderHistory dataclass exists; the 27 call sites in src/ai_client.py _send_<provider> functions remain on the legacy _anthropic_history / _deepseek_history / etc. globals).
  • 2 new audit scripts: scripts/audit_dataclass_coverage.py (CI gate; baseline = 207 → post-track = 200).
  • 1 styleguide update: conductor/code_styleguides/type_aliases.md §12 "When to Promote TypeAlias to dataclass" (98 lines; the codified rule future agents will follow).
  • 1 end-of-track report: docs/reports/TRACK_COMPLETION_any_type_componentization_20260621.md.

Code-path audit input doc: docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md (commit 0fabeaf4). Tier 1 should read this BEFORE scoping code_path_audit_20260607.

Failure report doc: docs/handoffs/HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md (commit d7b6b229). Tier 1 should read this BEFORE scoping the follow-up track.


Tier 1 decision points

Decision 1: Approve the follow-up track?

Recommended scope (per HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md):

Task Scope Est. commits
Phase 6a: Fix WebSocketServer.broadcast() callers Grep src/ for \.broadcast\(; replace broadcast(channel, payload) with broadcast(WebSocketMessage(channel=, payload=)) in src/app_controller.py:_run_pending_tasks_once_result, src/events.py, src/gui_2.py. Add regression tests. 4-6
Phase 6b: Complete t2_6 (OpenAICompatibleRequest callers in _send_grok, _send_minimax, _send_llama) Migrate the 3 remaining _send_<provider> functions in src/ai_client.py to construct OpenAICompatibleRequest(messages=[ChatMessage(...)], ...) instead of messages=[{"role": ..., "content": ...}] 3-4
Phase 6c: Complete Phase 3 (provider_state call-site migration) Replace _anthropic_history / _anthropic_history_lock etc. in src/ai_client.py with provider_state.get_history('anthropic'). ~27 call sites. 8-10
Phase 6d: Update _send_grok / _send_minimax / _send_llama callers to use new ChatMessage / UsageStats Migration of NormalizedResponse(text=..., usage_input_tokens=..., ...) to NormalizedResponse(text=..., usage=UsageStats(...)) in the 3 send functions. 3-4
Total ~18-24 commits

Tier 1 should decide: approve this scope, OR shrink (defer Phase 3 entirely to a separate track; do just Phase 6a + 6b + 6d to unblock the audit), OR expand (also include the cross-phase coupling fix: migrate OpenAICompatibleRequest.tools from list[dict[str, Any]] to list[ToolSpec]).

My recommendation: shrink. Phase 3 + cross-phase coupling are separate concerns. Do just Phase 6a + 6b + 6d (the code-path-honest part: every NormalizedResponse construction site uses the new API; every broadcast() caller uses the new signature). Defer Phase 3 + cross-phase coupling to their own tracks. This gives code_path_audit_20260607 a clean instrumented target.

Decision 2: Sequence code_path_audit_20260607 after the follow-up?

Yes. The audit's trace_action output will be polluted by worker[queue_fallback] error: WebSocketServer.broadcast() takes 2 positional arguments but 3 were given unless Phase 6a lands first. The audit's per-action profiling assumes no TypeError spam on the GUI thread; if the broadcast call site raises, the audit's timing data is contaminated.

Recommended sequencing:

T0:  Tier 1 approves follow-up track                  (decision 1)
T1:  Tier 2 implements Phase 6a + 6b + 6d            (~3 hours, ~18 commits)
T2:  Tier 2 runs tier-1-unit-core FULLY               (no stop-on-failure)
T3:  Tier 2 runs tier-3-live_gui FULLY                (no stop-on-failure)
T4:  Tier 1 reviews + merges follow-up track
T5:  Tier 1 launches code_path_audit_20260607
T6:  Tier 2 implements Phase 3 + cross-phase coupling (separate track, post-audit)

Decision 3: Adjust code_path_audit_20260607 per the handoff doc

The existing code_path_audit_20260607 spec (per ANY_TYPE_AUDIT_20260621.md §5) calls for per-action profiling. Tier 1 should ADD:

  1. The 5 micro-benchmarks listed in HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md §7 (NormalizedResponse.init, WebSocketMessage.init, UsageStats.init, ProviderHistory.lock, ToolSpec.init).
  2. A "no-TypeError-errors-on-any-thread" assertion: the audit should fail if any worker[queue_fallback] error: WebSocketServer.broadcast() appears in the test output during the audit's per-action profiling. (Phase 6a's regression test should make this assertion.)
  3. The 3 OpenAI-compatible providers (grok, minimax, llama) — currently unprofiled — should be instrumented, since they're the hot paths Phase 6b will migrate.

Decision 4: Code-Path Audit pre-flight scope expansion

The existing code_path_audit_20260607 spec scopes 3 actions (ai_message_lifecycle, discussion_save_load, gui_startup). Tier 1 should ADD:

  • provider_history_append: every _send_<provider> path appends to history; the audit should measure per-turn latency.
  • websocket_broadcast: the GUI thread broadcasts; the audit should measure broadcast throughput under load.

These are the hot paths Phase 3 + Phase 6a will touch. The audit's data will directly inform whether the Phase 3 + Phase 6a refactors are worth the cost.


The 4 documents Tier 1 should read (in this order)

  1. docs/reports/ANY_TYPE_AUDIT_20260621.md (input artifact; the 89 sites and the 5-pattern taxonomy)
  2. docs/reports/TRACK_COMPLETION_any_type_componentization_20260621.md (what was done, what was deferred, the per-phase results table)
  3. docs/handoffs/HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md (test failure categorization; the 4-section follow-up scope; the micro-benchmarks)
  4. docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md (the 5-pattern taxonomy applied to runtime; the "the code is the agent debugger" framing; the recommendation not to merge this branch)

Total read time: ~45 minutes for Tier 1 to come up to speed.


What Tier 1 should NOT do

  • Don't merge tier2/any_type_componentization_20260621 as-is. The 1 runtime bug (broadcast() in src/app_controller.py) makes the branch not merge-grade.
  • Don't launch code_path_audit_20260607 before the follow-up track. The TypeError spam will pollute the audit's per-action profiling.
  • Don't try to fix Phase 3 + cross-phase coupling in the same track as the follow-up. Phase 3 is ~8-10 commits; cross-phase coupling is ~3-4 commits; combining them with the broadcast fix would balloon the follow-up to ~25 commits and exceed the 1-4 hour Tier 2 budget.

What Tier 1 SHOULD do (concrete first steps)

  1. Read the 4 documents above. (45 min)
  2. Decide on Decision 1 scope. (10 min — approve the shrunk 18-commit follow-up, OR the full 24-commit version)
  3. Create the follow-up track spec at conductor/tracks/phase2_4_5_call_site_completion_2026MMDD/spec.md referencing this prompt + the 4 documents.
  4. Adjust code_path_audit_20260607 spec to include the 5 micro-benchmarks + 2 new actions (provider_history_append, websocket_broadcast) + the "no-TypeError" assertion.
  5. Launch the follow-up track via /conductor:implement.
  6. After follow-up completes and merges, launch code_path_audit_20260607.

What Tier 2 is available for

Tier 2 can be re-invoked to implement the follow-up track. The handoff is in docs/handoffs/; the spec will be in conductor/tracks/.../spec.md. Same Tier 2 conventions apply:

  • Read all 13 conductor/code_styleguides/*.md before starting
  • Per-task commit + git note + state.toml update
  • Throwaway scripts to scripts/tier2/artifacts/<track-name>/
  • Archive move is the user's job, not Tier 2's

Final note: the bigger vision

The user said: "We are nudging toward a much more interesting and compelling codebase to ideate this ai llm frontend towards something as novel as the rad debugger but for its domain."

The any_type_componentization_20260621 track is reconnaissance for that vision. The follow-up track is "make the codebase match the reconnaissance." code_path_audit_20260607 is "measure the runtime cost of every typed site so the agent debugger UI can read it losslessly." Together: typed code + measured paths + readable dataclasses = the foundation for an agent-debugger frontend.

Don't merge the branch. Use it as input.

— Tier 2