manual_slop

ed/manual_slop

Private

Public Access

Fork 0

Commit Graph

Author SHA1 Message Date

Author	SHA1	Message	Date
ed	95a8fae234	docs(handoff): Tier 1 prompt - follow-up track + audit sequencing Synthesizes the 2 prior handoff docs into a ready-to-use Tier 1 brief: - HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md (the audit framing) - HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md (the test failures + scope) Sections: 1. TL;DR (3 paragraphs): what happened, the hidden broadcast() bug, the recommendation (don't merge; use as input for follow-up track) 2. Context: 48 promoted, 41 deferred, 2 new audits, 1 styleguide 3. 4 decision points for Tier 1 (scope, sequencing, audit adjustments, scope expansion) 4. The 4 documents Tier 1 should read in order (45 min total) 5. What Tier 1 should NOT do (3 anti-patterns) 6. What Tier 1 SHOULD do (6 concrete first steps) 7. What Tier 2 is available for (conventions reminder) 8. The bigger vision (agent-debugger framing) Recommended sequencing for Tier 1: T0: Approve follow-up track scope T1: Tier 2 implements Phase 6a + 6b + 6d (~18 commits, 3 hours) T2: Tier 2 runs tier-1-unit-core FULLY (no stop-on-failure) T3: Tier 2 runs tier-3-live_gui FULLY T4: Tier 1 reviews + merges follow-up track T5: Tier 1 launches code_path_audit_20260607 T6: Tier 2 implements Phase 3 + cross-phase coupling (separate track) Tier 1's scope decision: I recommend the SHRUNK version (Phase 6a + 6b + 6d only; defer Phase 3 to its own track). This gives the code-path audit a clean instrumented target without ballooning the follow-up beyond Tier 2's 1-4 hour budget. Audit adjustments to add: - 5 micro-benchmarks (NormalizedResponse.__init__, WebSocketMessage.__init__, UsageStats.__init__, ProviderHistory.lock, ToolSpec.__init__) - 'no-TypeError-errors-on-any-thread' assertion - Instrument grok/minimax/llama providers (currently unprofiled) - Add 2 new actions: provider_history_append + websocket_broadcast	2026-06-21 17:57:38 -04:00
ed	b3ed4b1508	docs(handoff): test failure report for follow-up track scoping Categorizes the 12 test failures the user observed when running scripts/run_tests_batched.py after this track: - 10 failures (mine): Phase 2 NormalizedResponse API migration incomplete (state.toml t2_6 deferred task); FIXED in commit `30c8b263` - 3 failures (sandbox): test_audit_tier2_leaks.py flags sandbox files (mcp_paths.toml, opencode.json) as modified; NOT my fault - 1 failure (pre-existing): test_gui2_custom_callback_hook_works; live_gui test not touched by this track Hidden 12th failure: - worker[queue_fallback] error: WebSocketServer.broadcast() takes 2 positional arguments but 3 were given (appeared 6+ times during tier-2-mock-app-core but tests still passed; error logged on GUI thread from app_controller._run_pending_tasks_once_result). Phase 5 refactored broadcast(channel, payload) to broadcast(WebSocketMessage); I updated test_websocket_server.py but missed app_controller.py and events.py callers. Sections: 1. Executive summary (3 categories of failure) 2. Per-failure categorization (10 + 3 + 1) 3. Hidden 12th failure: WebSocket broadcast callers in app_controller 4. Phase 2 API migration status (8 sites; 5 done, 3 unverified) 5. Recommendations for follow-up track (~5 call sites + ~41 Phase 3) 6. Code-path audit input (5 micro-benchmarks to add) Follow-up track scope: ~15-20 commits, well-scoped. Should run BEFORE code_path_audit_20260607 because the worker[queue_fallback] TypeError spam will confuse the audit's runtime instrumentation.	2026-06-21 17:53:48 -04:00
ed	0fabeaf4ce	docs(handoff): Tier 2 -> Tier 1 input for code_path_audit_20260607 While running any_type_componentization_20260621, the Tier 2 agent performed a partial code-path audit + code normalization pass that wasn't in the original scope. This handoff document frames: 1. What was done (48 of 89 fat-struct sites promoted; 41 deferred) 2. The 5-pattern Any-type taxonomy (Patterns 3/4/5 correctly preserved; Patterns 1/2 promoted to dataclass/registry) 3. Recommended adjustments for code_path_audit_20260607: - Instrument the 89 fat-struct sites with hot/cold/init path tags - Compare pre/post refactor cost for the 48 promoted sites - Rank the 41 deferred Phase 3 sites by hot-path frequency - Report per-call cost deltas in microseconds 4. What was NOT done (no runtime profiling; no pre/post benchmarks) 5. Decision points for Tier 1 (merge / reject / cherry-pick) 6. The bigger vision: AI/LLM frontend debugger (rad-debugger analog) requires typed ProviderHistory, ToolSpec, Session, WebSocketMessage to step through the agent loop without losing type fidelity Recommendation: Don't merge this branch yet. Let code_path_audit_20260607 use it as a reconnaissance warm-up; drive the next refactor track from the audit's per-action cost data. The 4 newly-promoted dataclasses (mcp_tool_specs, openai_schemas, log_registry.Session, api_hooks.WebSocketMessage) are the typed-state foundation that the future debugger UI will read from. The 41 deferred Phase 3 sites are the last gap: per-turn history manipulation in src/ai_client.py needs typed state before the debugger can step through the agent loop losslessly. Length: 7 sections, 7 paragraphs of Tier 1 decision framing. Location: docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md (new directory; complements docs/reports/ which is for reports vs handoffs which are cross-track input artifacts).	2026-06-21 17:14:22 -04:00

95a8fae234

docs(handoff): Tier 1 prompt - follow-up track + audit sequencing

Synthesizes the 2 prior handoff docs into a ready-to-use Tier 1 brief:
- HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md (the audit framing)
- HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md (the test failures + scope)

Sections:
1. TL;DR (3 paragraphs): what happened, the hidden broadcast() bug,
   the recommendation (don't merge; use as input for follow-up track)
2. Context: 48 promoted, 41 deferred, 2 new audits, 1 styleguide
3. 4 decision points for Tier 1 (scope, sequencing, audit adjustments,
   scope expansion)
4. The 4 documents Tier 1 should read in order (45 min total)
5. What Tier 1 should NOT do (3 anti-patterns)
6. What Tier 1 SHOULD do (6 concrete first steps)
7. What Tier 2 is available for (conventions reminder)
8. The bigger vision (agent-debugger framing)

Recommended sequencing for Tier 1:
T0: Approve follow-up track scope
T1: Tier 2 implements Phase 6a + 6b + 6d (~18 commits, 3 hours)
T2: Tier 2 runs tier-1-unit-core FULLY (no stop-on-failure)
T3: Tier 2 runs tier-3-live_gui FULLY
T4: Tier 1 reviews + merges follow-up track
T5: Tier 1 launches code_path_audit_20260607
T6: Tier 2 implements Phase 3 + cross-phase coupling (separate track)

Tier 1's scope decision: I recommend the SHRUNK version (Phase 6a + 6b + 6d
only; defer Phase 3 to its own track). This gives the code-path audit a
clean instrumented target without ballooning the follow-up beyond Tier 2's
1-4 hour budget.

Audit adjustments to add:
- 5 micro-benchmarks (NormalizedResponse.__init__, WebSocketMessage.__init__,
  UsageStats.__init__, ProviderHistory.lock, ToolSpec.__init__)
- 'no-TypeError-errors-on-any-thread' assertion
- Instrument grok/minimax/llama providers (currently unprofiled)
- Add 2 new actions: provider_history_append + websocket_broadcast

2026-06-21 17:57:38 -04:00

b3ed4b1508

docs(handoff): test failure report for follow-up track scoping

Categorizes the 12 test failures the user observed when running
scripts/run_tests_batched.py after this track:

- 10 failures (mine): Phase 2 NormalizedResponse API migration
  incomplete (state.toml t2_6 deferred task); FIXED in commit 30c8b263
- 3 failures (sandbox): test_audit_tier2_leaks.py flags sandbox
  files (mcp_paths.toml, opencode.json) as modified; NOT my fault
- 1 failure (pre-existing): test_gui2_custom_callback_hook_works;
  live_gui test not touched by this track

Hidden 12th failure:
- worker[queue_fallback] error: WebSocketServer.broadcast() takes 2
  positional arguments but 3 were given (appeared 6+ times during
  tier-2-mock-app-core but tests still passed; error logged on
  GUI thread from app_controller._run_pending_tasks_once_result).
  Phase 5 refactored broadcast(channel, payload) to
  broadcast(WebSocketMessage); I updated test_websocket_server.py
  but missed app_controller.py and events.py callers.

Sections:
1. Executive summary (3 categories of failure)
2. Per-failure categorization (10 + 3 + 1)
3. Hidden 12th failure: WebSocket broadcast callers in app_controller
4. Phase 2 API migration status (8 sites; 5 done, 3 unverified)
5. Recommendations for follow-up track (~5 call sites + ~41 Phase 3)
6. Code-path audit input (5 micro-benchmarks to add)

Follow-up track scope: ~15-20 commits, well-scoped. Should run BEFORE
code_path_audit_20260607 because the worker[queue_fallback] TypeError
spam will confuse the audit's runtime instrumentation.

2026-06-21 17:53:48 -04:00

0fabeaf4ce

docs(handoff): Tier 2 -> Tier 1 input for code_path_audit_20260607

While running any_type_componentization_20260621, the Tier 2 agent
performed a partial code-path audit + code normalization pass that
wasn't in the original scope. This handoff document frames:

1. What was done (48 of 89 fat-struct sites promoted; 41 deferred)
2. The 5-pattern Any-type taxonomy (Patterns 3/4/5 correctly preserved;
   Patterns 1/2 promoted to dataclass/registry)
3. Recommended adjustments for code_path_audit_20260607:
   - Instrument the 89 fat-struct sites with hot/cold/init path tags
   - Compare pre/post refactor cost for the 48 promoted sites
   - Rank the 41 deferred Phase 3 sites by hot-path frequency
   - Report per-call cost deltas in microseconds
4. What was NOT done (no runtime profiling; no pre/post benchmarks)
5. Decision points for Tier 1 (merge / reject / cherry-pick)
6. The bigger vision: AI/LLM frontend debugger (rad-debugger analog)
   requires typed ProviderHistory, ToolSpec, Session, WebSocketMessage
   to step through the agent loop without losing type fidelity

Recommendation: Don't merge this branch yet. Let code_path_audit_20260607
use it as a reconnaissance warm-up; drive the next refactor track from
the audit's per-action cost data.

The 4 newly-promoted dataclasses (mcp_tool_specs, openai_schemas,
log_registry.Session, api_hooks.WebSocketMessage) are the typed-state
foundation that the future debugger UI will read from. The 41 deferred
Phase 3 sites are the last gap: per-turn history manipulation in
src/ai_client.py needs typed state before the debugger can step
through the agent loop losslessly.

Length: 7 sections, 7 paragraphs of Tier 1 decision framing.
Location: docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md
(new directory; complements docs/reports/ which is for reports vs
handoffs which are cross-track input artifacts).

2026-06-21 17:14:22 -04:00

3 Commits