Private
Public Access
0
0
Commit Graph

4114 Commits

Author SHA1 Message Date
ed 8ff397cfd7 conductor(free_lunches_levin): Phase 3 OCR - 67 frames OCR'd via winsdk in 2.3s 2026-06-21 22:57:26 -04:00
ed 85799bdef1 conductor(free_lunches_levin): Phase 2 Keyframes - 67 unique frames (threshold 0.05) 2026-06-21 22:55:36 -04:00
ed 593da35589 conductor(free_lunches_levin): Phase 1 Acquire - transcript (1539 clean segments, 55KB) + 67MB mp4 2026-06-21 22:54:26 -04:00
ed cbc6592938 conductor(platonic_intelligence_kumar): Phase 5 Verification - end-of-track report + state.toml completed 2026-06-21 22:41:50 -04:00
ed 8bb7bc0b03 conductor(platonic_intelligence_kumar): Phase 4 Synthesis - report.md (1564 lines, 104KB) + summary.md (384 words) 2026-06-21 22:40:27 -04:00
ed 751b94d4e8 Revert "merge: tier2/phase2_4_5_call_site_completion_20260621 (parent + follow-up + Phase 6e analysis)"
This reverts commit f914b2bcd4, reversing
changes made to 7fef95cc87.
2026-06-21 22:39:14 -04:00
ed f32e4fd268 conductor(platonic_intelligence_kumar): Phase 3 OCR - 62 frames OCR'd via winsdk in 3.7s 2026-06-21 22:33:09 -04:00
ed f690b4dea4 conductor(platonic_intelligence_kumar): Phase 2 Keyframes - 62 unique frames from 133 raw (threshold 0.05) 2026-06-21 22:30:59 -04:00
ed f914b2bcd4 merge: tier2/phase2_4_5_call_site_completion_20260621 (parent + follow-up + Phase 6e analysis)
Merges 39 commits from tier2 sandbox:
- any_type_componentization_20260621 parent (48/89 fat-struct sites; Phases 1,2,4,5 complete; Phase 3 deferred)
- phase2_4_5_call_site_completion_20260621 follow-up (Phases 6a broadcast fix + 6b sender migration + 6e Phase 3 cost analysis; Phase 6d was a no-op)
- docs/reports/PHASE3_TIER2_ANALYSIS.md (Tier 2 authoritative cost analysis; supersedes Tier 1's draft)

Unblocks code_path_audit_20260607:
- Phase 6a fixes the broadcast() TypeError that contaminated per-action profiling
- Phase 6e provides the cost hypothesis the audit will quantify
2026-06-21 22:30:10 -04:00
ed 7fef95cc87 conductor(platonic_intelligence_kumar): Phase 1 Acquire - transcript (1659 clean segments, 61KB) + 89MB mp4 2026-06-21 22:29:25 -04:00
ed c760b8e09d conductor(score_dynamics_giorgini): Phase 5 Verification - end-of-track report + state.toml completed 2026-06-21 22:21:05 -04:00
ed f1d157bf33 conductor(score_dynamics_giorgini): Phase 4 Synthesis - report.md (1325 lines, 93KB) + summary.md (354 words) 2026-06-21 22:19:42 -04:00
ed 077cdf20db conductor(score_dynamics_giorgini): Phase 3 OCR - 31 frames OCR'd via winsdk in 2.3s 2026-06-21 22:13:03 -04:00
ed edd2f181eb conductor(score_dynamics_giorgini): Phase 2 Keyframes - 31 unique frames from 91 raw (threshold 0.05) 2026-06-21 21:45:49 -04:00
ed 16fbf5619f conductor(score_dynamics_giorgini): Phase 1 Acquire - transcript (1485 clean segments, 46.5KB) + 178MB mp4 2026-06-21 20:43:50 -04:00
ed 49fb0a1a13 artifacts(track): throwaway scripts for phase2_4_5_call_site_completion_20260621
Per the Tier 2 convention, throwaway scripts are committed as archival
artifacts so future agents can understand what was tried during the track.

7 scripts:
- verify_test_format.py: AST + indentation check for new test file
- _check_line_endings.py: CRLF vs LF diagnostic
- _find_tracks_line.py: locate line 27 entry in tracks.md
- _verify_line_66.py: verify new line 66 content
- _update_tracks_md.py: programmatic update of line 27
- _update_state_toml.py: programmatic update of state.toml
- _fix_state_toml_crlf.py: restore CRLF after edits
2026-06-21 20:00:57 -04:00
ed 7c3052c893 conductor(archive): ship phase2_4_5_call_site_completion_20260621 (4 phases + report)
Updates:
- conductor/tracks.md: entry #27 marked SHIPPED 2026-06-21; BLOCKER
  removed for code_path_audit_20260607 (broadcast() TypeError fixed)
- state.toml: status=completed, current_phase=6, all 4 phases marked
  completed with checkpoint SHAs, all verification booleans true

NOT shipped (per user instruction):
- The git mv to conductor/tracks/archive/ is the USER's responsibility
- Track directory stays at conductor/tracks/phase2_4_5_call_site_completion_20260621/
- tier2/any_type_componentization_20260621 branch NOT merged (reconnaissance framing)
2026-06-21 20:00:11 -04:00
ed ae745886a7 docs(reports): TRACK_COMPLETION_phase2_4_5_call_site_completion_20260621 2026-06-21 19:54:04 -04:00
ed e9b1138949 docs(analysis): PHASE3_TIER2_ANALYSIS - authoritative Phase 3 cost hypothesis
Tier 2 produced this analysis during phase2_4_5_call_site_completion_20260621
Phase 6e. Supersedes Tier 1's draft at PHASE3_HYPOTHETICAL_PROMOTION.md (kept
as the hypothesis doc; this is the refined version with in-context data
from Phase 6b/6d work in src/ai_client.py).

Key findings:
- Measured 104 history references (Tier 1 estimated 112; 7% under)
- Anthropic dominates per-turn cost (~35-65µs vs Tier 1's 8-15µs estimate)
- Grok/qwen/llama are LOWER than Tier 1 estimated (~400ns vs 2-8µs)
- Total per-session: ~0.5-1.0ms (Tier 1 estimated 1.1-2.4ms)
- Discovered 3 hidden cross-references Tier 1 missed (_strip_private_keys,
  _extract_minimax_reasoning, _send_llama_native)
- Recommendations for the future Phase 3 track: anthropic first; use
  'with h.lock: msg_list = h.messages' for read snapshots; use
  'with h.lock: h.messages = [filtered]' for in-place mutations

Covers all 6 senders (anthropic, deepseek, minimax, grok, qwen, llama)
with per-site cost estimates + hidden cross-references + recommendations.
The audit (code_path_audit_20260607) quantifies these estimates after merge.
2026-06-21 19:52:15 -04:00
ed 06287dbb95 refactor(ai_client): migrate _send_grok/_send_minimax/_send_llama to ChatMessage API
Completes the deferred t2_6 task from any_type_componentization_20260621 Phase 2.
The 3 OpenAI-compatible senders now construct OpenAICompatibleRequest with
messages=[ChatMessage(role=, content=)] instead of list[dict] literals.

The _<provider>_history global lists are still dicts (Phase 3 deferred to
a separate track); the migration converts each dict to ChatMessage at
the request-build boundary via list comprehension. The backward-compat
shim in openai_compatible.py:86 (m.to_dict() if hasattr(m, 'to_dict')
else m) handles both ChatMessage and dict transparently.

Verified: 20/20 provider tests pass; tier-1-unit (5 pre-existing
sandbox-pollution failures unchanged); no new regressions.
2026-06-21 19:47:40 -04:00
ed 76b10e734d fix(broadcast): migrate WebSocketServer.broadcast() callers to WebSocketMessage signature
Phase 5 of any_type_componentization_20260621 changed
WebSocketServer.broadcast(channel, payload) -> broadcast(message: WebSocketMessage)
but did not update internal callers. This produced worker[queue_fallback]
TypeError spam on the GUI thread.

Fixed 2 sites:
- src/app_controller.py:1849 _process_pending_gui_tasks (telemetry broadcast)
- src/events.py:115 AsyncEventQueue.put (events broadcast)

gui_2.py has no internal broadcast callers (grep verified).

Both callers now construct WebSocketMessage(channel=, payload=) at the call site.
test_websocket_broadcast_regression.py 4/4 pass (was 1/4 failing in red phase).
2026-06-21 19:26:14 -04:00
ed 0c7a12a3fa test(broadcast): add regression test for WebSocketServer.broadcast() signature
Phase 5 of any_type_componentization_20260621 changed
WebSocketServer.broadcast(channel, payload) -> broadcast(message: WebSocketMessage)
but did not update internal callers in src/app_controller.py + src/events.py.

This adds 4 tests that pin the contract:
- test_websocket_server_broadcast_signature: asserts (self, message) signature
- test_websocket_server_broadcast_rejects_legacy_2arg_call: asserts legacy raises TypeError
- test_websocket_server_broadcast_accepts_websocket_message_instance: smoke test
- test_internal_callers_use_websocket_message_signature: structural grep over src/

The 4th test currently FAILS (red phase), identifying 2 legacy sites:
- src/app_controller.py:1849: self.event_queue.websocket_server.broadcast('telemetry', metrics)
- src/events.py:115: self.websocket_server.broadcast('events', {...})

The structural assertion is reused by code_path_audit_20260607.
2026-06-21 19:23:00 -04:00
ed 1dce32037a un-archive data structure strengthening 2026-06-21 19:18:14 -04:00
ed e4ec494b89 artifacts 2026-06-21 19:14:57 -04:00
ed 91775ee391 Merge branch 'master' of C:\projects\manual_slop into tier2/any_type_componentization_20260621 2026-06-21 19:08:35 -04:00
ed 6275c860bf conductor(spec+plan): add Phase 6e to follow-up - Tier 2 authoritative Phase 3 cost deduction
The follow-up track now includes Phase 6e: Tier 2 produces the authoritative
Phase 3 cost analysis as part of the follow-up work. Tier 2 is in
src/ai_client.py doing Phase 6b/6d anyway; they have full context to produce
the refined cost hypothesis that Tier 1's draft at PHASE3_HYPOTHETICAL_PROMOTION.md
could not (Tier 1 worked without the 6b/6d ground-truth context).

Tier 1's draft STAYS as the hypothesis doc. Tier 2's PHASE3_TIER2_ANALYSIS.md
is the refined version (per-sender cost summary + hidden call sites table
+ recommendations for the future Phase 3 track + cross-reference to Tier 1
explicit).

Phase 6e tasks (5 total, ~2 commits):
- t6e_1: Profile the 6 senders (codepath catalog + hidden cross-refs)
- t6e_2: Qualitative cost estimation per sender
- t6e_3: Identify hot iteration sites needing 'with h.lock:' pattern
- t6e_4: Author PHASE3_TIER2_ANALYSIS.md
- t6e_5: Phase 6e checkpoint commit + git note

Total estimated commits: 16 -> 18 (still within Tier 2 1-4 hour budget).

Files updated:
- conductor/tracks/phase2_4_5_call_site_completion_20260621/spec.md (+50 lines)
- conductor/tracks/phase2_4_5_call_site_completion_20260621/plan.md (+146 lines)
- conductor/tracks/phase2_4_5_call_site_completion_20260621/metadata.json (+13 lines)
- conductor/tracks/phase2_4_5_call_site_completion_20260621/state.toml (+9 lines)
- conductor/tracks.md (track 27 entry expanded with Phase 6e details)
2026-06-21 18:55:54 -04:00
ed 1a739ecef5 conductor(spec+plan): phase2_4_5_call_site_completion_20260621 + code_path_audit pre-flight adjustments + Phase 3 analysis
PHASE 2/4/5 FOLLOW-UP TRACK (Tier 1 decided SHINK to 6a + 6b + 6d):
- Phase 6a: Fix HookServer.broadcast() callers (app_controller.py + events.py + gui_2.py)
  Adds tests/test_websocket_broadcast_regression.py with no-TypeError assertion
- Phase 6b: Complete _send_grok/_send_minimax/_send_llama OpenAICompatibleRequest migration
- Phase 6d: Update those 3 senders' NormalizedResponse to use UsageStats

Total: ~16 atomic commits, ~3 hours Tier 2 work. Unblocks code_path_audit_20260607.

CODE_PATH_AUDIT_20260607 PRE-FLIGHT ADJUSTMENTS (per handoffs):
- Add 2 new actions: provider_history_append + websocket_broadcast
- Add 5 micro-benchmarks: NormalizedResponse.__init__, WebSocketMessage.__init__,
  UsageStats.__init__, ProviderHistory.lock, ToolSpec.__init__
- Add no-TypeError-errors-on-any-thread assertion (backs test_websocket_broadcast_regression.py)
- Add 89 fat-struct sites from ANY_TYPE_AUDIT_20260621.md as instrumented targets
- BLOCKER: phase2_4_5_call_site_completion_20260621 (broadcast() TypeError)

PHASE 3 HYPOTHETICAL ANALYSIS (separate doc):
docs/reports/PHASE3_HYPOTHETICAL_PROMOTION.md - dataclass definitions (already on tier2 branch),
per-provider codepath catalog (112 sites), qualitative cost estimation (~+1-2ms per session,
~+8-15us per _send_anthropic turn). Input for the audit; the audit quantifies the cost.

REGISTRATION:
conductor/tracks.md updated: new row 27 (follow-up), new row 28 (parent any_type_componentization),
row 17 (code_path_audit) updated with pre-flight adjustments note.

Files:
- conductor/tracks/phase2_4_5_call_site_completion_20260621/spec.md (NEW; 633 lines)
- conductor/tracks/phase2_4_5_call_site_completion_20260621/plan.md (NEW; 7 phases, 23 tasks)
- conductor/tracks/phase2_4_5_call_site_completion_20260621/metadata.json (NEW; 8.8KB)
- conductor/tracks/phase2_4_5_call_site_completion_20260621/state.toml (NEW; 11.8KB)
- docs/reports/PHASE3_HYPOTHETICAL_PROMOTION.md (NEW; 380 lines; qualitative cost analysis)
- conductor/tracks/code_path_audit_20260607/spec.md (MODIFIED; +93 lines Pre-Flight Adjustments)
- conductor/tracks.md (MODIFIED; +35 lines: 3 new entries + 1 stale row fix)
2026-06-21 18:32:02 -04:00
ed f08394a98c Merge branch 'master' of C:\projects\manual_slop into tier2/any_type_componentization_20260621 2026-06-21 18:13:40 -04:00
ed 95a8fae234 docs(handoff): Tier 1 prompt - follow-up track + audit sequencing
Synthesizes the 2 prior handoff docs into a ready-to-use Tier 1 brief:
- HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md (the audit framing)
- HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md (the test failures + scope)

Sections:
1. TL;DR (3 paragraphs): what happened, the hidden broadcast() bug,
   the recommendation (don't merge; use as input for follow-up track)
2. Context: 48 promoted, 41 deferred, 2 new audits, 1 styleguide
3. 4 decision points for Tier 1 (scope, sequencing, audit adjustments,
   scope expansion)
4. The 4 documents Tier 1 should read in order (45 min total)
5. What Tier 1 should NOT do (3 anti-patterns)
6. What Tier 1 SHOULD do (6 concrete first steps)
7. What Tier 2 is available for (conventions reminder)
8. The bigger vision (agent-debugger framing)

Recommended sequencing for Tier 1:
T0: Approve follow-up track scope
T1: Tier 2 implements Phase 6a + 6b + 6d (~18 commits, 3 hours)
T2: Tier 2 runs tier-1-unit-core FULLY (no stop-on-failure)
T3: Tier 2 runs tier-3-live_gui FULLY
T4: Tier 1 reviews + merges follow-up track
T5: Tier 1 launches code_path_audit_20260607
T6: Tier 2 implements Phase 3 + cross-phase coupling (separate track)

Tier 1's scope decision: I recommend the SHRUNK version (Phase 6a + 6b + 6d
only; defer Phase 3 to its own track). This gives the code-path audit a
clean instrumented target without ballooning the follow-up beyond Tier 2's
1-4 hour budget.

Audit adjustments to add:
- 5 micro-benchmarks (NormalizedResponse.__init__, WebSocketMessage.__init__,
  UsageStats.__init__, ProviderHistory.lock, ToolSpec.__init__)
- 'no-TypeError-errors-on-any-thread' assertion
- Instrument grok/minimax/llama providers (currently unprofiled)
- Add 2 new actions: provider_history_append + websocket_broadcast
2026-06-21 17:57:38 -04:00
ed 4bbc69019e chore(gitignore): add video_analysis artifact patterns (*.mp4, *.vtt)
Per FR8 in conductor/tracks/video_analysis_campaign_20260621/spec.md, mp4 files are too large for git and VTT auto-sub files are regenerable from transcript.json.

Note: existing tracked files in entropy_epiplexity (commit 5c5f347c) are still in history. The gitignore prevents FUTURE commits from adding them. To remove from history requires filter-repo/filter-branch rewrite (out of scope for this commit).
2026-06-21 17:54:39 -04:00
ed b3ed4b1508 docs(handoff): test failure report for follow-up track scoping
Categorizes the 12 test failures the user observed when running
scripts/run_tests_batched.py after this track:

- 10 failures (mine): Phase 2 NormalizedResponse API migration
  incomplete (state.toml t2_6 deferred task); FIXED in commit 30c8b263
- 3 failures (sandbox): test_audit_tier2_leaks.py flags sandbox
  files (mcp_paths.toml, opencode.json) as modified; NOT my fault
- 1 failure (pre-existing): test_gui2_custom_callback_hook_works;
  live_gui test not touched by this track

Hidden 12th failure:
- worker[queue_fallback] error: WebSocketServer.broadcast() takes 2
  positional arguments but 3 were given (appeared 6+ times during
  tier-2-mock-app-core but tests still passed; error logged on
  GUI thread from app_controller._run_pending_tasks_once_result).
  Phase 5 refactored broadcast(channel, payload) to
  broadcast(WebSocketMessage); I updated test_websocket_server.py
  but missed app_controller.py and events.py callers.

Sections:
1. Executive summary (3 categories of failure)
2. Per-failure categorization (10 + 3 + 1)
3. Hidden 12th failure: WebSocket broadcast callers in app_controller
4. Phase 2 API migration status (8 sites; 5 done, 3 unverified)
5. Recommendations for follow-up track (~5 call sites + ~41 Phase 3)
6. Code-path audit input (5 micro-benchmarks to add)

Follow-up track scope: ~15-20 commits, well-scoped. Should run BEFORE
code_path_audit_20260607 because the worker[queue_fallback] TypeError
spam will confuse the audit's runtime instrumentation.
2026-06-21 17:53:48 -04:00
ed 3172a6ac1d Merge branch 'master' of C:\projects\manual_slop into tier2/any_type_componentization_20260621 2026-06-21 17:46:57 -04:00
ed ad9c028acc docs(type_registry): regenerate for Phase 1-5 new modules
Auto-generated by scripts/generate_type_registry.py after the Phase
2 + 4 + 5 commits. These were untracked in the working tree because
commit 4a774eb3 was made before Phase 5 (api_hooks) committed.

NEW files (5):
- docs/type_registry/src_mcp_tool_specs.md (Phase 1; ToolSpec + ToolParameter)
- docs/type_registry/src_openai_schemas.md (Phase 2; ToolCall + ChatMessage + UsageStats + NormalizedResponse + OpenAICompatibleRequest)
- docs/type_registry/src_provider_state.md (Phase 3 partial; ProviderHistory + _PROVIDER_HISTORIES)
- docs/type_registry/src_api_hooks.md (Phase 5; WebSocketMessage)
- docs/type_registry/src_log_registry.md (Phase 4; Session + SessionMetadata)

Verified:
  uv run python scripts/generate_type_registry.py --check
    Registry in sync (22 files checked)

These 5 .md files were generated after the Phase 5 commit (e9fa69dd)
and the Phase 4 commit (fef6c20e); they were left in the working tree
because commit 4a774eb3 (verify) was made after the Phase 2 registry
regen but before Phase 4/5 changes were fully committed.
2026-06-21 17:43:43 -04:00
ed 30c8b26381 fix(ai_client): migrate gemini_cli NormalizedResponse callers to Phase 2 dataclass API
Phase 2 deferred t2_6: update src/ai_client.py _send_grok + _send_minimax +
_send_llama + _send_gemini_cli (4 functions) to use the new
dataclass API after NormalizedResponse was refactored to
(text, tool_calls: tuple[ToolCall, ...], usage: UsageStats, raw_response).

These 4 callers were left with the old keyword args
(usage_input_tokens, usage_output_tokens, ...) which broke at
runtime: ai_client.send() raised
TypeError: NormalizedResponse.__init__() got an unexpected keyword
argument 'usage_input_tokens'.

FIXES:
- src/ai_client.py L2054: gemini_cli 'adapter unavailable' branch
- src/ai_client.py L2088: gemini_cli normal response branch
- Added: from src.openai_schemas import UsageStats (module level)
- Added backward-compat in src/openai_compatible.py:
  messages_dicts = [m.to_dict() if hasattr(m, 'to_dict') else m for m in request.messages]
  (accepts both ChatMessage dataclass and dict for backward compat
  with existing tests that pass raw dicts)

TEST FIXES:
- tests/test_ai_client_tool_loop.py: _make_normalized_response helper
  uses UsageStats instead of usage_*_tokens kwargs
- tests/test_ai_client_tool_loop_builder.py: same
- tests/test_ai_client_tool_loop_send_func.py: same
- tests/test_openai_compatible.py: NormalizedResponse(text=..., usage=UsageStats(...))
  + tool_calls[0].function.name (attribute access) instead of ['function']['name']
- tests/test_auto_whitelist.py: use update_session_metadata() instead of
  dict subscript assignment (Session dataclass doesn't support item assignment)

VERIFIED:
  uv run pytest tests/test_ai_client_*.py tests/test_openai_*.py \
               tests/test_auto_whitelist.py --timeout=30
    56 passed in 4.49s (19 previously failing tests now pass)
  uv run python scripts/audit_weak_types.py --strict
    STRICT OK: 115 weak sites <= baseline 115
  uv run python scripts/audit_dataclass_coverage.py --strict
    STRICT OK: 200 weak sites <= baseline 207

This commit closes the t2_6 deferred task. The 41-site Phase 3 call-site
migration remains deferred (separate provider_state_migration track).
2026-06-21 17:42:35 -04:00
ed ea8bcdf389 conductor(entropy_epiplexity): Phase 5 Verification - end-of-track report + state.toml completed 2026-06-21 17:16:05 -04:00
ed 275f34da6e conductor(entropy_epiplexity): Phase 4 Synthesis - report.md (1,018 lines) + summary.md (341 words)
Deep-dive report covers all 8 sections per umbrella spec FR6:
- TL;DR: epiplexity as observer-relative information measure
- Key Concepts: 18 numbered concepts
- Frame Analysis: 176 unique frames from research talk
- Transcript Highlights: 10+ verbatim passages with timestamps
- Mathematical Content: 12 derivations (Shannon, Kolmogorov, Levin, sophistication, epiplexity)
- Connections: forward refs to 8 other videos
- Open Questions: 14 questions for Pass 2
- References: people, concepts, resources

Plus 9 appendices: concept map, transcript excerpts (C.1-C.12), math foundations (D.1-D.10), framework connections (E.1-E.7), cross-references (G.1-G.9), resources, final notes.

Lossless preservation per umbrella spec §0.
2026-06-21 17:15:10 -04:00
ed 0fabeaf4ce docs(handoff): Tier 2 -> Tier 1 input for code_path_audit_20260607
While running any_type_componentization_20260621, the Tier 2 agent
performed a partial code-path audit + code normalization pass that
wasn't in the original scope. This handoff document frames:

1. What was done (48 of 89 fat-struct sites promoted; 41 deferred)
2. The 5-pattern Any-type taxonomy (Patterns 3/4/5 correctly preserved;
   Patterns 1/2 promoted to dataclass/registry)
3. Recommended adjustments for code_path_audit_20260607:
   - Instrument the 89 fat-struct sites with hot/cold/init path tags
   - Compare pre/post refactor cost for the 48 promoted sites
   - Rank the 41 deferred Phase 3 sites by hot-path frequency
   - Report per-call cost deltas in microseconds
4. What was NOT done (no runtime profiling; no pre/post benchmarks)
5. Decision points for Tier 1 (merge / reject / cherry-pick)
6. The bigger vision: AI/LLM frontend debugger (rad-debugger analog)
   requires typed ProviderHistory, ToolSpec, Session, WebSocketMessage
   to step through the agent loop without losing type fidelity

Recommendation: Don't merge this branch yet. Let code_path_audit_20260607
use it as a reconnaissance warm-up; drive the next refactor track from
the audit's per-action cost data.

The 4 newly-promoted dataclasses (mcp_tool_specs, openai_schemas,
log_registry.Session, api_hooks.WebSocketMessage) are the typed-state
foundation that the future debugger UI will read from. The 41 deferred
Phase 3 sites are the last gap: per-turn history manipulation in
src/ai_client.py needs typed state before the debugger can step
through the agent loop losslessly.

Length: 7 sections, 7 paragraphs of Tier 1 decision framing.
Location: docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md
(new directory; complements docs/reports/ which is for reports vs
handoffs which are cross-track input artifacts).
2026-06-21 17:14:22 -04:00
ed 4a774eb341 conductor(verify): track completion artifacts - TRACK_COMPLETION + audit baselines + registry
Phase 6 (verification) artifacts for any_type_componentization_20260621.
The user handles the archive move (NOT done by Tier 2; reverted
a premature git mv per user instruction).

END-OF-TRACK REPORT (NEW):
- docs/reports/TRACK_COMPLETION_any_type_componentization_20260621.md
  (289 lines)
- Per-phase results table (0/1/2/4/5 complete; 3 partial)
- 48 sites promoted (1:8 + 2:17 + 4:7 + 5:16); 41 sites deferred (Phase 3 call-site migration)
- 7 architectural invariants established (frozen=True pattern; TypeAlias;
  JsonValue; ProviderHistory threading; SDK holders stay Any; etc.)
- Deferred-work section: provider_state_migration_2026MMDD follow-up track

STATE.TOML UPDATE:
- status: active -> completed
- current_phase: 2 -> 6
- (track stays at conductor/tracks/any_type_componentization_20260621/;
  archive move is the user's responsibility per Tier 2 conventions)

AUDIT BASELINE REGENERATION:
- scripts/audit_weak_types.baseline.json: 112 -> 115 (regenerated)
- 3 net new sites added by the new src/ files (openai_schemas: 10;
  log_registry: 10; provider_state: ?; api_hooks: ?). The new sites
  are at to_dict() / from_dict() / Optional[tuple[...]] serialization
  boundaries which are Pattern 5 (generic serialization; stay as Any).
- Both CI gates pass: STRICT OK: 115 <= 115; STRICT OK: 200 <= 207

TYPE REGISTRY REGENERATION (NEW/MODIFIED/DELETED):
- index.md: 18 -> 22 .md files
- src_api_hooks.md (NEW; Phase 5 WebSocketMessage)
- src_log_registry.md (NEW; Phase 4 Session + SessionMetadata)
- src_openai_schemas.md (NEW; Phase 2 ToolCall + ChatMessage + UsageStats + NormalizedResponse + OpenAICompatibleRequest)
- src_provider_state.md (NEW; Phase 3 ProviderHistory + _PROVIDER_HISTORIES)
- src_openai_compatible.md (DELETED; dataclasses moved to src_openai_schemas.md)
- src_type_aliases.md (MODIFIED; +JsonPrimitive + JsonValue)
- type_aliases.md (MODIFIED; registry index entry updated)

VERIFICATION COMMANDS (all pass):
  uv run python scripts/audit_weak_types.py --strict
    STRICT OK: 115 weak sites <= baseline 115
  uv run python scripts/audit_dataclass_coverage.py --strict
    STRICT OK: 200 weak sites <= baseline 207
  uv run python scripts/generate_type_registry.py --check
    Registry in sync (22 files checked)
  ~130 targeted tests pass across 13 test files (see TRACK_COMPLETION §4)
2026-06-21 17:07:22 -04:00
ed 5c5f347cf0 conductor(entropy_epiplexity): Phase 1-3 Acquire+Keyframes+OCR - transcript.json (~5k segments via yt-dlp), 176 unique frames (214 raw), OCR in 30s
Note: 364MB mp4 video. 176 frames after imagehash dedup (hamming<5).
2026-06-21 17:07:07 -04:00
ed e9fa69ddc1 feat(api_hooks): add WebSocketMessage + JsonValue type (t5_1-t5_8)
Phase 5 of any_type_componentization_20260621. Promotes the WebSocket
broadcast signature in src/api_hooks.py from (channel, payload: dict) to
a typed WebSocketMessage dataclass (16 Any sites):

NEW dataclass (inline in src/api_hooks.py):
- WebSocketMessage (frozen=True): channel: str, payload: JsonValue

MODIFIED:
- _serialize_for_api(obj: Any) -> JsonValue (typed return)
- broadcast(channel: str, payload: dict[str, Any]) -> broadcast(message: WebSocketMessage)
- _get_app_attr / _set_app_attr signatures UNCHANGED (Pattern 4 preserved)

NEW tests/test_api_hooks_dataclasses.py (12 tests, all pass):
- test_websocket_message_construction
- test_websocket_message_with_list_payload
- test_websocket_message_with_nested_payload
- test_websocket_message_is_frozen
- test_websocket_message_to_json
- test_serialize_for_api_returns_dict_for_to_dict_object
- test_serialize_for_api_handles_nested_lists
- test_serialize_for_api_handles_purepath
- test_serialize_for_api_passthrough_for_primitives
- test_serialize_for_api_handles_mixed_nesting
- test_get_app_attr_signature_preserved (Pattern 4 invariant)
- test_set_app_attr_signature_preserved (Pattern 4 invariant)

MODIFIED tests/test_websocket_server.py:
- Updated broadcast() call site to use WebSocketMessage(channel=..., payload=...)
- Added WebSocketMessage import

Verified:
  uv run pytest tests/test_api_hooks_dataclasses.py tests/test_api_hooks_warmup.py tests/test_websocket_server.py --timeout=30
    23 passed in 5.03s (12 new + 10 existing + 1 websocket)
2026-06-21 17:00:42 -04:00
ed fef6c20ea0 feat(log): add Session + SessionMetadata dataclasses (t4_1-t4_8)
Phase 4 of any_type_componentization_20260621. Promotes the 2-level
dict[str, dict[str, Any]] structure in src/log_registry.py to typed
Session + SessionMetadata dataclasses (7 Any sites):

NEW dataclasses (inline in src/log_registry.py):
- SessionMetadata (frozen): message_count, errors, size_kb, whitelisted,
  reason, timestamp
- Session (frozen): session_id, path, start_time, whitelisted, metadata
- to_dict() / from_dict() classmethod for round-trip with TOML shape
- Backward-compat __getitem__ / get() so existing test_log_registry.py
  tests that use session_data['path'] / session_data.get('metadata')
  continue to work

REFACTOR LogRegistry:
- self.data: dict[str, dict[str, Any]] -> dict[str, Session]
- load_registry: populates with Session.from_dict(...)
- save_registry: serializes via session.to_dict()
- register_session: creates Session dataclass
- update_session_metadata: creates new Session with updated SessionMetadata
- is_session_whitelisted: reads session.whitelisted
- update_auto_whitelist_status: reads session.path
- get_old_non_whitelisted_sessions: reads session.start_time + metadata

NEW tests/test_log_registry_dataclasses.py (13 tests, all pass):
- test_session_dataclass_construction
- test_session_metadata_dataclass_construction
- test_session_from_dict_basic / with_metadata
- test_session_to_dict_round_trip
- test_session_metadata_to_dict
- test_log_registry_data_is_typed
- test_log_registry_register_session_returns_session
- test_log_registry_update_session_metadata_sets_metadata
- test_log_registry_is_session_whitelisted
- test_log_registry_get_old_non_whitelisted_sessions
- test_session_is_frozen
- test_session_metadata_is_frozen

Verified:
  uv run pytest tests/test_log_registry.py tests/test_log_registry_dataclasses.py --timeout=30
    18 passed in 3.27s (5 existing + 13 new)
2026-06-21 16:56:24 -04:00
ed 901b1b0982 conductor(probability_logic): Phase 5 Verification - end-of-track report + state.toml completed
TRACK COMPLETE for child #2. All 7 deliverable artifacts present, report.md 1045 lines (within 1000-10000 target), summary.md 333 words (within 200-400 target), no TBDs.

10 children + 1 synthesis remaining in campaign.
2026-06-21 16:46:19 -04:00
ed cb85591fc8 conductor(probability_logic): Phase 4 Synthesis - report.md (1,045 lines) + summary.md (333 words)
Deep-dive report covers all 8 sections per umbrella spec FR6:
- TL;DR: probability as extension of logic
- Key Concepts: 32 numbered concepts
- Frame Analysis: 25 frames (12 chat-only, 13 presentation)
- Transcript Highlights: 16 verbatim passages with timestamps
- Mathematical Content: 15 derivations
- Connections: forward refs to 9 other videos
- Open Questions: 14 questions for Pass 2
- References: people, concepts, resources

Plus 6 appendices: concept map, lossless preservation audit, detailed transcript excerpts (sections C.1-C.15), math derivations (D.1-D.8), LLM connections, quick reference formulas.

Lossless preservation per umbrella spec §0.
2026-06-21 16:45:39 -04:00
ed e19672b2e0 conductor(plan): Phase 3 partial - provider_state + tests; call-site migration deferred 2026-06-21 16:44:28 -04:00
ed 2ad4718c3c feat(provider): add src/provider_state.py + tests (t3_2, t3_3)
Phase 3 of any_type_componentization_20260621 (PARTIAL). Adds the
ProviderHistory abstraction and 6-provider registry.

NEW src/provider_state.py (60 lines):
- ProviderHistory dataclass (messages: list[HistoryMessage], lock: Lock,
  append / get_all / replace_all / clear methods)
- _PROVIDER_HISTORIES: dict[str, ProviderHistory] for anthropic / deepseek /
  minimax / qwen / grok / llama
- get_history(provider) factory + clear_all() + providers()
- SDK client holders (_gemini_chat, _anthropic_client, etc.) NOT touched
  per Pattern 3 (heterogeneous SDK types)

NEW tests/test_provider_state.py (12 tests, all pass):
- test_six_providers_registered
- test_get_history_returns_singleton_per_provider
- test_get_history_raises_for_unknown
- test_provider_history_starts_empty
- test_provider_history_append / get_all_returns_copy / replace_all /
  replace_all_takes_copy / clear
- test_clear_all_resets_every_provider
- test_provider_history_thread_safety (10 threads x 100 messages)
- test_independent_locks_per_provider (lock on one doesn't block another)

DEFERRED:
- t3_4 (Remove 14 globals from ai_client.py:111-133)
- t3_5 through t3_13 (Update call sites in _send_<provider> functions)
- t3_14 (Run full regression suite on test_ai_client*.py)

These call-site updates require careful per-function refactoring of the
~27 sites in _send_anthropic, _send_deepseek, _send_minimax, _send_qwen,
_send_grok, _send_llama. The ai_client.py file is 3432 lines; a single
regex pass risks subtle indentation regressions in nested constructs
(see the 7
ot : orphan lines from a previous attempt).

The provider_state module is independently usable and tested. Future
track: provider_state_migration_2026MMDD to wire up the call sites
mechanically, OR integrate into a Phase 3 retry pass.

Verified:
  uv run pytest tests/test_provider_state.py --timeout=30
    12 passed in 2.99s
2026-06-21 16:43:42 -04:00
ed ca4826ab31 conductor(probability_logic): transcript_clean.txt (10k words) + presentation frame extractor 2026-06-21 16:41:42 -04:00
ed 4dd373d70d conductor(probability_logic): Phase 3 OCR - 25 frames OCR'd in 1.8s via winsdk 2026-06-21 16:40:04 -04:00
ed f855967bb8 conductor(probability_logic): Phase 2 Keyframes - 25 unique frames (threshold 0.05; low-motion math lecture) 2026-06-21 16:39:43 -04:00
ed 338573b1e8 refactor(video_analysis): extract_transcript.py uses yt-dlp VTT directly (skip youtube-transcript-api which consistently fails for these videos)
youtube-transcript-api v1.2.4 returns XML parse error on empty response for ALL videos in this campaign. yt-dlp's --write-auto-subs reliably returns 1000s of segments per video. Switched to yt-dlp as the primary path.

Tests updated to mock _fetch_via_ytdlp instead of _fetch_raw_transcript. 8/8 tests passing.
2026-06-21 16:33:44 -04:00
ed 7478090e71 conductor(probability_logic): Phase 1 Acquire - transcript.json (3315 segments via yt-dlp VTT fallback) + video.log (84MB mp4 downloaded)
Generic reusable drivers added: phase1_acquire.py, phase2_keyframes.py, phase3_ocr.py take slug as arg for batch use across all 12 children.
2026-06-21 16:32:19 -04:00