manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	6dfd0e5a7e	test(broadcast): add regression test for WebSocketServer.broadcast() signature Phase 5 of any_type_componentization_20260621 changed WebSocketServer.broadcast(channel, payload) -> broadcast(message: WebSocketMessage) but did not update internal callers in src/app_controller.py + src/events.py. This adds 4 tests that pin the contract: - test_websocket_server_broadcast_signature: asserts (self, message) signature - test_websocket_server_broadcast_rejects_legacy_2arg_call: asserts legacy raises TypeError - test_websocket_server_broadcast_accepts_websocket_message_instance: smoke test - test_internal_callers_use_websocket_message_signature: structural grep over src/ The 4th test currently FAILS (red phase), identifying 2 legacy sites: - src/app_controller.py:1849: self.event_queue.websocket_server.broadcast('telemetry', metrics) - src/events.py:115: self.websocket_server.broadcast('events', {...}) The structural assertion is reused by code_path_audit_20260607.	2026-06-21 19:23:00 -04:00
ed	9a354ef3b2	artifacts	2026-06-21 19:14:57 -04:00
ed	5033b401e6	Merge branch 'master' of C:\projects\manual_slop into tier2/any_type_componentization_20260621	2026-06-21 19:08:35 -04:00
ed	6275c860bf	conductor(spec+plan): add Phase 6e to follow-up - Tier 2 authoritative Phase 3 cost deduction The follow-up track now includes Phase 6e: Tier 2 produces the authoritative Phase 3 cost analysis as part of the follow-up work. Tier 2 is in src/ai_client.py doing Phase 6b/6d anyway; they have full context to produce the refined cost hypothesis that Tier 1's draft at PHASE3_HYPOTHETICAL_PROMOTION.md could not (Tier 1 worked without the 6b/6d ground-truth context). Tier 1's draft STAYS as the hypothesis doc. Tier 2's PHASE3_TIER2_ANALYSIS.md is the refined version (per-sender cost summary + hidden call sites table + recommendations for the future Phase 3 track + cross-reference to Tier 1 explicit). Phase 6e tasks (5 total, ~2 commits): - t6e_1: Profile the 6 senders (codepath catalog + hidden cross-refs) - t6e_2: Qualitative cost estimation per sender - t6e_3: Identify hot iteration sites needing 'with h.lock:' pattern - t6e_4: Author PHASE3_TIER2_ANALYSIS.md - t6e_5: Phase 6e checkpoint commit + git note Total estimated commits: 16 -> 18 (still within Tier 2 1-4 hour budget). Files updated: - conductor/tracks/phase2_4_5_call_site_completion_20260621/spec.md (+50 lines) - conductor/tracks/phase2_4_5_call_site_completion_20260621/plan.md (+146 lines) - conductor/tracks/phase2_4_5_call_site_completion_20260621/metadata.json (+13 lines) - conductor/tracks/phase2_4_5_call_site_completion_20260621/state.toml (+9 lines) - conductor/tracks.md (track 27 entry expanded with Phase 6e details)	2026-06-21 18:55:54 -04:00
ed	1a739ecef5	conductor(spec+plan): phase2_4_5_call_site_completion_20260621 + code_path_audit pre-flight adjustments + Phase 3 analysis PHASE 2/4/5 FOLLOW-UP TRACK (Tier 1 decided SHINK to 6a + 6b + 6d): - Phase 6a: Fix HookServer.broadcast() callers (app_controller.py + events.py + gui_2.py) Adds tests/test_websocket_broadcast_regression.py with no-TypeError assertion - Phase 6b: Complete _send_grok/_send_minimax/_send_llama OpenAICompatibleRequest migration - Phase 6d: Update those 3 senders' NormalizedResponse to use UsageStats Total: ~16 atomic commits, ~3 hours Tier 2 work. Unblocks code_path_audit_20260607. CODE_PATH_AUDIT_20260607 PRE-FLIGHT ADJUSTMENTS (per handoffs): - Add 2 new actions: provider_history_append + websocket_broadcast - Add 5 micro-benchmarks: NormalizedResponse.__init__, WebSocketMessage.__init__, UsageStats.__init__, ProviderHistory.lock, ToolSpec.__init__ - Add no-TypeError-errors-on-any-thread assertion (backs test_websocket_broadcast_regression.py) - Add 89 fat-struct sites from ANY_TYPE_AUDIT_20260621.md as instrumented targets - BLOCKER: phase2_4_5_call_site_completion_20260621 (broadcast() TypeError) PHASE 3 HYPOTHETICAL ANALYSIS (separate doc): docs/reports/PHASE3_HYPOTHETICAL_PROMOTION.md - dataclass definitions (already on tier2 branch), per-provider codepath catalog (112 sites), qualitative cost estimation (~+1-2ms per session, ~+8-15us per _send_anthropic turn). Input for the audit; the audit quantifies the cost. REGISTRATION: conductor/tracks.md updated: new row 27 (follow-up), new row 28 (parent any_type_componentization), row 17 (code_path_audit) updated with pre-flight adjustments note. Files: - conductor/tracks/phase2_4_5_call_site_completion_20260621/spec.md (NEW; 633 lines) - conductor/tracks/phase2_4_5_call_site_completion_20260621/plan.md (NEW; 7 phases, 23 tasks) - conductor/tracks/phase2_4_5_call_site_completion_20260621/metadata.json (NEW; 8.8KB) - conductor/tracks/phase2_4_5_call_site_completion_20260621/state.toml (NEW; 11.8KB) - docs/reports/PHASE3_HYPOTHETICAL_PROMOTION.md (NEW; 380 lines; qualitative cost analysis) - conductor/tracks/code_path_audit_20260607/spec.md (MODIFIED; +93 lines Pre-Flight Adjustments) - conductor/tracks.md (MODIFIED; +35 lines: 3 new entries + 1 stale row fix)	2026-06-21 18:32:02 -04:00
ed	1b433fdb72	Merge branch 'master' of C:\projects\manual_slop into tier2/any_type_componentization_20260621	2026-06-21 18:13:40 -04:00
ed	43c47c66d7	docs(handoff): Tier 1 prompt - follow-up track + audit sequencing Synthesizes the 2 prior handoff docs into a ready-to-use Tier 1 brief: - HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md (the audit framing) - HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md (the test failures + scope) Sections: 1. TL;DR (3 paragraphs): what happened, the hidden broadcast() bug, the recommendation (don't merge; use as input for follow-up track) 2. Context: 48 promoted, 41 deferred, 2 new audits, 1 styleguide 3. 4 decision points for Tier 1 (scope, sequencing, audit adjustments, scope expansion) 4. The 4 documents Tier 1 should read in order (45 min total) 5. What Tier 1 should NOT do (3 anti-patterns) 6. What Tier 1 SHOULD do (6 concrete first steps) 7. What Tier 2 is available for (conventions reminder) 8. The bigger vision (agent-debugger framing) Recommended sequencing for Tier 1: T0: Approve follow-up track scope T1: Tier 2 implements Phase 6a + 6b + 6d (~18 commits, 3 hours) T2: Tier 2 runs tier-1-unit-core FULLY (no stop-on-failure) T3: Tier 2 runs tier-3-live_gui FULLY T4: Tier 1 reviews + merges follow-up track T5: Tier 1 launches code_path_audit_20260607 T6: Tier 2 implements Phase 3 + cross-phase coupling (separate track) Tier 1's scope decision: I recommend the SHRUNK version (Phase 6a + 6b + 6d only; defer Phase 3 to its own track). This gives the code-path audit a clean instrumented target without ballooning the follow-up beyond Tier 2's 1-4 hour budget. Audit adjustments to add: - 5 micro-benchmarks (NormalizedResponse.__init__, WebSocketMessage.__init__, UsageStats.__init__, ProviderHistory.lock, ToolSpec.__init__) - 'no-TypeError-errors-on-any-thread' assertion - Instrument grok/minimax/llama providers (currently unprofiled) - Add 2 new actions: provider_history_append + websocket_broadcast	2026-06-21 17:57:38 -04:00
ed	4bbc69019e	chore(gitignore): add video_analysis artifact patterns (.mp4, .vtt) Per FR8 in conductor/tracks/video_analysis_campaign_20260621/spec.md, mp4 files are too large for git and VTT auto-sub files are regenerable from transcript.json. Note: existing tracked files in entropy_epiplexity (commit `5c5f347c`) are still in history. The gitignore prevents FUTURE commits from adding them. To remove from history requires filter-repo/filter-branch rewrite (out of scope for this commit).	2026-06-21 17:54:39 -04:00
ed	d7b6b2297b	docs(handoff): test failure report for follow-up track scoping Categorizes the 12 test failures the user observed when running scripts/run_tests_batched.py after this track: - 10 failures (mine): Phase 2 NormalizedResponse API migration incomplete (state.toml t2_6 deferred task); FIXED in commit `30c8b263` - 3 failures (sandbox): test_audit_tier2_leaks.py flags sandbox files (mcp_paths.toml, opencode.json) as modified; NOT my fault - 1 failure (pre-existing): test_gui2_custom_callback_hook_works; live_gui test not touched by this track Hidden 12th failure: - worker[queue_fallback] error: WebSocketServer.broadcast() takes 2 positional arguments but 3 were given (appeared 6+ times during tier-2-mock-app-core but tests still passed; error logged on GUI thread from app_controller._run_pending_tasks_once_result). Phase 5 refactored broadcast(channel, payload) to broadcast(WebSocketMessage); I updated test_websocket_server.py but missed app_controller.py and events.py callers. Sections: 1. Executive summary (3 categories of failure) 2. Per-failure categorization (10 + 3 + 1) 3. Hidden 12th failure: WebSocket broadcast callers in app_controller 4. Phase 2 API migration status (8 sites; 5 done, 3 unverified) 5. Recommendations for follow-up track (~5 call sites + ~41 Phase 3) 6. Code-path audit input (5 micro-benchmarks to add) Follow-up track scope: ~15-20 commits, well-scoped. Should run BEFORE code_path_audit_20260607 because the worker[queue_fallback] TypeError spam will confuse the audit's runtime instrumentation.	2026-06-21 17:53:48 -04:00
ed	089d5bdd75	Merge branch 'master' of C:\projects\manual_slop into tier2/any_type_componentization_20260621	2026-06-21 17:46:57 -04:00
ed	ad9c028acc	docs(type_registry): regenerate for Phase 1-5 new modules Auto-generated by scripts/generate_type_registry.py after the Phase 2 + 4 + 5 commits. These were untracked in the working tree because commit `4a774eb3` was made before Phase 5 (api_hooks) committed. NEW files (5): - docs/type_registry/src_mcp_tool_specs.md (Phase 1; ToolSpec + ToolParameter) - docs/type_registry/src_openai_schemas.md (Phase 2; ToolCall + ChatMessage + UsageStats + NormalizedResponse + OpenAICompatibleRequest) - docs/type_registry/src_provider_state.md (Phase 3 partial; ProviderHistory + _PROVIDER_HISTORIES) - docs/type_registry/src_api_hooks.md (Phase 5; WebSocketMessage) - docs/type_registry/src_log_registry.md (Phase 4; Session + SessionMetadata) Verified: uv run python scripts/generate_type_registry.py --check Registry in sync (22 files checked) These 5 .md files were generated after the Phase 5 commit (`e9fa69dd`) and the Phase 4 commit (`fef6c20e`); they were left in the working tree because commit `4a774eb3` (verify) was made after the Phase 2 registry regen but before Phase 4/5 changes were fully committed.	2026-06-21 17:43:43 -04:00
ed	30c8b26381	fix(ai_client): migrate gemini_cli NormalizedResponse callers to Phase 2 dataclass API Phase 2 deferred t2_6: update src/ai_client.py _send_grok + _send_minimax + _send_llama + _send_gemini_cli (4 functions) to use the new dataclass API after NormalizedResponse was refactored to (text, tool_calls: tuple[ToolCall, ...], usage: UsageStats, raw_response). These 4 callers were left with the old keyword args (usage_input_tokens, usage_output_tokens, ...) which broke at runtime: ai_client.send() raised TypeError: NormalizedResponse.__init__() got an unexpected keyword argument 'usage_input_tokens'. FIXES: - src/ai_client.py L2054: gemini_cli 'adapter unavailable' branch - src/ai_client.py L2088: gemini_cli normal response branch - Added: from src.openai_schemas import UsageStats (module level) - Added backward-compat in src/openai_compatible.py: messages_dicts = [m.to_dict() if hasattr(m, 'to_dict') else m for m in request.messages] (accepts both ChatMessage dataclass and dict for backward compat with existing tests that pass raw dicts) TEST FIXES: - tests/test_ai_client_tool_loop.py: _make_normalized_response helper uses UsageStats instead of usage__tokens kwargs - tests/test_ai_client_tool_loop_builder.py: same - tests/test_ai_client_tool_loop_send_func.py: same - tests/test_openai_compatible.py: NormalizedResponse(text=..., usage=UsageStats(...)) + tool_calls[0].function.name (attribute access) instead of ['function']['name'] - tests/test_auto_whitelist.py: use update_session_metadata() instead of dict subscript assignment (Session dataclass doesn't support item assignment) VERIFIED: uv run pytest tests/test_ai_client_.py tests/test_openai_*.py \ tests/test_auto_whitelist.py --timeout=30 56 passed in 4.49s (19 previously failing tests now pass) uv run python scripts/audit_weak_types.py --strict STRICT OK: 115 weak sites <= baseline 115 uv run python scripts/audit_dataclass_coverage.py --strict STRICT OK: 200 weak sites <= baseline 207 This commit closes the t2_6 deferred task. The 41-site Phase 3 call-site migration remains deferred (separate provider_state_migration track).	2026-06-21 17:42:35 -04:00
ed	ea8bcdf389	conductor(entropy_epiplexity): Phase 5 Verification - end-of-track report + state.toml completed	2026-06-21 17:16:05 -04:00
ed	5e7d2b15fd	conductor(entropy_epiplexity): Phase 5 Verification - end-of-track report + state.toml completed	2026-06-21 17:16:05 -04:00
ed	275f34da6e	conductor(entropy_epiplexity): Phase 4 Synthesis - report.md (1,018 lines) + summary.md (341 words) Deep-dive report covers all 8 sections per umbrella spec FR6: - TL;DR: epiplexity as observer-relative information measure - Key Concepts: 18 numbered concepts - Frame Analysis: 176 unique frames from research talk - Transcript Highlights: 10+ verbatim passages with timestamps - Mathematical Content: 12 derivations (Shannon, Kolmogorov, Levin, sophistication, epiplexity) - Connections: forward refs to 8 other videos - Open Questions: 14 questions for Pass 2 - References: people, concepts, resources Plus 9 appendices: concept map, transcript excerpts (C.1-C.12), math foundations (D.1-D.10), framework connections (E.1-E.7), cross-references (G.1-G.9), resources, final notes. Lossless preservation per umbrella spec §0.	2026-06-21 17:15:10 -04:00
ed	038bebce04	conductor(entropy_epiplexity): Phase 4 Synthesis - report.md (1,018 lines) + summary.md (341 words) Deep-dive report covers all 8 sections per umbrella spec FR6: - TL;DR: epiplexity as observer-relative information measure - Key Concepts: 18 numbered concepts - Frame Analysis: 176 unique frames from research talk - Transcript Highlights: 10+ verbatim passages with timestamps - Mathematical Content: 12 derivations (Shannon, Kolmogorov, Levin, sophistication, epiplexity) - Connections: forward refs to 8 other videos - Open Questions: 14 questions for Pass 2 - References: people, concepts, resources Plus 9 appendices: concept map, transcript excerpts (C.1-C.12), math foundations (D.1-D.10), framework connections (E.1-E.7), cross-references (G.1-G.9), resources, final notes. Lossless preservation per umbrella spec §0.	2026-06-21 17:15:10 -04:00
ed	0fabeaf4ce	docs(handoff): Tier 2 -> Tier 1 input for code_path_audit_20260607 While running any_type_componentization_20260621, the Tier 2 agent performed a partial code-path audit + code normalization pass that wasn't in the original scope. This handoff document frames: 1. What was done (48 of 89 fat-struct sites promoted; 41 deferred) 2. The 5-pattern Any-type taxonomy (Patterns 3/4/5 correctly preserved; Patterns 1/2 promoted to dataclass/registry) 3. Recommended adjustments for code_path_audit_20260607: - Instrument the 89 fat-struct sites with hot/cold/init path tags - Compare pre/post refactor cost for the 48 promoted sites - Rank the 41 deferred Phase 3 sites by hot-path frequency - Report per-call cost deltas in microseconds 4. What was NOT done (no runtime profiling; no pre/post benchmarks) 5. Decision points for Tier 1 (merge / reject / cherry-pick) 6. The bigger vision: AI/LLM frontend debugger (rad-debugger analog) requires typed ProviderHistory, ToolSpec, Session, WebSocketMessage to step through the agent loop without losing type fidelity Recommendation: Don't merge this branch yet. Let code_path_audit_20260607 use it as a reconnaissance warm-up; drive the next refactor track from the audit's per-action cost data. The 4 newly-promoted dataclasses (mcp_tool_specs, openai_schemas, log_registry.Session, api_hooks.WebSocketMessage) are the typed-state foundation that the future debugger UI will read from. The 41 deferred Phase 3 sites are the last gap: per-turn history manipulation in src/ai_client.py needs typed state before the debugger can step through the agent loop losslessly. Length: 7 sections, 7 paragraphs of Tier 1 decision framing. Location: docs/handoffs/HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md (new directory; complements docs/reports/ which is for reports vs handoffs which are cross-track input artifacts).	2026-06-21 17:14:22 -04:00
ed	4a774eb341	conductor(verify): track completion artifacts - TRACK_COMPLETION + audit baselines + registry Phase 6 (verification) artifacts for any_type_componentization_20260621. The user handles the archive move (NOT done by Tier 2; reverted a premature git mv per user instruction). END-OF-TRACK REPORT (NEW): - docs/reports/TRACK_COMPLETION_any_type_componentization_20260621.md (289 lines) - Per-phase results table (0/1/2/4/5 complete; 3 partial) - 48 sites promoted (1:8 + 2:17 + 4:7 + 5:16); 41 sites deferred (Phase 3 call-site migration) - 7 architectural invariants established (frozen=True pattern; TypeAlias; JsonValue; ProviderHistory threading; SDK holders stay Any; etc.) - Deferred-work section: provider_state_migration_2026MMDD follow-up track STATE.TOML UPDATE: - status: active -> completed - current_phase: 2 -> 6 - (track stays at conductor/tracks/any_type_componentization_20260621/; archive move is the user's responsibility per Tier 2 conventions) AUDIT BASELINE REGENERATION: - scripts/audit_weak_types.baseline.json: 112 -> 115 (regenerated) - 3 net new sites added by the new src/ files (openai_schemas: 10; log_registry: 10; provider_state: ?; api_hooks: ?). The new sites are at to_dict() / from_dict() / Optional[tuple[...]] serialization boundaries which are Pattern 5 (generic serialization; stay as Any). - Both CI gates pass: STRICT OK: 115 <= 115; STRICT OK: 200 <= 207 TYPE REGISTRY REGENERATION (NEW/MODIFIED/DELETED): - index.md: 18 -> 22 .md files - src_api_hooks.md (NEW; Phase 5 WebSocketMessage) - src_log_registry.md (NEW; Phase 4 Session + SessionMetadata) - src_openai_schemas.md (NEW; Phase 2 ToolCall + ChatMessage + UsageStats + NormalizedResponse + OpenAICompatibleRequest) - src_provider_state.md (NEW; Phase 3 ProviderHistory + _PROVIDER_HISTORIES) - src_openai_compatible.md (DELETED; dataclasses moved to src_openai_schemas.md) - src_type_aliases.md (MODIFIED; +JsonPrimitive + JsonValue) - type_aliases.md (MODIFIED; registry index entry updated) VERIFICATION COMMANDS (all pass): uv run python scripts/audit_weak_types.py --strict STRICT OK: 115 weak sites <= baseline 115 uv run python scripts/audit_dataclass_coverage.py --strict STRICT OK: 200 weak sites <= baseline 207 uv run python scripts/generate_type_registry.py --check Registry in sync (22 files checked) ~130 targeted tests pass across 13 test files (see TRACK_COMPLETION §4)	2026-06-21 17:07:22 -04:00
ed	5c5f347cf0	conductor(entropy_epiplexity): Phase 1-3 Acquire+Keyframes+OCR - transcript.json (~5k segments via yt-dlp), 176 unique frames (214 raw), OCR in 30s Note: 364MB mp4 video. 176 frames after imagehash dedup (hamming<5).	2026-06-21 17:07:07 -04:00
ed	e9856388ae	conductor(entropy_epiplexity): Phase 1-3 Acquire+Keyframes+OCR - transcript.json (~5k segments via yt-dlp), 176 unique frames (214 raw), OCR in 30s Note: 364MB mp4 video. 176 frames after imagehash dedup (hamming<5).	2026-06-21 17:07:07 -04:00
ed	e9fa69ddc1	feat(api_hooks): add WebSocketMessage + JsonValue type (t5_1-t5_8) Phase 5 of any_type_componentization_20260621. Promotes the WebSocket broadcast signature in src/api_hooks.py from (channel, payload: dict) to a typed WebSocketMessage dataclass (16 Any sites): NEW dataclass (inline in src/api_hooks.py): - WebSocketMessage (frozen=True): channel: str, payload: JsonValue MODIFIED: - _serialize_for_api(obj: Any) -> JsonValue (typed return) - broadcast(channel: str, payload: dict[str, Any]) -> broadcast(message: WebSocketMessage) - _get_app_attr / _set_app_attr signatures UNCHANGED (Pattern 4 preserved) NEW tests/test_api_hooks_dataclasses.py (12 tests, all pass): - test_websocket_message_construction - test_websocket_message_with_list_payload - test_websocket_message_with_nested_payload - test_websocket_message_is_frozen - test_websocket_message_to_json - test_serialize_for_api_returns_dict_for_to_dict_object - test_serialize_for_api_handles_nested_lists - test_serialize_for_api_handles_purepath - test_serialize_for_api_passthrough_for_primitives - test_serialize_for_api_handles_mixed_nesting - test_get_app_attr_signature_preserved (Pattern 4 invariant) - test_set_app_attr_signature_preserved (Pattern 4 invariant) MODIFIED tests/test_websocket_server.py: - Updated broadcast() call site to use WebSocketMessage(channel=..., payload=...) - Added WebSocketMessage import Verified: uv run pytest tests/test_api_hooks_dataclasses.py tests/test_api_hooks_warmup.py tests/test_websocket_server.py --timeout=30 23 passed in 5.03s (12 new + 10 existing + 1 websocket)	2026-06-21 17:00:42 -04:00
ed	fef6c20ea0	feat(log): add Session + SessionMetadata dataclasses (t4_1-t4_8) Phase 4 of any_type_componentization_20260621. Promotes the 2-level dict[str, dict[str, Any]] structure in src/log_registry.py to typed Session + SessionMetadata dataclasses (7 Any sites): NEW dataclasses (inline in src/log_registry.py): - SessionMetadata (frozen): message_count, errors, size_kb, whitelisted, reason, timestamp - Session (frozen): session_id, path, start_time, whitelisted, metadata - to_dict() / from_dict() classmethod for round-trip with TOML shape - Backward-compat __getitem__ / get() so existing test_log_registry.py tests that use session_data['path'] / session_data.get('metadata') continue to work REFACTOR LogRegistry: - self.data: dict[str, dict[str, Any]] -> dict[str, Session] - load_registry: populates with Session.from_dict(...) - save_registry: serializes via session.to_dict() - register_session: creates Session dataclass - update_session_metadata: creates new Session with updated SessionMetadata - is_session_whitelisted: reads session.whitelisted - update_auto_whitelist_status: reads session.path - get_old_non_whitelisted_sessions: reads session.start_time + metadata NEW tests/test_log_registry_dataclasses.py (13 tests, all pass): - test_session_dataclass_construction - test_session_metadata_dataclass_construction - test_session_from_dict_basic / with_metadata - test_session_to_dict_round_trip - test_session_metadata_to_dict - test_log_registry_data_is_typed - test_log_registry_register_session_returns_session - test_log_registry_update_session_metadata_sets_metadata - test_log_registry_is_session_whitelisted - test_log_registry_get_old_non_whitelisted_sessions - test_session_is_frozen - test_session_metadata_is_frozen Verified: uv run pytest tests/test_log_registry.py tests/test_log_registry_dataclasses.py --timeout=30 18 passed in 3.27s (5 existing + 13 new)	2026-06-21 16:56:24 -04:00
ed	901b1b0982	conductor(probability_logic): Phase 5 Verification - end-of-track report + state.toml completed TRACK COMPLETE for child #2. All 7 deliverable artifacts present, report.md 1045 lines (within 1000-10000 target), summary.md 333 words (within 200-400 target), no TBDs. 10 children + 1 synthesis remaining in campaign.	2026-06-21 16:46:19 -04:00
ed	cb85591fc8	conductor(probability_logic): Phase 4 Synthesis - report.md (1,045 lines) + summary.md (333 words) Deep-dive report covers all 8 sections per umbrella spec FR6: - TL;DR: probability as extension of logic - Key Concepts: 32 numbered concepts - Frame Analysis: 25 frames (12 chat-only, 13 presentation) - Transcript Highlights: 16 verbatim passages with timestamps - Mathematical Content: 15 derivations - Connections: forward refs to 9 other videos - Open Questions: 14 questions for Pass 2 - References: people, concepts, resources Plus 6 appendices: concept map, lossless preservation audit, detailed transcript excerpts (sections C.1-C.15), math derivations (D.1-D.8), LLM connections, quick reference formulas. Lossless preservation per umbrella spec §0.	2026-06-21 16:45:39 -04:00
ed	e19672b2e0	conductor(plan): Phase 3 partial - provider_state + tests; call-site migration deferred	2026-06-21 16:44:28 -04:00
ed	2ad4718c3c	feat(provider): add src/provider_state.py + tests (t3_2, t3_3) Phase 3 of any_type_componentization_20260621 (PARTIAL). Adds the ProviderHistory abstraction and 6-provider registry. NEW src/provider_state.py (60 lines): - ProviderHistory dataclass (messages: list[HistoryMessage], lock: Lock, append / get_all / replace_all / clear methods) - _PROVIDER_HISTORIES: dict[str, ProviderHistory] for anthropic / deepseek / minimax / qwen / grok / llama - get_history(provider) factory + clear_all() + providers() - SDK client holders (_gemini_chat, _anthropic_client, etc.) NOT touched per Pattern 3 (heterogeneous SDK types) NEW tests/test_provider_state.py (12 tests, all pass): - test_six_providers_registered - test_get_history_returns_singleton_per_provider - test_get_history_raises_for_unknown - test_provider_history_starts_empty - test_provider_history_append / get_all_returns_copy / replace_all / replace_all_takes_copy / clear - test_clear_all_resets_every_provider - test_provider_history_thread_safety (10 threads x 100 messages) - test_independent_locks_per_provider (lock on one doesn't block another) DEFERRED: - t3_4 (Remove 14 globals from ai_client.py:111-133) - t3_5 through t3_13 (Update call sites in _send_<provider> functions) - t3_14 (Run full regression suite on test_ai_client*.py) These call-site updates require careful per-function refactoring of the ~27 sites in _send_anthropic, _send_deepseek, _send_minimax, _send_qwen, _send_grok, _send_llama. The ai_client.py file is 3432 lines; a single regex pass risks subtle indentation regressions in nested constructs (see the 7 ot : orphan lines from a previous attempt). The provider_state module is independently usable and tested. Future track: provider_state_migration_2026MMDD to wire up the call sites mechanically, OR integrate into a Phase 3 retry pass. Verified: uv run pytest tests/test_provider_state.py --timeout=30 12 passed in 2.99s	2026-06-21 16:43:42 -04:00
ed	ca4826ab31	conductor(probability_logic): transcript_clean.txt (10k words) + presentation frame extractor	2026-06-21 16:41:42 -04:00
ed	4dd373d70d	conductor(probability_logic): Phase 3 OCR - 25 frames OCR'd in 1.8s via winsdk	2026-06-21 16:40:04 -04:00
ed	f855967bb8	conductor(probability_logic): Phase 2 Keyframes - 25 unique frames (threshold 0.05; low-motion math lecture)	2026-06-21 16:39:43 -04:00
ed	338573b1e8	refactor(video_analysis): extract_transcript.py uses yt-dlp VTT directly (skip youtube-transcript-api which consistently fails for these videos) youtube-transcript-api v1.2.4 returns XML parse error on empty response for ALL videos in this campaign. yt-dlp's --write-auto-subs reliably returns 1000s of segments per video. Switched to yt-dlp as the primary path. Tests updated to mock _fetch_via_ytdlp instead of _fetch_raw_transcript. 8/8 tests passing.	2026-06-21 16:33:44 -04:00
ed	7478090e71	conductor(probability_logic): Phase 1 Acquire - transcript.json (3315 segments via yt-dlp VTT fallback) + video.log (84MB mp4 downloaded) Generic reusable drivers added: phase1_acquire.py, phase2_keyframes.py, phase3_ocr.py take slug as arg for batch use across all 12 children.	2026-06-21 16:32:19 -04:00
ed	b942c3f8b9	conductor(plan): fill t2_9 SHA + phase_2 checkpoint	2026-06-21 16:31:19 -04:00
ed	4bfce93105	conductor(plan): mark Phase 2 complete (t2_6 deferred to Phase 3) Phase 2 (openai_schemas) progress: - t2_1-t2_5+t2_7-t2_8 (`a96f946b`): 19 tests pass; NormalizedResponse + OpenAICompatibleRequest refactored to dataclasses - t2_6 (deferred): _send_grok + _send_minimax + _send_llama in src/ai_client.py still use legacy NormalizedResponse(text=..., tool_calls=[], usage_*_tokens=...) kwargs. These will be updated in Phase 3 (provider_state) as part of the ai_client refactor. - t2_9: Phase 2 checkpoint (commit hash filled in this commit) current_phase: 2 -> 3 phase_2.status: pending -> completed Next: Phase 3 - provider_state (15 tasks; the largest phase).	2026-06-21 16:30:29 -04:00
ed	fd95ea4879	conductor(cs229): Phase 5 Verification - end-of-track report + state.toml completed	2026-06-21 16:28:24 -04:00
ed	a96f946b40	feat(openai): add src/openai_schemas.py + refactor openai_compatible.py (t2_1-t2_7) Phase 2 of any_type_componentization_20260621. Promotes NormalizedResponse + OpenAICompatibleRequest from src/openai_compatible.py to typed dataclasses. The 17 Any sites become 5 dataclasses: NEW src/openai_schemas.py (138 lines): - ToolCallFunction dataclass (name, arguments) - ToolCall dataclass (id, function: ToolCallFunction, type='function') - ChatMessage dataclass (role, content, tool_calls, tool_call_id, name) - UsageStats dataclass (input_tokens, output_tokens, cache_read_, cache_creation_) - NormalizedResponse dataclass (text, tool_calls: tuple, usage, raw_response: Any) - OpenAICompatibleRequest dataclass (messages: list[ChatMessage], model, ...) NEW tests/test_openai_schemas.py (19 tests, all pass): - ToolCallFunction, ToolCall, ChatMessage round-trips - UsageStats field access + frozen=True semantics - NormalizedResponse.to_legacy_dict preserves shape - raw_response stays Any (Pattern 3 preserved) - tools field stays list[dict[str, Any]] for Phase 1 ToolSpec follow-up MODIFIED src/openai_compatible.py: - Removed inline NormalizedResponse + OpenAICompatibleRequest definitions - Re-imported from src.openai_schemas - _send_blocking: tool_calls -> tuple[ToolCall, ...]; usage_*_tokens -> UsageStats - _send_streaming: same migration - send_openai_compatible: messages_dicts = [m.to_dict() for m in request.messages] - Exception handler: empty NormalizedResponse uses UsageStats - All NormalizedResponse consumers still work (legacy dict shape preserved) Verified: uv run pytest tests/test_openai_schemas.py tests/test_mcp_tool_specs.py tests/test_audit_dataclass_coverage.py tests/test_type_aliases.py tests/test_mcp_client_beads.py tests/test_mcp_client_paths.py tests/test_arch_boundary_phase2.py --timeout=60 64 passed in 6.28s	2026-06-21 16:27:59 -04:00
ed	1872b66f68	conductor(cs229): Phase 4 Synthesis - report.md (1,157 lines, 100KB) + summary.md (364 words) + transcript_clean.txt Deep-dive report covers all 8 sections per umbrella spec FR6: - TL;DR: 6-pillar LLM training framework - Key Concepts: 31 numbered concepts - Frame Analysis: 115 frames organized by topic - Transcript Highlights: 18 verbatim passages with timestamps - Mathematical Content: 14 formal derivations - Connections: forward refs to all 11 other videos - Open Questions: 14 questions for Pass 2 - References: people, courses, papers, resources Plus 11 appendices (A-O): full transcript sections, frame inventory, OCR reference, Q&A log, glossary, cross-references, future work. Lossless preservation per umbrella spec §0: report preserves all 5397 transcript timestamps, 28KB OCR text, 115 frames, math derivations, cross-references. R5 mitigation verified (yt-dlp works despite oEmbed 401). Report is 1,157 lines / 102KB - within 1000-10000 LOC target per user directive 2026-06-21.	2026-06-21 16:27:15 -04:00
ed	0318bfe9e2	conductor(plan): fill t1_8 commit_sha + phase_1 checkpoint	2026-06-21 16:16:34 -04:00
ed	9961e437fb	conductor(plan): mark t1_1-t1_7 complete + Phase 1 done (t1_8 partial) Phase 1 (mcp_tool_specs) commits: - t1_1+t1_2+t1_3 (`96007ebd`): tests/test_mcp_tool_specs.py (11 tests) + src/mcp_tool_specs.py (45 ToolSpec registrations) + generator scripts - t1_4 (`747e3983`): refactor mcp_client.py (removed 774 lines of dict literals; 3 call sites updated) - t1_5 (`8bcde094`): refactor ai_client.py (3 TOOL_NAMES sites updated) - t1_6+t1_7: cross-module invariant verified; 45/45 tests pass - t1_8 (in_progress): Phase 1 checkpoint (commit hash filled in this commit) state.toml updates: - current_phase: 1 -> 2 - phase_1.status: pending -> completed - t1_1..t1_7: pending -> completed (with commit_sha) Next: Phase 2 - openai_schemas (9 tasks).	2026-06-21 16:15:59 -04:00
ed	c4686787b6	conductor(cs229): Phase 3 OCR - 115 frames OCR'd in 5.1s via winsdk (28KB markdown)	2026-06-21 16:12:18 -04:00
ed	91a96ce139	conductor(cs229): Phase 2 Keyframes - 115 unique frames extracted (147 raw, 32 dupes removed by phash+hamming=5)	2026-06-21 16:11:34 -04:00
ed	8bcde09476	refactor(mcp): update ai_client.py 3 TOOL_NAMES sites (t1_5) Phase 1 of any_type_componentization_20260621. Migrates ai_client.py: - Line 560: new_tools = {name: False for name in mcp_client.TOOL_NAMES} -> mcp_tool_specs.tool_names() - Line 582: _agent_tools = {name: True for name in mcp_client.TOOL_NAMES} -> mcp_tool_specs.tool_names() - Line 1012: is_native = name in mcp_client.TOOL_NAMES -> name in mcp_tool_specs.tool_names() Plus adds: from src import mcp_tool_specs Verified: uv run pytest tests/test_mcp_tool_specs.py tests/test_mcp_client_beads.py tests/test_mcp_client_paths.py tests/test_audit_dataclass_coverage.py tests/test_type_aliases.py 39 passed in 11.79s No regressions. The mcp_client.TOOL_NAMES re-export is preserved for backward compatibility with any external test/code that imports it.	2026-06-21 16:11:27 -04:00
ed	747e3983bd	refactor(mcp): update mcp_client.py call sites to mcp_tool_specs (t1_4) Phase 1 of any_type_componentization_20260621. Migrates the 4 call sites in src/mcp_client.py to use the new typed module: - Line 1944: native_names = {t['name'] for t in MCP_TOOL_SPECS} -> native_names = mcp_tool_specs.tool_names() - Line 1958: res = list(MCP_TOOL_SPECS) -> res = [s.to_dict() for s in mcp_tool_specs.get_tool_schemas()] - Line 2747: TOOL_NAMES = {t['name'] for t in MCP_TOOL_SPECS} -> TOOL_NAMES = mcp_tool_specs.tool_names() Plus: removes the legacy MCP_TOOL_SPECS list literal (lines 1973-2746; 774 lines of dict literals). The data lives in src/mcp_tool_specs.py now; the canonical registry. (The legacy dict shape is preserved via ToolSpec.to_dict() for downstream serialization.) Adds import: from src import mcp_tool_specs Verified: uv run pytest tests/test_mcp_tool_specs.py tests/test_audit_dataclass_coverage.py tests/test_type_aliases.py 32 passed in 5.48s uv run pytest tests/test_mcp_client_beads.py tests/test_mcp_client_paths.py 7 passed in 3.20s Cross-module invariant (test_tool_names_subset_of_models_agent_tool_names): the 45 mcp_tool_specs.tool_names() are all in models.AGENT_TOOL_NAMES.	2026-06-21 16:09:30 -04:00
ed	0bc8abbe9a	conductor(cs229): Phase 1 Acquire - transcript.json (5397 segments via yt-dlp VTT fallback) + video.log (yt-dlp success for 336MB mp4, R5 verified) Fix extract_transcript.py: YouTubeTranscriptApi.get_transcript() (not .fetch()). youtube-transcript-api v1.2.4 uses class method get_transcript(video_id), not instance .fetch(). R5 mitigation: yt-dlp's VTT auto-sub extraction works where youtube-transcript-api fails (XML parse error on empty response). 5397 segments recovered. Add gitignore patterns for video_analysis artifacts: .mp4, .vtt (regenerable). video.log intentionally tracked.	2026-06-21 16:08:15 -04:00
ed	96007ebd77	feat(mcp): add src/mcp_tool_specs.py + tests (t1_1, t1_2, t1_3) Phase 1 of any_type_componentization_20260621. Promotes MCP_TOOL_SPECS (45 dict[str, Any] literals in src/mcp_client.py) to typed dataclasses: NEW src/mcp_tool_specs.py: - ToolParameter dataclass (name, type, description, required, enum) - ToolSpec dataclass (name, description, parameters: tuple) - _REGISTRY: dict[str, ToolSpec] - register() / get_tool_spec() / get_tool_schemas() / tool_names() - to_dict() preserves legacy JSON shape for downstream serialization - 45 register() calls (one per tool) at module level - Mirrors src/vendor_capabilities.py reference pattern NEW tests/test_mcp_tool_specs.py (11 tests, all pass): - test_module_loads_with_45_registrations - test_tool_names_set_matches_expected_45 - test_get_tool_spec_returns_correct_instance - test_get_tool_spec_raises_for_unknown_name - test_get_tool_schemas_returns_all_specs - test_tool_spec_is_frozen - test_tool_parameter_is_frozen - test_to_dict_round_trip_preserves_shape - test_tool_parameter_to_dict_includes_enum - test_tool_names_subset_of_models_agent_tool_names (cross-module invariant) - test_register_idempotent_replaces_existing (hot-reload support) NEW scripts/tier2/artifacts/any_type_componentization_20260621/: - generate_mcp_tool_specs.py: idempotent generator from MCP_TOOL_SPECS - generate_tool_specs.py: helper that emits registration lines - inspect_mcp_specs.py: shape inspection - _generated_registrations.txt: the 45 registration lines Verified: 11/11 tests pass. The legacy MCP_TOOL_SPECS dict in mcp_client.py still exists; this commit only ADDS the new module. Migration of call sites in mcp_client.py + ai_client.py follows in t1_4 + t1_5. Verified with: uv run pytest tests/test_mcp_tool_specs.py --timeout=30 11 passed in 3.01s	2026-06-21 16:06:29 -04:00
ed	bf1f11ed6c	conductor(plan): fill t0_5 commit_sha + phase_0 checkpoint	2026-06-21 16:00:05 -04:00
ed	6e6ba90e39	conductor(plan): mark t0_1-t0_4 complete + Phase 0 done (t0_5 partial) Phase 0 (Shared scaffolding) commits: - t0_1 (`647ad3d4`): tests/test_audit_dataclass_coverage.py (RED) - t0_2 (`cfdf8988`): scripts/audit_dataclass_coverage.py + baseline.json (GREEN; baseline = 207) - t0_3 (`4e658dd2`): src/type_aliases.py JsonPrimitive + JsonValue - t0_4 (`a28d8723`): styleguide 12 'When to Promote TypeAlias to dataclass' - t0_5 (in_progress): Phase 0 checkpoint (commit hash filled in this commit) state.toml updates: - current_phase: 0 -> 1 - phase_0.status: pending -> completed - t0_1..t0_4: pending -> completed (with commit_sha) - t0_5: pending -> in_progress Next: Phase 1 - mcp_tool_specs (8 tasks).	2026-06-21 15:59:36 -04:00
ed	a28d8723a8	docs(styleguide): add 12 'When to Promote TypeAlias to dataclass' (t0_4) Phase 0 of any_type_componentization_20260621. Adds the canonical decision rule that future contributors can apply without re-deriving: - TypeAlias conditions: open shape, self-describing, transient - dataclass(frozen=True) conditions: known fields, multi-site access, stable serialization, shared across modules - The src/vendor_capabilities.py reference pattern (5 properties) - Decision tree - The 5 worked examples (89 sites promoted per the audit) - Cross-references to audit scripts + input artifact + track This is the canonical artifact for the 'when to dataclass' question; subsequent phases refer to it via 'see styleguide 12' rather than re-deriving the rule.	2026-06-21 15:58:42 -04:00
ed	4e658dd25c	feat(types): add JsonPrimitive + JsonValue TypeAliases (t0_3) Phase 0 of any_type_componentization_20260621. Extends src/type_aliases.py with two recursive-friendly TypeAliases for JSON wire format (used by Phase 5 api_hooks WebSocketMessage): - JsonPrimitive: str \| int \| float \| bool \| None - JsonValue: JsonPrimitive \| list['JsonValue'] \| dict[str, 'JsonValue'] The forward-ref 'JsonValue' strings work because from __future__ import annotations is at the top of the module (PEP 563 + PEP 613 TypeAlias). Tests added (4 new, 14 total): - test_json_primitive_alias_resolves_to_union: hints exposes JsonPrimitive - test_json_value_alias_resolves_to_recursive_union: hints exposes JsonValue - test_json_value_accepts_primitive_dict: dict[str, JsonValue] runtime use - test_json_value_accepts_nested_structures: nested dict+list round-trip Verification: uv run pytest tests/test_type_aliases.py --timeout=30 14 passed in 2.97s	2026-06-21 15:57:40 -04:00
ed	cfdf8988fb	feat(audit): add scripts/audit_dataclass_coverage.py + baseline (t0_2) GREEN phase for Phase 0. Mirrors scripts/audit_weak_types.py design with 3 additions specific to the any-type componentization track: 1. PROMOTED_SITE_MODULES allowlist: the 3 new src/ modules (mcp_tool_specs.py, openai_schemas.py, provider_state.py) are exempt from Any-counting (their new dataclasses intentionally have raw_response: Any and SDK holder fields that stay as Any per Pattern 3). 2. INLINE_PROMOTED_SITE_MODULES: log_registry.py + api_hooks.py get their dataclasses added inline in Phase 4 + 5 (not new modules); same exemption. 3. Combined counter: counts both Any AND weak-struct patterns (dict_str_any, list_of_dict, optional_dict, etc.). Modes: - default: informational (exits 0; prints human report) - --json: machine-readable with by_file, by_category, total_weak - --strict: CI gate (exits 1 when current > baseline) - --baseline: path to baseline file (default: scripts/audit_dataclass_coverage.baseline.json) Baseline: scripts/audit_dataclass_coverage.baseline.json = 207 weak sites (captured pre-Phase-1; expected to drop to ~118 after 89 sites promoted). Verification: uv run python scripts/audit_dataclass_coverage.py --strict STRICT OK: 207 weak sites <= baseline 207 uv run pytest tests/test_audit_dataclass_coverage.py --timeout=30 7 passed in 5.15s	2026-06-21 15:56:41 -04:00
ed	647ad3d49d	test(audit): add tests/test_audit_dataclass_coverage.py (t0_1) RED phase for Phase 0. Mirrors tests/test_audit_weak_types.py structure: - test_audit_script_exists: AUDIT_SCRIPT.is_file() sanity - test_audit_help_runs: --help exits 0 - test_audit_json_mode_emits_valid_json: --json emits valid JSON with expected fields - test_audit_default_mode_emits_human_report: default mode prints a report - test_audit_strict_mode_against_existing_baseline_passes: --strict exits 0 when current <= baseline - test_audit_strict_mode_fails_when_baseline_is_zero: --strict exits 1 when current > baseline=0 - test_audit_baseline_field_shape: --json output has expected baseline-shape fields 7 tests total. Run with: uv run pytest tests/test_audit_dataclass_coverage.py --timeout=30 NOTE: 6 of 7 tests fail at this commit (audit script not yet implemented). This is the RED phase; GREEN comes in the next commit.	2026-06-21 15:56:19 -04:00

1 2 3 4 5 ...

4095 Commits