conductor(tracks): Register Pass 2 de-obfuscation campaign (row 29) + update Pass 1 §11.1

- tracks.md: new row 29 for the de-obfuscation campaign (priority A, research, awaits user samples) - Pass 1 spec §11.1: superseded 2026-06-21; now points to the dedicated Pass 2 umbrella spec for the full handoff contract. The 'user must rediscover math encoding' action item is replaced by 'user provides 3-10 samples of past de-obfuscation notes; warmup derives the lexicon'
2026-06-23 00:08:35 -04:00
parent 256af96bf3
commit e768e98d5e
2 changed files with 21 additions and 8 deletions
@@ -65,6 +65,7 @@ Tracks that are unblocked and ready to start. Ordered by **dependency** (blocked
 | 26 | A (research) | [Video Analysis Campaign (12 videos, 5 clusters, Pass 1 of 3)](#track-video-analysis-campaign-20260621) | spec Γ£ô, plan Γ£ô, **14 folders scaffolded (1 umbrella + 12 children + 1 synthesis); Pass 1 of 3 (information extraction); awaiting Phase 0 tooling prerequisites (yt-dlp, cv2, imagehash install in repo venv)**; 12 children in execution order: CS229 ΓåÆ math foundations ΓåÆ Platonic/geometric ΓåÆ biological ΓåÆ CS336 ΓåÆ applied capstone; per-video target: 1000-10000 LOC markdown deep-dive report | (none ΓÇö independent; **NEW 2026-06-21**; multi-track research campaign; 12 videos across 5 clusters (E: Stanford >1hr; A: math foundations; B: Platonic AI; C: biological/cognitive; D: applied); multi-pass handoff to Pass 2 (de-obfuscation via user's math encoding ΓÇö USER must rediscover notation before Pass 2 starts) + Pass 3 (projection to applied domain ΓÇö USER must articulate "own caveats" before Pass 3 starts); **lossless preservation directive**: Pass 1 artifacts must NOT be over-summarized (data cascades to Pass 2/3); **2 E-cluster videos failed oEmbed 401** (yt-dlp may still work; verify in Phase 1); reusable tooling: 5 TDD scripts in `scripts/video_analysis/` (download_video, extract_transcript, extract_keyframes, ocr_frames, synthesize_report) |
 | 27 | A | [Phase 2/4/5 Call-Site Completion (post any_type_componentization)](#track-phase2-4-5-call-site-completion-20260621) | spec ✓, plan ✓, metadata ✓, state ✓, **SHIPPED 2026-06-21** with all 4 phases complete (6a broadcast fix + 6b ChatMessage + 6d UsageStats no-op + 6e Phase 3 cost analysis); 5 atomic commits on tier2 branch; broadcast() TypeError fixed; 20/20 provider tests pass; all 3 audits --strict pass; unblocks `code_path_audit_20260607`; report at `docs/reports/TRACK_COMPLETION_phase2_4_5_call_site_completion_20260621.md` | any_type_componentization_20260621 (parent; shipped 2026-06-21 with 48/89 sites + 1 runtime bug) | (NEW 2026-06-21; bugfix + refactor + test-infrastructure + Tier 2 cost analysis; **Phase 6a COMPLETE**: fixed 2 broadcast() callers in `src/app_controller.py:1849` + `src/events.py:115` (gui_2.py had no callers, verified by grep); added `tests/test_websocket_broadcast_regression.py` 4/4 pass; **Phase 6b COMPLETE**: migrated `_send_grok` + `_send_minimax` + `_send_llama` to `ChatMessage` API; 20/20 provider tests pass; **Phase 6d NO-OP**: `NormalizedResponse` already uses `UsageStats` throughout `openai_compatible.py`; **Phase 6e COMPLETE**: produced `docs/reports/PHASE3_TIER2_ANALYSIS.md` (253 lines; Tier 2 authoritative version); measured 104 history sites (vs Tier 1 estimate 112); discovered 3 hidden cross-references (_strip_private_keys, _extract_minimax_reasoning, _send_llama_native); refined cost estimates: anthropic 35-65us/turn (Tier 1 said 8-15), grok/qwen/llama ~400ns (Tier 1 said 2-8us); **deferred**: Phase 3 call-site migration (104 sites in ai_client.py) -> separate track post-audit; cross-phase coupling -> separate track; `audit_tier2_leaks.py` sandbox-pollution -> infra track; **does NOT merge `tier2/any_type_componentization_20260621` branch** per Tier 2 reconnaissance framing; **does NOT archive `conductor/tracks/phase2_4_5_call_site_completion_20260621/`** - user handles that) |
 | 28 | A | [Any-Type Componentization (Promote dict[str, Any] to dataclass(frozen=True))](#track-any-type-componentization-promote-dictstr-any-to-dataclassfrozentrue) | spec Γ£ô, plan Γ£ô, metadata Γ£ô, state Γ£ô, **shipped 2026-06-21** with 48/89 fat-struct sites promoted (Phases 1, 2, 4, 5 complete); Phase 3 (`provider_state` call-site migration in `ai_client.py`) DEFERRED to a separate track; 1 runtime bug surfaced (`HookServer.broadcast()` callers in `app_controller.py` + `events.py`); not merged; reconnaissance for `code_path_audit_20260607`; tier2 branch at 24 commits | (none ΓÇö independent; **NEW 2026-06-21**; refactor + ai-readability + type-safety; ships: 3 new modules (`src/mcp_tool_specs.py`, `src/openai_schemas.py`, `src/provider_state.py`); 2 new audit scripts (`scripts/audit_dataclass_coverage.py` + `--strict` mode); styleguide `conductor/code_styleguides/type_aliases.md` ┬º12 "When to Promote TypeAlias to dataclass"; type-registry regenerated; 130+ tests pass; **input artifact**: `docs/reports/ANY_TYPE_AUDIT_20260621.md`; **handoff docs**: `docs/handoffs/PROMPT_FOR_TIER_1.md` + `HANDOFF_FOLLOWUP_TRACK_FROM_any_type_componentization.md` + `HANDOFF_CODE_PATH_AUDIT_FROM_any_type_componentization.md`) |
+| 29 | A (research) | [Video Analysis De-obfuscation Campaign (Pass 2 of 3)](#track-video-analysis-deob-20260621) | spec ✓, plan ✓, **5 folders scaffolded (1 umbrella + 1 warmup + 3 phase children); Pass 2 of 3 (de-obfuscation)**; **awaits USER action item**: gather 3-10 samples of past de-obfuscation notes into `video_analysis_deob_warmup_20260621/samples/`; warmup produces `report.md` + `prompt_template.md`; lexicon child refines; pilot child validates on 2 videos (`cs229_building_llms` + `entropy_epiplexity`); apply child applies to 10 + synthesis; multi-layer deliverable per video: translation + replacement + decoder | (none — independent; **NEW 2026-06-21**; multi-track research campaign; **de-obfuscation philosophy**: constructive type theory + Wildberger-style finitism + boundedness for knowledge + cycles/iteration explicit + etymology-aware; 4 verification criteria (lossless, bounded, constructively typed, etymology-cited); supersedes Pass 1 spec §11.1; consumed by Pass 3 (projection to applied domain, future user-led); **load-bearing directive**: Pass 1 artifacts must remain lossless because Pass 2 de-obfuscation consumes them as raw input) |

 **Note on numbering:** the legacy file used `0a`, `0b`, `0c`... and `0d`, `0e`, `0f`, `0g` for tracks created 2026-06-06+. This is the **git-blame sort order**, not a logical execution order. The new structure re-orders by dependency.

@@ -341,21 +341,33 @@ This track does not modify the manual_slop application architecture. It produces

 ## 11. Coordination with Future Passes (load-bearing)

-### 11.1 Pass 2 (de-obfuscation via user's math encoding notation) — handoff contract
+### 11.1 Pass 2 (de-obfuscation via user's constructive type-theoretic re-encoding) — handoff contract (superseded 2026-06-21)

-**Pass 2 will consume:**
+**This section is superseded by the dedicated Pass 2 track at [`conductor/tracks/video_analysis_deob_20260621/spec.md`](../video_analysis_deob_20260621/spec.md) (umbrella) + the warmup precursor at [`conductor/tracks/video_analysis_deob_warmup_20260621/spec.md`](../video_analysis_deob_warmup_20260621/spec.md).**
+
+**TL;DR.** Pass 2 is now a 5-folder campaign (1 warmup + 1 umbrella + 3 phase children), not a single "USER must rediscover" task. The user's "compress/decompress math info" encoding is now an **evidence-based lexicon** derived from the user's past de-obfuscation samples (collected via the warmup track), not a "user must rediscover" action item. The 3-layer deliverable per video (translation / replacement / decoder) replaces the single `deobfuscated/<slug>.md` originally described here.
+
+**Pass 2 will consume (unchanged):**
 - `transcript.json` (every child track's `artifacts/transcript.json`)
 - `frames/*.jpg` (every child track's `artifacts/frames/`)
 - `ocr.md` (every child track's `artifacts/ocr.md`)
- `report.md` (every child track's deep-dive report)
+- `report.md` (every child track's deep-dive report — the primary input)
 - `summary.md` (every child track's summary)

-**Pass 2's input encoding (user action item — pre-Pass-2):**
- The user must rediscover/redefine their "compress/decompress math info" encoding notation.
- This may be referenced in `conductor/tracks/intent_dsl_survey_20260612/` and other DSL-related track work; the user has prior art but it needs to be located.
- Without this encoding system, Pass 2 cannot start.
+**Pass 2's input (revised 2026-06-21):**
+- The user provides 3-10 samples of their past de-obfuscation notes in `samples/`. The warmup track (`video_analysis_deob_warmup_20260621/`) produces the initial lexicon + LLM prompt template.
+- The user's constructive type theory framing is documented at [`video_analysis_deob_20260621/spec.md`](../video_analysis_deob_20260621/spec.md) §1.1.

-**Pass 2 output:** a `deobfuscated/<slug>.md` per video + a `deobfuscated/synthesis.md` cross-cutting.
+**Pass 2 output (revised):** 3-layer deliverable per video + 3-layer deliverable for the cross-cutting synthesis. For each: `<slug>_translation.md` (side-by-side), `<slug>_deobfuscated.md` (replacement), `<slug>_decoder.md` (per-term decoder with form anchor + etymology + definition history).
+
+**5-folder structure:**
+- `video_analysis_deob_20260621/` — umbrella (coordinates the 3 phase children)
+- `video_analysis_deob_warmup_20260621/` — precursor (user samples → lexicon + prompt template)
+- `video_analysis_deob_lexicon_20260621/` — Phase 1 (refine the warmup's draft into a codified operational spec)
+- `video_analysis_deob_pilot_20260621/` — Phase 2 (apply to 2 videos: `cs229_building_llms` + `entropy_epiplexity`; capture refinements)
+- `video_analysis_deob_apply_20260621/` — Phase 3 (apply to the remaining 10 + synthesis)
+
+See the umbrella spec for the full design: §1 (philosophy), §2 (current state), §3 (goals), §4 (functional requirements), §6 (out of scope), §10 (user directives), §11 (see also).

 ### 11.2 Pass 3 (projection to applied domain) — handoff contract