conductor(deob_warmup): Update report.md v2 - 1.13 + 3 tier tables + 3.5 note + 10 per-language rendering

Design doc v2. Section 1.13 (Encoding-explicit) updated with placeholder scheme: float (general) / integer (general) / Scalar (linear/geo/tensor alg) / float64 (resolved). Section 3.1, 3.2, 3.3, 3.4 tier tables updated: 5 wrong re-encodings removed (set/kind, function/procedure, parameter/argument, input/arg, proof/construction, partial in 4.4). 4 template notations in 3.14 (B default, C++/Odin/Jai opt-in). 3 new entries added: 1.13 (<< / >>), 3.19 (Markov chain), 3.20 (PolyTimeAdversary), 4.25 (correlation), 4.26 (<< / >> with tolerance). Section 3.5 note added: pseudo sectr lang is incomplete and needs adapting (per user 2026-06-23). Section 10 added: per-language rendering pointer to lexicon.md 9. v1 state preserved in git history; v2 is the current state. 13 sections + 2 appendices.
2026-06-23 20:01:00 -04:00
parent 99bc1598d9
commit 86fe3ef53b
1 changed files with 68 additions and 31 deletions
@@ -141,27 +141,39 @@ This is the **operational definition** of the type-theoretic primitives in `Type
 2. **The 4-rule pattern (per Cluster 3)** should be extended to include a "compression history" section: for each type, document which axioms are "essentially true" (irreducible), which are "primitive compression" (can be removed at a cost), and which are "convention" (can be freely re-interpreted).
 3. **The "Linear dependence / associativity / commutativity as compression axioms" pattern (per Cluster 0, P33)** should be the canonical example: the deob-warmup should treat these as **opt-in / opt-out** per operation, not globally.

-### §1.13 Encoding-explicit (the user's 2026-06-23 refinement)
+### §1.13 Encoding-explicit (refined v2 per user 2026-06-23)

-**Source cluster:** Cluster 0 §0.6.2.7 (user refinement after Phase 1.5).
+**Source cluster:** Cluster 0 §0.6.2.7 (user refinement after Phase 1.5); v2 refinement per user 2026-06-23 (placeholder scheme).

 **Anchor (USER 2026-06-23, verbatim):** *"Quantity or scalar for value is fine but to keep in mind that if they are used, it should be associated with a finite encoding. Whereas the real number line for example is a classification of expressions that may resolve to any finite encoding of quantity resolution."*

+**Anchor (USER 2026-06-23, on `Scalar`):** *"no keep scalar its useful for linear alg, geo alg, tensor alg."*
+
+**Anchor (USER 2026-06-23, on ontology):** *"You can observe the shape of the procedure, not all possible result combinations or resolutions for a given metric utilized with that procedure."*
+
 **Take.** **"Quantity" or "scalar" is fine as the re-encoded form for a value** — they're not banned. **BUT they must be associated with a finite encoding** (e.g., `quantity : float64`, `scalar : int32`). The **"real number line" is not a value** — it's a **classification of expressions** that may resolve to any finite encoding of quantity resolution.

-**Consequence for the deob-warmup:**
-1. **Add a `Rule 5: Encoding-explicit` to the `prompt_template.md`.** Every value-bearing term must have an `encoding:` attribute. The encoding is the **bounded form** of the value; without the encoding, the value is **indefinite** (per §1.2).
-2. **Add an `encoding:` attribute to every Tier 1 and Tier 2 entry in the lexicon.** Default encoding: `float64` (~16 decimal digits).
-3. **Update the Boundedness rules (§3.6):** add a row for `Real` as a value (banned — it's a type-class) and `kind : Real` resolves to `quantity : float64` (allowed — the bounded form).
-4. **The encoding is the bounded form** (§1.1 made concrete): a `quantity : float64` is bounded to ~1.8 × 10^308; a `quantity : int32` is bounded to ±2.1 × 10^9; etc.
+**Consequence for the deob-warmup (v2 refinement):**
+1. **Rule 5: Encoding-explicit** uses placeholders, not committed resolutions. The principled defaults are:
+   - `float` (general unbounded float placeholder)
+   - `integer` (general unbounded integer placeholder)
+   - `Scalar` (placeholder with specific meaning in linear alg, geo alg, tensor alg)
+2. **`float64` is the principled resolved default ONLY when the user defines a target resolution** for the application. The v1 lexicon's blanket `float64` default was over-committing; v2 defers.
+3. **The placeholder vs resolved distinction** is the operational form of the ontology axiom: the *shape* of the procedure is observable; the *resolution* is a user-defined target.
+4. **`Scalar` is preserved** for linear/geo/tensor algebra (per user 2026-06-23). The choice between `float` and `Scalar` as a placeholder is domain-driven: `float` is general-purpose; `Scalar` is for linear/geo/tensor alg.

-**The encoding taxonomy** (per `prompt_template.md` Rule 5):
- `int8/16/32/64` (exact integers, bounded)
- `uint8/16/32/64` (exact unsigned integers, bounded)
- `float16/32/64/128` (floats, bounded; `float64` is the default)
+**The encoding taxonomy (v2, refined):**
+- `int8/16/32/64` (exact integers, bounded; resolved)
+- `uint8/16/32/64` (exact unsigned integers, bounded; resolved)
+- `float16/32/64/128` (floats, bounded; resolved)
+- `float` (general unbounded placeholder; v2)
+- `integer` (general unbounded placeholder; v2)
+- `Scalar` (linear/geo/tensor alg placeholder; v2)
 - `bigint` (arbitrary precision, exact)
 - `decimal64/128` (financial precision, bounded)

+**Per user 2026-06-23 (further clarification):** "I do like the encoding taxonomy table you have when picking a resolution matters though." The taxonomy table is preserved — the placeholder vs resolved distinction makes the taxonomy *more* useful, not less. The user can pick a resolution from the table when the application context demands it.
+
 ---

 ## §2. Prior Art (the user's influences)
@@ -212,11 +224,11 @@ The 8 influences below are grounded in the cluster sub-reports.

 The lexicon is the heart of the de-obfuscation. It is organized in 4 tiers. The total is ~70 terms (after Phase 1 expansion), spanning the 10 cluster sub-reports.

-### §3.1 Tier 1: Core concepts (12 terms)
+### §3.1 Tier 1: Core concepts (13 terms, refined v2)

 | # | Conventional | Re-encoded | Etymology | Source cluster |
 |---|---|---|---|---|
-| 1.1 | `set` | `kind` | Old English *cynd* | Cluster 0, 4 |
+| 1.1 | `set` | **NO RE-ENCODING** (set is a data structure: HashSet, SortedSet, etc.; per user 2026-06-23) | Old English *settan* + *gesamd* | Cluster 0, 4 |
 | 1.2 | `∀` | `forall` | Latin *pro omnibus* | Cluster 2, 4 |
 | 1.3 | `∃` | `exists` | Latin *existere* | Cluster 4 |
 | 1.4 | `∧` | `and` | Old English *and* | Cluster 3 |
@@ -228,35 +240,36 @@ The lexicon is the heart of the de-obfuscation. It is organized in 4 tiers. The
 | 1.10 | `⊥` | `Bottom` | Greek *boussomai* | Cluster 3 |
 | 1.11 | `Notion` (ἔννοια) | `concept` | Greek *ἔννοια* ("having in mind") | Cluster 7 |
 | 1.12 | `Boundary/Term` (ὅρος) | `definitio` | Greek *ὅρος* | Cluster 7 |
+| 1.13 | `<<` / `>>` (NEW v2) | `<<` / `>>` (much less than / much more than) with `tolerance : float64` | Mathematical convention | NEW (v2) |

-### §3.2 Tier 2: Data-oriented pipeline terms (18 terms)
+### §3.2 Tier 2: Data-oriented pipeline terms (18 terms, refined v2)

 | # | Conventional | Re-encoded | Source cluster |
 |---|---|---|---|
-| 2.1 | `function` | `procedure` | Cluster 2, 4 |
-| 2.2 | `parameter` | `argument` | Cluster 2, 4 |
+| 2.1 | `function` | **NO RE-ENCODING** (function = declarative/math; procedure = imperative/CS; per user 2026-06-23) | Cluster 2, 4 |
+| 2.2 | `parameter` | **NO RE-ENCODING** (parameter = formal name; argument = actual value; per user 2026-06-23) | Cluster 2, 4 |
 | 2.3 | `return value` | `result` (or `this`) | Cluster 2 |
 | 2.4 | `definition` | `formation` | Cluster 3 |
-| 2.5 | `input` | `arg` | Cluster 4 |
+| 2.5 | `input` | **NO RE-ENCODING** (input = conceptual act; arg = formal name; per user 2026-06-23) | Cluster 4 |
 | 2.6 | `equation` | `relation` | Cluster 2 |
 | 2.7 | `property` | `property` | Cluster 2 |
 | 2.8 | `lemma` / `corollary` | `claim` (collapse both) | User-specific |
-| 2.9 | `proof` | `construction` | Cluster 0, 7 |
+| 2.9 | `proof` | **NO RE-ENCODING** (construction is a sub-type tag, applied when proof is constructive; per user 2026-06-23) | Cluster 0, 7 |
 | 2.10 | `witness` | `instance` | Cluster 4 |
 | 2.11 | `Attribute` (attributus) | `attribute` (extrinsic) | Cluster 7 |
 | 2.12 | `Property` (proprietas) | `property` (intrinsic) | Cluster 7 |
-| 2.13 | `Type/Genus` (γένος) | `kind` (sense 8) | Cluster 7 |
+| 2.13 | `Type/Genus` (γένος) | **NO RE-ENCODING** (Type/Genus/Kind are analogous; `kind` reserved for enumeration types: components, DAG nodes, fat structs; per user 2026-06-23) | Cluster 7 |
 | 2.14 | `static declaration` | `static { }` | Cluster 6, 9 |
 | 2.15 | `execution block` | `exe { }` | Cluster 6, 9 |
 | 2.16 | `meta-programming` | `CodeSector` | Cluster 9, P14 |
 | 2.17 | `import alias` | `using` (Haskell-style) | Cluster 9, P15 |
 | 2.18 | `assertion` | `'figure 1.9' ... assert -> ... = ...` | Cluster 9, P16 |

-### §3.3 Tier 3: Type-theoretic primitives (18 terms, expanded in Phase 1)
+### §3.3 Tier 3: Type-theoretic primitives (20 terms, refined v2)

 | # | Conventional | Re-encoded | Source cluster |
 |---|---|---|---|
-| 3.1 | `Type` (the meta-type) | `kind` | Cluster 3 |
+| 3.1 | `Type` (the meta-type) | **NO RE-ENCODING** (Type/Genus/Kind are analogous; `kind` reserved for enums; per user 2026-06-23) | Cluster 3 |
 | 3.2 | `Type of types` | `Kind` | Cluster 3 |
 | 3.3 | `Constructor` | `intro` / `construct` | Cluster 3 |
 | 3.4 | `Eliminator` | `elim` / `eliminate` | Cluster 3 |
@@ -269,27 +282,29 @@ The lexicon is the heart of the de-obfuscation. It is organized in 4 tiers. The
 | 3.11 | `Top` | `Top` (to be defined) | Phase 1 |
 | 3.12 | `Pair` (Sigma type) | `Pair<A, B>` with `Build<A>`, `Build<B>` projections | Cluster 3 (Phase 1) |
 | 3.13 | `Pair constructor` | `<M, N>` | Cluster 3 (Phase 1) |
-| 3.14 | `Dependent Function` (Pi type) | `Dependent<x : A>(B)` | Cluster 3 (Phase 1) |
+| 3.14 | `Dependent Function` (Pi type) | **4 notations (NEW v2, B default)**: `Dependent(B) <- depends(x : A)` (B, default) / `Dependent<B>` (C++, opt-in) / `Dependent[B, x : A]` (Odin, opt-in) / `Dependent[B, x : A]` (Jai, opt-in) | Cluster 3 (Phase 1) |
 | 3.15 | `Lambda` | `lambda.x.M` | Cluster 3 (Phase 1) |
 | 3.16 | `objects :` (carrier declaration) | `objects : m : A, n : B ;` | Cluster 3 (Phase 1, Pattern 6) |
 | 3.17 | `Sum` (Disjoint Sum) | `A + B` with `inl`/`inr` injections | Cluster 3 |
 | 3.18 | `Sum elimination` (BNF) | `match(M, N, O)` | Cluster 3 (Phase 1) |
+| 3.19 | `Markov chain` (R4, NEW v2) | `Markov<X, Y, Z> where X -> Y -> Z is a Markov chain` | entropy_epiplexity §5.2 |
+| 3.20 | `PolyTimeAdversary` (R6, NEW v2) | `PolyTimeAdversary : Type where forall A : PolyTimeAdversary, runtime(A) : Polynomial(security_parameter) : int64` | entropy_epiplexity §5.8 |

-### §3.4 Tier 4: AI-fuzzing tolerance terms (22 terms, expanded in Phase 1)
+### §3.4 Tier 4: AI-fuzzing tolerance terms (26 terms, refined v2)

 > **Reading guide.** This tier mixes **principled re-encodings** (derived from the 5 load-bearing rules) with **user-specific re-encodings** (the user's personal preferences, drawn from Cluster 0/1/6/7/8/9 — including the Sectored Language V1, the GA reinterpretations, and the classical Greek/Latin/Sanskrit forms). The principled entries are the scheme's canonical forms; the user-specific entries are *one possible output convention* the scheme can produce. The 4-layer output format (§6.2) is similarly optional. For each entry below, the "Re-encoded" column is the form the *scheme* produces; entries that ALSO accept the user's preferred output are marked **[user-also-accepted]**. Phase 1 (lexicon child) will formalize this distinction.

 | # | Conventional (fuzzy) | Re-encoded (precise) | Source cluster |
 |---|---|---|---|
 | 4.1 | "invent" | `construct` | Cluster 0 |
-| 4.2 | "real number" | `encodable quantity` (or `scalar` for grade-0) | Cluster 0, 8 |
+| 4.2 | "real number" | `quantity(<value>) : <encoding>` where `<encoding>` is `float` (placeholder, general) or `Scalar` (placeholder, linear/geo/tensor alg) or `float64` (resolved) | Cluster 0, 8 |
 | 4.3 | "imaginary number" | `bivector` (with scalar multiplier) | Cluster 0, 8 |
-| 4.4 | "function" | `procedure` or `transform` | Cluster 2 |
+| 4.4 | "function" | **NO RE-ENCODING** (clarify with etymology; function = declarative; procedure = imperative) | Cluster 2 |
 | 4.5 | "magic" | `unboxed` or `indefinite` | Cluster 0, 9 |
 | 4.6 | "natural number" | `Nat = Zero | Succ(Nat)` | Cluster 3 |
 | 4.7 | "smooth" | `infinitely-differentiable` | Cluster 2 |
 | 4.8 | "the limit exists" | `Limit(f, p) : L for some L` | Cluster 2 |
-| 4.9 | "transcendental number" | `template expression for producing a value at a given resolution` | Cluster 1 (Pattern 7), 0 (Cluster A, P2) |
+| 4.9 | "transcendental number" | `classification of expressions that resolve to a specific sequence consistent with the encoding resolution, fulfilling very specific traits (transcendence over algebraic); an algebraic expression that fulfills the term for irrationals shares some but not all traits` | Cluster 1 (Pattern 7), 0 (Cluster A, P2) |
 | 4.10 | "dot product" | `length-projection product` (or `'scalar product'` per Sectored Language) | Cluster 1 (Pattern 6), 9 |
 | 4.11 | "cross product" | `wedge product` (3D) | Cluster 1 (Pattern 6), 8, 9 |
 | 4.12 | "anti-wedge" | `regressive product` / `contraction` / `interior product` | Cluster 1 (Pattern 6) |
@@ -299,16 +314,20 @@ The lexicon is the heart of the de-obfuscation. It is organized in 4 tiers. The
 | 4.16 | "straight line" | `Εὐθεῖα` (Greek) / `linea recta` (Latin) | Cluster 7 |
 | 4.17 | "kernel" (cross-domain) | `discrete subsystem that holds a continuous process up` | Cluster 0 (Cluster B, P8) |
 | 4.18 | "Bourbaki" | **FOIL** (cultural opponent) | Cluster 0, 9 |
-| 4.19 | "real" (in reals) | `kind : Real` (a type-class, NOT a value) resolves to `quantity : float64` | Cluster 0 (Cluster A, P2) + user 2026-06-23 |
-| 4.20 | "Pi" | `kind : Pi` (a type-class) resolves to `quantity : float64` (default encoding; or `quantity : float128` for high-precision) | Cluster 0 (Deep Math 2 §25) + user 2026-06-23 |
-| 4.21 | "quantity" (a value) | `quantity(<value>) : <encoding>` (e.g., `quantity(3.14) : float64`, `quantity(5) : int64`) | User 2026-06-23 — accepted as the re-encoded form for a value |
-| 4.22 | "scalar" (a value) | `scalar : <encoding>` (e.g., `scalar : float64`) | User 2026-06-23 — accepted as the re-encoded form for a value |
+| 4.19 | "real" (in reals) | `kind : Real` (a type-class) resolves to `float` (general placeholder) or `Scalar` (linear/geo/tensor alg placeholder) or `quantity : <encoding>` (resolved) | Cluster 0 (Cluster A, P2) + user 2026-06-23 |
+| 4.20 | "Pi" | `kind : Pi` (a type-class) resolves to `float` (general placeholder) or `Scalar` (linear/geo/tensor alg placeholder) or `quantity : <encoding>` (resolved) | Cluster 0 (Deep Math 2 §25) + user 2026-06-23 |
+| 4.21 | "quantity" (a value) | `quantity(<value>) : <encoding>` where `<encoding>` is `integer` (placeholder) or `int64` (resolved) for integers; `float` (placeholder) or `float64` (resolved) for reals | User 2026-06-23 — refined v2 |
+| 4.22 | "scalar" (a value) | `float` (general placeholder) or `Scalar` (linear/geo/tensor alg placeholder) or `scalar : <encoding>` (resolved) | User 2026-06-23 — refined v2 |
 | 4.23 | "Lengyel's Standard GA" | **FOIL** (per Cluster 0, Cluster B, P6) | Cluster 0 |
 | 4.24 | "Standard GA" (Hestenes, Dorst) | **FOIL** (Lengyel's Projective GA is the unifier) | Cluster 0 |
+| 4.25 | "correlation" (R1, NEW v2) | `correlation : <encoding>` where `<encoding>` is `float` (placeholder) or `float64` (resolved) | cs229 §2.6 |
+| 4.26 | "<< N" / ">> N" (much less than / much more than, NEW v2) | `weakly_coupled(a, b) : Prop` (predicate) OR `much_less(a, b, tolerance)` / `much_greater(a, b, tolerance)` (comparison, with `tolerance : float64`) | multiscale_hoffman §5.2, neural_dynamics_miller §5.10 |

 ### §3.5 Sectored Language operator terms (Phase 1, from Cluster 6 + Cluster 9)

 > **Reading guide.** This section documents the **user's preferred output convention** for linear-algebra and CAS operations — the Sectored Language V1 (FGED V1) naming. The de-obfuscation scheme does NOT require this convention. It is one example of how the scheme's principled re-encodings (e.g., `scalar product`, `magnitude`) can be realized in an executable form. Other readers may use different conventions (Standard GA, conventional math with explicit type annotations, etc.); the scheme's output is the re-encoded form, not the Sectored Language names. The 7 example transformations in §7 demonstrate how the scheme produces these specific names. **Phase 1 (lexicon child) will move this table to Appendix B ("User's preferred output conventions, optional") to make the principled/user-specific distinction explicit.**
+>
+> **Per user 2026-06-23:** "When it comes to the code psuedo sectr lang is not complete and prob needs adapting or further adjustments." The Sectored Language is a starting point; the user's actual code conventions (C11: raddbg / duffel / pikuma / forth bootslop; Python: manual_slop) take precedence. Pass 3 will adapt the Sectored Language as needed.

 | Conventional | Sectored Language name | Source |
 |---|---|---|
@@ -886,5 +905,23 @@ These are not blocking questions for the deob-warmup's shipping. They are open q

 ---

-*End of `report.md`. Total: 12 sections (including §11 Scope and Limits) + 2 appendices (A: Provenance). Spec FR4 structure: complete. ~700+ LOC main report + ~2,491 LOC cluster sub-reports (with §0.6.2.6 Phase 1.5 expansion in Cluster 0) = ~3,200+ LOC total. Phase 1 (lexicon child) will refine and extend the 31 unresolved items.*
+## §10. Per-language rendering for `<<` / `>>` (NEW v2)
+
+**See `lexicon.md` §9 for the full per-language rendering specification.** This section is a brief pointer.
+
+The `<<` / `>>` operators (much less than / much more than) have a per-language rendering issue: in C11, `a << b` and `a >> b` are bit-shift operators. In Python, the same. In Forth, `a b <<` is a shift. The principled form cannot be used as-is in these languages — there's a namespace collision with bit-shift.
+
+**Resolution:** use named functions or operators in the target language. Per user 2026-06-23:
+
+| Principled form | C11 rendering | Python rendering | Forth rendering |
+|---|---|---|---|
+| `<<` (much less than) | `much_less(a, b, tolerance)` | `much_less(a, b, tolerance)` | `much_less` (named word) |
+| `>>` (much more than) | `much_greater(a, b, tolerance)` | `much_greater(a, b, tolerance)` | `much_greater` (named word) |
+| `<<` / `>>` (predicate) | `weakly_coupled(a, b, tolerance)` | `weakly_coupled(a, b, tolerance)` | `weakly_coupled` (named word) |
+
+The full §9 of `lexicon.md` (the lexicon child) documents the C11/Python/Forth renderings in detail, including example code.
+
+---
+
+*End of `report.md`. Total: 13 sections (including §11 Scope and Limits + §10 NEW Per-language rendering pointer) + 2 appendices (A: Provenance). Spec FR4 structure: complete. ~700+ LOC main report + ~2,491 LOC cluster sub-reports (with §0.6.2.6 Phase 1.5 expansion in Cluster 0) = ~3,200+ LOC total. Phase 1 (lexicon child) will refine and extend the 31 unresolved items.*