diff --git a/conductor/tracks/video_analysis_deob_lexicon_20260621/lexicon.md b/conductor/tracks/video_analysis_deob_lexicon_20260621/lexicon.md index cfab29c2..06d8704f 100644 --- a/conductor/tracks/video_analysis_deob_lexicon_20260621/lexicon.md +++ b/conductor/tracks/video_analysis_deob_lexicon_20260621/lexicon.md @@ -48,9 +48,14 @@ The warmup's report.md §3.4 and §3.5 documented this distinction in reading-gu The LLM **MUST** produce the principled re-encoding. The LLM **MAY** produce the user-specific form as an additional output (per the verification checklist in `prompt_template.md`). -### §0.4 The encoding-explicit form (per user 2026-06-23) +### §0.4 The encoding-explicit form (per user 2026-06-23, refined v2) -Every value-bearing term has an `encoding:` attribute (per Rule 5, the new Rule 5 added in Phase 1.5 of the warmup). The encoding is the **bounded form** of the value. Default: `float64` (~16 decimal digits). See §7 for the encoding taxonomy. +Every value-bearing term has an `encoding:` attribute (per Rule 5). The encoding is the **bounded form** of the value. The **principled defaults** (v2) are: +- `float` (general unbounded float placeholder) or +- `integer` (general unbounded integer placeholder) or +- `Scalar` (placeholder with specific meaning in linear algebra, geometric algebra, tensor algebra; per user 2026-06-23 "useful for linear alg, geo alg, tensor alg"). + +`float64` is the principled resolved default **only when the user defines a target resolution** for the application. The v1 lexicon's blanket `float64` default was over-committing; v2 defers the resolution until the application context demands it. See §7 for the encoding taxonomy. See L7 in `V2_CHANGELOG.md` for the v1→v2 transition. --- @@ -121,32 +126,44 @@ The 4-rule pattern (per Cluster 3) is extended to include a "compression history **Selective compression (per Cluster 0, P41):** the "linear dependence / associativity / commutativity" axioms are **opt-in / opt-out per operation**, not globally. The LLM treats them as compression flags: `linear_dependence: on / off`, `associativity: on / off`, `commutativity: on / off`. This is the operational form of the lossless rule. -### §1.5 Rule 5: Encoding-explicit (per user 2026-06-23) +### §1.5 Rule 5: Encoding-explicit (per user 2026-06-23, refined 2026-06-23) **Statement.** Every value-bearing term has an `encoding:` attribute. The encoding is the **bounded form** of the value; without the encoding, the value is **indefinite** (per §1.1). -**Default encoding:** `float64` (~16 decimal digits). +**Principled encoding default (v2):** +- `float` (general unbounded float placeholder) +- `integer` (general unbounded integer placeholder) +- `Scalar` (placeholder with specific meaning in linear alg, geo alg, tensor alg; per user 2026-06-23) -**The encoding taxonomy (per warmup §11.2):** +The principled form **defers** the resolution choice until the user defines a target resolution for the application. This is the operational form of the ontology axiom (per user 2026-06-23): "you can observe the shape of the procedure, not all possible result combinations or resolutions." -| Encoding | Range | Precision | Use case | -|---|---|---|---| -| `int8` | -128 to 127 | exact | small integers, byte-level | -| `int16` | -32,768 to 32,767 | exact | short integers | -| `int32` | -2,147,483,648 to 2,147,483,647 | exact | standard integers | -| `int64` | -9.2 × 10^18 to 9.2 × 10^18 | exact | large integers, IDs | -| `uint8 / uint16 / uint32 / uint64` | 0 to 2^N-1 | exact | unsigned integers | -| `float16` | ±65,504 | ~3.3 decimal digits | graphics, ML | -| `float32` | ±3.4 × 10^38 | ~7.2 decimal digits | standard floats | -| `float64` (**DEFAULT**) | ±1.8 × 10^308 | ~15.9 decimal digits | standard scientific | -| `float128` | ±1.2 × 10^4932 | ~19.0 decimal digits | high-precision | -| `bigint` | arbitrary | exact | cryptographic | -| `decimal64` | ±9.9 × 10^384 | 16-19 decimal digits | financial | -| `decimal128` | ±9.9 × 10^6144 | 34-38 decimal digits | high-precision financial | +**`float64` is the principled default ONLY when the user defines a target resolution** (e.g., "the encoding matters for this code: `float64`"). The v1 lexicon's blanket `float64` default was over-committing to a specific resolution; the v2 defers until the application context demands it. See L7 in `V2_CHANGELOG.md` for the v1→v2 transition. -**The encoding is mandatory** for any value-bearing term. A `quantity` or `scalar` without an encoding is **indefinite** (per §1.1); a `quantity : float64` or `scalar : int32` is **bounded** (per §1.1). +**The encoding taxonomy (per warmup §11.2, refined v2):** -**The encoding is the bounded form** (§1.1 made concrete): a `quantity : float64` is bounded to ~1.8 × 10^308; a `quantity : int32` is bounded to ±2.1 × 10^9; etc. +| Encoding | Range | Precision | Use case | Status | +|---|---|---|---|---| +| `int8` | -128 to 127 | exact | small integers, byte-level | resolved | +| `int16` | -32,768 to 32,767 | exact | short integers | resolved | +| `int32` | -2,147,483,648 to 2,147,483,647 | exact | standard integers | resolved | +| `int64` | -9.2 × 10^18 to 9.2 × 10^18 | exact | large integers, IDs | resolved | +| `uint8 / uint16 / uint32 / uint64` | 0 to 2^N-1 | exact | unsigned integers | resolved | +| `integer` | unbounded until resolution chosen | exact | deferred integer resolution | **PLACEHOLDER** (v2, general purpose) | +| `float16` | ±65,504 | ~3.3 decimal digits | graphics, ML | resolved | +| `float32` | ±3.4 × 10^38 | ~7.2 decimal digits | standard floats | resolved | +| `float64` | ±1.8 × 10^308 | ~15.9 decimal digits | standard scientific | **RESOLVED DEFAULT** (when target resolution matters) | +| `float128` | ±1.2 × 10^4932 | ~19.0 decimal digits | high-precision | resolved | +| `float` | unbounded until resolution chosen | depends on resolution | deferred float resolution | **PLACEHOLDER** (v2, general purpose) | +| `Scalar` | unbounded | depends on resolution | linear alg, geo alg, tensor alg (per user 2026-06-23) | **PLACEHOLDER** (v2, domain-specific) | +| `bigint` | arbitrary | exact | cryptographic | resolved | +| `decimal64` | ±9.9 × 10^384 | 16-19 decimal digits | financial | resolved | +| `decimal128` | ±9.9 × 10^6144 | 34-38 decimal digits | high-precision financial | resolved | + +**The encoding is mandatory** for any value-bearing term. A `quantity` or `scalar` without an encoding is **indefinite** (per §1.1); a `quantity : float64` or `scalar : int32` or `Scalar : float64` is **bounded** (per §1.1). + +**The encoding is the bounded form** (§1.1 made concrete): a `quantity : float64` is bounded to ~1.8 × 10^308; a `quantity : int32` is bounded to ±2.1 × 10^9; a `Scalar` is bounded only when the user resolves the resolution (otherwise it is indefinite). + +**The ontology note (per user 2026-06-23, confirmed):** "You can observe the shape of the procedure, not all possible result combinations or resolutions for a given metric utilized with that procedure. If the hedging is unrelated to this I need clarification." **The hedging IS related.** The encoding-placeholder form (`Scalar`, `floatX`, `integer`) reflects this: the *shape* of the procedure (its type, its operation) is observable; the *resolution* (the specific encoding) is a user-defined target. The honest epistemic hedging pattern is the operational form of this distinction. --- @@ -154,13 +171,13 @@ The 4-rule pattern (per Cluster 3) is extended to include a "compression history The lexicon is organized in 4 tiers. Total: ~70 terms. Each entry has: tag, conventional form, principled re-encoding, etymology (1-line origin + 1-line history), form anchor, source cluster. -### §2.1 Tier 1: Core concepts (12 terms) +### §2.1 Tier 1: Core concepts (13 terms, refined v2) The principled core. These are the foundational type-theoretic primitives + the constructive type theory operators. | # | Conventional | Principled re-encoding | Etymology | Source cluster | |---|---|---|---|---| -| 1.1 | `set` | `kind` | Old English *cynd* ("kind, sort, nature") | Cluster 0, 4 | +| 1.1 | `set` | **NO RE-ENCODING** (clarify with etymology; set is a data structure, not an enumerable type) | Old English *settan* + *gesamd*; the user's domain has specific data structures `HashSet`, `SortedSet`, etc. | Cluster 0, 4 | | 1.2 | `∀` | `forall` | Latin *pro omnibus* ("for all") | Cluster 2, 4 | | 1.3 | `∃` | `exists` | Latin *existere* ("to stand out, to be") | Cluster 4 | | 1.4 | `∧` | `and` | Old English *and* | Cluster 3 | @@ -172,46 +189,55 @@ The principled core. These are the foundational type-theoretic primitives + the | 1.10 | `⊥` | `Bottom` | Greek *βύσμα* via *boussomai* ("to stop up") | Cluster 3 | | 1.11 | `Notion` (ἔννοια) | `concept` `[user-also-accepted]` | Greek *ἔννοια* ("having in mind"); the user's preferred form | Cluster 7 | | 1.12 | `Boundary/Term` (ὅρος) | `definitio` `[user-also-accepted]` | Greek *ὅρος* ("boundary, term, definition"); Latin *definitio* | Cluster 7 | +| 1.13 | `<<` / `>>` | `<<` / `>>` (much less than / much more than, with `tolerance` for the fuzzy relation) | Mathematical convention; the user's `tolerance` is the explicit epsilon | NEW (v2) | + +**Note on §2.1.1 (v2):** per user 2026-06-23, "Set is a specific data structure, not directly related to an enumerable type." The v1 re-encoding of `set` to `kind` was over-broad; v2 removes the re-encoding. The data structure `set` is a *specific implementation* (HashSet, SortedSet, etc.), distinct from the constructive type theory identification of sets/kinds/types. See L1 in `V2_CHANGELOG.md`. **Note on §2.1.11-12:** the principled form for these is the type-theoretic one (`Notion : type`, `Boundary : type` per Cluster 7, `Notiones.txt`). The `concept` and `definitio` forms are user-specific; the user has a working definition of `Type = "successful act of association"` (per Cluster 7, line 78) and uses these as the operational form. -### §2.2 Tier 2: Data-oriented pipeline terms (18 terms) +**Note on §2.1.13 (NEW v2):** per user 2026-06-23, the `<<` / `>>` operators are "much less than" / "much more than" with explicit `tolerance`. The v1 lexicon had no Tier 1 entry for these; they appeared in `apply_report.md` R9 as fuzzy patterns. v2 promotes them to Tier 1 (core comparison) AND Tier 4 (fuzzy with tolerance; see §2.4 #4.26) AND §9 (per-language rendering). Three placements, three roles. + +### §2.2 Tier 2: Data-oriented pipeline terms (18 terms, refined v2) The principled pipeline. These are the data-oriented + constructive type theory operators. | # | Conventional | Principled re-encoding | Source cluster | |---|---|---|---| -| 2.1 | `function` | `procedure` | Cluster 2, 4 | -| 2.2 | `parameter` | `argument` | Cluster 2, 4 | +| 2.1 | `function` | **NO RE-ENCODING** (clarify with etymology; function = declarative/math; procedure = imperative/CS) | Cluster 2, 4 | +| 2.2 | `parameter` | **NO RE-ENCODING** (clarify with etymology; parameter = formal name in definition; argument = actual value passed) | Cluster 2, 4 | | 2.3 | `return value` | `result` (or `this`) | Cluster 2 | | 2.4 | `definition` | `formation` | Cluster 3 | -| 2.5 | `input` | `arg` | Cluster 4 | +| 2.5 | `input` | **NO RE-ENCODING** (clarify with etymology; input = conceptual act; arg = formal name) | Cluster 4 | | 2.6 | `equation` | `relation` | Cluster 2 | | 2.7 | `property` | `property` | Cluster 2 | | 2.8 | `lemma` / `corollary` | `claim` (collapse both) | User-specific `[user-also-accepted]` | -| 2.9 | `proof` | `construction` | Cluster 0, 7 | +| 2.9 | `proof` | **NO RE-ENCODING** (construction is a sub-type tag, applied when the proof is constructive) | Cluster 0, 7 | | 2.10 | `witness` | `instance` | Cluster 4 | -| 2.11 | `Attribute` (attributus) | `attribute` (extrinsic) `[user-also-accepted]` | Cluster 7 | -| 2.12 | `Property` (proprietas) | `property` (intrinsic) `[user-also-accepted]` | Cluster 7 | -| 2.13 | `Type/Genus` (γένος) | `kind` (sense 8) `[user-also-accepted]` | Cluster 7 | +| 2.11 | `Attribute` (attributus) | `attribute` (extrinsic) `[user-also-accepted]` (NO re-encoding; Type/Attribute/Property/Genus are analogous) | Cluster 7 | +| 2.12 | `Property` (proprietas) | `property` (intrinsic) `[user-also-accepted]` (NO re-encoding; Type/Attribute/Property/Genus are analogous) | Cluster 7 | +| 2.13 | `Type/Genus` (γένος) | **NO RE-ENCODING** (Type/Genus/Kind are analogous; `kind` reserved for enumeration types: components, DAG nodes, fat structs) | Cluster 7 | | 2.14 | `static declaration` | `static { }` | Cluster 6, 9 | | 2.15 | `execution block` | `exe { }` | Cluster 6, 9 | | 2.16 | `meta-programming` | `CodeSector` | Cluster 9, P14 | | 2.17 | `import alias` | `using` (Haskell-style) | Cluster 9, P15 | | 2.18 | `assertion` | `'figure 1.9' ... assert -> ... = ...` | Cluster 9, P16 | +**Note on §2.2.1, §2.2.2, §2.2.5 (v2):** per user 2026-06-23, "A procedure is not necessarily a function, a function has a specific declarative sense it, a procedure is imperative. Parameters and arguments are also distinctly separate terms they are not analogous ... input and arg are also not the same." The v1 lexicon collapsed these distinctions; v2 removes the collapses and clarifies with native language + etymology. See L2, L3, L4 in `V2_CHANGELOG.md`. + +**Note on §2.2.9 (v2):** per user 2026-06-23, "a construction is a type of proof." The v1 lexicon collapsed `proof → construction`; v2 removes the collapse. When a proof is constructive, the `construction` tag is applied as a sub-type annotation, not as a replacement. See L5 in `V2_CHANGELOG.md`. + +**Note on §2.2.11-13 (v2):** per user 2026-06-23, "Type/Genus/Kind are analogous no need to reencode. Type and Type of types is analogous to kind. Kind is useful term to reserve for enumeration types, especially when used with components, DAG nodes, or fat structs." The v1 lexicon re-encoded Attribute → `attribute` (lowercase), Property → `property` (lowercase), Type/Genus → `kind` (lowercase). v2 removes these collapses; the principled form is to keep the conventional form (Type, Attribute, Property, Genus) and reserve `kind` (lowercase) for the enumeration case. See L8 in `V2_CHANGELOG.md`. + **Note on §2.2.8:** the principled form for `lemma` and `corollary` is to keep them as separate terms (per conventional math). The user collapses them to `claim` per Cluster 9 (Pattern 14). This is user-specific. -**Note on §2.2.11-13:** the principled form for these is the type-theoretic one (`Attribute : type`, `Property : type`, `Type/Kind : kind`). The `attribute (extrinsic)`, `property (intrinsic)`, `kind` forms are user-specific per Cluster 7, `Notiones.txt` line 78 + the **4 elements** (Notion / Attribute / Property / Type/Genus) pattern. - -### §2.3 Tier 3: Type-theoretic primitives (18 terms) +### §2.3 Tier 3: Type-theoretic primitives (20 terms, refined v2) The principled type-theoretic primitives. These are the **constructive type theory** (Per Martin-Löf tradition) operators that the user has operationalized in `TypeTheory.bp`. | # | Conventional | Principled re-encoding | Source cluster | |---|---|---|---| -| 3.1 | `Type` (the meta-type) | `kind` | Cluster 3 | -| 3.2 | `Type of types` | `Kind` | Cluster 3 | +| 3.1 | `Type` (the meta-type) | **NO RE-ENCODING** (Type/Genus/Kind are analogous; per user 2026-06-23, `kind` reserved for enumeration types) | Cluster 3 | +| 3.2 | `Type of types` | `Kind` (capital K; the type-of-types term in standard type theory) | Cluster 3 | | 3.3 | `Constructor` | `intro` / `construct` | Cluster 3 | | 3.4 | `Eliminator` | `elim` / `eliminate` | Cluster 3 | | 3.5 | `Computation rule` (value-level) | `comp` | Cluster 3 | @@ -223,29 +249,42 @@ The principled type-theoretic primitives. These are the **constructive type theo | 3.11 | `Top` | `Top` (to be defined — see §10) | Phase 1 | | 3.12 | `Pair` (Sigma type) | `Pair` with `Build`, `Build` projections | Cluster 3 (Phase 1) | | 3.13 | `Pair constructor` | `` | Cluster 3 (Phase 1) | -| 3.14 | `Dependent Function` (Pi type) | `Dependent(B)` | Cluster 3 (Phase 1) | +| 3.14 | `Dependent Function` (Pi type) | **4 notations** (B default, C++ opt-in, Odin opt-in, Jai): see template notation table below | Cluster 3 (Phase 1) | | 3.15 | `Lambda` | `lambda.x.M` | Cluster 3 (Phase 1) | | 3.16 | `objects :` (carrier declaration) | `objects : m : A, n : B ;` | Cluster 3 (Phase 1, Pattern 6) | | 3.17 | `Sum` (Disjoint Sum) | `A + B` with `inl`/`inr` injections | Cluster 3 | | 3.18 | `Sum elimination` (BNF) | `match(M, N, O)` | Cluster 3 (Phase 1) | +| 3.19 | `Markov chain` (R4, NEW v2) | `Markov where X -> Y -> Z` (a type-class predicate) | entropy_epiplexity §5.2 | +| 3.20 | `PolyTimeAdversary` (R6, NEW v2) | `PolyTimeAdversary : Type where forall A : PolyTimeAdversary, runtime(A) : Polynomial(security_parameter) : int64` | entropy_epiplexity §5.8 | + +**Template notation table for #3.14 (NEW v2, per user 2026-06-23):** + +| Notation | Form | Default? | Per-language context | +|---|---|---|---| +| **B (intent scripting DSL)** | `Dependent(B) <- depends(x : A)` | **DEFAULT** | The principled form per user 2026-06-23; explicit dependency declaration | +| **C++ template style** | `Dependent` | opt-in (C++ context) | C++ projects only; the user requests this when the C++ convention is preferred | +| **Odin parametric style** | `Dependent[B, x : A]` | opt-in (Odin context) | Odin projects only; the user requests this when the Odin convention is preferred | +| **Jai parametric style** | (Same as Odin) | opt-in (Jai context) | Jai projects only; the user requests this when the Jai convention is preferred | + +**Selection rule (per user 2026-06-23, "B as default"):** "Use B as default. Use c++ template language style in specific scenario when the user asks for it or if its better (same with jai/odin language have the lexicon support it if the user desires it)." **The 4-rule pattern (per Cluster 3, Pattern 2):** every "fully-defined" type has 4 rules — Introduction, Elimination, Computation, Uniqueness. The de-obfuscation checks for this pattern when de-obfuscating a type definition. The **type-correctness computation** (per Cluster 3, Pattern 4) is the user's contribution: for product types, the Computation rule splits into value-level (β-reduction) and type-level (type-correctness check). -### §2.4 Tier 4: AI-fuzzing tolerance terms (24 terms, expanded in Phase 1) +### §2.4 Tier 4: AI-fuzzing tolerance terms (26 terms, refined v2) > **Reading guide.** This tier mixes **principled re-encodings** (from the 5 rules) with **user-specific re-encodings** (the user's personal preferences). Entries with `[user-also-accepted]` ALSO accept the user's preferred form; entries with `(per user-also-accepted)` are PURELY user-specific (no principled form). | # | Conventional (fuzzy) | Principled re-encoding | User-specific form (optional) | Source cluster | |---|---|---|---|---| | 4.1 | "invent" | `construct` | | Cluster 0 | -| 4.2 | "real number" | `quantity() : ` (e.g., `quantity(3.14) : float64`) | `encodable quantity` (per Cluster 0) | Cluster 0, 8 | +| 4.2 | "real number" | `quantity() : ` where `` is `float` (placeholder, general) or `Scalar` (placeholder, linear/geo/tensor alg) or `float64` (resolved) | `encodable quantity` (per Cluster 0) | Cluster 0, 8 | | 4.3 | "imaginary number" | `bivector (with scalar multiplier)` | `bivector` (per Cluster 0, Pattern 2 — GA reinterpretation) | Cluster 0, 8 | -| 4.4 | "function" | `procedure` | `transform` (per Cluster 2) | Cluster 2 | +| 4.4 | "function" | **NO RE-ENCODING** (clarify with etymology; function = declarative; procedure = imperative) | `transform` (per Cluster 2, only when the user requests) | Cluster 2 | | 4.5 | "magic" | `unboxed` (the boxed/unboxed distinction) | `indefinite` (per Cluster 0, 9) | Cluster 0, 9 | | 4.6 | "natural number" | `Nat = Zero | Succ(Nat)` | | Cluster 3 | | 4.7 | "smooth" | `infinitely-differentiable` | | Cluster 2 | | 4.8 | "the limit exists" | `Limit(f, p) : L for some L` | | Cluster 2 | -| 4.9 | "transcendental number" | `template expression for producing a value at a given resolution` | | Cluster 1 (Pattern 7), 0 (Cluster A, P2) | +| 4.9 | "transcendental number" | `classification of expressions that resolve to a specific sequence consistent with the encoding resolution, fulfilling very specific traits (transcendence over algebraic); an algebraic expression that fulfills the term for irrationals shares some but not all traits` | | Cluster 1 (Pattern 7), 0 (Cluster A, P2) | | 4.10 | "dot product" | `length-projection product` (the principled form, Cluster 1 Pattern 6) | `'scalar product'` (per Sectored Language, Cluster 9) | Cluster 1, 9 | | 4.11 | "cross product" | `wedge product` (3D) | `'cross product'` (per Sectored Language, Cluster 9) | Cluster 1, 8, 9 | | 4.12 | "anti-wedge" | `regressive product` / `contraction` / `interior product` | | Cluster 1 (Pattern 6) | @@ -255,16 +294,26 @@ The principled type-theoretic primitives. These are the **constructive type theo | 4.16 | "straight line" | `Εὐθεῖα` (Greek) / `linea recta` (Latin) | | Cluster 7 | | 4.17 | "kernel" (cross-domain) | `discrete subsystem that holds a continuous process up` | | Cluster 0 (Cluster B, P8) | | 4.18 | "Bourbaki" | **FOIL** (cultural opponent) | | Cluster 0, 9 | -| 4.19 | "real" (in reals) | `kind : Real` resolves to `quantity : float64` | | Cluster 0 (Cluster A, P2) + user 2026-06-23 | -| 4.20 | "Pi" | `kind : Pi` resolves to `quantity : float64` (or `float128`) | | Cluster 0 (Deep Math 2 §25) + user 2026-06-23 | -| 4.21 | "quantity" (a value) | `quantity() : ` (e.g., `quantity(3.14) : float64`, `quantity(5) : int64`) | | User 2026-06-23 | -| 4.22 | "scalar" (a value) | `scalar : ` (e.g., `scalar : float64`) | | User 2026-06-23 | +| 4.19 | "real" (in reals) | `kind : Real` (a type-class) resolves to `float` (general placeholder) or `Scalar` (linear/geo/tensor alg placeholder) or `quantity : ` (resolved) | | Cluster 0 (Cluster A, P2) + user 2026-06-23 | +| 4.20 | "Pi" | `kind : Pi` (a type-class) resolves to `float` (general placeholder) or `Scalar` (linear/geo/tensor alg placeholder) or `quantity : ` (resolved) | | Cluster 0 (Deep Math 2 §25) + user 2026-06-23 | +| 4.21 | "quantity" (a value) | `quantity() : ` where `` is `integer` (placeholder) or `int64` (resolved) for integers; `float` (placeholder) or `float64` (resolved) for reals | | User 2026-06-23 | +| 4.22 | "scalar" (a value) | `float` (general placeholder) or `Scalar` (linear/geo/tensor alg placeholder) or `scalar : ` (resolved) | | User 2026-06-23 | | 4.23 | "Lengyel's Standard GA" | **FOIL** (per Cluster 0, Cluster B, P6) | | Cluster 0 | | 4.24 | "Standard GA" (Hestenes, Dorst) | **FOIL** (Lengyel's Projective GA is the unifier) | | Cluster 0 | +| 4.25 | "correlation" (R1, NEW v2) | `correlation : float64` (e.g., `correlation = 0.98`) | | cs229 §2.6 | +| 4.26 | "<< N" / ">> N" (much less than / much more than) | `weakly_coupled(a, b) : Prop` (predicate) OR `much_less(a, b, tolerance)` / `much_greater(a, b, tolerance)` (comparison, with explicit `tolerance : float64`) | | multiscale_hoffman §5.2, neural_dynamics_miller §5.10 (R9/R11) | -**Per user 2026-06-23:** "Quantity or scalar for value is fine but to keep in mind that if they are used, it should be associated with a finite encoding. Whereas the real number line for example is a classification of expressions that may resolve to any finite encoding of quantity resolution." +**Note on §2.4.2, §2.4.19-4.22 (v2):** per user 2026-06-23, the principled encoding default is `Scalar` or `floatX` (placeholder), not `float64`. `float64` is the principled default ONLY when the user defines a target resolution for the application. The taxonomy table (§1.5) documents the placeholder vs resolved distinction. See L7 in `V2_CHANGELOG.md`. -The encoding-explicit form is the operational form of §1.1 (form requires bounds): every value must have a bounded form. The user's contribution: the bound is the `encoding:`. +**Note on §2.4.4 (v2):** per user 2026-06-23, "A procedure is not necessarily a function, a function has a specific declarative sense it, a procedure is imperative." The v1 lexicon re-encoded `function → procedure`; v2 removes this. The `transform` user-specific form is opt-in only when the user requests it. See L2 in `V2_CHANGELOG.md`. + +**Note on §2.4.9 (v2):** per user 2026-06-23, "the transcendental number is a classification of expressions that resolve to a specific sequence that is consistent for the encoding resolution, and fullfills very specific traits. An algebraic expression that fullfills the term for irrationals also shares similar traits but is mising some that transcendental has." The v1 re-encoded transcendental as "template expression for producing a value at a given resolution"; v2 replaces with the classification form. See L6 in `V2_CHANGELOG.md`. + +**Note on §2.4.25 (R1, NEW v2):** correlation was the #1 refinement from the apply phase; v2 adds it to the encoding-explicit examples with `float64` as the principled encoding. + +**Note on §2.4.26 (NEW v2):** `<<` / `>>` with explicit `tolerance`. The `weakly_coupled(a, b)` predicate form is for "loose correlation" (two things are loosely related); the `much_less(a, b, tolerance)` / `much_greater(a, b, tolerance)` forms are for comparison. Per user 2026-06-23, "weakly_coupled(...) is good for c11, much_less and much_greater can be used as well." See <<1, <<2, <<3 in `V2_CHANGELOG.md`. + +**Per user 2026-06-23:** "Quantity or scalar for value is fine but to keep in mind that if they are used, it should be associated with a finite encoding. Whereas the real number line for example is a classification of expressions that may resolve to any finite encoding of quantity resolution." The v2 encoding form is the operational form of this: `Scalar` / `floatX` / `integer` as placeholders, with `float64` / `int64` / etc. only when the user defines a target resolution. --- @@ -673,7 +722,7 @@ When a single source fails (e.g., Wiktionary has no entry), the user tries multi --- -## §7. The Encoding-Explicit Rule (formal definition) +## §7. The Encoding-Explicit Rule (formal definition, refined v2) The encoding-explicit rule is the **third operational requirement**, derived from §1.5. @@ -681,42 +730,59 @@ The encoding-explicit rule is the **third operational requirement**, derived fro **Every value-bearing term has an `encoding:` attribute. The encoding is the bounded form of the value.** -Default encoding: `float64` (~16 decimal digits). +**Principled defaults (v2, refined):** +- `float` (general unbounded float placeholder) +- `integer` (general unbounded integer placeholder) +- `Scalar` (placeholder with specific meaning in linear alg, geo alg, tensor alg) -### §7.2 The encoding taxonomy +The placeholder defers the resolution until the user defines a target resolution for the application. `float64` is the principled resolved default ONLY when the user defines a target resolution. **The choice between `float` and `Scalar` as a placeholder is domain-driven:** `float` is the general-purpose placeholder; `Scalar` is the domain-specific placeholder for linear/geo/tensor algebra (per user 2026-06-23, "Scalar is useful for linear alg, geo alg, tensor alg"). -| Encoding | Range | Precision | Use case | +**The ontology note (per user 2026-06-23, confirmed):** "You can observe the shape of the procedure, not all possible result combinations or resolutions for a given metric utilized with that procedure." The encoding-placeholder form reflects this: the *shape* is observable (the value's type, its operation); the *resolution* is a user-defined target (the specific encoding). The honest epistemic hedging pattern is the operational form of this distinction. + +### §7.2 The encoding taxonomy (v2, with placeholder vs resolved distinction) + +| Encoding | Range | Precision | Use case | Status | +|---|---|---|---|---| +| `int8` | -128 to 127 | exact | small integers, byte-level | resolved | +| `int16` | -32,768 to 32,767 | exact | short integers | resolved | +| `int32` | -2,147,483,648 to 2,147,483,647 | exact | standard integers | resolved | +| `int64` | -9.2 × 10^18 to 9.2 × 10^18 | exact | large integers, IDs | resolved | +| `uint8 / uint16 / uint32 / uint64` | 0 to 2^N-1 | exact | unsigned integers | resolved | +| `integer` | unbounded until resolution chosen | exact | deferred integer resolution | **PLACEHOLDER** (v2, general) | +| `float16` | ±65,504 | ~3.3 decimal digits | graphics, ML | resolved | +| `float32` | ±3.4 × 10^38 | ~7.2 decimal digits | standard floats | resolved | +| `float64` | ±1.8 × 10^308 | ~15.9 decimal digits | standard scientific | **RESOLVED DEFAULT** (when target resolution matters) | +| `float128` | ±1.2 × 10^4932 | ~19.0 decimal digits | high-precision | resolved | +| `float` | unbounded until resolution chosen | depends on resolution | deferred float resolution | **PLACEHOLDER** (v2, general) | +| `Scalar` | unbounded | depends on resolution | linear alg, geo alg, tensor alg | **PLACEHOLDER** (v2, domain-specific) | +| `bigint` | arbitrary | exact | cryptographic | resolved | +| `decimal64` | ±9.9 × 10^384 | 16-19 decimal digits | financial | resolved | +| `decimal128` | ±9.9 × 10^6144 | 34-38 decimal digits | high-precision financial | resolved | + +### §7.3 Examples (v2, with placeholder forms) + +| Conventional | Re-encoded (placeholder) | Re-encoded (resolved) | Encoding | |---|---|---|---| -| `int8` | -128 to 127 | exact | small integers, byte-level | -| `int16` | -32,768 to 32,767 | exact | short integers | -| `int32` | -2,147,483,648 to 2,147,483,647 | exact | standard integers | -| `int64` | -9.2 × 10^18 to 9.2 × 10^18 | exact | large integers, IDs | -| `uint8 / uint16 / uint32 / uint64` | 0 to 2^N-1 | exact | unsigned integers | -| `float16` | ±65,504 | ~3.3 decimal digits | graphics, ML | -| `float32` | ±3.4 × 10^38 | ~7.2 decimal digits | standard floats | -| `float64` (**DEFAULT**) | ±1.8 × 10^308 | ~15.9 decimal digits | standard scientific | -| `float128` | ±1.2 × 10^4932 | ~19.0 decimal digits | high-precision | -| `bigint` | arbitrary | exact | cryptographic | -| `decimal64` | ±9.9 × 10^384 | 16-19 decimal digits | financial | -| `decimal128` | ±9.9 × 10^6144 | 34-38 decimal digits | high-precision financial | +| "real number" (general) | `kind : Real` resolves to `float` | `kind : Real` resolves to `quantity : float64` | `float` or `float64` | +| "real number" (linear/geo/tensor alg) | `kind : Real` resolves to `Scalar` | `kind : Real` resolves to `Scalar : float64` | `Scalar` or `Scalar : float64` | +| "Pi" | `kind : Pi` resolves to `float` | `kind : Pi` resolves to `quantity : float64` | `float` or `float64` | +| "the value is 5" (general) | `quantity(5) : integer` | `quantity(5) : int64` | `integer` or `int64` | +| "the value is 3.14" (general) | `quantity(3.14) : float` | `quantity(3.14) : float64` | `float` or `float64` | +| "the value is 3.14" (linear alg) | `Scalar(3.14)` | `Scalar(3.14) : float64` | `Scalar` or `Scalar : float64` | +| "the probability is 0.5" | `quantity(0.5) : float` | `quantity(0.5) : float64` | `float` or `float64` | +| "the matrix" (linear alg) | `Matrix : 3x3 of Scalar` | `Matrix : 3x3 of float64` | `Scalar` or `float64` | +| "the vector" (linear alg) | `Vector : n of Scalar` | `Vector : n of float64` | `Scalar` or `float64` | +| "the correlation is 0.98" (R1, NEW v2) | `correlation : float` | `correlation : float64` | `float` or `float64` | -### §7.3 Examples - -| Conventional | Re-encoded | Encoding | -|---|---|---| -| "real number" | `kind : Real` resolves to `quantity : float64` | `float64` | -| "Pi" | `kind : Pi` resolves to `quantity : float64` | `float64` | -| "the value is 5" | `quantity(5) : int64` | `int64` | -| "the value is 3.14" | `quantity(3.14) : float64` | `float64` | -| "the probability is 0.5" | `quantity(0.5) : float64` | `float64` | -| "the matrix" | `Matrix : 3x3 of float64` | (matrix + element encoding) | -| "the vector" | `Vector : n of float64` | (vector + element encoding) | - -### §7.4 Per user 2026-06-23 clarification +### §7.4 Per user 2026-06-23 clarification (refined v2) "Quantity or scalar for value is fine but to keep in mind that if they are used, it should be associated with a finite encoding. Whereas the real number line for example is a classification of expressions that may resolve to any finite encoding of quantity resolution." -The encoding is the operational form of §1.1 (form requires bounds): every value must have a bounded form. The user's contribution: the bound is the `encoding:`. +The v2 form adds: **"I do like the encoding taxonomy table you have when picking a resolution matters though."** The taxonomy table (§7.2) is preserved — the placeholder vs resolved distinction makes the taxonomy *more* useful, not less. The user can pick a resolution from the table when the application context demands it. + +**`Scalar` is preserved** (per user 2026-06-23, "keep scalar its useful for linear alg, geo alg, tensor alg"). The principled form distinguishes: +- `float` = general-purpose unbounded float placeholder +- `Scalar` = domain-specific placeholder for linear alg, geo alg, tensor alg (where the geometric meaning matters) --- @@ -745,7 +811,87 @@ The encoding is the operational form of §1.1 (form requires bounds): every valu --- -## §9. The 12 unresolved items (per warmup §A.3) — addressed +## §9. Per-language rendering (NEW v2) + +The `<<` / `>>` operators appear in 3 placements (Tier 1 #1.13, Tier 4 #4.26, this section). This section documents how they render in C11, Python, and Forth — the languages Pass 3 will use. + +**The conflict:** in C11, `a << b` and `a >> b` are bit-shift operators. In Python, the same. In Forth, `a b <<` is a shift. The principled form (`<<` / `>>` as "much less than" / "much more than") cannot be used as-is in these languages — there's a namespace collision with bit-shift. + +**Resolution:** use named functions or operators in the target language. The principled form (`<<` / `>>` with `tolerance`) is reserved for the abstract mathematical context (e.g., the lexicon, the type-theoretic spec). In code, the named functions are used. + +### §9.1 C11 rendering (per user 2026-06-23) + +Per user 2026-06-23, "weakly_coupled(...) is good for c11, much_less and much_greater can be used as well." + +| Principled form | C11 rendering | Notes | +|---|---|---| +| `<<` (much less than) | `much_less(a, b, tolerance)` | Comparison operator form; takes `tolerance : float64` | +| `>>` (much more than) | `much_greater(a, b, tolerance)` | Comparison operator form; takes `tolerance : float64` | +| `<< N` / `>> N` (predicate form) | `weakly_coupled(a, b, tolerance)` | Predicate form; for "loose correlation" (not a comparison) | + +**Example C11 code:** + +```c +#include + +typedef struct { float value; } Scalar; +typedef struct { Scalar a; Scalar b; float tolerance; } TolerancePair; + +bool much_less(Scalar a, Scalar b, float tolerance) { + return a.value < (b.value - tolerance); +} + +bool much_greater(Scalar a, Scalar b, float tolerance) { + return a.value > (b.value + tolerance); +} + +bool weakly_coupled(Scalar a, Scalar b, float tolerance) { + return fabs(a.value - b.value) < tolerance; +} +``` + +### §9.2 Python rendering + +Python uses the same named functions: + +| Principled form | Python rendering | Notes | +|---|---|---| +| `<<` | `much_less(a, b, tolerance)` | Same as C11 | +| `>>` | `much_greater(a, b, tolerance)` | Same as C11 | +| `<<` / `>>` (predicate) | `weakly_coupled(a, b, tolerance)` | Same as C11 | + +**Example Python code:** + +```python +def much_less(a: Scalar, b: Scalar, tolerance: float) -> bool: + return a.value < (b.value - tolerance) + +def much_greater(a: Scalar, b: Scalar, tolerance: float) -> bool: + return a.value > (b.value + tolerance) + +def weakly_coupled(a: Scalar, b: Scalar, tolerance: float) -> bool: + return abs(a.value - b.value) < tolerance +``` + +### §9.3 Forth rendering + +Forth has the bit-shift `<<` / `>>` operators as primitive words. The principled form is also rendered as named words: + +| Principled form | Forth rendering | Notes | +|---|---|---| +| `<<` | `much_less` (named word) | Forth's `<<` is bit-shift; named word avoids collision | +| `>>` | `much_greater` (named word) | Same | +| `<<` / `>>` (predicate) | `weakly_coupled` (named word) | Same | + +### §9.4 Selection rule (NEW v2) + +- The principled form (`<<` / `>>` with `tolerance`) is used in the lexicon, the type-theoretic spec, and abstract mathematical contexts. +- In code (C11, Python, Forth, Forth bootslop, etc.), the named functions are used to avoid the bit-shift collision. +- The user (or the LLM at their direction) can choose to use `<<` / `>>` in C11 with operator overloading if they want — this is opt-in, not the default. + +--- + +## §10. The 12 unresolved items (per warmup §A.3) — addressed (was §9) The 12 items below were flagged as "to be defined" in the warmup's `report.md` §A.3. This lexicon addresses each: