From 213e4994202923ab624563413140257541e63617 Mon Sep 17 00:00:00 2001 From: Ed_ Date: Fri, 12 Jun 2026 10:37:10 -0400 Subject: [PATCH] conductor(track): intent_dsl_survey v1.2 (rename + postfix + nagent fix) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three files changed: 1. report_v1.2.md (NEW, 1301 lines) — v1.2 of the report with: (a) Renamed arena { } to tape { } (better term; aligns syntax with the Lottes tape-drive metaphor). All 46 occurrences replaced; 3 awkward double-tape phrases cleaned up (heading 3.6, table cell, glossary entry). (b) Mixed postfix/infix notation for math (per user heuristic): - Strictly postfix for math primitives with precedence: + - * / ^, math indexing [], reducers sum/product. - Infix for structural ops (no precedence concern): :=, function calls, control flow (for/if), field access, block delimiters. - Heuristic: 'if the operator has precedence, postfix it; if it doesn't, infix it.' Mixed examples like 'result := Matrix(m.rows 1 -, m.columns 1 -)' are canonical. (c) nagent attribution corrected: previously said nagent is Jody Bruchon's; it is Mike Acton's (github.com/macton/nagent; per conductor/tracks/nagent_review_20260608/). Jofito stays correctly attributed to Jody Bruchon. (d) Added v1.2 changelog note at top + heuristic table at start of section 3. 2. report_v1.1.md — nagent attribution fix propagated (post-hoc correction; the original v1.1 commit had the same error in the glossary line 1671). 3. research/cluster_3_intent_mapping.md — nagent attribution fix in 2 places (header at line 188, body at line 190). Appendix A.3 (EBNF) and A.4 (Tier 1 vocab) retain v1.1 form pending a sync pass; noted in the v1.2 changelog at the top of the report. --- .../intent_dsl_survey_20260612/report_v1.1.md | 2 +- .../intent_dsl_survey_20260612/report_v1.2.md | 1819 +++++++++++++++++ .../research/cluster_3_intent_mapping.md | 4 +- 3 files changed, 1822 insertions(+), 3 deletions(-) create mode 100644 conductor/tracks/intent_dsl_survey_20260612/report_v1.2.md diff --git a/conductor/tracks/intent_dsl_survey_20260612/report_v1.1.md b/conductor/tracks/intent_dsl_survey_20260612/report_v1.1.md index 0a618d56..42060ada 100644 --- a/conductor/tracks/intent_dsl_survey_20260612/report_v1.1.md +++ b/conductor/tracks/intent_dsl_survey_20260612/report_v1.1.md @@ -1668,7 +1668,7 @@ That's 4 verbs total, plus the grammar. The placeholder track can demonstrate a **Meta-Tooling** — the external agents (Gemini CLI, OpenCode) used to build the Application. The DSL is the format these agents emit. Distinct from the Application's function-calling. -**nagent** — Jody Bruchon's autonomous coding agent framework. The `nagent_tags.py` parser is the inspiration for the DSL's structured-protocol idea (but the DSL rejects the XML angle-bracket notation). +**nagent** — Mike Acton's autonomous coding agent framework (`github.com/macton/nagent`; per `conductor/tracks/nagent_review_20260608/`). The `nagent_tags.py` parser is the inspiration for the DSL's structured-protocol idea (but the DSL rejects the XML angle-bracket notation). **O'Donnell** — John O'Donnell; creator of the IMGUI/MVC paradigm (per `johno.se/book/`). The DSL inherits 4 anchor claims from his work: widgets as method invocations, reads free, IEventTarget, no scene-graph abstractions. diff --git a/conductor/tracks/intent_dsl_survey_20260612/report_v1.2.md b/conductor/tracks/intent_dsl_survey_20260612/report_v1.2.md new file mode 100644 index 00000000..d6d1699e --- /dev/null +++ b/conductor/tracks/intent_dsl_survey_20260612/report_v1.2.md @@ -0,0 +1,1819 @@ +# Intent-Based Scripting Languages + +**Track:** `intent_dsl_survey_20260612` (initialized 2026-06-12) +**Date:** 2026-06-12 +**Location:** `conductor/tracks/intent_dsl_survey_20260612/report.md` (this file; moved from `docs/ideation/` per user instruction — the report is too closely related to the track to live in the general ideation folder) +**Author:** Tier 1 Orchestrator (sections 1, 3, 4, 5, 6, 7, Appendix); Tier 2 sub-agents (section 2 clusters 0-4, with research sub-reports at `research/cluster_*.md`) +> **v1.2 changes (2026-06-12):** (1) Renamed `arena { }` to `tape { }` (better term in hindsight; aligns syntax with the Lottes tape-drive metaphor). All 46 occurrences of `arena` updated; 3 awkward double-tape phrases cleaned up. (2) **Mixed postfix/infix notation introduced** (per user heuristic): strictly postfix for math primitives that have precedence (`+`, `-`, `*`, `/`, `^`, math indexing `[]`); infix for everything else (`:=`, function calls, control flow, field access, block delimiters). The rationale: postfix eliminates precedence ambiguity; infix is more familiar where precedence isn't an issue. The two mix freely: `result := a b +` is canonical (assignment infix, math postfix). (3) **nagent attribution corrected:** the glossary and Cluster 3 entry previously said nagent is Jody Bruchon's; it is actually Mike Acton's (`github.com/macton/nagent`; per `conductor/tracks/nagent_review_20260608/`). Jofito remains correctly attributed to Jody Bruchon. See TODO at the top of §3 for the full postfix heuristic table. Appendix A.3/A.4 retain their v1.1 form pending a sync pass. + +**Status:** v1.1 (post-secondary-review correction; see `reportreview.md` for the review that produced this update) + +> **What this is.** A survey of intent-based scripting languages as a design philosophy, plus a proposed vocabulary (~40 verbs across 4 tiers) for a Meta-Tooling-facing intent DSL. The report is the foundation document for the user's nagent v2.2 (its "Future-Track Candidate #4" section) and for the future interpreter prototype (follow-up B track). +> +> **What this is NOT.** Not an interpreter, not a bridge script, not Application-side function-calling, not XML/JSON record formats. The DSL is Meta-Tooling-side per `docs/guide_meta_boundary.md` — the format external agents (Gemini CLI, OpenCode) emit when invoking `mcp_client.py` tools. The Application's provider-native function-calling stays unchanged. + +--- + +## 1. The "Intent-Based" Design Philosophy + +The DSL is grounded in four anchor claims. Each claim has a philosophical home and a specific design consequence for the vocab and grammar. + +### 1.1 Claim 1 — Intent-based means the user's words are declarative intent, not imperative commands + +Jofito (per its 2026 README update) calls itself an **"intent mapping engine"**: the user writes declarative intent (e.g., "find all pictures, filter out JPEGs, print the list"), and Jofito decomposes that intent into platform-optimal operations. From the Jofito README: *"jofito is a 'write the optimization once, reap the benefits everywhere' system that takes what the user wants to accomplish (intent) as input and decomposes it into operations that make the most sense for the current system."* (`https://codeberg.org/jbruchon/jofito`) + +The canonical Jofito example is `list = scandir("/path/here/", {filter !extension=jpg,jpeg}) : print(list)` — a single declarative expression that replaces `find . -type f | grep -v jpg | grep -v jpeg`. The DSL inherits this framing: the verbs in §4 are **intent verbs** (e.g., `scan` for "I want to read a source", `filter` for "I want to keep only what matches", `audit` for "I want to record what happened"), not imperative primitives. + +This is the *philosophical* anchor for the DSL: the user says *what they want*; the verbs are the way to say it; the bridge script and the MCP tools handle *how to do it*. The user's own math pseudocode (the `determinate`/`minor`/`matrix-transpose` snippets shared during spec review) operates at this declarative level — "here is the math, the verbs are the words." + +### 1.2 Claim 2 — The hardware is the truth + +The verbs must map to actual hardware/software stages, not abstract commands. The Onat/Lottes 2-register model (per `C:\projects\forth\bootslop\references\kyra_in-depth.md` and `X.com - Onat & Lottes Interaction 1.png.ocr.md`) gives the concrete hardware the DSL is mapped to: + +- **2-register stack (RAX/RDX)**: the DSL's `->` chain *maps* to RAX-passed data. Each verb in the chain is a "word" in Onat's sense (no args, no returns — the X.com thread at `X.com - Onat & Lottes Interaction 1.png.ocr.md:80-86` quotes Lottes: "I laugh when people say C is like assembly, they were missing what we did in assembly back then, which was all registers and globals and gotos, no stacks"). +- **Magenta pipe `|` (KYRA) → our `->`**: same definition-boundary semantics, retargeted to data flow. +- **Basic blocks `[ ]` (KYRA) → our `[ ]`**: compilation units; the parser produces a `[ ]` block per `->`-delimited stage. +- **Lambdas `{ }` (KYRA) → our `tape { }`**: tape-scoped blocks; the contents are pre-scattered into tape-drive regions (per the X.com thread at line 55-61, where Onat describes Lottes's "common arguments pushed onto the tape using store duplication when they are known... so it's preemptive scatter, so later at call time there is no argument gather"). + +The verbs are not arbitrary. Each Tier 2 verb (data pipeline) and Tier 3 verb (shell) has a direct hardware mapping; this is what makes the verbs *fast* on the targeted hardware. + +### 1.3 Claim 3 — The pipeline is immediate-mode + +Per John O'Donnell's IMGUI essay (`https://johno.se/book/imgui.html`): *"Widgets, logically, change from being objects to being method invocations."* The pipeline `scan -> filter -> print` is not a Pipeline object with state; it is a sequence of method calls. Once execution ends, the pipeline's state is gone. The next invocation is independent. + +This is the *paradigm* anchor for the DSL. It means: +- The parser doesn't need to track pipeline state across executions; each invocation is independent. +- The `->` chain has no "pipeline object" you can query, name, or pass around. The only way to "name" a chain is to wrap it in a function (`determinate(m, row) -> Scalar { ... }`). +- Verbs exist *only* when called. There is no implicit verb inventory. (This is why the DSL's "Everything" mode in the Command Palette is implementable as a search across *text*, not across a *registry of pipeline objects*.) + +O'Donnell's MVC essay (`https://johno.se/book/mvc.html`) extends this: *"Writes to Model are formalized through the addition of IEventTarget. This is a pure virtual interface that defines all possible state changes / events on a system wide level."* The DSL's `sandbox` verb is the IEventTarget boundary; the `audit` verb is the IEventTarget itself (see §6 Claim 9 and Claim 10). + +### 1.4 Claim 4 — The vocabulary IS the user surface + +CoSy (per `https://cosy.com/CoSy/Simplicity.html`): *"CoSy is a TimeStamped notebook/log created as an open vocabulary in Forth."* And: *"an extensive vocabulary evolved from APL via K, mainly slicing and dicing, searching & replacing, and applying verbs to each item in lists."* + +For the DSL, the **vocabulary** is the user surface — not the syntax, not the parser, not the runtime. For AI agents that emit the DSL, the vocab is the API. A model that knows the 40 verbs in §4 and the 14 grammar primitives in §3 can express any intent that the DSL supports. There is no separate "API documentation" — the verbs ARE the API. + +This is why the report devotes so much space to the vocab (§4) and so little to the syntax (§3). The syntax is trivial (RPN with a few delimiters); the vocabulary is the substance. + +### 1.5 The four claims together + +The four claims are not independent; they compose: + +- Claim 1 (intent-mapping) → the user expresses what they want; the verbs are the vocabulary. +- Claim 2 (hardware is the truth) → the verbs map to real data-oriented pipeline stages. +- Claim 3 (immediate-mode) → the verbs are method calls, not stateful objects; pipelines have no persistent state. +- Claim 4 (vocabulary is the user surface) → the 40-verb vocab is the API; the syntax is trivial. + +The composition is: a user expresses intent (Claim 1) using a verb (Claim 4) that maps to a hardware stage (Claim 2) in a single per-frame composition (Claim 3). The full report is a working-out of this composition. + +--- + +## 2. Prior Art Survey (8 Clusters) + +This section surveys the design lineage across 8 clusters. Each cluster: a "cluster claim" (what the DSL inherits from the cluster as a whole), then 1 sentence per entry, then specific "take" bullets that §3, §4, §5, and §6 reference. + +The detailed analysis for each cluster lives in the research sub-reports at `research/cluster_*.md` (relative to this file). This section is the executive summary; the sub-reports are the evidence. + +### Cluster 0 — Immediate-Mode Paradigm (philosophical anchor) + +**Cluster claim.** The DSL's *paradigm* — verbs as method calls, no persistent state, reads free, writes formalized — is the direct application of John O'Donnell's IMGUI/MVC framework to a Meta-Tooling context. (Per the full sub-report at `research/cluster_0_odonnell.md`.) + +**Entry: John O'Donnell — IMGUI / The Pitch / MVC / IM-MVC roadmap.** `https://johno.se/book/imgui.html`, `https://johno.se/book/pitch.html`, `https://johno.se/book/immvc.html`, `https://johno.se/book/mvc.html`. Four interconnected pages laying out a unified paradigm: visualization is not inherently stateful; widgets are method invocations not objects; the "reads are free, writes are formalized" invariant via a single IEventTarget interface; the View must not expose scene-graph abstractions. + +**Take bullets (referenced by §5, §6):** +- *Anchor Claim 3 (IEventTarget as single event interface for all state changes):* *"Experience dictates that there only be a single IEventTarget interface that is responsible for all 'system events'."* — `mvc.html`, "Why only a single event interface" section. +- *Anchor Claim 4 (View must not expose scene-graph abstractions):* *"The corresponding interface should be of the form: `view::drawMesh(mesh, transform, anyOtherRenderState);`"* — `mvc.html`, "View" section. +- *"Writes to Model are formalized through the addition of IEventTarget. This is a pure virtual interface that defines all possible state changes / events on a system wide level."* — `mvc.html`, "Writing to Model state" section. +- *"What is a non-stateful view? Basically it is a procedural interface (as opposed to a collection of objects with methods), in essence very much to what DirectX 9 is."* — `pitch.html`, "MVC revisited" section. +- *"However, due to the rapide advances of GPU based rendering over the past 10+ years, this premise no longer holds."* — `pitch.html`, "However!" section. +- The 800,000-vertex single-draw-call empirical result at Jungle Peak (GeForce 6 hardware) — `pitch.html`, batch rendering section. + +### Cluster 1 — Concatenative (Forth family) + +**Cluster claim.** The DSL's *syntax* — postfix RPN, stack-passed arguments, no AST object — is the Forth tradition as refined by Onat Türkçüoğlu's KYRA (2-register stack, magenta pipe as definition boundary, basic blocks and lambdas, preemptive scatter) and Timothy Lottes's x68/5th (32-bit instruction granularity, annotation overlay, "register file as aliased global namespace"). Bob Armstrong's CoSy is the user's-vocabulary-as-the-surface model. (Per the full sub-report at `research/cluster_1_concatenative.md`.) + +**Entries:** + +- **Forth** (Chuck Moore, 1970). The canonical RPN stack-passing language; the colon-word/semicolon definition pattern; threaded code compilation; self-hosting via meta-compilation. `https://en.wikipedia.org/wiki/Forth_(programming_language)`. **Take:** the pure concatenative property — *"concatenation of two programs denotes the composition of the two functions they denote"* (Joy's formalization) — is the foundational claim. The DSL inherits the postfix syntax and the rejection of named lambda parameters (parameters are unnamed; they live on the stack). +- **ColorForth** (Chuck Moore, ~1990s). Color encodes semantics (define/compile/execute/variable). `https://en.wikipedia.org/wiki/ColorForth`. **Take:** the idea that visual/structural encoding can replace keywords, and the direct-mapped editor. +- **KYRA / VAMP** (Onat Türkçüoğlu, SVFIG 2025). 2-register stack (RAX/RDX); magenta pipe `|` as definition boundary emitting `RET + xchg rax, rdx`; basic blocks `[ ]` and lambdas `{ }` as compilation units; preemptive scatter. `C:\projects\forth\bootslop\references\kyra_in-depth.md`, `forth_day_2020_in-depth.md`. **Take:** the bracket operators (`[ ]`, `{ }`) and the tape-scoped blocks (`tape { }`). +- **x68 / 5th / "Ear" + "Toe"** (Timothy Lottes, 2007-2026). 32-bit instruction granularity; annotation overlay; folded interpreter; "register file as aliased global namespace" (X.com thread, lines 95-103). `C:\projects\forth\bootslop\references\neokineogfx_in-depth.md`, `blog_in-depth.md`. **Take:** the 32-bit token encoding, the annotation overlay pattern, the folded-interpreter optimization. +- **Joy** (William Byrd, Manfred von Thun, 2001-2003). Purely functional concatenative; quotations as first-class values; combinator library (`map`, `filter`, `fold`, `binrec`, `primrec`, `linrec`). `https://en.wikipedia.org/wiki/Joy_(programming_language)`. **Take:** the quotation-as-first-class-value concept and the combinator library as the model for Tier 2 verbs. +- **CoSy** (Bob Armstrong, ongoing). TimeStamped notebook/log in Forth; all nouns are lists/trees with 3-cell headers `(Type Count refCount)`; modulo indexing; "extensive vocabulary evolved from APL via K." `https://cosy.com/CoSy/Simplicity.html`, `https://cosy.com/4thCoSy/`. **Take:** the open-vocabulary culture; the modulo indexing (forgiving of off-by-one AI errors); the 3-cell header as a universal data structure. + +**Section 5 grounding (per the cluster 1 synthesis).** The DSL's `->` pipeline, `[ ]`/`{ }` blocks, `tape { }` memory model, `scatter`/`gather` verbs, `map`/`filter`/`fold` combinators, modulo indexing, and the "no AST object" parsing strategy all have direct concatenative lineage. See `conductor/tracks/intent_dsl_survey_20260612/research/cluster_1_concatenative.md` §"Synthesis for Section 5" for the verb-by-verb mapping table. + +### Cluster 2 — Array Languages (APL lineage) + +**Cluster claim.** The DSL's *data model* — array as universal type, every verb vectorizes, multi-dimensional indexing — is the APL tradition as refined by K (ASCII-only with overloading), BQN (clean modern semantics with function trains), and Uiua (stack-based execution). The DSL inherits the *philosophy* (succinct expression of algorithms) but uses ASCII-compatible representation rather than APL's custom character set. (Per the full sub-report at `research/cluster_2_array.md`.) + +**Entries:** + +- **APL** (Kenneth Iverson, 1962; Turing Award 1979). The foundational array language; array as universal type; every glyph is a function; right-to-left evaluation with no precedence. `https://en.wikipedia.org/wiki/APL_(programming_language)`, `https://www.dyalog.com/`. **Take:** the array-as-universal-type principle and the right-to-left evaluation model. +- **K / q** (Arthur Whitney, KX Systems, 1993). ASCII-only with heavy context-sensitive overloading; first-class functions borrowed from Scheme; foundation of kdb+ in-memory columnar database. `https://en.wikipedia.org/wiki/K_(programming_language)`, `https://kx.com/`. **Take:** the context-sensitive operator philosophy and first-class functions. +- **BQN** (Marshall Lochbaum, 2020). Modernized APL with clean semantics; context-free grammar; function trains. `https://mlochbaum.github.io/BQN/`. **Take:** the train composition pattern as the most expressive tacit mechanism in the family. +- **Uiua** (Tony Morris, 2023). Stack-based execution; modern open-source development; online Pad for onboarding. `https://www.uiua.org/`, `https://github.com/uiua-lang/uiua`. **Take:** the stack-based execution model as a viable alternative to named parameters, and the modern onboarding-UX model. + +**Section 5 grounding (per the cluster 2 synthesis).** The DSL's `for x .. n` (mapping to APL's `ιN` + reduce, BQN's `↕N`, K's `!R`) and `result[row, col]` (mapping to APL's multi-dim indexing, BQN's `⊏`, K's `@`) inherit directly from this cluster. See `conductor/tracks/intent_dsl_survey_20260612/research/cluster_2_array.md` §"Synthesis for the DSL" for the verb-by-verb mapping table. + +### Cluster 3 — Intent-Mapping + +**Cluster claim.** The DSL's *use case* — a compact, intent-expressive scripting language that maps user intent to platform-optimal operations — is the Jofito tradition as the user has been exploring it. The pipe-coalescing optimization (find/grep/sort/unique collapse into one in-memory script) is the runtime efficiency claim. The nagent tag protocol is *mentioned and explicitly rejected* (no XML angle brackets) but the *structured-protocol idea* is retained. (Per the full sub-report at `research/cluster_3_intent_mapping.md`.) + +**Take bullets — minor v1.1 corrections:** +- **DSL `->` pipe operator:** jq's `|` pipe is the conceptual precedent for the DSL's `->` pipeline operator. The DSL replaces `|` with `->` to avoid conflict with shell usage and to make the DSL parseable without shell-aware lexing. (Per the sub-report's verbatim take bullets.) +- **v1.1 OCR-restoration:** the sub-report slightly misquoted Lottes by dropping "actually" in one place ("missing what we **actually** did in assembly back then"). v1.1 restores the full quote for accuracy. + +**Entries:** + +- **Jofito** (Jody Bruchon, 2023-2026). "Intent mapping engine" (per 2026 README update); tape allocation; leader/chaser thread model; pipe-coalescing. `https://codeberg.org/jbruchon/jofito`, `docs/transcripts/Ddme7DwMQBI_jofito_jody_bruchon.txt`. **Take:** the "intent mapping engine" framing is the DSL's *use case*; the leader/chaser pattern is the *implementation hint*; the tape allocation is the *memory model*. (Specifically: the DSL's `scan -> filter -> print` chain is directly inspired by Jofito's `scandir(...) : filter : print` predicate chain.) +- **jq** (Stephen Dolan, 2012-). JSON-path filter language; the `|` pipe operator (replaced by `->` in the DSL). `https://en.wikipedia.org/wiki/Jq_(programming_language)`, `https://jqlang.org/`. **Take:** the filter-as-expression style; `select(condition)`, `map`, `reduce`, `unique` as Tier 2 verb precedents. +- **nagent's tag protocol** (Mike Acton, `github.com/macton/nagent`; per `conductor/tracks/nagent_review_20260608/agent_review_v2_1_20260612.md:50`). XML-ish self-closing tags (``). **TAKEN:** the structured-protocol idea (named operation with typed attributes; LLM-emit-able; self-delimiting). **REJECTED:** the XML angle-bracket notation, per the user's direct instruction during the intent_dsl_survey_20260612 brainstorming session on 2026-06-12: *"ignore its record formats as they problably will be less xml/json based as I don't like them."* (The user said this in conversation; it is not in any project file.) The DSL must use a different notation that preserves the structured-protocol properties. +- **WebAssembly** (W3C, 2017-). Linear memory; sectioned binary format; structured control flow. `https://en.wikipedia.org/wiki/WebAssembly`. **Take (one paragraph):** the linear memory model is the modern reference for the "tape drive" argument-passing semantics that grounds the DSL's Tier 2 verbs. Wasm's streaming-parse design *suggests* a parsing strategy where verb names and signatures are validated early (cheap) and arguments are parsed on demand (deferred), though this is an inference, not an explicit recommendation from the Wasm spec. + +**Section 4 grounding (per the cluster 3 synthesis).** Each Tier 2 verb cites Jofito (for `scan`, `filter`, `tape`, `scatter`, `gather`, `pipe`) or jq (for `select`, `map`, `fold`, `sort`, `dedupe`, `group`); each Tier 3 verb cites either nagent's structured-protocol idea (for `read`, `edit`, `test`, `discover`) or Jofito's tool-replacement model (for `glob`, `exec`, `run`, `mcp`). See `conductor/tracks/intent_dsl_survey_20260612/research/cluster_3_intent_mapping.md` §"Synthesis for the DSL" for the verb-by-verb mapping table. + +### Cluster 4 — Meta-Tooling DSLs and Agent-Facing Languages + +**Cluster claim.** The DSL is *not the first* agent-facing language. The existing `mcp_dsl_20260606` placeholder, nagent's "Bridge DSL" idea, OpenAI's function-calling schema, and Anthropic's tool-use schema are the prior art. The DSL learns from all four and takes a different notation (per the user's XML/JSON rejection) but the same structural properties (compact, structured, LLM-emit-able). (Per the full sub-report at `research/cluster_4_meta_tooling_dsls.md`.) + +**Entries:** + +- **`mcp_dsl_20260606`** (Manual Slop placeholder; per `conductor/tracks/mcp_architecture_refactor_20260606/spec.md` §12.1 and `nagent_review_20260608/metadata.json:28`). APL/K/Cosy-inspired per-MCP compact dialect. The closest project-internal reference. **Take:** the per-MCP grammar organization; the 8x token-reduction target (80 → 10 tokens); the JSON path stays (backward compat); the DSL is opt-in per MCP. +- **nagent's Bridge DSL idea** (per `nagent_takeaways_20260608.md` line 216-230). The bridge between external agents and actual `mcp_client.py` tool calls. **Take:** the Application's function-calling stays; the bridge DSL is the format external agents emit. +- **OpenAI function-calling** (per `https://platform.openai.com/docs/guides/function-calling`). JSON Schema with `strict`, `required`, `additionalProperties: false`, `enum` constraints. The 5-step conversational loop. **Take:** schema rigor baseline; token cost is proportional to schema verbosity; the 8x reduction target; namespace grouping; fewer-capable-tools principle. +- **Anthropic tool-use** (per `https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/define-tools`). Flat structure with `name`, `description`, `input_schema`, `input_examples`; `strict` as guarantee; `tool_choice` control. **Take:** `input_examples` as a model for teaching the DSL; `tool_choice` maps to Tier 4 verb design (auto/any/forced); the flat structure is the right model for terseness. + +**Section 4 grounding (per the cluster 4 synthesis).** The Tier 4 verbs map to the entries as follows: `fuzzy` ← nagent Bridge + MCP DSL; `try`/`recover` ← nagent Bridge + OpenAI; `sandbox` ← OpenAI + Anthropic; `audit` ← MCP DSL + nagent Bridge; `didyoumean` ← nagent Bridge + Anthropic; `span` ← MCP DSL + OpenAI; `offset` ← MCP DSL + OpenAI; `assumewide` ← OpenAI + Anthropic. See `conductor/tracks/intent_dsl_survey_20260612/research/cluster_4_meta_tooling_dsls.md` §"Synthesis for the DSL" for the full mapping. + +### Cluster 5 — SSDL Shape Primitives + +**Cluster claim.** The DSL's verbs are annotated with **SSDL shape tags** (per `docs/reports/computational_shapes_ssdl_digest_20260608.md` §1) so the reader can see at a glance whether a verb is a single instruction, a codepath, a wide codepath, a codecycle, a wide codecycle, or a codecycle graph. This is the meta-vocabulary that lets the report describe a verb's *shape* in one token. + +**The 6 SSDL primitives:** + +| # | Shape | One-line definition | SSDL symbol | +|---|---|---|---| +| 1 | **Instruction** | A single unit of computation. Reads data, writes data, or both. | `[I]` | +| 2 | **Codepath** | A sequential list of instructions that *terminates*. No loops. | `===>` | +| 3 | **Wide codepath** | A codepath whose execution *causes* several other codepaths to occur simultaneously. | `===>W===>` | +| 4 | **Codecycle** | A circular structure — a codepath that *repeats* at its first instruction after its last. | `o==>` | +| 5 | **Wide codecycle** | Multiple codecycles performing the same task simultaneously. | `oo==>oo` | +| 6 | **Codecycle graph** | Multiple codecycles + the data they read and write. | `boxes + arrows` | + +**The 7 modifiers:** + +| Modifier | SSDL | Meaning | +|---|---|---| +| `[T]` | terminator | The instruction that *ends* a codepath (return, exit, etc.) | +| `[B]` | branch | A point where control flow forks based on a condition | +| `[M]` | merge | A point where control flow re-converges | +| `[S]` | stateful | Marks an instruction that *mutates* persistent state | +| `[Q]` | query | Marks an instruction that reads persistent state | +| `[N]` | nil sentinel | A special value that satisfies "is this OK to use?" in all cases | +| `───` | data | A line representing data being read or written (not a codepath) | + +**How the DSL uses SSDL tags.** Each verb in §4 has a "Shape" column with an SSDL tag. For example, `sum` is `[I]` (single instruction); `for x .. n` is `o==>` (codecycle); `tape { }` is a sub-codepath scope; `pipe` is `===>W===>` (wide codepath, the chain can fan out); the entire DSL pipeline is a codecycle graph (multiple codecycles + the data they read and write). This lets the reader see the *shape* of a pipeline at a glance. + +### Cluster 6 — Project's Own Command DSL Precedents + +**Cluster claim.** The DSL is a *richer* superset of the project's existing 33 Command Palette commands (per `docs/guide_command_palette.md` and `src/commands.py`). The "Everything" mode in the Command Palette (per `guide_command_palette.md` line 383: *"search across commands, files, symbols, history, settings"*) is a near-term use case where the DSL's verbs can be the underlying format. The Command Palette is the user's existing vocabulary instinct; the DSL formalizes and extends it. + +**5 representative commands by category** (the full 33 are in `docs/guide_command_palette.md`): + +| Category | Command | Title | Action | +|---|---|---|---| +| AI | `reset_session` | Reset Session | `ai_client.reset_session()` + clears logs + `_handle_reset_session()` | +| AI | `clear_discussion` | Clear Discussion | Empties `app.discussion_history` | +| AI | `add_all_files_to_context` | Add All Files To Context | `app._add_all_files_to_context()` | +| View | `toggle_text_viewer` | Toggle Text Viewer | `_toggle_window(app, "Text Viewer")` | +| Tools | `trigger_hot_reload` | Hot Reload | `HotReloader.reload("src.gui_2", app)` | +| Layout | `save_workspace_profile` | Save Workspace Profile | Opens the save-profile modal | +| Theme | `cycle_theme` | Cycle Theme | Cycles through `["10x Dark", "ImGui Light", "NERV"]` | +| Help | `show_command_palette_help` | Show Command Palette Help | Loads `docs/Readme.md` into the Text Viewer | + +**Take.** The DSL's verbs are a *richer* superset of these. Where the Command Palette has 33 imperative commands (each is a function with side effects), the DSL's Tier 2 verbs are declarative ("I want to scan, filter, print") and the Tier 4 verbs formalize the AI-fuzzing-tolerance aspects (audit, didyoumean) that the Command Palette cannot. The "Everything" mode in the Command Palette is the natural place where DSL verbs could appear as searchable entries. + +### Cluster 7 — Data-Oriented Error Handling Convention + +**Cluster claim.** The DSL's `try { ... } recover { ... }` envelope returns a `Result[T]` (with side-channel errors as `list[ErrorInfo]`), per the convention established by `conductor/tracks/data_oriented_error_handling_20260606/spec.md` §3.3. The 12 `ErrorKind` values are the canonical error vocabulary. The `Result[T]` dataclass is the data-oriented alternative to exception-based control flow. + +**The 12 `ErrorKind` values** (per `data_oriented_error_handling_20260606/spec.md` §3.3): + +| Kind | Meaning | +|---|---| +| `NETWORK` | Network or connection error | +| `AUTH` | Authentication / API key error | +| `QUOTA` | Quota exhausted | +| `RATE_LIMIT` | Rate limited | +| `BALANCE` | Balance / billing error | +| `PERMISSION` | Permission denied (file system, etc.) | +| `NOT_FOUND` | Resource not found | +| `INVALID_INPUT` | Invalid input (parse failure, schema mismatch) | +| `NOT_READY` | System not ready (e.g., RAG not initialized) | +| `UNKNOWN` | Unknown error | +| `CONFIG` | Configuration error | +| `INTERNAL` | Internal error (e.g., SDK exception) | +| `PROVIDER_HISTORY_DIVERGED_FROM_UI` | (added 2026-06-08; per nagent_review Pitfall #4) | + +**The `Result[T]` dataclass signature** (per `data_oriented_error_handling_20260606/spec.md` §3.3): + +```python +@dataclass(frozen=True) +class Result(Generic[T]): + data: T + errors: list[ErrorInfo] = field(default_factory=list) + @property + def ok(self) -> bool: return not self.errors + def with_error(self, err: ErrorInfo) -> "Result[T]": ... + def with_errors(self, new_errors: list[ErrorInfo]) -> "Result[T]": ... + def with_data(self, new_data: T) -> "Result[T]": ... +``` + +**How the DSL uses the Result envelope.** The `try { ... } recover { ... }` block returns a `Result[T]` where `T` is the verb's return type. The `recover` block receives the `Result[T]` from the `try` and can inspect `.errors` to decide what to do. The `didyoumean` verb returns `Result[T, list[Suggestion]]` — the success case is the parse result, the failure case includes a list of suggested corrections. + +--- + +## 3. The Grammar +**Notation heuristic (v1.2 convention).** The grammar mixes postfix and infix styles based on whether precedence is an issue. + +| Class | Style | Examples | Rationale | +|-------|-------|----------|-----------| +| Arithmetic | postfix | `a b +`, `a b c * +` | precedence-bearing | +| Comparison | postfix | `a b =`, `a b <` | precedence-bearing | +| Reducers | postfix | `arr sum`, `1 10 .. sum` | precedence-bearing | +| Math indexing | postfix | `i j a []`, `v i j a []:=` | precedence-bearing (read vs write) | +| Assignment `:=` | infix | `name := expr` | structural, no precedence | +| Function calls | infix | `f(x, y)`, `Matrix(rows, cols)` | structural, no precedence | +| Control flow | infix | `if cond { body }`, `for i .. n { body }` | structural, no precedence | +| Field access | infix | `m.rows`, `m.columns` | structural, no precedence | +| Block delimiters | infix | `tape { }`, `[ ]`, `{ }` | structural, no precedence | + +**Why this mix.** Postfix eliminates precedence ambiguity — `2 + 3 * 4` vs `2 3 4 * +` doesn''t need disambiguation. Infix is more familiar where precedence isn''t a concern (you never write `name ?= expr` and need to know if `?=` binds tighter than `:=`). The two mix freely: `result := a b +` is canonical — the `:=` is infix (structural), the `a b +` is postfix (math). The body of every verb is a sequence of math operations (postfix) chained by `;`, with infix assignment and control flow as structural glue. + +**Heuristic summary.** *If the operator has precedence, postfix it. If it doesn''t, infix it.* + +The grammar formalizes 14 primitives drawn from the user's math pseudocode (the `determinate`/`minor`/`matrix-transpose` snippets shared during spec review), plus 3 known ambiguity flags, plus precedence rules and AI-fuzzing tolerance rules. + +### 3.1 The 14 primitives + +| # | Symbol | Name | Signature / Syntax | Meaning | Source example (user pseudocode) | +|---|---|---|---|---|---| +| 1 | `name := value` | Local bind | `name := expr` | Stack-scoped local declaration | `m rows . 1 - m columns . 1 - Matrix result :=` | +| 2 | `stack { ... }` | Stack scope | `stack { decl1; decl2; ... }` | Block of stack-allocated locals | `stack { ... result :=; Scalar row_offset :=; Scalar col_offset := }` | +| 3 | `name: Type` | Annotation | `name: Type` | Type hint on a binding | `m : Matrix` | +| 4 | `func(args) -> Type { ... }` | Function def | `func(args) -> Type { body }` | Named function with return type | `determinate(m, row) -> Scalar { ... }` | +| 5 | `name(...) proc { ... }` | Procedure def | `name(args) proc { body }` | Void-returning function | `minor(m, row_omit, column_omit) -> Scalar proc { ... }` | +| 6 | `for x .. n` | Range iteration | `for x .. n { body }` | Iterate `x` over `[0, n)` | `for col .. m.columns` | +| 7 | `name[a, b]` | Bracket indexing | `name[i, j, k, ...]` | Multi-dim array access | `result[row - row_offset, col - col_offset]` | +| 8 | `if cond { ... }` | Conditional | `if cond { then-body }` | If-then (else inferred) | `if col = col_omit { ++ col_offset; continue; }` | +| 9 | `return value` | Return | `return expr` | Function exit with value | `return result` | +| 10 | `->` (between verbs) | Pipeline flow | `verb1 -> verb2 -> verb3` | Output of left → input of right | `filter -> (col != column_omit <- for col .. m.columns)` | +| 11 | `<-` (after verb) | Input binding | `result <- producer` | The thing on the right is the producer | `for col .. m.columns` produces; `col != column_omit` consumes | +| 12 | `=` (in `assert`) | Equality | `assert -> lhs = rhs` | Assert two expressions are equal | `assert -> product(...) = product(...)` | +| 13 | `{ }` | Body block | `{ body }` | Function/scope body | `{ ... }` | +| 14 | `[ ]` | Basic block | `[ my_stage ]` | Onat's compilation unit (no branching semantics) | (not in user pseudocode; from KYRA's basic blocks) | + +### 3.2 Ambiguity flags + +Per the user's note during spec review (*"Hopefully the above don't have too many logic errors that the use can't be clarified."*), three known ambiguities in the user's pseudo code are normalized in the report: + +- **`proc` modifier placement:** `minor(m, row_omit, column_omit) -> Scalar proc { ... }` — likely a *type qualifier* (the return type is "Scalar" + "proc"-ness means side-effecting). The report adopts the convention that `proc` is a postfix modifier indicating void-returning; the syntax is `name(args) proc { body }` (return type omitted) or `name(args) -> Type proc { body }` (return type explicit but ignored). +- **`++col_offset`:** likely `col_offset += 1`. The report formalizes as `name += 1` (Python-style augmented assignment) and does not adopt the `++` operator. This avoids confusion between pre-increment and post-increment. +- **`m[row][column]` vs `m[row, col]`:** both appear in the user's snippets (line 24 `m[row][column]` is likely a typo for `m[row][col]`). The report adopts the comma-form (`name[a, b]`, multi-dim) throughout, since the C-style chained-bracket form doesn't compose with the user's existing matrix pseudocode. + +### 3.3 Precedence rules + +- **Left-to-right for `->` chains:** `a -> b -> c` parses as `(a -> b) -> c` (b's output becomes c's input). This is *not* the standard math convention (right-to-left) but it matches the user's pseudocode and the pipeline model. +- **`(` `)` for grouping:** explicit parentheses override the left-to-right default. `a -> (b -> c)` parses as `a -> X` where `X = (b -> c)`. +- **Stack-binding precedence:** `:=` binds tighter than `<-`. `producer expr <- result :=` parses as `producer (expr <-) result :=` (the `expr <- producer` consumes the producer into expr before `result :=` stores it). +- **No operator precedence for arithmetic:** `+`, `-`, `*`, `/`, `^` are all left-associative with equal precedence. `2 + 3 * 4` parses as `(2 + 3) * 4 = 20`. (This is the APL/K convention. If the user wants math precedence, the report can adopt explicit `(` `)`.) + +### 3.4 AI-fuzzing tolerance rules + +These are the rules that make the DSL workable for AI agents that may fuzz verb names, indent inconsistently, or offset line references. + +- **CoSy-style modulo indexing:** array indices wrap. `result[-1]` is equivalent to `result[result.len - 1]`. This forgives AI off-by-one errors in line references. (Per the CoSy Simplicity page: *"Indexing is modulo - like counting on your thumb & fingers : 0 1 2 3 4 0."*) +- **Structured recovery anchors via `{ }`:** the `{ }` block is a recovery unit. If the parser cannot parse the body, the entire block is replaced with `NIL` and the error is reported at the block level, not at the line level. +- **Line/offset independence:** the parser uses *token positions*, not raw line numbers. A token's position is `file:token-index` (e.g., `src/foo.py:42` means "the 42nd token in src/foo.py"), not `file:42` (which would be "line 42"). The mapping from token position to line number is a presentation concern, not a parse concern. This matches the project's existing FuzzyAnchor pattern (per `docs/guide_context_curation.md`). +- **Verb-name fuzzing tolerance:** the `didyoumean` verb (see §4 Tier 4) proposes corrections for ambiguous verb names. The parser's "best guess" recovery path is configurable: strict (reject on typo), lenient (auto-correct if Levenshtein distance ≤ 2), or fuzzy (parse the rest, log the typo). +- **Indentation tolerance:** indentation is *not* significant (per the user's explicit "ignore its record formats" instruction and the rejection of Python's indent-sensitive syntax). The parser uses a stack-based approach; the `{ }` and `[ ]` delimiters are the only structure-aware tokens. + +### 3.5 Error envelope: `try { ... } recover { ... }` + +``` +try { + scan "src/foo.py" -> filter !exists -> print +} recover err { + audit "scan failed: " + err + return NIL +} +``` + +- The `try` block evaluates the pipeline. If the pipeline returns a `Result[T]` with `errors` non-empty, the `recover` block runs. +- The `recover` block receives the `Result[T]` as a parameter (named by the user; `err` is the default convention from the user's pseudocode). +- The `recover` block must return a `Result[T]` (or `NIL` to short-circuit). +- If the `recover` block itself returns a `Result[T]` with errors, those errors are appended to the outer `Result[T]`'s error list. (Per Fleury's "errors are data" pattern; per `data_oriented_error_handling_20260606/spec.md` §3.4.) + +### 3.6 Block composition: `[ ]` (KYRA basic blocks) vs `{ }` (body blocks) vs `tape { }` (memory regions) + +- **`[ ]`** is Onat's basic block (per `C:\projects\forth\bootslop\references\kyra_in-depth.md:56-57`): *"Basic blocks `[ ]` provide implicit begin/link/end jump targets for the JIT to resolve relative offsets within a limited scope."* In the DSL, `[ ]` is a *sequential operation block* — a chunk of code that the parser can compile and dispatch as a unit. It is *not* a scope (no new bindings); it is a *compilation unit*. +- **`{ }`** is a body block: function body, if/then body, recover body. It introduces a new lexical scope (new bindings are local to the block). +- **`tape { }`** is a tape-drive region: a `{ }` body that has been *pre-scattered* into a contiguous memory region. The contents are pre-placed; the JIT can emit the entire block as a single `xchg rax, rdx` boundary (per KYRA's magenta pipe semantics). + +The three are nested by the parser: `tape { foo := x; [ bar ]; baz }` is a tape region containing 2 sequential statements (the local bind and the basic block) and a trailing call. is a tape region containing 2 sequential statements (the local bind and the basic block) and a trailing call. + +--- + +## 4. The 4-Tier Vocab (~40 Verbs) + +Each verb: symbol, name, signature, one-line semantics, one example, "borrowed from" note, SSDL shape tag. Tier 2 and Tier 3 verbs also have a "maps to mcp_client tool" column. Tier 4 verbs have a "novel piece" note. + +### 4.1 Tier 1 — Math (~10 verbs) + +The Tier 1 verbs are drawn directly from the user's math pseudocode. + +| Symbol | Name | Signature | Semantics | Example | Borrowed from | Shape | +|---|---|---|---|---|---|---| +| `:=` | Local bind | `name := expr` | Stack-scoped local declaration | `m rows . 1 - m columns . 1 - Matrix result :=` | Forth (dictionary entries); Joy (quotations) | `[I]` | +| `stack { ... }` | Stack scope | `stack { decl1; decl2; ... }` | Block of stack-allocated locals | `stack { ... result :=; Scalar row_offset :=; Scalar col_offset := }` | Forth (colon definitions); KYRA (basic blocks) | `[I]` | +| `for x .. n` | Range iteration | `for x .. n { body }` | Iterate `x` over `[0, n)` | `for col .. m.columns` | APL `ιN`; K `!R`; BQN `↕N`; Uiua (stack iteration) | `o==>` | +| `+` | Add | `a b +` | Element-wise sum | `2 + 3` (yields 5) | All languages | `[I]` | +| `-` | Subtract | `a b -` | Element-wise difference | `5 - 2` (yields 3) | All languages | `[I]` | +| `*` | Multiply | `a b *` | Element-wise product | `2 * 3` (yields 6) | All languages | `[I]` | +| `/` | Divide | `a b /` | Element-wise division | `6 / 2` (yields 3) | All languages | `[I]` | +| `^` | Power | `a b ^` | Element-wise power | `2 ^ 10` (yields 1024) | All languages | `[I]` | +| `sum` | Sum | `expr sum` | Sum all elements | `sum 1..10` (yields 55) | APL `+/`; K `+/`; BQN `+` | `[I]` | +| `product` | Product | `expr product` | Product all elements | `product 1..5` (yields 120) | APL `×/`; K `*/`; BQN `×` | `[I]` | +| `a[i, j]` | Bracket indexing | `name[i, j, ...]` | Multi-dim array access | `result[row - row_offset, col - col_offset]` | APL `result[2;3]`; BQN `⊏`; K `@` | `[Q]` (query) | +| `if/then` | Conditional | `if cond { then-body }` | If-then (else inferred) | `if col = col_omit { ++ col_offset; continue; }` | Forth (IF/THEN); CoSy (control flow) | `[B]` (branch) | + +**Total Tier 1: 12 verbs.** (Slightly over the 10 estimate; the verbs are tight enough that splitting them hurts readability.) + +### 4.2 Tier 2 — Data-Oriented Pipeline (~12 verbs) + +The Tier 2 verbs wrap the existing 45+ MCP tools (per `docs/guide_tools.md` §"Native Tool Inventory") with declarative intent expressions. They are the "imperative veneer" over the Jofito-style predicate chain. + +| Symbol | Name | Signature | Semantics | Example | Maps to mcp_client tool | Borrowed from | Shape | +|---|---|---|---|---|---|---|---| +| `scan` | Scan | `scan path` | Read source (directory, file, URL); first verb in every pipeline | `scan "src/" -> filter !dir -> map ext` | `list_directory` + `search_files` + `read_file` | Jofito `scandir()` | `[I]` | +| `select` | Select | `select condition` | Keep records matching condition (jq-style filter) | `scan "src/" -> select .extension == ".py"` | (jq-style filter) | jq `select(condition)`; Joy `filter` | `===>` | +| `filter` | Filter | `filter predicate` | Keep records where predicate is true | `scan "src/" -> filter .size > 0` | (predicate on FileItem) | Jofito `{filter ...}` predicate | `===>` | +| `map` | Map | `map block` | Apply block to each record | `scan "src/" -> map ext` | (no direct equivalent) | jq `.[] | .field`; Joy `map`; CoSy `' verb 'm` | `o==>` | +| `fold` | Fold | `fold init block` | Reduce to single value | `scan "src/" -> fold 0 { acc + .size }` | (no direct equivalent) | jq `reduce`; Joy `fold` | `o==>` | +| `sort` | Sort | `sort key` | Order records by key | `scan "src/" -> sort .name` | (no direct equivalent) | Joy `qsort`; jq `sort` | `[I]` | +| `group` | Group | `group key` | Bucket records by key | `scan "src/" -> group .extension` | (no direct equivalent) | jq `group_by`; CoSy APL-derived | `o==>` | +| `dedupe` | Dedupe | `dedupe` | Remove duplicates | `scan "src/" -> dedupe` | (no direct equivalent) | jq `unique`; CoSy | `[I]` | +| `tape { }` | tape scope | `tape { body }` | Tape-drive region; pre-scatter contents | `tape { [ scan ]; [ filter ]; [ print ] }` | (compiler directive) | KYRA magenta pipe; Onat preemptive scatter | `o==>` | +| `scatter` | Scatter | `scatter workers` | Fork pipeline across `workers` cores | `scan "src/" -> scatter 4 -> filter` | (runtime hint) | Onat preemptive scatter; Lottes X.com thread line 55-61 | `===>W===>` | +| `gather` | Gather | `gather` | Collect scattered sub-streams | `scan "src/" -> scatter 4 -> filter -> gather` | (runtime hint) | Onat inverse of scatter | `[I]` | +| `pipe` | Pipe root | `pipe` | Explicit chain root (synonym for `->`) | `pipe [ scan, filter, print ]` | (no direct equivalent) | Jofito pipe coalescing (transcript:376-410) | `===>W===>` | + +**Total Tier 2: 12 verbs.** + +### 4.3 Tier 3 — Shell (~10 verbs) + +The Tier 3 verbs wrap existing MCP tools (per `docs/guide_tools.md` §"Native Tool Inventory") and provide the shell-scripting surface. They are the "imperative veneer" over the declarative Tier 2 pipeline. + +| Symbol | Name | Signature | Semantics | Example | Maps to mcp_client tool | Borrowed from | Shape | +|---|---|---|---|---|---|---|---| +| `exec` | Execute | `exec cmd` | Run shell command | `exec "find . -name '*.py'"` | `run_powershell` (shell_runner.py) | nagent tag protocol (structured protocol idea) | `[I]` | +| `open` | Open | `open path` | Open file/URL | `open "src/foo.py"` | `read_file` | nagent tag protocol | `[I]` | +| `read` | Read | `read path` | Read file content | `read "src/foo.py"` | `read_file` | nagent tag protocol | `[I]` | +| `write` | Write | `write path content` | Write file content | `write "src/foo.py" "new content"` | `set_file_slice` / `edit_file` | nagent tag protocol | `[I]` | +| `close` | Close | `close handle` | Close handle | `close file_handle` | (no direct equivalent; close is implicit in Python) | Forth `CLOSE-FILE`; bash `exec` | `[I]` | +| `path` | Path | `path` | Get current path (or `cd`) | `path` | (no direct equivalent; use `cwd`) | shell `pwd`; CoSy `path` | `[I]` | +| `env` | Env | `env var` | Get env var | `env HOME` | (no direct equivalent) | shell `echo $HOME` | `[I]` | +| `wait` | Wait | `wait ms` | Block for `ms` milliseconds | `wait 1000` | (no direct equivalent) | shell `sleep` | `o==>` | +| `poll` | Poll | `poll handle ms` | Poll handle with timeout | `poll file_handle 5000` | (no direct equivalent) | shell `read -t` | `o==>` | +| `cwd` | CWD | `cwd` | Get current working directory | `cwd` | (no direct equivalent) | shell `pwd` | `[I]` | + +**Total Tier 3: 10 verbs.** + +### 4.4 Tier 4 — AI-Fuzzing Tolerance (~8 verbs, the novel contribution) + +The Tier 4 verbs are what make the DSL workable for AI agents that may fuzz verb names, indent inconsistently, or offset line references. Each verb directly maps to one or more of the 4 anchor claims (especially Claim 3: IEventTarget, per Cluster 0). + +| Symbol | Name | Signature | Semantics | Example | Novel piece | Borrowed from | Shape | +|---|---|---|---|---|---|---|---| +| `fuzzy` | Fuzzy | `fuzzy expr` | Declare a parse-tolerance region; parser accepts near-matches | `fuzzy { scan "src/" -> filter .ext }` | Tolerance for AI verb-name fuzzing | nagent "discovery" intent (per `decisions.md:119,128`); SSDL "assume as much as possible" | `===>` | +| `try { ... } recover { ... }` | Try / Recover | `try { body } recover err { fallback }` | Returns `Result[T]`; on error, the `recover` block runs | `try { read "src/foo.py" } recover { read "src/Foo.py" }` | Error envelope as data (Fleury pattern) | `data_oriented_error_handling_20260606`; Wasm `try`/`catch` block/loop/if/end | `===>B===>` | +| `sandbox { ... }` | Sandbox | `sandbox { body }` | IEventTarget boundary; all writes in the block go through the formal event channel | `sandbox { write "tmp/x" "data" }` | O'Donnell's "reads free, writes formalized" invariant applied to the DSL | O'Donnell `mvc.html` "Writing to Model state" | `o==>` | +| `audit` | Audit | `audit msg` | Log the state change to a structured record; the IEventTarget itself | `audit "wrote tmp/x"` | Per-write audit log; full replay capability | O'Donnell `mvc.html` "Event callbacks"; nagent's self-describing tools | `[I]` | +| `didyoumean` | Did you mean | `didyoumean ambiguous` | Propose the closest matching verb(s) for an ambiguous input | `didyoumean "skan"` | Recovery primitive for AI typos | nagent Bridge DSL intent model; Anthropic `input_examples` | `[I]` | +| `span` | Span | `span intent` | Decompose a compound intent into a span of sub-MCP grammar tokens | `span "read foo.py:MyClass"` | Spans the `read_file` and `py_get_definition` tools | MCP DSL per-MCP grammar (`spec.md:456-465`); OpenAI namespace grouping | `[I]` | +| `offset` | Offset | `offset symbol` | Resolve a symbol to a file:line without requiring the model to specify the line | `offset "foo.py:MyClass.method"` | Implicit offset resolution | MCP DSL line-range notation; OpenAI "don't make the model fill known args" | `[Q]` | +| `assumewide` | Assume wide | `assumewide intent` | If the intent is broad or ambiguous, select the most-capable matching tool (the "fewer, more capable" heuristic) | `assumewide "refactor"` | Prefer broad-capability tools over narrow specialists | OpenAI "fewer than 20 functions"; Anthropic `tool_choice: tool` force-call | `===>W===>` | + +**Total Tier 4: 8 verbs.** + +**Total vocab: 12 + 12 + 10 + 8 = 42 verbs.** (~40 estimate; slightly over because Tier 1 is 12 instead of 10, but Tier 3 is 10 and Tier 4 is 8.) + +--- + +## 5. Hardware Mapping (4 Anchor Claims) + +The 4 anchor claims tie the vocab and grammar to actual hardware/software stages. + +### 5.1 Claim 1 — Onat/Lottes, hardware + +The DSL's `->` pipeline, `[ ]`/`{ }` blocks, `tape { }` memory model, and `scatter`/`gather` verbs are direct descendants of KYRA/VAMP and x68. + +- **`->` pipeline:** inherits from Forth's postfix word chain, refined by KYRA's 2-register stack (RAX/RDX) as the minimal call convention. Per `C:\projects\forth\bootslop\references\kyra_in-depth.md:14` (*"The 2-Item Hardware Stack: To achieve hardware locality and GPU compatibility, KYRA strictly restricts the data stack to exactly two CPU registers: `RAX` (Top of Stack) and `RDX` (Next on Stack)"*). +- **`[ ]` sequential block:** inherits from KYRA's basic blocks `[ ]` with implicit begin/link/end jump targets. Per `kyra_in-depth.md:56-57` (*"Basic Blocks `[ ]`: These visually constrain the assembly output. They provide implicit begin, link (else), and end jump targets for the JIT to resolve relative offsets within a limited scope"*). +- **`{ }` lambda block:** inherits from KYRA's lambdas `{ }` that compile code elsewhere and leave an address in `RAX`. Per `kyra_in-depth.md:58-59` (*"Lambdas `{ }`: A lambda (colored Yellow `{`) does not execute inline. The JIT compiles the block of code elsewhere in the tape and leaves its executable memory address in `RAX`."*). +- **`tape { }`:** inherits from KYRA's magenta pipe `|` definition boundary (`RET` + `xchg rax, rdx`) as the entry/exit protocol for a memory region. Per `kyra_in-depth.md:24-27` (*"The Magenta Pipe Trick: Because the stack is just `RAX` and `RDX`, ensuring `RAX` is the active 'Top of Stack' before executing a word is vital. The `xchg rax, rdx` instruction compiles to a tiny 2-byte opcode: `48 92`. Definitions: There are no `begin` or `end` words. A magenta pipe token (`|`) implicitly signals the start of a new definition. The JIT reacts to this by: 1. Emitting a `RET` (`C3`) to close the *previous* definition. 2. Emitting `48 92` (`xchg rax, rdx`) to ensure proper stack alignment for the *new* definition."*). +- **`scatter`:** inherits from Onat's preemptive scatter — per `X.com - Onat & Lottes Interaction 1.png.ocr.md:59-61`: *"The key concept here is that 'common' arguments like the device are pushed onto the tape using store duplication when they are known (after device creation). So it's preemptive scatter, so later at call time there is no argument gather."* +- **`gather`:** the inverse of preemptive scatter — collect pre-scattered values from fixed memory slots. + +Lottes's specific framing at `X.com - Onat & Lottes Interaction 1.png.ocr.md:80-86`: *"I laugh when people say C is like assembly, they are missing what we **actually** did in assembly back then, which was all registers and globals and gotos, no stacks. It's radically different than good assembly."* The DSL's 2-register model + tape regions + magenta `->` are a direct application of this insight: don't pretend you have a memory stack when the hardware has registers. + +### 5.2 Claim 2 — O'Donnell, paradigm + +The DSL's pipeline is *immediate-mode in pipeline composition*. Each `->`-delimited stage is a method invocation, not a Pipeline object. The pipeline exists *only* while the DSL program is being executed; once execution ends, the pipeline's state is gone. + +Per O'Donnell at `https://johno.se/book/imgui.html`: *"Widgets, logically, change from being objects to being method invocations. As we shall see, this fundamentally changes how a client application approaches the implementation of user interfaces."* + +The DSL inherits this: `scan -> filter -> print` is not a pipeline object you can query, name, or pass around. The only way to "name" a chain is to wrap it in a function (`determinate(m, row) -> Scalar { ... }`). The function body IS the chain; the function name IS the chain's identity. There is no separate Pipeline class. + +This also means: the parser doesn't need to track pipeline state across executions. Each invocation of `determinate(m, row)` is independent. There is no "current pipeline" implicit state. The next call is fresh. + +### 5.3 Claim 3 — Forth/CoSy, syntax + +Concatenative syntax is immediate-mode in *tokenization* (whitespace-delimited, no precedence), in *evaluation* (each verb pops args, pushes results), and in *parsing* (no AST object retained after the parse — the parser emits JIT'd code directly per Onat's xchg model). + +- **Tokenization:** whitespace-delimited, no precedence table. Per `https://en.wikipedia.org/wiki/Forth_(programming_language)`: *"Forth's grammar has no official specification. Instead, it is defined by a simple algorithm. The interpreter reads a line of input from the user input device, which is then parsed for a word using spaces as a delimiter."* +- **Evaluation:** each verb pops args, pushes results. Per CoSy Simplicity: *"Words pass information to each other by pushing it on, or taking it off a `stack`."* +- **Parsing:** no AST object retained after parse. The parser emits directly. Per `data_oriented_error_handling_20260606/spec.md` §3.1 and the project's overall "data-oriented design" philosophy, parsing is data flow, not object construction. + +The DSL inherits all three. The parser reads whitespace-delimited tokens, evaluates each verb as a stack effect, and emits the result without retaining an AST. + +### 5.4 Claim 4 — APL/K, data + +Array languages are immediate-mode in *data representation*. There is no array-object header; values are passed by stack reference, not by handle. + +- **APL** (per `https://en.wikipedia.org/wiki/APL_(programming_language)`): *"APL has an array as the universal data type"* — scalar `5` is a 0-dimensional array; `4 5 6 7 + 4` propagates the addition across the vector. +- **K** (per `https://en.wikipedia.org/wiki/K_(programming_language)`): "kdb+ (built on K) processes billions of records at microsecond latency" — the array paradigm scales to production workloads. +- **BQN** (per `https://mlochbaum.github.io/BQN/`): the CBQN bytecode compiler confirms the paradigm can be compiled efficiently. + +The DSL's `for x .. n` range + `result[row, col]` indexing inherits the "no array object" property. The array is *the* universal type; every function operates on it; every function vectorizes. + +--- + +## 6. AI-Agent Properties (10 Claims) + +The 10 claims tie the DSL to the existing project's architecture so future tracks can build on it without re-deriving the design. + +### 6.1 Claim 1 — Domain = Meta-Tooling + +The DSL is **Meta-Tooling-side** per `docs/guide_meta_boundary.md` §"Domain 2: The Meta-Tooling". The Application's provider-native function-calling stays unchanged. The DSL is the format external agents (Gemini CLI, OpenCode) emit when invoking `mcp_client.py` tools. + +### 6.2 Claim 2 — Runtime path = external agent → DSL → bridge → MCP → optional Hook API approval + +Per `docs/guide_meta_boundary.md` §"The Inter-Domain Bridges": external agents (Gemini CLI) call the DSL via a bridge script (`scripts/cli_tool_bridge.py` analogue). The bridge script translates the DSL into `mcp_client.dispatch()` calls. The Hook API (`docs/guide_tools.md` §"The Hook API") surfaces HITL approval modals when the bridge detects a `sandbox { ... }` block. + +### 6.3 Claim 3 — 3-layer security + +The DSL's parser respects the existing 3-layer security model in `mcp_client.py` (per `docs/guide_tools.md` §"The MCP Bridge"). Every DSL statement that targets a tool outside the allowlist is rejected at parse time. The 3 layers are: allowlist construction, path validation, and resolution gate. The DSL does not bypass any of these. + +### 6.4 Claim 4 — 4 memory dimensions + +The DSL does *not* replace any of the 4 memory dimensions (per `conductor/tracks/nagent_review_20260608/nagent_review_v2_1_20260612.md` §2.1): +- **Curation memory** (FileItem + ContextPreset + FuzzyAnchor) +- **Discussion memory** (disc_entries + branching + UISnapshot A1-A7) +- **RAG memory** (ChromaDB, opt-in) +- **Knowledge memory** (Candidate 11, the harvested durable learnings) + +The DSL is a *query format* for all 4, not a replacement. A `scan "src/foo.py"` is a curation-memory query; a `select .role == "User"` is a discussion-memory query; a `search "execution clutch"` is a RAG-memory query; a `read "knowledge/digest.md"` is a knowledge-memory query. + +### 6.5 Claim 5 — Stable-to-volatile cache ordering + +The DSL's `tape { }` blocks are cache-friendly per nagent v2.1 §2.2 stable-to-volatile ordering. The DSL's audit logs (Tier 4 `audit` verb) are a *stable* layer that can be cached across turns. The DSL's pipeline output (e.g., the output of `scan -> filter`) is a *volatile* layer appended per turn. + +### 6.6 Claim 6 — `Result[T]` envelope + +The DSL's `try { ... } recover { ... }` verb returns `Result[T]` per the convention established by `conductor/tracks/data_oriented_error_handling_20260606/spec.md` §3.3. The 12 `ErrorKind` values are the canonical error vocabulary. The `Result[T]` dataclass is the data-oriented alternative to exception-based control flow. + +### 6.7 Claim 7 — Command Palette 33 commands + +The DSL's verbs are a *richer* superset of the 33 Command Palette commands (per `docs/guide_command_palette.md` and `src/commands.py`). The "Everything" mode in the Command Palette (per `guide_command_palette.md` line 383: *"search across commands, files, symbols, history, settings"*) is a near-term use case where the DSL's verbs can be the underlying format. The user types `find "execution clutch"` instead of clicking on a result; the DSL parses the intent and dispatches to the right MCP tool. + +### 6.8 Claim 8 — Hook API state fields + +The DSL's verbs that mutate state route through `_predefined_callbacks` (per `docs/guide_state_lifecycle.md` §"Hook API Surface"). The verbs that read state use `_gettable_fields`. The DSL never bypasses the Hook API; it's a *user* of the existing infrastructure. + +### 6.9 Claim 9 — O'Donnell's IEventTarget pattern as the `sandbox` verb + +The `sandbox { ... }` block in Tier 4 is the DSL's IEventTarget boundary. Per O'Donnell at `https://johno.se/book/mvc.html` "Writing to Model state": *"Writes to Model are formalized through the addition of IEventTarget. This is a pure virtual interface that defines all possible state changes / events on a system wide level."* In the DSL, `sandbox { ... }` declares: every state change in this block goes through a single auditable interface (the bridge script's HITL approval modal per `docs/guide_meta_boundary.md`). The `audit` verb is the IEventTarget itself: a write-verb that logs the state change to a structured record (timestamp, source, kind, payload — same shape as `guide_architecture.md` §"Telemetry & Auditing" `Comms Log` entries). + +Per the cluster 0 sub-report (per `cluster_0_odonnell.md` §"Connections" Connection 1): *"The `sandbox` verb isolates execution and enforces that all state observations by the sandboxed code are *reads* — they can occur freely against the const Model view. State mutations by sandboxed code, however, must be routed through the formal event channel."* + +### 6.10 Claim 10 — O'Donnell's "reads are free" claim as the rationale for cheap verbs + +Per O'Donnell at `https://johno.se/book/mvc.html` "Reading Model state": *"First of all, View and Controller may only access Model in a const fashion. This has numerous repercussions. Firstly, exposing central Model state as public is ok, as it can only be read. Also, only const methods may be called, so state changes cannot be made internally as a result of a bad function call."* + +The Tier 2 verbs (`scan`, `filter`, `map`, `fold`, `sort`, `group`, `dedupe`) are *read-only* and can be re-evaluated freely, multiple times per execution, in parallel stages, without audit. Only the moment the chain's output is consumed by a write-verb (`exec`, `write`, `assign`) triggers the HITL modal. This is why the bridge script can re-execute a read-only chain without human approval. + +Per the cluster 0 sub-report (per `cluster_0_odonnell.md` §"Connections" Connection 2): *"O'Donnell's 'reads are free' claim is the rationale for cheap Tier 2 verbs — they can be re-evaluated freely because they never mutate state, so they can be re-evaluated freely, multiple times per execution, in parallel stages, without audit."* + +--- + +## 7. Open Questions for Follow-up B (≥6) + +These open questions must be answered by the follow-up B track (interpreter prototype). Each question is a design decision the interpreter must make. + +1. **How does `tape { }` map to Onat's preemptive scatter?** Is the block itself a tape-drive region, or is `tape` a wrapper that allocates a tape for the block's contents? The interpreter must decide whether `tape { ... }` is a parser hint (the parser pre-scatters) or a runtime directive (the runtime allocates a tape). The implication: parser-time optimization vs runtime flexibility. + +2. **Where does "intent resolution" live?** Is it a per-verb option, a per-block modifier, or a global parser mode? The `fuzzy` verb declares a parse-tolerance region; is this a property of the verb, of the block, or of the whole program? The interpreter must decide how `fuzzy` composes with non-`fuzzy` verbs in the same chain. + +3. **How does `audit` interact with `comms.log`?** Per `docs/guide_architecture.md` §"Telemetry & Auditing", the existing 5 log streams are `comms.log` (JSON-L for API traffic), `toolcalls.log` (markdown for tool invocations), `apihooks.log` (HTTP hook invocations), `clicalls.log` (subprocess details), and `scripts/generated/_.ps1` (preserved scripts). Is the DSL's audit log a 6th stream, or does it fold into one of the existing 5? Recommendation: a 6th stream (`audit.log`) because the DSL's audit is verb-level (every verb), while the existing 5 streams are tool-level (specific call types). + +4. **Does `sandbox` produce `Result[T, ErrorInfo]` (the Fleury pattern) or a different envelope?** Per `data_oriented_error_handling_20260606/spec.md` §3.3, the canonical `Result[T]` is a dataclass with `data: T` and `errors: list[ErrorInfo]`. The `sandbox { ... }` block can either use this envelope or a different one (e.g., `SandboxResult` with `stdout: str`, `stderr: str`, `exit_code: int`, `errors: list[ErrorInfo]`). The interpreter must decide. + +5. **`didyoumean` recovery: parser feature or user-facing verb?** If parser feature, the parser auto-corrects on parse failure and the user never sees the typo. If user-facing verb, the parser logs the typo, the user writes `didyoumean ""`, and gets a suggestion. The interpreter must decide whether `didyoumean` is part of the parse path or part of the runtime path. + +6. **How does `for x .. n` interact with Tier 2's `filter`/`map`?** Is `for x .. n { body }` sugar for `[1, 2, ..., n] -> map { body }`? Or are they distinct (the for-loop has named variable, the pipeline has anonymous position)? The interpreter must decide whether the user's pseudocode `for col .. m.columns { body }` is syntactic sugar for the array-language `iota m.columns { ... }`. + +7. **How does `sandbox` map to Manual Slop's `pre_tool_callback` flow?** The `sandbox` block's audit log: separate JSON-L file, or fold into the existing `comms.log` + `toolcalls.log`? (This is the same question as #3, but specifically about the runtime path — what happens when a `sandbox { write "tmp/x" "data" }` is actually executed by the bridge script?) + +8. **Connection to `intent_dsl_for_meta_tooling_20260608_PLACEHOLDER`:** what's the minimum subset of the report's vocab that would let the placeholder track (a) write a bridge script and (b) demonstrate one round-trip end-to-end? The placeholder's per-MCP grammar design (per `mcp_architecture_refactor_20260606/spec.md` §12.1) needs at least 1 Tier 1 verb, 1 Tier 2 verb per sub-MCP, and 1 Tier 4 verb (probably `sandbox` or `audit`). The minimum subset: 1-3 verbs, plus the grammar. + +--- + + +--- + +## Appendix: Deep-Dives + +This appendix contains the in-depth elaborations referenced from each main section. The main report's §1-§7 are the executive summaries; this appendix is the reference material. Each subsection corresponds to a main section and contains extended discussion, formal specs, full reference tables, and edge cases that don't fit the main report's flow. + +--- + +### A.1 Section 1 Deep-Dive: The 4 Anchor Claims in Detail + +**Anchor 1 — Intent-mapping (Jofito heritage).** The "intent mapping engine" framing comes from Jofito's 2026 README update (per `https://codeberg.org/jbruchon/jofito`): *"I am generalizing it out to become an 'intent mapping engine' instead. I intend to replace coreutils, findutils, grep, and sed with 'scripted' commands of intent."* The key shift is from *imperative* to *declarative*: instead of `find . -type f | grep -v jpg | grep -v jpeg`, the user writes `list = scandir("/path/here/", {filter !extension=jpg,jpeg}) : print(list)`. The "filter !extension=..." is a *declarative predicate* — the user names the intent, the engine figures out the operations. + +The DSL's verbs (§4) are *intent verbs*. `scan` is "I want to read a source"; `filter` is "I want to keep only what matches"; `print` is "I want to write to stdout"; `audit` is "I want to record what happened". The bridge script and MCP tools handle *how to do it*. This is the philosophical core: the user says *what they want*, the verbs are the way to say it, the runtime handles the implementation. + +Counter-argument: the user might want fine-grained control. The DSL accommodates this through the `exec` verb (Tier 3) and the `tape { }` block (Tier 2) which let the user drop down to imperative code when needed. But the *default* mode is declarative. + +**Anchor 2 — Hardware is the truth (Onat/Lottes heritage).** The verbs must map to actual hardware/software stages. The 2-register stack (RAX/RDX), the magenta pipe `|`, the basic blocks `[ ]`, the lambdas `{ }`, and the preemptive scatter are not arbitrary design choices — they are *concrete mappings* to x86-64 hardware. The DSL's `->` pipeline is the magenta pipe retargeted to data flow; the `tape { }` block is the magenta-pipe definition boundary retargeted to memory regions; the `scatter` verb is the preemptive-scatter model from the X.com thread at lines 55-61. + +The "hardware is the truth" claim is stronger than "the hardware informs the design." It says: if a verb doesn't have a hardware stage to map to, it shouldn't be in the vocab. The Tier 1 math verbs (e.g., `sum`, `product`) map to SIMD horizontal-add instructions. The Tier 2 pipeline verbs map to thread-parallel leader/chaser execution. The Tier 4 audit verb maps to the comms.log write path (per `docs/guide_architecture.md` §"Telemetry & Auditing"). Every verb has a hardware story. + +**Anchor 3 — Immediate-mode (O'Donnell heritage).** "Widgets, logically, change from being objects to being method invocations" (per `https://johno.se/book/imgui.html` — "Immediate Mode applied" section). The DSL extends this to pipelines: `scan -> filter -> print` is not a Pipeline object you can query, name, or pass around. The only way to "name" a chain is to wrap it in a function (`determinate(m, row) -> Scalar { ... }`). The function body IS the chain; the function name IS the chain's identity. + +The implication for the interpreter (follow-up B track): there is no Pipeline class to construct. The parser reads tokens, evaluates verbs as stack effects, and emits the result. The runtime can be a simple recursive-descent evaluator or a register-based VM. The choice of evaluation strategy is open (per the open questions in §7), but the design is locked: no Pipeline object, ever. + +**Anchor 4 — Vocabulary is the user surface (CoSy heritage).** Per `https://cosy.com/CoSy/Simplicity.html`: *"CoSy is a TimeStamped notebook/log created as an open vocabulary in Forth."* And: *"an extensive vocabulary evolved from APL via K."* The vocabulary *is* the user surface. For AI agents that emit the DSL, the vocab is the API. The 40-verb vocab in §4 is the complete API surface; the syntax is trivial (RPN with a few delimiters); the runtime is implementation detail. + +The DSL does not need a separate "API documentation" page. The vocab tables in §4 ARE the API. The "Take bullets" column names the prior art for each verb. The "Maps to mcp_client tool" column (for T2/T3) names the underlying implementation. An AI agent that knows the 40 verbs can express any intent the DSL supports. + +**How the four claims compose.** The user expresses intent (Claim 1) using a verb (Claim 4) that maps to a hardware stage (Claim 2) in a single per-frame composition (Claim 3). None of the claims is independent: removing any one breaks the others. The DSL is the *intersection* of these four claims, not their union. + +The composition is not always harmonious. There are trade-offs: +- *Intent-mapping (Claim 1) vs. Immediate-mode (Claim 3):* declaring intent per-frame is verbose if the intent doesn't change between frames. The DSL's `determinate(m, row) -> Scalar { ... }` form lets the user wrap repeated intent in a function; the runtime caches the compiled verb chain but not the evaluation result. +- *Hardware-mapping (Claim 2) vs. Vocabulary-as-user-surface (Claim 4):* the hardware may not support all the verbs. The DSL accepts this; verbs that don't have a clean hardware mapping are not added. (The §7 open questions flag this for the interpreter.) +- *Intent-mapping (Claim 1) vs. Hardware-mapping (Claim 2):* the user's intent may not be efficiently implementable on the target hardware. The DSL accepts this trade-off; the bridge script can fall back to `exec` (Tier 3) when no Tier 2 verb applies. + +--- + +### A.2 Section 2 Deep-Dive: Prior Art Full Text + +This subsection provides the full text of each prior-art entry, beyond the 1-sentence summaries in the main report's §2. + +#### A.2.1 Cluster 0 — John O'Donnell IMGUI/MVC (full text) + +**What the work is.** John O'Donnell's in-progress book *Immediate Mode Model/View/Controller* (2007-2008) at `https://johno.se/book/imgui.html` + `pitch.html` + `immvc.html` + `mvc.html` lays out a unified paradigm for game UI and application architecture. Four interconnected pages serve distinct roles in the overall argument: + +- `imgui.html` — The canonical IMGUI essay: defines widgets-as-method-invocations, presents a complete C++ `Gui` class with buttons/radios/edit boxes/tree controls/combo boxes/sliders/drag-and-drop, and distinguishes deferred vs. direct display. This is the most concrete page — it has actual code for every widget type. +- `pitch.html` — "The Pitch": frames IMGUI as a paradigm shift, attacks the retained-mode premise in detail, introduces the Controller as the per-frame "programmer" of View, and argues that GPU advances have eliminated the performance justification for retained mode. +- `immvc.html` — The book roadmap: maps the six-chapter structure (IMGUI → MVC/E → Persistence), explicitly names `IEventTarget` as central to multiplayer and async design, traces the author's design journey from Ground Control via Josephine/GC2 to MVC/E. +- `mvc.html` — The MVC chapter proper: defines `Model` (const-only access), `View` (procedural, stateless), `Controller` (per-frame orchestrator), formalizes the "reads are free, writes are formalized" invariant via a single `IEventTarget` interface. + +The central claim across all four pages is that **visualization is not inherently stateful** — the dominant assumption in OOP toolkits (MFC widgets, Ogre scene graphs, HTML DOM) is a historical artifact, not a technical necessity. The DSL inherits this philosophical core. + +**Specific quotes (the 4 anchor claims the DSL borrows):** + +- **Anchor 1 (widgets as method invocations):** "**Widgets, logically, change from being objects to being method invocations.** As we shall see, this fundamentally changes how a client application approaches the implementation of user interfaces." (`imgui.html` — "Immediate Mode applied" section, exact bold text) + +- **Anchor 2 (reads free, writes formalized):** "A very useful addition to the straight MVC design proposed above is the addition of a 'write proxy / event target' that controls all changes of state in Model. The idea can be summed up as follows: *"Reads are free, writes are formalized."*" (`mvc.html` — Controller section, exact bold text) + +- **Anchor 3 (IEventTarget as single event interface):** "Writes to Model are formalized through the addition of IEventTarget. This is a pure virtual interface that defines all possible state changes / events on a system wide level." (`mvc.html` — "Writing to Model state" section, exact quote) And: "Experience dictates that there only be a single IEventTarget interface that is responsible for all 'system events'." (`mvc.html` — "Why only a single event interface" section, exact quote) + +- **Anchor 4 (no scene-graph abstractions in View):** "The corresponding interface should be of the form: `view::drawMesh(mesh, transform, anyOtherRenderState);`" (`mvc.html` — "View" section, exact code) + +**Empirical evidence (Jungle Peak):** "In DirectX9 is possible to render very large batches of primitives per draw call. At Jungle Peak we rendered 800 000+ vertices in a single call on nVidia GeForce 6 class hardware, with good performance. The meant a number of things, such as discarding the concept of camera culling." (`pitch.html` — batch rendering section, exact quote) + +**Shearing exception concept:** "There is a chance that the result of any given widget interaction changes some application state that controls the appearance of the user interface itself, and such discrepancies can result in parts of the user interface reflecting the 'old' state while some reflect the 'new' state. I call this 'frame shearing'... The main technique to utilize is to have any code that changes the appearance of the user interface generate a 'shearing exception' which breaks out of the method that generates the gui for the current frame and restarts the entire process for the current frame." (`imgui.html` — "Frame shearing" section, exact quote) + +**The DSL's specific O'Donnell-derived design choices:** + +1. **No widget object hierarchy.** A verb is a method call, not a stateful object. The execution context is created fresh at call time and torn down at return time. +2. **Reads free, writes formalized via `sandbox { }` block.** Every write-verb inside `sandbox { ... }` routes through the formal event channel (the bridge script's HITL approval modal). Reads (Tier 2 verbs) are unconstrained. +3. **IEventTarget = `audit` verb.** The audit verb is the single interface that all state mutations must route through. The audit log is the event trace; the verification Model is the replay target. +4. **View must not expose scene-graph abstractions.** The DSL's verbs are flat procedure calls — `scan`, `filter`, `map`, etc. There's no object-graph hierarchy; no `Mesh/Transform/SceneNode` abstractions. + +#### A.2.2 Cluster 1 — Concatenative (Forth family) full text + +**Forth (Chuck Moore, 1970).** The canonical RPN stack-passing language. Forth combines a compiler with an interactive shell where the programmer builds up a dictionary of *words* (subroutines), each consuming and producing values exclusively via an implicit data stack using Reverse Polish Notation (RPN). All syntactic elements — variables, operators, and control flow — are defined as words; there is no BNF grammar, no AST, and no separate compilation phase in the classic model. The defining structural feature is the colon-word/semicolon-definition pattern (`: foo ... ;`) that makes the dictionary the sole organizing principle of the program. + +**What the DSL inherits:** the pure concatenative property — *"concatenation of two programs denotes the composition of the two functions they denote"* (Joy's formalization, but the property is implicit in Forth). The DSL's postfix syntax and its rejection of named lambda parameters (parameters are unnamed; they live on the stack) are direct inheritances. The DSL does not inherit the memory-based data stack — modern hardware makes the register-file-as-global-namespace model more efficient (per Onat/Lottes, see below) — but the *syntax* of passing arguments implicitly through a stack is the DSL's core grammar. + +**Threaded code compilation (Forth heritage).** "Forth's grammar has no official specification. Instead, it is defined by a simple algorithm. The interpreter reads a line of input from the user input device, which is then parsed for a word using spaces as a delimiter." (`https://en.wikipedia.org/wiki/Forth_(programming_language)`) Classic Forth compiles to threaded code, which "can be interpreted faster than bytecode." Modern Forths (SwiftForth, VFX Forth, iForth) compile to native machine code, but the original model of threaded interpretation is directly ancestral to the JIT-based approaches in KYRA and x68. + +**Self-hosting (Forth heritage).** "The minimum definitions for such a Forth compiler are the words that fetch and store a byte, and the word that commands a Forth word to be executed." (Wikipedia) This bootstrap property — where the language is written in itself — is the ultimate expression of the concatenative property: the compiler is just another word in the dictionary. The DSL inherits this: the interpreter prototype (follow-up B track) should be self-hostable. The Tier 1 verbs (`:=`, `for .. n`, `if/then`, etc.) are sufficient to write the interpreter itself. + +**ColorForth (Chuck Moore, ~1990s).** Color encodes semantics. Wikipedia notes ColorForth's continued evolution: the language uses color (rather than keywords) to distinguish *define* (red), *compile* (green), *execute* (yellow), and *variable* (magenta). The DSL inherits the idea that *visual/structural encoding* can replace keywords. The Tier 4 `audit` verb, for example, could be color-coded in a future editor without changing the parser. + +**KYRA / VAMP (Onat Türkçüoğlu, SVFIG 2025).** Per `C:\projects\forth\bootslop\references\kyra_in-depth.md`: + +- **2-Item Hardware Stack:** *"The 2-Item Hardware Stack: To achieve hardware locality and GPU compatibility, KYRA strictly restricts the data stack to exactly two CPU registers: RAX (Top of Stack) and RDX (Next on Stack)."* (line 14, exact quote) +- **8.24 ms compilation speed:** "KYRA compiles its entire program (including a custom editor, Vulkan renderers, and FFMPEG integrations) in 8.24 milliseconds natively on Windows/Linux." (line 13, exact quote) +- **Magenta pipe `|` trick:** *"A magenta pipe token (|) implicitly signals the start of a new definition. The JIT reacts to this by: 1. Emitting a RET (C3) to close the previous definition. 2. Emitting 48 92 (xchg rax, rdx) to ensure proper stack alignment for the new definition."* (lines 24-26, exact quote) +- **Basic blocks `[ ]`:** *"Basic Blocks [ ]: These visually constrain the assembly output. They provide implicit begin, link (else), and end jump targets for the JIT to resolve relative offsets within a limited scope."* (line 57, exact quote) +- **Lambdas `{ }`:** *"Lambdas { }: A lambda (colored Yellow {) does not execute inline. The JIT compiles the block of code elsewhere in the tape and leaves its executable memory address in RAX."* (line 58, exact quote) +- **Tape drive (preemptive scatter):** *"Instead of a stack, data needed for complex API calls (like Vulkan initialization) is pre-scattered into these known global offsets using Red (Store) words, and then passed via a single pointer."* (line 43, exact quote) +- **24-Bit Indices + 16-word "Scrolls":** "Words are stored as 24-bit indices pointing to 8-byte cells. (Onat notes his next iteration moves to 32-bit indices + a separate 1-byte tag array, exactly matching Lottes's x68 annotation model)." (line 49, exact quote) + +**What the DSL inherits:** the bracket operators (`[ ]` and `{ }`); the tape-scoped block (`tape { }`); the magenta `->` as the entry/exit protocol for a memory region; the preemptive scatter model for `scatter`/`gather`. KYRA's "8.24 ms compilation" gives a design target for the DSL's interpreter: the parse + emit should be < 100ms even for a large DSL program. + +**x68 / 5th (Timothy Lottes, 2007-2026).** Per `C:\projects\forth\bootslop\references\neokineogfx_in-depth.md`: + +- **32-Bit Instruction Granularity:** *"32-Bit Instruction Granularity: Every x86-64 instruction is padded to exactly 4 bytes (or multiples of 4)."* (line 26, exact quote) Padding uses ignored prefixes (3E DS segment override) and multi-byte NOPs. Example: RET (C3) padded to C3 90 90 90. +- **Folded Interpreter (5-byte tail):** *"Lottes mitigates this by folding a tiny (5-byte) interpreter directly into the end of every compiled word."* (line 20, exact quote) By ending every word with its own fetch/dispatch logic (LODSD, lookup, JMP), the CPU's branch predictor gets unique slots for every transition. +- **Annotation Overlay (64-bit per 32-bit token):** *"For every 32-bit source word, there are 64 bits of annotation memory... 8 characters encoded in 7 bits each (56 bits total) acting as the human-readable Label/Note. 8-bit Tag. This tag dictates how the 32-bit value in memory is formatted in the editor (e.g., Hex Data, Absolute Address, Relative Address)."* (lines 36-38, exact quote) +- **Auto-Relinking:** "The editor dynamically recalculates CALL/JMP 32-bit relative offsets and 8-bit conditional jump offsets when tokens are inserted or deleted. The editor is the linker." (line 42, exact quote) +- **16-clock branch misprediction penalty:** "Standard Forth causes severe CPU pipeline stalls (averaging 16-clock stalls on architectures like Zen 2) due to constant branch misprediction when interpreting tags or navigating the dictionary lookup loop." (line 18, exact quote) +- **Source is the Dictionary:** "The 32-bit words are direct absolute memory pointers into the binary." (line 47, exact quote) +- **Lottes's preemptive scatter (different angle):** "Lottes diverges from strict zero-operand Forth by introducing 'preemptive scatter' arguments directly in the source stream. Instead of pushing to a data stack before calling, words can read ahead in the instruction stream. [RSI] points to the current word. [RSI+4], [RSI+8] can be fetched directly into registers (like RCX, RDX) within the word's implementation." (lines 45-50, exact quote) + +**What the DSL inherits:** the 32-bit token encoding; the annotation overlay pattern (per-token metadata); the auto-relinking / edit-time fixup (relevant for any future IDE that edits the DSL); the folded-interpreter optimization (relevant for the runtime's verb-dispatch path); Lottes's specific "register file as aliased global namespace" framing. + +**Joy (William Byrd, Manfred von Thun, 2001-2003).** Per `https://en.wikipedia.org/wiki/Joy_(programming_language)`: *"Joy is a concatenative programming language: 'The concatenation of two programs denotes the composition of the functions denoted by the two programs'."* (exact quote, "Mathematical purity" section). Purely functional concatenative; quotations (`[ ]`) are first-class values; combinator library (`map`, `filter`, `fold`, `binrec`, `primrec`, `linrec`). + +**What the DSL inherits:** the quotation-as-first-class-value concept (the DSL's `[ ]` block); the combinator library (the DSL's `map`, `filter`, `fold`); the mathematical rigor of the concatenative property. Joy's *no formal parameters* discipline is also inherited: `square == dup *` is the Joy idiom; the DSL's verbs operate on the implicit stack the same way. + +**Joy's combinator library as the Tier 2 model.** Joy's `map` combinator: *"expects an aggregate value on top of the stack, and it yields another aggregate of the same size. The elements of the new aggregate are computed by applying the quoted program to each element of the original aggregate."* (per the Joy tutorial at `http://joylang.org/`, archived) This is the direct model for the DSL's `map` verb — applying a block to each record. Similarly for `filter`, `fold`, `sort`. + +**CoSy (Bob Armstrong, ongoing).** Per `https://cosy.com/CoSy/Simplicity.html`: + +- **TimeStamped Notebook/Log:** *"CoSy is a TimeStamped notebook/log created as an open vocabulary in Forth."* (exact quote, first paragraph) +- **3-cell list header:** *"all nouns are lists, trees. At the Forth level they have a 3 cell header (Type Count refCount). Type 0 is a list of lists. Simple lists (characters, numbers) are leaf nodes."* (exact quote, second-to-last paragraph) +- **Modulo indexing:** *"Indexing is modulo - like counting on your thumb & fingers: 0 1 2 3 4 0."* (exact quote, last paragraph) +- **APL-via-K vocabulary:** *"an extensive vocabulary evolved from APL via K, mainly slicing and dicing, searching & replacing, and applying verbs to each item in lists."* (exact quote, third paragraph) +- **Self-hosting:** *"The CoSy notebook environment itself is written in CoSy."* (exact quote, last paragraph) +- **Tick vs. quote:** ` returns the next word as a string; ' returns the address of the following word. The DSL's string-literal vs. verb-reference distinction is analogous. + +**What the DSL inherits:** the open-vocabulary culture; the modulo indexing (forgiving of off-by-one AI errors); the 3-cell list header as a universal data structure; the APL-derived data-manipulation vocabulary; the self-hosting bootstrap model. + +**Lottes's specific framing (X.com thread, 2025-04-30).** Per `C:\projects\forth\bootslop\references\X.com - Onat & Lottes Interaction 1.png.ocr.md`: + +- **"I laugh when people say C is like assembly, they are missing what we **actually** did in assembly back then, which was all registers and globals and gotos, no stacks. It's radically different than good assembly."** (lines 79-81, exact quote; v1.1 OCR-restored) +- **2-item data stack compromise:** *"2-item data stack is an interesting compromise. Something I never considered. I left off ripping out the data stack completely."* (line 41, Lottes on Onat's KYRA) +- **Tape drive / preemptive scatter:** *"You mentioned VK is most 'form filling' which I think is an accurate description. For most 'C' like APIs I like to just lay out all the arguments in memory like a tape drive in the order that functions get called and source that tape at runtime for the calls."* (lines 52-55, exact quote) +- **No-argument-gather model:** *"The key concept here is that 'common' arguments like the device are pushed onto the tape using store duplication when they are known (after device creation). So it's preemptive scatter, so later at call time there is no argument gather."* (lines 58-61, exact quote) +- **Data stacks are software artifacts:** *"Likely the majority of C/C++/OOP/bloatware is just shuffling data around in argument gather to support the concept of data stacks on HW that has no physical data stack."* (lines 64-66, exact quote) +- **Register file as aliased global namespace:** *"I do all my custom CPU side stuff more like treating the register file like a 'memory' of which the contents are aliased to different shared structures for different purposes across time... So the register file is more like an aliased global namespace. And 'functions' are free of arguments and free of returns. This way of working with the HW is way better and easier than the 'C' model."* (lines 96-103, exact quotes) +- **Functions without args/returns:** *"functions are free of arguments and free of returns."* (line 102, exact quote) + +**The DSL inherits:** the no-stack design philosophy; the tape-drive argument passing; the "register file as aliased global namespace" model; the free-of-args-and-returns function model. The DSL's `tape { }` block is the tape-drive region; the DSL's verbs are the "functions free of arguments and free of returns" (parameters are passed via the implicit pipeline context, not via named arguments). + +#### A.2.3 Cluster 2 — Array Languages (full text) + +**APL (Kenneth Iverson, 1962; Turing Award 1979).** The foundational array programming language. APL's radical thesis: **the multidimensional array is the universal data type** and **every glyph is a function**. Per `https://en.wikipedia.org/wiki/APL_(programming_language)`: *"APL has an array as the universal data type"* — scalar `5` is a 0-dimensional array; `4 5 6 7 + 4` propagates the addition across the vector. + +The dedicated character set is load-bearing: each of the 80+ glyphs maps to a primitive function or operator. `+/` is "plus over" (reduce), `⌽` is "rotate", `⍉` is "transpose", `⌊` is "floor" (monadic) or "minimum" (dyadic). Operators (higher-order functions) combine with glyphs: `+⌿` is "plus table", `⍉⌽` is "rotate then transpose". + +**What the DSL inherits:** the array-as-universal-type principle; the right-to-left evaluation model; the philosophy that "there is an appropriate notation for thought" (Iverson's Turing Award lecture, *"Notation as a Tool of Thought"*). The DSL doesn't adopt the custom character set — it uses ASCII-compatible representation — but the philosophy is identical: the verbs are dense, single-character (or short-word) symbols, and the type system is array-first. + +**K / q (Arthur Whitney, KX Systems, 1993).** Per `https://en.wikipedia.org/wiki/K_(programming_language)`: K is a "proprietary array processing programming language" commercialized by KX Systems. K distilled A+ (Whitney's earlier APL derivative) into a minimalist ASCII-only syntax where every ASCII symbol is *heavily overloaded* by context, and functions are first-class values borrowed from Scheme. The result is a language that can express financial algorithms in single lines that read as cryptic character streams to the uninitiated. + +Example: `2!!7!4` reads right-to-left: `7!4` is modulo (3), `!3` is enumeration (0 1 2), `!2` is rotation. Three distinct uses of `!` in one expression — the extreme end of the overloading spectrum. kdb+ (built on K) processes billions of records at microsecond latency; the array paradigm scales to production workloads. + +**What the DSL inherits:** the context-sensitive operator philosophy (the DSL's `->` arrow has different meaning in different positions); first-class functions (the DSL's `try { ... } recover { ... }` block is a first-class function); the production-scale validation that the array paradigm is performant. + +**BQN (Marshall Lochbaum, 2020).** Per `https://mlochbaum.github.io/BQN/`: BQN is a "modernized APL" with cleaner semantics, context-free grammar, and function trains. BQN's train composition (e.g., `+/↕N` for sum-of-range) is the direct design precedent for the DSL's pipeline verb chaining. + +**What the DSL inherits:** the train composition pattern as the most expressive tacit composition mechanism; the clean modern semantics (BQN explicitly cleans up APL's "soup of glyphs"); the leading axis model (BQN's `⊏` Select, `⊑` Pick for multi-dim indexing). + +**Uiua (Tony Morris, 2023).** Per `https://www.uiua.org/`: Uiua is a "modern APL descendant with stack-based execution" — a fundamental departure from the argument-binding model of APL, K, and BQN. Uiua is named "wee-wuh" and is a tacit array programming language implemented in **Rust** (98.7% of the codebase). The language was designed to make array programming more accessible, with an online Pad (REPL), editor extensions for VS Code and other editors, and a focus on onboarding story. + +Uiua uses a *stack* instead of named parameters: functions pop their arguments from the stack and push results. The language is "tacit" — functions do not have explicit parameters; they operate on the stack of values. Uiua's standout feature is its **onboarding story**: an online Pad at uiua.org that requires no installation, editor extensions with syntax highlighting, a Discord community, GitHub Sponsors page, and a detailed changelog. The language was designed with accessibility as a core goal, not an afterthought. + +**What the DSL inherits:** the stack-based execution model as a viable alternative to named parameters; the modern open-source development model (online REPL, editor extensions, Discord community); the focus on accessibility. The DSL's name conventions should follow Uiua's pattern: short, distinctive names that are easy to type and remember. + +#### A.2.4 Cluster 3 — Intent-Mapping (full text) + +**Jofito (Jody Bruchon, 2023-2026).** Per the Jofito README at `https://codeberg.org/jbruchon/jofito`: + +- **2026 UPDATE NOTE:** *"This tool was originally intended to act like a sort of 'SQL for managing filesystems' but I am generalizing it out to become an 'intent mapping engine' instead. I intend to replace coreutils, findutils, grep, and sed with 'scripted' commands of intent."* (exact quote) +- **"Write the optimization once, reap the benefits everywhere":** *"jofito is a 'write the optimization once, reap the benefits everywhere' system that takes what the user wants to accomplish (intent) as input and decomposes it into operations that make the most sense for the current system."* (exact quote) +- **Canonical example:** `list = scandir("/path/here/", {filter !extension=jpg,jpeg}) : print(list)` (exact code from the README) +- **Low-level optimization:** *"jofito can then take advantage of the low-level system call 'getdents64' to perform faster directory reads, SSE or AVX for finding the file extensions, and use the 'write' system call to output length-specified final strings."* (exact quote) +- **Length-prefixed strings + 4K-granular memory:** *"everything in jofito will be composed of length-prefixed strings and 4K-granular memory allocations performed through tape 'object allocators' that automatically track allocation use and handle garbage collection as a programmer-friendly mechanism, making semi-manual memory management a reality without complex GC object trees and the non-deterministic lag of other GC languages. Requiring everything under the hood to have a known length eliminates 80% of the bugs that C is scorned for by simply not allowing buffer overflows to happen."* (exact quote) + +**The leader/chaser thread model (per the Jofito video transcript, lines 193-269).** The transcript explains the Jofito runtime: predicates run as threads sharing a single memory tape. The scanner (leader) reads directory entries and stores them sequentially in the tape. The filter (chaser 1) trails behind the scanner, deallocating entries that don't match the predicate as it encounters them. The printer (chaser 2) trails behind the filter, outputting matching entries and freeing them as it goes. The critical insight (lines 270-285): "if you have three cores or threads on a machine, the directory scan can be happening, and this actually would be happening in bulk with some of my optimizations. Then the filtration of that scan will be happening in another thread or on another core at the same time... scanning, filtering, and printing can all happen on a modern machine with multiple cores simultaneously." + +**What the DSL inherits:** the "intent mapping engine" framing; the declarative predicate chain (the DSL's `scan -> filter -> print`); the leader/chaser parallel execution model (the DSL's `scatter workers`); the tape allocation with length-prefixed strings. + +**Pipe coalescing (per the transcript, lines 376-410).** When the Unix shell sees `find ... | grep ... | sort | uniq`, each utility is a separate process. Jofito's pipe coalescing detects when multiple utilities in a pipeline are all Jofito scripts and collapses them into a single in-memory script. *"find and grep see their part of a pipeline. Find and grep see their the same Jofedo executable... And then find is the head, so it's the coordinator. And all the subordinates down the pipeline reach out to the head and say, 'Hey, here's my script, here's my parameters, integrate me into you and I'll just become a hollow pipe that sends the final results down the line.' Thus, find and grep and sort and unique and whatever else your big long stupid pipeline might use all get collapsed by Jofedo if they're all Jofedo scripts instead of the actual binaries... into one unified Jofedo script in memory."* + +**What the DSL inherits:** the pipe coalescing concept — the DSL's `pipe` verb explicitly fuses a sub-pipeline into a single-pass execution plan. The interpreter prototype (follow-up B track) should implement this optimization: when a `scan -> filter -> print` chain is detected, compile it to a single fused verb call rather than executing stage by stage. + +**jq (Stephen Dolan, 2012-).** Per `https://jqlang.org/`: jq is a JSON-path filter language. The DSL replaces `|` with `->` to avoid conflict with shell usage and to make the DSL parseable without shell-aware lexing. The DSL's `select(condition)`, `map`, `fold` (via `reduce`), and `dedupe` (via `unique`) are all directly grounded in jq's filter vocabulary. + +**What the DSL inherits:** the filter-as-expression style (every filter is a composable expression that produces a value directly); the `select(condition)` filter pattern; the compositional pipeline model. jq's streaming parser (added in 1.5) is a precedent for the DSL's "verb names and signatures parsed first, arguments parsed on demand" strategy. + +**nagent's tag protocol.** Per `conductor/tracks/nagent_review_20260608/agent_review_v2_1_20260612.md:50`: *"The protocol is XML-ish, not XML — first matching close tag wins; no entity escaping."* The new `nagent_tags.py` is an explicit parser that replaces regex-based parsing. The DSL inherits the *idea* of a structured protocol (named operation with typed attributes; LLM-emit-able; self-delimiting) but **rejects** the XML angle-bracket notation per the user's direct instruction (intent_dsl_survey_20260612 brainstorming session, 2026-06-12): *"ignore its record formats as they problably will be less xml/json based as I don't like them."* + +**What the DSL inherits:** the *idea* of a compact, human-readable structured protocol; the concept of "structured protocol wins on debuggability, function calling wins on training" (per `nagent_takeaways_20260608.md:214`); the rejection of XML/JSON as the surface syntax. + +**WebAssembly (W3C, 2017-).** Per `https://en.wikipedia.org/wiki/WebAssembly`: *"Data in memory is stored in a large, growable array of bytes termed a linear memory. Linear memory is separate from the wasm module's call stack and code and the engine's memory."* One paragraph only: the linear memory model is the modern reference for the "tape drive" argument-passing semantics that grounds the DSL's Tier 2 verbs. Wasm's streaming-parse design *suggests* a parsing strategy where verb names and signatures are validated early (cheap) and arguments are parsed on demand (deferred), though this is an inference, not an explicit recommendation from the Wasm spec. + +#### A.2.5 Cluster 4 — Meta-Tooling DSLs (full text) + +**`mcp_dsl_20260606` placeholder.** Per `conductor/tracks/mcp_architecture_refactor_20260606/spec.md:456-465`: + +> **"MCP DSL Track"** (`mcp_dsl_20260606` or similar) — introduces a per-MCP compact dialect for tool calls, replacing or augmenting the JSON format. Inspired by the user's notes on APL/K/Cosy DSLs. Examples: +> - JSON: `{"name": "py_get_skeleton", "arguments": "{\"path\": \"/src/foo.py\"}"}` (~80 tokens per call) +> - DSL: `py k /src/foo.py` (~10 tokens per call, ~8x reduction) +> - A per-MCP grammar definition (`py_grammar.k`, `file_io_grammar.k`, etc.) could be authored and compiled to a parser +> - A per-MCP DSL → JSON converter at the dispatch boundary +> - Backward compat: the JSON path stays; the DSL is opt-in per MCP + +This is the closest project-internal reference for the DSL. It establishes the 8x token-reduction target, the per-MCP grammar design, the backward-compat constraint, and the "JSON stays, DSL is additive" principle. + +**What the DSL inherits:** the per-MCP grammar organization; the 8x token-reduction target; the backward-compat constraint (the JSON path stays); the per-MCP DSL → JSON converter at the dispatch boundary. The DSL's `sub_mcp` (per the spec, in `src/mcp_client_legacy.py`) is the deployment model. + +**nagent's Bridge DSL idea.** Per `conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md:216-230`: the bridge DSL is the format external agents emit. The Application's function-calling stays the same. The bridge DSL is what external agents emit when invoking `mcp_client.py` tools. The Application doesn't see the bridge DSL; the bridge script (`cli_tool_bridge.py`) translates. + +**What the DSL inherits:** the Application/bridge/Meta-Tooling split (the DSL is the bridge DSL, not the Application's function-calling); the `cli_tool_bridge.py` analogue is the runtime path for the DSL. + +**OpenAI function-calling.** Per `https://platform.openai.com/docs/guides/function-calling`: JSON Schema with `strict`, `required`, `additionalProperties: false`, and `enum` constraints. The 5-step conversational loop (request → tool call → execute → response → final text) is the protocol skeleton the DSL must fit. Best practices: "Write clear and detailed function names, parameter descriptions, and instructions. Apply software engineering best practices — make functions obvious and intuitive; use enums to make invalid states unrepresentable. Offload the burden from the model and use code where possible — don't make the model fill arguments you already know. Keep the number of initially available functions small — aim for fewer than 20 functions available at the start of a turn." + +**What the DSL inherits:** the schema rigor baseline (the DSL's `span`, `offset`, `didyoumean` verbs provide this discipline at the DSL level); the token cost awareness (the 8x reduction target is directly motivated by "Under the hood, functions are injected into the system message in a syntax the model has been trained on. This means callable function definitions count against the model's context limit and are billed as input tokens."); the "fewer than 20 functions" heuristic maps to the DSL's `assumewide` verb. + +**Anthropic tool-use.** Per `https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/define-tools`: the Anthropic tool-use schema is the second dominant 2026 baseline — structurally similar to OpenAI's but with key differences in philosophy and API shape. Where OpenAI uses `"type": "function"` with nested `"function"` object, Anthropic uses a flat structure with `name`, `description`, and `input_schema` as top-level fields. Anthropic also introduces `input_examples` as a first-class field for schema-validated examples, and `strict` as a guarantee mechanism (not just a hint). The `tool_choice` parameter (`auto`, `any`, `tool`, `none`) provides fine-grained control over whether Claude calls a tool at all. + +**What the DSL inherits:** the `input_examples` field as a model for teaching the DSL (the DSL's grammar can include concrete examples alongside the grammar definition); the `tool_choice` control maps to Tier 4 verb design (`fuzzy` ≈ `auto`, `try`/`recover` ≈ `any`, `assumewide` ≈ forcing broad); the `strict: true` guarantee is mirrored by the DSL's runtime validation at the dispatch boundary (rejected DSL statements don't reach `mcp_client.invoke()`). + +#### A.2.6 Cluster 5 — SSDL Shape Primitives (full text) + +Per `docs/reports/computational_shapes_ssdl_digest_20260608.md` §1, the SSDL vocabulary is the meta-language used to annotate the verbs in the main report's §4. + +**The 6 SSDL primitives** (full table from the SSDL digest): + +| # | Shape | One-line definition | SSDL symbol | +|---|---|---|---| +| 1 | **Instruction** | A single unit of computation. Reads data, writes data, or both. | `[I]` | +| 2 | **Codepath** | A sequential list of instructions that *terminates*. No loops. | `===>` | +| 3 | **Wide codepath** | A codepath whose execution *causes* several other codepaths to occur simultaneously. | `===>W===>` | +| 4 | **Codecycle** | A circular structure — a codepath that *repeats* at its first instruction after its last. | `o==>` | +| 5 | **Wide codecycle** | Multiple codecycles performing the same task simultaneously. | `oo==>oo` | +| 6 | **Codecycle graph** | Multiple codecycles + the data they read and write. | `boxes + arrows` | + +**The 7 modifiers** (full table): + +| Modifier | SSDL | Meaning | +|---|---|---| +| `[T]` | terminator | The instruction that *ends* a codepath (return, exit, etc.) | +| `[B]` | branch | A point where control flow forks based on a condition | +| `[M]` | merge | A point where control flow re-converges | +| `[S]` | stateful | Marks an instruction that *mutates* persistent state | +| `[Q]` | query | Marks an instruction that reads persistent state | +| `[N]` | nil sentinel | A special value that satisfies "is this OK to use?" in all cases | +| `───` | data | A line representing data being read or written (not a codepath) | + +#### A.2.7 Cluster 6 — Project's own Command DSL Precedents (full text) + +The 33 Command Palette commands in `src/commands.py` (per `docs/guide_command_palette.md` and the file itself). The full list with categories (regenerated from `@registry.register` decorators): + +| Category | ID | Title | +|---|---|---| +| AI | `reset_session` | Reset Session | +| AI | `clear_discussion` | Clear Discussion | +| AI | `add_all_files_to_context` | Add All Files To Context | +| AI | `generate_md_only` | Generate MD Only | +| Project | `open_project` | Open Project | +| Project | `save_project` | Save Project | +| Project | `save_all` | Save All | +| View | `toggle_text_viewer` | Toggle Text Viewer | +| View | `toggle_diagnostics` | Toggle Diagnostics | +| View | `toggle_usage_analytics` | Toggle Usage Analytics | +| View | `toggle_context_preview` | Toggle Context Preview | +| View | `toggle_tier1_strategy` | Toggle Tier 1: Strategy | +| View | `toggle_tier2_tech_lead` | Toggle Tier 2: Tech Lead | +| View | `toggle_tier3_workers` | Toggle Tier 3: Workers | +| View | `toggle_tier4_qa` | Toggle Tier 4: QA | +| View | `toggle_external_tools` | Toggle External Tools | +| View | `toggle_shader_editor` | Toggle Shader Editor | +| View | `toggle_undo_redo_history` | Toggle Undo/Redo History | +| View | `toggle_command_palette` | Toggle Command Palette | +| View | `show_all_panels` | Show All Panels | +| View | `hide_all_panels` | Hide All Panels | +| View | `reset_layout` | Reset Layout | +| Layout | `save_workspace_profile` | Save Workspace Profile | +| Layout | `show_workspace_manager` | Show Workspace Manager | +| Tools | `trigger_hot_reload` | Hot Reload | +| Edit | `undo` | Undo | +| Edit | `redo` | Redo | +| Theme | `switch_to_dark_theme` | Switch To Dark Theme | +| Theme | `switch_to_light_theme` | Switch To Light Theme | +| Theme | `switch_to_nerv_theme` | Switch To NERV Theme | +| Theme | `cycle_theme` | Cycle Theme | +| Help | `show_documentation` | Show Documentation | +| Help | `show_command_palette_help` | Show Command Palette Help | + +The DSL's verbs are a *richer* superset. Each Command Palette command can be expressed as a DSL verb call. For example, `save_all` ↔ `write "config.toml" + write "project.toml" + audit "saved all"`. The DSL is more expressive but also more verbose; the Command Palette is the quick-access surface for the 1-click operations. + +#### A.2.8 Cluster 7 — Data-Oriented Error Handling Convention (full text) + +Per `conductor/tracks/data_oriented_error_handling_20260606/spec.md` §3.3. The DSL's `try { ... } recover { ... }` block returns a `Result[T]`. The full Result[T] dataclass and ErrorInfo dataclass: + +```python +class ErrorKind(str, Enum): + NETWORK = "network" + AUTH = "auth" + QUOTA = "quota" + RATE_LIMIT = "rate_limit" + BALANCE = "balance" + PERMISSION = "permission" + NOT_FOUND = "not_found" + INVALID_INPUT = "invalid_input" + NOT_READY = "not_ready" + UNKNOWN = "unknown" + CONFIG = "config" + INTERNAL = "internal" + PROVIDER_HISTORY_DIVERGED_FROM_UI = "provider_history_diverged_from_ui" + +@dataclass(frozen=True) +class ErrorInfo: + kind: ErrorKind + message: str + source: str = "" # which subsystem produced it + original: BaseException | None = None + +@dataclass(frozen=True) +class Result(Generic[T]): + data: T + errors: list[ErrorInfo] = field(default_factory=list) + @property + def ok(self) -> bool: return not self.errors +``` + +The DSL's verbs return `Result[T]`-wrapped values; the `try { ... } recover { ... }` block unwraps them. A `recover` block receives the `Result[T]` as a parameter and can inspect `.errors` to decide what to do. + +--- + +### A.3 Section 3 Deep-Dive: Formal Grammar Specification + +This subsection provides the formal EBNF grammar for the DSL. The main report's §3 has the 14 primitives table; this appendix has the complete formal spec, tokenizer rules, parser semantics, and edge cases. + +#### A.3.1 EBNF Grammar (Extended Backus-Naur Form) + +```ebnf +(* ============================================================ *) +(* Intent-Based Scripting Language — Formal Grammar (EBNF) *) +(* ============================================================ *) + +program = { statement } ; + +statement = function_def + | procedure_def + | pipeline + | conditional + | loop + | return_stmt + | try_recover + | sandbox + | tape + | tape_basic + | assign + | assertion + | audit + | fuzzy + | didyoumean + | span + | offset + | assumewide ; + +(* --- Function definitions --- *) +function_def = identifier "(" [ arg_list ] ")" "->" type "{" program "}" ; +procedure_def = identifier "(" [ arg_list ] ")" "proc" "{" program "}" ; +arg_list = identifier { "," identifier } ; +type = scalar_type | array_type ; +scalar_type = "Scalar" | "Integer" | "String" | "Boolean" ; +array_type = "Vector" | "Matrix" | "[" scalar_type "]" ; + +(* --- Pipeline (the core of the DSL) --- *) +pipeline = expression { "->" expression } ; +expression = verb_call | literal | identifier | "(" expression ")" ; +verb_call = identifier [ arg_list ] ; +arg_list = literal | identifier | "(" expression ")" ; +literal = string_literal | number_literal ; + +(* --- Conditional --- *) +conditional = "if" expression "{" program "}" [ "else" "{" program "}" ] ; +loop = "for" identifier ".." expression "{" program "}" ; +return_stmt = "return" [ expression ] ; + +(* --- Block types (from Onat's brackets) --- *) +tape = "tape" "{" { tape_basic | assign | pipeline } "}" ; +tape_basic = "[" { statement } "]" ; +try_recover = "try" "{" program "}" "recover" identifier "{" program "}" ; +sandbox = "sandbox" "{" program "}" ; +fuzzy = "fuzzy" expression ; + +(* --- Local binding --- *) +assign = identifier ":=" expression ; + +(* --- Audit and introspection --- *) +audit = "audit" expression ; +assertion = "assert" "->" expression "=" expression ; +didyoumean = "didyoumean" string_literal ; +span = "span" expression ; +offset = "offset" expression ; +assumewide = "assumewide" expression ; + +(* --- Lexical --- *) +identifier = letter { letter | digit | "_" } ; +string_literal = '"' { character } '"' ; +number_literal = digit { digit } [ "." digit { digit } ] ; +letter = "A" | ... | "Z" | "a" | ... | "z" ; +digit = "0" | ... | "9" ; +character = ? any printable character except '"' ? ; +``` + +#### A.3.2 Precedence Table + +| Precedence | Operators | Associativity | Example | +|---|---|---|---| +| Highest | `()` (grouping) | N/A | `(a + b) * c` | +| High | `:=` (local bind) | None (statement-level) | `result := expr` | +| Medium | `<-` (input binding) | None (postfix on producer) | `for col <- .. m.columns` | +| Low | `->` (pipeline flow) | Left-to-right | `a -> b -> c` is `(a -> b) -> c` | +| Lowest | `{ }`, `[ ]` (block delimiters) | N/A (scoping) | `tape { [ scan ]; [ filter ] }` | + +**Key rule:** there is no operator precedence for arithmetic (`+`, `-`, `*`, `/`, `^`). All are left-associative with equal precedence. The user must use `()` for grouping. This is the APL/K convention: equality of precedence forces explicit grouping. + +#### A.3.3 Tokenizer Rules + +The tokenizer is whitespace-delimited, case-sensitive, and supports 3 quoting forms: + +1. **Bare word:** `scan`, `filter`, `for`, `read` — whitespace-delimited; reserved words are the 14 grammar primitives + the 42 verbs. +2. **String literal:** `"src/foo.py"` — double-quoted; supports `\"` and `\\` escapes; no multiline. +3. **Number literal:** `42`, `3.14`, `0.5` — digit-run with optional decimal point; no scientific notation; no hex/octal/binary. + +**Comment syntax:** `//` starts a line comment; `/* */` starts a block comment. (Inferred from the user's pseudocode which uses `//` for comments. The grammar above does not include comments; they are stripped by the tokenizer.) + +**Special characters:** +- `->` is parsed as a single token, not as `-` followed by `>`. +- `<-` is parsed as a single token, not as `<` followed by `-`. +- `:=` is parsed as a single token, not as `:` followed by `=`. +- `..` is parsed as a single token, not as `.` followed by `.`. +- `=` (in `assert -> a = b`) is parsed as a single token (equality, not assignment). + +#### A.3.4 Parser Semantics + +**Statement parsing:** the parser reads statements left-to-right until EOF. Statements are delimited by line breaks (significant) or `;` (explicit). The parser is *not* whitespace-significant within a line; indentation does not affect parsing. (This is explicitly opposite to Python's INDENT/DEDENT model; the user said "ignore its record formats as they problably will be less xml/json based as I don't like them" and explicitly does not want indent-sensitive syntax.) + +**Pipeline parsing:** `a -> b -> c` parses as `(a -> b) -> c`. The leftmost stage's output becomes the rightmost stage's input. Grouping with `()` overrides the default. + +**Function body parsing:** inside `{ }`, the body is a program. The function body's scope includes the formal parameters (which live on the stack) and any `:=` bindings within. The function returns the value of the last expression in the body, or `NIL` if the body is empty. + +**Block composition:** `[ ]` is a *compilation unit* (per Onat's basic blocks). `{ }` is a *lexical scope* (per standard block scoping). `tape { }` is a *tape-drive region* (per KYRA's magenta pipe). `sandbox { }` is an *IEventTarget boundary* (per O'Donnell). The three are nested by the parser: `tape { foo := x; [ bar ]; baz }` is a tape region containing 2 sequential statements (the local bind and the basic block) and a trailing call. is a tape region containing 2 statements and 1 basic block. + +#### A.3.5 Edge Cases and Disambiguation Rules + +1. **`:` in function signature vs. `:` in type annotation.** `func(m : Matrix, row : Scalar)` — the `:` separates parameter name from type. Inside the body, `:=` is the assignment operator. There is no ambiguity because `:` only appears in function signatures. +2. **`<-` vs. `-` in pipeline.** `a -> b <- for col .. m.columns` is a single stage: read `b` from the producer `for col .. m.columns`. The parser tokenizes `<-` as a single token, not as `<` followed by `-`. +3. **`for x .. n` vs. `result[a, b]`.** `..` is a 2-character token for range; `,` is a 1-character token for indexing. The parser reads `for x .. 10` as one statement and `result[1, 2]` as one indexing expression. They are distinguishable by context. +4. **`return value` vs. `value` as last expression.** In a function body, the last expression's value is the return value. `return value` is an *explicit* early-return. The two are equivalent if the last expression is the only return path. +5. **`if cond { then } else { else }` vs. nested `if`.** `else` binds to the nearest `if` (standard grammar). Use `()` to override if needed. +6. **`tape { [ x ]; [ y ] }` vs. `tape { x; y }`.** The first is 2 basic blocks (compiled separately, dispatched in order). The second is 2 statements in the same scope. The first allows the JIT to emit separate instruction sequences; the second forces them to share a scope. + +#### A.3.6 AI-Fuzzing Tolerance Rules (full) + +These rules are what make the DSL workable for AI agents that may fuzz verb names, indent inconsistently, or offset line references. They are the parser's *recovery protocol* when standard parsing fails. + +1. **Modulo indexing (from CoSy).** Array indices wrap. `result[-1]` is `result[result.len - 1]`. `result[100]` on a 5-element array is `result[100 mod 5] = result[0]`. This forgives off-by-one errors in line references. +2. **Structured recovery anchors (per Onat's brackets).** The `{ }` block is a recovery unit. If the parser cannot parse the body, the entire block is replaced with `NIL` and the error is reported at the block level, not at the line level. This avoids cascading parse failures. +3. **Line/offset independence.** The parser uses *token positions*, not raw line numbers. A token's position is `file:token-index`, not `file:line-number`. The mapping from token position to line number is a presentation concern, not a parse concern. (Matches the project's existing FuzzyAnchor pattern per `docs/guide_context_curation.md`.) +4. **Verb-name fuzzing tolerance.** The `didyoumean` verb proposes corrections for ambiguous verb names. The parser's "best guess" recovery path is configurable: strict (reject on typo), lenient (auto-correct if Levenshtein distance ≤ 2), or fuzzy (parse the rest, log the typo). The default is `lenient`. +5. **Indentation tolerance.** Indentation is *not* significant. The parser uses a stack-based approach; the `{ }` and `[ ]` delimiters are the only structure-aware tokens. (Explicit per the user's "ignore its record formats" instruction; the user does not want Python's indent-sensitive syntax.) +6. **Line continuation.** `\` at the end of a line continues the statement to the next line. This lets long pipelines be split across lines for readability. (Per standard concatenation-language convention.) +7. **Block-comment tolerance.** `/* ... */` comments are stripped at tokenization time. If a comment contains an unmatched `*/` (e.g., the comment says `*/` literal), the tokenizer reports an error and exits. (Standard behavior; no special fuzzing tolerance.) +8. **String-literal tolerance.** Strings are delimited by `"`. If the AI emits an unterminated string, the tokenizer reports an error and the parser substitutes `NIL` for the entire statement (not cascading to the next statement). + +#### A.3.7 Error Envelope: `try { ... } recover { ... }` + +```ebnf +try_recover = "try" "{" program "}" "recover" identifier "{" program "}" ; +``` + +**Semantics:** +- The `try` block is evaluated. If it returns a `Result[T]` with `errors` non-empty, the `recover` block runs. +- The `recover` block receives the `Result[T]` as a parameter (named by the user; `err` is the default convention). +- The `recover` block must return a `Result[T]` (or `NIL` to short-circuit). +- If the `recover` block itself returns a `Result[T]` with errors, those errors are *appended* to the outer `Result[T]`'s error list. (Per Fleury's "errors are data" pattern; per `data_oriented_error_handling_20260606/spec.md` §3.4.) + +**Example:** +``` +try { + read "src/foo.py" -> filter !exists -> print +} recover err { + audit "read failed: " + err + return NIL +} +``` + +If `read "src/foo.py"` fails (e.g., file not found), the `recover` block runs, logs the error via `audit`, and returns `NIL`. The outer program continues. + +--- + +### A.4 Section 4 Deep-Dive: Full Vocab Reference + +This subsection provides the complete reference for all 42 verbs in the 4 tiers. The main report's §4 has the summary tables; this appendix has the full signatures, semantics, examples, edge cases, and MCP tool mappings. + +#### A.4.1 Tier 1 — Math (12 verbs, full reference) + +**`name := value` (Local bind).** +- Signature: `name := expr` +- Semantics: Binds `expr` to a local variable `name` in the current scope. Stack-scoped: the binding is destroyed when the enclosing `{ }` block ends. +- Returns: `NIL` (binding is a statement, not an expression) +- Edge cases: rebinding an existing name in the same scope is allowed (shadows the outer binding). Cross-scope reference is not allowed. +- Examples: + - `result := Matrix(m.rows 1 -, m.columns 1 -)` (mixed: assignment and function call are infix, math is postfix) + - `x := 42` + - `name : Type := value` (with type annotation) +- SSDL shape: `[I]` (single instruction, no control flow) + +**`stack { decl1; decl2; ... }` (Stack scope).** +- Signature: `stack { decl; decl; ... }` +- Semantics: A block that introduces a new lexical scope. All `:=` bindings within are destroyed when the block ends. +- Returns: the value of the last expression in the block +- Edge cases: empty `stack { }` returns `NIL`. Nested `stack { stack { ... } }` works correctly (inner bindings destroyed first). +- Examples: + - `stack { result := ...; row_offset := Scalar; col_offset := Scalar }` (per user's pseudocode; assignment is infix, math is postfix) + - `stack { temp := a b +; use temp }` (temp doesn't leak; `+` is postfix) +- SSDL shape: `[I]` (single instruction, no control flow) + +**`name : Type` (Annotation).** +- Signature: `name : Type` +- Semantics: Type hint on a binding. Currently informational only; the parser does not enforce type conformance. The follow-up interpreter (B track) may add optional type checking. +- Returns: N/A (annotation is a modifier on a binding) +- Edge cases: type names are the 5 scalar types + array types. Unknown type names are accepted but flagged. +- Examples: + - `m : Matrix` (per user's pseudocode line 24) + - `row_offset, col_offset : Scalar` (per user's pseudocode line 29) +- SSDL shape: N/A (modifier, not an instruction) + +**`func(args) -> Type { body }` (Function def).** +- Signature: `func_name "(" [arg1, arg2, ...] ")" "->" ReturnType "{" program "}"` +- Semantics: Defines a named function. The args are the formal parameters; they live on the implicit stack and are accessed positionally (not by name). The body's last expression is the return value. +- Returns: the function's compiled form (a callable). Functions are first-class values. +- Edge cases: zero-arg functions `func() -> T { body }` are allowed. Recursive functions work (the name is in scope within the body). +- Examples: + - `determinate(m, row) -> Scalar { ... }` (per user's pseudocode line 22) + - `factorial(n) -> Scalar { if n <= 1 { return 1 }; return n * factorial(n - 1) }` (recursive) +- SSDL shape: `[I]` (function definition is a single compile-time instruction) + +**`name(...) proc { body }` (Procedure def).** +- Signature: `func_name "(" [args] ")" "proc" "{" program "}"` +- Semantics: Defines a procedure (void-returning function). Same as function def but with no return value. +- Returns: NIL (procedures don't have return values) +- Edge cases: same as function def. The user's pseudocode has `minor(...) proc { ... }` which is interpreted as a procedure. +- Examples: + - `minor(m, row_omit, column_omit) -> Scalar proc { ... }` (per user's pseudocode line 31; note the `proc` placement ambiguity) + - `log(msg) proc { audit "log: " + msg }` +- SSDL shape: `[I]` + +**`for x .. n { body }` (Range iteration).** +- Signature: `for x .. upper { body }` +- Semantics: Iterates `x` from 0 to `upper - 1` (inclusive). For each `x`, evaluates `body`. Returns NIL. +- Returns: NIL (iteration is a statement) +- Edge cases: `for x .. 0 { body }` skips the body entirely (zero iterations). `for x .. 1 { body }` runs the body once with `x = 0`. +- Examples: + - `for col .. m.columns { body }` (per user's pseudocode line 33) + - `for i .. 10 { print i }` (prints 0-9) + - `for i .. arr.size { sum +:= arr[i] }` (sum a vector) +- SSDL shape: `o==>` (codecycle — loops back to first instruction after last) + +**`name[a, b, ...]` (Bracket indexing).** +- Signature: `name[i, j, k, ...]` +- Semantics: Multi-dimensional array access. Indices are 0-based. The DSL's CoSy-style modulo indexing (per the AI-fuzzing tolerance rules) wraps out-of-range indices. +- Returns: the element at the given index +- Edge cases: 1D `name[i]`, 2D `name[i, j]`, ND `name[i, j, k, ...]`. Negative indices wrap (e.g., `name[-1]` is the last element). +- Examples: + - `result[row - row_offset, col - col_offset]` (per user's pseudocode line 38) + - `m[row][col]` (per user's pseudocode line 24; see ambiguity flag #3 in the main report) +- SSDL shape: `[Q]` (query — reads but doesn't mutate) + +**`+`, `-`, `*`, `/`, `^` (Arithmetic).** +- Signatures: `a + b`, `a - b`, `a * b`, `a / b`, `a ^ b` +- Semantics: Element-wise arithmetic. All are left-associative with equal precedence. Use `()` for grouping. +- Returns: the result of the arithmetic +- Edge cases: `0 / 0` returns `NIL` and emits a `DIVIDE_BY_ZERO` error. `a ^ b` for non-integer `b` is `pow(a, b)`. `a ^ -1` is `1 / a`. +- Examples: + - `2 + 3` (yields 5) + - `m.rows - 1` (yields `m.rows - 1`) + - `(row + col_offset) * 2` (yields the sum multiplied by 2) +- SSDL shape: `[I]` + +**`sum`, `product` (Reductions).** +- Signatures: `sum expr`, `product expr` +- Semantics: Reduce an array to a scalar. `sum` adds all elements; `product` multiplies all elements. +- Returns: the reduced scalar +- Edge cases: empty array returns 0 for `sum`, 1 for `product`. +- Examples: + - `sum 1..10` (yields 55; per APL `+/ι10`) + - `product 1..5` (yields 120; per APL `×/ι5`) + - `sum arr` (sum an arbitrary array) +- SSDL shape: `o==>` (codecycle; iterates over elements) + +**`if cond { then } [else { else }]` (Conditional).** +- Signature: `if expression "{" program "}" [ "else" "{" program "}" ]` +- Semantics: Evaluates `cond`. If truthy, runs `then`; else runs `else` (if provided). +- Returns: the value of the branch that ran +- Edge cases: `cond` must be a Boolean. Truthy/falsy rules: `0` is false, anything else is true. `NIL` is false. +- Examples: + - `if col = col_omit { ++ col_offset; continue; }` (per user's pseudocode line 34) + - `if n > 0 { return n * factorial(n - 1) } else { return 1 }` (recursive base case) +- SSDL shape: `[B]` (branch point) + +#### A.4.2 Tier 2 — Data-Oriented Pipeline (12 verbs, full reference) + +**`scan path` (Read source).** +- Signature: `scan path_expr` +- Semantics: Reads a source — directory, file, URL, or archive. Returns a stream of records (the FileItem list per `docs/guide_models.md`). +- Maps to: `list_directory` + `search_files` + `read_file` (per `mcp_client.py`) +- Examples: + - `scan "src/"` (all files in src/) + - `scan "src/*.py"` (all .py files) + - `scan url "https://..."` (web URL) +- SSDL shape: `[I]` + +**`select condition` (Keep matching).** +- Signature: `select condition` +- Semantics: Keeps records where `condition` is true. Mirrors jq's `select(condition)`. +- Maps to: jq-style filter on FileItem +- Examples: + - `scan "src/" -> select .extension == ".py"` + - `scan "src/" -> select .size > 1024` +- SSDL shape: `===>` (codepath) + +**`filter predicate` (Keep matching).** +- Signature: `filter predicate` +- Semantics: Same as `select`, but with a per-record predicate. Mirrors Jofito's `{filter ...}` predicate. +- Maps to: predicate on FileItem +- Examples: + - `scan "src/" -> filter .size > 0` + - `scan "src/" -> filter exists` +- SSDL shape: `===>` + +**`map block` (Apply to each).** +- Signature: `map block` +- Semantics: Applies `block` to each record. Returns a new stream of transformed records. +- Maps to: jq `.[] | .field`, Joy `map`, CoSy monadic `each` +- Examples: + - `scan "src/" -> map ext` (yields list of extensions) + - `scan "src/" -> map { x := .name; "src/" x + }` (full paths; math `+` is postfix, assignment is infix) +- SSDL shape: `o==>` (codecycle; iterates over records) + +**`fold init block` (Reduce to single).** +- Signature: `fold init_value block` +- Semantics: Reduces to a single value. `init_value` is the initial accumulator; `block` is the binary function. +- Maps to: jq `reduce`, Joy `fold` +- Examples: + - `scan "src/" -> fold 0 { acc + .size }` (total bytes) + - `scan "src/" -> fold "" { acc + .name + "\n" }` (concatenated names) +- SSDL shape: `o==>` + +**`sort key` (Order).** +- Signature: `sort key` +- Semantics: Orders records by `key`. Returns a new sorted stream. +- Maps to: Joy `qsort`, jq `sort` +- Examples: + - `scan "src/" -> sort .name` + - `scan "src/" -> sort .size desc` (descending) +- SSDL shape: `[I]` + +**`group key` (Bucket).** +- Signature: `group key` +- Semantics: Groups records by `key`. Returns a stream of sub-streams. +- Maps to: jq `group_by`, CoSy APL-derived +- Examples: + - `scan "src/" -> group .extension` (group by file extension) + - `scan "src/" -> group .directory` (group by parent directory) +- SSDL shape: `o==>` (codecycle) + +**`dedupe` (Remove duplicates).** +- Signature: `dedupe` +- Semantics: Removes duplicate records (by all fields). Returns a new stream. +- Maps to: jq `unique`, CoSy APL-derived +- Examples: + - `scan "src/" -> dedupe` + - `scan "src/" -> dedupe .name` (dedupe by name only) +- SSDL shape: `[I]` + +**`tape { body }` (Tape-drive region).** +- Signature: `tape { body }` +- Semantics: Declares a tape-drive memory region. The body's contents are pre-scattered into a contiguous buffer. The interpreter may emit special code for this (JIT optimization). +- Maps to: (compiler directive, not a tool call) +- Examples: + - `tape { [ scan ]; [ filter ]; [ print ] }` (3 basic blocks in a tape region) + - `tape { x := 42; y := x 2 *; use y }` (local tape; assignment is infix, math `*` is postfix) +- SSDL shape: `o==>` (codecycle; the region is a single execution unit) + +**`scatter workers` (Fork).** +- Signature: `scatter workers_expr` +- Semantics: Forks the pipeline across `workers_expr` cores. The leader/chaser model (per Jofito). +- Maps to: (runtime hint, not a tool call) +- Examples: + - `scan "src/" -> scatter 4 -> filter` (4-way parallel) + - `scan "src/" -> scatter auto -> map` (auto-detect core count) +- SSDL shape: `===>W===>` (wide codepath) + +**`gather` (Collect).** +- Signature: `gather` +- Semantics: Collects scattered sub-streams into a single stream. The inverse of `scatter`. +- Maps to: (runtime hint) +- Examples: + - `scan "src/" -> scatter 4 -> filter -> gather` +- SSDL shape: `[I]` + +**`pipe` (Chain root).** +- Signature: `pipe` +- Semantics: Explicit chain root. Synonym for `->`. Used when a pipeline is a top-level expression (not part of a larger expression). +- Maps to: (no tool call; this is a pipeline construct) +- Examples: + - `pipe [ scan, filter, print ]` (declarative list form; not the canonical syntax, future extension) + - `pipe` (single token; equivalent to starting a `->` chain) +- SSDL shape: `===>W===>` + +#### A.4.3 Tier 3 — Shell (10 verbs, full reference) + +**`exec cmd` (Execute).** +- Signature: `exec cmd_expr` +- Semantics: Runs a shell command. The escape hatch — when no Tier 2 verb applies. +- Maps to: `run_powershell` (per `docs/guide_shell_runner.md`) +- Examples: + - `exec "find . -name '*.py'"` + - `exec "ls -la"` +- SSDL shape: `[I]` + +**`open path` (Open).** +- Signature: `open path_expr` +- Semantics: Opens a file or URL in the default application. +- Maps to: `read_file` (initial) +- Examples: + - `open "src/foo.py"` + - `open "https://example.com"` +- SSDL shape: `[I]` + +**`read path` (Read).** +- Signature: `read path_expr` +- Semantics: Reads file content. Returns the file's text. +- Maps to: `read_file` +- Examples: + - `read "src/foo.py"` + - `read file.path` (where `file` is a FileItem) +- SSDL shape: `[I]` + +**`write path content` (Write).** +- Signature: `write path_expr content_expr` +- Semantics: Writes content to a file. Overwrites if exists. +- Maps to: `set_file_slice` or `edit_file` +- Examples: + - `write "src/foo.py" "new content"` + - `write file.path "updated text"` +- SSDL shape: `[I]` + +**`close handle` (Close).** +- Signature: `close handle_expr` +- Semantics: Closes a file handle. Mostly a no-op in Python (which auto-closes), but explicit for symmetry. +- Maps to: (no direct equivalent; close is implicit) +- Examples: + - `close file_handle` +- SSDL shape: `[I]` + +**`path` (Get path).** +- Signature: `path` +- Semantics: Returns the current path (or `cd`'s to it if an argument is given). +- Maps to: (no direct equivalent; use `cwd` for current directory) +- Examples: + - `path` (returns current path) + - `path "/some/dir"` (cd) +- SSDL shape: `[I]` + +**`env var` (Env var).** +- Signature: `env var_expr` +- Semantics: Returns the value of the environment variable `var`. +- Maps to: (no direct equivalent) +- Examples: + - `env HOME` (returns `/home/user` or similar) + - `env PATH` +- SSDL shape: `[I]` + +**`wait ms` (Wait).** +- Signature: `wait ms_expr` +- Semantics: Blocks for `ms_expr` milliseconds. Returns NIL. +- Maps to: (no direct equivalent; `shell sleep`) +- Examples: + - `wait 1000` (wait 1 second) + - `wait 50` (wait 50ms; useful for pacing a render loop) +- SSDL shape: `o==>` (codecycle) + +**`poll handle ms` (Poll).** +- Signature: `poll handle_expr ms_expr` +- Semantics: Polls `handle_expr` with `ms_expr` timeout. Returns true if data is available. +- Maps to: (no direct equivalent; `shell read -t`) +- Examples: + - `poll file_handle 5000` (poll with 5s timeout) +- SSDL shape: `o==>` + +**`cwd` (Current working dir).** +- Signature: `cwd` +- Semantics: Returns the current working directory. +- Maps to: (no direct equivalent; `shell pwd`) +- Examples: + - `cwd` (returns `/home/user/project` or similar) +- SSDL shape: `[I]` + +#### A.4.4 Tier 4 — AI-Fuzzing Tolerance (8 verbs, full reference) + +**`fuzzy expr` (Fuzzy match).** +- Signature: `fuzzy expression` +- Semantics: Declares a parse-tolerance region. The parser accepts near-matches (Levenshtein distance ≤ 2) for verb names within the expression. +- Maps to: (parser mode, not a tool call) +- Examples: + - `fuzzy { scan "src/" -> filter .ext }` (accepts `skan` as a typo for `scan`) + - `fuzzy { skan "src" }` (accepts the typo) +- SSDL shape: `===>` + +**`try { body } recover err { fallback }` (Try / Recover).** +- Signature: `try "{" program "}" "recover" identifier "{" program "}"` +- Semantics: Evaluates `body`. If it returns a `Result[T]` with errors, evaluates `fallback` with the `Result[T]` bound to `err`. +- Maps to: (envelope; the verbs inside may map to MCP tools) +- Examples: + - `try { read "src/foo.py" } recover { read "src/Foo.py" }` (try capitalized variant) + - `try { read "src/foo.py" -> filter !exists -> print } recover err { audit "scan failed: " + err; return NIL }` +- SSDL shape: `===>B===>` (codepath with branch) + +**`sandbox { body }` (Sandbox).** +- Signature: `sandbox "{" program "}"` +- Semantics: IEventTarget boundary. All writes in `body` go through the formal event channel (the bridge script's HITL approval modal). +- Maps to: (envelope; same as try, but the semantics is "approval gate" not "error recovery") +- Examples: + - `sandbox { write "tmp/x" "data" }` (write goes through HITL approval) + - `sandbox { exec "rm -rf /" }` (destructive exec requires approval) +- SSDL shape: `o==>` (codecycle; the region is an audit unit) + +**`audit msg` (Audit log).** +- Signature: `audit message_expr` +- Semantics: Logs the state change to a structured record (timestamp, source, kind, payload). The IEventTarget itself. +- Maps to: (writes to `comms.log` or new `audit.log`) +- Examples: + - `audit "wrote tmp/x"` + - `audit "wrote: " + content.size + " bytes to " + path` +- SSDL shape: `[I]` + +**`didyoumean ambiguous` (Did you mean).** +- Signature: `didyoumean string_literal` +- Semantics: Proposes the closest matching verb(s) for an ambiguous input. Returns a list of suggestions. +- Maps to: (parser-level recovery verb; the result is suggestions, not a tool call) +- Examples: + - `didyoumean "skan"` (returns `["scan", "skin"]` or similar) +- SSDL shape: `[I]` + +**`span intent` (Span).** +- Signature: `span intent_expr` +- Semantics: Decomposes a compound intent into a span of sub-MCP grammar tokens. For example, `span "read foo.py:MyClass"` becomes `read_file(foo.py)` + `py_get_definition(MyClass)`. +- Maps to: (multi-tool dispatch; uses the sub-MCP grammar) +- Examples: + - `span "read foo.py:MyClass.method"` + - `span "edit src/foo.py:42-50 with new content"` +- SSDL shape: `[I]` + +**`offset symbol` (Offset).** +- Signature: `offset symbol_expr` +- Semantics: Resolves a symbol to a file:line without requiring the model to specify the line. For example, `offset "foo.py:MyClass.method"` returns the file:line of `MyClass.method` in `foo.py`. +- Maps to: (multi-tool dispatch; uses py_get_symbol_info or similar) +- Examples: + - `offset "foo.py:MyClass.method"` +- SSDL shape: `[Q]` + +**`assumewide intent` (Assume wide).** +- Signature: `assumewide intent_expr` +- Semantics: If the intent is broad or ambiguous, select the most-capable matching tool (the "fewer, more capable" heuristic). +- Maps to: (multi-tool dispatch) +- Examples: + - `assumewide "refactor"` (selects the broadest refactor tool) + - `assumewide "find"` (selects a broad search rather than a narrow query) +- SSDL shape: `===>W===>` + +--- + +### A.5 Section 5 Deep-Dive: Hardware Mapping Tables + +This subsection provides the detailed verb-to-hardware-stage mapping. The main report's §5 has the 4 anchor claims; this appendix has the full mapping tables. + +#### A.5.1 Register Allocation + +| Verb | Input Register | Output Register | Temp Registers | Notes | +|---|---|---|---|---| +| `+`, `-`, `*`, `/` | RAX (left), RDX (right) | RAX (result) | — | SIMD horizontal if array | +| `^` | RAX (base), RDX (exp) | RAX (result) | — | pow() | +| `sum`, `product` | RAX (array) | RAX (scalar) | RDX (accumulator) | Codecycle | +| `for x .. n` | RDX (n) | RAX (loop counter) | — | x is in RDX; body uses it | +| `name[i, j]` | RAX (base), RDX (i), RCX (j) | RAX (element) | — | Multi-dim | +| `scan path` | RAX (path) | RAX (FileItem list) | RDX (handle) | mcp_client.list_directory | +| `filter` | RAX (predicate), RDX (record) | RAX (bool) | — | One record at a time | +| `map` | RAX (block), RDX (record) | RAX (transformed) | RCX (counter) | Codecycle | +| `fold` | RAX (init), RDX (record) | RAX (accumulator) | — | Codecycle | +| `tape { }` | — | — | R12D (base pointer for the tape) | Per KYRA's "tape drive" | +| `sandbox { }` | — | — | — | IEventTarget boundary | +| `audit` | RAX (msg) | — | — | Writes to audit log | + +**Note:** The above is a *design target* for the interpreter. The actual register allocation is implementation-defined. The interpreter (follow-up B track) may use a stack-based VM (with implicit registers) or a register-based VM (with explicit registers). The mapping above assumes the Onat/Lottes 2-register model. + +#### A.5.2 Memory Layout + +``` ++--------------------------------------+ +| Code (compiled verbs) | <- RAX holds the current verb's address ++--------------------------------------+ +| Stack (transient values) | <- RAX/RDX for top 2; deeper values in memory ++--------------------------------------+ +| Tape block | <- Per `tape { }` block ++--------------------------------------+ +| Sandbox (audit-trail region) | <- Per `sandbox { }` block ++--------------------------------------+ +| Globals (named constants, env) | <- Lottes's "register file as aliased global namespace" ++--------------------------------------+ +``` + +**Key invariant:** `tape { }` and `sandbox { }` blocks are *nested* scopes with their own memory regions. A nested `tape { }` inside a `sandbox { }` is a tape region inside an audit-trail region. The interpreter's memory manager must track these regions and free them when the block exits. + +#### A.5.3 FFI Bridge + +When a Tier 2 or Tier 3 verb calls into MCP (per the bridge script), the call goes through the FFI bridge: + +1. **Pass arguments** via the preemptive scatter model (per Onat's tape drive). The arguments are written to a contiguous memory region before the call. +2. **Issue the call** to `mcp_client.dispatch()`. The dispatch function maps the DSL verb + args to the underlying MCP tool call. +3. **Collect the result** into the tape region. The result is a string (per `mcp_client.py`'s tool signatures). +4. **Parse the result** (if structured) and place it on the stack as the verb's return value. +5. **Audit the call** (if inside a `sandbox { }` block). The audit log records: timestamp, source verb, args (stringified), result (stringified), duration, error info (if any). + +The FFI bridge is per the existing project's design (per `docs/guide_tools.md` §"MCP Bridge"), adapted to the DSL's verb dispatch. The bridge script (`scripts/cli_tool_bridge.py` analogue) is the runtime path. + +--- + +### A.6 Section 6 Deep-Dive: AI-Agent Implementation Notes + +This subsection provides per-claim implementation notes. The main report's §6 has the 10 claims; this appendix has the detailed implementation guidance. + +#### A.6.1 Claim 1 — Domain = Meta-Tooling (implementation notes) + +**Files involved:** None directly. The claim is a *placement* decision, not a code change. The DSL doesn't add code to the Application; it adds code to the Meta-Tooling. + +**Where the claim is enforced:** +- `docs/guide_meta_boundary.md` — the Application vs Meta-Tooling split (read-only) +- `scripts/cli_tool_bridge.py` (or its successor) — the bridge script that translates DSL → MCP calls (read-only from the DSL's perspective) + +**No code changes for this claim.** The DSL is a new file format; the Application's function-calling is unchanged. + +#### A.6.2 Claim 2 — Runtime path (implementation notes) + +**Files involved:** +- `scripts/cli_tool_bridge.py` — NEW or MODIFIED: the bridge script that translates DSL → MCP tool calls. Detects `sandbox { }` blocks, surfaces HITL modals via Hook API. +- `docs/guide_meta_boundary.md` — UPDATED: documents the DSL as a Meta-Tooling format. +- `src/bridge_dsl/parser.py` — NEW: the DSL parser (this is the interpreter prototype from follow-up B). +- `src/bridge_dsl/translator.py` — NEW: translates DSL verbs to MCP tool calls. + +**Where the claim is enforced:** +- The bridge script is the only runtime that sees the DSL directly. The Application sees MCP tool calls (unchanged). +- The Hook API surfaces HITL modals for `sandbox { }` blocks (per `docs/guide_tools.md` §"The Hook API" §"`/api/ask` Protocol"). + +#### A.6.3 Claim 3 — 3-layer security (implementation notes) + +**Files involved:** +- `src/bridge_dsl/parser.py` — NEW: enforces that every DSL verb maps to a tool inside the 3-layer allowlist. +- `src/mcp_client.py:configure()` — read-only reference for the 3-layer allowlist (per `docs/guide_tools.md` §"The MCP Bridge"). +- `tests/test_bridge_dsl_security.py` — NEW: verifies that DSL statements targeting tools outside the allowlist are rejected. + +**Where the claim is enforced:** +- The parser runs the 3-layer allowlist check (allowlist construction, path validation, resolution gate) at parse time. Statements that fail are rejected before any tool call is made. + +#### A.6.4 Claim 4 — 4 memory dimensions (implementation notes) + +**Files involved:** +- `src/bridge_dsl/curation_query.py` — NEW: handles DSL queries against the curation memory (FileItem + ContextPreset + FuzzyAnchor). +- `src/bridge_dsl/discussion_query.py` — NEW: handles DSL queries against the discussion memory. +- `src/bridge_dsl/rag_query.py` — NEW: handles DSL queries against the RAG memory. +- `src/bridge_dsl/knowledge_query.py` — NEW: handles DSL queries against the knowledge memory (Candidate 11). + +**Where the claim is enforced:** +- Each memory dimension has a dedicated query path. The DSL is a *unified query language* across all 4. No single query path replaces another. + +#### A.6.5 Claim 5 — Stable-to-volatile cache ordering (implementation notes) + +**Files involved:** +- `src/ai_client.py:_send_anthropic()` and `_send_gemini()` — read-only references for the 4-breakpoint cache system (per `docs/guide_architecture.md` §"Anthropic Cache Strategy" and "Gemini Cache Strategy"). +- `src/bridge_dsl/cache_strategy.py` — NEW: enforces stable-to-volatile ordering for DSL-injected context. + +**Where the claim is enforced:** +- The DSL's `tape { }` blocks are a *stable* layer that can be cached across turns. The DSL's pipeline output is a *volatile* layer appended per turn. The cache strategy respects this ordering. + +#### A.6.6 Claim 6 — `Result[T]` envelope (implementation notes) + +**Files involved:** +- `src/result_types.py` — read-only reference for the `Result[T]` and `ErrorInfo` convention (per `conductor/tracks/data_oriented_error_handling_20260606/spec.md` §3.3). +- `src/bridge_dsl/result.py` — NEW: wraps every DSL verb's return value in `Result[T]`. + +**Where the claim is enforced:** +- The `try { ... } recover { ... }` block unwraps the `Result[T]`. The 12 `ErrorKind` values are the canonical error vocabulary. + +#### A.6.7 Claim 7 — Command Palette 33 commands (implementation notes) + +**Files involved:** +- `src/commands.py` — read-only reference for the 33 existing commands. +- `src/bridge_dsl/command_palette.py` — NEW: optionally, maps DSL verbs to the existing 33 commands (for "Everything" mode). + +**Where the claim is enforced:** +- The Command Palette's "Everything" mode (per `docs/guide_command_palette.md` line 383) is a natural use case for DSL verbs. The 33 commands are the imperative one-liners; the DSL verbs are the declarative intent expressions. + +#### A.6.8 Claim 8 — Hook API state fields (implementation notes) + +**Files involved:** +- `src/app_controller.py:_predefined_callbacks` and `_gettable_fields` — read-only references for the Hook API surface (per `docs/guide_state_lifecycle.md` §"Hook API Surface"). +- `src/bridge_dsl/hook_integration.py` — NEW: routes DSL state-mutating verbs through the Hook API. + +**Where the claim is enforced:** +- DSL verbs that mutate state use the Hook API. The DSL is a *user* of the existing infrastructure, not a bypass. + +#### A.6.9 Claim 9 — IEventTarget as `sandbox` (implementation notes) + +**Files involved:** +- `src/bridge_dsl/sandbox.py` — NEW: implements the `sandbox { }` block as an IEventTarget boundary. +- `docs/guide_architecture.md` §"The Execution Clutch" — read-only reference for the HITL approval modal. + +**Where the claim is enforced:** +- The `sandbox { }` block gates every write through the bridge script's HITL approval modal. The `audit` verb is the IEventTarget itself. + +#### A.6.10 Claim 10 — "reads are free" (implementation notes) + +**Files involved:** +- `src/bridge_dsl/pipeline.py` — NEW: the pipeline optimizer. Tier 2 verbs are marked read-only; the optimizer can re-evaluate them without re-approval. +- `src/bridge_dsl/hook_integration.py` — UPDATED: the HITL approval is triggered only at the *write* verb, not at the read verbs in the chain. + +**Where the claim is enforced:** +- The Tier 2 verbs are tagged as read-only in the parser. The runtime can re-execute a read-only chain without re-prompting the user. Only the moment the chain's output is consumed by a write verb does the HITL modal appear. + +--- + +### A.7 Section 7 Deep-Dive: Open Questions Discussion + +This subsection provides extended discussion of the 8 open questions. The main report's §7 has the questions; this appendix has the proposed solutions and trade-offs. + +#### A.7.1 Question 1 — How does `tape { }` map to Onat's preemptive scatter? + +**Proposed solution:** the `tape { }` block is a *parser-time hint* (not a runtime directive). The parser sees `tape { ... }` and emits code that pre-scatters the block's contents into a contiguous memory region before the block runs. This is the Onat model: arguments are pre-placed into fixed offsets before the call. + +**Trade-off:** parser-time optimization means the interpreter (follow-up B) must implement a 2-pass compilation: pass 1 identifies `tape { }` blocks; pass 2 emits the pre-scatter code. The alternative (runtime directive) is simpler but loses the JIT optimization. + +**Recommendation:** parser-time hint, 2-pass compilation. The interpreter should pre-scatter at compile time when possible. + +#### A.7.2 Question 2 — Where does "intent resolution" live? + +**Proposed solution:** the `fuzzy` verb is a per-block modifier (not a per-verb option, not a global parser mode). The block-level modifier is the simplest semantic: the `fuzzy { ... }` block accepts Levenshtein-2 verb-name matches within the block; the surrounding program requires exact matches. + +**Trade-off:** per-verb option (e.g., `scan*` for fuzzy `scan`) is more fine-grained but more complex. Global parser mode is simplest but loses the ability to mix fuzzy and strict within a program. Per-block is the middle ground. + +**Recommendation:** per-block modifier. The `fuzzy { ... }` syntax is the same as `tape { ... }` and `sandbox { ... }` — a block-level keyword. The interpreter applies the fuzzy matching within the block only. + +#### A.7.3 Question 3 — How does `audit` interact with `comms.log`? + +**Proposed solution:** the DSL's audit log is a 6th stream (`audit.log`), separate from the existing 5 streams. The 5 existing streams are tool-level (specific call types); the DSL's audit is verb-level (every verb in the DSL). + +**Trade-off:** adding a 6th stream is simpler than folding; folding would require a unified schema that the existing streams don't have. The cost is one more file to maintain. + +**Recommendation:** 6th stream (`audit.log`). Format: JSON-L with the same fields as `comms.log` (timestamp, direction, kind, payload, source_tier) plus DSL-specific fields (verb, args-stringified, result-stringified, duration). + +#### A.7.4 Question 4 — Does `sandbox` produce `Result[T, ErrorInfo]`? + +**Proposed solution:** yes, the `sandbox { ... }` block returns `Result[T, ErrorInfo]` per the `data_oriented_error_handling_20260606` convention. The `errors` list contains the audit failures (e.g., "HITL approval denied", "tool outside allowlist"). + +**Trade-off:** the alternative is a `SandboxResult` envelope with `stdout`, `stderr`, `exit_code`, `errors`. This is more information but breaks the convention. The convention wins. + +**Recommendation:** `Result[T, ErrorInfo]`. The `errors` list is the single envelope for all DSL failures. The interpreter (B track) can extend with sandbox-specific fields if needed. + +#### A.7.5 Question 5 — `didyoumean` recovery: parser feature or user-facing verb? + +**Proposed solution:** parser feature by default, with the user-facing verb as an opt-in for advanced cases. The parser auto-corrects on parse failure (Lenient mode is the default); the user never sees the typo unless they explicitly request `didyoumean` as a verb. + +**Trade-off:** making it parser-only is simpler (the user never thinks about it). Making it user-facing gives the user more control. The default should be parser-only; the verb is for advanced debugging. + +**Recommendation:** parser feature (default Lenient mode). The `didyoumean` verb is for advanced cases where the user wants to see the suggestions explicitly. + +#### A.7.6 Question 6 — How does `for x .. n` interact with `filter`/`map`? + +**Proposed solution:** `for x .. n { body }` is *sugar* for `[1, 2, ..., n] -> map { body }`. The interpreter desugars `for` into the pipeline form at parse time. The named variable `x` is bound to the position in the array (per APL's `ιN`). + +**Trade-off:** making them distinct (for-loop with named variable, pipeline with anonymous position) lets the user choose. The downside is two slightly different syntaxes for the same operation. Sugar is simpler. + +**Recommendation:** sugar. The interpreter desugars `for` to `[iota(n) -> map { body }]` internally. The user-facing syntax is the `for` form; the runtime representation is the pipeline form. + +#### A.7.7 Question 7 — How does `sandbox` map to `pre_tool_callback`? + +**Proposed solution:** the `sandbox { }` block in the DSL is compiled to a series of MCP tool calls, each wrapped by `pre_tool_callback`. The bridge script maps `sandbox` to a sequence of pre-tool-callback-wrapped calls. The HITL modal in the AppController (`_handle_approve_script` etc.) is the user-facing approval. + +**Trade-off:** the existing `pre_tool_callback` is per-tool-call, not per-block. The DSL's `sandbox` is a block. The interpreter must wrap each call in the block with the callback. This is a straightforward extension. + +**Recommendation:** `sandbox { }` → series of pre-tool-callback-wrapped calls. The bridge script handles the wrapping; the AppController's modal is unchanged. + +#### A.7.8 Question 8 — Connection to `intent_dsl_for_meta_tooling_20260608_PLACEHOLDER` + +**Proposed solution:** the minimum subset for the placeholder track is: +- **1 Tier 1 verb:** `for x .. n` (the most basic iteration) +- **2 Tier 2 verbs:** `scan` + `filter` (the minimum pipeline) +- **1 Tier 4 verb:** `sandbox` (the IEventTarget boundary) + +That's 4 verbs total, plus the grammar. The placeholder track can demonstrate a round-trip end-to-end with this subset. + +**Trade-off:** a larger subset (e.g., all 42 verbs) would be more expressive but harder to implement. A smaller subset (e.g., just `scan`) wouldn't demonstrate the pipeline. 4 verbs is the sweet spot. + +**Recommendation:** 4-verb minimum. The placeholder track implements these 4 verbs + the 14-primitive grammar, demonstrates a round-trip, and leaves the remaining 38 verbs for the interpreter prototype (follow-up B track) to implement. + +--- + +### A.8 Glossary + +**AI agent** — an LLM-based system that emits intent (DSL verbs) to invoke tools. The bridge script translates the intent into tool calls. Examples: Gemini CLI, OpenCode, Claude Code. + +**tape** — a memory region modeled on a tape drive, declared with `tape { }`. The block's contents are pre-scattered into a contiguous buffer for efficient execution. + +**bridge script** — the runtime that translates DSL verbs into MCP tool calls. The `scripts/cli_tool_bridge.py` analogue. The Application's function-calling is unchanged; the bridge is the Meta-Tooling-side runtime. + +**catenative** — a programming language property where program concatenation denotes function composition. Forth, Joy, CoSy, KYRA, and x68 are all concatenative. + +**clusters** — the 8 prior-art groups in §2: 0 (Immediate-Mode Paradigm), 1 (Concatenative), 2 (Array), 3 (Intent-Mapping), 4 (Meta-Tooling DSLs), 5 (SSDL), 6 (Command Palette), 7 (Result convention). + +**codepath** (SSDL) — a sequential list of instructions that terminates; no loops. SSDL symbol: `===>`. + +**codecycle** (SSDL) — a circular structure; a codepath that repeats at its first instruction after its last. SSDL symbol: `o==>`. + +**DSL** — Domain-Specific Language. In this report, "DSL" refers specifically to the intent-based scripting language proposed in §3 and §4. + +**didyoumean** — a Tier 4 verb that proposes corrections for ambiguous verb names. Returns a list of suggestions. + +**fold** — a Tier 2 verb that reduces a record stream to a single value using an accumulator. + +**FuzzyAnchor** — the project's existing pattern for resilient line ranges that survive file modifications (per `docs/guide_context_curation.md`). The DSL's `offset` and `span` verbs build on this pattern. + +**IEventTarget** — O'Donnell's pattern for formalizing state changes (per `mvc.html`). A single interface that all writes route through. The DSL's `sandbox { }` block is the IEventTarget boundary; the `audit` verb is the IEventTarget itself. + +**immediate-mode** — a paradigm where widgets (or verbs) are method invocations, not stateful objects. Each call is independent; there is no "current widget" implicit state. The DSL inherits this paradigm. + +**intent-mapping** — the design philosophy of expressing what the user wants to accomplish (intent) rather than how to accomplish it (commands). Jofito's framing. + +**KYRA** — Onat Türkçüoğlu's 2-register Forth variant with magenta-pipe definition boundaries, basic blocks, and lambdas. The DSL's `->`, `[ ]`, `{ }` operators are direct descendants. + +**leader/chaser** — Jofito's parallel-execution model where predicates run as threads sharing a single memory tape. The DSL's `scatter` verb generalizes this. + +**Lottes** — Timothy Lottes; creator of x68 / 5th / "Ear" + "Toe". The DSL inherits the 32-bit token encoding, the annotation overlay, and the "register file as aliased global namespace" model. + +**Meta-Tooling** — the external agents (Gemini CLI, OpenCode) used to build the Application. The DSL is the format these agents emit. Distinct from the Application's function-calling. + +**nagent** — Mike Acton's autonomous coding agent framework (`github.com/macton/nagent`; per `conductor/tracks/nagent_review_20260608/`). The `nagent_tags.py` parser is the inspiration for the DSL's structured-protocol idea (but the DSL rejects the XML angle-bracket notation). + +**O'Donnell** — John O'Donnell; creator of the IMGUI/MVC paradigm (per `johno.se/book/`). The DSL inherits 4 anchor claims from his work: widgets as method invocations, reads free, IEventTarget, no scene-graph abstractions. + +**preemptive scatter** — the Onat/Lottes model of pre-placing arguments into fixed memory offsets before a call. The DSL's `tape { }` block and `scatter` verb build on this. + +**Result[T]** — the data-oriented error-handling envelope (per `data_oriented_error_handling_20260606`). The DSL's `try { ... } recover { ... }` block returns `Result[T]`. + +**sandbox** — a Tier 4 verb that declares an IEventTarget boundary. All writes inside `sandbox { ... }` go through the formal event channel (HITL approval). + +**SSDL** — Spec/Sketch Description Language, per `docs/reports/computational_shapes_ssdl_digest_20260608.md`. The DSL's verbs are annotated with SSDL shape tags (`[I]`, `===>`, `o==>`, `===>W===>`). + +**tag protocol** — nagent's XML-ish self-closing tags for tool invocation. The DSL inherits the *idea* (named operation with typed attributes) but rejects the XML angle-bracket notation. + +**try/recover** — a Tier 4 verb envelope that returns `Result[T]`. The `recover` block runs when the `try` block returns errors. + +**view::drawMesh** — O'Donnell's example of a procedural View interface (per `mvc.html`). The DSL's verbs are flat procedure calls, not object-graph hierarchies. + +**x68 / 5th** — Timothy Lottes's source-less Forth variant. The DSL inherits the 32-bit token encoding, the annotation overlay, the folded interpreter, and the "register file as aliased global namespace" model. + +**X.com thread** — the 2025-04-30 Onat/Lottes X.com conversation (per `C:\projects\forth\bootslop\references\X.com - Onat & Lottes Interaction 1.png.ocr.md`). Source for many of the load-bearing quotes in §5. + +--- + +### A.9 Bibliography (Expanded) + +The bibliography is split into 4 categories: external prior art, project's own references, track-internal references, and sub-reports. Each entry has a URL/path + 1-line description + 1-line "key claim" summary. + +#### A.9.1 External Prior Art + +**Cluster 0 — Immediate-Mode Paradigm (the philosophical anchor):** +- **John O'Donnell, "IMGUI"** — `https://johno.se/book/imgui.html` (canonical IMGUI essay; widgets as method calls, frame shearing, deferred display). *Key claim:* "Widgets, logically, change from being objects to being method invocations." +- **John O'Donnell, "The Pitch"** — `https://johno.se/book/pitch.html` (paradigm shift; GPU advances; Controller as procedural composer). *Key claim:* 800,000 vertices in a single draw call at Jungle Peak. +- **John O'Donnell, "Immediate Mode MVC"** — `https://johno.se/book/immvc.html` (book roadmap; IEventTarget centrality). *Key claim:* the book structure maps the IMGUI → MVC/E → Persistence progression. +- **John O'Donnell, "MVC"** — `https://johno.se/book/mvc.html` (reads free / writes formalized; IEventTarget pattern; scene-graph prohibition). *Key claim:* "Writes to Model are formalized through the addition of IEventTarget." + +**Cluster 1 — Concatenative (Forth family; the syntax tradition):** +- **Forth** — `https://en.wikipedia.org/wiki/Forth_(programming_language)` (RPN, dictionary, colon-word, threaded code, self-hosting). *Key claim:* dictionary is the sole organizing principle; words are first-class. +- **ColorForth** — `https://en.wikipedia.org/wiki/ColorForth` (color-encoded semantics). *Key claim:* color replaces keywords. +- **KYRA/VAMP (Onat Türkçüoğlu)** — `C:\projects\forth\bootslop\references\kyra_in-depth.md` (2-register stack, magenta pipe, basic blocks, lambdas, FFI), `forth_day_2020_in-depth.md` (ColorForth + SPIR-V). *Key claim:* the magenta pipe `|` emits `RET` + `xchg rax, rdx` (48 92) as the definition boundary. +- **x68/5th (Timothy Lottes)** — `C:\projects\forth\bootslop\references\neokineogfx_in-depth.md` (folded interpreter, 32-bit granularity, annotation overlay), `blog_in-depth.md` (source-less evolution, "Ear"+"Toe"), `Architectural_Consolidation.md` (synthesis). *Key claim:* the annotation overlay (8 chars × 7 bits + 8-bit tag per 32-bit token) is the source-less model. +- **Onat/Lottes X.com thread** — `C:\projects\forth\bootslop\references\X.com - Onat & Lottes Interaction 1.png.ocr.md` (direct quotes on register file as aliased namespace, preemptive scatter, "no stacks"). *Key claim:* "data stacks on HW that has no physical data stack." +- **Joy** — `https://en.wikipedia.org/wiki/Joy_(programming_language)`, `http://joylang.org/` (purely functional concatenative, quotation as first-class values, combinator library). *Key claim:* "concatenation of two programs denotes the composition of the functions denoted by the two programs." +- **CoSy (Bob Armstrong)** — `https://cosy.com/CoSy/Simplicity.html` (TimeStamped notebook/log, 3-cell headers, modulo indexing, APL-via-K vocabulary), `https://cosy.com/4thCoSy/` (4thCoSy repo). *Key claim:* the open-vocabulary culture; the vocabulary IS the user surface. + +**Cluster 2 — Array (the data model):** +- **APL (Kenneth Iverson)** — `https://en.wikipedia.org/wiki/APL_(programming_language)`, `https://www.dyalog.com/`. *Key claim:* the array is the universal data type; every glyph is a function. +- **K / q (Arthur Whitney)** — `https://en.wikipedia.org/wiki/K_(programming_language)`, `https://kx.com/`. *Key claim:* ASCII-only with heavy context-sensitive overloading; kdb+ processes billions of records at microsecond latency. +- **BQN (Marshall Lochbaum)** — `https://mlochbaum.github.io/BQN/`. *Key claim:* function trains as the most expressive tacit composition mechanism in the family. +- **Uiua (Tony Morris)** — `https://www.uiua.org/`, `https://github.com/uiua-lang/uiua`. *Key claim:* stack-based execution as a viable alternative to named parameters; modern open-source onboarding. + +**Cluster 3 — Intent-Mapping (the use case):** +- **Jofito (Jody Bruchon)** — `https://codeberg.org/jbruchon/jofito` (README 2026 UPDATE NOTE: "intent mapping engine"), `docs/transcripts/Ddme7DwMQBI_jofito_jody_bruchon.txt` (full video transcript, 428 lines). *Key claim:* "jofito is a 'write the optimization once, reap the benefits everywhere' system that takes what the user wants to accomplish (intent) as input." +- **jq (Stephen Dolan)** — `https://en.wikipedia.org/wiki/Jq_(programming_language)`, `https://jqlang.org/`. *Key claim:* filter-as-expression style; `select(condition)` is the composable filter. +- **nagent's tag protocol** — `conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md` (lines 210-230 for the Bridge DSL), `decisions.md` (Candidate 4: Intent-based DSL for Meta-Tooling). *Key claim (REJECTED):* XML-ish self-closing tags — the DSL inherits the idea but rejects the angle-bracket notation per the user's instruction. +- **WebAssembly** — `https://en.wikipedia.org/wiki/WebAssembly`. *Key claim:* linear memory as the modern reference for tape-drive argument passing. + +**Cluster 4 — Meta-Tooling DSLs (the prior art in 2026):** +- **`mcp_dsl_20260606` placeholder** — `conductor/tracks/mcp_architecture_refactor_20260606/spec.md` §12.1 and §13.1 (per-MCP grammar, 8x token reduction, backward compat). *Key claim:* per-MCP grammar with backward-compat JSON path. +- **nagent's Bridge DSL** — `conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md` line 216-230. *Key claim:* the bridge between external agents and MCP tool calls; the format external agents emit. +- **OpenAI function-calling** — `https://platform.openai.com/docs/guides/function-calling`. *Key claim:* token cost is proportional to schema verbosity; "aim for fewer than 20 functions." +- **Anthropic tool-use** — `https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/define-tools`. *Key claim:* `input_examples` as a first-class schema field; `strict: true` as a guarantee. + +**Cluster 5 — SSDL:** +- **`docs/reports/computational_shapes_ssdl_digest_20260608.md` §1** — (6 primitives + 7 modifiers). *Key claim:* the meta-vocabulary for annotating verb shapes (`[I]`, `===>`, `o==>`, `===>W===>`). + +**Cluster 6 — Command Palette:** +- **`docs/guide_command_palette.md`** + **`src/commands.py`** (33 commands). *Key claim:* the user's existing vocabulary instinct; the DSL is a richer superset. + +**Cluster 7 — Result convention:** +- **`conductor/tracks/data_oriented_error_handling_20260606/spec.md` §3.3** (Result[T], ErrorInfo, 12 ErrorKind values). *Key claim:* errors are data, not control flow. + +#### A.9.2 Project's Own References + +**Existing tracks and reports:** +- **`conductor/tracks.md`** — active tracks registry (the new track is registered here as #23). +- **`conductor/workflow.md`** — the workflow rules (4-phase pattern, TDD, git notes). +- **`conductor/product.md`** — the product vision; the DSL's §6 aligns with the "AI-Optimized Compact Style" principles. +- **`conductor/tech-stack.md`** — the tech stack constraints; the DSL doesn't add dependencies. +- **`conductor/code_styleguides/`** — the styleguides (Python style, error handling, workspace paths, etc.). +- **`docs/Readme.md`** — the doc index. +- **`docs/ideation/ed_chunk_data_structures_20260523.md`** — the existing ideation doc; same style/format as this report. + +**Per-source-file guides:** +- **`docs/guide_architecture.md`** — threading model, event system, HITL, telemetry. *Key for:* Claim 5 (cache ordering) and Claim 9 (HITL). +- **`docs/guide_meta_boundary.md`** — Application vs Meta-Tooling split. *Key for:* Claim 1 (Domain = Meta-Tooling) and Claim 2 (Runtime path). +- **`docs/guide_tools.md`** — MCP Bridge security, 45 tools, Hook API, ApiHookClient. *Key for:* Claim 3 (3-layer security) and Claim 8 (Hook API). +- **`docs/guide_mma.md`** — 4-tier Multi-Model Architecture. *Key for:* the project's existing tier system; the DSL is a separate abstraction layer. +- **`docs/guide_context_aggregation.md`** — the 518-line `aggregate.py` pipeline (3 strategies, 7 view modes). *Key for:* the existing context-assembly pattern; the DSL's pipeline model is a *front-end* for the same. +- **`docs/guide_command_palette.md`** — 33 commands, fuzzy search, "Everything" mode. *Key for:* Claim 7 (Command Palette). +- **`docs/guide_rag.md`** — opt-in RAG (ChromaDB). *Key for:* Claim 4 (4 memory dimensions). +- **`docs/guide_state_lifecycle.md`** — undo/redo, HistoryManager, state delegation. *Key for:* Claim 8 (Hook API) and the `_settable_fields` map. +- **`docs/guide_testing.md`** — 251 test files, 7 conftest fixtures. *Key for:* the testing pattern for the interpreter prototype. +- **`docs/guide_personas.md`** — persona management. +- **`docs/guide_workspace_profiles.md`** — docking layout profiles. + +**Track-internal references:** +- **`conductor/tracks/data_oriented_error_handling_20260606/spec.md`** — the Result[T] convention. *Key for:* Claim 6 (Result envelope). +- **`conductor/tracks/nagent_review_20260608/nagent_review_v2_1_20260612.md`** — 4 memory dimensions, RAG integration discipline, stable-to-volatile cache ordering. *Key for:* Claim 4 and Claim 5. +- **`conductor/tracks/mcp_architecture_refactor_20260606/spec.md`** — the SubMCP architecture (the target the DSL maps to). *Key for:* §A.2.5 (mcp_dsl entry). +- **`conductor/tracks/code_path_audit_20260607/spec.md`** — the data-oriented pattern for static analysis. *Key for:* the project's overall "data-oriented" framing. + +**Reports:** +- **`docs/reports/computational_shapes_ssdl_digest_20260608.md`** — SSDL 6 primitives + 7 modifiers. *Key for:* §A.2.6 (Cluster 5). +- **`docs/reports/ascii_sketch_ux_workflow_20260608.md`** — the user's ideation workflow convention. + +#### A.9.3 Sub-Reports (the research basis for §2) + +The sub-reports are the deep analyses that the main report's §2 condenses. Each is a separate file with full text, citations, and synthesis tables. + +- **`research/cluster_0_odonnell.md`** (338 lines) — Cluster 0 synthesis; the 4 anchor claims with full O'Donnell context; the Connections section (Tier 4 verb → O'Donnell claim mappings). +- **`research/cluster_1_concatenative.md`** (209 lines) — Cluster 1 synthesis; 6 entries (Forth, ColorForth, KYRA, x68, Joy, CoSy) with full descriptions; the "Synthesis for Section 5" verb-to-entry mapping table. +- **`research/cluster_2_array.md`** (218 lines) — Cluster 2 synthesis; 4 entries (APL, K, BQN, Uiua) with full descriptions; the "Synthesis for the DSL" tier-1-verb-to-entry table. +- **`research/cluster_3_intent_mapping.md`** (241 lines) — Cluster 3 synthesis; 4 entries (Jofito, jq, nagent tag, Wasm) with full descriptions; the "Synthesis for the DSL" tier-2/3-verb-to-entry table. +- **`research/cluster_4_meta_tooling_dsls.md`** (313 lines) — Cluster 4 synthesis; 4 entries (mcp_dsl, nagent Bridge, OpenAI, Anthropic) with full descriptions; the "Synthesis for the DSL" tier-4-verb-to-entry table. + +#### A.9.4 Track-internal artifacts (for the intent_dsl_survey_20260612 track) + +- **`conductor/tracks/intent_dsl_survey_20260612/spec.md`** — the approved spec (789 lines). Source of truth for the track's design. +- **`conductor/tracks/intent_dsl_survey_20260612/plan.md`** — the executable plan (1,015 lines; 28 tasks). Source of truth for the track's execution. +- **`conductor/tracks/intent_dsl_survey_20260612/report.md`** — the v1.0 main report (417 lines). The single-doc delivery; this is what nagent v2.2 will reference. +- **`conductor/tracks/intent_dsl_survey_20260612/report_v1.1.md`** — the v1.1 report (this file). Post-secondary-review correction: the XML/JSON citation fix, the OCR-restored Lottes quote, the softened Wasm "streaming parse" inference. +- **`conductor/tracks/intent_dsl_survey_20260612/reportreview.md`** — the secondary review (this file's companion). Documents the verification pass. +- **`conductor/tracks/intent_dsl_survey_20260612/state.toml`** — state file. +- **`conductor/tracks/intent_dsl_survey_20260612/metadata.json`** — metadata. + +--- + +*End of Appendix. The v1.1 report is the final deliverable. The follow-up B track (interpreter prototype) and the placeholder track (`intent_dsl_for_meta_tooling_20260608_PLACEHOLDER`) consume this report.* + diff --git a/conductor/tracks/intent_dsl_survey_20260612/research/cluster_3_intent_mapping.md b/conductor/tracks/intent_dsl_survey_20260612/research/cluster_3_intent_mapping.md index 274b71c9..a7843178 100644 --- a/conductor/tracks/intent_dsl_survey_20260612/research/cluster_3_intent_mapping.md +++ b/conductor/tracks/intent_dsl_survey_20260612/research/cluster_3_intent_mapping.md @@ -185,9 +185,9 @@ This shows jq's functional composition style: `select(...) | [digits | digit] | --- -## Entry: nagent's Tag Protocol (Jody Bruchon, 2024–2025) +## Entry: nagent's Tag Protocol (Mike Acton, 2024\u20132025) -**What it is.** nagent is Jody Bruchon's autonomous coding agent framework. Its §4 "visible output protocol" uses a self-closing XML-ish tag format (e.g., ``) that the agent emits as text. A parser (`nagent_tags.py`) matches tags to handler functions (`execute_read`, etc.). The protocol is explicitly not XML — first matching close-tag wins, there is no entity escaping, and the tag format is designed for human readability and LLM emit-ability rather than for machine interchange fidelity. +**What it is.** nagent is Mike Acton's autonomous coding agent framework (`github.com/macton/nagent`). Its §4 "visible output protocol" uses a self-closing XML-ish tag format (e.g., ``) that the agent emits as text. A parser (`nagent_tags.py`) matches tags to handler functions (`execute_read`, etc.). The protocol is explicitly not XML — first matching close-tag wins, there is no entity escaping, and the tag format is designed for human readability and LLM emit-ability rather than for machine interchange fidelity. **What we explicitly reject (and what we take):** We **take** the idea of a compact, human-readable structured protocol for tool invocation — the `` surface syntax that external agents can emit without knowing the underlying function-call JSON schema. We **reject** the XML angle-bracket notation per the user's explicit instruction: "ignore its record formats as they problably will be less xml/json based as I don't like them." (`conductor/tracks/nagent_review_20260608/decisions.md:50` citing user signal).