This commit is contained in:
2026-02-21 14:18:15 -05:00
parent 6953e6b9b3
commit 67f8639ee7
5 changed files with 604 additions and 1467 deletions

View File

@@ -3,7 +3,7 @@
## Overview
`attempt_1` is a minimal C program that serves as a proof-of-concept for the "Lottes/Onat" sourceless ColorForth paradigm. It successfully integrates a visual editor, a live JIT compiler, and an execution environment into a single, cohesive Win32 application that links against the C runtime but avoids direct includes of standard headers, using manually declared functions instead.
The application presents a visual grid of 32-bit tokens and allows the user to navigate and edit them directly. On every keypress, the token array is re-compiled into x86-64 machine code and executed, with the results (register states and global memory) displayed instantly in the HUD.
The application presents a visual grid of 32-bit tokens rendered via `microui` floating panels and allows the user to navigate and edit them directly. On every keypress, the token array is re-compiled into x86-64 machine code and executed, with the results (register states and global memory) displayed instantly in the HUD.
## Core Concepts Implemented
@@ -17,42 +17,90 @@ The application presents a visual grid of 32-bit tokens and allows the user to n
3. **2-Register Stack & Global Memory:**
* The JIT compiler emits x86-64 that strictly adheres to Onat's `RAX`/`RDX` register stack.
* A `vm_globals` array is passed by pointer into the JIT'd code (via `RCX` on Win64), allowing instructions like `FETCH` and `STORE` to simulate the "tape drive" memory model.
* A `vm_globals` array (16 x `U8`) is passed by pointer into the JIT'd code via `RCX` (Win64 calling convention), held in `RBX` for the duration of execution.
* `vm_globals[14]` and `vm_globals[15]` serve as the `RAX` and `RDX` save/restore slots across JIT entry and exit.
* Indices 013 are available as the "tape drive" global memory for `FETCH`/`STORE` primitives.
4. **Handmade x86-64 JIT Emitter:**
* A small set of `emit8`/`emit32` functions write raw x86-64 opcodes into a `VirtualAlloc` block marked as executable (`PAGE_EXECUTE_READWRITE`).
* This buffer is cast to a C function pointer and called directly, bypassing the need for an external assembler like NASM or a complex library like Zydis for this prototype stage.
4. **Handmade x86-64 JIT Emitter with Named DSL:**
* A small set of `emit8`/`emit32`/`emit64` functions write raw x86-64 opcodes into a `VirtualAlloc` block marked `PAGE_EXECUTE_READWRITE`.
* All emission is done through a well-defined **x64 Emission DSL** (`#pragma region x64 Emission DSL`) consisting of:
* Named REX prefix constants (`x64_REX`, `x64_REX_R`, `x64_REX_B`, etc.).
* Named register encoding constants (`x64_reg_RAX`, `x64_reg_RDX`, etc.).
* ModRM and SIB composition macros (`x64_modrm(mod, reg, rm)`, `x64_sib(scale, index, base)`).
* Named opcode constants (`x64_op_MOV_reg_rm`, `x64_op_CALL_rel32`, etc.).
* Composite inline instruction helpers (`x64_XCHG_RAX_RDX()`, `x64_ADD_RAX_RDX()`, `x64_RET_IF_ZERO()`, `x64_FETCH()`, `x64_STORE()`, etc.).
* Prologue/Epilogue helpers (`x64_JIT_PROLOGUE()`, `x64_JIT_EPILOGUE()`).
* FFI helpers (`x64_FFI_PROLOGUE()`, `x64_FFI_MAP_ARGS()`, `x64_FFI_CALL_ABS(addr)`, `x64_FFI_EPILOGUE()`).
* **Raw magic bytes are forbidden** in `compile_and_run_tape` and `compile_action`. All emission uses the DSL.
5. **Modal Editor (Win32 GDI):**
* The UI is built with raw Win32 GDI calls defined in `duffle.h`.
* It features two modes: `Navigation` (gray cursor, arrow key movement) and `Edit` (orange cursor, text input).
5. **Modal Editor (Win32 GDI + microui):**
* The UI is built with `microui` rendered via raw Win32 GDI calls defined in `duffle.h`.
* It features two modes: `Navigation` (blue cursor, arrow key movement) and `Edit` (orange cursor, text input).
* The editor correctly handles token insertion, deletion (Vim-style backspace), tag cycling (Tab), and value editing, all while re-compiling and re-executing on every keystroke.
* Four floating panels: **ColorForth Source Tape**, **Compiler & Status**, **Registers & Globals**, **Print Log**.
6. **O(1) Dictionary & Visual Linking:**
* The dictionary relies on an edit-time visual linker. When the tape is modified, `relink_tape` resolves names to absolute source memory indices.
* The compiler resolves references in `O(1)` time instantly by indexing into an offset mapping table (`tape_to_code_offset`).
* The compiler resolves references in `O(1)` time by indexing into `tape_to_code_offset[65536]`.
7. **Implicit Definition Boundaries (Magenta Pipe):**
* Definitions implicitly cause the JIT to emit a `RET` to close the prior block, and an `xchg rax, rdx` to rotate the stack for the new block.
7. **Implicit Definition Boundaries (STag_Define):**
* A `STag_Define` token causes the JIT to:
1. Emit `RET` to close the prior block (via `x64_RET()`).
2. Emit a `JMP rel32` placeholder to skip over the new definition body.
3. Record the entry point in `tape_to_code_offset[i]`.
4. Emit `xchg rax, rdx` (via `x64_XCHG_RAX_RDX()`) as the definition's first instruction, rotating the 2-register stack.
8. **x68 Instruction Padding:**
* The JIT pads every logical block/instruction to exact 32-bit multiples using `0x90` (NOPs) to perfectly align with the visual token grid logic.
8. **Lambda Tag (STag_Lambda):**
* A `STag_Lambda` token compiles a code block out-of-line and leaves its absolute 64-bit address in `RAX` for use with `STORE` or `EXECUTE`.
* Implemented via `x64_MOV_RDX_RAX()` to save the prior TOS, a `mov rax, imm64` with a patched-in address, and a `JMP rel32` to skip the body.
9. **The FFI Bridge:**
* The system uses an FFI macro (`emit_ffi_dance`) to align the `RSP` stack to 16 bytes, allocate 32 bytes of shadow space, and map the 2-register data stack/globals into the Windows C-ABI (`RCX`, `RDX`, `R8`, `R9`) to safely call WinAPI functions (like `MessageBoxA`).
9. **x68 Instruction Padding:**
* `pad32()` pads every logical block/instruction to exact 32-bit multiples using `0x90` (NOPs), aligning with the visual token grid.
## What's Missing (TODO)
10. **The FFI Bridge:**
* `x64_FFI_PROLOGUE()` pushes `RDX`, aligns `RSP` to 16 bytes, and allocates 32 bytes of shadow space. * x64_FFI_MAP_ARGS() maps the 2-register stack and globals into Win64 ABI registers (RCX=RAX, R8=globals[0], R9=globals[1]). * x64_FFI_CALL_ABS(addr) loads the absolute 64-bit function address into R10 and calls it. * x64_FFI_EPILOGUE() restores RSP and pops RDX.
* **Saving/Loading (Persistence):** The tape and annotation arenas are purely in-memory and are lost when the program closes. Need to implement the self-modifying OS cartridge concept.
* **Expanded Instruction Set:** The JIT only knows a handful of primitives. It has no support for floating point or more complex branches.
* **Annotation Editing & Comments:** Typing into an annotation just appends characters up to 8 bytes. A proper text-editing cursor within the token is needed, and support for arbitrarily long comments should be implemented.
* **Tape Drive / Preemptive Scatter Logic:** Improve the FFI argument mapping to properly read from the "tape drive" memory slots instead of just mapping RAX/RDX to the first parameters.
Persistence (Cartridge Save/Load):
F1 saves the tape and annotation arenas (with metadata) to cartridge.bin via WriteFile.
F2 loads from cartridge.bin, re-runs relink_tape() and compile_and_run_tape() to restore full live state.
Primitive Instruction Set
## References Utilized
* **Heavily Utilized:**
* **Onat's Talks:** The core architecture (2-register stack, global memory tape, JIT philosophy) is a direct implementation of the concepts from his VAMP/KYRA presentations.
* **Lottes' Twitter Notes:** The 2-character mapped dictionary, `ret-if-signed` (`RET_IF_ZERO`), and annotation layer concepts were taken directly from his tweets.
* **User's `duffle.h` & `fortish-study`:** The C coding conventions (X-Macros, `FArena`, byte-width types, `ms_` prefixes) were adopted from these sources.
* **Lightly Utilized:**
* **Lottes' Blog:** Provided the high-level "sourceless" philosophy and inspiration.
* **Grok Searches:** Served to validate our understanding and provide parallels (like Wasm's linear memory), but did not provide direct implementation details.
```md
ID Name Emitted x86-64 (via DSL)
1 SWAP x64_XCHG_RAX_RDX()
2 MULT x64_IMUL_RAX_RDX()
3 ADD x64_ADD_RAX_RDX()
4 FETCH x64_FETCH() — mov rax, [rbx + rax*8]
5 DEC x64_DEC_RAX()
6 STORE x64_STORE() — mov [rbx + rax*8], rdx
7 RET_IF_Z x64_RET_IF_ZERO()
8 RETURN x64_RET()
9 PRINT FFI dance → ms_builtin_print
10 RET_IF_S x64_RET_IF_SIGN()
11 DUP x64_MOV_RDX_RAX()
12 DROP x64_MOV_RAX_RDX()
13 SUB x64_SUB_RAX_RDX()
14 EXECUTE x64_CALL_RAX()
```
## Whats Missing (TODO)
- DSL wrappers for forward jump placeholders: The JMP rel32 and CALL rel32 forward-jump patterns in compile_and_run_tape still use bare emit8(x64_op_JMP_rel32) + emit32(0) pairs. Dedicated x64_JMP_fwd_placeholder(U4* offset_out) and x64_patch_fwd(U4 offset) helpers should be added to the DSL to eliminate this last gap.
- Expanded Annotation Layer (Variable-Length Comments): The anno_arena strictly allocates 8 bytes per token. Arbitrarily long comment blocks need a separate indirection layer without disrupting the O(1) compile mapping.
- Expanded Instruction Set: No floating point. No multi-way branching beyond RET_IF_Z / RET_IF_S.
- Basic Block Jumps [ ]: Lottes-style scoped jump targets for structured control flow without an AST are not yet implemented.
- Tape Drive / Preemptive Scatter Improvements: The FFI argument mapping reads globals[0] and globals[1] for R8/R9. A proper scatter model that pre-places arguments into named slots before a call is not yet formalized.
- Self-Hosting Bootstrap: The editor and JIT are written in C. The long-term goal is to rewrite the core inside the custom language itself, discarding the C host.
## References Utilized
### Heavily Utilized:
- Onats Talks: The core architecture (2-register stack, global memory tape, JIT philosophy) is a direct implementation of the concepts from his VAMP/KYRA presentations.
Lottes Twitter Notes: The 2-character mapped dictionary, ret-if-signed (RET_IF_ZERO), and annotation layer concepts were taken directly from his tweets.
- Users duffle.h & fortish-study: The C coding conventions (X-Macros, FArena, byte-width types, ms_ prefixes) were adopted from these sources.
### Lightly Utilized:
- Lottes Blog: Provided the high-level “sourceless” philosophy and inspiration.
- Grok Searches: Served to validate our understanding and provide parallels (like Wasms linear memory), but did not provide direct implementation details.