Files
forth_bootslop/attempt_1/attempt_1.md
2026-02-20 22:10:29 -05:00

5.3 KiB

Technical Outline: Attempt 1

Overview

attempt_1 is a minimal C program that serves as a proof-of-concept for the "Lottes/Onat" sourceless ColorForth paradigm. It successfully integrates a visual editor, a live JIT compiler, and an execution environment into a single, cohesive Win32 application that links against the C runtime but avoids direct includes of standard headers, using manually declared functions instead.

The application presents a visual grid of 32-bit tokens and allows the user to navigate and edit them directly. On every keypress, the token array is re-compiled into x86-64 machine code and executed, with the results (register states and global memory) displayed instantly in the HUD.

Core Concepts Implemented

  1. Sourceless Token Array (FArena tape):

    • The "source code" is a contiguous block of U4 (32-bit) integers allocated by VirtualAlloc and managed by the FArena from duffle.h.
    • Each token is packed with a 4-bit "Color" tag and a 28-bit payload, adhering to the core design.
  2. Annotation Layer (FArena anno):

    • A parallel FArena of U8 (64-bit) integers stores an 8-character string for each corresponding token on the tape.
    • The UI renderer prioritizes displaying this string, but the compiler only ever sees the 2-character ID packed into the 32-bit token, successfully implementing Lottes' dictionary annotation strategy.
  3. 2-Register Stack & Global Memory:

    • The JIT compiler emits x86-64 that strictly adheres to Onat's RAX/RDX register stack.
    • A vm_globals array is passed by pointer into the JIT'd code (via RCX on Win64), allowing instructions like FETCH and STORE to simulate the "tape drive" memory model.
  4. Handmade x86-64 JIT Emitter:

    • A small set of emit8/emit32 functions write raw x86-64 opcodes into a VirtualAlloc block marked as executable (PAGE_EXECUTE_READWRITE).
    • This buffer is cast to a C function pointer and called directly, bypassing the need for an external assembler like NASM or a complex library like Zydis for this prototype stage.
  5. 2-Character Mapped Dictionary & Resolver:

    • The ID2(a, b) macro packs two characters into a 16-bit integer for use as a token's payload.
    • The JIT compiler maintains a simple array-based dictionary. On a : Define token, it records the ID and the current memory offset. On a ~ Call token, it looks up the ID and emits a relative 32-bit CALL instruction (0xE8).
    • It also correctly emits JMP instructions to skip over definition bodies during linear execution.
  6. Modal Editor (Win32 GDI):

    • The UI is built with raw Win32 GDI calls defined in duffle.h.
    • It features two modes: Navigation (gray cursor, arrow key movement) and Edit (orange cursor, text input).
    • The editor correctly handles token insertion, deletion (Vim-style backspace), tag cycling (Tab), and value editing, all while re-compiling and re-executing on every keystroke.

What's Missing (TODO)

  • Saving/Loading: The tape and annotation arenas are purely in-memory and are lost when the program closes.
  • Expanded Instruction Set: The JIT only knows a handful of primitives (SWAP, MULT, ADD, FETCH, STORE, DEC, RET_IF_ZERO, PRINT). It has no support for floating point or more complex branches.
  • The FFI Bridge: The system needs a macro (like Onat's CCALL) to align the RSP stack to 16 bytes and map the 2-register data stack/globals into the Windows C-ABI (RCX, RDX, R8, R9) to call WinAPI safely from the JIT.
  • Implicit Definition Boundaries (Magenta Pipe): Definitions should not need explicit begin/end. A definition token should implicitly cause the JIT to emit a RET to close the prior block, and an xchg rax, rdx to rotate the stack for the new block.
  • x68 Instruction Padding: The JIT currently emits variable-length instructions (emit8). It needs to pad every logical block/instruction to exact 32-bit multiples using ignored prefixes or NOPs to perfectly align with the visual token grid.
  • O(1) Dictionary & Visual Linking: The current dictionary is a simple string-matched array rebuilt on compile. It needs to transition to a true "visual linker" where visual tokens store the absolute source memory indices, resolving locations instantly at edit-time.
  • Annotation Editing: Typing into an annotation just appends characters. A proper text-editing cursor within the token is needed.

References Utilized

  • Heavily Utilized:
    • Onat's Talks: The core architecture (2-register stack, global memory tape, JIT philosophy) is a direct implementation of the concepts from his VAMP/KYRA presentations.
    • Lottes' Twitter Notes: The 2-character mapped dictionary, ret-if-signed (RET_IF_ZERO), and annotation layer concepts were taken directly from his tweets.
    • User's duffle.h & fortish-study: The C coding conventions (X-Macros, FArena, byte-width types, ms_ prefixes) were adopted from these sources.
  • Lightly Utilized:
    • Lottes' Blog: Provided the high-level "sourceless" philosophy and inspiration.
    • Grok Searches: Served to validate our understanding and provide parallels (like Wasm's linear memory), but did not provide direct implementation details.