Files
forth_bootslop/references/forth_day_2020_in-depth.md
2026-02-20 21:38:52 -05:00

5.5 KiB

In-Depth Analysis: Onat's Forth Day 2020 Presentation

This document provides an exhaustive breakdown of the technical specifics, screen visuals, and mechanical explanations from Onat Türkçüoğlu's "Preview of x64 & ColorForth & SPIR V" presentation at Forth Day 2020, synthesizing both the video transcript and the OCR analysis of the editor's visual state.


1. The Environment and Editor UI

Onat introduces a custom 3-pane UI built entirely from scratch in C and Vulkan. This editor serves as the primary IDE, compiler, and visual debugger.

Visual Layout (from OCR & Video)

  • The Three Panes: Left/center panes display the block-based, colorized Forth/macro tokens. The right pane displays live x86-64 assembly output (or SPIR-V binary data) that updates instantly as the user edits the source.
  • Color Semantics (Observed in OCR):
    • Cyan: Low-level x86-64 opcodes or API functions (mov, jmp, xorpd, CCALL1, ide_syscmd).
    • Yellow: Line numbers, specific execution tokens, or immediate jump labels/blocks.
    • Magenta: High-level struct definitions, bitwise layouts, and basic block delineations (Structs, vars, bits).
    • Red: Literal numbers (32, 64), format strings, or specific SPIR-V instruction IDs and properties.
    • Orange/Green: UI and control flow modifiers.
  • State Tracking: The editor treats code blocks as tracked state objects, which allows for native, robust Undo/Redo operations without relying on a traditional text file format.

2. O(1) Dictionary Lookup & "Compile-Time Call Graph"

Traditional Forth systems (and even Lottes's early systems) relied on hashing strings or linear searches to resolve words. Onat eliminated this overhead entirely.

  • Source Memory Mapping: Instead of hashing, the compiler allocates an extra 4 bytes per character in the visual block to store the exact source memory location of the currently compiled word.
  • Instant Resolution: Because the token itself points to its origin, "Jump to Definition" is instantaneous.
  • Execution Tracing: He demonstrates a command that instantly numbers every occurrence of a word across the codebase in the exact chronological order of execution. This provides a "compile-time call graph" without actually running the program, allowing the programmer to visualize the data flow statically.

3. The High-Level x64 Macro Assembler

The core of the system is not a traditional Forth interpreter, but a high-level macro assembler that compiles words directly into x64 machine code.

  • Syntax & Abstraction:
    • The syntax is designed to be readable and fluid: AX to BX or CX + offset.
    • A "direction register" macro allows toggling the flow of data. For instance, from AX to BX register, let's move an unsigned emits a 32-bit mov ebx, eax.
    • Modifiers like long change the emission to a 64-bit mov rbx, rax.
  • Low-Level Control (OCR Insights): The OCR reveals exact x64 instructions embedded in the blocks:
    • xorpd xm15, xm15 and movups [rsi], xm15 show direct, native access to SSE/AVX registers for vectorized operations.
    • Macros like PUSH2 rsi, rdi and POP2 rsi, rdi are used instead of traditional C-style prologues/epilogues, maintaining tight control over the stack pointer and register preservation.
    • C-ABI Integration: The OCR shows words like CCALL1 ide_p and CCALL3 ide_syscmd. This indicates a custom FFI (Foreign Function Interface) macro set (CCALL0, CCALL1, CCALL2, CCALL3) designed to automatically align the stack (RSP to 16 bytes) and map registers to the C-ABI (e.g., RCX, RDX, R8, R9 on Windows) to call out to the C-based host/Vulkan engine.

4. SPIR-V Generation

A significant portion of the presentation focuses on using this same macro-assembler foundation to generate SPIR-V (the intermediate representation for Vulkan compute/graphics shaders) entirely from scratch, replacing massive compiler toolchains like glslang.

  • x64 vs. SPIR-V Complexity: Onat notes that x64 assembly was actually less complicated to generate than SPIR-V.
    • x64 is a flat, linear instruction stream.
    • SPIR-V is strictly structured. It requires rigid sections for Capabilities, Extensions, Memory Models, Entry Points, Execution Modes, Types, and Function Definitions before any actual logic can be emitted.
  • SPIR-V Macros (OCR Insights): The OCR captures the exact implementation of the SPIR-V generator:
    • Words like opTypeInt 32, opTypeVector 4, opTypeFloat map directly to the SPIR-V specification binary IDs.
    • Memory addresses and types are explicitly laid out: PhysicalStorageBuffer64.
    • This proves that the "sourceless" environment scales perfectly from raw CPU machine code to structured GPU bytecodes by just changing the underlying byte-emission macros.

5. Key Takeaways for the bootslop Implementation

  1. Immediate x64 Access: The system shouldn't hide the CPU. It should expose it via macros (like CCALL) that handle the tedious parts of the ABI while letting the programmer write movups if they want to.
  2. Visual Over Text: The implementation of 4 extra bytes per character to store "source location" reinforces that the visual grid is the data structure. It's not text being parsed; it's a spatial array of tokens pointing to each other.
  3. The FFI Bridge: We will need a macro pattern equivalent to CCALL in our JIT emitter to talk to WinAPI functions without trashing the 2-item (RAX/RDX) stack or violating the 16-byte RSP alignment required by Windows.