Files
forth_bootslop/references/Architectural_Consolidation.md
2026-02-20 22:10:29 -05:00

3.5 KiB

Architectural Consolidation: Zero-Overhead Sourceless ColorForth

This document serves as the master blueprint for the research and curation phase, synthesizing the findings from Timothy Lottes, Onat Türkçüoğlu, and related high-performance minimalist systems.

1. Core Philosophy

  • Sourceless: The "source of truth" is a 32-bit token array, not a text file. No string parsing occurs at runtime.
  • Zero-Overhead: Instant iteration (<5ms compilation) by emitting machine code directly from tokens.
  • Bounded Complexity: Force complexity into data structures rather than code logic.
  • Hardware Locality: Treat the register file as a global namespace; minimize or eliminate the data stack.

2. Lottes' x68 Architecture (The Frontend/Editor)

  • 32-Bit Instruction Granularity: Every x86-64 instruction is padded to exactly 4 bytes (or multiples thereof) using ignored prefixes and multi-byte NOPs.
    • Example: RET (0xC3) -> C3 90 90 90.
  • Token Format: 32-bit words consisting of:
    • 28 Bits: Compressed name/string or value.
    • 4 Bits: Semantic Tag (Opcode, Abs Addr, Rel Addr, Immediate, etc.).
  • Annotation Overlay: A parallel memory layer (e.g., 64-bit per token) stores metadata for the editor (colors, names, formatting tags) without polluting the executable.
  • Tooling Recommendation: ImHex with a custom .hexpat pattern language can serve as the visual frontend for this annotation overlay.

3. Onat's VAMP/KYRA Architecture (The Runtime/Codegen)

  • 2-Item Register Stack: Uses RAX and RDX as a tiny, hardware-resident stack.
    • The Swap / Magenta Pipe: A definition boundary implicitly emits RET (to close the last block) followed by xchg rax, rdx (1-byte: 48 87 C2 or 48 92) to rotate the "top of stack" for the new block.
  • Aliased Global Namespace: The CPU register file is treated as a shared, aliased memory space for functions.
  • Functions as Blocks: Words are "free of arguments and returns" in the traditional sense.
  • Preemptive Scatter ("Tape Drive"): Arguments are pre-placed into fixed, contiguous memory slots ("the tape") by the compiler/loader before execution. This eliminates "argument gathering" during function calls.
  • The FFI Dance (C-ABI Integration): To call OS APIs (like WinAPI or Vulkan), the hardware stack pointer (RSP) must be strictly 16-byte aligned. Custom macros (like CCALL) must save state, align RSP, map the 2-register stack into C-ABI registers (RCX, RDX, R8, R9), execute the CALL, and restore RSP.

4. Implementation Components

  • Emitter: Zydis Encoder API. Zero-allocation, sub-5ms instruction generation.
  • Live Reload: Hot Runtime Linking (Fredriksson style). Atomic pointer swapping at main-loop "safe points" to patch code in-place.
  • Threading Model: Direct Threaded Code (DTC) for the initial dictionary/execution token (xt) baseline.
  • Wasm Parallels: WebAssembly's linear memory and binary sectioning provide a modern reference for the "tape drive" and fixed-offset load/store model.

5. Visual Semantics (ColorForth Mapping)

  • RED: Define new word (Dictionary entry).
  • GREEN: Compile word into current definition.
  • YELLOW/ORANGE: Immediate execution (Macros/Editor commands).
  • CYAN/BLUE: Variables, Addresses, Layout.
  • WHITE/DIM: Comments, Annotations, UI.
  • MAGENTA: Pointers, State modifiers.

Curation Phase Status: COMPLETE Ready for Strategy Phase: Pending Directive