more
This commit is contained in:
@@ -10,6 +10,7 @@ This document outlines the strict C style and architectural conventions expected
|
||||
* Float: `F4`, `F8`
|
||||
* Boolean: `B1`, `B2`, `B4`, `B8` (use `true`/`false` primitives)
|
||||
* Strings/Chars: `UTF8` (for characters), `Str8` (for string slices)
|
||||
* **Fundamental Type Casts:** Strictly use the provided casting macros (e.g., `u8_(val)`, `u4_r(ptr)`, `s4_(val)`) instead of standard C-style cast syntax like `(U8)val`. Standard casts should only be used for complex types or when an appropriate macro isn't available.
|
||||
* **WinAPI Structs:** Only use `MS_` prefixed fundamental types (e.g., `MS_LONG`, `MS_DWORD`) *inside* WinAPI struct definitions (`MS_WNDCLASSA`, etc.) to maintain FFI compatibility. Do not use them in general application logic.
|
||||
|
||||
## 2. Declaration Wrappers & X-Macros
|
||||
@@ -18,8 +19,8 @@ This document outlines the strict C style and architectural conventions expected
|
||||
* `typedef Enum_(UnderlyingType, Name) { ... };`
|
||||
* **X-Macros:** Use X-Macros to tightly couple Enums with their corresponding string representations or metadata.
|
||||
```c
|
||||
#define My_Tag_Entries()
|
||||
X(Define, "Define")
|
||||
#define My_Tag_Entries() \
|
||||
X(Define, "Define") \
|
||||
X(Call, "Call")
|
||||
```
|
||||
|
||||
@@ -43,19 +44,16 @@ This document outlines the strict C style and architectural conventions expected
|
||||
} MS_WNDCLASSA;
|
||||
```
|
||||
* **Multi-line Argument Alignment:** For long function signatures, place one argument per line with a single 4-space indent.
|
||||
* Example:
|
||||
```c
|
||||
WinAPI B4 ms_read_console(
|
||||
MS_Handle handle,
|
||||
UTF8*r buffer,
|
||||
U4 to_read,
|
||||
U4*r num_read,
|
||||
U8 reserved_input_control
|
||||
) asm("ReadConsoleA");
|
||||
```
|
||||
* **WinAPI Grouping:** Group foreign procedure declarations by their originating OS library (e.g., Kernel32, User32, GDI32) using comment headers.
|
||||
* **Brace Style:** Use Allman style (braces on a new line) for function bodies or control blocks (`if`, `for`, `switch`, etc.) that are large or complex. Smaller blocks may use K&R style.
|
||||
* **Conditionals:** Always place `else if` and `else` statements on a new line, un-nested from the previous closing brace.
|
||||
* **Conditionals & Control Flow:** Always place `else if` and `else` statements on a new line. Align control flow parentheses (e.g., between consecutive `while` and `if` blocks) vertically when possible for aesthetic uniformity:
|
||||
```c
|
||||
while (len < 8) len ++;
|
||||
if (len > 0) { ... }
|
||||
```
|
||||
* **Address-Of Operator:** Do insert a space between the address-of operator (`&`) and the variable name.
|
||||
* **Correct:** `& my_var`
|
||||
* **Incorrect:** `&my_var`
|
||||
|
||||
## 5. Memory Management
|
||||
* **Standard Library:** The C standard library is linked, but headers like `<stdlib.h>` or `<string.h>` should not be included directly. Required functions should be declared manually if needed, or accessed via compiler builtins.
|
||||
@@ -77,3 +75,4 @@ This document outlines the strict C style and architectural conventions expected
|
||||
X(Call, "Call", 0x00D6A454, "~")
|
||||
```
|
||||
* **Naming Conventions:** When using X-Macros for Tags, entry names should be PascalCase, and the Enum symbols should be prefixed with the Enum type name (e.g., `tmpl(STag, Define)` -> `STag_Define`).
|
||||
|
||||
|
||||
28
GEMINI.md
28
GEMINI.md
@@ -38,3 +38,31 @@ Based on the curation in `./references/`, the resulting system MUST adhere to th
|
||||
4. **Preemptive Scatter ("Tape Drive"):** Function arguments are not pushed to a stack before a call. They are "scattered" into pre-allocated, contiguous global memory slots during compilation/initialization. The function simply reads from these known offsets, eliminating argument gathering overhead.
|
||||
5. **No `if/then` branches:** Rely on hardware-level flags like conditional returns (`ret-if-signed`) combined with factored calls to avoid writing complex AST parsers.
|
||||
6. **No Dependencies:** C implementation must be minimal (`-nostdlib`), ideally running directly against OS APIs (e.g., WinAPI `VirtualAlloc`, `ExitProcess`, `GDI32` for rendering).
|
||||
|
||||
## Current Development Roadmap (attempt_1)
|
||||
|
||||
Here's a breakdown of the next steps to advance the `attempt_1` implementation towards a ColorForth derivative:
|
||||
|
||||
1. **Enhance Lexer/Parser/Compiler (JIT) in `main.c`:**
|
||||
* **Token Interpretation:** Refine the interpretation of the 28-bit payload based on the 4-bit color tag (e.g., differentiate between immediate values, dictionary IDs, and data addresses).
|
||||
* **Dictionary Lookup:** Improve the efficiency and scalability of dictionary lookups for custom words beyond the current linear search.
|
||||
* **New Word Definition:** Implement mechanisms for defining new Forth words directly within the editor, compiling them into the `code_arena`.
|
||||
|
||||
2. **Refine Visual Editor (`win_proc` in `main.c`):**
|
||||
* **Dynamic Colorization:** Ensure all rendered tokens accurately reflect their 4-bit color tags, updating dynamically with changes.
|
||||
* **Annotation Handling:** Implement more sophisticated display for token annotations, supporting up to 8 characters clearly without truncation or visual artifacts.
|
||||
* **Input Handling:** Improve text input for `STag_Data` (e.g., supporting full hexadecimal input, backspace functionality).
|
||||
* **Cursor Behavior:** Ensure the cursor accurately reflects the current editing position within the token stream.
|
||||
|
||||
3. **Expand Register-Only Stack Operations:**
|
||||
* Implement core Forth stack manipulation words (e.g., `DUP`, `DROP`, `OVER`, `ROT`) by generating appropriate x86-64 assembly instructions that operate solely on `RAX` and `RDX`.
|
||||
|
||||
4. **Develop `Tape Drive` Memory Management:**
|
||||
* Ensure all memory access (read/write) for Forth variables and data structures correctly utilize the `vm_globals` array and the "preemptive scatter" approach.
|
||||
|
||||
5. **Implement Control Flow without Branches:**
|
||||
* Leverage conditional returns and factored calls to create more complex control flow structures (e.g., `IF`/`ELSE`/`THEN` equivalents) without introducing explicit `jmp` instructions where not architecturally intended.
|
||||
|
||||
6. **Continuous Validation & Debugging:**
|
||||
* Enhance debugging output within the UI to provide clearer insight into VM state (RAX, RDX, global memory, log buffer) during execution.
|
||||
* Consider adding simple "tests" as Forth sequences within `tape_arena` to verify new features.
|
||||
|
||||
86
references/Video_Breakdowns.md
Normal file
86
references/Video_Breakdowns.md
Normal file
@@ -0,0 +1,86 @@
|
||||
# Advanced Source-Less Programming & JIT Architecture: A Hardcore Technical Study
|
||||
|
||||
This document contains a deep-dive technical extraction of the mechanics, JIT compiler optimizations, and paradigms presented by Timothy Lottes and Onat Türkçüoğlu. These notes surpass high-level theory, detailing the exact x86-64 assembly generation rules, state-tracking mechanisms, and memory layouts required to implement a zero-overhead, source-less Forth environment.
|
||||
|
||||
---
|
||||
|
||||
## 1. The Lottes "x68" Paradigm: Editor as the OS
|
||||
|
||||
Lottes's approach fundamentally transforms the editor into a live, dynamic linker and machine-code orchestrator.
|
||||
|
||||
### 1.1 The Lexical Grid and 32-Bit Instruction Granularity
|
||||
In x68, the runtime contains *no parsing logic*. Code is a flat array of 32-bit tokens (4-bit tag, 28-bit payload).
|
||||
To make the x86-64 architecture fit this visual editor grid, Lottes forces all generated machine code to 32-bit boundaries:
|
||||
* **Instruction Padding:** Native instructions that are smaller than 4 bytes are padded.
|
||||
* *Example:* `RET` (`0xC3`) becomes `C3 90 90 90` (using three `NOP`s).
|
||||
* *Example:* `MOV` or `ADD` can use ignored segment overrides (like the `3E` DS prefix) or unnecessary `REX` prefixes to reach exactly 4 bytes.
|
||||
* **Auto-Relinking:** The editor implicitly acts as a linker. Because every instruction is 32 bits, 32-bit RIP-relative offsets for `CALL` (`E8`) and `JMP` (`E9`) are perfectly aligned. When the user inserts or deletes a token in the editor, the editor instantly recalculates and updates the raw binary relative offsets for all branch instructions.
|
||||
* **Shorthand Assembly UI:** The editor can decode these 32-bit blocks and display human-readable macro-assembly, e.g., mapping `add rcx, qword ptr [rdx + 0x8]` to the visual string `h + at i08`.
|
||||
|
||||
### 1.2 ColorForth Semantic Tags & The State Machine
|
||||
The 4-bit color tag dictates how the editor/JIT interprets the 28-bit payload:
|
||||
* **White (Ignored):** Comments, formatting, or skipped words.
|
||||
* **Yellow (Immediate Execution):**
|
||||
* If a number: Append it to the data stack *during edit/compile time*.
|
||||
* If a word: Look it up in the dictionary and execute its associated code *immediately*.
|
||||
* **Red (Define):** Sets a word in the dictionary to point to the current compilation address (or TOS).
|
||||
* **Green (Compile):**
|
||||
* If a number: Emits machine code to push that number (e.g., `mov rax, imm`).
|
||||
* If a word: Looks it up in the *Macro* dictionary; if found, calls it (code generation). Otherwise, looks it up in the *Forth* dictionary and emits a `CALL` to it.
|
||||
* **Cyan/Blue (Defer Execution):** Looks up a word in the macro dictionary and appends a call to it. Used for macros that generate other macros.
|
||||
* **Magenta (Variable/Pointer):** Sets the dictionary value to point to the *next source token* in memory.
|
||||
* **The Transition Trigger:** A transition from Yellow (Execution) to Green (Compilation) causes the JIT to pop the current Top of Stack and emit a native machine-code instruction to push that value. (i.e., "Turning a computed number back into a program").
|
||||
|
||||
### 1.3 The 5-Byte Folded Interpreter
|
||||
To eliminate the massive pipeline stall (branch misprediction) caused by a standard `NEXT` instruction in threaded-code interpreters, Lottes suggests embedding a micro-interpreter at the *end of every word*:
|
||||
1. **`LODSD` (1 byte or 2 bytes with REX):** Loads the next 32-bit token from `RSI` (the instruction pointer) into `EAX`/`RAX` and increments `RSI`.
|
||||
2. **Lookup (2 bytes):** Uses a highly optimized hash or direct mapping to translate the token payload to a memory address.
|
||||
3. **Jump (2 bytes):** Emits an indirect jump (e.g., `JMP RAX`).
|
||||
*Result:* Every word transition has its own dedicated branch predictor slot in the CPU hardware, reducing average clock stalls from ~16 to near 0.
|
||||
|
||||
---
|
||||
|
||||
## 2. Onat's VAMP / KYRA: High-Performance Macro-Assembler
|
||||
|
||||
Onat's implementation provides a masterclass in eliminating the Forth data stack and leveraging x86-64 hardware registers optimally.
|
||||
|
||||
### 2.1 The 2-Register Stack & JIT State Tracking
|
||||
Traditional Forth maintains a data stack in RAM, requiring constant memory loads/stores. Onat eliminates this:
|
||||
* **The Stack is `RAX` and `RDX`.** No memory is used for parameter passing.
|
||||
* **The 1-Bit JIT Optimizer:** The JIT compiler maintains a single bit of state: `is_rax_tos` (Is RAX currently the Top of Stack?).
|
||||
* **Smart Compilation:**
|
||||
* If the user types a Cyan number (Immediate), the JIT checks `is_rax_tos`. If true, it emits `mov rax, imm`. If false, it emits `mov rdx, imm`.
|
||||
* Before compiling a `CALL`, the JIT knows which register the target function expects the TOS to be in. If the current JIT state mismatches the target's expectation, it automatically emits the 3-byte `xchg rax, rdx` (`48 87 C2`) instruction *before* the call.
|
||||
* This makes operations like `SWAP` virtually free—they often just flip the compiler's internal `is_rax_tos` boolean without emitting any machine code.
|
||||
* **Function Prologue/Epilogue:** Functions do not push/pop to a return stack in memory manually; they rely purely on the native x86 `call` and `ret` instructions utilizing `RSP` purely as a call stack.
|
||||
|
||||
### 2.2 Global Preemptive Scatter (The "Tape Drive")
|
||||
Because the data stack is limited to two items, passing deep context is impossible.
|
||||
* **Global Single-Register Base:** A single x86 register (e.g., `R12` or `R15`) is dedicated globally as the base pointer for all application memory (giving "gigabytes of state").
|
||||
* **Colors map to memory operations:**
|
||||
* **Green Tag (Read):** Emits `mov REG, [base_ptr + token_offset]`.
|
||||
* **Red Tag (Write):** Emits `mov [base_ptr + token_offset], REG`.
|
||||
* **FFI (Foreign Function Interface):** To call complex OS APIs (like Vulcan `VkImageCreateInfo`), VAMP does not use C-struct bindings. It manually calculates byte-offsets from the global base, emits instructions to write the struct data inline, aligns `RSP` for the OS calling convention, and calls the dynamic library pointer.
|
||||
|
||||
### 2.3 Lexical Syntax and Color Semantics
|
||||
Onat uses a 24-bit dictionary index + 8-bit color tag. The semantics map directly to JIT actions:
|
||||
* **Magenta Pipe (`|`):** Defines the boundary of a function. The JIT encounters this, emits a `RET` (`C3`) to close the previous function, and records the current instruction pointer as the start address of the new function.
|
||||
* **White (Call):** Emits a relative `CALL` to the target. (If jumping to a dynamic address already in a register, it optimizes to `JMP RAX`).
|
||||
* **Yellow (Macro):** Executes the attached code *during JIT compilation*. Used for compiler directives, setting layouts, or emitting specialized instructions like `LOCK` prefixes.
|
||||
* **Blue (Comment):** Ignored by the JIT pointer entirely.
|
||||
|
||||
### 2.4 Control Flow without ASTs
|
||||
VAMP abandons standard `IF/ELSE/THEN` parsing trees in favor of assembly-level basic blocks and lambdas.
|
||||
* **Lambdas `{ }`:** Defining a lambda simply compiles the block of code elsewhere and leaves its executable memory address on the stack (`RAX` or `RDX`).
|
||||
* **Conditionals via Global State:**
|
||||
1. A comparison (e.g., `>`) is executed.
|
||||
2. The result is written to a dedicated global variable (e.g., `condition` using a Red tag).
|
||||
3. The conditional jump word reads the `condition` variable, consumes the lambda's address from the stack, and emits `CMP condition, 0` followed by `JZ lambda_address`.
|
||||
* **Basic Blocks `[ ]`:** These constrain the scope of assembly generation. If a conditional within a block passes, execution falls through. If it fails, it jumps to the nearest closing `]`.
|
||||
|
||||
### 2.5 Live Debugging via Instruction Injection
|
||||
The most powerful UX feature of VAMP is its real-time data flow visualization.
|
||||
* The editor tracks the user's cursor position.
|
||||
* During JIT compilation, if the `compiler_instruction_ptr` equals the `editor_cursor_ptr`, the JIT injects a debug macro.
|
||||
* This macro emits instructions to copy the current state of `RAX` and `RDX` (the entire data stack) into a global circular buffer.
|
||||
* The UI reads this buffer, instantly displaying the exact runtime state of the program at the cursor's location, acting as an instant, zero-cost `printf`.
|
||||
Reference in New Issue
Block a user