This commit is contained in:
2026-02-21 14:18:15 -05:00
parent 6953e6b9b3
commit 67f8639ee7
5 changed files with 604 additions and 1467 deletions

View File

@@ -44,25 +44,40 @@ Based on the curation in `./references/`, the resulting system MUST adhere to th
## Current Development Roadmap (attempt_1)
The prototype currently implements a functional WinAPI modal editor, a 2-register (`RAX`/`RDX`) JIT compiler with an `O(1)` visual linker, x68 32-bit instruction padding, implicit definition boundaries (Magenta Pipe), and an initial FFI Bridge (`emit_ffi_dance`).
The prototype currently implements:
- A functional WinAPI modal editor backed by `microui` for immediate-mode floating panels.
- A 2-register (`RAX`/`RDX`) JIT compiler with an `O(1)` visual linker (`tape_to_code_offset` table).
- x68-style 32-bit instruction padding via `pad32()` using `0x90` NOPs.
- Implicit definition boundaries (Magenta Pipe / `STag_Define`) emitting `JMP rel32` over the body and `xchg rax, rdx` at the entry point.
- An FFI Bridge (`x64_FFI_PROLOGUE`, `x64_FFI_MAP_ARGS`, `x64_FFI_CALL_ABS`, `x64_FFI_EPILOGUE`) for calling WinAPI functions safely from JIT'd code.
- Persistence via F1 (save) / F2 (load) to `cartridge.bin`.
- A Lambda tag (`STag_Lambda`) that compiles a code block out-of-line and leaves its address in `RAX`.
- A well-defined **x64 Emission DSL** (`#pragma region x64 Emission DSL`) with named REX prefixes, register encodings, ModRM/SIB composition macros, opcode constants, and composite instruction inline functions.
### x64 Emission DSL Discipline
All JIT code emission in `main.c` MUST use the x64 Emission DSL defined in the `#pragma region x64 Emission DSL` block. Raw magic bytes are forbidden. The allowed primitives are:
- **Composite helpers:** `x64_XCHG_RAX_RDX()`, `x64_MOV_RDX_RAX()`, `x64_MOV_RAX_RDX()`, `x64_ADD_RAX_RDX()`, `x64_SUB_RAX_RDX()`, `x64_IMUL_RAX_RDX()`, `x64_DEC_RAX()`, `x64_TEST_RAX_RAX()`, `x64_RET_IF_ZERO()`, `x64_RET_IF_SIGN()`, `x64_FETCH()`, `x64_STORE()`, `x64_CALL_RAX()`, `x64_RET()`.
- **Prologue/Epilogue:** `x64_JIT_PROLOGUE()`, `x64_JIT_EPILOGUE()`.
- **FFI:** `x64_FFI_PROLOGUE()`, `x64_FFI_MAP_ARGS()`, `x64_FFI_CALL_ABS(addr)`, `x64_FFI_EPILOGUE()`.
- **Raw emission only via named constants:** `emit8(x64_op_*)`, `emit8(x64_REX*)`, `emit8(x64_modrm(*))`, `emit32(val)`, `emit64(val)`.
- **Exception:** Forward jump placeholders (`JMP rel32`, `CALL rel32`) that have no composite helper may use `emit8(x64_op_JMP_rel32)` / `emit8(x64_op_CALL_rel32)` directly with a following `emit32(0)` placeholder, pending a dedicated DSL wrapper.
Here is a breakdown of the next steps to advance the `attempt_1` implementation towards a complete ColorForth derivative:
1. ~~**Refine the FFI / Tape Drive Argument Scatter:**~~ (Completed via `PRIM_PRINT` updating to load R8/R9 from `vm_globals`)
* Currently, the FFI bridge only maps `RAX` and `RDX` to the C-ABI `RCX` and `RDX`.
* Implement "Preemptive Scatter" logic so the FFI bridge correctly reads subsequent arguments (e.g., `R8`, `R9`) directly from pre-defined offsets in the `vm_globals` tape drive instead of just zeroing them out.
1. ~~**Refine the FFI / Tape Drive Argument Scatter:**~~ (Completed)
2. ~~**Implement the Self-Modifying Cartridge (Persistence):**~~ (Completed via F1/F2 save/load)
3. ~~**Refine Visual Editor Interactions:**~~ (Completed via `microui` integration)
4. ~~**Audit and enforce x64 Emission DSL usage throughout `main.c`:**~~ (Completed — all raw magic bytes replaced with named DSL constants and composite helpers)
2. **Expanded Annotation Layer (Variable-Length Comments):**
* The current `anno_arena` strictly allocates 8 bytes (a `U8`) per token.
* Refactor the visual editor and annotation memory management to allow for arbitrarily long text blocks (comments) to be attached to specific tokens without disrupting the `O(1)` compilation mapping.
5. **Add DSL wrappers for forward jump placeholders:**
- `x64_JMP_fwd_placeholder(U4* offset_out)` — emits `E9 00000000` and writes the patch offset.
- `x64_patch_fwd(U4 offset)` — patches a previously emitted placeholder with the current code position.
- This will eliminate the last remaining raw `emit8`/`emit32` pairs in `compile_and_run_tape`.
3. ~~**Implement the Self-Modifying Cartridge (Persistence):**~~ (Completed via F1/F2 save/load)
* The tape and annotations are currently lost when the program closes.
* Move away from purely transient `VirtualAlloc` buffers to a memory-mapped file approach (or a manual Save/Load equivalent in WinAPI) to allow the "executable as source" to persist between sessions.
6. **Expanded Annotation Layer (Variable-Length Comments):**
- The current `anno_arena` strictly allocates 8 bytes (a `U8`) per token.
- Refactor the visual editor and annotation memory management to allow for arbitrarily long text blocks (comments) to be attached to specific tokens without disrupting the `O(1)` compilation mapping.
4. ~~**Refine Visual Editor Interactions:**~~ (Completed via `microui` integration)
* Implement a proper internal text-editing cursor within the `STag_Data` and `STag_Format` (annotation) tokens, rather than relying on backspace-truncation and appendage.
* Migrated to `microui` for immediate mode GUI floating panels, auto-layout token sizing (for a natural text look), and window resizing.
5. **Continuous Validation & Complex Control Flow:**
* Expand the primitive set to allow for more complex, AST-less control flow (e.g., handling Lambdas or specific Basic Block jumps).
7. **Continuous Validation & Complex Control Flow:**
- Expand the primitive set to allow for more complex, AST-less control flow (e.g., handling Basic Block jumps `[ ]`).
- Investigate adding a `RET_IF_ZERO` + tail-call pattern for loops without explicit branch instructions.

View File

@@ -3,7 +3,7 @@
## Overview
`attempt_1` is a minimal C program that serves as a proof-of-concept for the "Lottes/Onat" sourceless ColorForth paradigm. It successfully integrates a visual editor, a live JIT compiler, and an execution environment into a single, cohesive Win32 application that links against the C runtime but avoids direct includes of standard headers, using manually declared functions instead.
The application presents a visual grid of 32-bit tokens and allows the user to navigate and edit them directly. On every keypress, the token array is re-compiled into x86-64 machine code and executed, with the results (register states and global memory) displayed instantly in the HUD.
The application presents a visual grid of 32-bit tokens rendered via `microui` floating panels and allows the user to navigate and edit them directly. On every keypress, the token array is re-compiled into x86-64 machine code and executed, with the results (register states and global memory) displayed instantly in the HUD.
## Core Concepts Implemented
@@ -17,42 +17,90 @@ The application presents a visual grid of 32-bit tokens and allows the user to n
3. **2-Register Stack & Global Memory:**
* The JIT compiler emits x86-64 that strictly adheres to Onat's `RAX`/`RDX` register stack.
* A `vm_globals` array is passed by pointer into the JIT'd code (via `RCX` on Win64), allowing instructions like `FETCH` and `STORE` to simulate the "tape drive" memory model.
* A `vm_globals` array (16 x `U8`) is passed by pointer into the JIT'd code via `RCX` (Win64 calling convention), held in `RBX` for the duration of execution.
* `vm_globals[14]` and `vm_globals[15]` serve as the `RAX` and `RDX` save/restore slots across JIT entry and exit.
* Indices 013 are available as the "tape drive" global memory for `FETCH`/`STORE` primitives.
4. **Handmade x86-64 JIT Emitter:**
* A small set of `emit8`/`emit32` functions write raw x86-64 opcodes into a `VirtualAlloc` block marked as executable (`PAGE_EXECUTE_READWRITE`).
* This buffer is cast to a C function pointer and called directly, bypassing the need for an external assembler like NASM or a complex library like Zydis for this prototype stage.
4. **Handmade x86-64 JIT Emitter with Named DSL:**
* A small set of `emit8`/`emit32`/`emit64` functions write raw x86-64 opcodes into a `VirtualAlloc` block marked `PAGE_EXECUTE_READWRITE`.
* All emission is done through a well-defined **x64 Emission DSL** (`#pragma region x64 Emission DSL`) consisting of:
* Named REX prefix constants (`x64_REX`, `x64_REX_R`, `x64_REX_B`, etc.).
* Named register encoding constants (`x64_reg_RAX`, `x64_reg_RDX`, etc.).
* ModRM and SIB composition macros (`x64_modrm(mod, reg, rm)`, `x64_sib(scale, index, base)`).
* Named opcode constants (`x64_op_MOV_reg_rm`, `x64_op_CALL_rel32`, etc.).
* Composite inline instruction helpers (`x64_XCHG_RAX_RDX()`, `x64_ADD_RAX_RDX()`, `x64_RET_IF_ZERO()`, `x64_FETCH()`, `x64_STORE()`, etc.).
* Prologue/Epilogue helpers (`x64_JIT_PROLOGUE()`, `x64_JIT_EPILOGUE()`).
* FFI helpers (`x64_FFI_PROLOGUE()`, `x64_FFI_MAP_ARGS()`, `x64_FFI_CALL_ABS(addr)`, `x64_FFI_EPILOGUE()`).
* **Raw magic bytes are forbidden** in `compile_and_run_tape` and `compile_action`. All emission uses the DSL.
5. **Modal Editor (Win32 GDI):**
* The UI is built with raw Win32 GDI calls defined in `duffle.h`.
* It features two modes: `Navigation` (gray cursor, arrow key movement) and `Edit` (orange cursor, text input).
5. **Modal Editor (Win32 GDI + microui):**
* The UI is built with `microui` rendered via raw Win32 GDI calls defined in `duffle.h`.
* It features two modes: `Navigation` (blue cursor, arrow key movement) and `Edit` (orange cursor, text input).
* The editor correctly handles token insertion, deletion (Vim-style backspace), tag cycling (Tab), and value editing, all while re-compiling and re-executing on every keystroke.
* Four floating panels: **ColorForth Source Tape**, **Compiler & Status**, **Registers & Globals**, **Print Log**.
6. **O(1) Dictionary & Visual Linking:**
* The dictionary relies on an edit-time visual linker. When the tape is modified, `relink_tape` resolves names to absolute source memory indices.
* The compiler resolves references in `O(1)` time instantly by indexing into an offset mapping table (`tape_to_code_offset`).
* The compiler resolves references in `O(1)` time by indexing into `tape_to_code_offset[65536]`.
7. **Implicit Definition Boundaries (Magenta Pipe):**
* Definitions implicitly cause the JIT to emit a `RET` to close the prior block, and an `xchg rax, rdx` to rotate the stack for the new block.
7. **Implicit Definition Boundaries (STag_Define):**
* A `STag_Define` token causes the JIT to:
1. Emit `RET` to close the prior block (via `x64_RET()`).
2. Emit a `JMP rel32` placeholder to skip over the new definition body.
3. Record the entry point in `tape_to_code_offset[i]`.
4. Emit `xchg rax, rdx` (via `x64_XCHG_RAX_RDX()`) as the definition's first instruction, rotating the 2-register stack.
8. **x68 Instruction Padding:**
* The JIT pads every logical block/instruction to exact 32-bit multiples using `0x90` (NOPs) to perfectly align with the visual token grid logic.
8. **Lambda Tag (STag_Lambda):**
* A `STag_Lambda` token compiles a code block out-of-line and leaves its absolute 64-bit address in `RAX` for use with `STORE` or `EXECUTE`.
* Implemented via `x64_MOV_RDX_RAX()` to save the prior TOS, a `mov rax, imm64` with a patched-in address, and a `JMP rel32` to skip the body.
9. **The FFI Bridge:**
* The system uses an FFI macro (`emit_ffi_dance`) to align the `RSP` stack to 16 bytes, allocate 32 bytes of shadow space, and map the 2-register data stack/globals into the Windows C-ABI (`RCX`, `RDX`, `R8`, `R9`) to safely call WinAPI functions (like `MessageBoxA`).
9. **x68 Instruction Padding:**
* `pad32()` pads every logical block/instruction to exact 32-bit multiples using `0x90` (NOPs), aligning with the visual token grid.
## What's Missing (TODO)
10. **The FFI Bridge:**
* `x64_FFI_PROLOGUE()` pushes `RDX`, aligns `RSP` to 16 bytes, and allocates 32 bytes of shadow space. * x64_FFI_MAP_ARGS() maps the 2-register stack and globals into Win64 ABI registers (RCX=RAX, R8=globals[0], R9=globals[1]). * x64_FFI_CALL_ABS(addr) loads the absolute 64-bit function address into R10 and calls it. * x64_FFI_EPILOGUE() restores RSP and pops RDX.
* **Saving/Loading (Persistence):** The tape and annotation arenas are purely in-memory and are lost when the program closes. Need to implement the self-modifying OS cartridge concept.
* **Expanded Instruction Set:** The JIT only knows a handful of primitives. It has no support for floating point or more complex branches.
* **Annotation Editing & Comments:** Typing into an annotation just appends characters up to 8 bytes. A proper text-editing cursor within the token is needed, and support for arbitrarily long comments should be implemented.
* **Tape Drive / Preemptive Scatter Logic:** Improve the FFI argument mapping to properly read from the "tape drive" memory slots instead of just mapping RAX/RDX to the first parameters.
Persistence (Cartridge Save/Load):
F1 saves the tape and annotation arenas (with metadata) to cartridge.bin via WriteFile.
F2 loads from cartridge.bin, re-runs relink_tape() and compile_and_run_tape() to restore full live state.
Primitive Instruction Set
```md
ID Name Emitted x86-64 (via DSL)
1 SWAP x64_XCHG_RAX_RDX()
2 MULT x64_IMUL_RAX_RDX()
3 ADD x64_ADD_RAX_RDX()
4 FETCH x64_FETCH() — mov rax, [rbx + rax*8]
5 DEC x64_DEC_RAX()
6 STORE x64_STORE() — mov [rbx + rax*8], rdx
7 RET_IF_Z x64_RET_IF_ZERO()
8 RETURN x64_RET()
9 PRINT FFI dance → ms_builtin_print
10 RET_IF_S x64_RET_IF_SIGN()
11 DUP x64_MOV_RDX_RAX()
12 DROP x64_MOV_RAX_RDX()
13 SUB x64_SUB_RAX_RDX()
14 EXECUTE x64_CALL_RAX()
```
## Whats Missing (TODO)
- DSL wrappers for forward jump placeholders: The JMP rel32 and CALL rel32 forward-jump patterns in compile_and_run_tape still use bare emit8(x64_op_JMP_rel32) + emit32(0) pairs. Dedicated x64_JMP_fwd_placeholder(U4* offset_out) and x64_patch_fwd(U4 offset) helpers should be added to the DSL to eliminate this last gap.
- Expanded Annotation Layer (Variable-Length Comments): The anno_arena strictly allocates 8 bytes per token. Arbitrarily long comment blocks need a separate indirection layer without disrupting the O(1) compile mapping.
- Expanded Instruction Set: No floating point. No multi-way branching beyond RET_IF_Z / RET_IF_S.
- Basic Block Jumps [ ]: Lottes-style scoped jump targets for structured control flow without an AST are not yet implemented.
- Tape Drive / Preemptive Scatter Improvements: The FFI argument mapping reads globals[0] and globals[1] for R8/R9. A proper scatter model that pre-places arguments into named slots before a call is not yet formalized.
- Self-Hosting Bootstrap: The editor and JIT are written in C. The long-term goal is to rewrite the core inside the custom language itself, discarding the C host.
## References Utilized
* **Heavily Utilized:**
* **Onat's Talks:** The core architecture (2-register stack, global memory tape, JIT philosophy) is a direct implementation of the concepts from his VAMP/KYRA presentations.
* **Lottes' Twitter Notes:** The 2-character mapped dictionary, `ret-if-signed` (`RET_IF_ZERO`), and annotation layer concepts were taken directly from his tweets.
* **User's `duffle.h` & `fortish-study`:** The C coding conventions (X-Macros, `FArena`, byte-width types, `ms_` prefixes) were adopted from these sources.
* **Lightly Utilized:**
* **Lottes' Blog:** Provided the high-level "sourceless" philosophy and inspiration.
* **Grok Searches:** Served to validate our understanding and provide parallels (like Wasm's linear memory), but did not provide direct implementation details.
### Heavily Utilized:
- Onats Talks: The core architecture (2-register stack, global memory tape, JIT philosophy) is a direct implementation of the concepts from his VAMP/KYRA presentations.
Lottes Twitter Notes: The 2-character mapped dictionary, ret-if-signed (RET_IF_ZERO), and annotation layer concepts were taken directly from his tweets.
- Users duffle.h & fortish-study: The C coding conventions (X-Macros, FArena, byte-width types, ms_ prefixes) were adopted from these sources.
### Lightly Utilized:
- Lottes Blog: Provided the high-level “sourceless” philosophy and inspiration.
- Grok Searches: Served to validate our understanding and provide parallels (like Wasms linear memory), but did not provide direct implementation details.

View File

@@ -426,27 +426,99 @@ internal void relink_tape(void) {
// Each maps directly to the emit8/emit32/emit64 calls in compile_action.
// Stack Machine Operations
#define x64_XCHG_RAX_RDX() do { emit8(x64_REX); emit8(x64_op_XCHG_rm_reg); emit8(x64_modrm_RAX_RDX); } while(0)
#define x64_MOV_RDX_RAX() do { emit8(x64_REX); emit8(x64_op_MOV_rm_reg); emit8(x64_modrm_RAX_RDX); } while(0) // DUP
#define x64_MOV_RAX_RDX() do { emit8(x64_REX); emit8(x64_op_MOV_rm_reg); emit8(x64_modrm_RDX_RAX); } while(0) // DROP
IA_ void x64_XCHG_RAX_RDX() { emit8(x64_REX); emit8(x64_op_XCHG_rm_reg); emit8(x64_modrm_RAX_RDX); }
IA_ void x64_MOV_RDX_RAX() { emit8(x64_REX); emit8(x64_op_MOV_rm_reg); emit8(x64_modrm_RAX_RDX); } // DUP
IA_ void x64_MOV_RAX_RDX() { emit8(x64_REX); emit8(x64_op_MOV_rm_reg); emit8(x64_modrm_RDX_RAX); } // DROP
// Arithmetic (2-register stack: op RAX with RDX, result in RAX)
#define x64_ADD_RAX_RDX() do { emit8(x64_REX); emit8(x64_op_ADD_rm_reg); emit8(x64_modrm_RAX_RDX); } while(0)
#define x64_SUB_RAX_RDX() do { emit8(x64_REX); emit8(x64_op_SUB_rm_reg); emit8(x64_modrm_RAX_RDX); } while(0)
#define x64_IMUL_RAX_RDX() do { emit8(x64_REX); emit8(x64_op_IMUL_reg_rm); emit8(x64_op_IMUL_reg_rm2); emit8(x64_modrm_RAX_RDX); } while(0)
#define x64_DEC_RAX() do { emit8(x64_REX); emit8(x64_op_UNARY); emit8(x64_modrm(x64_mod_reg, x64_ext_DEC, x64_reg_RAX)); } while(0)
IA_ void x64_ADD_RAX_RDX() { emit8(x64_REX); emit8(x64_op_ADD_rm_reg); emit8(x64_modrm_RAX_RDX); }
IA_ void x64_SUB_RAX_RDX() { emit8(x64_REX); emit8(x64_op_SUB_rm_reg); emit8(x64_modrm_RAX_RDX); }
IA_ void x64_IMUL_RAX_RDX() { emit8(x64_REX); emit8(x64_op_IMUL_reg_rm); emit8(x64_op_IMUL_reg_rm2); emit8(x64_modrm_RAX_RDX); }
IA_ void x64_DEC_RAX() { emit8(x64_REX); emit8(x64_op_UNARY); emit8(x64_modrm(x64_mod_reg, x64_ext_DEC, x64_reg_RAX)); }
// Flag Operations (for conditional returns)
#define x64_TEST_RAX_RAX() do { emit8(x64_REX); emit8(x64_op_TEST_rm_reg); emit8(x64_modrm_RAX_RAX); } while(0)
IA_ void x64_TEST_RAX_RAX() { emit8(x64_REX); emit8(x64_op_TEST_rm_reg); emit8(x64_modrm_RAX_RAX); }
// Conditional Returns (TEST must precede these)
// JNZ skips the RET if RAX != 0, so RET only fires when RAX == 0
#define x64_RET_IF_ZERO() do { x64_TEST_RAX_RAX(); emit8(x64_op_JNZ_rel8); emit8(0x01); emit8(x64_op_RET); } while(0)
// JNS skips the RET if RAX >= 0, so RET only fires when RAX < 0
#define x64_RET_IF_SIGN() do { x64_TEST_RAX_RAX(); emit8(x64_op_JNS_rel8); emit8(0x01); emit8(x64_op_RET); } while(0)
IA_ void x64_RET_IF_ZERO() { x64_TEST_RAX_RAX(); emit8(x64_op_JNZ_rel8); emit8(0x01); emit8(x64_op_RET); } // JNZ skips the RET if RAX != 0, so RET only fires when RAX == 0
IA_ void x64_RET_IF_SIGN() { x64_TEST_RAX_RAX(); emit8(x64_op_JNS_rel8); emit8(0x01); emit8(x64_op_RET); } // JNS skips the RET if RAX >= 0, so RET only fires when RAX < 0
// Tape Drive Memory (Preemptive Scatter via RBX base pointer)
#define x64_FETCH() do { emit8(x64_REX); emit8(x
IA_ void x64_FETCH() {
emit8(x64_REX);
emit8(x64_op_MOV_reg_rm);
emit8(x64_modrm_RAX_sib);
emit8(x64_sib_tape);
}
IA_ void x64_STORE() {
emit8(x64_REX);
emit8(x64_op_MOV_rm_reg);
emit8(x64_modrm_RDX_sib);
emit8(x64_sib_tape);
}
// Indirect call through RAX (EXECUTE primitive)
IA_ void x64_CALL_RAX() {
emit8(x64_op_UNARY);
emit8(x64_modrm(x64_mod_reg, x64_ext_CALL, x64_reg_RAX));
}
IA_ void x64_RET() { emit8(x64_op_RET); } // RET
// JIT Entry Prologue: save RBX, load vm_globals ptr from RCX, restore RAX/RDX state
// vm_globals[14] = RAX save slot (14 * 8 = 0x70)
// vm_globals[15] = RDX save slot (15 * 8 = 0x78)
#define x64_vm_rax_slot 0x70
#define x64_vm_rdx_slot 0x78
IA_ void x64_JIT_PROLOGUE() {
emit8(x64_op_PUSH_RBX);
emit8(x64_REX); emit8(x64_op_MOV_rm_reg); emit8(x64_modrm_RCX_RBX);
emit8(x64_REX); emit8(x64_op_MOV_reg_rm); emit8(x64_modrm_RAX_mem_disp8_RBX); emit8(x64_vm_rax_slot);
emit8(x64_REX); emit8(x64_op_MOV_reg_rm); emit8(x64_modrm_RDX_mem_disp8_RBX); emit8(x64_vm_rdx_slot);
}
// JIT Exit Epilogue: save RAX/RDX state back, restore RBX, return
IA_ void x64_JIT_EPILOGUE() {
emit8(x64_REX); emit8(x64_op_MOV_rm_reg); emit8(x64_modrm_RAX_mem_disp8_RBX); emit8(x64_vm_rax_slot);
emit8(x64_REX); emit8(x64_op_MOV_rm_reg); emit8(x64_modrm_RDX_mem_disp8_RBX); emit8(x64_vm_rdx_slot);
emit8(x64_op_POP_RBX);
emit8(x64_op_RET);
}
// Win64 FFI Dance: align RSP, allocate 32 bytes shadow space
// sub rsp, 40 (32 shadow + 8 to realign since we pushed RDX)
#define x64_ffi_shadow_space 0x28
IA_ void x64_FFI_PROLOGUE() {
emit8(x64_op_PUSH_RDX);
emit8(x64_REX); emit8(x64_op_ARITH_imm8);
emit8(x64_modrm(x64_mod_reg, x64_ext_SUB, x64_reg_RSP));
emit8(x64_ffi_shadow_space);
}
// Map 2-register stack and tape drive into Win64 ABI argument registers
// RCX = RAX (arg1 = Top of Stack)
// RDX = RDX (arg2 = Next of Stack, already in place)
// R8 = vm_globals[0]
// R9 = vm_globals[1]
#define x64_FFI_MAP_ARGS() \
do { \
emit8(x64_REX); emit8(x64_op_MOV_rm_reg); emit8(x64_modrm_RAX_RCX); \
emit8(x64_REX_R); emit8(x64_op_MOV_reg_rm); emit8(x64_modrm(x64_mod_mem, x64_reg_R8, x64_reg_RBX)); \
emit8(x64_REX_R); emit8(x64_op_MOV_reg_rm); emit8(x64_modrm(x64_mod_mem_disp8, x64_reg_R9, x64_reg_RBX)); emit8(0x08); \
} while(0)
// Load absolute 64-bit function address into R10 and call it
#define x64_FFI_CALL_ABS(abs_addr) \
do { \
emit8(x64_REX_B); emit8(x64_op_MOV_r10_imm64); \
emit32(u4_(u8_(abs_addr) & 0xFFFFFFFF)); \
emit32(u4_(u8_(abs_addr) >> 32)); \
emit8(0x41); emit8(x64_op_UNARY); \
emit8(x64_modrm(x64_mod_reg, x64_ext_CALL, x64_reg_R10)); \
} while(0)
// Restore RSP and RDX after FFI call
#define x64_FFI_EPILOGUE() \
do { \
emit8(x64_REX); emit8(x64_op_ARITH_imm8); \
emit8(x64_modrm(x64_mod_reg, x64_ext_ADD, x64_reg_RSP)); \
emit8(x64_ffi_shadow_space); \
emit8(x64_op_POP_RDX); \
} while(0)
#pragma endregion x64 Emission DSL
internal void compile_action(U4 val)
@@ -454,83 +526,75 @@ internal void compile_action(U4 val)
if (val >= 0x10000) {
U4 p = val - 0x10000;
if (p == PRIM_SWAP) {
emit8(0x48); emit8(0x87); emit8(0xC2);
x64_XCHG_RAX_RDX();
pad32();
return;
} else if (p == PRIM_MULT) {
emit8(0x48); emit8(0x0F); emit8(0xAF); emit8(0xC2);
}
else if (p == PRIM_MULT) {
x64_IMUL_RAX_RDX();
pad32();
return;
} else if (p == PRIM_ADD) {
emit8(0x48); emit8(0x01); emit8(0xD0);
}
else if (p == PRIM_ADD) {
x64_ADD_RAX_RDX();
pad32();
return;
} else if (p == PRIM_SUB) {
emit8(0x48); emit8(0x29); emit8(0xD0);
}
else if (p == PRIM_SUB) {
x64_SUB_RAX_RDX();
pad32();
return;
} else if (p == PRIM_FETCH) {
emit8(0x48); emit8(0x8B); emit8(0x04); emit8(0xC3); // mov rax, [rbx + rax*8]
}
else if (p == PRIM_FETCH) {
x64_FETCH();
pad32();
return;
} else if (p == PRIM_DEC) {
emit8(0x48); emit8(0xFF); emit8(0xC8);
}
else if (p == PRIM_DEC) {
x64_DEC_RAX();
pad32();
return;
} else if (p == PRIM_STORE) {
emit8(0x48); emit8(0x89); emit8(0x14); emit8(0xC3); // mov [rbx + rax*8], rdx
}
else if (p == PRIM_STORE) {
x64_STORE();
pad32();
return;
} else if (p == PRIM_RET_Z) {
emit8(0x48); emit8(0x85); emit8(0xC0);
emit8(0x75); emit8(0x01);
emit8(0xC3);
}
else if (p == PRIM_RET_Z) {
x64_RET_IF_ZERO();
pad32();
return;
} else if (p == PRIM_RET_S) {
emit8(0x48); emit8(0x85); emit8(0xC0);
emit8(0x79); emit8(0x01);
emit8(0xC3);
}
else if (p == PRIM_RET_S) {
x64_RET_IF_SIGN();
pad32();
return;
} else if (p == PRIM_RET) {
emit8(0xC3);
}
else if (p == PRIM_RET) {
emit8(x64_op_RET);
pad32();
return;
} else if (p == PRIM_DUP) {
emit8(0x48); emit8(0x89); emit8(0xC2);
}
else if (p == PRIM_DUP) {
x64_MOV_RDX_RAX();
pad32();
return;
} else if (p == PRIM_DROP) {
emit8(0x48); emit8(0x89); emit8(0xD0);
}
else if (p == PRIM_DROP) {
x64_MOV_RAX_RDX();
pad32();
return;
}
else if (p == PRIM_EXECUTE) {
emit8(0XFF); emit8(0XD0);
x64_CALL_RAX();
pad32();
return;
}
else if (p == PRIM_PRINT) {
// FFI Dance: Save RDX, Align RSP (32 shadow + 8 align = 40)
emit8(0x52); // push rdx
emit8(0x48); emit8(0x83); emit8(0xEC); emit8(0x28); // sub rsp, 40
// Map arguments: RCX=RAX, RDX=RDX(already loaded), R8=Globals[0], R9=Globals[1]
emit8(0x48); emit8(0x89); emit8(0xC1); // mov rcx, rax
emit8(0x4C); emit8(0x8B); emit8(0x03); // mov r8, [rbx]
emit8(0x4C); emit8(0x8B); emit8(0x4B); emit8(0x08); // mov r9, [rbx+8]
// Load func ptr and call
emit8(0x49); emit8(0xBA); // mov r10, ...
U8 addr = u8_(& ms_builtin_print);
emit32(u4_(addr & 0xFFFFFFFF));
emit32(u4_(addr >> 32));
emit8(0x41); emit8(0xFF); emit8(0xD2); // call r10
// Restore
emit8(0x48); emit8(0x83); emit8(0xC4); emit8(0x28); // add rsp, 40
emit8(0x5A); // pop rdx
x64_FFI_PROLOGUE();
x64_FFI_MAP_ARGS();
x64_FFI_CALL_ABS(u8_(& ms_builtin_print));
x64_FFI_EPILOGUE();
pad32();
return;
}
@@ -540,22 +604,20 @@ internal void compile_action(U4 val)
U4 target = tape_to_code_offset[val];
pad32();
S4 rel32 = s4_(target) - s4_(code_arena.used + 5);
emit8(0xE8);
emit8(x64_op_CALL_rel32);
emit32(u4_(rel32));
pad32();
}
}
IA_ void compile_and_run_tape(void)
{
farena_reset(& code_arena);
log_count = 0;
gdi_log_count = 0;
emit8(0x53); // push rbx
emit8(0x48); emit8(0x89); emit8(0xCB); // mov rbx, rcx
emit8(0x48); emit8(0x8B); emit8(0x43); emit8(0x70); // mov rax, [rbx+0x70]
emit8(0x48); emit8(0x8B); emit8(0x53); emit8(0x78); // mov rdx, [rbx+0x78]
x64_JIT_PROLOGUE();
U4*r tape_ptr = u4_r(tape_arena.start);
U8*r anno_ptr = u8_r(anno_arena.start);
@@ -571,20 +633,19 @@ IA_ void compile_and_run_tape(void)
U4 tag = unpack_tag(tape_ptr[i]);
U4 val = unpack_val(tape_ptr[i]);
// NUDGE: Define what terminates blocks.
B4 is_terminator = (tag == STag_Define || tag == STag_Imm);
// Terminate lambdas first if needed
if (in_lambda && (is_terminator || tag == STag_Lambda)) {
emit8(0xC3); pad32(); // Terminate lambda with RET
x64_RET();
pad32();
U4 current = code_arena.used;
u4_r(code_arena.start + lambda_jmp_offset)[0] = current - (lambda_jmp_offset + 4);
in_lambda = false;
}
// Terminate definitions
if (in_def && is_terminator) {
emit8(0xC3); pad32(); // Terminate definition with RET
x64_RET();
pad32();
U4 current = code_arena.used;
u4_r(code_arena.start + def_jmp_offset)[0] = current - (def_jmp_offset + 4);
in_def = false;
@@ -593,18 +654,17 @@ IA_ void compile_and_run_tape(void)
if (tag == STag_Define)
{
pad32();
emit8(0xE9);
emit8(x64_op_CALL_rel32 - 3); // E9 = JMP rel32
def_jmp_offset = code_arena.used;
emit32(0); // Placeholder for jump distance
emit32(0);
pad32();
in_def = true;
tape_to_code_offset[i] = code_arena.used;
emit8(0x48); emit8(0x87); emit8(0xC2); // xchg rax, rdx
x64_XCHG_RAX_RDX();
pad32();
}
// NUDGE: Handle the new Lambda tag.
else if (tag == STag_Lambda)
{
char* name = (char*)& anno_ptr[i];
@@ -617,21 +677,23 @@ IA_ void compile_and_run_tape(void)
};
debug_log(str8("Compiling lambda: <name> (val: <val>)"), ktl_str8_from_arr(call_log_table));
// Outer function: Push lambda address into RAX
emit8(0x48); emit8(0x89); emit8(0xC2); // mov rdx, rax (save old rax)
emit8(0x48); emit8(0xB8); // mov rax, ... (64-bit immediate)
// mov rdx, rax (save old rax into rdx)
x64_MOV_RDX_RAX();
// mov rax, imm64 (placeholder for lambda body address)
emit8(x64_REX);
emit8(x64_op_MOV_rax_imm64);
U4 rax_imm_offset = code_arena.used;
emit64(0); // Placeholder for lambda address
emit64(0);
pad32();
// Outer function: Jump over lambda body
emit8(0xE9);
// jmp rel32 over lambda body
emit8(x64_op_JMP_rel32);
lambda_jmp_offset = code_arena.used;
emit32(0); // Placeholder for jump distance
emit32(0);
pad32();
in_lambda = true;
// Patch the mov rax, ... with the actual lambda body address
// Patch the mov rax, imm64 with the actual lambda body address
U8 lambda_addr = u8_(code_arena.start + code_arena.used);
u8_r(code_arena.start + rax_imm_offset)[0] = lambda_addr;
}
@@ -641,29 +703,29 @@ IA_ void compile_and_run_tape(void)
}
else if (tag == STag_Data)
{
emit8(0x48); emit8(0x89); emit8(0xC2);
emit8(0x48); emit8(0xC7); emit8(0xC0); emit32(val);
x64_MOV_RDX_RAX();
emit8(x64_REX);
emit8(x64_op_MOV_rm_imm32);
emit8(x64_modrm(x64_mod_reg, 0, x64_reg_RAX));
emit32(val);
pad32();
}
}
if (in_lambda) {
emit8(0xC3);
x64_RET();
pad32();
U4 current = code_arena.used;
u4_r(code_arena.start + lambda_jmp_offset)[0] = current - (lambda_jmp_offset + 4);
}
if (in_def) {
emit8(0xC3);
x64_RET();
pad32();
U4 current = code_arena.used;
u4_r(code_arena.start + def_jmp_offset)[0] = current - (def_jmp_offset + 4);
}
emit8(0x48); emit8(0x89); emit8(0x43); emit8(0x70); // mov [rbx+0x70], rax
emit8(0x48); emit8(0x89); emit8(0x53); emit8(0x78); // mov [rbx+0x78], rdx
emit8(0x5B); // pop rbx
emit8(0xC3); // ret
x64_JIT_EPILOGUE();
typedef void JIT_Func(U8* globals_ptr);
JIT_Func* func = (JIT_Func*)code_arena.start;
@@ -672,12 +734,8 @@ IA_ void compile_and_run_tape(void)
vm_rax = vm_globals[14];
vm_rdx = vm_globals[15];
char rax_hex[9];
u64_to_hex(vm_rax, rax_hex, 8);
rax_hex[8] = '\0';
char rdx_hex[9];
u64_to_hex(vm_rdx, rdx_hex, 8);
rdx_hex[8] = '\0';
char rax_hex[9]; u64_to_hex(vm_rax, rax_hex, 8); rax_hex[8] = '\0';
char rdx_hex[9]; u64_to_hex(vm_rdx, rdx_hex, 8); rdx_hex[8] = '\0';
KTL_Slot_Str8 post_jit_log_table[] = {
{ ktl_str8_key("rax"), str8(rax_hex) },
{ ktl_str8_key("rdx"), str8(rdx_hex) },
@@ -687,6 +745,7 @@ IA_ void compile_and_run_tape(void)
#undef r
#undef v
#undef expect
@@ -982,14 +1041,16 @@ S8 win_proc(void* hwnd, U4 msg, U8 wparam, S8 lparam)
case MS_WM_PAINT: {
mu_begin(&mu_ctx);
if (mu_begin_window(&mu_ctx, "ColorForth Source Tape", mu_rect(10, 10, 900, 480))) {
if (mu_begin_window(&mu_ctx, "ColorForth Source Tape", mu_rect(10, 10, 900, 480)))
{
U4*r tape_ptr = u4_r(tape_arena.start);
U8*r anno_ptr = u8_r(anno_arena.start);
S4 start_x = 5, start_y = 5, spacing_x = 6, spacing_y = 26;
S4 x = start_x, y = start_y;
for (U8 i = 0; i < tape_count; i++) {
for (U8 i = 0; i < tape_count; i++)
{
U4 t = tape_ptr[i];
U4 tag = unpack_tag(t);
U4 val = unpack_val(t);
@@ -1072,7 +1133,6 @@ S8 win_proc(void* hwnd, U4 msg, U8 wparam, S8 lparam)
mu_text(&mu_ctx, "[F5] Toggle Run | [PgUp/PgDn] Scroll");
mu_ctx.style->colors[MU_COLOR_TEXT] = mu_color(255, 255, 255, 255);
mu_text(&mu_ctx, jit_str);
if (tape_count > 0 && cursor_idx < tape_count) {
U4 cur_tag = unpack_tag(tape_ptr[cursor_idx]);
const char* tag_name = tag_names [cur_tag];
@@ -1090,7 +1150,6 @@ S8 win_proc(void* hwnd, U4 msg, U8 wparam, S8 lparam)
mu_ctx.style->colors[MU_COLOR_TEXT] = mu_color(230, 230, 230, 255);
mu_end_window(&mu_ctx);
}
if (mu_begin_window(&mu_ctx, "Registers & Globals", mu_rect(370, 500, 350, 200))) {
char state_str[64] = "RAX: 00000000 | RDX: 00000000";
u64_to_hex(vm_rax, state_str + 5, 8);
@@ -1111,7 +1170,6 @@ S8 win_proc(void* hwnd, U4 msg, U8 wparam, S8 lparam)
mu_ctx.style->colors[MU_COLOR_TEXT] = mu_color(230, 230, 230, 255);
mu_end_window(&mu_ctx);
}
if (mu_begin_window(&mu_ctx, "Print Log", mu_rect(730, 500, 250, 200))) {
mu_layout_row(&mu_ctx, 1, (int[]){-1}, 0);
mu_ctx.style->colors[MU_COLOR_TEXT] = mu_color(161, 186, 148, 255);
@@ -1181,6 +1239,7 @@ int main(void) {
mu_ctx.text_width = text_width_cb;
mu_ctx.text_height = text_height_cb;
// Factorial
{
scatter(pack_token(STag_Comment, 0), "INIT ");
scatter(pack_token(STag_Data, 5), 0);
@@ -1227,6 +1286,7 @@ int main(void) {
scatter(pack_token(STag_Imm, 0), "F_STEP ");
}
// Lambda test
{
scatter(pack_token(STag_Comment, 0), "LAMBDAS ");
scatter(pack_token(STag_Format, 0xA), 0);

View File

@@ -1,888 +0,0 @@
#include "duffle.amd64.win32.h"
// --- Semantic Tags (Using X-Macros & Enum_) ---
#define Tag_Entries() \
X(Define, "Define", 0x0018AEFF, ":") \
X(Call, "Call", 0x00D6A454, "~") \
X(Data, "Data", 0x0094BAA1, "$") \
X(Imm, "Imm", 0x004AA4C2, "^") \
X(Comment, "Comment", 0x00AAAAAA, ".") \
X(Format, "Format", 0x003A2F3B, " ")
typedef Enum_(U4, STag) {
#define X(n, s, c, p) tmpl(STag, n),
Tag_Entries()
#undef X
STag_Count,
};
global U4 tag_colors[] = {
#define X(n, s, c, p) c,
Tag_Entries()
#undef X
};
global const char* tag_prefixes[] = {
#define X(n, s, c, p) p,
Tag_Entries()
#undef X
};
global const char* tag_names[] = {
#define X(n, s, c, p) s,
Tag_Entries()
#undef X
};
#define pack_token(tag, val) ((u4_(tag) << 28) | (u4_(val) & 0x0FFFFFFF))
#define unpack_tag(token) ( ((token) >> 28) & 0x0F)
#define unpack_val(token) ( (token) & 0x0FFFFFFF)
#define TOKENS_PER_ROW 8
#define MODE_NAV 0
#define MODE_EDIT 1
global FArena tape_arena;
global FArena anno_arena;
global U8 cursor_idx = 0;
global U4 editor_mode = MODE_NAV;
global B4 mode_switch_now = false;
global FArena code_arena;
global U8 vm_rax = 0;
global U8 vm_rdx = 0;
global U8 vm_globals[16] = {0};
global B4 run_full = false;
global U8 log_buffer[16] = {0};
global U4 log_count = 0;
global S4 scroll_y_offset = 0;
// New GDI log
#define GDI_LOG_MAX_LINES 10
#define GDI_LOG_MAX_LINE_LEN 128
global char gdi_log_buffer[GDI_LOG_MAX_LINES][GDI_LOG_MAX_LINE_LEN] = {0};
global U4 gdi_log_count = 0;
internal void debug_log(Str8 fmt, KTL_Str8 table) {
// A static buffer for our log lines.
LP_ UTF8 console_log_buffer[1024];
mem_zero(u8_(console_log_buffer), 1024);
// Format the string.
Str8 result = str8_fmt_ktl_buf(slice_ut_arr(console_log_buffer), table, fmt);
// Also write to our GDI log buffer
if (gdi_log_count < GDI_LOG_MAX_LINES) {
U4 len_to_copy = result.len < GDI_LOG_MAX_LINE_LEN - 1 ? result.len : GDI_LOG_MAX_LINE_LEN - 1;
mem_copy(u8_(gdi_log_buffer[gdi_log_count]), u8_(result.ptr), len_to_copy);
gdi_log_buffer[gdi_log_count][len_to_copy] = '\0';
gdi_log_count++;
}
// Get stdout handle.
MS_Handle stdout_handle = ms_get_std_handle(MS_STD_OUTPUT);
// Write the formatted string.
ms_write_console(stdout_handle, result.ptr, (U4)result.len, nullptr, 0);
// Write a newline.
ms_write_console(stdout_handle, (UTF8 const*r)"\n", 1, nullptr, 0);
}
U8 ms_builtin_print(U8 val, U8 rdx_val, U8 r8_val, U8 r9_val) {
char hex1[9], hex2[9], hex3[9], hex4[9];
u64_to_hex(val, hex1, 8); hex1[8] = '\0';
u64_to_hex(rdx_val, hex2, 8); hex2[8] = '\0';
u64_to_hex(r8_val, hex3, 8); hex3[8] = '\0';
u64_to_hex(r9_val, hex4, 8); hex4[8] = '\0';
KTL_Slot_Str8 log_table[] = {
{ ktl_str8_key("v1"), str8(hex1) },
{ ktl_str8_key("v2"), str8(hex2) },
{ ktl_str8_key("v3"), str8(hex3) },
{ ktl_str8_key("v4"), str8(hex4) },
};
debug_log(str8("FFI PRINT -> RCX:<v1> RDX:<v2> R8:<v3> R9:<v4>"), ktl_str8_from_arr(log_table));
if (log_count < 16) log_buffer[log_count++] = val;
return val;
}
// Visual Linker & O(1) Dictionary
global U4 tape_to_code_offset[65536] = {0};
// --- WinAPI Persistence ---
#define MS_GENERIC_READ 0x80000000
#define MS_GENERIC_WRITE 0x40000000
#define MS_CREATE_ALWAYS 2
#define MS_OPEN_EXISTING 3
#define MS_FILE_ATTRIBUTE_NORMAL 0x80
#define MS_VK_F1 0x70
#define MS_VK_F2 0x71
WinAPI void* ms_create_file_a(char const* lpFileName, U4 dwDesiredAccess, U4 dwShareMode, void* lpSecurityAttributes, U4 dwCreationDisposition, U4 dwFlagsAndAttributes, void* hTemplateFile) asm("CreateFileA");
WinAPI B4 ms_write_file(void* hFile, void const* lpBuffer, U4 nNumberOfBytesToWrite, U4* lpNumberOfBytesWritten, void* lpOverlapped) asm("WriteFile");
WinAPI B4 ms_read_file(void* hFile, void* lpBuffer, U4 nNumberOfBytesToRead, U4* lpNumberOfBytesRead, void* lpOverlapped) asm("ReadFile");
WinAPI B4 ms_close_handle(void* hObject) asm("CloseHandle");
#define PRIM_SWAP 1
#define PRIM_MULT 2
#define PRIM_ADD 3
#define PRIM_FETCH 4
#define PRIM_DEC 5
#define PRIM_STORE 6
#define PRIM_RET_Z 7
#define PRIM_RET 8
#define PRIM_PRINT 9
#define PRIM_RET_S 10
#define PRIM_DUP 11
#define PRIM_DROP 12
#define PRIM_SUB 13
global const char* prim_names[] = {
"",
"SWAP ",
"MULT ",
"ADD ",
"FETCH ",
"DEC ",
"STORE ",
"RET_IF_Z",
"RETURN ",
"PRINT ",
"RET_IF_S",
"DUP ",
"DROP ",
"SUB "
};
internal U4 resolve_name_to_index(const char* ref_name);
internal void relink_tape(void);
IA_ void compile_and_run_tape(void);
internal void save_cartridge(void) {
void* hFile = ms_create_file_a("cartridge.bin", MS_GENERIC_WRITE, 0, nullptr, MS_CREATE_ALWAYS, MS_FILE_ATTRIBUTE_NORMAL, nullptr);
if (hFile != (void*)-1) {
U4 written = 0;
ms_write_file(hFile, & tape_arena.used, 8, & written, nullptr);
ms_write_file(hFile, & anno_arena.used, 8, & written, nullptr);
ms_write_file(hFile, & cursor_idx, 8, & written, nullptr);
ms_write_file(hFile, (void*)tape_arena.start, (U4)tape_arena.used, & written, nullptr);
ms_write_file(hFile, (void*)anno_arena.start, (U4)anno_arena.used, & written, nullptr);
ms_close_handle(hFile);
}
}
internal void load_cartridge(void) {
void* hFile = ms_create_file_a("cartridge.bin", MS_GENERIC_READ, 0, nullptr, MS_OPEN_EXISTING, MS_FILE_ATTRIBUTE_NORMAL, nullptr);
if (hFile != (void*)-1) {
U4 read = 0;
ms_read_file(hFile, & tape_arena.used, 8, & read, nullptr);
ms_read_file(hFile, & anno_arena.used, 8, & read, nullptr);
ms_read_file(hFile, & cursor_idx, 8, & read, nullptr);
ms_read_file(hFile, (void*)tape_arena.start, (U4)tape_arena.used, & read, nullptr);
ms_read_file(hFile, (void*)anno_arena.start, (U4)anno_arena.used, & read, nullptr);
ms_close_handle(hFile);
relink_tape();
compile_and_run_tape();
}
}
IA_ void scatter(U4 token, const char* anno_str) {
if (tape_arena.used + sizeof(U4) <= tape_arena.capacity && anno_arena.used + sizeof(U8) <= anno_arena.capacity) {
U4 tag = unpack_tag(token);
U4 val = unpack_val(token);
if (anno_str && (tag == STag_Call || tag == STag_Imm)) {
val = resolve_name_to_index(anno_str);
}
U4*r ptr = u4_r(tape_arena.start + tape_arena.used);
ptr[0] = pack_token(tag, val);
tape_arena.used += sizeof(U4);
U8*r aptr = u8_r(anno_arena.start + anno_arena.used);
aptr[0] = 0;
if (anno_str) {
char* dest = (char*)aptr;
int i = 0; while(i < 8 && anno_str[i]) { dest[i] = anno_str[i]; i ++; }
}
anno_arena.used += sizeof(U8);
}
}
internal void emit8(U1 b) {
if (code_arena.used + 1 <= code_arena.capacity) {
u1_r(code_arena.start + code_arena.used)[0] = b;
code_arena.used += 1;
}
}
internal void emit32(U4 val) {
if (code_arena.used + 4 <= code_arena.capacity) {
u4_r(code_arena.start + code_arena.used)[0] = val;
code_arena.used += 4;
}
}
internal void pad32(void) {
while ((code_arena.used % 4) != 0) emit8(0x90);
}
internal U4 resolve_name_to_index(const char* ref_name) {
U8 tape_count = tape_arena.used / sizeof(U4);
U4*r tape_ptr = u4_r(tape_arena.start);
U8*r anno_ptr = u8_r(anno_arena.start);
U8 prim_count = array_len(prim_names);
for (int p = 1; p < prim_count; p++) {
int match = 1;
for (int c = 0; c < 8; c++) {
char c1 = ref_name[c] ? ref_name[c] : ' ';
char c2 = prim_names[p][c] ? prim_names[p][c] : ' ';
if (c1 != c2) { match = 0; break; }
}
if (match) return p + 0x10000;
}
for (U8 j = 0; j < tape_count; j++) {
if (unpack_tag(tape_ptr[j]) == STag_Define) {
char* def_name = (char*)&anno_ptr[j];
int match = 1;
for (int c = 0; c < 8; c++) {
char c1 = ref_name[c] ? ref_name[c] : ' ';
char c2 = def_name[c] ? def_name[c] : ' ';
if (c1 != c2) { match = 0; break; }
}
if (match) return j;
}
}
return 0;
}
internal void relink_tape(void) {
U8 tape_count = tape_arena.used / sizeof(U4);
U4*r tape_ptr = u4_r(tape_arena.start);
U8*r anno_ptr = u8_r(anno_arena.start);
for (U8 i = 0; i < tape_count; i++) {
U4 t = tape_ptr[i];
U4 tag = unpack_tag(t);
if (tag == STag_Call || tag == STag_Imm) {
char* ref_name = (char*)&anno_ptr[i];
U4 new_val = resolve_name_to_index(ref_name);
tape_ptr[i] = pack_token(tag, new_val);
}
}
}
internal void compile_action(U4 val)
{
if (val >= 0x10000) {
U4 p = val - 0x10000;
if (p == PRIM_SWAP) {
emit8(0x48); emit8(0x87); emit8(0xC2);
pad32();
return;
} else if (p == PRIM_MULT) {
emit8(0x48); emit8(0x0F); emit8(0xAF); emit8(0xC2);
pad32();
return;
} else if (p == PRIM_ADD) {
emit8(0x48); emit8(0x01); emit8(0xD0);
pad32();
return;
} else if (p == PRIM_SUB) {
emit8(0x48); emit8(0x29); emit8(0xD0);
pad32();
return;
} else if (p == PRIM_FETCH) {
emit8(0x48); emit8(0x8B); emit8(0x04); emit8(0xC3); // mov rax, [rbx + rax*8]
pad32();
return;
} else if (p == PRIM_DEC) {
emit8(0x48); emit8(0xFF); emit8(0xC8);
pad32();
return;
} else if (p == PRIM_STORE) {
emit8(0x48); emit8(0x89); emit8(0x14); emit8(0xC3); // mov [rbx + rax*8], rdx
pad32();
return;
} else if (p == PRIM_RET_Z) {
emit8(0x48); emit8(0x85); emit8(0xC0);
emit8(0x75); emit8(0x01);
emit8(0xC3);
pad32();
return;
} else if (p == PRIM_RET_S) {
emit8(0x48); emit8(0x85); emit8(0xC0);
emit8(0x79); emit8(0x01);
emit8(0xC3);
pad32();
return;
} else if (p == PRIM_RET) {
emit8(0xC3);
pad32();
return;
} else if (p == PRIM_DUP) {
emit8(0x48); emit8(0x89); emit8(0xC2);
pad32();
return;
} else if (p == PRIM_DROP) {
emit8(0x48); emit8(0x89); emit8(0xD0);
pad32();
return;
} else if (p == PRIM_PRINT) {
// FFI Dance: Save RDX, Align RSP (32 shadow + 8 align = 40)
emit8(0x52); // push rdx
emit8(0x48); emit8(0x83); emit8(0xEC); emit8(0x28); // sub rsp, 40
// Map arguments: RCX=RAX, RDX=RDX(already loaded), R8=Globals[0], R9=Globals[1]
emit8(0x48); emit8(0x89); emit8(0xC1); // mov rcx, rax
emit8(0x4C); emit8(0x8B); emit8(0x03); // mov r8, [rbx]
emit8(0x4C); emit8(0x8B); emit8(0x4B); emit8(0x08); // mov r9, [rbx+8]
// Load func ptr and call
emit8(0x49); emit8(0xBA); // mov r10, ...
U8 addr = u8_(& ms_builtin_print);
emit32(u4_(addr & 0xFFFFFFFF));
emit32(u4_(addr >> 32));
emit8(0x41); emit8(0xFF); emit8(0xD2); // call r10
// Restore
emit8(0x48); emit8(0x83); emit8(0xC4); emit8(0x28); // add rsp, 40
emit8(0x5A); // pop rdx
pad32();
return;
}
}
if (val > 0 && val < 0x10000) {
U4 target = tape_to_code_offset[val];
pad32();
S4 rel32 = s4_(target) - s4_(code_arena.used + 5);
emit8(0xE8);
emit32(u4_(rel32));
pad32();
}
}
IA_ void compile_and_run_tape(void)
{
farena_reset(& code_arena);
log_count = 0;
gdi_log_count = 0;
emit8(0x53); // push rbx (callee-saved; also aligns RSP to 0 mod 16)
emit8(0x48); emit8(0x89); emit8(0xCB); // mov rbx, rcx (stable globals ptr for whole JIT session)
emit8(0x48); emit8(0x8B); emit8(0x43); emit8(0x70); // mov rax, [rbx+0x70]
emit8(0x48); emit8(0x8B); emit8(0x53); emit8(0x78); // mov rdx, [rbx+0x78]
U4*r tape_ptr = u4_r(tape_arena.start);
U8*r anno_ptr = u8_r(anno_arena.start);
B4 in_def = false;
U4 def_jmp_offset = 0;
U8 end_idx = run_full ? (tape_arena.used / sizeof(U4)) : (cursor_idx + 1);
for (U8 i = 0; i < end_idx; i++)
{
U4 tag = unpack_tag(tape_ptr[i]);
U4 val = unpack_val(tape_ptr[i]);
if (tag == STag_Define)
{
if (in_def == false) {
pad32();
emit8(0xE9);
def_jmp_offset = code_arena.used;
emit32(0);
pad32();
in_def = true;
} else {
emit8(0xC3);
pad32();
}
tape_to_code_offset[i] = code_arena.used;
emit8(0x48); emit8(0x87); emit8(0xC2);
pad32();
}
else if (tag == STag_Call || tag == STag_Imm)
{
char* name = (char*)&anno_ptr[i];
char val_hex[9];
u64_to_hex(val, val_hex, 8);
val_hex[8] = '\0';
KTL_Slot_Str8 call_log_table[] = {
{ ktl_str8_key("name"), str8(name) },
{ ktl_str8_key("val"), str8(val_hex) },
};
debug_log(str8("Compiling call: <name> (val: <val>)"), ktl_str8_from_arr(call_log_table));
if (tag == STag_Imm && in_def) {
emit8(0xC3);
pad32();
U4 current = code_arena.used;
u4_r(code_arena.start + def_jmp_offset)[0] = current - (def_jmp_offset + 4);
in_def = false;
}
compile_action(val);
}
else if (tag == STag_Data) {
emit8(0x48); emit8(0x89); emit8(0xC2);
emit8(0x48); emit8(0xC7); emit8(0xC0); emit32(val);
pad32();
}
}
if (in_def) {
emit8(0xC3);
pad32();
U4 current = code_arena.used;
u4_r(code_arena.start + def_jmp_offset)[0] = current - (def_jmp_offset + 4);
}
emit8(0x48); emit8(0x89); emit8(0x43); emit8(0x70); // mov [rbx+0x70], rax
emit8(0x48); emit8(0x89); emit8(0x53); emit8(0x78); // mov [rbx+0x78], rdx
emit8(0x5B); // pop rbx
emit8(0xC3); // ret
typedef void JIT_Func(U8* globals_ptr);
JIT_Func* func = (JIT_Func*)code_arena.start;
func(vm_globals);
vm_rax = vm_globals[14];
vm_rdx = vm_globals[15];
char rax_hex[9];
u64_to_hex(vm_rax, rax_hex, 8);
rax_hex[8] = '\0';
char rdx_hex[9];
u64_to_hex(vm_rdx, rdx_hex, 8);
rdx_hex[8] = '\0';
KTL_Slot_Str8 post_jit_log_table[] = {
{ ktl_str8_key("rax"), str8(rax_hex) },
{ ktl_str8_key("rdx"), str8(rdx_hex) },
};
debug_log(str8("JIT finished. RAX: <rax>, RDX: <rdx>"), ktl_str8_from_arr(post_jit_log_table));
}
S8 win_proc(void* hwnd, U4 msg, U8 wparam, S8 lparam)
{
U8 tape_count = tape_arena.used / sizeof(U4);
U4*r tape_ptr = u4_r(tape_arena.start);
switch (msg) {
case MS_WM_CHAR: {
if (editor_mode != MODE_EDIT) return 0;
U4 t = tape_ptr[cursor_idx];
U4 tag = unpack_tag(t);
U4 val = unpack_val(t);
U1 c = u1_(wparam);
B4 should_skip = c < 32 || (c == 'e' && mode_switch_now);
if (should_skip) { mode_switch_now = false; return 0; }
if (tag == STag_Data) {
U4 digit = 16;
if (c >= '0' && c <= '9') digit = c - '0';
if (c >= 'a' && c <= 'f') digit = c - 'a' + 10;
if (c >= 'A' && c <= 'F') digit = c - 'A' + 10;
if (digit < 16) {
val = ((val << 4) | digit) & 0x0FFFFFFF;
tape_ptr[cursor_idx] = pack_token(tag, val);
}
}
else if (tag != STag_Format) {
U8*r anno_ptr = u8_r(anno_arena.start);
char* anno_str = (char*) & anno_ptr[cursor_idx];
int len = 0;
while (len < 8 && anno_str[len] != '\0' && anno_str[len] != ' ') len ++;
if (len < 8) {
anno_str[len] = (char)c;
for (int i = len + 1; i < 8; i++) anno_str[i] = '\0';
if (tag == STag_Call || tag == STag_Imm || tag == STag_Define) {
U4 new_val = resolve_name_to_index(anno_str);
tape_ptr[cursor_idx] = pack_token(tag, new_val);
if (tag == STag_Define) relink_tape();
}
}
}
vm_rax = 0; vm_rdx = 0; mem_zero(u8_(vm_globals), sizeof(vm_globals));
compile_and_run_tape();
ms_invalidate_rect(hwnd, nullptr, true);
return 0;
}
case MS_WM_KEYDOWN: {
if (wparam == 0x45 && editor_mode == MODE_NAV) {
editor_mode = MODE_EDIT;
mode_switch_now = true;
ms_invalidate_rect(hwnd, nullptr, true);
return 0;
}
if (wparam == 0x1B && editor_mode == MODE_EDIT) {
editor_mode = MODE_NAV;
relink_tape();
ms_invalidate_rect(hwnd, nullptr, true);
return 0;
}
if (editor_mode == MODE_EDIT) {
if (wparam == MS_VK_BACK) {
U4 t = tape_ptr[cursor_idx];
U4 tag = unpack_tag(t);
U4 val = unpack_val(t);
if (tag == STag_Data) {
val = val >> 4;
tape_ptr[cursor_idx] = pack_token(tag, val);
}
else if (tag != STag_Format) {
U8*r anno_ptr = u8_r(anno_arena.start);
char* anno_str = (char*) & anno_ptr[cursor_idx];
int len = 0;
while (len < 8 && anno_str[len] != '\0' && anno_str[len] != ' ') len ++;
if (len > 0) {
anno_str[len - 1] = '\0';
if (tag == STag_Call || tag == STag_Imm || tag == STag_Define) {
U4 new_val = resolve_name_to_index(anno_str);
tape_ptr[cursor_idx] = pack_token(tag, new_val);
if (tag == STag_Define) relink_tape();
}
}
}
vm_rax = 0; vm_rdx = 0; mem_zero(u8_(vm_globals), sizeof(vm_globals));
compile_and_run_tape();
ms_invalidate_rect(hwnd, nullptr, true);
}
return 0;
}
if (wparam == MS_VK_RIGHT && cursor_idx < tape_count - 1) cursor_idx ++;
if (wparam == MS_VK_LEFT && cursor_idx > 0) cursor_idx --;
if (wparam == MS_VK_UP) {
U8 line_start = cursor_idx;
while (line_start > 0 && unpack_tag(tape_ptr[line_start - 1]) != STag_Format) line_start--;
if (line_start > 0) {
U8 col = cursor_idx - line_start;
U8 prev_line_start = line_start - 1;
while (prev_line_start > 0 && unpack_tag(tape_ptr[prev_line_start - 1]) != STag_Format) prev_line_start--;
U8 prev_line_len = (line_start - 1) - prev_line_start;
cursor_idx = prev_line_start + (col < prev_line_len ? col : prev_line_len);
}
}
if (wparam == MS_VK_DOWN) {
U8 line_start = cursor_idx;
while (line_start > 0 && unpack_tag(tape_ptr[line_start - 1]) != STag_Format) line_start --;
U8 col = cursor_idx - line_start;
U8 next_line_start = cursor_idx;
while (next_line_start < tape_count && unpack_tag(tape_ptr[next_line_start]) != STag_Format) next_line_start ++;
if (next_line_start < tape_count) {
next_line_start ++;
U8 next_line_end = next_line_start;
while (next_line_end < tape_count && unpack_tag(tape_ptr[next_line_end]) != STag_Format) next_line_end ++;
U8 next_line_len = next_line_end - next_line_start;
cursor_idx = next_line_start + (col < next_line_len ? col : next_line_len);
}
}
if (wparam == MS_VK_PRIOR) { scroll_y_offset -= 100; if (scroll_y_offset < 0) scroll_y_offset = 0; }
if (wparam == MS_VK_NEXT) { scroll_y_offset += 100; }
if (wparam == MS_VK_F5) { run_full = !run_full; }
if (wparam == MS_VK_F1) { save_cartridge(); }
if (wparam == MS_VK_F2) { load_cartridge(); ms_invalidate_rect(hwnd, nullptr, true); }
if (wparam == MS_VK_TAB) {
U4 t = tape_ptr[cursor_idx];
U4 tag = (unpack_tag(t) + 1) % STag_Count;
tape_ptr[cursor_idx] = pack_token(tag, unpack_val(t));
}
else if (wparam == MS_VK_BACK)
{
U8 delete_idx = cursor_idx;
B4 is_shift = (ms_get_async_key_state(MS_VK_SHIFT) & 0x8000) != 0;
if (is_shift == false) {
if (cursor_idx > 0) {
delete_idx = cursor_idx - 1;
cursor_idx--;
}
else return 0;
}
if (tape_count > 0) {
U8*r anno_ptr = u8_r(anno_arena.start);
for (U8 i = delete_idx; i < tape_count - 1; i ++) {
tape_ptr[i] = tape_ptr[i + 1];
anno_ptr[i] = anno_ptr[i + 1];
}
tape_arena.used -= sizeof(U4);
anno_arena.used -= sizeof(U8);
}
relink_tape();
}
else if (wparam == MS_VK_SPACE || wparam == MS_VK_RETURN) {
B4 is_shift = (ms_get_async_key_state(MS_VK_SHIFT) & 0x8000) != 0;
U8 insert_idx = cursor_idx;
if (is_shift) insert_idx ++;
if (tape_arena.used + sizeof(U4) <= tape_arena.capacity && anno_arena.used + sizeof(U8) <= anno_arena.capacity) {
U8*r anno_ptr = u8_r(anno_arena.start);
for (U8 i = tape_count; i > insert_idx; i --) {
tape_ptr[i] = tape_ptr[i-1];
anno_ptr[i] = anno_ptr[i-1];
}
if (wparam == MS_VK_RETURN) {
tape_ptr[insert_idx] = pack_token(STag_Format, 0xA);
anno_ptr[insert_idx] = 0;
} else {
tape_ptr[insert_idx] = pack_token(STag_Comment, 0);
anno_ptr[insert_idx] = 0;
}
if (is_shift) cursor_idx ++;
tape_arena.used += sizeof(U4);
anno_arena.used += sizeof(U8);
}
}
vm_rax = 0; vm_rdx = 0;
mem_zero(u8_(vm_globals), sizeof(vm_globals));
compile_and_run_tape();
ms_invalidate_rect(hwnd, nullptr, true);
return 0;
}
case MS_WM_PAINT: {
MS_PAINTSTRUCT ps;
void* hdc = ms_begin_paint(hwnd, & ps);
void* hFont = ms_create_font_a(20, 0, 0, 0, 400, 0, 0, 0, 0, 0, 0, 0, 0, "Consolas");
void* hOldFont = ms_select_object(hdc, hFont);
ms_set_bk_mode(hdc, 1);
void* hBgBrush = ms_create_solid_brush(0x00222222);
ms_select_object(hdc, hBgBrush);
ms_rectangle(hdc, -1, -1, 3000, 3000);
void* hBrushEdit = ms_create_solid_brush(0x008E563B);
void* hBrushNav = ms_create_solid_brush(0x00262F3B);
S4 start_x = 40, start_y = 60, spacing_x = 110, spacing_y = 35;
S4 x = start_x, y = start_y;
U4*r tape_ptr = u4_r(tape_arena.start);
U8*r anno_ptr = u8_r(anno_arena.start);
for (U8 i = 0; i < tape_count; i++)
{
if (x >= start_x + (TOKENS_PER_ROW * spacing_x)) {
x = start_x; y += spacing_y;
}
S4 render_y = y - scroll_y_offset;
if (i == cursor_idx && render_y >= 30 && render_y < 500) {
ms_select_object(hdc, editor_mode == MODE_EDIT ? hBrushEdit : hBrushNav);
ms_rectangle(hdc, x - 5, render_y - 2, x + 95, render_y + 22);
}
if (render_y >= 30 && render_y < 500)
{
U4 t = tape_ptr[i];
U4 tag = unpack_tag(t);
U4 val = unpack_val(t);
U8 anno = anno_ptr[i];
if (tag == STag_Format && val == 0xA) {
ms_set_text_color(hdc, 0x00444444);
ms_text_out_a(hdc, x, render_y, " \\n ", 6);
x = start_x;
y += spacing_y;
}
else
{
U4 color = tag_colors[tag];
const char* prefix = tag_prefixes[tag];
ms_set_text_color(hdc, color);
if (editor_mode == MODE_EDIT && i == cursor_idx) {
ms_set_text_color(hdc, 0x001E1E1E);
}
char val_str[9];
if (tag == STag_Data) {
u64_to_hex(val, val_str, 6);
val_str[6] = '\0';
}
else
{
char* a_str = (char*) & anno;
for(int c=0; c<8; c++) {
val_str[c] = a_str[c] ? a_str[c] : ' ';
}
val_str[8] = '\0';
}
char out_buf[12];
out_buf[0] = prefix[0];
out_buf[1] = ' ';
mem_copy(u8_(out_buf + 2), u8_(val_str), 8);
out_buf[10] = '\0';
ms_text_out_a(hdc, x, render_y, out_buf, 10);
x += spacing_x;
}
}
else if (unpack_tag(tape_ptr[i]) == STag_Format && unpack_val(tape_ptr[i]) == 0xA) {
x = start_x;
y += spacing_y;
}
else {
x += spacing_x;
}
}
void* hHudBrush = ms_create_solid_brush(0x00141E23);
ms_select_object(hdc, hHudBrush);
ms_rectangle(hdc, -1, 500, 3000, 3000);
ms_rectangle(hdc, -1, -1, 3000, 40);
ms_set_text_color(hdc, 0x00AAAAAA);
ms_text_out_a(hdc, 40, 10, "x86-64 Machine Code Emitter | 2-Reg Stack | [F5] Toggle Run Mode | [PgUp/PgDn] Scroll", 85);
ms_set_text_color(hdc, 0x00FFFFFF);
char jit_str[64] = "Mode: Incremental | JIT Size: 0x000 bytes";
if (run_full) mem_copy(u8_(jit_str + 6), u8_("Full "), 11);
u64_to_hex(code_arena.used, jit_str + 32, 3);
ms_text_out_a(hdc, 40, 520, jit_str, 41);
char state_str[64] = "RAX: 00000000 | RDX: 00000000";
u64_to_hex(vm_rax, state_str + 5, 8);
u64_to_hex(vm_rdx, state_str + 21, 8);
ms_set_text_color(hdc, 0x0094BAA1);
ms_text_out_a(hdc, 40, 550, state_str, 29);
if (tape_count > 0 && cursor_idx < tape_count) {
U4 cur_tag = unpack_tag(tape_ptr[cursor_idx]);
const char* tag_name = tag_names [cur_tag];
U4 cur_color = tag_colors[cur_tag];
char semantics_str[64] = "Current Tag: ";
U4 name_len = 0;
while (tag_name[name_len]) {
semantics_str[13 + name_len] = tag_name[name_len];
name_len ++;
}
semantics_str[13 + name_len] = '\0';
ms_set_text_color(hdc, cur_color);
ms_text_out_a(hdc, 40, 580, semantics_str, 13 + name_len);
}
ms_set_text_color(hdc, 0x00C8C8C8);
ms_text_out_a(hdc, 400, 520, "Global Memory (Contiguous Array):", 33);
for (int i=0; i < 4; i ++) {
char glob_str[32] = "[0]: 00000000";
glob_str[1] = '0' + i;
u64_to_hex(vm_globals[i], glob_str + 5, 8);
ms_set_text_color(hdc, 0x00D6A454);
ms_text_out_a(hdc, 400, 550 + (i * 25), glob_str, 13);
}
ms_set_text_color(hdc, 0x00C8C8C8);
ms_text_out_a(hdc, 750, 520, "Print Log:", 10);
for (int i = 0; i<log_count && i < 4; i ++) {
char log_str[32] = "00000000";
u64_to_hex(log_buffer[i], log_str, 8);
ms_set_text_color(hdc, 0x0094BAA1);
ms_text_out_a(hdc, 750, 550 + (i * 25), log_str, 8);
}
ms_set_text_color(hdc, 0x00C8C8C8);
ms_text_out_a(hdc, 40, 650, "Debug Log:", 10);
for (U4 i = 0; i < gdi_log_count; i++) {
U4 len = 0;
while(gdi_log_buffer[i][len] != '\0' && len < GDI_LOG_MAX_LINE_LEN) {
len++;
}
ms_set_text_color(hdc, 0x00AAAAAA);
ms_text_out_a(hdc, 40, 670 + (i * 20), gdi_log_buffer[i], len);
}
ms_select_object(hdc, hOldFont);
ms_delete_object(hBgBrush);
ms_delete_object(hBrushEdit);
ms_delete_object(hBrushNav);
ms_delete_object(hHudBrush);
ms_end_paint(hwnd, & ps);
return 0;
}
case MS_WM_DESTROY: { ms_post_quit_message(0); return 0; }
}
return ms_def_window_proc_a(hwnd, msg, wparam, lparam);
}
int main(void) {
Slice tape_mem = slice_ut_(u8_(ms_virtual_alloc(nullptr, 64 * 1024, MS_MEM_COMMIT | MS_MEM_RESERVE, MS_PAGE_READWRITE)), 64 * 1024);
Slice anno_mem = slice_ut_(u8_(ms_virtual_alloc(nullptr, 64 * 1024, MS_MEM_COMMIT | MS_MEM_RESERVE, MS_PAGE_READWRITE)), 64 * 1024);
Slice code_mem = slice_ut_(u8_(ms_virtual_alloc(nullptr, 64 * 1024, MS_MEM_COMMIT | MS_MEM_RESERVE, MS_PAGE_EXECUTE_READWRITE)), 64 * 1024);
if (! tape_mem.ptr || ! anno_mem.ptr || ! code_mem.ptr) ms_exit_process(1);
farena_init(& tape_arena, tape_mem);
farena_init(& anno_arena, anno_mem);
farena_init(& code_arena, code_mem);
scatter(pack_token(STag_Comment, 0), "INIT ");
scatter(pack_token(STag_Data, 5), 0);
scatter(pack_token(STag_Data, 0), 0);
scatter(pack_token(STag_Imm, 0), "STORE ");
scatter(pack_token(STag_Data, 1), 0);
scatter(pack_token(STag_Data, 1), 0);
scatter(pack_token(STag_Imm, 0), "STORE ");
scatter(pack_token(STag_Format, 0xA), 0);
scatter(pack_token(STag_Define, 0), "F_STEP ");
scatter(pack_token(STag_Data, 0), 0);
scatter(pack_token(STag_Call, 0), "FETCH ");
scatter(pack_token(STag_Call, 0), "RET_IF_Z");
scatter(pack_token(STag_Format, 0xA), 0);
scatter(pack_token(STag_Data, 1), 0);
scatter(pack_token(STag_Call, 0), "FETCH ");
scatter(pack_token(STag_Data, 0), 0);
scatter(pack_token(STag_Call, 0), "FETCH ");
scatter(pack_token(STag_Call, 0), "MULT ");
scatter(pack_token(STag_Data, 1), 0);
scatter(pack_token(STag_Call, 0), "STORE ");
scatter(pack_token(STag_Format, 0xA), 0);
scatter(pack_token(STag_Data, 0), 0);
scatter(pack_token(STag_Call, 0), "FETCH ");
scatter(pack_token(STag_Call, 0), "DEC ");
scatter(pack_token(STag_Data, 0), 0);
scatter(pack_token(STag_Call, 0), "STORE ");
scatter(pack_token(STag_Data, 1), 0);
scatter(pack_token(STag_Call, 0), "FETCH ");
scatter(pack_token(STag_Call, 0), "PRINT ");
scatter(pack_token(STag_Format, 0xA), 0);
scatter(pack_token(STag_Imm, 0), "F_STEP ");
scatter(pack_token(STag_Imm, 0), "F_STEP ");
scatter(pack_token(STag_Imm, 0), "F_STEP ");
scatter(pack_token(STag_Imm, 0), "F_STEP ");
scatter(pack_token(STag_Imm, 0), "F_STEP ");
relink_tape();
run_full = true;
compile_and_run_tape();
run_full = false;
MS_WNDCLASSA wc;
mem_fill(u8_(& wc), 0, sizeof(wc));
wc.lpfnWndProc = win_proc;
wc.hInstance = ms_get_stock_object(0);
wc.lpszClassName = "ColorForthWindow";
wc.hbrBackground = ms_get_stock_object(4);
ms_register_class_a(& wc);
void* hwnd = ms_create_window_ex_a(0, wc.lpszClassName, "Sourceless Global Memory Explorer", MS_WS_OVERLAPPEDWINDOW | MS_WS_VISIBLE, 100, 100, 1100, 750, nullptr, nullptr, wc.hInstance, nullptr);
MS_MSG msg;
while (ms_get_message_a(& msg, nullptr, 0, 0)) { ms_translate_message(& msg); ms_dispatch_message_a(& msg); }
ms_exit_process(0);
return 0;
}

View File

@@ -1,98 +0,0 @@
#pragma region OS
#pragma warning(push)
#pragma warning(disable: 4820)
#pragma comment(lib, "Kernel32.lib")
#pragma comment(lib, "Advapi32.lib")
#define MS_INVALID_HANDLE_VALUE ((MS_HANDLE)(S8)-1)
#define MS_ANYSIZE_ARRAY 1
#define MS_MEM_COMMIT 0x00001000
#define MS_MEM_RESERVE 0x00002000
#define MS_MEM_RELEASE 0x00008000
#define MS_MEM_LARGE_PAGES 0x20000000
#define MS_PAGE_READWRITE 0x04
#define MS_TOKEN_ADJUST_PRIVILEGES (0x0020)
#define MS_SE_PRIVILEGE_ENABLED (0x00000002L)
#define MS_TOKEN_QUERY (0x0008)
#define MS__TEXT(quote) L ## quote // r_winnt
#define MS_TEXT(quote) MS__TEXT(quote) // r_winnt
#define MS_SE_LOCK_MEMORY_NAME MS_TEXT("SeLockMemoryPrivilege")
typedef int MS_BOOL;
typedef unsigned long MS_DWORD;
typedef MS_DWORD* MS_PDWORD;
typedef void* MS_HANDLE;
typedef MS_HANDLE* MS_PHANDLE;
typedef long MS_LONG;
typedef S8 MS_LONGLONG;
typedef char const* MS_LPCSTR;
typedef unsigned short* MS_LPWSTR, *MS_PWSTR;
typedef void* MS_LPVOID;
typedef MS_DWORD* MS_LPDWORD;
typedef U8 MS_ULONG_PTR, *MS_PULONG_PTR;
typedef void const* MS_LPCVOID;
typedef struct MS_SECURITY_ATTRIBUTES *MS_PSECURITY_ATTRIBUTES, *MS_LPSECURITY_ATTRIBUTES;
typedef struct MS_OVERLAPPED *MS_LPOVERLAPPED;
typedef def_union(MS_LARGE_INTEGER) { struct { MS_DWORD LowPart; MS_LONG HighPart; } _; struct { MS_DWORD LowPart; MS_LONG HighPart; } u; MS_LONGLONG QuadPart; };
typedef def_struct(MS_FILE) { void* _Placeholder; };
typedef def_struct(MS_SECURITY_ATTRIBUTES) { MS_DWORD nLength; A4_B1 _PAD_; MS_LPVOID lpSecurityDescriptor; MS_BOOL bInheritHandle; };
typedef def_struct(MS_OVERLAPPED) { MS_ULONG_PTR Internal; MS_ULONG_PTR InternalHigh; union { struct { MS_DWORD Offset; MS_DWORD OffsetHigh; } _; void* Pointer; } _; MS_HANDLE hEvent; };
typedef struct MS_LUID* MS_PLUID;
typedef struct MS_LUID_AND_ATTRIBUTES* MS_PLUID_AND_ATTRIBUTES;
typedef struct MS_TOKEN_PRIVILEGES* MS_PTOKEN_PRIVILEGES;
typedef def_struct(MS_LUID) { MS_DWORD LowPart; MS_LONG HighPart; };
typedef def_struct(MS_LUID_AND_ATTRIBUTES) { MS_LUID Luid; MS_DWORD Attributes; };
typedef def_struct(MS_TOKEN_PRIVILEGES) { MS_DWORD PrivilegeCount; MS_LUID_AND_ATTRIBUTES Privileges[MS_ANYSIZE_ARRAY]; };
WinAPI MS_BOOL CloseHandle(MS_HANDLE hObject);
WinAPI MS_BOOL AdjustTokenPrivileges(MS_HANDLE TokenHandle, MS_BOOL DisableAllPrivileges, MS_PTOKEN_PRIVILEGES NewState, MS_DWORD BufferLength, MS_PTOKEN_PRIVILEGES PreviousState, MS_PDWORD ReturnLength);
WinAPI MS_HANDLE GetCurrentProcess(void);
WinAPI U8 GetLargePageMinimum(void);
WinAPI MS_BOOL LookupPrivilegeValueW(MS_LPWSTR lpSystemName, MS_LPWSTR lpName, MS_PLUID lpLuid);
WinAPI MS_BOOL OpenProcessToken(MS_HANDLE ProcessHandle, MS_DWORD DesiredAccess, MS_PHANDLE TokenHandle);
WinAPI MS_LPVOID VirtualAlloc(MS_LPVOID lpAddress, U8 dwSize, MS_DWORD flAllocationType, MS_DWORD flProtect);
WinAPI MS_BOOL VirtualFree (MS_LPVOID lpAddress, U8 dwSize, MS_DWORD dwFreeType);
#pragma warning(pop)
typedef def_struct(OS_Windows_State) { OS_SystemInfo system_info; };
global OS_Windows_State os__windows_info;
IA_ OS_SystemInfo* os_system_info(void) { return & os__windows_info.system_info; }
I_
void os__enable_large_pages(void) {
MS_HANDLE token;
if (OpenProcessToken(GetCurrentProcess(), MS_TOKEN_ADJUST_PRIVILEGES | MS_TOKEN_QUERY, &token))
{
MS_LUID luid;
if (LookupPrivilegeValueW(0, MS_SE_LOCK_MEMORY_NAME, &luid))
{
MS_TOKEN_PRIVILEGES priv;
priv.PrivilegeCount = 1;
priv.Privileges[0].Luid = luid;
priv.Privileges[0].Attributes = MS_SE_PRIVILEGE_ENABLED;
AdjustTokenPrivileges(token, 0, & priv, size_of(priv), 0, 0);
}
CloseHandle(token);
}
}
I_
void os_init(void) {
os__enable_large_pages();
OS_SystemInfo*R_ info = & os__windows_info.system_info;
info->target_page_size = (U8)GetLargePageMinimum();
}
// TODO(Ed): Large pages disabled for now... (not failing gracefully)
IA_ U8 os__vmem_reserve(U8 size, Opts_vmem*R_ opts) {
assert(opts != nullptr);
void*R_ result = VirtualAlloc(cast(void*R_, opts->base_addr), size
, MS_MEM_RESERVE
// |MS_MEM_COMMIT|(opts->no_large_pages == false ? MS_MEM_LARGE_PAGES : 0)
, MS_PAGE_READWRITE
);
return u8_(result);
}
IA_ B4 os__vmem_commit(U8 vm, U8 size, Opts_vmem*R_ opts) {
assert(opts != nullptr);
// if (opts->no_large_pages == false ) { return 1; }
B4 result = (VirtualAlloc(cast(MS_LPVOID, vm), size, MS_MEM_COMMIT, MS_PAGE_READWRITE) != 0);
return result;
}
I_ void os_vmem_release(U8 vm, U8 size) { VirtualFree(cast(MS_LPVOID, vm), 0, MS_MEM_RELEASE); }
#pragma endregion OS