diff --git a/GEMINI.md b/GEMINI.md index b06ab54..712a593 100644 --- a/GEMINI.md +++ b/GEMINI.md @@ -44,28 +44,24 @@ Based on the curation in `./references/`, the resulting system MUST adhere to th ## Current Development Roadmap (attempt_1) -Here's a breakdown of the next steps to advance the `attempt_1` implementation towards a ColorForth derivative: +The prototype currently implements a functional WinAPI modal editor, a 2-register (`RAX`/`RDX`) JIT compiler with an `O(1)` visual linker, x68 32-bit instruction padding, implicit definition boundaries (Magenta Pipe), and an initial FFI Bridge (`emit_ffi_dance`). -1. **Enhance Lexer/Parser/Compiler (JIT) in `main.c`:** - * **Token Interpretation:** Refine the interpretation of the 28-bit payload based on the 4-bit color tag (e.g., differentiate between immediate values, dictionary IDs, and data addresses). - * **Dictionary Lookup:** Improve the efficiency and scalability of dictionary lookups for custom words beyond the current linear search. - * **New Word Definition:** Implement mechanisms for defining new Forth words directly within the editor, compiling them into the `code_arena`. +Here is a breakdown of the next steps to advance the `attempt_1` implementation towards a complete ColorForth derivative: -2. **Refine Visual Editor (`win_proc` in `main.c`):** - * **Dynamic Colorization:** Ensure all rendered tokens accurately reflect their 4-bit color tags, updating dynamically with changes. - * **Annotation Handling:** Implement more sophisticated display for token annotations, supporting up to 8 characters clearly without truncation or visual artifacts. - * **Input Handling:** Improve text input for `STag_Data` (e.g., supporting full hexadecimal input, backspace functionality). - * **Cursor Behavior:** Ensure the cursor accurately reflects the current editing position within the token stream. +1. **Refine the FFI / Tape Drive Argument Scatter:** + * Currently, the FFI bridge only maps `RAX` and `RDX` to the C-ABI `RCX` and `RDX`. + * Implement "Preemptive Scatter" logic so the FFI bridge correctly reads subsequent arguments (e.g., `R8`, `R9`) directly from pre-defined offsets in the `vm_globals` tape drive instead of just zeroing them out. -3. **Expand Register-Only Stack Operations:** - * Implement core Forth stack manipulation words (e.g., `DUP`, `DROP`, `OVER`, `ROT`) by generating appropriate x86-64 assembly instructions that operate solely on `RAX` and `RDX`. +2. **Expanded Annotation Layer (Variable-Length Comments):** + * The current `anno_arena` strictly allocates 8 bytes (a `U8`) per token. + * Refactor the visual editor and annotation memory management to allow for arbitrarily long text blocks (comments) to be attached to specific tokens without disrupting the `O(1)` compilation mapping. -4. **Develop `Tape Drive` Memory Management:** - * Ensure all memory access (read/write) for Forth variables and data structures correctly utilize the `vm_globals` array and the "preemptive scatter" approach. +3. **Implement the Self-Modifying Cartridge (Persistence):** + * The tape and annotations are currently lost when the program closes. + * Move away from purely transient `VirtualAlloc` buffers to a memory-mapped file approach (or a manual Save/Load equivalent in WinAPI) to allow the "executable as source" to persist between sessions. -5. **Implement Control Flow without Branches:** - * Leverage conditional returns and factored calls to create more complex control flow structures (e.g., `IF`/`ELSE`/`THEN` equivalents) without introducing explicit `jmp` instructions where not architecturally intended. +4. **Refine Visual Editor Interactions:** + * Implement a proper internal text-editing cursor within the `STag_Data` and `STag_Format` (annotation) tokens, rather than relying on backspace-truncation and appendage. -6. **Continuous Validation & Debugging:** - * Enhance debugging output within the UI to provide clearer insight into VM state (RAX, RDX, global memory, log buffer) during execution. - * Consider adding simple "tests" as Forth sequences within `tape_arena` to verify new features. +5. **Continuous Validation & Complex Control Flow:** + * Expand the primitive set to allow for more complex, AST-less control flow (e.g., handling Lambdas or specific Basic Block jumps). diff --git a/attempt_1/attempt_1.md b/attempt_1/attempt_1.md index c82d3c8..80b4072 100644 --- a/attempt_1/attempt_1.md +++ b/attempt_1/attempt_1.md @@ -13,7 +13,7 @@ The application presents a visual grid of 32-bit tokens and allows the user to n 2. **Annotation Layer (`FArena` anno):** * A parallel `FArena` of `U8` (64-bit) integers stores an 8-character string for each corresponding token on the tape. - * The UI renderer prioritizes displaying this string, but the compiler only ever sees the 2-character ID packed into the 32-bit token, successfully implementing Lottes' dictionary annotation strategy. + * The UI renderer prioritizes displaying this string, but the compiler only ever sees the indices packed into the 32-bit token. 3. **2-Register Stack & Global Memory:** * The JIT compiler emits x86-64 that strictly adheres to Onat's `RAX`/`RDX` register stack. @@ -23,25 +23,30 @@ The application presents a visual grid of 32-bit tokens and allows the user to n * A small set of `emit8`/`emit32` functions write raw x86-64 opcodes into a `VirtualAlloc` block marked as executable (`PAGE_EXECUTE_READWRITE`). * This buffer is cast to a C function pointer and called directly, bypassing the need for an external assembler like NASM or a complex library like Zydis for this prototype stage. -5. **2-Character Mapped Dictionary & Resolver:** - * The `ID2(a, b)` macro packs two characters into a 16-bit integer for use as a token's payload. - * The JIT compiler maintains a simple array-based dictionary. On a `: Define` token, it records the ID and the current memory offset. On a `~ Call` token, it looks up the ID and emits a relative 32-bit `CALL` instruction (`0xE8`). - * It also correctly emits `JMP` instructions to skip over definition bodies during linear execution. - -6. **Modal Editor (Win32 GDI):** +5. **Modal Editor (Win32 GDI):** * The UI is built with raw Win32 GDI calls defined in `duffle.h`. * It features two modes: `Navigation` (gray cursor, arrow key movement) and `Edit` (orange cursor, text input). * The editor correctly handles token insertion, deletion (Vim-style backspace), tag cycling (Tab), and value editing, all while re-compiling and re-executing on every keystroke. +6. **O(1) Dictionary & Visual Linking:** + * The dictionary relies on an edit-time visual linker. When the tape is modified, `relink_tape` resolves names to absolute source memory indices. + * The compiler resolves references in `O(1)` time instantly by indexing into an offset mapping table (`tape_to_code_offset`). + +7. **Implicit Definition Boundaries (Magenta Pipe):** + * Definitions implicitly cause the JIT to emit a `RET` to close the prior block, and an `xchg rax, rdx` to rotate the stack for the new block. + +8. **x68 Instruction Padding:** + * The JIT pads every logical block/instruction to exact 32-bit multiples using `0x90` (NOPs) to perfectly align with the visual token grid logic. + +9. **The FFI Bridge:** + * The system uses an FFI macro (`emit_ffi_dance`) to align the `RSP` stack to 16 bytes, allocate 32 bytes of shadow space, and map the 2-register data stack/globals into the Windows C-ABI (`RCX`, `RDX`, `R8`, `R9`) to safely call WinAPI functions (like `MessageBoxA`). + ## What's Missing (TODO) -* **Saving/Loading:** The tape and annotation arenas are purely in-memory and are lost when the program closes. -* **Expanded Instruction Set:** The JIT only knows a handful of primitives (`SWAP`, `MULT`, `ADD`, `FETCH`, `STORE`, `DEC`, `RET_IF_ZERO`, `PRINT`). It has no support for floating point or more complex branches. -* **The FFI Bridge:** The system needs a macro (like Onat's `CCALL`) to align the `RSP` stack to 16 bytes and map the 2-register data stack/globals into the Windows C-ABI (`RCX`, `RDX`, `R8`, `R9`) to call WinAPI safely from the JIT. -* **Implicit Definition Boundaries (Magenta Pipe):** Definitions should not need explicit `begin`/`end`. A definition token should implicitly cause the JIT to emit a `RET` to close the prior block, and an `xchg rax, rdx` to rotate the stack for the new block. -* **x68 Instruction Padding:** The JIT currently emits variable-length instructions (`emit8`). It needs to pad every logical block/instruction to exact 32-bit multiples using ignored prefixes or NOPs to perfectly align with the visual token grid. -* **O(1) Dictionary & Visual Linking:** The current dictionary is a simple string-matched array rebuilt on compile. It needs to transition to a true "visual linker" where visual tokens store the absolute source memory indices, resolving locations instantly at edit-time. -* **Annotation Editing:** Typing into an annotation just appends characters. A proper text-editing cursor within the token is needed. +* **Saving/Loading (Persistence):** The tape and annotation arenas are purely in-memory and are lost when the program closes. Need to implement the self-modifying OS cartridge concept. +* **Expanded Instruction Set:** The JIT only knows a handful of primitives. It has no support for floating point or more complex branches. +* **Annotation Editing & Comments:** Typing into an annotation just appends characters up to 8 bytes. A proper text-editing cursor within the token is needed, and support for arbitrarily long comments should be implemented. +* **Tape Drive / Preemptive Scatter Logic:** Improve the FFI argument mapping to properly read from the "tape drive" memory slots instead of just mapping RAX/RDX to the first parameters. ## References Utilized * **Heavily Utilized:**