notes
This commit is contained in:
34
GEMINI.md
34
GEMINI.md
@@ -44,28 +44,24 @@ Based on the curation in `./references/`, the resulting system MUST adhere to th
|
|||||||
|
|
||||||
## Current Development Roadmap (attempt_1)
|
## Current Development Roadmap (attempt_1)
|
||||||
|
|
||||||
Here's a breakdown of the next steps to advance the `attempt_1` implementation towards a ColorForth derivative:
|
The prototype currently implements a functional WinAPI modal editor, a 2-register (`RAX`/`RDX`) JIT compiler with an `O(1)` visual linker, x68 32-bit instruction padding, implicit definition boundaries (Magenta Pipe), and an initial FFI Bridge (`emit_ffi_dance`).
|
||||||
|
|
||||||
1. **Enhance Lexer/Parser/Compiler (JIT) in `main.c`:**
|
Here is a breakdown of the next steps to advance the `attempt_1` implementation towards a complete ColorForth derivative:
|
||||||
* **Token Interpretation:** Refine the interpretation of the 28-bit payload based on the 4-bit color tag (e.g., differentiate between immediate values, dictionary IDs, and data addresses).
|
|
||||||
* **Dictionary Lookup:** Improve the efficiency and scalability of dictionary lookups for custom words beyond the current linear search.
|
|
||||||
* **New Word Definition:** Implement mechanisms for defining new Forth words directly within the editor, compiling them into the `code_arena`.
|
|
||||||
|
|
||||||
2. **Refine Visual Editor (`win_proc` in `main.c`):**
|
1. **Refine the FFI / Tape Drive Argument Scatter:**
|
||||||
* **Dynamic Colorization:** Ensure all rendered tokens accurately reflect their 4-bit color tags, updating dynamically with changes.
|
* Currently, the FFI bridge only maps `RAX` and `RDX` to the C-ABI `RCX` and `RDX`.
|
||||||
* **Annotation Handling:** Implement more sophisticated display for token annotations, supporting up to 8 characters clearly without truncation or visual artifacts.
|
* Implement "Preemptive Scatter" logic so the FFI bridge correctly reads subsequent arguments (e.g., `R8`, `R9`) directly from pre-defined offsets in the `vm_globals` tape drive instead of just zeroing them out.
|
||||||
* **Input Handling:** Improve text input for `STag_Data` (e.g., supporting full hexadecimal input, backspace functionality).
|
|
||||||
* **Cursor Behavior:** Ensure the cursor accurately reflects the current editing position within the token stream.
|
|
||||||
|
|
||||||
3. **Expand Register-Only Stack Operations:**
|
2. **Expanded Annotation Layer (Variable-Length Comments):**
|
||||||
* Implement core Forth stack manipulation words (e.g., `DUP`, `DROP`, `OVER`, `ROT`) by generating appropriate x86-64 assembly instructions that operate solely on `RAX` and `RDX`.
|
* The current `anno_arena` strictly allocates 8 bytes (a `U8`) per token.
|
||||||
|
* Refactor the visual editor and annotation memory management to allow for arbitrarily long text blocks (comments) to be attached to specific tokens without disrupting the `O(1)` compilation mapping.
|
||||||
|
|
||||||
4. **Develop `Tape Drive` Memory Management:**
|
3. **Implement the Self-Modifying Cartridge (Persistence):**
|
||||||
* Ensure all memory access (read/write) for Forth variables and data structures correctly utilize the `vm_globals` array and the "preemptive scatter" approach.
|
* The tape and annotations are currently lost when the program closes.
|
||||||
|
* Move away from purely transient `VirtualAlloc` buffers to a memory-mapped file approach (or a manual Save/Load equivalent in WinAPI) to allow the "executable as source" to persist between sessions.
|
||||||
|
|
||||||
5. **Implement Control Flow without Branches:**
|
4. **Refine Visual Editor Interactions:**
|
||||||
* Leverage conditional returns and factored calls to create more complex control flow structures (e.g., `IF`/`ELSE`/`THEN` equivalents) without introducing explicit `jmp` instructions where not architecturally intended.
|
* Implement a proper internal text-editing cursor within the `STag_Data` and `STag_Format` (annotation) tokens, rather than relying on backspace-truncation and appendage.
|
||||||
|
|
||||||
6. **Continuous Validation & Debugging:**
|
5. **Continuous Validation & Complex Control Flow:**
|
||||||
* Enhance debugging output within the UI to provide clearer insight into VM state (RAX, RDX, global memory, log buffer) during execution.
|
* Expand the primitive set to allow for more complex, AST-less control flow (e.g., handling Lambdas or specific Basic Block jumps).
|
||||||
* Consider adding simple "tests" as Forth sequences within `tape_arena` to verify new features.
|
|
||||||
|
|||||||
@@ -13,7 +13,7 @@ The application presents a visual grid of 32-bit tokens and allows the user to n
|
|||||||
|
|
||||||
2. **Annotation Layer (`FArena` anno):**
|
2. **Annotation Layer (`FArena` anno):**
|
||||||
* A parallel `FArena` of `U8` (64-bit) integers stores an 8-character string for each corresponding token on the tape.
|
* A parallel `FArena` of `U8` (64-bit) integers stores an 8-character string for each corresponding token on the tape.
|
||||||
* The UI renderer prioritizes displaying this string, but the compiler only ever sees the 2-character ID packed into the 32-bit token, successfully implementing Lottes' dictionary annotation strategy.
|
* The UI renderer prioritizes displaying this string, but the compiler only ever sees the indices packed into the 32-bit token.
|
||||||
|
|
||||||
3. **2-Register Stack & Global Memory:**
|
3. **2-Register Stack & Global Memory:**
|
||||||
* The JIT compiler emits x86-64 that strictly adheres to Onat's `RAX`/`RDX` register stack.
|
* The JIT compiler emits x86-64 that strictly adheres to Onat's `RAX`/`RDX` register stack.
|
||||||
@@ -23,25 +23,30 @@ The application presents a visual grid of 32-bit tokens and allows the user to n
|
|||||||
* A small set of `emit8`/`emit32` functions write raw x86-64 opcodes into a `VirtualAlloc` block marked as executable (`PAGE_EXECUTE_READWRITE`).
|
* A small set of `emit8`/`emit32` functions write raw x86-64 opcodes into a `VirtualAlloc` block marked as executable (`PAGE_EXECUTE_READWRITE`).
|
||||||
* This buffer is cast to a C function pointer and called directly, bypassing the need for an external assembler like NASM or a complex library like Zydis for this prototype stage.
|
* This buffer is cast to a C function pointer and called directly, bypassing the need for an external assembler like NASM or a complex library like Zydis for this prototype stage.
|
||||||
|
|
||||||
5. **2-Character Mapped Dictionary & Resolver:**
|
5. **Modal Editor (Win32 GDI):**
|
||||||
* The `ID2(a, b)` macro packs two characters into a 16-bit integer for use as a token's payload.
|
|
||||||
* The JIT compiler maintains a simple array-based dictionary. On a `: Define` token, it records the ID and the current memory offset. On a `~ Call` token, it looks up the ID and emits a relative 32-bit `CALL` instruction (`0xE8`).
|
|
||||||
* It also correctly emits `JMP` instructions to skip over definition bodies during linear execution.
|
|
||||||
|
|
||||||
6. **Modal Editor (Win32 GDI):**
|
|
||||||
* The UI is built with raw Win32 GDI calls defined in `duffle.h`.
|
* The UI is built with raw Win32 GDI calls defined in `duffle.h`.
|
||||||
* It features two modes: `Navigation` (gray cursor, arrow key movement) and `Edit` (orange cursor, text input).
|
* It features two modes: `Navigation` (gray cursor, arrow key movement) and `Edit` (orange cursor, text input).
|
||||||
* The editor correctly handles token insertion, deletion (Vim-style backspace), tag cycling (Tab), and value editing, all while re-compiling and re-executing on every keystroke.
|
* The editor correctly handles token insertion, deletion (Vim-style backspace), tag cycling (Tab), and value editing, all while re-compiling and re-executing on every keystroke.
|
||||||
|
|
||||||
|
6. **O(1) Dictionary & Visual Linking:**
|
||||||
|
* The dictionary relies on an edit-time visual linker. When the tape is modified, `relink_tape` resolves names to absolute source memory indices.
|
||||||
|
* The compiler resolves references in `O(1)` time instantly by indexing into an offset mapping table (`tape_to_code_offset`).
|
||||||
|
|
||||||
|
7. **Implicit Definition Boundaries (Magenta Pipe):**
|
||||||
|
* Definitions implicitly cause the JIT to emit a `RET` to close the prior block, and an `xchg rax, rdx` to rotate the stack for the new block.
|
||||||
|
|
||||||
|
8. **x68 Instruction Padding:**
|
||||||
|
* The JIT pads every logical block/instruction to exact 32-bit multiples using `0x90` (NOPs) to perfectly align with the visual token grid logic.
|
||||||
|
|
||||||
|
9. **The FFI Bridge:**
|
||||||
|
* The system uses an FFI macro (`emit_ffi_dance`) to align the `RSP` stack to 16 bytes, allocate 32 bytes of shadow space, and map the 2-register data stack/globals into the Windows C-ABI (`RCX`, `RDX`, `R8`, `R9`) to safely call WinAPI functions (like `MessageBoxA`).
|
||||||
|
|
||||||
## What's Missing (TODO)
|
## What's Missing (TODO)
|
||||||
|
|
||||||
* **Saving/Loading:** The tape and annotation arenas are purely in-memory and are lost when the program closes.
|
* **Saving/Loading (Persistence):** The tape and annotation arenas are purely in-memory and are lost when the program closes. Need to implement the self-modifying OS cartridge concept.
|
||||||
* **Expanded Instruction Set:** The JIT only knows a handful of primitives (`SWAP`, `MULT`, `ADD`, `FETCH`, `STORE`, `DEC`, `RET_IF_ZERO`, `PRINT`). It has no support for floating point or more complex branches.
|
* **Expanded Instruction Set:** The JIT only knows a handful of primitives. It has no support for floating point or more complex branches.
|
||||||
* **The FFI Bridge:** The system needs a macro (like Onat's `CCALL`) to align the `RSP` stack to 16 bytes and map the 2-register data stack/globals into the Windows C-ABI (`RCX`, `RDX`, `R8`, `R9`) to call WinAPI safely from the JIT.
|
* **Annotation Editing & Comments:** Typing into an annotation just appends characters up to 8 bytes. A proper text-editing cursor within the token is needed, and support for arbitrarily long comments should be implemented.
|
||||||
* **Implicit Definition Boundaries (Magenta Pipe):** Definitions should not need explicit `begin`/`end`. A definition token should implicitly cause the JIT to emit a `RET` to close the prior block, and an `xchg rax, rdx` to rotate the stack for the new block.
|
* **Tape Drive / Preemptive Scatter Logic:** Improve the FFI argument mapping to properly read from the "tape drive" memory slots instead of just mapping RAX/RDX to the first parameters.
|
||||||
* **x68 Instruction Padding:** The JIT currently emits variable-length instructions (`emit8`). It needs to pad every logical block/instruction to exact 32-bit multiples using ignored prefixes or NOPs to perfectly align with the visual token grid.
|
|
||||||
* **O(1) Dictionary & Visual Linking:** The current dictionary is a simple string-matched array rebuilt on compile. It needs to transition to a true "visual linker" where visual tokens store the absolute source memory indices, resolving locations instantly at edit-time.
|
|
||||||
* **Annotation Editing:** Typing into an annotation just appends characters. A proper text-editing cursor within the token is needed.
|
|
||||||
|
|
||||||
## References Utilized
|
## References Utilized
|
||||||
* **Heavily Utilized:**
|
* **Heavily Utilized:**
|
||||||
|
|||||||
Reference in New Issue
Block a user