claude updates

This commit is contained in:
2026-02-21 10:52:56 -05:00
parent 67c55a50ce
commit 68d0a5997f
2 changed files with 130 additions and 22 deletions

145
CLAUDE.md
View File

@@ -2,11 +2,30 @@
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## AI Behavior Rules
- **Do not** create shell scripts, README files, or descriptive files unless explicitly instructed.
- **Do not** do anything beyond what was asked. Suggest extras in text; do not implement them.
- If a task is heavy, use sub-agents (codebase investigator, code editor, pattern analyzer, etc.).
- Screenshots are in `C:\Users\Ed\scoop\apps\sharex\current\ShareX\Screenshots\2026-02` — user will
specify which by last-modified. Manually pasted content goes in `./gallery`.
- Do not use `.gitignore` to infer file relevance for context.
- Goal is guided mentorship: validate architecture, give nudges, provide tactical help when asked.
The user is learning to build this system. Do not auto-generate finished solutions.
## Project Overview
**bootslop** is an experimental x86-64 Windows application: a sourceless, zero-overhead ColorForth-inspired programming environment. There is no human-readable source — the "source of truth" is a binary token array (the "tape"). It features a modal visual editor (GDI-based), real-time JIT compilation to x86-64 machine code, and cartridge-based persistence.
**bootslop** is an experimental x86-64 Windows application: a sourceless, zero-overhead
ColorForth-inspired programming environment. Inspired by Timothy Lottes' "x56-40" / source-less
programming series and Onat Türkçüoğlu's VAMP/KYRA register-stack architecture.
The canonical reference for architecture is `references/Architectural_Consolidation.md`. The coding conventions are in `CONVENTIONS.md`.
There is no human-readable source — the "source of truth" is a binary token array (the "tape").
It features a modal visual editor (GDI-based), real-time JIT compilation to x86-64 machine code,
and cartridge-based persistence.
Canonical architecture reference: `references/Architectural_Consolidation.md`
Coding conventions: `CONVENTIONS.md`
AI behavior and goal context: `GEMINI.md`
## Build
@@ -18,7 +37,13 @@ pwsh scripts/build.attempt_1.c.ps1
Output goes to `build/attempt_1.exe`. Run the exe manually — it opens a GUI window.
**Toolchain requirements:** `clang` and `lld-link.exe` on PATH. Targets amd64 Windows 11. Standard: C23. Flags: `-O0 -g -Wall -DBUILD_DEBUG=1`.
**Toolchain requirements:** `clang` and `lld-link.exe` on PATH. Targets amd64 Windows 11.
Compiler flags: `-std=c23 -O0 -g -Wall -DBUILD_DEBUG=1 -fno-exceptions -fdiagnostics-absolute-paths`
Linker flags: `/MACHINE:X64 /SUBSYSTEM:CONSOLE /DEBUG /INCREMENTAL:NO` + `kernel32.lib user32.lib gdi32.lib`
Note: `-nostdlib` / `-ffreestanding` are commented out in the build script — the CRT is currently
linked but `<stdlib.h>` / `<string.h>` must not be included directly.
No automated tests exist. Verification is interactive via the running GUI.
@@ -26,44 +51,122 @@ No automated tests exist. Verification is interactive via the running GUI.
All active source is in `attempt_1/`:
- **`main.c`** — The entire application (~850 lines). Contains: semantic tag definitions (X-macro), global VM state, the JIT compiler (`compile_action`, `compile_and_run`), the GDI renderer, keyboard input handling, and cartridge save/load (F1/F2).
- **`duffle.amd64.win32.h`** — The C DSL header. Defines all base types (`U1``U8`, `S1``S8`, `F4`, `F8`, `B1``B8`, `Str8`, `UTF8`), macros (`global`, `internal`, `LP_`, `I_`, `N_`), arena allocator (`FArena`, `farena_push`, `farena_reset`), string formatting, and raw WinAPI bindings.
- **`main.c`** — The entire application (~867 lines). Contains: semantic tag definitions (X-macro),
global VM state, the JIT compiler (`compile_action`, `compile_and_run_tape`), the GDI renderer,
keyboard input handling, and cartridge save/load (F1/F2).
- **`duffle.amd64.win32.h`** — The C DSL header. Defines all base types (`U1``U8`, `S1``S8`,
`F4`, `F8`, `B1``B8`, `Str8`, `UTF8`), macros (`global`, `internal`, `LP_`, `I_`, `N_`),
arena allocator (`FArena`, `farena_push`, `farena_reset`), string formatting, and raw WinAPI
bindings.
### Token / Tape Model
- Tokens are `U4` (32-bit): top 4 bits = semantic tag, lower 28 bits = value or annotation index.
- Tags are defined via X-macro `Tag_Entries()`: `Define`, `Call`, `Data`, `Imm`, `Comment`, `Format`.
- Two arenas: `tape_arena` (array of `U4` tokens) and `anno_arena` (array of `U8` — 8-char name slots for each token).
- Tags are defined via X-macro `Tag_Entries()`:
`Define` (`:`) · `Call` (`~`) · `Data` (`$`) · `Imm` (`^`) · `Comment` (`.`) · `Format` (` `)
- Two arenas: `tape_arena` (array of `U4` tokens) and `anno_arena` (array of `U8` — one 8-char
name slot per token, space-padded for name resolution).
- Helper macros: `pack_token(tag, val)`, `unpack_tag(token)`, `unpack_val(token)`.
### JIT Compiler
- `compile_action()` incremental: emits x86-64 machine code into `code_arena` up to the cursor position. Runs on every keystroke for live feedback.
- `compile_and_run()` — full tape compilation + execution. Toggled by F5.
- The VM uses two global registers (`vm_rax`, `vm_rdx`) and 16 global memory cells (`vm_globals[16]`).
- 13 primitive operations: `SWAP`, `MULT`, `ADD`, `FETCH`, `STORE`, `DUP`, `DROP`, `SUB`, `DEC`, `PRINT`, `RET`, `RET_IF_Z`, `RET_IF_S`.
- Name resolution: `resolve_name_to_index()` matches 8-char space-padded annotations against primitives or prior `Define` tokens. After edits, `relink_tape()` re-resolves all `Call`/`Imm` references.
- `compile_action(val)` — emits x86-64 machine code for a single primitive or call. Called by
`compile_and_run_tape` for each token.
- `compile_and_run_tape()` (`IA_` always-inline) — resets `code_arena`, compiles the tape up to
`cursor_idx + 1` (incremental mode, `run_full == false`) or the full tape (`run_full == true`),
then immediately executes the generated code. Called on every relevant keystroke.
- **JIT prologue/epilogue:** The generated function takes `U8* globals_ptr` (= `vm_globals`).
Prologue loads `rax` from `globals_ptr[0x70/8]` = `vm_globals[14]` and `rdx` from
`globals_ptr[0x78/8]` = `vm_globals[15]`. Epilogue stores them back. `vm_rax` / `vm_rdx` are
synced from `vm_globals[14/15]` after execution.
- **The Magenta Pipe:** Every `Define` token emits a `JMP` (to skip over the function body for
inline execution flow) followed by `xchg rax, rdx` at the word entry point. This is the implicit
register-stack rotation at word boundaries — Onat's "magenta pipe".
- **O(1) linker:** `tape_to_code_offset[65536]` maps tape index → byte offset in `code_arena`.
Populated during `compile_and_run_tape` when a `Define` token is encountered.
- The VM uses two global registers (`vm_rax`, `vm_rdx`) and 16 global memory cells
(`vm_globals[16]`). No traditional Forth data stack in memory.
- **13 primitive operations:** `SWAP` · `MULT` · `ADD` · `FETCH` · `STORE` · `DUP` · `DROP` ·
`SUB` · `DEC` · `PRINT` · `RET` · `RET_IF_Z` · `RET_IF_S`
- **32-bit instruction granularity:** All emitted instructions are padded to 4-byte alignment via
NOP bytes (0x90). `pad32()` enforces this after every emit.
- Name resolution: `resolve_name_to_index()` matches 8-char space-padded annotations against
primitives first, then prior `Define` tokens. After edits, `relink_tape()` re-resolves all
`Call`/`Imm` references.
### Editor
- Two modes: `MODE_NAV` (navigate) / `MODE_EDIT` (insert tokens). Toggled with `i` / Escape.
- Tape renders as colored token boxes, 8 per row (`TOKENS_PER_ROW`), each showing a prefix character (from `tag_prefixes`) and a 6-char hex value or 8-char name.
- GDI double-buffered rendering. Scroll via arrow keys in NAV mode.
- Two modes: `MODE_NAV` (navigate) / `MODE_EDIT` (type into token). Toggled with `E` / `Escape`.
- **Key bindings (NAV mode):**
- `E`enter MODE_EDIT
- Arrow keys — move cursor (Up/Down navigate by logical lines delimited by `Format` tokens)
- `Tab` — cycle the current token's tag through `STag_*` values
- `Space` — insert a new `Comment` token at cursor
- `Shift+Space` — insert a new `Comment` token after cursor
- `Return` — insert a `Format` (newline) token at cursor
- `Backspace` — delete token before cursor
- `Shift+Backspace` — delete token at cursor
- `PgUp` / `PgDn` — scroll viewport
- `F5` — toggle `run_full` (incremental ↔ full-tape JIT)
- `F1` — save cartridge to `cartridge.bin`
- `F2` — load cartridge from `cartridge.bin` and run
- **Key bindings (EDIT mode):**
- Hex digits (`0-9`, `a-f`) — shift into `Data` token value
- Any printable char — append to annotation name (up to 8 chars)
- `Backspace` — shift `Data` value right or trim annotation name
- `Escape` — exit to MODE_NAV, triggers `relink_tape()`
- Tape renders as colored token boxes, `TOKENS_PER_ROW` (8) per row, each showing a tag prefix
char and either a 6-char hex value (Data) or an 8-char annotation name.
- GDI rendering via `BeginPaint`/`EndPaint`. The HUD (status bar at bottom) shows RAX/RDX state,
global memory cells [0-3], print log, and debug log.
### Persistence
- Cartridge format: `[tape_arena.used : U8][anno_arena.used : U8][cursor_idx : U8]
[tape data][anno data]`
- On load: restores arenas, cursor, calls `relink_tape()` then `compile_and_run_tape()`.
## Current Development Roadmap
Status as of 2026-02-21:
1. **FFI / Tape Drive Argument Scatter** — the PRINT primitive manually aligns RSP and moves rax
into rcx before calling `ms_builtin_print`. R8/R9 args should come from pre-defined `vm_globals`
offsets ("preemptive scatter") rather than being zeroed.
2. **Variable-Length Annotations** — `anno_arena` is fixed at 8 bytes per token. Need a scheme
for longer comments without breaking the `O(1)` `tape_to_code_offset` mapping.
3. ~~**Cartridge Persistence**~~ — DONE (F1/F2 save/load via WinAPI `CreateFileA`/`WriteFile`).
4. **Editor Cursor Refinement** — proper in-token cursor for `Data` and annotation tokens, rather
than backspace-truncation and right-shift append.
5. **Control Flow Expansion** — lambdas or basic block jumps beyond the current conditional-return
primitives (`RET_IF_Z`, `RET_IF_S`).
## C DSL Conventions (from CONVENTIONS.md — strictly enforced)
**Types:** Never use `int`, `long`, `unsigned`, etc. Always use `U1`/`U2`/`U4`/`U8` (unsigned), `S1`/`S2`/`S4`/`S8` (signed), `F4`/`F8` (float), `B1``B8` (bool). Use cast macros (`u8_(val)`, `u4_(val)`) not C-style casts.
**Types:** Never use `int`, `long`, `unsigned`, etc. Always use `U1`/`U2`/`U4`/`U8` (unsigned),
`S1`/`S2`/`S4`/`S8` (signed), `F4`/`F8` (float), `B1``B8` (bool).
Use cast macros (`u8_(val)`, `u4_(val)`, `u4_r(ptr)`) — not C-style casts. Standard C casts only
for complex types where no macro exists.
**Naming:** `lower_snake_case` for functions/variables. `PascalCase` for types. WinAPI bindings prefixed with `ms_` using `asm("SymbolName")` — never declare raw WinAPI names.
**Naming:** `lower_snake_case` for functions/variables. `PascalCase` for types. WinAPI bindings
prefixed with `ms_` using `asm("SymbolName")` — never declare raw WinAPI names.
**const placement:** Always to the right: `char const*`, not `const char*`.
**Structs/Enums:** Use `typedef Struct_(Name) { ... };` and `typedef Enum_(UnderlyingType, Name) { ... };`.
**X-Macros:** Use for enums coupled with metadata (colors, prefixes, names). Entry names PascalCase, enum symbols use `tmpl(TypeName, Entry)``TypeName_Entry`.
**X-Macros:** Use for enums coupled with metadata (colors, prefixes, names). Entry names PascalCase,
enum symbols use `tmpl(TypeName, Entry)` → `TypeName_Entry`.
**Memory:** Use `FArena` / `farena_push` / `farena_reset` — no raw malloc. Use `mem_fill`/`mem_copy` not memset/memcpy. Do not `#include <stdlib.h>` or `<string.h>`.
**Memory:** Use `FArena` / `farena_push` / `farena_reset` — no raw malloc. Use `mem_fill`/`mem_copy`
not memset/memcpy. Do not `#include <stdlib.h>` or `<string.h>`.
**Formatting:** Allman braces for complex blocks. Vertical alignment for struct fields and related declarations. Space between `&` and operand: `& my_var`. `else if` / `else` on new lines.
**Formatting:** Allman braces for complex blocks. Vertical alignment for struct fields and related
declarations. Space between `&` and operand: `& my_var`. `else if` / `else` on new lines. Align
consecutive `while`/`if` keywords vertically where possible.
**Storage class keywords:** `global` (= `static` at file scope), `internal` (= `static` for functions), `LP_` (= `static` inside a function), `I_` (inline), `N_` (noinline).
**Storage class keywords:** `global` (= `static` at file scope), `internal` (= `static` for
functions), `LP_` (= `static` inside a function), `I_` (inline), `N_` (noinline), `IA_`
(always-inline).
**Line length:** 120160 characters per line in scripts.