diff --git a/.claude/settings.local.json b/.claude/settings.local.json index d93e597..393ec12 100644 --- a/.claude/settings.local.json +++ b/.claude/settings.local.json @@ -6,7 +6,12 @@ "Bash(tail:*)", "Bash(ls:*)", "Bash(sort:*)", - "Bash(dir:*)" + "Bash(dir:*)", + "Bash(printf %s\\\\n:*)", + "Bash(tee:*)", + "Bash(cmd.exe:*)", + "Bash(pwsh.exe:*)", + "Bash(echo No ANTHROPIC env vars found:*)" ] } } diff --git a/CLAUDE.md b/CLAUDE.md index 35d4c16..c7801e8 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -2,11 +2,30 @@ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. +## AI Behavior Rules + +- **Do not** create shell scripts, README files, or descriptive files unless explicitly instructed. +- **Do not** do anything beyond what was asked. Suggest extras in text; do not implement them. +- If a task is heavy, use sub-agents (codebase investigator, code editor, pattern analyzer, etc.). +- Screenshots are in `C:\Users\Ed\scoop\apps\sharex\current\ShareX\Screenshots\2026-02` — user will + specify which by last-modified. Manually pasted content goes in `./gallery`. +- Do not use `.gitignore` to infer file relevance for context. +- Goal is guided mentorship: validate architecture, give nudges, provide tactical help when asked. + The user is learning to build this system. Do not auto-generate finished solutions. + ## Project Overview -**bootslop** is an experimental x86-64 Windows application: a sourceless, zero-overhead ColorForth-inspired programming environment. There is no human-readable source — the "source of truth" is a binary token array (the "tape"). It features a modal visual editor (GDI-based), real-time JIT compilation to x86-64 machine code, and cartridge-based persistence. +**bootslop** is an experimental x86-64 Windows application: a sourceless, zero-overhead +ColorForth-inspired programming environment. Inspired by Timothy Lottes' "x56-40" / source-less +programming series and Onat Türkçüoğlu's VAMP/KYRA register-stack architecture. -The canonical reference for architecture is `references/Architectural_Consolidation.md`. The coding conventions are in `CONVENTIONS.md`. +There is no human-readable source — the "source of truth" is a binary token array (the "tape"). +It features a modal visual editor (GDI-based), real-time JIT compilation to x86-64 machine code, +and cartridge-based persistence. + +Canonical architecture reference: `references/Architectural_Consolidation.md` +Coding conventions: `CONVENTIONS.md` +AI behavior and goal context: `GEMINI.md` ## Build @@ -18,7 +37,13 @@ pwsh scripts/build.attempt_1.c.ps1 Output goes to `build/attempt_1.exe`. Run the exe manually — it opens a GUI window. -**Toolchain requirements:** `clang` and `lld-link.exe` on PATH. Targets amd64 Windows 11. Standard: C23. Flags: `-O0 -g -Wall -DBUILD_DEBUG=1`. +**Toolchain requirements:** `clang` and `lld-link.exe` on PATH. Targets amd64 Windows 11. + +Compiler flags: `-std=c23 -O0 -g -Wall -DBUILD_DEBUG=1 -fno-exceptions -fdiagnostics-absolute-paths` +Linker flags: `/MACHINE:X64 /SUBSYSTEM:CONSOLE /DEBUG /INCREMENTAL:NO` + `kernel32.lib user32.lib gdi32.lib` + +Note: `-nostdlib` / `-ffreestanding` are commented out in the build script — the CRT is currently +linked but `` / `` must not be included directly. No automated tests exist. Verification is interactive via the running GUI. @@ -26,44 +51,122 @@ No automated tests exist. Verification is interactive via the running GUI. All active source is in `attempt_1/`: -- **`main.c`** — The entire application (~850 lines). Contains: semantic tag definitions (X-macro), global VM state, the JIT compiler (`compile_action`, `compile_and_run`), the GDI renderer, keyboard input handling, and cartridge save/load (F1/F2). -- **`duffle.amd64.win32.h`** — The C DSL header. Defines all base types (`U1`–`U8`, `S1`–`S8`, `F4`, `F8`, `B1`–`B8`, `Str8`, `UTF8`), macros (`global`, `internal`, `LP_`, `I_`, `N_`), arena allocator (`FArena`, `farena_push`, `farena_reset`), string formatting, and raw WinAPI bindings. +- **`main.c`** — The entire application (~867 lines). Contains: semantic tag definitions (X-macro), + global VM state, the JIT compiler (`compile_action`, `compile_and_run_tape`), the GDI renderer, + keyboard input handling, and cartridge save/load (F1/F2). +- **`duffle.amd64.win32.h`** — The C DSL header. Defines all base types (`U1`–`U8`, `S1`–`S8`, + `F4`, `F8`, `B1`–`B8`, `Str8`, `UTF8`), macros (`global`, `internal`, `LP_`, `I_`, `N_`), + arena allocator (`FArena`, `farena_push`, `farena_reset`), string formatting, and raw WinAPI + bindings. ### Token / Tape Model - Tokens are `U4` (32-bit): top 4 bits = semantic tag, lower 28 bits = value or annotation index. -- Tags are defined via X-macro `Tag_Entries()`: `Define`, `Call`, `Data`, `Imm`, `Comment`, `Format`. -- Two arenas: `tape_arena` (array of `U4` tokens) and `anno_arena` (array of `U8` — 8-char name slots for each token). +- Tags are defined via X-macro `Tag_Entries()`: + `Define` (`:`) · `Call` (`~`) · `Data` (`$`) · `Imm` (`^`) · `Comment` (`.`) · `Format` (` `) +- Two arenas: `tape_arena` (array of `U4` tokens) and `anno_arena` (array of `U8` — one 8-char + name slot per token, space-padded for name resolution). - Helper macros: `pack_token(tag, val)`, `unpack_tag(token)`, `unpack_val(token)`. ### JIT Compiler -- `compile_action()` — incremental: emits x86-64 machine code into `code_arena` up to the cursor position. Runs on every keystroke for live feedback. -- `compile_and_run()` — full tape compilation + execution. Toggled by F5. -- The VM uses two global registers (`vm_rax`, `vm_rdx`) and 16 global memory cells (`vm_globals[16]`). -- 13 primitive operations: `SWAP`, `MULT`, `ADD`, `FETCH`, `STORE`, `DUP`, `DROP`, `SUB`, `DEC`, `PRINT`, `RET`, `RET_IF_Z`, `RET_IF_S`. -- Name resolution: `resolve_name_to_index()` matches 8-char space-padded annotations against primitives or prior `Define` tokens. After edits, `relink_tape()` re-resolves all `Call`/`Imm` references. +- `compile_action(val)` — emits x86-64 machine code for a single primitive or call. Called by + `compile_and_run_tape` for each token. +- `compile_and_run_tape()` (`IA_` always-inline) — resets `code_arena`, compiles the tape up to + `cursor_idx + 1` (incremental mode, `run_full == false`) or the full tape (`run_full == true`), + then immediately executes the generated code. Called on every relevant keystroke. +- **JIT prologue/epilogue:** The generated function takes `U8* globals_ptr` (= `vm_globals`). + Prologue loads `rax` from `globals_ptr[0x70/8]` = `vm_globals[14]` and `rdx` from + `globals_ptr[0x78/8]` = `vm_globals[15]`. Epilogue stores them back. `vm_rax` / `vm_rdx` are + synced from `vm_globals[14/15]` after execution. +- **The Magenta Pipe:** Every `Define` token emits a `JMP` (to skip over the function body for + inline execution flow) followed by `xchg rax, rdx` at the word entry point. This is the implicit + register-stack rotation at word boundaries — Onat's "magenta pipe". +- **O(1) linker:** `tape_to_code_offset[65536]` maps tape index → byte offset in `code_arena`. + Populated during `compile_and_run_tape` when a `Define` token is encountered. +- The VM uses two global registers (`vm_rax`, `vm_rdx`) and 16 global memory cells + (`vm_globals[16]`). No traditional Forth data stack in memory. +- **13 primitive operations:** `SWAP` · `MULT` · `ADD` · `FETCH` · `STORE` · `DUP` · `DROP` · + `SUB` · `DEC` · `PRINT` · `RET` · `RET_IF_Z` · `RET_IF_S` +- **32-bit instruction granularity:** All emitted instructions are padded to 4-byte alignment via + NOP bytes (0x90). `pad32()` enforces this after every emit. +- Name resolution: `resolve_name_to_index()` matches 8-char space-padded annotations against + primitives first, then prior `Define` tokens. After edits, `relink_tape()` re-resolves all + `Call`/`Imm` references. ### Editor -- Two modes: `MODE_NAV` (navigate) / `MODE_EDIT` (insert tokens). Toggled with `i` / Escape. -- Tape renders as colored token boxes, 8 per row (`TOKENS_PER_ROW`), each showing a prefix character (from `tag_prefixes`) and a 6-char hex value or 8-char name. -- GDI double-buffered rendering. Scroll via arrow keys in NAV mode. +- Two modes: `MODE_NAV` (navigate) / `MODE_EDIT` (type into token). Toggled with `E` / `Escape`. +- **Key bindings (NAV mode):** + - `E` — enter MODE_EDIT + - Arrow keys — move cursor (Up/Down navigate by logical lines delimited by `Format` tokens) + - `Tab` — cycle the current token's tag through `STag_*` values + - `Space` — insert a new `Comment` token at cursor + - `Shift+Space` — insert a new `Comment` token after cursor + - `Return` — insert a `Format` (newline) token at cursor + - `Backspace` — delete token before cursor + - `Shift+Backspace` — delete token at cursor + - `PgUp` / `PgDn` — scroll viewport + - `F5` — toggle `run_full` (incremental ↔ full-tape JIT) + - `F1` — save cartridge to `cartridge.bin` + - `F2` — load cartridge from `cartridge.bin` and run +- **Key bindings (EDIT mode):** + - Hex digits (`0-9`, `a-f`) — shift into `Data` token value + - Any printable char — append to annotation name (up to 8 chars) + - `Backspace` — shift `Data` value right or trim annotation name + - `Escape` — exit to MODE_NAV, triggers `relink_tape()` +- Tape renders as colored token boxes, `TOKENS_PER_ROW` (8) per row, each showing a tag prefix + char and either a 6-char hex value (Data) or an 8-char annotation name. +- GDI rendering via `BeginPaint`/`EndPaint`. The HUD (status bar at bottom) shows RAX/RDX state, + global memory cells [0-3], print log, and debug log. + +### Persistence + +- Cartridge format: `[tape_arena.used : U8][anno_arena.used : U8][cursor_idx : U8] + [tape data][anno data]` +- On load: restores arenas, cursor, calls `relink_tape()` then `compile_and_run_tape()`. + +## Current Development Roadmap + +Status as of 2026-02-21: + +1. **FFI / Tape Drive Argument Scatter** — the PRINT primitive manually aligns RSP and moves rax + into rcx before calling `ms_builtin_print`. R8/R9 args should come from pre-defined `vm_globals` + offsets ("preemptive scatter") rather than being zeroed. +2. **Variable-Length Annotations** — `anno_arena` is fixed at 8 bytes per token. Need a scheme + for longer comments without breaking the `O(1)` `tape_to_code_offset` mapping. +3. ~~**Cartridge Persistence**~~ — DONE (F1/F2 save/load via WinAPI `CreateFileA`/`WriteFile`). +4. **Editor Cursor Refinement** — proper in-token cursor for `Data` and annotation tokens, rather + than backspace-truncation and right-shift append. +5. **Control Flow Expansion** — lambdas or basic block jumps beyond the current conditional-return + primitives (`RET_IF_Z`, `RET_IF_S`). ## C DSL Conventions (from CONVENTIONS.md — strictly enforced) -**Types:** Never use `int`, `long`, `unsigned`, etc. Always use `U1`/`U2`/`U4`/`U8` (unsigned), `S1`/`S2`/`S4`/`S8` (signed), `F4`/`F8` (float), `B1`–`B8` (bool). Use cast macros (`u8_(val)`, `u4_(val)`) not C-style casts. +**Types:** Never use `int`, `long`, `unsigned`, etc. Always use `U1`/`U2`/`U4`/`U8` (unsigned), +`S1`/`S2`/`S4`/`S8` (signed), `F4`/`F8` (float), `B1`–`B8` (bool). +Use cast macros (`u8_(val)`, `u4_(val)`, `u4_r(ptr)`) — not C-style casts. Standard C casts only +for complex types where no macro exists. -**Naming:** `lower_snake_case` for functions/variables. `PascalCase` for types. WinAPI bindings prefixed with `ms_` using `asm("SymbolName")` — never declare raw WinAPI names. +**Naming:** `lower_snake_case` for functions/variables. `PascalCase` for types. WinAPI bindings +prefixed with `ms_` using `asm("SymbolName")` — never declare raw WinAPI names. **const placement:** Always to the right: `char const*`, not `const char*`. **Structs/Enums:** Use `typedef Struct_(Name) { ... };` and `typedef Enum_(UnderlyingType, Name) { ... };`. -**X-Macros:** Use for enums coupled with metadata (colors, prefixes, names). Entry names PascalCase, enum symbols use `tmpl(TypeName, Entry)` → `TypeName_Entry`. +**X-Macros:** Use for enums coupled with metadata (colors, prefixes, names). Entry names PascalCase, +enum symbols use `tmpl(TypeName, Entry)` → `TypeName_Entry`. -**Memory:** Use `FArena` / `farena_push` / `farena_reset` — no raw malloc. Use `mem_fill`/`mem_copy` not memset/memcpy. Do not `#include ` or ``. +**Memory:** Use `FArena` / `farena_push` / `farena_reset` — no raw malloc. Use `mem_fill`/`mem_copy` +not memset/memcpy. Do not `#include ` or ``. -**Formatting:** Allman braces for complex blocks. Vertical alignment for struct fields and related declarations. Space between `&` and operand: `& my_var`. `else if` / `else` on new lines. +**Formatting:** Allman braces for complex blocks. Vertical alignment for struct fields and related +declarations. Space between `&` and operand: `& my_var`. `else if` / `else` on new lines. Align +consecutive `while`/`if` keywords vertically where possible. -**Storage class keywords:** `global` (= `static` at file scope), `internal` (= `static` for functions), `LP_` (= `static` inside a function), `I_` (inline), `N_` (noinline). +**Storage class keywords:** `global` (= `static` at file scope), `internal` (= `static` for +functions), `LP_` (= `static` inside a function), `I_` (inline), `N_` (noinline), `IA_` +(always-inline). + +**Line length:** 120–160 characters per line in scripts.