claude updates
This commit is contained in:
145
CLAUDE.md
145
CLAUDE.md
@@ -2,11 +2,30 @@
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## AI Behavior Rules
|
||||
|
||||
- **Do not** create shell scripts, README files, or descriptive files unless explicitly instructed.
|
||||
- **Do not** do anything beyond what was asked. Suggest extras in text; do not implement them.
|
||||
- If a task is heavy, use sub-agents (codebase investigator, code editor, pattern analyzer, etc.).
|
||||
- Screenshots are in `C:\Users\Ed\scoop\apps\sharex\current\ShareX\Screenshots\2026-02` — user will
|
||||
specify which by last-modified. Manually pasted content goes in `./gallery`.
|
||||
- Do not use `.gitignore` to infer file relevance for context.
|
||||
- Goal is guided mentorship: validate architecture, give nudges, provide tactical help when asked.
|
||||
The user is learning to build this system. Do not auto-generate finished solutions.
|
||||
|
||||
## Project Overview
|
||||
|
||||
**bootslop** is an experimental x86-64 Windows application: a sourceless, zero-overhead ColorForth-inspired programming environment. There is no human-readable source — the "source of truth" is a binary token array (the "tape"). It features a modal visual editor (GDI-based), real-time JIT compilation to x86-64 machine code, and cartridge-based persistence.
|
||||
**bootslop** is an experimental x86-64 Windows application: a sourceless, zero-overhead
|
||||
ColorForth-inspired programming environment. Inspired by Timothy Lottes' "x56-40" / source-less
|
||||
programming series and Onat Türkçüoğlu's VAMP/KYRA register-stack architecture.
|
||||
|
||||
The canonical reference for architecture is `references/Architectural_Consolidation.md`. The coding conventions are in `CONVENTIONS.md`.
|
||||
There is no human-readable source — the "source of truth" is a binary token array (the "tape").
|
||||
It features a modal visual editor (GDI-based), real-time JIT compilation to x86-64 machine code,
|
||||
and cartridge-based persistence.
|
||||
|
||||
Canonical architecture reference: `references/Architectural_Consolidation.md`
|
||||
Coding conventions: `CONVENTIONS.md`
|
||||
AI behavior and goal context: `GEMINI.md`
|
||||
|
||||
## Build
|
||||
|
||||
@@ -18,7 +37,13 @@ pwsh scripts/build.attempt_1.c.ps1
|
||||
|
||||
Output goes to `build/attempt_1.exe`. Run the exe manually — it opens a GUI window.
|
||||
|
||||
**Toolchain requirements:** `clang` and `lld-link.exe` on PATH. Targets amd64 Windows 11. Standard: C23. Flags: `-O0 -g -Wall -DBUILD_DEBUG=1`.
|
||||
**Toolchain requirements:** `clang` and `lld-link.exe` on PATH. Targets amd64 Windows 11.
|
||||
|
||||
Compiler flags: `-std=c23 -O0 -g -Wall -DBUILD_DEBUG=1 -fno-exceptions -fdiagnostics-absolute-paths`
|
||||
Linker flags: `/MACHINE:X64 /SUBSYSTEM:CONSOLE /DEBUG /INCREMENTAL:NO` + `kernel32.lib user32.lib gdi32.lib`
|
||||
|
||||
Note: `-nostdlib` / `-ffreestanding` are commented out in the build script — the CRT is currently
|
||||
linked but `<stdlib.h>` / `<string.h>` must not be included directly.
|
||||
|
||||
No automated tests exist. Verification is interactive via the running GUI.
|
||||
|
||||
@@ -26,44 +51,122 @@ No automated tests exist. Verification is interactive via the running GUI.
|
||||
|
||||
All active source is in `attempt_1/`:
|
||||
|
||||
- **`main.c`** — The entire application (~850 lines). Contains: semantic tag definitions (X-macro), global VM state, the JIT compiler (`compile_action`, `compile_and_run`), the GDI renderer, keyboard input handling, and cartridge save/load (F1/F2).
|
||||
- **`duffle.amd64.win32.h`** — The C DSL header. Defines all base types (`U1`–`U8`, `S1`–`S8`, `F4`, `F8`, `B1`–`B8`, `Str8`, `UTF8`), macros (`global`, `internal`, `LP_`, `I_`, `N_`), arena allocator (`FArena`, `farena_push`, `farena_reset`), string formatting, and raw WinAPI bindings.
|
||||
- **`main.c`** — The entire application (~867 lines). Contains: semantic tag definitions (X-macro),
|
||||
global VM state, the JIT compiler (`compile_action`, `compile_and_run_tape`), the GDI renderer,
|
||||
keyboard input handling, and cartridge save/load (F1/F2).
|
||||
- **`duffle.amd64.win32.h`** — The C DSL header. Defines all base types (`U1`–`U8`, `S1`–`S8`,
|
||||
`F4`, `F8`, `B1`–`B8`, `Str8`, `UTF8`), macros (`global`, `internal`, `LP_`, `I_`, `N_`),
|
||||
arena allocator (`FArena`, `farena_push`, `farena_reset`), string formatting, and raw WinAPI
|
||||
bindings.
|
||||
|
||||
### Token / Tape Model
|
||||
|
||||
- Tokens are `U4` (32-bit): top 4 bits = semantic tag, lower 28 bits = value or annotation index.
|
||||
- Tags are defined via X-macro `Tag_Entries()`: `Define`, `Call`, `Data`, `Imm`, `Comment`, `Format`.
|
||||
- Two arenas: `tape_arena` (array of `U4` tokens) and `anno_arena` (array of `U8` — 8-char name slots for each token).
|
||||
- Tags are defined via X-macro `Tag_Entries()`:
|
||||
`Define` (`:`) · `Call` (`~`) · `Data` (`$`) · `Imm` (`^`) · `Comment` (`.`) · `Format` (` `)
|
||||
- Two arenas: `tape_arena` (array of `U4` tokens) and `anno_arena` (array of `U8` — one 8-char
|
||||
name slot per token, space-padded for name resolution).
|
||||
- Helper macros: `pack_token(tag, val)`, `unpack_tag(token)`, `unpack_val(token)`.
|
||||
|
||||
### JIT Compiler
|
||||
|
||||
- `compile_action()` — incremental: emits x86-64 machine code into `code_arena` up to the cursor position. Runs on every keystroke for live feedback.
|
||||
- `compile_and_run()` — full tape compilation + execution. Toggled by F5.
|
||||
- The VM uses two global registers (`vm_rax`, `vm_rdx`) and 16 global memory cells (`vm_globals[16]`).
|
||||
- 13 primitive operations: `SWAP`, `MULT`, `ADD`, `FETCH`, `STORE`, `DUP`, `DROP`, `SUB`, `DEC`, `PRINT`, `RET`, `RET_IF_Z`, `RET_IF_S`.
|
||||
- Name resolution: `resolve_name_to_index()` matches 8-char space-padded annotations against primitives or prior `Define` tokens. After edits, `relink_tape()` re-resolves all `Call`/`Imm` references.
|
||||
- `compile_action(val)` — emits x86-64 machine code for a single primitive or call. Called by
|
||||
`compile_and_run_tape` for each token.
|
||||
- `compile_and_run_tape()` (`IA_` always-inline) — resets `code_arena`, compiles the tape up to
|
||||
`cursor_idx + 1` (incremental mode, `run_full == false`) or the full tape (`run_full == true`),
|
||||
then immediately executes the generated code. Called on every relevant keystroke.
|
||||
- **JIT prologue/epilogue:** The generated function takes `U8* globals_ptr` (= `vm_globals`).
|
||||
Prologue loads `rax` from `globals_ptr[0x70/8]` = `vm_globals[14]` and `rdx` from
|
||||
`globals_ptr[0x78/8]` = `vm_globals[15]`. Epilogue stores them back. `vm_rax` / `vm_rdx` are
|
||||
synced from `vm_globals[14/15]` after execution.
|
||||
- **The Magenta Pipe:** Every `Define` token emits a `JMP` (to skip over the function body for
|
||||
inline execution flow) followed by `xchg rax, rdx` at the word entry point. This is the implicit
|
||||
register-stack rotation at word boundaries — Onat's "magenta pipe".
|
||||
- **O(1) linker:** `tape_to_code_offset[65536]` maps tape index → byte offset in `code_arena`.
|
||||
Populated during `compile_and_run_tape` when a `Define` token is encountered.
|
||||
- The VM uses two global registers (`vm_rax`, `vm_rdx`) and 16 global memory cells
|
||||
(`vm_globals[16]`). No traditional Forth data stack in memory.
|
||||
- **13 primitive operations:** `SWAP` · `MULT` · `ADD` · `FETCH` · `STORE` · `DUP` · `DROP` ·
|
||||
`SUB` · `DEC` · `PRINT` · `RET` · `RET_IF_Z` · `RET_IF_S`
|
||||
- **32-bit instruction granularity:** All emitted instructions are padded to 4-byte alignment via
|
||||
NOP bytes (0x90). `pad32()` enforces this after every emit.
|
||||
- Name resolution: `resolve_name_to_index()` matches 8-char space-padded annotations against
|
||||
primitives first, then prior `Define` tokens. After edits, `relink_tape()` re-resolves all
|
||||
`Call`/`Imm` references.
|
||||
|
||||
### Editor
|
||||
|
||||
- Two modes: `MODE_NAV` (navigate) / `MODE_EDIT` (insert tokens). Toggled with `i` / Escape.
|
||||
- Tape renders as colored token boxes, 8 per row (`TOKENS_PER_ROW`), each showing a prefix character (from `tag_prefixes`) and a 6-char hex value or 8-char name.
|
||||
- GDI double-buffered rendering. Scroll via arrow keys in NAV mode.
|
||||
- Two modes: `MODE_NAV` (navigate) / `MODE_EDIT` (type into token). Toggled with `E` / `Escape`.
|
||||
- **Key bindings (NAV mode):**
|
||||
- `E` — enter MODE_EDIT
|
||||
- Arrow keys — move cursor (Up/Down navigate by logical lines delimited by `Format` tokens)
|
||||
- `Tab` — cycle the current token's tag through `STag_*` values
|
||||
- `Space` — insert a new `Comment` token at cursor
|
||||
- `Shift+Space` — insert a new `Comment` token after cursor
|
||||
- `Return` — insert a `Format` (newline) token at cursor
|
||||
- `Backspace` — delete token before cursor
|
||||
- `Shift+Backspace` — delete token at cursor
|
||||
- `PgUp` / `PgDn` — scroll viewport
|
||||
- `F5` — toggle `run_full` (incremental ↔ full-tape JIT)
|
||||
- `F1` — save cartridge to `cartridge.bin`
|
||||
- `F2` — load cartridge from `cartridge.bin` and run
|
||||
- **Key bindings (EDIT mode):**
|
||||
- Hex digits (`0-9`, `a-f`) — shift into `Data` token value
|
||||
- Any printable char — append to annotation name (up to 8 chars)
|
||||
- `Backspace` — shift `Data` value right or trim annotation name
|
||||
- `Escape` — exit to MODE_NAV, triggers `relink_tape()`
|
||||
- Tape renders as colored token boxes, `TOKENS_PER_ROW` (8) per row, each showing a tag prefix
|
||||
char and either a 6-char hex value (Data) or an 8-char annotation name.
|
||||
- GDI rendering via `BeginPaint`/`EndPaint`. The HUD (status bar at bottom) shows RAX/RDX state,
|
||||
global memory cells [0-3], print log, and debug log.
|
||||
|
||||
### Persistence
|
||||
|
||||
- Cartridge format: `[tape_arena.used : U8][anno_arena.used : U8][cursor_idx : U8]
|
||||
[tape data][anno data]`
|
||||
- On load: restores arenas, cursor, calls `relink_tape()` then `compile_and_run_tape()`.
|
||||
|
||||
## Current Development Roadmap
|
||||
|
||||
Status as of 2026-02-21:
|
||||
|
||||
1. **FFI / Tape Drive Argument Scatter** — the PRINT primitive manually aligns RSP and moves rax
|
||||
into rcx before calling `ms_builtin_print`. R8/R9 args should come from pre-defined `vm_globals`
|
||||
offsets ("preemptive scatter") rather than being zeroed.
|
||||
2. **Variable-Length Annotations** — `anno_arena` is fixed at 8 bytes per token. Need a scheme
|
||||
for longer comments without breaking the `O(1)` `tape_to_code_offset` mapping.
|
||||
3. ~~**Cartridge Persistence**~~ — DONE (F1/F2 save/load via WinAPI `CreateFileA`/`WriteFile`).
|
||||
4. **Editor Cursor Refinement** — proper in-token cursor for `Data` and annotation tokens, rather
|
||||
than backspace-truncation and right-shift append.
|
||||
5. **Control Flow Expansion** — lambdas or basic block jumps beyond the current conditional-return
|
||||
primitives (`RET_IF_Z`, `RET_IF_S`).
|
||||
|
||||
## C DSL Conventions (from CONVENTIONS.md — strictly enforced)
|
||||
|
||||
**Types:** Never use `int`, `long`, `unsigned`, etc. Always use `U1`/`U2`/`U4`/`U8` (unsigned), `S1`/`S2`/`S4`/`S8` (signed), `F4`/`F8` (float), `B1`–`B8` (bool). Use cast macros (`u8_(val)`, `u4_(val)`) not C-style casts.
|
||||
**Types:** Never use `int`, `long`, `unsigned`, etc. Always use `U1`/`U2`/`U4`/`U8` (unsigned),
|
||||
`S1`/`S2`/`S4`/`S8` (signed), `F4`/`F8` (float), `B1`–`B8` (bool).
|
||||
Use cast macros (`u8_(val)`, `u4_(val)`, `u4_r(ptr)`) — not C-style casts. Standard C casts only
|
||||
for complex types where no macro exists.
|
||||
|
||||
**Naming:** `lower_snake_case` for functions/variables. `PascalCase` for types. WinAPI bindings prefixed with `ms_` using `asm("SymbolName")` — never declare raw WinAPI names.
|
||||
**Naming:** `lower_snake_case` for functions/variables. `PascalCase` for types. WinAPI bindings
|
||||
prefixed with `ms_` using `asm("SymbolName")` — never declare raw WinAPI names.
|
||||
|
||||
**const placement:** Always to the right: `char const*`, not `const char*`.
|
||||
|
||||
**Structs/Enums:** Use `typedef Struct_(Name) { ... };` and `typedef Enum_(UnderlyingType, Name) { ... };`.
|
||||
|
||||
**X-Macros:** Use for enums coupled with metadata (colors, prefixes, names). Entry names PascalCase, enum symbols use `tmpl(TypeName, Entry)` → `TypeName_Entry`.
|
||||
**X-Macros:** Use for enums coupled with metadata (colors, prefixes, names). Entry names PascalCase,
|
||||
enum symbols use `tmpl(TypeName, Entry)` → `TypeName_Entry`.
|
||||
|
||||
**Memory:** Use `FArena` / `farena_push` / `farena_reset` — no raw malloc. Use `mem_fill`/`mem_copy` not memset/memcpy. Do not `#include <stdlib.h>` or `<string.h>`.
|
||||
**Memory:** Use `FArena` / `farena_push` / `farena_reset` — no raw malloc. Use `mem_fill`/`mem_copy`
|
||||
not memset/memcpy. Do not `#include <stdlib.h>` or `<string.h>`.
|
||||
|
||||
**Formatting:** Allman braces for complex blocks. Vertical alignment for struct fields and related declarations. Space between `&` and operand: `& my_var`. `else if` / `else` on new lines.
|
||||
**Formatting:** Allman braces for complex blocks. Vertical alignment for struct fields and related
|
||||
declarations. Space between `&` and operand: `& my_var`. `else if` / `else` on new lines. Align
|
||||
consecutive `while`/`if` keywords vertically where possible.
|
||||
|
||||
**Storage class keywords:** `global` (= `static` at file scope), `internal` (= `static` for functions), `LP_` (= `static` inside a function), `I_` (inline), `N_` (noinline).
|
||||
**Storage class keywords:** `global` (= `static` at file scope), `internal` (= `static` for
|
||||
functions), `LP_` (= `static` inside a function), `I_` (inline), `N_` (noinline), `IA_`
|
||||
(always-inline).
|
||||
|
||||
**Line length:** 120–160 characters per line in scripts.
|
||||
|
||||
Reference in New Issue
Block a user