173 lines
9.2 KiB
Markdown
173 lines
9.2 KiB
Markdown
# CLAUDE.md
|
||
|
||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||
|
||
## AI Behavior Rules
|
||
|
||
- **Do not** create shell scripts, README files, or descriptive files unless explicitly instructed.
|
||
- **Do not** do anything beyond what was asked. Suggest extras in text; do not implement them.
|
||
- If a task is heavy, use sub-agents (codebase investigator, code editor, pattern analyzer, etc.).
|
||
- Screenshots are in `C:\Users\Ed\scoop\apps\sharex\current\ShareX\Screenshots\2026-02` — user will
|
||
specify which by last-modified. Manually pasted content goes in `./gallery`.
|
||
- Do not use `.gitignore` to infer file relevance for context.
|
||
- Goal is guided mentorship: validate architecture, give nudges, provide tactical help when asked.
|
||
The user is learning to build this system. Do not auto-generate finished solutions.
|
||
|
||
## Project Overview
|
||
|
||
**bootslop** is an experimental x86-64 Windows application: a sourceless, zero-overhead
|
||
ColorForth-inspired programming environment. Inspired by Timothy Lottes' "x56-40" / source-less
|
||
programming series and Onat Türkçüoğlu's VAMP/KYRA register-stack architecture.
|
||
|
||
There is no human-readable source — the "source of truth" is a binary token array (the "tape").
|
||
It features a modal visual editor (GDI-based), real-time JIT compilation to x86-64 machine code,
|
||
and cartridge-based persistence.
|
||
|
||
Canonical architecture reference: `references/Architectural_Consolidation.md`
|
||
Coding conventions: `CONVENTIONS.md`
|
||
AI behavior and goal context: `GEMINI.md`
|
||
|
||
## Build
|
||
|
||
Two-stage build via PowerShell: compile with clang, link with lld-link.
|
||
|
||
```powershell
|
||
pwsh scripts/build.attempt_1.c.ps1
|
||
```
|
||
|
||
Output goes to `build/attempt_1.exe`. Run the exe manually — it opens a GUI window.
|
||
|
||
**Toolchain requirements:** `clang` and `lld-link.exe` on PATH. Targets amd64 Windows 11.
|
||
|
||
Compiler flags: `-std=c23 -O0 -g -Wall -DBUILD_DEBUG=1 -fno-exceptions -fdiagnostics-absolute-paths`
|
||
Linker flags: `/MACHINE:X64 /SUBSYSTEM:CONSOLE /DEBUG /INCREMENTAL:NO` + `kernel32.lib user32.lib gdi32.lib`
|
||
|
||
Note: `-nostdlib` / `-ffreestanding` are commented out in the build script — the CRT is currently
|
||
linked but `<stdlib.h>` / `<string.h>` must not be included directly.
|
||
|
||
No automated tests exist. Verification is interactive via the running GUI.
|
||
|
||
## Code Architecture
|
||
|
||
All active source is in `attempt_1/`:
|
||
|
||
- **`main.c`** — The entire application (~867 lines). Contains: semantic tag definitions (X-macro),
|
||
global VM state, the JIT compiler (`compile_action`, `compile_and_run_tape`), the GDI renderer,
|
||
keyboard input handling, and cartridge save/load (F1/F2).
|
||
- **`duffle.amd64.win32.h`** — The C DSL header. Defines all base types (`U1`–`U8`, `S1`–`S8`,
|
||
`F4`, `F8`, `B1`–`B8`, `Str8`, `UTF8`), macros (`global`, `internal`, `LP_`, `I_`, `N_`),
|
||
arena allocator (`FArena`, `farena_push`, `farena_reset`), string formatting, and raw WinAPI
|
||
bindings.
|
||
|
||
### Token / Tape Model
|
||
|
||
- Tokens are `U4` (32-bit): top 4 bits = semantic tag, lower 28 bits = value or annotation index.
|
||
- Tags are defined via X-macro `Tag_Entries()`:
|
||
`Define` (`:`) · `Call` (`~`) · `Data` (`$`) · `Imm` (`^`) · `Comment` (`.`) · `Format` (` `)
|
||
- Two arenas: `tape_arena` (array of `U4` tokens) and `anno_arena` (array of `U8` — one 8-char
|
||
name slot per token, space-padded for name resolution).
|
||
- Helper macros: `pack_token(tag, val)`, `unpack_tag(token)`, `unpack_val(token)`.
|
||
|
||
### JIT Compiler
|
||
|
||
- `compile_action(val)` — emits x86-64 machine code for a single primitive or call. Called by
|
||
`compile_and_run_tape` for each token.
|
||
- `compile_and_run_tape()` (`IA_` always-inline) — resets `code_arena`, compiles the tape up to
|
||
`cursor_idx + 1` (incremental mode, `run_full == false`) or the full tape (`run_full == true`),
|
||
then immediately executes the generated code. Called on every relevant keystroke.
|
||
- **JIT prologue/epilogue:** The generated function takes `U8* globals_ptr` (= `vm_globals`).
|
||
Prologue loads `rax` from `globals_ptr[0x70/8]` = `vm_globals[14]` and `rdx` from
|
||
`globals_ptr[0x78/8]` = `vm_globals[15]`. Epilogue stores them back. `vm_rax` / `vm_rdx` are
|
||
synced from `vm_globals[14/15]` after execution.
|
||
- **The Magenta Pipe:** Every `Define` token emits a `JMP` (to skip over the function body for
|
||
inline execution flow) followed by `xchg rax, rdx` at the word entry point. This is the implicit
|
||
register-stack rotation at word boundaries — Onat's "magenta pipe".
|
||
- **O(1) linker:** `tape_to_code_offset[65536]` maps tape index → byte offset in `code_arena`.
|
||
Populated during `compile_and_run_tape` when a `Define` token is encountered.
|
||
- The VM uses two global registers (`vm_rax`, `vm_rdx`) and 16 global memory cells
|
||
(`vm_globals[16]`). No traditional Forth data stack in memory.
|
||
- **13 primitive operations:** `SWAP` · `MULT` · `ADD` · `FETCH` · `STORE` · `DUP` · `DROP` ·
|
||
`SUB` · `DEC` · `PRINT` · `RET` · `RET_IF_Z` · `RET_IF_S`
|
||
- **32-bit instruction granularity:** All emitted instructions are padded to 4-byte alignment via
|
||
NOP bytes (0x90). `pad32()` enforces this after every emit.
|
||
- Name resolution: `resolve_name_to_index()` matches 8-char space-padded annotations against
|
||
primitives first, then prior `Define` tokens. After edits, `relink_tape()` re-resolves all
|
||
`Call`/`Imm` references.
|
||
|
||
### Editor
|
||
|
||
- Two modes: `MODE_NAV` (navigate) / `MODE_EDIT` (type into token). Toggled with `E` / `Escape`.
|
||
- **Key bindings (NAV mode):**
|
||
- `E` — enter MODE_EDIT
|
||
- Arrow keys — move cursor (Up/Down navigate by logical lines delimited by `Format` tokens)
|
||
- `Tab` — cycle the current token's tag through `STag_*` values
|
||
- `Space` — insert a new `Comment` token at cursor
|
||
- `Shift+Space` — insert a new `Comment` token after cursor
|
||
- `Return` — insert a `Format` (newline) token at cursor
|
||
- `Backspace` — delete token before cursor
|
||
- `Shift+Backspace` — delete token at cursor
|
||
- `PgUp` / `PgDn` — scroll viewport
|
||
- `F5` — toggle `run_full` (incremental ↔ full-tape JIT)
|
||
- `F1` — save cartridge to `cartridge.bin`
|
||
- `F2` — load cartridge from `cartridge.bin` and run
|
||
- **Key bindings (EDIT mode):**
|
||
- Hex digits (`0-9`, `a-f`) — shift into `Data` token value
|
||
- Any printable char — append to annotation name (up to 8 chars)
|
||
- `Backspace` — shift `Data` value right or trim annotation name
|
||
- `Escape` — exit to MODE_NAV, triggers `relink_tape()`
|
||
- Tape renders as colored token boxes, `TOKENS_PER_ROW` (8) per row, each showing a tag prefix
|
||
char and either a 6-char hex value (Data) or an 8-char annotation name.
|
||
- GDI rendering via `BeginPaint`/`EndPaint`. The HUD (status bar at bottom) shows RAX/RDX state,
|
||
global memory cells [0-3], print log, and debug log.
|
||
|
||
### Persistence
|
||
|
||
- Cartridge format: `[tape_arena.used : U8][anno_arena.used : U8][cursor_idx : U8]
|
||
[tape data][anno data]`
|
||
- On load: restores arenas, cursor, calls `relink_tape()` then `compile_and_run_tape()`.
|
||
|
||
## Current Development Roadmap
|
||
|
||
Status as of 2026-02-21:
|
||
|
||
1. **FFI / Tape Drive Argument Scatter** — the PRINT primitive manually aligns RSP and moves rax
|
||
into rcx before calling `ms_builtin_print`. R8/R9 args should come from pre-defined `vm_globals`
|
||
offsets ("preemptive scatter") rather than being zeroed.
|
||
2. **Variable-Length Annotations** — `anno_arena` is fixed at 8 bytes per token. Need a scheme
|
||
for longer comments without breaking the `O(1)` `tape_to_code_offset` mapping.
|
||
3. ~~**Cartridge Persistence**~~ — DONE (F1/F2 save/load via WinAPI `CreateFileA`/`WriteFile`).
|
||
4. **Editor Cursor Refinement** — proper in-token cursor for `Data` and annotation tokens, rather
|
||
than backspace-truncation and right-shift append.
|
||
5. **Control Flow Expansion** — lambdas or basic block jumps beyond the current conditional-return
|
||
primitives (`RET_IF_Z`, `RET_IF_S`).
|
||
|
||
## C DSL Conventions (from CONVENTIONS.md — strictly enforced)
|
||
|
||
**Types:** Never use `int`, `long`, `unsigned`, etc. Always use `U1`/`U2`/`U4`/`U8` (unsigned),
|
||
`S1`/`S2`/`S4`/`S8` (signed), `F4`/`F8` (float), `B1`–`B8` (bool).
|
||
Use cast macros (`u8_(val)`, `u4_(val)`, `u4_r(ptr)`) — not C-style casts. Standard C casts only
|
||
for complex types where no macro exists.
|
||
|
||
**Naming:** `lower_snake_case` for functions/variables. `PascalCase` for types. WinAPI bindings
|
||
prefixed with `ms_` using `asm("SymbolName")` — never declare raw WinAPI names.
|
||
|
||
**const placement:** Always to the right: `char const*`, not `const char*`.
|
||
|
||
**Structs/Enums:** Use `typedef Struct_(Name) { ... };` and `typedef Enum_(UnderlyingType, Name) { ... };`.
|
||
|
||
**X-Macros:** Use for enums coupled with metadata (colors, prefixes, names). Entry names PascalCase,
|
||
enum symbols use `tmpl(TypeName, Entry)` → `TypeName_Entry`.
|
||
|
||
**Memory:** Use `FArena` / `farena_push` / `farena_reset` — no raw malloc. Use `mem_fill`/`mem_copy`
|
||
not memset/memcpy. Do not `#include <stdlib.h>` or `<string.h>`.
|
||
|
||
**Formatting:** Allman braces for complex blocks. Vertical alignment for struct fields and related
|
||
declarations. Space between `&` and operand: `& my_var`. `else if` / `else` on new lines. Align
|
||
consecutive `while`/`if` keywords vertically where possible.
|
||
|
||
**Storage class keywords:** `global` (= `static` at file scope), `internal` (= `static` for
|
||
functions), `LP_` (= `static` inside a function), `I_` (inline), `N_` (noinline), `IA_`
|
||
(always-inline).
|
||
|
||
**Line length:** 120–160 characters per line in scripts.
|