b389f1be98
Foundation research track. Produces a single markdown report at docs/ideation/2026-06-12-intent-based-scripting-languages.md surveying intent-based scripting languages and proposing a 4-tier vocab (~40 verbs) for a Meta-Tooling-facing intent DSL. The report's 7 sections: 1. The 'intent-based' design philosophy (O'Donnell immediate-mode, Onat/Lottes hardware, CoSy open-vocab, Jofito intent-mapping) 2. Prior art across 8 clusters (0: IMGUI, 1: Concatenative, 2: Array, 3: Intent-mapping, 4: Meta-Tooling, 5: SSDL shapes, 6: Command Palette, 7: Result error handling) 3. The grammar (14 primitives formalized from user's pseudocode) 4. The 4-tier vocab (math, data pipeline, shell, AI-fuzzing tolerance) 5. Hardware mapping (4 anchor claims to Onat/Lottes/O'Donnell/APL-K) 6. AI-agent properties (10 claims tying to existing project architecture: Meta-Tooling domain, 3-layer security, 4 memory dimensions, stable-to-volatile cache, Result envelope, Command Palette 33 commands, Hook API, IEventTarget/sandbox, 'reads are free') 7. Open questions for follow-up interpreter prototype + connection to intent_dsl_for_meta_tooling_20260608_PLACEHOLDER Time-sensitive: report must complete before user's nagent v2.2. No new src/ code, no new tests, no pyproject.toml changes. Pure research deliverable.
428 lines
14 KiB
Plaintext
428 lines
14 KiB
Plaintext
As you can see, you guys have managed to
|
|
buy a solid day of developer time for
|
|
Jofido in under 24 hours. I am truly
|
|
humbled by your support. In fact, I'm so
|
|
humbled that Danny the dinosaur back
|
|
here has now decided to become my high
|
|
priest,
|
|
which is why there's that creepy staff
|
|
thing that I'm a little afraid of.
|
|
Anyway, um in order to avoid getting
|
|
murdered by the magic, I'm going to show
|
|
you what I've done
|
|
just so that you can understand Jofido a
|
|
little better. I have this nifty little
|
|
diagram right here. Oh, no, no, Danny,
|
|
please don't kill me for hiding you. But
|
|
this is how we're going to be rust. It's
|
|
pretty straightforward uh if you know
|
|
what you're looking at. So, let me maybe
|
|
pivot somewhere else, which is going to
|
|
unfortunately force me to edit the
|
|
video, and actually show you this
|
|
diagram in a little bit more clarity.
|
|
All right, this is going to make for the
|
|
most awkward presentation I've ever
|
|
given anyone ever.
|
|
>> [cough]
|
|
>> So, what [clears throat] we've got here
|
|
is the old way of doing things. This is
|
|
your standard pipeline. Um by the way,
|
|
excuse the stacking everything up on
|
|
VCRs. I didn't know what else to do. I
|
|
don't have a proper table here.
|
|
find dot {dash} type f pipeline to grep
|
|
{dash} e The backslashes are escapes for
|
|
the dot. jpg {dollar} sign {dash} e
|
|
{backslash} {backslash} dot png {dollar}
|
|
sign.
|
|
What does this mean? Well, if we try to
|
|
read it like a layman, it doesn't mean
|
|
very much. Find whatever an f is. Uh I
|
|
can think of some things that start with
|
|
f that remind me of things that I don't
|
|
want to find. But anyway, and then a a
|
|
vertical symbol, and what is a grep? Who
|
|
knows? And what are all these? I mean, I
|
|
I I kind of know what jpg and png mean,
|
|
but if I'm a layman, this is cryptic
|
|
crap.
|
|
>> [snorts]
|
|
>> It's not just cryptic, it's inefficient
|
|
beyond belief. So, here's what we've
|
|
got.
|
|
You'll notice
|
|
>> [clears throat]
|
|
>> that we have arrows going down to this
|
|
box called pipe buffer.
|
|
That's because if you run find, current
|
|
directory as the root for the find,
|
|
only return results that are type file,
|
|
just as an example, um
|
|
>> [clears throat]
|
|
>> pipeline, it has to shovel the output of
|
|
that as the input of grep. Grep is
|
|
general regular expression parser.
|
|
It's a big fancy state machine that
|
|
takes a while to spin up and is not all
|
|
that fast at just simple globbing, which
|
|
is the term used to refer to finding
|
|
basically
|
|
um
|
|
finding substrings in a string except in
|
|
reverse.
|
|
So,
|
|
>> [cough]
|
|
>> these grep [clears throat] expressions,
|
|
which is the e's,
|
|
say dot jpg or dot png, and {dollar}
|
|
sign is code for the end of the line.
|
|
You have to know all of that to make
|
|
this work. This essentially finds every
|
|
single file, but not directory, only
|
|
actual files under the current
|
|
directory,
|
|
and then pipes that to grep to then
|
|
further reduce the results so that you
|
|
only have jpg or png file extensions at
|
|
the end of the list.
|
|
To do that, it has to jump through this
|
|
pipe buffer. Now, the problem is some
|
|
data will get kicked out of find, put
|
|
into this intermediate buffer, and then
|
|
pushed out of the intermediate buffer as
|
|
the input of grep. Every [snorts] single
|
|
time you send stuff through a pipe, or a
|
|
consumer consumes the stuff through the
|
|
other end of the pipe, you have a
|
|
context switch. Also, I didn't
|
|
illustrate it here, but you also have a
|
|
problem where if the consumer isn't fast
|
|
enough, the producer waits for the
|
|
consumer, potentially running into a
|
|
nasty time-sinking task of some sort
|
|
along the way.
|
|
But we're going to ignore that for now.
|
|
So, every time you do a context switch,
|
|
you're basically [clears throat]
|
|
throwing away your CPU state and
|
|
trashing your caches, which makes
|
|
everything run slower, because now all
|
|
this stuff you're doing the work for
|
|
here is no longer in main memory, or
|
|
rather in the L1 cache, which is your
|
|
CPU's execution core's main memory. It
|
|
gets thrown out and switched over to
|
|
this one. You just keep bouncing back
|
|
and forth, or whatever. So, you're
|
|
destroying your cache coherency by
|
|
duplicating data, because the pipe
|
|
buffer doesn't just like magically drop
|
|
itself into grep. It has to be fed
|
|
through the interfaces that grep uses to
|
|
input, be it fgets, which reads
|
|
individual lines, or um fread, or just
|
|
plain read. But one way or the other, it
|
|
gets kicked out of this, which usually
|
|
there's some kind of output interface
|
|
here. Then it gets stored by proxy in a
|
|
buffer. Then that same proxy is also
|
|
kicking it out. So, there's all these
|
|
switches between the contexts,
|
|
and it wrecks your CPU performance. Now,
|
|
it's also just generally inefficient and
|
|
unreadable.
|
|
Grep is also a beast. And at the end of
|
|
it, all we're doing is printing the list
|
|
of files that match. Now,
|
|
my solution, Jofido, Jody's file tool,
|
|
we'll say scan directory. And this is
|
|
sort of the C function format. I'm sorry
|
|
I had to break things across lines, cuz
|
|
I wrote large, but
|
|
it ends over here.
|
|
So, scan directory,
|
|
the first parameter is the same thing.
|
|
It's dot. It's presented in double
|
|
quotes so that we know it's a string. We
|
|
know that it's actually meant to be text
|
|
and not a variable name. That's
|
|
important.
|
|
But other than that oddity, this is the
|
|
same. But here's how it differs.
|
|
Find does not have grep. Find can't do
|
|
the
|
|
only match things against certain
|
|
parameters, or only match things that
|
|
don't meet certain parameters.
|
|
Scan directory, however, has this curly
|
|
brace filter
|
|
that ends over here.
|
|
Filter is a generic predicate that calls
|
|
a particular kind of filtration on a
|
|
string or list of strings,
|
|
and then filters them as you want them.
|
|
In this case, we would have a filter
|
|
that filters extensions. JP JPG and PNG
|
|
corresponding to JPEG and portable
|
|
network graphics images.
|
|
It's much easier to read. We know we're
|
|
scanning a directory. The dot's cryptic,
|
|
but it's the current directory. I mean,
|
|
that's just you kind of have to accept
|
|
that degree of the terminology here.
|
|
Excuse the coughing.
|
|
[cough and clears throat]
|
|
Then this filter, what happens under the
|
|
hood is scan directory alone can just
|
|
start reading the directory contents,
|
|
but filter
|
|
runs in a parallel thread. And then
|
|
you'll notice that's not the end of it.
|
|
Then the last one is another predicate
|
|
called print.
|
|
The curly braces mean that it's a
|
|
predicate. Basically, think of it as a
|
|
modifier. And that's the end of the scan
|
|
directory function. Now, we don't have
|
|
to have a big pipe buffer. We don't have
|
|
to have an output buffer, a pipe buffer,
|
|
and an input buffer, which is what's
|
|
really going on here under the hood with
|
|
the C library.
|
|
Instead, we're doing everything
|
|
in-house. We do it all internal to
|
|
Jofido. So, what we have is an arena. An
|
|
arena is a kind of memory map where you
|
|
just slam everything in order, and um
|
|
you allocate in large chunks. And I
|
|
don't want to go too far into it, but
|
|
the bottom line is as the scan directory
|
|
reads in
|
|
these paths and stores them in the arena
|
|
here,
|
|
the filter predicate is chasing that
|
|
arena. Rather than waiting to to be able
|
|
to continue to scan the directory for
|
|
the filter to make a decision, these run
|
|
in parallel. If scan directory is faster
|
|
than filter,
|
|
then filter eventually has to catch up.
|
|
But if filter is faster than scan
|
|
directory, which is most likely,
|
|
then filter
|
|
catches up to
|
|
It just stops. It doesn't process
|
|
anymore
|
|
until scan directory increments
|
|
the size of this list, and that triggers
|
|
filter. Its thread wakes up, sees that
|
|
the increment's there, sees that the
|
|
done flag for the operation it's
|
|
supposed to filter hasn't been toggled,
|
|
and bumps to the next item. So, in this
|
|
way, we have a leader
|
|
and a chaser.
|
|
The chaser [clears throat] goes through,
|
|
and that's what this blue arrow is here,
|
|
and qualifies each one. This one's bad.
|
|
Okay. So, what happens when filter finds
|
|
this bad one?
|
|
Scan directory has already moved past
|
|
it. So, filter will deallocate this and
|
|
detach it. There's a complicated way
|
|
that I prevent deallocation of an object
|
|
from the arena from causing an index
|
|
mismatch, but it can All you need to
|
|
know is that we can remove this item
|
|
without the third chaser, or the second
|
|
chaser, print here,
|
|
having a problem where, oh no, there's
|
|
an item that's gone, and now I see this
|
|
is item three instead of four. We don't
|
|
have that problem. Filter can
|
|
immediately detach this.
|
|
And now, when print goes through, it
|
|
will never hit this. See, each one of
|
|
these follows in order. This is the most
|
|
subordinate. This is the leader. So,
|
|
print is chasing filter is chasing scan
|
|
directory. We have a situation here
|
|
where if you have three cores or threads
|
|
on a machine,
|
|
the directory scan can be happening, and
|
|
this actually would be happening in bulk
|
|
with some of my optimizations.
|
|
Then
|
|
the filtration of that scan will be
|
|
happening in another thread or on
|
|
another core
|
|
at the same time
|
|
and will stop when it runs out of data
|
|
and resume when more data is available.
|
|
Then the subordinate here also, same
|
|
deal. It will stop when the filter
|
|
doesn't have any more filtered items
|
|
available and continue when it does. So,
|
|
scanning, filtering, and printing can
|
|
all happen on a modern machine with
|
|
multiple cores simultaneously.
|
|
But the most important part is if we
|
|
have the scanner, the filter, and the
|
|
printer chasing all one after the other,
|
|
the likelihood of say
|
|
say the scanner here has just loaded
|
|
bad.text into the list
|
|
and then the filter here um has filtered
|
|
just qualified abc.jpeg and the print
|
|
has just printed xyz.png, right?
|
|
So, these things are all assuming that
|
|
the predicates here are fast enough,
|
|
they're all kind of working in lockstep,
|
|
which means that these items are still
|
|
hot in the level one instruction and
|
|
data caches as it's iterating through
|
|
this list.
|
|
So, rather than this situation where you
|
|
have three separate lists that are in
|
|
completely different places that are
|
|
blowing out each other's L1 cache
|
|
presence,
|
|
our entire chain here
|
|
is following one another. And the best
|
|
part of all of this,
|
|
which not other than the fact that print
|
|
can output immediately instead of
|
|
waiting,
|
|
the best part is this part. Arena
|
|
objects are destroyed once they're
|
|
terminal.
|
|
So,
|
|
what [clears throat] makes an arena
|
|
object terminal? Well, when filter
|
|
filters out this,
|
|
it can no longer be passed to any of the
|
|
predicates that are subordinate to it
|
|
that come later.
|
|
So,
|
|
print is not going to be able to print
|
|
this.
|
|
So, there's no more use for it. This
|
|
object officially's dead. So, filter can
|
|
say so. Filter can say, "Hey, this one's
|
|
a no-no, kill it." And it gets killed
|
|
and it gets marked as free in the arena.
|
|
But then, when print prints this one and
|
|
this one and this one and this one in
|
|
order,
|
|
as it's printing them, it is the
|
|
terminal predicate. It is the end of the
|
|
line. Nothing happens with this after
|
|
print because we didn't assign the scan
|
|
directory results to some variable to
|
|
keep.
|
|
So, once scan directory's done and print
|
|
has completed, we don't need any of this
|
|
anymore.
|
|
But we don't deallocate it in bulk at
|
|
the end.
|
|
As print chips away at the list and is
|
|
the tail end of this predicate chain,
|
|
dump dump dump dump. Once an item is no
|
|
longer needed, it is freed up. Once
|
|
enough arena items have been freed up,
|
|
this entire arena page can be compacted.
|
|
And I don't want to go over it in this
|
|
one, but maybe the next video if you're
|
|
interested. The way that the arena works
|
|
is we actually have an indirection
|
|
block, think of it as over here
|
|
somewhere,
|
|
so that these high-level primitives
|
|
point to indirection blocks, but the
|
|
low-level locations are pointed to by
|
|
the indirection blocks. So, this sees
|
|
the list it's outputting at a fixed
|
|
location
|
|
that points to a variable location.
|
|
So, we can move these around all we
|
|
want. We can garbage collect as in free
|
|
up memory and compact out the empty
|
|
spaces all day long.
|
|
And none of these predicates or filters
|
|
or actions or verbs or whatever you want
|
|
to call them have any idea that that's
|
|
going on right behind their backs.
|
|
Anyway, this is just a basic example of
|
|
the kind of thing that I intend to do.
|
|
This effectively replaces this find grep
|
|
chain, which is a pretty common one. I
|
|
actually use this pretty often to find
|
|
all of the pictures under a certain
|
|
folder. So, this is not some academic
|
|
example. This is real world working with
|
|
your hands on the metal, you know,
|
|
system administration. I need to find
|
|
all the pictures underneath this folder
|
|
and get a list of them.
|
|
And this is a common thing to do and
|
|
there are steps along the way that make
|
|
it a lot slower than it has to be. The
|
|
longer you wait for one step to finish,
|
|
the longer it takes everything down the
|
|
pipeline to finish.
|
|
Also, something I haven't talked about,
|
|
uh maybe a little teaser for you guys,
|
|
I want to replace find and grep with
|
|
Jofedo primitives and scripts.
|
|
Well, one of the solutions I have to,
|
|
"Well, how are you going to integrate
|
|
Jofedo in in like this and not lose the
|
|
benefits of like of avoiding this pipe
|
|
buffer?"
|
|
I've come up with some tech called pipe
|
|
coalescing where
|
|
find and grep see their part of a
|
|
pipeline. Find and grep see their the
|
|
same
|
|
Jofedo executable.
|
|
And then find is the head, so it's the
|
|
coordinator. And all the subordinates
|
|
down the pipeline reach out to the head
|
|
and say, "Hey,
|
|
here's my script, here's my parameters,
|
|
integrate me into you
|
|
and I'll just become a hollow pipe that
|
|
sends the final results down the line.
|
|
Thus, find and grep and sort and unique
|
|
and whatever else your big long stupid
|
|
pipeline might use all get collapsed by
|
|
Jofedo if they're all Jofedo scripts
|
|
instead of the actual binaries, that is,
|
|
into one unified Jofedo script in memory
|
|
that then performs all these actions and
|
|
thus can optimize away um cases where,
|
|
for example, it would be wasteful to get
|
|
certain information, um it it can
|
|
optimize away that stuff and do it
|
|
faster than you would ever be able to do
|
|
it with a normal pipeline
|
|
on your own.
|
|
>> [clears throat]
|
|
>> Anyway, I don't want to talk anymore. I
|
|
know I've hit almost 15 minutes on just
|
|
this part and I thought that this would
|
|
be a good introduction to give you an
|
|
idea of what we're doing here and why
|
|
you funding Jofedo development is so
|
|
important. This kind of logic is not
|
|
something that just anybody can write.
|
|
And even for me, it's not like this is
|
|
necessarily easy. This is a lot of work
|
|
and a lot of testing. So, look down um
|
|
my Kofi will be in the description,
|
|
possibly the pinned comment, um a link
|
|
to the video that started all this,
|
|
perhaps, too. And um thanks for your
|
|
support. I hope to do you proud. Have a
|
|
great day. |