Files
forth_bootslop/references/ForthNotes/Variations.md
2026-02-19 16:16:24 -05:00

6.0 KiB
Raw Permalink Blame History

Variations

Source: https://muforth.dev/threaded-code-variations/

Threaded code: Variations muforth

Threaded code: Variations


Address lists: cfas or pfas?

In my introduction to threaded code I mentioned that colon word bodies consist of a list of cfas (code field addresses), but that sometimes it makes sense to use pfas instead. We are going to explore that idea here.

First, lets recap the definitions:

The address of a code field is called, unsurprisingly, a code field address usually abbreviated cfa.

The address of a parameter field is called, also unsurprisingly, a parameter field address usually abbreviated pfa.

In a DTC system, a cfa points to a code field, which contains machine code.

In an ITC system, a cfa points to a code field, which points to machine code.

The main reason for using cfas in address lists is that its the obvious thing to do. The cfa points to the code field, which we either jump to (DTC) or jump through (ITC). Why would we want to use anything else?

The answer is suggested not by the code of NEXT, but by NEST, VARIABLE, and CONSTANT the “runtime” routines for colon words, variables, and constants, respectively. Lets look at them again:

  DTC_NEST: Push IP onto call stack
            Set IP to W + sizeof(Jump)
            Execute NEXT
  ITC_NEST: Push IP onto call stack
            Set IP to W + address size
            Execute NEXT
  VARIABLE: Add to W the size of the code field (ie, calculate pfa)
            Push W onto the data stack (push pfa)
            Execute NEXT
  CONSTANT: Add to W the size of the code field (calculate pfa)
            Fetch value pointed to by W into W (fetch contents of parameter field)
            Push W onto the data stack
            Execute NEXT

Whether calculating the new IP in NEST, or calculating the address of the value in VARIABLE and CONSTANT, we are always adding an offset to W. Because of how NEXT works, W contains the cfa. We want the pfa.

Whenever a programmer creates a new data type that has an assembler runtime (using ;code) they have to do this same arithmetic on W. Why not do it one place?

The natural “one place” is in NEXT. So what would we change? Our aim is to exit NEXT with W containing a pfa instead of a cfa.

Lets look at DTC first. Here is DTC NEXT:

  NEXT:  Fetch into W the address pointed to by IP
         Increment IP by address size
         Jump to W

We have conflicting requirements here. For DTC we have to jump to the code field. We have no choice there. But somehow we want W to point to the parameter field one memory cell higher. How to do both?

If the address list contains cfas, then the first step of NEXT will put a cfa into W. Can we jump to W and increment it in one instruction? Not likely.

What about using pfas in the address list. Does this help? W will have the right value when we exit from NEXT, but can we jump to W minus an offset? It depends on the machine. And it might make NEXT longer possibly significantly longer.

On some machines the best way to do DTC NEXT doesnt involve W at all. For example, on the MSP430 the shortest DTC NEXT is one instruction:

  mov pc, (ip)+

This does the auto-increment of IP and the jump to the fetched cfa, but leaves to the runtimes the computation of the pfa. This approach requires that code fields contain calls rather than jumps in order to “capture” the pfa somewhere (it gets pushed onto the machines call stack).

DTC doesnt offer a lot of wiggle room. Maybe ITC is better. Lets take a look.

Here is ITC NEXT:

  NEXT:  Fetch into W the address pointed to by IP
         Increment IP by address size
         Fetch into X the address pointed to by W
         Jump to X

As we did with DTC, lets start with the case where address lists contain cfas. Just as with DTC NEXT, the first step of ITC NEXT loads a cfa into W. On a machine with auto-increment addressing we can have the “Fetch into X” step increment W so that when we get to NEST (or one of the other “runtimes”) W will contain the pfa.

It might take an extra cycle (it does on the MSP430) but because in every case except executing a code word we will have to take at least a cycle to do arithmetic on W, it might make sense to auto-increment instead. Its also less error-prone to do it, correctly, in one place. (This is what I have done in the MSP430 ITC kernel.)

If we dont have auto-increment (eg, most true RISC machines ARM (excepting v6-M) doesnt count ;-) but can use a negative offset in our memory access instructions, then it makes sense to use pfas in our address lists instead of cfas. If we do this, the first step of NEXT loads W with a pfa, which is what we want. But the “Fetch into X” step needs to load from the cfa instead, so we offset backwards by the size of an address. This sounds convoluted, but its actually the most efficient approach.

Here is that version of ITC NEXT:

  NEXT:  Fetch into W the address pointed to by IP (this is a pfa, not a cfa!)
         Increment IP by address size
         Fetch into X the address pointed to by (W - address size)
         Jump to X

This is what I have done for the RISC-V ITC NEXT.

You have to work with the strengths and against the weaknesses of your architecture not the other way around! There is no fixed answer. Using cfas is not always the “perfect and beautiful” approach. Nor is using pfas. It pays to tinker and draw pictures and pay attention to instruction cycle-times and sizes.


Send feedback on this page (last edited 2017 April 24)
Browse all pages
Return home