2031 lines
72 KiB
Markdown
2031 lines
72 KiB
Markdown
# libforth
|
||
|
||
**Source:** https://github.com/howerj/libforth
|
||
|
||
GitHub - howerj/libforth: libforth: A small Forth interpreter that can be used as a library written in c99
|
||
|
||
|
||
|
||
[Skip to content](#start-of-content)
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
## Navigation Menu
|
||
|
||
Toggle navigation
|
||
|
||
[Sign in](/login?return_to=https%3A%2F%2Fgithub.com%2Fhowerj%2Flibforth)
|
||
|
||
Appearance settings
|
||
|
||
* Platform
|
||
|
||
+ AI CODE CREATION
|
||
- [GitHub CopilotWrite better code with AI](https://github.com/features/copilot)
|
||
- [GitHub SparkBuild and deploy intelligent apps](https://github.com/features/spark)
|
||
- [GitHub ModelsManage and compare prompts](https://github.com/features/models)
|
||
- [MCP RegistryNewIntegrate external tools](https://github.com/mcp)
|
||
+ DEVELOPER WORKFLOWS
|
||
- [ActionsAutomate any workflow](https://github.com/features/actions)
|
||
- [CodespacesInstant dev environments](https://github.com/features/codespaces)
|
||
- [IssuesPlan and track work](https://github.com/features/issues)
|
||
- [Code ReviewManage code changes](https://github.com/features/code-review)
|
||
+ APPLICATION SECURITY
|
||
- [GitHub Advanced SecurityFind and fix vulnerabilities](https://github.com/security/advanced-security)
|
||
- [Code securitySecure your code as you build](https://github.com/security/advanced-security/code-security)
|
||
- [Secret protectionStop leaks before they start](https://github.com/security/advanced-security/secret-protection)
|
||
+ EXPLORE
|
||
- [Why GitHub](https://github.com/why-github)
|
||
- [Documentation](https://docs.github.com)
|
||
- [Blog](https://github.blog)
|
||
- [Changelog](https://github.blog/changelog)
|
||
- [Marketplace](https://github.com/marketplace)
|
||
|
||
[View all features](https://github.com/features)
|
||
* Solutions
|
||
|
||
+ BY COMPANY SIZE
|
||
- [Enterprises](https://github.com/enterprise)
|
||
- [Small and medium teams](https://github.com/team)
|
||
- [Startups](https://github.com/enterprise/startups)
|
||
- [Nonprofits](https://github.com/solutions/industry/nonprofits)
|
||
+ BY USE CASE
|
||
- [App Modernization](https://github.com/solutions/use-case/app-modernization)
|
||
- [DevSecOps](https://github.com/solutions/use-case/devsecops)
|
||
- [DevOps](https://github.com/solutions/use-case/devops)
|
||
- [CI/CD](https://github.com/solutions/use-case/ci-cd)
|
||
- [View all use cases](https://github.com/solutions/use-case)
|
||
+ BY INDUSTRY
|
||
- [Healthcare](https://github.com/solutions/industry/healthcare)
|
||
- [Financial services](https://github.com/solutions/industry/financial-services)
|
||
- [Manufacturing](https://github.com/solutions/industry/manufacturing)
|
||
- [Government](https://github.com/solutions/industry/government)
|
||
- [View all industries](https://github.com/solutions/industry)
|
||
|
||
[View all solutions](https://github.com/solutions)
|
||
* Resources
|
||
|
||
+ EXPLORE BY TOPIC
|
||
- [AI](https://github.com/resources/articles?topic=ai)
|
||
- [Software Development](https://github.com/resources/articles?topic=software-development)
|
||
- [DevOps](https://github.com/resources/articles?topic=devops)
|
||
- [Security](https://github.com/resources/articles?topic=security)
|
||
- [View all topics](https://github.com/resources/articles)
|
||
+ EXPLORE BY TYPE
|
||
- [Customer stories](https://github.com/customer-stories)
|
||
- [Events & webinars](https://github.com/resources/events)
|
||
- [Ebooks & reports](https://github.com/resources/whitepapers)
|
||
- [Business insights](https://github.com/solutions/executive-insights)
|
||
- [GitHub Skills](https://skills.github.com)
|
||
+ SUPPORT & SERVICES
|
||
- [Documentation](https://docs.github.com)
|
||
- [Customer support](https://support.github.com)
|
||
- [Community forum](https://github.com/orgs/community/discussions)
|
||
- [Trust center](https://github.com/trust-center)
|
||
- [Partners](https://github.com/partners)
|
||
* Open Source
|
||
|
||
+ COMMUNITY
|
||
- [GitHub SponsorsFund open source developers](https://github.com/sponsors)
|
||
+ PROGRAMS
|
||
- [Security Lab](https://securitylab.github.com)
|
||
- [Maintainer Community](https://maintainers.github.com)
|
||
- [Accelerator](https://github.com/accelerator)
|
||
- [Archive Program](https://archiveprogram.github.com)
|
||
+ REPOSITORIES
|
||
- [Topics](https://github.com/topics)
|
||
- [Trending](https://github.com/trending)
|
||
- [Collections](https://github.com/collections)
|
||
* Enterprise
|
||
|
||
+ ENTERPRISE SOLUTIONS
|
||
- [Enterprise platformAI-powered developer platform](https://github.com/enterprise)
|
||
+ AVAILABLE ADD-ONS
|
||
- [GitHub Advanced SecurityEnterprise-grade security features](https://github.com/security/advanced-security)
|
||
- [Copilot for BusinessEnterprise-grade AI features](https://github.com/features/copilot/copilot-business)
|
||
- [Premium SupportEnterprise-grade 24/7 support](https://github.com/premium-support)
|
||
* [Pricing](https://github.com/pricing)
|
||
|
||
Search or jump to...
|
||
|
||
|
||
# Search code, repositories, users, issues, pull requests...
|
||
|
||
Search
|
||
|
||
Clear
|
||
|
||
[Search syntax tips](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax)
|
||
|
||
# Provide feedback
|
||
|
||
We read every piece of feedback, and take your input very seriously.
|
||
|
||
|
||
Include my email address so I can be contacted
|
||
|
||
Cancel
|
||
Submit feedback
|
||
|
||
|
||
|
||
|
||
|
||
# Saved searches
|
||
|
||
## Use saved searches to filter your results more quickly
|
||
|
||
Name
|
||
|
||
Query
|
||
|
||
To see all available qualifiers, see our [documentation](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax).
|
||
|
||
Cancel
|
||
Create saved search
|
||
|
||
[Sign in](/login?return_to=https%3A%2F%2Fgithub.com%2Fhowerj%2Flibforth)
|
||
|
||
[Sign up](/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E&source=header-repo&source_repo=howerj%2Flibforth)
|
||
|
||
Appearance settings
|
||
|
||
Resetting focus
|
||
|
||
You signed in with another tab or window. Reload to refresh your session.
|
||
You signed out in another tab or window. Reload to refresh your session.
|
||
You switched accounts on another tab or window. Reload to refresh your session.
|
||
|
||
|
||
|
||
Dismiss alert
|
||
|
||
{{ message }}
|
||
|
||
[howerj](/howerj)
|
||
/
|
||
**[libforth](/howerj/libforth)**
|
||
Public
|
||
|
||
* [Notifications](/login?return_to=%2Fhowerj%2Flibforth) You must be signed in to change notification settings
|
||
* [Fork
|
||
31](/login?return_to=%2Fhowerj%2Flibforth)
|
||
* [Star
|
||
200](/login?return_to=%2Fhowerj%2Flibforth)
|
||
|
||
libforth: A small Forth interpreter that can be used as a library written in c99
|
||
|
||
### License
|
||
|
||
[MIT license](/howerj/libforth/blob/master/LICENSE)
|
||
|
||
[200
|
||
stars](/howerj/libforth/stargazers) [31
|
||
forks](/howerj/libforth/forks) [Branches](/howerj/libforth/branches) [Tags](/howerj/libforth/tags) [Activity](/howerj/libforth/activity)
|
||
|
||
[Star](/login?return_to=%2Fhowerj%2Flibforth)
|
||
|
||
[Notifications](/login?return_to=%2Fhowerj%2Flibforth) You must be signed in to change notification settings
|
||
|
||
* [Code](/howerj/libforth)
|
||
* [Issues
|
||
0](/howerj/libforth/issues)
|
||
* [Pull requests
|
||
0](/howerj/libforth/pulls)
|
||
* [Actions](/howerj/libforth/actions)
|
||
* [Projects
|
||
0](/howerj/libforth/projects)
|
||
* [Wiki](/howerj/libforth/wiki)
|
||
* [Security
|
||
0](/howerj/libforth/security)
|
||
* [Insights](/howerj/libforth/pulse)
|
||
|
||
Additional navigation options
|
||
|
||
|
||
* [Code](/howerj/libforth)
|
||
* [Issues](/howerj/libforth/issues)
|
||
* [Pull requests](/howerj/libforth/pulls)
|
||
* [Actions](/howerj/libforth/actions)
|
||
* [Projects](/howerj/libforth/projects)
|
||
* [Wiki](/howerj/libforth/wiki)
|
||
* [Security](/howerj/libforth/security)
|
||
* [Insights](/howerj/libforth/pulse)
|
||
|
||
# howerj/libforth
|
||
|
||
master
|
||
|
||
[Branches](/howerj/libforth/branches)[Tags](/howerj/libforth/tags)
|
||
|
||
Go to file
|
||
|
||
Code
|
||
|
||
Open more actions menu
|
||
|
||
## Folders and files
|
||
|
||
| Name | | Name | Last commit message | Last commit date |
|
||
| --- | --- | --- | --- | --- |
|
||
| Latest commit History[350 Commits](/howerj/libforth/commits/master/) 350 Commits | | |
|
||
| [fth](/howerj/libforth/tree/master/fth "fth") | | [fth](/howerj/libforth/tree/master/fth "fth") | | |
|
||
| [libline @ 465e8a5](/howerj/libline/tree/465e8a59a0bfbef625d44ba250cccc60e8fea3be "libline") | | [libline @ 465e8a5](/howerj/libline/tree/465e8a59a0bfbef625d44ba250cccc60e8fea3be "libline") | | |
|
||
| [.gitignore](/howerj/libforth/blob/master/.gitignore ".gitignore") | | [.gitignore](/howerj/libforth/blob/master/.gitignore ".gitignore") | | |
|
||
| [.gitmodules](/howerj/libforth/blob/master/.gitmodules ".gitmodules") | | [.gitmodules](/howerj/libforth/blob/master/.gitmodules ".gitmodules") | | |
|
||
| [.travis.yml](/howerj/libforth/blob/master/.travis.yml ".travis.yml") | | [.travis.yml](/howerj/libforth/blob/master/.travis.yml ".travis.yml") | | |
|
||
| [LICENSE](/howerj/libforth/blob/master/LICENSE "LICENSE") | | [LICENSE](/howerj/libforth/blob/master/LICENSE "LICENSE") | | |
|
||
| [convert](/howerj/libforth/blob/master/convert "convert") | | [convert](/howerj/libforth/blob/master/convert "convert") | | |
|
||
| [dpans94.txt](/howerj/libforth/blob/master/dpans94.txt "dpans94.txt") | | [dpans94.txt](/howerj/libforth/blob/master/dpans94.txt "dpans94.txt") | | |
|
||
| [editor.fth](/howerj/libforth/blob/master/editor.fth "editor.fth") | | [editor.fth](/howerj/libforth/blob/master/editor.fth "editor.fth") | | |
|
||
| [forth.fth](/howerj/libforth/blob/master/forth.fth "forth.fth") | | [forth.fth](/howerj/libforth/blob/master/forth.fth "forth.fth") | | |
|
||
| [length](/howerj/libforth/blob/master/length "length") | | [length](/howerj/libforth/blob/master/length "length") | | |
|
||
| [libforth.c](/howerj/libforth/blob/master/libforth.c "libforth.c") | | [libforth.c](/howerj/libforth/blob/master/libforth.c "libforth.c") | | |
|
||
| [libforth.h](/howerj/libforth/blob/master/libforth.h "libforth.h") | | [libforth.h](/howerj/libforth/blob/master/libforth.h "libforth.h") | | |
|
||
| [main.c](/howerj/libforth/blob/master/main.c "main.c") | | [main.c](/howerj/libforth/blob/master/main.c "main.c") | | |
|
||
| [makefile](/howerj/libforth/blob/master/makefile "makefile") | | [makefile](/howerj/libforth/blob/master/makefile "makefile") | | |
|
||
| [readme.md](/howerj/libforth/blob/master/readme.md "readme.md") | | [readme.md](/howerj/libforth/blob/master/readme.md "readme.md") | | |
|
||
| [unit.c](/howerj/libforth/blob/master/unit.c "unit.c") | | [unit.c](/howerj/libforth/blob/master/unit.c "unit.c") | | |
|
||
| [unit.fth](/howerj/libforth/blob/master/unit.fth "unit.fth") | | [unit.fth](/howerj/libforth/blob/master/unit.fth "unit.fth") | | |
|
||
| [unit.h](/howerj/libforth/blob/master/unit.h "unit.h") | | [unit.h](/howerj/libforth/blob/master/unit.h "unit.h") | | |
|
||
| View all files | | |
|
||
|
||
## Repository files navigation
|
||
|
||
* [README](#)
|
||
* [MIT license](#)
|
||
|
||
% FORTH(1)
|
||
% Richard Howe
|
||
% November 2016
|
||
|
||
# NAME
|
||
|
||
forth - a forth interpreter
|
||
|
||
# SYNOPSIS
|
||
|
||
**forth** [**-s** file] [**-e** string] [**-l** file] [**-m** size] [**-VthvLSnx**] [**-**] [**files**]
|
||
|
||
# DESCRIPTION
|
||
|
||
A Forth interpreter built around a library, libforth, that implements a
|
||
complete Forth interpreter.
|
||
|
||
For a much better, albeit 16-bit only, Forth that is also very well documented
|
||
and (slightly more) standards compliant, please visit the more up to date
|
||
[Embed](https://github.com/howerj/embed) virtual machine. Also of interest is
|
||
the [Forth CPU](https://github.com/howerj/forth-cpu) I wrote, in VHDL, for an
|
||
FPGA.
|
||
|
||
This interpreter is available at [here](https://github.com/howerj/libforth).
|
||
|
||
[](https://camo.githubusercontent.com/2fc05cd9c4e36dbe5aafa595c13716d1ef247c3965e6ecfc83e4811e0a604359/68747470733a2f2f7472617669732d63692e6f72672f686f7765726a2f6c6962666f7274682e7376673f6272616e63683d6d6173746572)
|
||
|
||
# OPTIONS
|
||
|
||
Command line switches must be given before any files, unless that switch takes
|
||
a file as an argument.
|
||
|
||
* -s file
|
||
|
||
This saves the working memory of the Forth interpreter to a file,
|
||
which can later be loaded with the "-l" option. If a core file has been
|
||
invalidated this will not be saved, invalidation occurs when an unrecoverable
|
||
error has been detected that would prevent any recovery or meaningful
|
||
execution with the current image.
|
||
|
||
* -e string
|
||
|
||
Evaluate a Forth string passed in as an argument.
|
||
|
||
* -t
|
||
|
||
After all the files have been read from and any core files have been loaded
|
||
this will make the Forth interpreter read from [stdin](https://en.wikipedia.org/wiki/Standard_streams), the core file will be
|
||
saved after [stdin](https://en.wikipedia.org/wiki/Standard_streams) has been read from and there is no more work to do, if
|
||
the "-d" or "-s" flags have been specified.
|
||
|
||
* -h
|
||
|
||
Print out a short help message and exit unsuccessfully.
|
||
|
||
* -v
|
||
|
||
Turn verbose mode on, more information will be printed out, to [stderr](https://en.wikipedia.org/wiki/Standard_streams), about
|
||
what is happening in the interpreter. Usually the interpreter is as silent as
|
||
possible.
|
||
|
||
* -m size
|
||
|
||
Specify the virtual machines memory size in kilobytes, overriding the default
|
||
memory size. This is mutually exclusive with "-l".
|
||
|
||
* -l file
|
||
|
||
This option loads a forth core file generated from the "-d" option of a
|
||
previous run. This core file is not portable and must be generated on the same
|
||
platform as it was generated. It can only be specified once per run of the
|
||
interpreter.
|
||
|
||
* -L
|
||
|
||
The same as "-l", however the default core file name is used, "forth.core", so
|
||
an argument does not have to be provided.
|
||
|
||
* -S
|
||
|
||
The same as "-s", however the default core file name is used, "forth.core", so
|
||
an argument does not have to be provided.
|
||
|
||
* '-'
|
||
|
||
Stop processing any more command line options and treat all arguments after
|
||
this as files to be executed, if there are any.
|
||
|
||
* -V
|
||
|
||
Print version and interpreter information and exit successfully.
|
||
|
||
* -f file
|
||
|
||
Process a file immediately. This allows options and file arguments to be
|
||
intermingled.
|
||
|
||
* -n
|
||
|
||
If the line editing library is compiled into the executable, which is a compile
|
||
time option, then when reading from [stdin](https://en.wikipedia.org/wiki/Standard_streams) this will use a [line editor](https://github.com/howerj/libline)
|
||
to read in a line at a time. This option implies *-t*.
|
||
|
||
* -x
|
||
|
||
Enable signal handling, signal handling is experimental and not quite worked
|
||
out yet. It is off by default as I find it annoying when programs catch signals
|
||
when you really want to program to *die*. This cannot be enabled from within
|
||
the Forth interpreter. This option should disappear once signal handling has
|
||
been sorted out.
|
||
|
||
* file...
|
||
|
||
If a file, or list of files, is given, read from them one after another
|
||
and execute them. The dictionary and any stored Forth blocks will persist, as
|
||
will values on the stack.
|
||
|
||
If no files are given to execute [stdin](https://en.wikipedia.org/wiki/Standard_streams) will be read from.
|
||
|
||
## EXAMPLES
|
||
|
||
```
|
||
./forth
|
||
```
|
||
|
||
Execute any commands given from [stdin](https://en.wikipedia.org/wiki/Standard_streams)
|
||
|
||
```
|
||
./forth -t file1.4th file2.4th
|
||
```
|
||
|
||
Execute file "file1.4th", then "file2.4th", then read from [stdin](https://en.wikipedia.org/wiki/Standard_streams)
|
||
|
||
```
|
||
./forth file1.4th
|
||
```
|
||
|
||
Execute file "file1.4th".
|
||
|
||
```
|
||
./forth -s file1.4th
|
||
```
|
||
|
||
Execute file "file1.4th", the produce a "forth.core" save file.
|
||
|
||
```
|
||
./forth -s -l forth.core
|
||
```
|
||
|
||
Load a "forth.core" file, read from [stdin](https://en.wikipedia.org/wiki/Standard_streams) and execute any commands given,
|
||
then dump the new core file to "forth.core".
|
||
|
||
The interpreter returns zero on success and non zero on failure.
|
||
|
||
# LICENSE
|
||
|
||
The Forth interpreter and the library that implements it are released under the
|
||
[MIT](https://opensource.org/licenses/MIT) license. Copyright (c) Richard Howe, 2016.
|
||
|
||
# EXIT STATUS
|
||
|
||
This program will return a non-zero value on failure, and zero on success.
|
||
|
||
# SEE ALSO
|
||
|
||
libforth(3)
|
||
|
||
# BUGS
|
||
|
||
If you find a bug, or would like to request a new feature, please Email me at:
|
||
|
||
```
|
||
howe.r.j.89 [ at ] gmail . com
|
||
```
|
||
|
||
The interpreter has not been battle hardened yet so there is likely behavior
|
||
that is non-standard (for no reason) or just outright incorrect.
|
||
|
||
# MANUAL
|
||
|
||
This small [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) interpreter is based on a de-obfuscated entrant
|
||
into the [IOCCC](http://ioccc.org/winners.html) by *buzzard*. The entry described a [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29)
|
||
like language which this derives from. You can use this library to
|
||
evaluate [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) strings or as an embeddable interpreter. Work
|
||
would need to be done to get useful information after doing those
|
||
evaluations, but the library works quite well.
|
||
|
||
*main.c* is simply a wrapper around one the functions that implements
|
||
a simple [REPL](https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop).
|
||
|
||
This project implements a [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) interpreter library which can be embedded
|
||
in other projects, it is incredibly minimalistic, but usable. To build the
|
||
project a [C](https://en.wikipedia.org/wiki/C_%28programming_language%29) compiler is needed, and a copy of [Make](https://en.wikipedia.org/wiki/Make_%28software%29), type:
|
||
|
||
```
|
||
make help
|
||
```
|
||
|
||
For a list of build options. By running:
|
||
|
||
```
|
||
make run
|
||
```
|
||
|
||
Will build the interpreter and run it, it will then read from [stdin](https://en.wikipedia.org/wiki/Standard_streams).
|
||
|
||
To build the documentation other programs may be needed, such as [pandoc](http://pandoc.org/) and
|
||
the [markdown script](https://daringfireball.net/projects/markdown/), but these steps are optional.
|
||
|
||
[Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) is an odd language that has a loyal following groups, but it
|
||
is admittedly not the most practical of language as it lacks nearly everything
|
||
the modern programmer wants in a language; safety, garbage collection,
|
||
modularity and clarity. It is however possible to implement a fully working
|
||
interpreter in a one to two kilobytes of assembly, those kilobytes can make a
|
||
functional and interactive programming environment, giving a high ratio of utility
|
||
memory used.
|
||
|
||
From the [Wikipedia](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) article we can neatly summarize the language:
|
||
|
||
```
|
||
"Forth is an imperative stack-based computer programming language
|
||
and programming environment.
|
||
|
||
Language features include structured programming, reflection (the
|
||
ability to modify the program structure during program execution),
|
||
concatenative programming (functions are composed with juxtaposition)
|
||
and extensibility (the programmer can create new commands).
|
||
|
||
...
|
||
|
||
A procedural programming language without type checking, Forth features
|
||
both interactive execution of commands (making it suitable as a shell
|
||
for systems that lack a more formal operating system) and the ability
|
||
to compile sequences of commands for later execution."
|
||
```
|
||
|
||
Given the nature of the [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) language it does not make for a terribly good
|
||
embeddable scripting language, but it is simple to implement and can be fun
|
||
to use. This interpreter is based off a previous [IOCCC](http://ioccc.org/winners.html) in a file called
|
||
[buzzard.2.c](http://www.ioccc.org/1992/buzzard.2.c), it is a descendant of that file.
|
||
|
||
Before using and understanding this library/interpreter it is useful to checkout
|
||
more literature on [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) such as [Thinking Forth](http://thinking-forth.sourceforge.net/) by Leo Brodie for a
|
||
philosophy of the language, [Starting Forth](http://www.forth.com/starting-forth/) (same Author), [Jonesforth](https://rwmj.wordpress.com/2010/08/07/jonesforth-git-repository/)
|
||
which is a specific implementation of the language in x86 assembly and
|
||
[Gforth](https://www.gnu.org/software/gforth/), a more modern and portable implementation of the language.
|
||
|
||
It is important to realize that [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) is really more a philosophy and
|
||
collection of ideas than a specific reference implementation or standard.
|
||
It has been said that an intermediate [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) user is one who has implemented
|
||
a [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) interpreter, something which cannot be said about other languages
|
||
nor is possible given their complexity.
|
||
|
||
The saying "if you have seen one Forth implementation, you have seen one Forth
|
||
implementation" applies, nearly every single [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) implementation has its
|
||
own idea of how to go about things despite standardization efforts - in keeping
|
||
with this, this library has its own idiosyncrasies.
|
||
|
||
This implementation, written in [C](https://en.wikipedia.org/wiki/C_%28programming_language%29), can be thought of as a hybrid between a
|
||
fairly dumb stack based virtual machine with instructions such as "pop two
|
||
values off the stack, add them, and push the result" and a small
|
||
interpreter/compiler for the virtual machine. This simple kernel is then used
|
||
to build a more compliant and usable [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) implementation by defining
|
||
words that build upon those provided by the base system.
|
||
|
||
## Other documentation
|
||
|
||
Apart from this file there are other sources of information about the
|
||
project:
|
||
|
||
As can the code, which is small enough to be comprehensible:
|
||
|
||
* [libforth.c](/howerj/libforth/blob/master/libforth.c) (contains the core interpreter)
|
||
* [libforth.h](/howerj/libforth/blob/master/libforth.h) (contains the API documentation)
|
||
|
||
And the forth startup code:
|
||
|
||
* [forth.fth](/howerj/libforth/blob/master/forth.fth)
|
||
|
||
The startup code is well commented and shows how the core interpreter is
|
||
extended to a more function [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) environment.
|
||
|
||
The source file [libforth.c](/howerj/libforth/blob/master/libforth.c) can be converted to a more readable webpage by
|
||
first converting the source to [markdown](https://daringfireball.net/projects/markdown/) with [convert](/howerj/libforth/blob/master/convert) script, the converting
|
||
that to HTML in the usual fashion
|
||
|
||
## Using the interpreter
|
||
|
||
*main.c* simple calls the function *main\_forth()* in *libforth.c*, this function
|
||
initializes a [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) environment and puts the user in a [REPL](https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop) where you
|
||
can issue commands and define words. See the manual pages for list of command
|
||
line options and library calls. All commands are given using
|
||
[Reverse Polish Notation](https://en.wikipedia.org/wiki/Reverse_Polish_notation) (or RPN),
|
||
|
||
So:
|
||
|
||
```
|
||
2+(2*4)
|
||
```
|
||
|
||
Becomes:
|
||
|
||
```
|
||
4 2 * 2 +
|
||
```
|
||
|
||
And brackets are no longer needed. Numbers of pushed on to the variable
|
||
stack automatically and commands (such as '\*' and '+') take their operands
|
||
off the stack and push the result. Juggling variables on the stack becomes
|
||
easier over time. To pop a value from the stack and print it there is the
|
||
'.' word.
|
||
|
||
So:
|
||
|
||
```
|
||
2 2 + .
|
||
```
|
||
|
||
Prints:
|
||
|
||
```
|
||
4
|
||
```
|
||
|
||
The simplicity of the language allows for a small interpreter, the
|
||
loop looks something like this:
|
||
|
||
```
|
||
1) Read in a space delimited Forth WORD.
|
||
2) Is this WORD in the dictionary?
|
||
FOUND) Are we in IMMEDIATE mode?
|
||
IMMEDIATE-MODE) Execute WORD.
|
||
goto 1;
|
||
COMPILE-MODE) Compile WORD into the dictionary.
|
||
goto 1;
|
||
NOT-FOUND) Is this actually a number?
|
||
YES) Are we in IMMEDIATE mode?
|
||
IMMEDIATE-MODE) Push Number onto the stack.
|
||
goto 1;
|
||
COMPILE-MODE) Compile a literal number.
|
||
goto 1;
|
||
NO) Error! Handle error
|
||
goto 1;
|
||
```
|
||
|
||
Given that we are reading in *space delimited words* if follows that the
|
||
above expression:
|
||
|
||
```
|
||
2 2 + .
|
||
```
|
||
|
||
Would not work if we did:
|
||
|
||
```
|
||
2 2+ .
|
||
```
|
||
|
||
Or:
|
||
|
||
```
|
||
2 2 +.
|
||
```
|
||
|
||
As "2+" and "+." would be parsed as words, which may or may not be defined
|
||
and if they are do not have the behavior that we want. This is more apparent
|
||
when we do any kind of string handling.
|
||
|
||
## A Forth Word
|
||
|
||
The Forth execution model uses [Threaded Code](https://en.wikipedia.org/wiki/Threaded_code), the layout of a word
|
||
header follows from this.
|
||
|
||
A [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) word is defined in the dictionary and has a particular format that
|
||
varies between implementations. A dictionary is simply a linked list of
|
||
[Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) words, the dictionary is usually contiguous and can only grow. The
|
||
format for our [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) words is as follows:
|
||
|
||
Briefly:
|
||
|
||
* Word Header:
|
||
* field <0 = Word Name (the name is stored before the main header)
|
||
* field 0 = Previous Word
|
||
* field 1 = Code Word (bits 0 - 7) | Hidden Flag (bit 8) | Word Name Offset (bit 9 - 14) | Compiling bit (bit 15)
|
||
* field 2+ = Data field (if it exists).
|
||
|
||
And in more detail:
|
||
|
||
```
|
||
.----------------------------------------.
|
||
| Word Header | Word Body |
|
||
.---------------.-----.------.-----------.
|
||
| NAME ... | PWD | MISC | DATA ... |
|
||
.---------------.-----.------.-----------.
|
||
|
||
____
|
||
NAME = The name, or the textual representation, of a Forth
|
||
word, it is a variable length field that is ASCII NUL
|
||
terminated, the MISC field has an offset that points
|
||
to the begining of this field if taken off the PWD
|
||
position (not value). The offset is in machine words,
|
||
not characters.
|
||
___
|
||
PWD = A pointer to the previously declared word.
|
||
____
|
||
MISC = A complex field that can contains a CODE WORD, a
|
||
"hide" bit and the offset from the PWD field to the
|
||
beginning of NAME, as well as the compiling bit.
|
||
____
|
||
DATA = This could be anything, but it is most likely to be
|
||
a list of pointers to CODE WORDs of previously defined
|
||
words if this optional DATA field is present.
|
||
```
|
||
|
||
All fields are aligned on the [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) virtual machines word boundaries.
|
||
|
||
The MISC field is laid out as so:
|
||
|
||
```
|
||
.-------------------------------------------------------------------------------.
|
||
| <- Least Significant Bit Most Significant Bit -> |
|
||
.-------------------------------------------------------------------------------.
|
||
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
|
||
.-------------------------------------------------------------------------------.
|
||
| CODE WORD | HD | NAME OFFSET | CB |
|
||
.-------------------------------------------------------------------------------.
|
||
_________
|
||
CODE WORD = Bits 0-6 are a code word, this code word is always run
|
||
reguardless of whether we are in compiling or command
|
||
mode
|
||
__
|
||
HD = Bit 7 is the Hide Bit, if this is true then when
|
||
compiling or executing words the word will be hidden from the
|
||
search.
|
||
___________
|
||
NAME OFFSET = Bits 8 to 15 are the offset to the words name. To find the
|
||
beginning of the words name we take this value away from
|
||
position of this words PWD header. This value is in
|
||
machine words, and so the beginning of the NAME must be aligned
|
||
to the virtual machine words boundaries and not character, or byte,
|
||
aligned. The length of this field, and the size of the input buffer,
|
||
limit the maximum size of a word.
|
||
__
|
||
CB = Compiling bit, if set this is a compiling word, if
|
||
cleared it is an immediate word.
|
||
```
|
||
|
||
Depending on the virtual machine word size, or cell size, there may be more
|
||
bits above bit '15', the most significant bit, in the MISC field. These bits
|
||
are not used and should be set to zero.
|
||
|
||
And the dictionary looks like this:
|
||
|
||
```
|
||
[ Special 'fake' word ]
|
||
.
|
||
/|\
|
||
|
|
||
.-------.-----.----------------------.
|
||
| NAME | PWD | Rest of the word ... |
|
||
.-------.-----.----------------------.
|
||
.
|
||
/|\
|
||
|
|
||
~~~~~
|
||
The rest of the dictionary
|
||
~~~~~
|
||
|
|
||
.-------.-----.----------------------.
|
||
| NAME | PWD | Rest of the word ... |
|
||
.-------.-----.----------------------.
|
||
.
|
||
/|\
|
||
|
|
||
.-------.-----.----------------------.
|
||
| NAME | PWD | Rest of the word ... |
|
||
.-------.-----.----------------------.
|
||
.
|
||
/|\
|
||
|
|
||
[ Previous Word Register ]
|
||
```
|
||
|
||
Searching of the dictionary starts from the *Previous Word Register* and ends
|
||
at a special 'fake' word.
|
||
|
||
Defining words adds them to the dictionary, we can defined words with the ':'
|
||
words like this:
|
||
|
||
```
|
||
: two-times 2 * ;
|
||
```
|
||
|
||
Which defined the word "two-times", a word that takes a value from the stack,
|
||
multiplies it by two and pushes the results back onto the stack.
|
||
|
||
The word ':' performs multiple actions; it is an immediate word that reads in the
|
||
next space delimited word from the input stream and creates a header for that
|
||
word. It also switches the interpreter into compile mode, compiling words will
|
||
be compiled into that word definition instead of being executed, immediate words
|
||
are executed as normal. ';' is also an immediate word, it compiles a special
|
||
word exit into the dictionary which returns from a word call and switches the
|
||
interpreter back into command mode. This type of behavior is typical of
|
||
[Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) implementations.
|
||
|
||
## Memory Map and Special Registers
|
||
|
||
The way this interpreter works is that is emulates an idealized machine, one
|
||
built for executing [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) directly. As such it has to make compromises and
|
||
treats certain sectors of memory as being special, as shown below (numbers are
|
||
given in *hexadecimal* and are multiples of the virtual machines word-size
|
||
which is either 16, 32 or 64 bit depending on compile time options.
|
||
|
||
Where the dictionary ends and the variable and return stacks begin depends on
|
||
how much memory was allocated to the interpreter (with a minimum of 2048
|
||
words), the default is 32768 words, and the following diagram assumes this:
|
||
|
||
```
|
||
.-----------------------------------------------.
|
||
| 0-3F | 40-7BFF |7C00-7DFF|7E00-7FFF|
|
||
.-----------------------------------------------.
|
||
| Registers | Dictionary... | V stack | R stack |
|
||
.-----------------------------------------------.
|
||
|
||
V stack = The Variable Stack
|
||
R stack = The Return Stack
|
||
```
|
||
|
||
Each may be further divided into special sections:
|
||
|
||
### Registers
|
||
|
||
At the beginning of the Forth virtual machine there is a section used for
|
||
registers, modifying them arbitrary can cause undefined behavior to occur which
|
||
will most likely cause the virtual machine to be terminated.
|
||
|
||
```
|
||
NAME LOCATION DESCRIPTION
|
||
DECIMAL HEX
|
||
0-1 0-1 Unused
|
||
2-5 2-5 Push integer word
|
||
DIC 6 6 Dictionary pointer
|
||
RSTK 7 7 Return stack pointer
|
||
STATE 8 8 Interpreter state; compile/command mode
|
||
BASE 9 9 Base conversion variable
|
||
PWD 10 A Pointer to last defined word
|
||
SOURCE_ID 11 B Input source selector (-1 = string input,
|
||
0 = file input)
|
||
SIN 12 C String input pointer
|
||
SIDX 13 D String input index (index into SIN)
|
||
SLEN 14 E String input length (length of SIN)
|
||
START_ADDR 15 F Pointer to start of VM
|
||
FIN 16 10 File input pointer
|
||
FOUT 17 11 File output pointer
|
||
STDIN 18 12 File pointer to stdin, if available
|
||
STDOUT 19 13 File pointer to stdout, if available
|
||
STDERR 20 14 File pointer to stderr, if available
|
||
ARGC 21 15 Count of arguments passed to program,
|
||
if available
|
||
ARGV 22 16 An array of pointers to NUL terminated
|
||
ASCII strings, if available, of ARGC
|
||
length
|
||
DEBUG 23 17 Turn debugging on/off if enabled
|
||
INVALID 24 18 If non zero, this interpreter is invalid
|
||
TOP 25 19 Stored version of top of stack
|
||
INSTRUCTION 26 1A Stored version of instruction pointer
|
||
STACK_SIZE 27 1B Size of the variable stack
|
||
ERROR_HANDLER 28 1C Action to take on error
|
||
THROW 29 1D Used for throw/catch
|
||
SIGNAL_HANDLER 30 1E Used for signal handling
|
||
SCRATCH_X 31 1F Scratch variable for the user
|
||
```
|
||
|
||
Some registers will need more explaining.
|
||
|
||
* SIGNAL\_HANDLER
|
||
|
||
This register is used when a signal is caught, it is up to the C environment to
|
||
call *forth\_signal* from a signal handler in the C environment to let the
|
||
Forth interpreter know a signal has been caught.
|
||
|
||
* SCRATCH\_X
|
||
|
||
Scratch X is a variable that can be used by the user, be warned that other
|
||
words defined in the base library *might* use this register as well.
|
||
|
||
### Dictionary
|
||
|
||
Apart from the constraints that the dictionary begins after where the
|
||
registers are and before where V stack is there are no set demarcations
|
||
for each region, although currently the defined word region ends before
|
||
0x200 leaving room between that and 0x7BFF for user defined words.
|
||
|
||
```
|
||
.----------------------------------------------------------------.
|
||
| 40-??? | ???-??? | ???-7BFF |
|
||
.----------------------------------------------------------------.
|
||
| Special read word | Interpreter word | Defined word ... |
|
||
.----------------------------------------------------------------.
|
||
|
||
Special read word = A word called on entrance to the interpreter,
|
||
it calls itself recursively (as a tail call). This
|
||
word cannot be 'found', it does not have a name.
|
||
Interpreter word = Any named (not 'invisible' ones) interpreter word
|
||
gets put here.
|
||
Defined word = A list of words that have been defined with ':'
|
||
```
|
||
|
||
## Glossary of Forth words
|
||
|
||
Each word is also given with its effect on the variable stack, any other effects
|
||
are documented (including the effects on other stacks). Each entry looks like
|
||
this:
|
||
|
||
* word ( y -- z )
|
||
|
||
Where 'word' is the word being described, the contents between the parenthesis
|
||
describe the stack effects, this word expects one number to be one the stack,
|
||
'y', and returns a number to the stack 'z'.
|
||
|
||
Some of the descriptions of the variables in the stack effect comment
|
||
have a meaning:
|
||
|
||
```
|
||
.---------.-------------------------------------------------.
|
||
| Name | Meaning |
|
||
.---------.-------------------------------------------------.
|
||
| addr | An address in cells |
|
||
| c-addr | Character address |
|
||
| r-addr | A raw address |
|
||
| file-id | A file id, used as a handle for file operations |
|
||
| ior | A error status returned by file operations |
|
||
| char | Single byte / character |
|
||
| u | An unsigned value |
|
||
| x | A signed value |
|
||
| c" xxx" | The word parses a word (not a stack effect) |
|
||
.---------.-------------------------------------------------.
|
||
```
|
||
|
||
The addressing modes complicate certain operations of Forth words. A normal
|
||
Forth address, as passed to words like '!', '@', 'c@' and 'c!' are relative to
|
||
the beginning of the start of the Forth cores memory, or at register zero. '!'
|
||
and '@' take their addresses in cells (*addr*), whereas 'c@' and 'c!' take
|
||
their addresses in character address (*c-addr*).
|
||
|
||
Raw addresses are however normal addresses, as understood by [C](https://en.wikipedia.org/wiki/C_%28programming_language%29) programs
|
||
and the computer. Operations on raw addresses can possibly result in undefined
|
||
behavior, and all objects they may operate on that exists outside of the Forth
|
||
core cannot be guaranteed to exists after a core file has been reloaded (that
|
||
is pointers to objects outside of Forth core should not be saved for further
|
||
use). Raw-addresses are also in character, not cell, units.
|
||
|
||
In future revisions of Forth this memory addressing might be simplified, if a
|
||
simple way of doing this can be achieved.
|
||
|
||
### Internal words
|
||
|
||
There are three types of words.
|
||
|
||
#### 'Invisible' words
|
||
|
||
These invisible words have no name but are used to implement the Forth. They
|
||
are all *immediate* words.
|
||
|
||
* push ( -- x )
|
||
|
||
Push the next value in the instruction stream onto the variable stack, advancing
|
||
the instruction stream.
|
||
|
||
* const ( -- x )
|
||
|
||
Push the next location after this instruction, not the instruction stream, onto
|
||
the variable stack.
|
||
|
||
* run ( -- )
|
||
|
||
Save the current instruction stream pointer onto the return stack and set
|
||
the pointer instruction stream pointer to point to value after *run*.
|
||
|
||
#### Immediate words
|
||
|
||
These words are named and are *immediate* words.
|
||
|
||
* ':' ( c" xxx" -- )
|
||
|
||
Read in a new word from the input stream and compile it into the dictionary.
|
||
|
||
* 'immediate' ( -- )
|
||
|
||
Make the previously declared word immediate. Unlike in most Forth
|
||
implementations this is used after the words name is given not after the
|
||
final ';' has been reached.
|
||
|
||
So:
|
||
|
||
```
|
||
: word immediate ... ;
|
||
```
|
||
|
||
Instead of:
|
||
|
||
```
|
||
: word ... ; immediate
|
||
```
|
||
|
||
* '\' ( c" \n" -- )
|
||
|
||
A comment, ignore everything until the end of the line.
|
||
|
||
#### Compiling words
|
||
|
||
* 'read' ( c" xxx" -- )
|
||
|
||
*read* is a complex word that implements most of the input interpreter,
|
||
it reads in a [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) *word* (up to 31 characters), if this *word* is in
|
||
the *dictionary* it will either execute the word if we are in *command mode*
|
||
or compile a pointer to the executable section of the word if in *compile
|
||
mode*. If this *word* is not in the *dictionary* it is checked if it is a
|
||
number, if it is then in *command mode* we push this value onto the *variable
|
||
stack*, if in *compile mode* then we compile a *literal* into the *dictionary*.
|
||
If it is none of these we print an error message and attempt to read in a
|
||
new word.
|
||
|
||
* '@' ( address -- x )
|
||
|
||
Pop an address and push the value at that address onto the stack.
|
||
|
||
* '!' ( x address -- )
|
||
|
||
Given an address and a value, store that value at that address.
|
||
|
||
* 'c@' ( char-address -- char )
|
||
|
||
Pop a character address and push the character value at that address onto the
|
||
stack. Note that this access is not checked for being within range of the
|
||
virtual machines memory, but it is still relative to the start address of
|
||
virtual machine memory.
|
||
|
||
* 'c!' ( char char-address -- )
|
||
|
||
Given a character address, store a character value at that address, like 'c@'
|
||
the address is relative to the virtual machines starting address.
|
||
|
||
* '-' ( x y -- z )
|
||
|
||
Pop two values, subtract 'y' from 'x' and push the result onto the stack.
|
||
|
||
* '+' ( x y -- z )
|
||
|
||
Pop two values, add 'y' to 'x' and push the result onto the stack.
|
||
|
||
* 'and' ( x y -- z )
|
||
|
||
Pop two values, compute the bitwise 'AND' of them and push the result on to
|
||
the stack.
|
||
|
||
* 'or' ( x y -- z )
|
||
|
||
Pop two values, compute the bitwise 'OR' of them and push the result on to
|
||
the stack.
|
||
|
||
* 'xor' ( x y -- z )
|
||
|
||
Pop two values, compute the bitwise 'XOR' of them and push the result on to
|
||
the stack.
|
||
|
||
* 'invert' ( x y -- z )
|
||
|
||
Perform a bitwise negation on the top of the stack.
|
||
|
||
* 'lshift' ( x y -- z )
|
||
|
||
Pop two values, compute 'y' shifted by 'x' places to the left and push
|
||
the result on to the stack.
|
||
|
||
* 'rshift' ( x y -- z )
|
||
|
||
Pop two values, compute 'y' shifted by 'x' places to the right and push
|
||
the result on to the stack.
|
||
|
||
* '\*' ( x y -- z )
|
||
|
||
Pop two values, multiply them and push the result onto the stack.
|
||
|
||
* '/' ( x y -- z )
|
||
|
||
Pop two values, divide 'x' by 'y' and push the result onto the stack. If 'y'
|
||
is zero and error message is printed and 'x' and 'y' will remain on the
|
||
stack, but execution will continue on as normal.
|
||
|
||
* 'u<' ( x y -- z )
|
||
|
||
Pop two unsigned values, compare them (y < x) and push the result onto the
|
||
stack, the comparison will be unsigned.
|
||
|
||
* 'u>' ( x y -- z )
|
||
|
||
Pop two values, compare them (y > x) and push the result onto the stack. The
|
||
comparison will be unsigned.
|
||
|
||
* '\_exit' ( -- )
|
||
|
||
Pop the return stack and set the instruction stream pointer to that
|
||
value.
|
||
|
||
* 'exit' ( -- )
|
||
|
||
This does the same as **\_exit**, the reason there are two exits instead
|
||
of one is so that the word 'see', defined in **forth.fth** can differentiate
|
||
between an exit that occurs in the definition of a word, and one that occurs
|
||
at the end of a words definition. ('see' is a decompiler for Forth).
|
||
|
||
For example:
|
||
|
||
```
|
||
: test 0 begin dup 10 > if exit then dup . cr 1+ again ;
|
||
```
|
||
|
||
Gets Compiled to:
|
||
|
||
```
|
||
Address Contents
|
||
._____._____._____._____. <- Start of Word
|
||
X | 't' | 'e' | 's' | 't' |
|
||
._____._____._____._____.
|
||
X+1 | 0 | 0 | 0 | 0 | NUL terminates 'test' string
|
||
._____._____._____._____.
|
||
X+2 | Previous Word Pointer | AKA 'PWD' field
|
||
._______________________.
|
||
X+3 | MISC Field | <- Execution Starts here
|
||
._______________________.
|
||
X+4 | Literal | Literals a compiled as a pointer to
|
||
._______________________. a 'literal' word and the literal in
|
||
X+5 | 0 | the next field.
|
||
._______________________.
|
||
X+6 | Pointer to 'dup' |
|
||
._______________________.
|
||
X+7 | literal |
|
||
._______________________.
|
||
X+8 | 10 |
|
||
._______________________.
|
||
X+9 | Pointer to '>' |
|
||
._______________________.
|
||
X+10 | Pointer to 'branch?' | 'if' gets compiled to 'branch?'
|
||
._______________________. and '2' so it jumps over 'exit'
|
||
X+11 | 2 | if the previous test fails. This
|
||
._______________________. is encoded as the jump destination
|
||
X+12 | Pointer to 'exit' | less one as an increment happens
|
||
._______________________. after the word is executed.
|
||
X+13 | Pointer to 'dup' |
|
||
._______________________.
|
||
X+14 | Pointer to '.' |
|
||
._______________________.
|
||
X+15 | Pointer to 'cr' |
|
||
._______________________.
|
||
X+16 | Pointer to '1+' |
|
||
._______________________.
|
||
X+17 | Pointer to 'branch' |
|
||
._______________________.
|
||
X+18 | -12 |
|
||
._______________________.
|
||
X+19 | Pointer to '_exit' |
|
||
._______________________. <- End of Word
|
||
```
|
||
|
||
The decompiler knows that the end of a word is demarcated by a pointer to
|
||
**\_exit**, and that pointers to **exit** can occur within the body of the
|
||
definition.
|
||
|
||
* 'key' ( -- char )
|
||
|
||
Get a value from the input and put it onto the stack.
|
||
|
||
* '\_emit' ( char -- status )
|
||
|
||
Put a character to the output stream returning a success value.
|
||
|
||
* 'r>' ( -- x )
|
||
|
||
Pop a value from the return stack and push it to the variable stack.
|
||
|
||
* '>r' ( x -- )
|
||
|
||
Pop a value from the variable stack and push it to the return stack.
|
||
|
||
* 'branch' ( -- )
|
||
|
||
Jump unconditionally to the destination next in the instruction stream.
|
||
|
||
* '?branch' ( bool -- )
|
||
|
||
Pop a value from the variable stack, if it is zero the jump to the
|
||
destination next in the instruction stream, otherwise skip over it.
|
||
|
||
* '(.)' ( x -- status )
|
||
|
||
Pop a value from the variable stack and print it to the output either
|
||
as a ASCII decimal or hexadecimal value depending on the BASE register. A
|
||
return status is pushed onto the stack, greater or equal to zero is a success,
|
||
negative is a failure. Failure can occur because of an invalid base in the BASE
|
||
register, or because the output could not be written to.
|
||
|
||
* ''' ( -- )
|
||
|
||
Push the next value in the instruction stream onto the variable stack
|
||
and advance the instruction stream pointer over it.
|
||
|
||
* ',' ( x -- )
|
||
|
||
Write a value into the dictionary, advancing the dictionary pointer.
|
||
|
||
* '=' ( x y -- z )
|
||
|
||
Pop two values, perform a test for equality and push the result.
|
||
|
||
* 'swap' ( x y -- y z )
|
||
|
||
Swap two values on the stack.
|
||
|
||
* 'dup' ( x -- x x )
|
||
|
||
Duplicate a value on the stack.
|
||
|
||
* 'drop' ( x -- )
|
||
|
||
Drop a value.
|
||
|
||
* 'over' ( x y -- x y x )
|
||
|
||
Duplicate the value that is next on the stack.
|
||
|
||
* 'find' ( -- execution-token )
|
||
|
||
Find a word in the dictionary pushing a pointer to that word onto the
|
||
variable stack.
|
||
|
||
* 'depth' ( -- depth )
|
||
|
||
Push the current stack depth onto the stack, the value is the depth of the
|
||
stack before the depth value was pushed onto the variable stack.
|
||
|
||
* 'sp@' ( -- addr )
|
||
|
||
Push the address of the stack pointer onto the stack, before **sp@** was
|
||
executed:
|
||
|
||
```
|
||
1 2 sp@ . . .
|
||
```
|
||
|
||
Prints:
|
||
|
||
```
|
||
2 2 1
|
||
```
|
||
|
||
* 'sp!' ( addr -- )
|
||
|
||
Set the address of the stack pointer.
|
||
|
||
* 'clock' ( -- x )
|
||
|
||
Push the difference between the startup time and now, in milliseconds. This
|
||
can be used for timing and implementing sleep functionality, the counter
|
||
will not increase the interpreter is blocking and waiting for input, although
|
||
this is implementation dependent.
|
||
|
||
* 'evaluator' ( c-addr u 0 | file-id 0 1 -- x )
|
||
|
||
This word is a primitive used to implement 'evaluate' and 'include-file', it
|
||
takes a boolean to decide whether it will read from a file (1) or a string (0),
|
||
and then takes either a forth string, or a **file-id**.
|
||
|
||
* 'system' ( c-addr u -- status )
|
||
|
||
Execute a command with the systems command interpreter.
|
||
|
||
* 'raise' ( u -- ??? )
|
||
|
||
Raises a signal, what happens after the signal is raised is undefined.
|
||
|
||
* 'date' ( -- date )
|
||
|
||
Push the date onto the stack, the order of the arguments are:
|
||
|
||
```
|
||
Is day light savings time?
|
||
Days since start of year
|
||
Week day
|
||
Year
|
||
Month
|
||
Day of month
|
||
Hour
|
||
Minutes
|
||
Seconds (note, this can include lead seconds!)
|
||
```
|
||
|
||
The time is in UTC time.
|
||
|
||
* 'memory-copy' ( r-addr1 r-addr2 u -- )
|
||
|
||
Operates on two raw addresses. Copy 'u' characters from r-addr2 to r-addr1.
|
||
|
||
* 'memory-locate' ( r-addr char u -- r-addr | 0 )
|
||
|
||
Locate a character in a block of memory 'u' characters wide, returning a pointer to
|
||
that character or zero if that address cannot be found.
|
||
|
||
* 'memory-set' ( r-addr char u -- )
|
||
|
||
Set 'u' character of memory starting at 'r-addr' to 'char'.
|
||
|
||
* 'memory-compare' ( r-addr1 r-addr2 u -- x )
|
||
|
||
Compare two blocks of memory 'u' units wide.
|
||
|
||
* 'allocate' ( u -- r-addr status )
|
||
|
||
Allocate a block of memory.
|
||
|
||
* 'free' ( r-addr -- status )
|
||
|
||
Free a block of memory.
|
||
|
||
* 'getenv' ( c-addr u -- r-addr u )
|
||
|
||
Get an [environment variable](https://en.wikipedia.org/wiki/Environment_variable) given a string, it returns '0 0' if the
|
||
variable was not found.
|
||
|
||
##### File Access Words
|
||
|
||
The following compiling words are part of the File Access Word set, a few of
|
||
the fields need explaining in the stack comments. "file-id" refers to a
|
||
previously opened file as returned by "open-file", "ior" refers to a return
|
||
status provided by the file operations. "fam" is a file access method,
|
||
|
||
* 'close-file' ( file-id -- ior )
|
||
|
||
Close an already opened file.
|
||
|
||
* 'open-file' ( c-addr u fam -- file-id ior )
|
||
|
||
Open a file, given a Forth string (the 'c-addr' and the 'u' arguments), and a
|
||
file access method, which is defined within "forth.fth". Possible file access
|
||
methods are "w/o", "r/w" and "r/o" for read only, read-write and write only
|
||
respectively.
|
||
|
||
* 'delete-file' ( c-addr u -- ior )
|
||
|
||
Delete a file on the file system given a Forth string.
|
||
|
||
* 'read-file' ( c-addr u file-id -- ior )
|
||
|
||
Read in 'u' characters into 'c-addr' given a file identifier.
|
||
|
||
* 'write-file' ( c-addr u file-id -- ior )
|
||
|
||
Write 'u' characters from 'c-addr' to a given file identifier.
|
||
|
||
* 'file-position' ( file-id -- ud ior )
|
||
|
||
Get the file position offset from the beginning of the file given a file
|
||
identifier.
|
||
|
||
* 'reposition-file' ( ud file-id -- ior )
|
||
|
||
Reposition a files offset relative to the beginning of the file given a file
|
||
identifier.
|
||
|
||
* 'flush-file' ( file-id -- ior )
|
||
|
||
Attempt to flush any buffered information written to a file.
|
||
|
||
* 'rename-file' ( c-addr1 u1 c-addr2 u2 -- ior )
|
||
|
||
Rename a file on the file system named by the first string ('c-addr1' and 'u1')
|
||
to the second string ('c-addr2' and 'u2').
|
||
|
||
* 'temporary-file' ( -- file-id ior )
|
||
|
||
Open up a new temporary file for writing and reading.
|
||
|
||
### Defined words
|
||
|
||
Defined words are ones which have been created with the ':' word, some words
|
||
get defined before the user has a chance to define their own to make their
|
||
life easier.
|
||
|
||
* 'state' ( -- addr )
|
||
|
||
Push the address of the register that controls the interpreter state onto
|
||
the stack, this value can be written to put the interpreter into compile
|
||
or command modes.
|
||
|
||
* ';' ( -- )
|
||
|
||
Write 'exit' into the dictionary and switch back into command mode.
|
||
|
||
* 'base' ( -- addr )
|
||
|
||
This pushes the address of a variable used for input and output conversion of
|
||
numbers, this address can be written to and read, valid numbers to write are 0
|
||
and 2 to 36 (*not* 1).
|
||
|
||
* 'pwd' ( -- pointer )
|
||
|
||
Pushes a pointer to the previously define word onto the stack.
|
||
|
||
* 'h' ( -- pointer )
|
||
|
||
Push a pointer to the dictionary pointer register.
|
||
|
||
* 'r' ( -- pointer )
|
||
|
||
Push a pointer to the register pointer register.
|
||
|
||
* 'here' ( -- dictionary-pointer )
|
||
|
||
Push the current dictionary pointer (equivalent to "h @").
|
||
|
||
* '[' ( -- )
|
||
|
||
Immediately switch into command mode.
|
||
|
||
* ']' ( -- )
|
||
|
||
Switch into compile mode
|
||
|
||
* '>mark' ( -- location )
|
||
|
||
Write zero into the head of the dictionary and advance the dictionary pointer,
|
||
push a address to the zero written into the dictionary. This is usually used
|
||
after in a word definition that changes the control flow, after a branch for
|
||
example.
|
||
|
||
* ':noname' ( -- execution-token )
|
||
|
||
This creates a word header for a word without a name and switches to compile
|
||
mode, the usual ';' finishes the definition. It pushes a execution token onto
|
||
the stack that can be written into the dictionary and run, or executed directly.
|
||
|
||
* 'if' ( bool -- )
|
||
|
||
Begin an if-else-then statement. If the top of stack is true then we
|
||
execute all between the if and a corresponding 'else' or 'then', otherwise
|
||
we skip over it.
|
||
|
||
Abstract Examples:
|
||
|
||
```
|
||
: word ... bool if do-stuff ... else do-other-stuff ... then ... ;
|
||
|
||
: word ... bool if do-stuff ... then ... ;
|
||
```
|
||
|
||
and a concrete examples:
|
||
|
||
```
|
||
: test-word if 2 2 + . cr else 3 3 * . cr ;
|
||
0 test-word
|
||
4 # prints 4
|
||
1 test-word
|
||
9 # prints 9
|
||
```
|
||
|
||
Is a simple and contrived example.
|
||
|
||
* 'else' ( -- )
|
||
|
||
See 'if'.
|
||
|
||
* 'then' ( -- )
|
||
|
||
See 'if'.
|
||
|
||
* 'begin' ( -- )
|
||
|
||
This marks the beginning of a loop.
|
||
|
||
* 'until' ( bool -- )
|
||
|
||
Loop back to the corresponding 'begin' if the top of the stack is zero, continue
|
||
on otherwise.
|
||
|
||
* "')'" ( -- char )
|
||
|
||
Push the number representing the ')' character onto the stack.
|
||
|
||
* 'tab' ( -- )
|
||
|
||
Print a tab.
|
||
|
||
* 'cr' ( -- )
|
||
|
||
Prints a newline.
|
||
|
||
* '(' ( -- )
|
||
|
||
This will read the input stream until encountering a ')' character, it
|
||
is used for comments.
|
||
|
||
* 'allot' ( amount -- )
|
||
|
||
Allocate a number of cells in the dictionary.
|
||
|
||
* 'tuck' ( x y -- y x y )
|
||
|
||
The stack comment documents this word entirely.
|
||
|
||
* 'nip' ( x y -- y )
|
||
|
||
The stack comment documents this word entirely.
|
||
|
||
* 'rot' ( x y z -- z x y )
|
||
|
||
The stack comment documents this word entirely. This word rotates three items
|
||
on the variable stack.
|
||
|
||
* '-rot' ( x y z -- y z x )
|
||
|
||
The stack comment documents this word entirely. This word rotates three items
|
||
on the variable stack, in the opposite direction of "rot".
|
||
|
||
* 'emit' ( x -- )
|
||
|
||
Write a single character out to the output stream.
|
||
|
||
* '?' ( addr -- )
|
||
|
||
Print the contents of addr to the screen.
|
||
|
||
## Library of Forth words
|
||
|
||
The file [forth.fth](/howerj/libforth/blob/master/forth.fth) contains many defined words, however those words are
|
||
documented within that file and so as to avoid duplication will not be
|
||
mentioned here. This file is *not* loaded automatically, and so should be run
|
||
like this:
|
||
|
||
Unix:
|
||
|
||
```
|
||
./forth -t forth.fth
|
||
```
|
||
|
||
Windows
|
||
|
||
```
|
||
forth.exe -t forth.fth
|
||
```
|
||
|
||
## Glossary of Forth terminology
|
||
|
||
* Word vs Machine-Word
|
||
|
||
Usually in computing a 'word' refers to the natural length of integer in a
|
||
machine, the term 'machine word' is used to invoke this specific meaning,
|
||
a word in [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) is more analogous to a function, but there
|
||
are different types of Forth words; *immediate* and *compiling* words,
|
||
*internal* and *defined* words and finally *visible* and *invisible* words.
|
||
|
||
The distinction between a machine word and a Forth word
|
||
can lead to some confusion.
|
||
|
||
* *The* dictionary
|
||
|
||
There is only one dictionary in a normal [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) implementation, it is a
|
||
data structure that can only grow in size (or at least it can in this
|
||
implementation) and holds all of the defined words.
|
||
|
||
* *The* stack
|
||
|
||
When we referring to a stack, or the stack, we refer to the variable stack
|
||
unless otherwise stated (such as the return stack). The variable, or the
|
||
stack, holds the result of recent operations such as addition or subtraction.
|
||
|
||
* The return stack
|
||
|
||
Forth implementations are two (or more) stack machines. The second stack
|
||
is the return stack which holds the usual function call return values as
|
||
well as temporary variables.
|
||
|
||
* Defined Words
|
||
|
||
A defined word is one that is not implement directly by the interpreter but
|
||
has been create with the ':' word. It can be an *immediate* word, but does
|
||
not have to be.
|
||
|
||
* Compile mode
|
||
|
||
In this mode we *compile* words unless those words are *immediate* words,
|
||
if the are then we immediately execute them.
|
||
|
||
* Command mode
|
||
|
||
In this mode, regardless of whether we are in *command* or *compile* mode
|
||
we execute words or push them on to the stack.
|
||
|
||
* A block.
|
||
|
||
A [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) block is primitive way of managing persistent storage and this
|
||
version of block interface is more primitive than most. A block is a
|
||
contiguous range of bytes, usually 1024 of them as in this instance, and
|
||
they can be written or read from disk. Flushing of dirty blocks is not
|
||
performed in this implementation and must be done 'manually'.
|
||
|
||
## Porting this interpreter
|
||
|
||
The interpreter code is written in [C99](https://en.wikipedia.org/wiki/C99), and is written to be portable, however
|
||
porting it to embedded platforms that lack a C standard library (which is most
|
||
of them) would mean replacing the most of the C standard library functions used,
|
||
and implementing a new I/O mechanism for reading, printing and block storage.
|
||
|
||
The interpreter has been tested on the following platforms:
|
||
|
||
* Linux x86-64 with,
|
||
+ Debian Jessie (8.x)
|
||
+ GCC version 4.9.2
|
||
* Windows 7 x86-64 (not recently)
|
||
* Linux ARM 32-bit Little Endian (not recently)
|
||
* OSX Sierra 10.12.1 (tested by Rikard Lang).
|
||
|
||
And the different virtual machine word size options (32 and 64 bit machine
|
||
words) have been tested. There is no reason it should not also work on 16-bit
|
||
platforms.
|
||
|
||
libforth is also available as a [Linux Kernel Module](http://tldp.org/LDP/lkmpg/2.6/html/), on a branch of libforth,
|
||
see <https://github.com/howerj/libforth/tree/linux-kernel-module>. This is
|
||
module is very experimental, and it is quite possible that it will make your
|
||
system unstable.
|
||
|
||
## Standards compliance
|
||
|
||
This Forth interpreter is in no way compliant with any of the standards
|
||
relating to Forth, such as [ANS Forth](http://lars.nocrew.org/dpans/dpans.htm), or previous Forth standardization
|
||
efforts. However attempts to support words and behavior typical of these
|
||
standards are made (although there is still a lot to do).
|
||
|
||
Some important deviations are:
|
||
|
||
* ok
|
||
|
||
'ok' is not printed after a successful command execution , this is for two
|
||
reasons, firstly because of limitations in the implementation, and secondly
|
||
there is no reason for cluttering up the output window with this. The
|
||
implementation should be silent by default.
|
||
|
||
* Control structures
|
||
|
||
Various control structures in Forth behave slightly differently than in
|
||
normal Forths. This should be fixed, given time.
|
||
|
||
## forth.core magic
|
||
|
||
The [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29) interpreters core files can be detected by the [file](https://linux.die.net/man/1/file) utility
|
||
by creating a [magic](https://linux.die.net/man/5/magic) file.
|
||
|
||
Save the following file on [Unix](/howerj/libforth/blob/master/www.opengroup.org/unix) systems to */etc/magic*:
|
||
|
||
```
|
||
# libforth core files start with 0xFF '4' 'T' 'H'
|
||
0 string \xFF\x34\x54\x48 libforth core file
|
||
>4 byte 2 16-bit
|
||
>4 byte 4 32-bit
|
||
>4 byte 8 64-bit
|
||
## File version, version 4 is current as of 2017/03/08
|
||
>5 byte x version=[%d]
|
||
>5 byte <4 ancient
|
||
>5 byte 4 current
|
||
>5 byte >4 futuristic
|
||
## Endianess test
|
||
>6 byte 0 big-endian
|
||
>6 byte 1 little-endian
|
||
>6 byte >1 INVALID-ENDIANESS
|
||
## Size is stored as the base-2 logarithm of the size
|
||
>7 byte x size=[2^%d]
|
||
## Extra tests could be added, such as whether the core file is still valid
|
||
```
|
||
|
||
## Coding Standards
|
||
|
||
The coding standards used for both the C and Forth code deviate from what is
|
||
normal, the Forth code mostly so. Nonetheless they should be abided by
|
||
regardless of what one may think of them. The C code is also written in a
|
||
literate style (although the documentation is generated from it, rather than
|
||
the code being generated from the documentation), this imposes extra
|
||
requirements on comments.
|
||
|
||
### C Coding Standards
|
||
|
||
The libforth Forth interpreter is primarily a library first and should be
|
||
thought of as such, it is possible that multiple instances of the Forth
|
||
interpreter are active at the same time, as such absolutely no global state can
|
||
be used within the Forth library. Be very careful in what functions you export,
|
||
by default all functions should be declared as static.
|
||
|
||
Global state can be used in the [main.c][] file which contains a wrapper
|
||
around libforth. This wrapper is used to make the **forth** executable, and
|
||
there is only ever one instance of the interpreter in use at a time.
|
||
|
||
The library is also written in [C99](https://en.wikipedia.org/wiki/C99) and made to be portable. Any
|
||
non-portable code should be placed in [main.c][] and should be optionally
|
||
compiled only if the correct macro for that functionality is defined. The
|
||
macros **\_WIN32** and **\_\_unix\_\_** are currently used to test for the
|
||
operating system in use for example.
|
||
|
||
The program [doxygen](https://en.wikipedia.org/wiki/Doxygen) is used to extract and built the API documentation for
|
||
libforth. At minimum all functions and macros that are exported by the library
|
||
should be documented using [doxygen](https://en.wikipedia.org/wiki/Doxygen) style markup.
|
||
|
||
All functions exported by [libforth.c](/howerj/libforth/blob/master/libforth.c) should be tested, and should be
|
||
prefixed with "forth\_". Unit tests against [libforth.c](/howerj/libforth/blob/master/libforth.c) belong in
|
||
[unit.c](/howerj/libforth/blob/master/unit.c). The same portability guidelines that apply to [libforth.c](/howerj/libforth/blob/master/libforth.c)
|
||
also apply to [unit.c](/howerj/libforth/blob/master/unit.c).
|
||
|
||
Try to use assertions and checks on code liberally, even if the check is
|
||
expensive to do. Expensive checks should be disabled with by defining the
|
||
**NDEBUG** macro, like **assert** is.
|
||
|
||
The Linux kernel coding standards are to be roughly followed. Line lengths
|
||
should be limited to 80 characters in length (this is much more important given
|
||
documentation is generated from the source).
|
||
|
||
The main exception to the rule on code indentation is the **forth\_run**
|
||
function, which mostly consists of a large switch statement that implements the
|
||
Forth virtual machine.
|
||
|
||
Bellow is a cut down version of the switch statement:
|
||
|
||
```
|
||
switch(w) { /* switches on instruction */
|
||
/* ...More instructions... */
|
||
case DEFINE:
|
||
m[STATE] = 1; /* compile mode */
|
||
if(forth_get_word(o, o->s) < 0)
|
||
goto end;
|
||
compile(o, RUN, (char*)o->s, true);
|
||
break;
|
||
/* ...More instructions... */
|
||
case ADD: f = *S-- + f; break;
|
||
case AND: f = *S-- & f; break;
|
||
case OR: f = *S-- | f; break;
|
||
case XOR: f = *S-- ^ f; break;
|
||
case INV: f = ~f; break;
|
||
/* ...More instructions... */
|
||
}
|
||
```
|
||
|
||
Of note is that simple instructions like **ADD** and **OR** take up a single
|
||
line, having worked on the virtual machine for a while this aids in readability
|
||
for me. The **DEFINE** instruction is a lot longer and so is split up into
|
||
multiple lines.
|
||
|
||
A [AWK](https://en.wikipedia.org/wiki/AWK) script, specifically [GAWK](https://www.gnu.org/software/gawk/), is used to turn the [C](https://en.wikipedia.org/wiki/C_%28programming_language%29) code into a
|
||
single [PDF](https://en.wikipedia.org/wiki/Portable_Document_Format) document, by first generating [markdown](https://daringfireball.net/projects/markdown/) from it. The script,
|
||
called [convert](/howerj/libforth/blob/master/convert), is simple. The script by default indents any [C](https://en.wikipedia.org/wiki/C_%28programming_language%29) code
|
||
which in [markdown](https://daringfireball.net/projects/markdown/) means it is treated as a code block, to turn the
|
||
indentation on or off the commands "/\*\*" turns if off and the command
|
||
"\*\*/" turns it back on, these obviously double as both [C](https://en.wikipedia.org/wiki/C_%28programming_language%29) comment blocks,
|
||
but also make [doxygen](https://en.wikipedia.org/wiki/Doxygen) look for tags within the comments. The script only
|
||
looks for comment commands at the beginning of a line.
|
||
|
||
The following [C](https://en.wikipedia.org/wiki/C_%28programming_language%29) program:
|
||
|
||
```
|
||
/**
|
||
## A title using markdown syntax
|
||
This comment will not be indented in the resulting mark down
|
||
file and can be used to describe any of the following code.
|
||
**/
|
||
int main(void)
|
||
{
|
||
printf("Hello, World\n"); /* a normal C comment */
|
||
return 0;
|
||
}
|
||
```
|
||
|
||
Will be turned into:
|
||
|
||
```
|
||
## A title using markdown syntax
|
||
This comment will not be indented in the resulting mark down
|
||
file and can be used to describe any of the following code.
|
||
<NEW-LINE>
|
||
int main(void)
|
||
{
|
||
printf("Hello, World\n"); /* a normal C comment */
|
||
return 0;
|
||
}
|
||
```
|
||
|
||
The comment blocks are used to explain the code in detail.
|
||
|
||
### Forth Coding Standards
|
||
|
||
The way code is formatted is idiosyncratic for Forth, the main difference is
|
||
that tabs are used to indent code. Spaces and new lines are used to breakup
|
||
longer word definitions, for short definitions only the entire definition
|
||
should take up only two lines, as so:
|
||
|
||
```
|
||
: foo ( n -- n : add bar to 'n')
|
||
bar + ;
|
||
```
|
||
|
||
All functions exposed to the user should be unit tested (which currently is not
|
||
the case, but that is not an excuse not to test them). Unit tests go in the
|
||
file [unit.fth](/howerj/libforth/blob/master/unit.fth).
|
||
|
||
Sections of related code should be enclosed with decoration that makes it
|
||
obvious that the code belongs together, like so:
|
||
|
||
```
|
||
( ============ The Foo Library ============ )
|
||
( The Foo library implements the standard
|
||
words 'foo', 'bar' and 'foobar' )
|
||
( ... )
|
||
: foo ( n -- n )
|
||
bar + ;
|
||
( ... )
|
||
( ============ The Foo Library ============ )
|
||
```
|
||
|
||
The standard way of making a word immediate in Forth is to place the word
|
||
**immediate** after a word definition. This is still possible to do in
|
||
libforth, however the preferred method is to place it just after the word being
|
||
defined:
|
||
|
||
```
|
||
: example immediate ; ( preferred way )
|
||
: example ; immediate ( traditional way )
|
||
```
|
||
|
||
I prefer the first way of making a word immediate as it implies that it is part
|
||
of the word definition and not something that acts on the word.
|
||
|
||
## Bugs
|
||
|
||
As mentioned in the standards compliance section, this Forth does things in a
|
||
non-standard way. Apart from that:
|
||
|
||
* Passing invalid pointers to instructions like **open-file** or **system** can
|
||
cause undefined behavior (your program will most likely crash). There is no
|
||
simple way to handle this (apart from not doing it).
|
||
* The core interpreter does not currently make use of the throw and catch
|
||
mechanism when handling certain errors (like division by zero), in effect there
|
||
are two error handlers. These mechanisms need unifying.
|
||
|
||
## To-Do
|
||
|
||
* A few environment variables could be used to specify start up files for the
|
||
interpreter and user specific startup files.
|
||
* Signal handling should be improved.
|
||
* Error handling could be improved - the latest word definition should be
|
||
erased if an error occurs before the terminating ';'. And trap handling
|
||
should be done in pure forth, instead of as a hybrid which is currently is.
|
||
* A series simple Unix utilities, like in <https://github.com/howerj/multicall>
|
||
could be made in Forth, this would demonstrate that this Forth is usable
|
||
and practical. Tools like "cat", "tr", "grep", and various other text
|
||
formatting tools could be done in pure C, others like "kill", "mount" or "ls"
|
||
would require operating system dependent code to be added, this would need
|
||
to be contained somehow, so the core remains pure C.
|
||
* Allow the main memory to be resized. A Forth primitive MORE-CORE could be
|
||
added.
|
||
* Move the stack space so it is before the dictionary, this would mean we
|
||
could write to the end of memory without worrying about hitting the stack.
|
||
* Make a debian package for the Forth interpreter.
|
||
* For a Forth only related "To-Do" list see the end of the file [forth.fth](/howerj/libforth/blob/master/forth.fth).
|
||
* A compiler for the virtual machine itself should be made, as a separate
|
||
program. This could be used to make a more advanced read-evaluate loop.
|
||
* Core files are currently not portable across machines of different words
|
||
sizes or endianess, which needs addressing, or ignoring conclusively.
|
||
* Use either liballocs or libffi for adding foreign function interfaces
|
||
to programs, see:
|
||
+ <https://github.com/stephenrkell/liballocs>
|
||
+ <https://sourceware.org/libffi/>
|
||
* The Forth could be made to be more standards compliant (especially where it
|
||
comes to I/O and addressing (but also things like CASE statements).
|
||
* A document (part of this document) for the coding standards used in this
|
||
document needs to be created and added here. Things that will need
|
||
documenting are:
|
||
|
||
* How comments should look like (literate and doxygen comments)
|
||
* Tabs vs Spaces (use tabs)
|
||
* Levels of indentation
|
||
* Coding standards for Forth (stack comments, indentation, etcetera).
|
||
* And much more.
|
||
|
||
* Improve error handling by adding 'throw' and 'catch' to the virtual
|
||
machine as both C functions callable within the confines of 'libforth.c' and
|
||
as virtual machine instructions.
|
||
* To fix the problem with a mismatching between C addresses and Forth
|
||
addresses which currently exists all modes of addressing should be made to
|
||
be relative, or fixed up by the **forth\_init**. Currently access to memory
|
||
outside of the Forth address space can only be done by copying memory to and
|
||
from the C and the Forth addresses spaces with words that accept real addresses
|
||
(like **memory-copy**).
|
||
|
||
* Branches use relative addressing at the moment.
|
||
* Register values are either fixed up at initialization (as in the case
|
||
of **stdin**, **stdout** and a few other pointers) or need fixing (like the
|
||
variable stack pointer, return stack pointer the HERE pointer and more).
|
||
* How the constants are defined will need to be changed (this will be the
|
||
most difficult thing to fix).
|
||
* See: <https://en.wikipedia.org/wiki/Position-independent_code>
|
||
* And: <https://en.wikipedia.org/wiki/Relocation_%28computing%29>
|
||
|
||
### Virtual Machine To-Do points
|
||
|
||
The virtual machine needs changing, in someways drastically, to accommodate the
|
||
following behaviors:
|
||
|
||
1. Error handling should be purely Forth based.
|
||
|
||
Currently error handling and recovery is a weird mishmash between [C](https://en.wikipedia.org/wiki/C_%28programming_language%29) and
|
||
[Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29), more of the error handling should be moved to the [Forth](https://en.wikipedia.org/wiki/Forth_%28programming_language%29)
|
||
environment using the normal THROW/CATCH mechanisms (which should be turned
|
||
into virtual machine instructions and C functions, which can then be called to
|
||
make a throw occur.
|
||
|
||
2. The virtual machine should use character based addressing. Currently it uses
|
||
cell based addressing, which causes all kinds of confusion.
|
||
3. Words currently not defined need to be hidden, until the terminating ';'.
|
||
|
||
### Submodules
|
||
|
||
The [main.c][] file depends, optionally, on various submodules. These
|
||
submodules need improving.
|
||
|
||
#### libline
|
||
|
||
* libline needs porting to Windows
|
||
* The API needs improving so there is more control on whether or not raw mode
|
||
is turned on or off, whether a terminal is being read from or not, ...
|
||
|
||
### Experimental single binary
|
||
|
||
The build system can be used to build a single binary which contains the
|
||
contents of "forth.fth", with the command "make libforth". This is a
|
||
multistage build process that is experimental, so should not be relied upon. It
|
||
involves a bootstrapping process using the first forth executable to build a
|
||
second.
|
||
|
||
The process works like this:
|
||
|
||
1. The executable "forth" is made.
|
||
|
||
make forth
|
||
|
||
or:
|
||
|
||
```
|
||
gcc -std=c99 main.c unit.c libforth.c -o forth
|
||
```
|
||
|
||
2. The executable is used to generate a code file containing the contents of
|
||
"forth.fth" is made.
|
||
|
||
./forth -s forth.core
|
||
3. A forth word, *core2c* is used to generate a C file called "core.gen.c":
|
||
|
||
./forth -l forth.core -e 'c" forth.core" c" core.gen.c" core2c'
|
||
4. The forth program is recompiled with an extra define, which means that
|
||
initialization of a minimal forth environment is replaced with the core file we
|
||
just made:
|
||
|
||
gcc -DUSE\_BUILT\_IN\_CORE -std=c99 main.c unit.c libforth.c core.gen.c -o libforth
|
||
|
||
The new executable, *libforth*, behaves the same as *forth* but with a built in
|
||
core file.
|
||
|
||
## Notes
|
||
|
||
* The compilation should result in a small executable, and when statically
|
||
linked against [musl](https://www.musl-libc.org/) under Linux (x86-84), the stripped executable is around
|
||
50kb in size.
|
||
* It is quite possible to make Forth programs that corrupt memory that they
|
||
should, this is not a design flaw in this interpreter but more part of the
|
||
Forth philosophy. If you want memory safety (and most of the time you should)
|
||
you should use a different language, or implementation.
|
||
|
||
<style type="text/css">body{margin:40px auto;max-width:850px;line-height:1.6;font-size:16px;color:#444;padding:0 10px}h1,h2,h3{line-height:1.2}</style>
|
||
|
||
## About
|
||
|
||
libforth: A small Forth interpreter that can be used as a library written in c99
|
||
|
||
### Topics
|
||
|
||
[c](/topics/c "Topic: c")
|
||
[interpreter](/topics/interpreter "Topic: interpreter")
|
||
[stack](/topics/stack "Topic: stack")
|
||
[virtual-machine](/topics/virtual-machine "Topic: virtual-machine")
|
||
[forth](/topics/forth "Topic: forth")
|
||
|
||
### Resources
|
||
|
||
[Readme](#readme-ov-file)
|
||
|
||
### License
|
||
|
||
[MIT license](#MIT-1-ov-file)
|
||
|
||
### Uh oh!
|
||
|
||
There was an error while loading. Please reload this page.
|
||
|
||
[Activity](/howerj/libforth/activity)
|
||
|
||
### Stars
|
||
|
||
[**200**
|
||
stars](/howerj/libforth/stargazers)
|
||
|
||
### Watchers
|
||
|
||
[**17**
|
||
watching](/howerj/libforth/watchers)
|
||
|
||
### Forks
|
||
|
||
[**31**
|
||
forks](/howerj/libforth/forks)
|
||
|
||
[Report repository](/contact/report-content?content_url=https%3A%2F%2Fgithub.com%2Fhowerj%2Flibforth&report=howerj+%28user%29)
|
||
|
||
## [Releases](/howerj/libforth/releases)
|
||
|
||
[6
|
||
tags](/howerj/libforth/tags)
|
||
|
||
## [Packages 0](/users/howerj/packages?repo_name=libforth)
|
||
|
||
No packages published
|
||
|
||
## Languages
|
||
|
||
* [C
|
||
49.2%](/howerj/libforth/search?l=c)
|
||
* [Forth
|
||
48.3%](/howerj/libforth/search?l=forth)
|
||
* [Makefile
|
||
1.6%](/howerj/libforth/search?l=makefile)
|
||
* [Awk
|
||
0.9%](/howerj/libforth/search?l=awk)
|
||
|
||
## Footer
|
||
|
||
© 2026 GitHub, Inc.
|
||
|
||
### Footer navigation
|
||
|
||
* [Terms](https://docs.github.com/site-policy/github-terms/github-terms-of-service)
|
||
* [Privacy](https://docs.github.com/site-policy/privacy-policies/github-privacy-statement)
|
||
* [Security](https://github.com/security)
|
||
* [Status](https://www.githubstatus.com/)
|
||
* [Community](https://github.community/)
|
||
* [Docs](https://docs.github.com/)
|
||
* [Contact](https://support.github.com?tags=dotcom-footer)
|
||
* Manage cookies
|
||
* Do not share my personal information
|
||
|
||
|
||
|
||
You can’t perform that action at this time. |