A self-hosted Forth for my Z80 computer

Introduction

A while back, I decided to implement a self-hosted Forth interpreter/compiler for the ZI-28, my Z80-based computer. The previous version of the software consisted of a simple operating system written in assembly, with applications and games written in C. There were some things I was unhappy about. Writing the OS in assembly was slow and error-prone. Writing the applications in C was much faster and easier in comparison, but it required the use of z88dk, a big software suite that was difficult to set up and went against my goal of the whole project.

My goal was always to build a computer system from the ground up, allowing me to understand every part of the process. Ideally, this would include being able to build the entire software on the computer itself. While I could maybe see myself writing an assembler in Z80 assembly, building an entire C compiler from scratch seemed too ambitious. Rewriting the entire software in a different language wasn’t an option at the time, but after a long break from the project, starting on something fresh seemed more appealing than trying to understand my old assembly code.

I chose Forth for the rewrite. It offered enough abstractions to make writing software more pleasant and reliable than assembly, while still being relatively easy to implement from scratch. There are also some excellent resources, like Starting Forth explaining the basics of the language, or Jonesforth showing how to implement a complete interpreter. My goal was to implement a system that could read its own source code and compile it, all on the ZI-28 itself.

Assembler

Since I wanted my Forth to be completely self-hosted, I decided to also write an assembler in Forth. For development on the host computer, I used Gforth. As I was targeting a very constrained system, I didn’t go with traditional assembly syntax and instead chose a more Forth-like way of writing instructions, inspired by MMSForth. For example, ld a, b turns into A B LD,. Here, A and B are constants that get pushed onto the stack, and LD, is a Forth word that takes two register addresses and writes the resulting opcode to a buffer. I wanted to use 16-bit stack cells and a single cell per argument. This presented the problem that a register constant is indistinguishable from an immediate number, so I added separate instructions that deal with immediates, suffixed with a # (ld a, 42 becomes A 42 LD#,).

Thanks to Forth’s powerful metaprogramming features, implementing the basic instructions was straightforward. I created a defining word for each type of instruction. In the following example, the defining word NOARG gets created.

1
2
3
4
5
6
7
8


\ Creates an opcode with no arguments
: NOARG
    \ Create a word and store the provided opcode with it
    CREATE C,
    \ When the created word is called, retrieve the stored opcode and write it
    \ to the assembler output buffer
    DOES> C@ ASM-C,
;

This word can then be invoked like this:

1
2
3


\ Defines the word `NOP,`, which simply appends the opcode `0x00` to the
\ assembler output buffer when called.
$00 NOARG NOP,

Unfortunately, the Z80 instruction set is quite complex. There are a lot of instructions that have the same name but translate to different opcodes depending on the type of arguments. The worst of them is the ld instruction which has about 15 different variations, not even counting the ones loading immediate numbers. The MMSForth manual linked above puts into perspective how much more complex the Z80 instruction set is over the 8080 instruction set it is based on: the 8080 assembler uses 1.2 kilobytes of RAM, while the full Z80 assembler uses 3.5 kilobytes.

To make sure I had implemented all of the instructions correctly, I also created a test harness. I defined the words T{, -> and }T to allow me to compile an instruction and check that it outputs the correct opcode and is free of stack effects.

A single test case then looks like this:

1
2


\ Checks that the instruction `A B LD,` outputs 0x78
T{ A B LD, -> $78 }T

Meta-Compiler

Now that I had an assembler, I could start building some basic words for my own Forth. Some of the more complex ones like multiplication and division I copied from CamelForth, as I wasn’t really interested in coming up with efficient implementations on my own.

The beauty of Forth is that once you have a basic set of words, you can then build on those and even implement the interpreter itself using just Forth. Doing this while cross-compiling required some more effort. For a lot of the special compiling words, I had to implement a meta-variant that compiles to the target system instead. I defined the word :META as a counterpart to the standard : that defines new words. Calling :META essentially starts up a separate Forth compiler. It looks up words in the target output and whenever it encounters an immediate word, it calls the special meta-variant of it. This way, Forth words can be defined in the familiar way, only requiring :META instead of :.

Here’s an example for the word COUNT being defined using the meta-compiler:

1
2
3
4


( c-addr -- c-addr u )
:META COUNT
    DUP 1+ SWAP C@
;

One challenge I ran into was jump labels. While jumping back to a previously defined label is very easy, jumping forward is more difficult, as the destination address is only known later. My first approach to solve this was to simply compile the code twice, putting in placeholders in the first pass and filling them in during the second pass. This was easy to implement and worked fine on my laptop, but on the ZI-28, reading in and interpreting the entire source code twice would mean a significant slow-down.

My solution was to introduce the words FWDREF and RESOLVE which can be used like this:

1
2
3
4
5
6
7


\ This jump...
FWDREF mylabel JP#,

...

\ ... goes to here
RESOLVE mylabel

The way this works is that FWDREF first checks if a label with the provided name already exists. If it doesn’t, it creates it and stores the of the jump, putting in a placeholder value of zero. If the label already exists, it builds a linked list of all the jump origins going to that label. The RESOLVE word can then traverse the list and replace all the pointers with the current address.

Self-hosting

Once I implemented all the Forth words that my assembler and compiler required, I could simply run the same code that I wrote for Gforth on the ZI-28. I ported some of my original assembly code, like the keyboard interface and the SD card and FAT filesystem drivers.

I noticed that the interpreter was really slow. I had followed a pretty standard design of storing the address of the previous word whenever a new one is defined, thus building a linked list of words. The issue with this was that a lot of commonly used words were defined first and therefore at the end of the list, like DUP, SWAP, or +. Whenever one of those words was interpreted or compiled, more than 200 words had to be traversed and checked against.

To fix this issue, I implemented a hashmap. I opted for 256 entries, as that could be indexed nicely with an 8-bit hash. It wasn’t big enough to avoid collisions, but since each word already had a pointer field to the next word before, I could keep the logic to search through a linked list. The only difference was that instead of starting from a fixed point, it now hashed the word to be searched first to find the linked list of all words with that hash value.

The initial hashmap gets built during compilation and stored in ROM. On startup, it gets copied to RAM so that it can be modified as more words are defined by the user. This took up 512 bytes (256 entries of 16-bit pointers) each in ROM and RAM, not an insignificant amount on a system with a 64 kilobyte address space. This trade-off was worth it though in my eyes, as it resulted in a speed increase of more than two times.

Conclusion

At this point, I had achieved my original goal of having a system that can compile its own source code while running on the ZI-28. The only limitation was that I still had to copy the code onto the SD card manually and it couldn’t write the compiled firmware to the ROM by itself. I started writing a code editor in Forth so that the source code could be modified on the system itself. By this point however, my enthusiasm for the project was starting to fade, as I had already accomplished the most interesting part. It also became obvious that while Forth was easier to write than assembly, it was still a lot more challenging than C and offered little convenience or safety to the developer. In addition, while I did have a functional system that I could write code on, everything was still quite slow.

Writing my own self-hosted Forth was one of my most challenging but also interesting and fulfilling projects. Modern software tools and languages offer a lot of convenience and allow you to build complex systems very quickly. The downside however is that they build on so many layers of abstractions that it is almost impossible to fully understand them. This project offered me an opportunity to explore the minimum amount of complexity required to build an interpreter and compiler.

While I tried to highlight the most interesting challenges I faced and solutions I came up with in this article, obviously I still had to leave out a lot of the details. If you’re interested, the code can be found in the ZI-28 repository under the forth directory. It’s not documented, so feel free to reach out to me if you have any questions.