A tiny Forth I built in a week.

Overview

( Wrote a blog post about this here )

It was raining hard, a week ago.

And what could you possibly do on a rainy Saturday afternoon?

Well...

You can make a Forth interpreter/compiler from scratch...
...then put it inside a 1.5$ Blue Pill microcontroller...
...and finally, inside an Arduino UNO...
... within its tiny 2K RAM!

Click on the image to watch it blinking the LED of my Arduino:

Here's a video of it action, blinking my Arduino :-)

I haven't done anything even remotely close to this in decades...
I loved building it.

The rainy afternoon turned into a week-long hackfest (was looking forward every day to the post-work FORTH-tinkering in the afternoon...)

The result: a tiny, mini, micro Forth. In portable C++ :-)
It has...

  • basic arithmetic
  • star-slash (double-word accurate muldiv)
  • literals
  • constants
  • variables
  • direct memory access
  • string printing
  • reseting
  • comments
  • nested DO/LOOP
  • comparisons
  • nested IF/ELSE/THEN
  • ...and of course, functions (Forth words)

Here's an ascii-cast recording of it in action:

Recording of building and uploading on an Arduino UNO

Read the test scenario below to see my supported Forth constructs.

Portability, ArduinoSTL and Valgrind/AddressSanitizer checks

I meant it when I said "portable". Part of my reasoning was, that in addition to targeting multiple platforms (e.g. BluePill and Arduino) I wanted to be able to use Valgrind and AddressSanitizer to detect - in the host! - any issues I have with my memory handling.

Since I had embedded targets in mind, I tried ArduinoSTL - but it was too wasteful memory-wise. It also made the build process significantly slower. I therefore built my own memory pool, as well as list, tuple and string-like C++ templates. It was a nice challenge, re-inventing a tiny C++ STL...

And I understand STL a lot better now, after building small pieces of it myself :-)

Simulation / Debugging

I setup simulation via simavr. This tremendously improved my developing speed, since a simulator spawns and runs much faster than the real board. Due to the code being portable, debugging took place mostly in the host GDB; and after Valgrind and AddressSanitizer gave their blessing, I usually found out that the simulator (and the real board) worked fine as well.

BluePill vs Arduino UNO

Thanks to ArduinoSTL, I quickly reached the point of running inside the BluePill. The 1.5$ mini-monster has 10 times more SRAM than an Arduino UNO; so in a couple of days, I had a working branch.

The 1.5$ 'Beast'

But as said above, that wasn't nearly enough to make it work in my Arduino UNO. That required far more work (see below).

As for the BluePill, I should note that, as in all my other embedded targets, I prefer a development workflow that is based on normal bootloaders (not on programmers). I therefore burned the stm32duino bootloader on the BluePill, which allowed me to easily program it in subsequent iterations via the USB connection (and a simple make upload).

The same USB connection would then function as a serial port immediately afterwards - allowing me to interact with the newly uploaded Forth in the BluePill.

The screenshot below is from a tmux: on the left, the output from make upload; and on the right, I used picocom to interact with my mini-Forth over the serial port:

Compiling, uploading and testing

Memory - the final frontier

That covered the first two days.

But when I tried compiling for the Arduino UNO, I realised that the ArduinoSTL was not enough. I run out of memory...

So I built my own mini-STL, and tightly controlled all memory utilisation.

I also used macro-magic to move all strings to Flash at compile-time (see dprintf in the code)... And saved memory everywhere I could, re-using error messages across various operations - and storing the entire array of native operations in Flash.

Nothing flexes your coding muscles as much as optimising; whether it is for speed or for space. See the implementation of ".S" for example, where the (obvious) stack reversal code is also the most wasteful... Changing it to a slower but memory-preserving algorithm allowed me to use ".S" even when almost all my memory is full.

C++ vs C

I know that many developers hate C++. I even wrote a blog post about it.

And I understand why - they see code like this...

typename forward_list::box *forward_list::_freeList = NULL; ">
#include "mini_stl.h"

template
typename forward_list::box *forward_list::_freeList = NULL;

...and they start screaming - "what the hell is that", "incomprehensible madness", etc.

But there are very important benefits in using C++ - and templates in particular. You write less code, with no additional run-time or memory overhead compared to C, and with a lot more compile-time checks that watch your back (for things that would otherwise blow up in your face).

See my Optional for example, that emulates (badly) the optional type of Rust/OCaml/F#/Scala/Kotlin etc. It forces you to check your returned error codes:

Optional Forth::needs_a_number(const __FlashStringHelper *msg)
{
    if (_stack.empty())
        return error(emptyMsgFlash, msg);
    auto topVal = *_stack.begin();
    if (topVal._kind == StackNode::LIT)
        return topVal._u.intVal;
    else
        return FAILURE;
}

You can't "forget" to check the potential for a failure coded inside your returned value - because your code has to "unwrap" it. I could have done this better, but I chose to implement it via simple tuples (this was a one-weeks-afternoons hack, after all :-)

As for the template "magic" incantation above - it is true magic: My forward_list template is using free-lists to store the pop_front-ed elements and reuse them in subsequent allocations. I wanted these free-lists to be global (i.e. static members) because lists of the same type must re-use a single, commonly-shared free-list. The magic spell tells the compiler I want to instantiate these globals once, for each type T that I use in any lists in my code.

My Forth test scenario - including a FizzBuzz!

Yep, FizzBuzz - we are fully Turing complete. And would surely pass Joel's interview :-)

2 ?... " 1 2 > . ." Is 1 < 2 ?... " 1 2 < . ." Define pi at double-word precision... " : pi 355 113 */ ; ." Use definition to compute 10K times PI... " 10000 pi . ." Check: 23 mod 7... " 23 7 MOD . ." Defining 1st level function1... " : x2 2 * ; ." Defining 1st level function2... " : p4 4 + ; ." 2nd level word using both - must print 24... " 10 x2 p4 . ." Defining a variable with value 123... " 123 variable ot3 ." Printing variable's value... " ot3 @ . ." Defining The Constant (TM)... " 42 constant lifeUniverse ." Printing The Constant (TM)... " lifeUniverse . ." Setting the variable to The Constant (TM)... " lifeUniverse ot3 ! ." Printing variable's value... " ot3 @ . ." Setting the variable to hex 0x11... " $11 ot3 ! ." Printing variable's value... " ot3 @ . ." Setting the variable to binary 10100101... " %10100101 ot3 ! ." Printing variable's value... " ot3 @ . ." Defining helper... " : p5 5 U.R . ; ." Defining 3 times loop... " : x3lp 3 0 DO I p5 LOOP ; ." Calling loop... " x3lp ." Defining loop calling loop 2 times... " : x6lp 2 0 DO x3lp LOOP ; ." Nested-looping 2x3 times... " x6lp ." Inline: " : m 3 1 DO 3 1 DO CR J p5 I p5 ." = " J I * p5 LOOP LOOP ; ." Use inline loops with two indexes... " m ." Make multiples of 7 via DUP... " : m7s 10 0 DO DUP I * . LOOP DROP ; ." Print them and DROP the 7... " 7 m7s ." Reset... " RESET \ Time for Turing completeness... ." Let's do Fizz-Buzz! " \ Turing Completeness check... \ fizz ( n -- 0_or_1 n ) ." Define fizz... " : fizz DUP 3 MOD 0 = IF ." fizz " 1 ELSE 0 THEN SWAP ; \ buzz ( n -- 0_or_1 n ) ." Define buzz... " : buzz DUP 5 MOD 0 = IF ." buzz " 1 ELSE 0 THEN SWAP ; \ emitNum ( 0_or_1 0_or_1 n -- ) ." Define emitNum... " : emitNum ROT ROT + 0 = if . ELSE DROP THEN ; \ mainloop ( n -- ) ." Define mainloop... " : mainloop ." ( " fizz buzz emitNum ." ) " ; \ fb ( -- ) ." Define fizzbuzz... " : fb 37 1 DO I mainloop LOOP ; ." Run it! " fb ." Report memory usage... " .S ." All done! " ">
." Reset... " RESET
." Check comments... " \ Yes, we support the new-style comments :-)
." Computing simple addition of 3 + 4... " 3 4 + .
." Is 1 = 2 ?... " 1 2 = .
." Is 1 > 2 ?... " 1 2 > .
." Is 1 < 2 ?... " 1 2 < .
." Define pi at double-word precision... " : pi 355 113 */ ;
." Use definition to compute 10K times PI... " 10000 pi .
." Check: 23 mod 7... " 23 7 MOD .
." Defining 1st level function1... " : x2 2 * ;
." Defining 1st level function2... " : p4 4 + ;
." 2nd level word using both - must print 24... " 10 x2 p4 . 
." Defining a variable with value 123... " 123 variable ot3
." Printing variable's value... " ot3 @ .
." Defining The Constant (TM)... " 42 constant lifeUniverse
." Printing The Constant (TM)... " lifeUniverse .
." Setting the variable to The Constant (TM)... " lifeUniverse ot3 !
." Printing variable's value... " ot3 @ .
." Setting the variable to hex 0x11... " $11 ot3 !
." Printing variable's value... " ot3 @ .
." Setting the variable to binary 10100101... " %10100101 ot3 !
." Printing variable's value... " ot3 @ .
." Defining helper... " : p5 5 U.R . ;
." Defining 3 times loop... " : x3lp 3 0 DO I p5 LOOP ;
." Calling loop... " x3lp
." Defining loop calling loop 2 times... " : x6lp 2 0 DO x3lp LOOP ;
." Nested-looping 2x3 times... " x6lp
." Inline: " : m 3 1 DO 3 1 DO CR J p5 I p5 ." = " J I * p5 LOOP LOOP ;
." Use inline loops with two indexes... " m
." Make multiples of 7 via DUP... " : m7s 10 0 DO DUP I * . LOOP DROP ;
." Print them and DROP the 7... " 7 m7s
." Reset... " RESET
\ Time for Turing completeness...
." Let's do Fizz-Buzz! " \ Turing Completeness check...
\ fizz ( n -- 0_or_1 n )
." Define fizz... " : fizz DUP 3 MOD 0 = IF ." fizz " 1 ELSE 0 THEN SWAP ;
\ buzz ( n -- 0_or_1 n )
." Define buzz... " : buzz DUP 5 MOD 0 = IF ." buzz " 1 ELSE 0 THEN SWAP ;
\ emitNum ( 0_or_1 0_or_1 n -- )
." Define emitNum... " : emitNum ROT ROT + 0 = if . ELSE DROP THEN ;
\ mainloop ( n -- )
." Define mainloop... " : mainloop ." ( " fizz buzz emitNum ." ) " ;
\ fb ( -- )
." Define fizzbuzz... " : fb 37 1 DO I mainloop LOOP ;
." Run it! " fb
." Report memory usage... " .S
." All done! "

Automation

I am a strong believer in automation. The final form of my Makefile therefore has many rules - e.g. make arduino-sim - that automate various parts of the workflow.

Here's what they do:

  • arduino: Compiles the code for Arduino UNO - builds src/tmp/myforth.ino.{elf,hex}

  • arduino-sim: After building, launches the compiled mini-Forth in simduino.

  • upload: After building, uploads to an Arduino attached to the port configured inside config.mk.

  • terminal: After uploading, launches a picocom terminal with all appropriate settings to interact with my Forth.

  • x86: Builds for x86. Actually, should easily build for any native target (ARM, etc).

  • test-address-sanitizer: Uses the x86 binary to test the code, executing all steps of the scenario shown above. The binary is built with the address sanitizer enabled (to detect memory issues).

  • test-valgrind: Same, but with Valgrind.

  • test-simulator: Spawns simavr and sends the entire test scenario shown above to it - while showing the responses received from it.

  • test-arduino: Sends the entire test scenario shown above to an Arduino Uno connected to the port specified in config.mk and shows the responses received over that serial port.

  • blink-arduino: Sends the "hello word" of the HW world: a tiny Forth program blinking the Arduino's LED.

Another example of automation - the complete test scenario shown in the previous section, is not just an example in the documentation; it is extracted automatically from this README and fed into the Valgrind and AddressSanitizer tests... and also into the Python testing script that sends the data to the board in real-time.

DRY, folks.

Conclusion

I thoroughly enjoyed building this. I know full well that Forths are not supposed to be built in C++; they are supposed to be built in assembly, and also, utilise the Flash to store the user-compiled code at run-time.

But that wasn't the point of this - the point was to have fun and learn Forth.
And what better way to learn a language than to actually implement it! :-)

And... as a child of the 80s... I now know first-hand what Jupiter Ace was about :-)

Fork the code, and enjoy tinkering with it!
Thanassis.

You might also like...
Motion planner built upon Tesseract and Trajopt

motion_planner Motion planner built upon Tesseract and Trajopt The abb_example package is similar to the tesseract_ros_examples, but it contain more e

Simple text editor in C++ - Simple editor built upon kilo editor.

GUMBO editor Simple editor built upon kilo editor. Still big work in progress although this is just fun side project to learn more C/C++. From 0.0.2-

A Minimal Web Browser with Built-in Adblocker in Less Than 100 Lines of Code
A Minimal Web Browser with Built-in Adblocker in Less Than 100 Lines of Code

A Minimal QtWebEngine Web Browser with Adblocker How Does It Work This is a minimal network filter implementation using QWebEngineUrlRequestIntercepto

My first os built from scratch

Kernel project My first os built from scratch Contributors are welcome LICENSE TODO GDT IDT PS2 Keyboard PS2 Mouse PIT RTC Initrd Drawing on framebuff

Physical Tic-Tac-Toe smart board with PvP mode and two levels of AI. Built atop a custom PCB connected to an Arduino Mega 2560.

TicTacToe_SmartBoard The files in TicTacToe_SmartBoard are based on the files in https://wiki.illinois.edu/wiki/display/ECE110HLSF15/Tic-Tac-Toe+Smart

A ZX Spectrum-like library built for
A ZX Spectrum-like library built for "dos-like" by Mattias Gustavsson.

ZX-Like A ZX Spectrum-like library built for "dos-like" by Mattias Gustavsson. It allows for the creation of ZX Spectrum like screens for demos, games

This package provides localization in a pre-built map using ICP and odometry (or the IMU measurements).
This package provides localization in a pre-built map using ICP and odometry (or the IMU measurements).

Localization using ICP in a known map Overview This package localizes the lidar sensor in a given map using the ICP algorithm. It subscribes to lidar

Template library and blog that explain how JSI modules are built from scratch in React Native

react-native-jsi-template This is an example library that explains how anyone can build jsi modules from scratch in React Native. This code is written

Low level library to develop GBA games that can also be built for PC.

Universal GBA Library 1. Introduction This is a library for development of GBA games. It can be used to build actual GBA game ROMs, but it can also ta

Comments
  • compile problem on mac with clang 12.0.5

    compile problem on mac with clang 12.0.5

    Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
    Apple clang version 12.0.5 (clang-1205.0.22.11)
    Target: x86_64-apple-darwin20.5.0
    Thread model: posix
    InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
    

    make

    This error occurs on many places. Is there an fix?

    In file included from myforth.cpp:4:
    In file included from ../src/miniforth.h:39:
    ../src/compiled_node.h:53:11: error: union member '_string' has a non-trivial default constructor
            } _string;
    
    opened by wiiikiii 1
Releases(v1.1)
  • v1.1(Jul 9, 2021)

    I've done everything I wanted to do, to optimize the code and move all things I could into flash (program) space. The result is a Forth that can use 1.4K of the Arduino UNO's 2K SRAM. Not bad, especially if you consider that half of the remaining memory is basically reserved for run-time (CPU) stack - and this number can be tweaked.

    Lots of FORTH code can fit now... I proceeded with the "hello world" of the HW universe :-) https://www.youtube.com/watch?v=xePollbCzow

    Source code(tar.gz)
    Source code(zip)
  • v1.0(Jul 4, 2021)

    • Code cleaned up and commented.
    • Passes all tests (AddressSanitizer, Valgrind, and also, in both simavr and real Arduino UNO board).
    • All strings moved to Flash (via dprintf, __FlashStringHelper, etc)
    • No obvious places where memory/stack are wasted - at least I can't see any...
    • Blog post written ( https://www.thanassis.space/miniforth.html )
    Source code(tar.gz)
    Source code(zip)
Owner
Thanassis Tsiodras
Lead Software Engineer of a software services startup for more than a decade. Over the last 5 years, working for the European Space Agency.
Thanassis Tsiodras
A C++ implementation of the Forth programming language

onward A C++ implementation of the Forth programming language Build # Clone repository git clone https://github.com/tetsuo-cpp/onward.git cd onward/

Alex Cameron 3 Mar 6, 2022
A demonstration of various different techniques for implementing 'threaded code,' a technique used in Forth and in virtual machines like the JVM.

Threaded code is a technique used in the implementation of virtual machines (VMs). It avoids the overhead of calling subroutines repeatedly by 'thread

null 25 Nov 4, 2022
Learn C++ in the first week

Learn-C++-in-the-first-week This is a program I made for calculating quadratic functions, entering the quadratic term coefficients, the primary term c

Seksa Tianwei 1 Jan 27, 2022
built-in CMSIS-DAP debugger tailored especially for the RP2040 “Raspberry Pi Pico”

RP2040 has two ARM Cortex-M0+ cores, and the second core normally remains dormant. pico-debug runs on one core in a RP2040 and provides a USB CMSIS-DAP interface to debug the other core. No hardware is added; it is as if there were a virtual debug pod built-in.

null 272 Dec 30, 2022
CQC (Charmed Quark Controller) a commercial grade, full featured, software based automation system. CQC is built on our CIDLib C++ development system, which is also available here on GitHub.

The CQC Automation System What It Is CQC is a commercial quality, software based automation system, suitable for residential or commercial application

Dean Roddey 61 Dec 13, 2022
Single-header, ranges-compatible generator type built on C++20 coroutines

generator Single-header, ranges-compatible generator type built with C++20 coroutines. A generator allows implementing sequence producers which are te

Sy Brand 39 Dec 20, 2022
C64 Watch is a customized T-Watch 2020 that was inspired by the Commodore 64 computer. It features a C64 theme and a built-in BASIC interpreter.

C64 Watch C64 Watch is a customized T-Watch 2020 that was inspired by the Commodore 64 computer. It features a C64 theme and a built-in BASIC interpre

Nick Bild 30 Nov 26, 2022
A SDK with a built-in cheat for Garry's Mod.

GMod-SDK This is a module for Garry's Mod that works based on a SDK. I've spent the past few days reversing a few modules of the game, in order to get

null 70 Jan 3, 2023
A kernel level driver for Windows built to configure the Blue Screen Of Death

BSODConfigure A kernel level driver for Windows built to configure the Blue Screen Of Death. Go see the writeup at https://www.phasetw0.com/configurin

phasetw0 14 Dec 22, 2022
A Rideshare Simulation built in C++, using OpenStreetMap data

My Capstone project for the C++ Nanodegree at Udacity, a rideshare simulator. It extends the concurrency project based on a traffic simulation, as well as taking in parts of the earlier route planning project, in order to simulate ridesharing apps that are able to pick up passengers

Michael Virgo 13 Nov 8, 2022