I've seen this touched on several times across various threads, but if you can c...

nils-m-holm · on May 5, 2021

I am not sure what you mean by "custom forth", but threaded code ("bytecode") can be much denser than machine code, so it saves space, thereby allowing FORTH to run on systems with as little as 2K bytes of memory.

arethuza · on May 5, 2021

I found this fascinating page on threaded interpreters and Forth:

http://www.complang.tuwien.ac.at/forth/threaded-code.html

MaxBarraclough · on May 5, 2021

> threaded code ("bytecode") can be much denser than machine code

Is there somewhere with hard numbers on this? I couldn't find anything with a quick google.

lebuffon · on May 5, 2021

This very much depends on the Hardware that Forth is running on but the concept is very old. Interpreters of all types are used to encode higher level functions than the hardware supports as a single bytes or integers. Then you write your program using the byte/integer code as instructions and it takes less space for a given program.

Byte code threaded programs are small but less speedy. (Open-Firmware is byte coded Forth)

Traditional Forth systems encode this virtual machine as lists of addresses. These lists are "interpreted" by a piece of code that is 2 to 3 instructions on modern computers, so it's pretty fast. (slightly bigger on an 8 bit CPU)

This address based VM code "can" be smaller on a 16bit machine but might not be on a 32bit machine until you raise the programming level to very high level functions. This depends on the CPU. On RISC-V it's probably still true based on what I see of the instruction set. (not tested)

The typical Forth development cycle involves extending the language in the direction that one needs, writing higher level functions, combining them into even higher level functions such that at the end of the coding process you are using very few of these application specific functions to write the program. Thus a space saving because each function uses only one byte or integer in your program to reference it.

MaxBarraclough · on May 5, 2021

Thanks, but I already know that. I was wondering about hard numbers, and how it really pans out on modern systems.

To my knowledge, the demoscene folks never use Forth or anything like it, they just optimise their assembly and sometimes use compression.

lebuffon · on May 5, 2021

Sincere apology.

This paper by Anton Ertl who is one of the GForth maintainers might give more insight.

http://www.euroforth.org/ef99/ertl99.pdf

mschaef · on May 5, 2021

> I've seen this touched on several times across various threads, but if you can compile down to assembly, is there an advantage to use a custom forth as essentially the bytecode to compile down to?

Forth is compact and performant at runtime, and then it's possible to implement it compactly too. This gives the language particular advantages in small and resource constrained environments. (Just a few kilobytes, in fact, are enough.)

To wit:

Adobe Postscript is itself a variant of Forth, that was of course originally designed to run embedded on printers. These printers originally weren't exactly resource constrained, but most of the resources they had available were dedicated to printing the page. (Memory was expensive at the time and, for Postscript, you needed enough to store every pixel on a page before printing.)

PostScript is also notably 'a bytecode to compile down to'. It's a fully Turing complete language (people can and do write it by hand), but it is mostly generated automatically by various drawing and page layout packages.

Another good example of the power of Forth school is the later model HP calculators. HP devveloped a custom language (RPL) that combined the execution model of Forth with a number of features of Lisp. (Note that this does not apply to all of HP's RPN calculators, just the later ones done in RPL.) There's a lot that can be said, but these were small machines with limited RAM and an outsized amount of functionality. It was a good engineering trade off for the time.

Interestingly, the entire software stack for an RPL calculator was done in either machine code or in RPL. The only distinction between "User RPL" and the "System RPL" was whether or not the calculator had user-visible symbolic names for the various entry points. If the calculator knew the name of a symbol, you could enter it and request it be called. If it did not know the name of a given symbol, you needed specific development tools that did, and they usually ran externally. The process was also bi-directional - enter a program, and the calculator 'compiled' the text into a binary representation. Edit the program, and the calculator reconstituted the text (in pretty-printed form) based on the binary representation. It made programs nicely editable without having to carry around all the source text in addition to the runtime representation. All very cool and efficient.

> It's always impressed me just how much mileage a small team with Slava Pestov was able to get out of Factor.

I think most or all of that has to do with Slava Pestov and his general dedication and skill. (At least based on the high rate of progress he made early on, when it was just himself.)