Forth : Past, Present and Future
Abstract
A look at one possible future direction
for Forth, by analysing present and past systems.
Forth : Past
In 1386 Bishop Ralph Erghum installed a
clock in the tower of Salisbury Cathedral. It is a classic example of what
could be called "Forth design philosophy". Obviously, as it predates
the computer age by nearly six centuries it is not a computer program, but it
does have several properties which it shares with good Forth programs :
1. Elegance. The indefinable "wow" factor.
1. Simplicity. There is no unnecessary complexity.
1. Modularity. Each part has a specific function,
well separated from the other "modules".
1. Robustness. It has been running for over five
hundred years ( with a 72 year break ).
1. Modifiability. Changes can be made easily.
1. Appropriateness. It solves a specific problem,
for a given environment.
Description
The clock is made from wrought iron,
wood, stone and rope. Minimalist programmers will be pleased to learn that it
has no clock face, no hands and no pendulum. Its function is to chime the hours
on the Cathedral bell. In the 14th century a clock face would have been
superfluous, as no one could "tell the time".
In the 18th century, when accurate
timekeeping became necessary to determine longitude at sea, the original
"foliot and verge" escapement was considered too crude, and the
mechanism was changed to include a pendulum. This gives an order of magnitude
improvement in timekeeping stability. Only in the early 20th century was the
clock restored to its original form with an appreciation of its historic
value.
The design uses two stone weights to
provide motive power. One drives a constant velocity
"foliot and verge" escapement,
the other provides power to chime the bell. There is an on/off switch which
disconnects the gear trains so that the weights can be wound up again.
The clock is a state machine, and moves
through 12 distinct states, S, numbered 1 to 12. The state, S, is stored by the
angle of a 78 tooth gear wheel. This angle is 360n/78 degrees, where n is the
sum of the integers from 1 to S, the current state number. The number 78 is the
sum of integers from 1 to 12. The transition from one state to the next is
triggered at hourly intervals by the "foliot and verge" escapement (
"ticking" every 8 seconds ) geared down to give one rotation per
hour. Bishop Erghum obviously liked the latest in Hi-Tech gadgetry, as he
installed version 2.0 in Bath when he moved there...
Forth : Present
In the early 1970’s Chuck Moore
developed the first Forth computer systems. In 1978 I started using the
microForth system for the 1802 processor. Apart from the detail that microForth
was retired from service after about 10 years, not 500, the design qualities
are the same as the Salisbury clock. The elegance became apparent almost
immediately when I typed 1 1 + . <cr> and saw a 2 appear.
The simplicity, modularity, robustness and modifiability took longer to
appreciate, as I understood how it all worked, but the appropriateness I
am only beginning to appreciate now.
Description
Forth is a computer programming language
which respects human understanding. That is, the programmer is the key element
in the Forth system. The language is an expression of his/her understanding. As
the understanding changes so does Forth, hence the ability to define new words,
structures and compiler functions is a key feature of all Forth systems.
To avoid getting bogged down in details,
such as direct/indirect/token threading, assemblers, etc, I will describe a
much simplified Forth computer. (Any real computer can provide a Forth computer
by means of a software virtual processor.)
A Forth computer is a general purpose
processing machine which contains memory, a (virtual) processor which follows a
Forth instruction set, and some input/output devices.
The memory may be thought of as a series
of bytes, each one having an address, starting at 0 and continuing up to
the maximum size of memory in the computer,
A program is put into memory ( by
another program ) and the processor steps through the instructions which
constitute the program.
Some of the instructions send or receive
data ( numbers or characters ) to or from the input /output devices. For
example, a key may be pressed and that character will appear on a display.
Other instructions allow the processor
to jump to a different part of the program. This allows the same sequence of
instructions to be used many times.
The processor’s instruction set of words
are grouped into more useful words, each of which is then given its own name
by which the programmer can refer to it, and a token by which the
processor can refer to it. The name is a sequence of printable ASCII characters
with one or more spaces at each end. The list of names of Forth words is linked
together in the dictionary, which may be split into different vocabularies
so that the same name may be used for different words, each one being selected
according to which vocabulary is currently active. The token is often the
address of the word’s list of instructions, each of which is another Forth
token.
One particular Forth word is called QUIT
, the Forth text interpreter. QUIT receives characters and processes them :
if the characters are the name of a Forth word in the dictionary, the processor
jumps to that word and performs the associated sequence of instructions. If it
is not it tries to convert the characters to a number, and if that fails it
jumps to the start of QUIT, This allows words to be tried out by typing their
names.
Two other important Forth words are :
which creates a new Forth word in the dictionary by compiling the tokens that
it requires ( starting with nest ) , and ; which compiles unnest
. The return stack stores the return addresses of nested words.
The parameter stack stores data
which one word may pass to another, so that each word may be considered
separately from all the others.
Programming is the creation of the
program whose instructions perform a desired function. With the above features
Chuck Moore created an elegant, simple, modular, robust and easily modified
programming language which he called Forth.
Forth : Future
Every feature of the present day Forth
system is highly refined, and uses the simplest, most elegant structures
possible. The use of the parameter stack to isolate words from each other, the
return stack to allow nesting of words calling other words, and the linked list
dictionary are all examples of this approach. The text interpreter, QUIT , is
also the simplest possible way of allow interpreted testing of words. So what
can be added?
Today’s Forth provides the solution to a
specific problem - how does a programmer control the operation of a computer.
Tomorrow’s Forth must provide a solution to a new problem - how do two programmers
control the operation of a computer.
The change from a single programmer to
two ( or more ) programmers working on the same problem sounds deceptively
simple, but several of the simple structures used in conventional Forth are now
no longer applicable.
Firstly, the name of a Forth word is no
longer a static entity. Instead of a simple string of ASCII characters we need
a point in a four dimensional "space". The four dimensions are :
1. Name ( same as before )
1. Version number ( bug fix /programmer ID )
1. Application ( any variations dependent on what
the program is doing )
1. Platform ( what hardware / software environment
the word runs in )
The reason for the extra three
parameters to describe a name are that whereas in the case of the single
programmer, when a bug fix or other modification is made the old version is
usually overwritten ( or possibly archived ), but with multiple programmers it
is possible to have two different versions in existence simultaneously, in
different physical locations.
Take as an example the Forth word SQRT
, which finds the square root of a number.
Let Alice and Bob be two programmers,
each with their own version of SQRT . Alice’s version of SQRT takes a positive
30 bit number and returns a 15 bit number, and is coded in 8051 assembler. Bob’s
version of SQRT takes a positive 32 bit number and returns a positive 16 bit
number, and is written in high level Forth. The two versions differ in both
Application (30 or 32 bit input ) and Platform ( 8051 or high level ).
With just one programmer SQRT will only
be used for the current application, running on the current platform. With two
programmers life becomes much more complicated.
If Alice wants to use Bob’s version
because her application suddenly uses 32 bit values as input to SQRT, she will
have two versions of SQRT on her computer. Then Bob ‘phones Alice to say that
he found a bug in the version of SQRT which he just sent her. Alice then has
three versions of SQRT, each differing in detail, but each performing the same
conceptual function.
What has happened in the transition from
one- to multiple- programmers is that the simplifying assumption that the
connection between the programmer’s understanding and the program is
"tightly bound" no longer applies. We are now in the world of version
control and relational databases.
So what implications are there in this
new 4D name? The dictionary ( and hence QUIT ) must be moved from the compiler
to the editor. The compiler is much reduced, to the point where it may not be
necessary any more. Goodbye : ! The editor becomes a version control
tool, with built in archiving, Internet access and text comparison features.
Source text validation will be required to sift through the dross.
The return stack may also have to go.
This is a by-product of word-based version control - words need to be
categorised by level, so unlimited nesting is out.
Its Forth, but not as we know it! Or is
it? 8^) Howerd Oakford , 1 Sep 97