What are the requirements for a language for the standard interpreter? Within the requirements, we can devise a variety of possible languages, but we will aim for something that is simple, and expressive for a wide range of programs, with particular emphasis on interpreters.
The first requirement is that the language must be able to express all Turing-computable functions. This is usually satisfied through common-sense in designing the language. It is possible to devise computationally complete languages that are poorly expressive, for example the Turing Machine [Turing 37]. However, we will provide a range of operators more expressive than the Turing Machine (and working on a basic type system which supports mapping of abstract types to it rather better than does the Turing Machine Tape). Here are some kinds of operators that we expect to find in a serious programming language (and even in a small example language):
Provision of operators of all these kinds is sufficient for the implementation of a wide variety of programming languages, through interpreters written in a reasonably expressive style. Support for algorithmic and functional styles is particularly good, and implementation of other styles of language, while not so succinct, is not particularly cumbersome. Declarative and pattern-matching languages are the worst fit to our model, and require pre-processing into a suitable procedure-based representation, as explained in section 8.8.
I have presented the base language as a form of Lisp, partly because, having simple syntax, it is a convenient notation for simple new procedural or functional languages, and partly because the semantics of the base language, and the operators available, are generally Lisp-like. This follows from Lisp's origins and its evolution toward language and other symbolic processing.
Our other major requirement for the base language is that it must support reification and reflection. This needs only two operators:
grand-jumpy-reifier
returns the tower's
state
grand-jumpy-reflector
sets the tower's state
An implementation for each of these is given in section 9.4.
Other reifiers and reflectors can be built on top of these, as explained in section 9.4, through use of the type system and the function application operator, for example the main operations on meta-continuations as described (under different names) in section 9.4:
grand-pushy-reifier
runs a function with the
tower's state as its argument
grand-pushy-reflector
runs a function in a
tower constructed from its argument.
As mentioned in section 5.4, the language for a closure must provide all the operators used in the expression of that closure. This is no new requirement introduced by the tower. The tower of levels with languages has only made it possible for this condition not to be satisfied. In a conventional language, all the operators in the language are always there. Only in a system where the language can be reflected into can this condition be broken.
Therefore, the base language must include at least all the operators needed to implement the standard evaluator. This is an extension of the idea of structural integrity, but applied to interpretation, rather than to the handling, of reified values. To be useful, it should also include a variety of operators typically useful in implementing operators of other languages, such as structure field accessors for the data types returned by reifiers.
It is not necessary for all operators of the base language to be shadowed by the meta-evaluator, but in practice (for efficiency) they all are. (Operators not included in the base language may also be shadowed. The only link between inclusion in the base language and being shadowed is that a computationally complete subset of the base language, and enough of the reflective operators to reify and reflect the entire tower state, must be shadowed.) [Smith And des Rivières 84b] contains a description of how to derive a minimal set of grounded operators.
Like those in other languages, many of the operators in the base language will need to evaluate all their arguments, but do not need to specify in what order to evaluate them. Arithmetic operators are a good example of this. In the explanation below, we will use
((lambda (a b c) (+ a b c)) 1 2 3)as an example.
The most concise way to implement such a family of operators is to split each operator into three parts:
evaluate-sub-expressions
(described in
section 7.7), which evaluates each
sub-expression of the expression being interpreted
for which the original operator was used. In the
example above, this will take the a b c
of
(+ a b c)
as the list of sub-expressions to
evaluate. Since the lambda construct makes local
bindings, a
, b
and c
are local
variables, that is, indices into the value list.
The value list contains 1 2 3
in the
positions a
, b
and c
, and so the
result of the sub-expression evaluation is (1
2 3)
.
prim+2
,
capable of doing the basic action on objects of
the relevant type, but not of evaluating their
arguments.
+
.
These first call evaluate-sub-expressions
,
and then map their corresponding primitive
operator along the result of this, and return the
result accumulated at the end of the mapping.
Thus, +
will calculate (through iteration,
here unrolled for explanation) (prim+2
(prim+2 1 2) 3)
as it accumulates its result.
In the experimental implementation Platypus
(see section pl89sect) the base language
and its shadows are set up by a group of Lisp
macros platypus-defprim
,
platypus-def-control-prim
,
platypus-def-lispy-prim
, and
platypus-def-lispy-expr-prim
, which both define
the code to be interpreted within the tower, and
name (or even define) the Lisp function to be used
as the shadow outside the tower.
Since the operators structured in this way call
more rudimentary operators such as
dyadic-add
, these more rudimentary ones may be
provided as operators in their own right; they may
be used directly for implementing some languages.
Since they do not evaluate their arguments, the
arguments must be fed to them in a fixed
manner---non-evaluation of arguments means they
cannot even be given through varying local
variable indices. The values to be processed must
be placed at the end of the values list---the top
of the stack---and the operator called. It removes
its arguments from the values list by treating it
as a stack and popping them from it, performs its
essential action, and puts any results onto the
end of the values list by pushing them as onto a
stack. This form of calling makes these operators
suitable for use directly in a FORTH-like language
such as PostScript---this is done in the
implementation of PostScript used here.
In practice, many more operators than strictly necessary may be provided as primitive (shadowed), for efficiency and to make better use of the richness of the substrate system. The operators provided in Platypus include flow control, evaluation, arithmetic, data structure manipulation, reification, reflection, and input and output.
In this section, we look at adding reflective operators to a language, taking Lisp as our example language. All of this applies to any other language used in our experimental system; Lisp is the most convenient example language.
The same reflective operators may be provided in any language. Furthermore, since we make all languages equivalent and transparently cross-callable, and interpretable by the same interpreter, by having common formats for program, language definition, and state, a program Pa written in language La may reify a sub-program Pb (perhaps a library routine called from Pa) written in Lb, and will receive it in the same form as that it would receive the representation of itself from a reifier. The operators used will be different, but if Pa does any analysis of the program, it may use the definitions of the operators of the language Lb (which is built into the closure of Pb) to find what each operator does.
Since Lisp's calls are the same as our non-reflective calls, a common way to add reflective features to a Lisp system is to add a special form that does a reflective call, in which a procedure is applied to some otherwise hidden part of the Lisp system (such as the environment or the stack) along with any other arguments which are given as normal in the calling form.
This gives us reifiers such as
call-with-environment(procedure args env)which is called as
(call-with-environment procedure arg1 arg2 ... argn)
but as a callee has an environment and a list of the original arguments supplied as its arguments, with the interpreter interposing the extra information. In effect, the interpreter executes
(call-with-environment procedure arg1 arg2 ... argn)as if it were
(apply procedure *current-environment* (list arg1 arg2 ... argn))
where *current-environment*
is a reifying
variable---a variable handled specially by the
evaluator, holding part of the information used by
the evaluator in evaluating the program---holding
the current environment.
Here is a sample piece using this style of reifier:
(defun print-environment (reified-environment arglist) "Print out the environment to a stream. This expects to be passed two arguments: the environment of its caller, and the argument list with which its caller thought it called it. That argument list should have one element, a stream to which to print." (format (car arglist) ; should be a stream "The caller's environment is S reified-environment))
(defun env-to-file (fn) "This uses a reifying call to pass its environment to a function which will print out that environment." (with-open-file (str fn :direction :output) (call-with-environment print-environment str)))
The general case of such a reifier is
call-with-level proc (level args)
, which is
called as (call-with-level proc arg1 arg2
... argn)
but as callee receives the level
at which it is running and the original argument
list, as if it were called as:
(call-with-level *current-level* proc args)
.
Since the call frame contains a level, the call frame (activation record) of such a reflective call is in effect a tower level, into which information has been reified from the calling level.
When using activation records as tower levels, the link to the lower levels is an ordinary local variable/parameter in the stack frame. The next lower level for an interpreter must be held in variables of the interpreter anyway, even in a non-reflective system, as the interpreter must store somewhere the program it is interpreting. In our experimental implementation, the link to the lower levels is always in the same place in the level: it is in the second argument position in the value list of the interpreter, just after the interpreter's evaluand.
In this form of reflective interpretation, every
call (transfer of meaning) to a lower level must
pass that lower level to itself as part of its
parameter list. This is easily accomplished, as
the calling interpreter has that information
available for its own use already, although
possibly in a different form. All that it needs to
do is include the data in the stack frame that it
builds, so that it appears as if given as an
argument to the call.
For reflection, one of a second range of special
forms, complementary to the first one, and
characterized by the form {\tt(eval-with- returns a cons cell which was constructed by the
Although these additions to Lisp can provide full
reflective facilities, they present no organized
model of non-reflective and reflective calls.
Also, they often (although not necessarily) work
in terms of the internals of a normal Lisp system,
which, meta-circularity notwithstanding, is not
particularly suited to manipulating language
elements as data values. For example, to represent
environments we might conveniently use a-lists,
but an abstract type for environment, with
appropriate operators (bind, unbind, lookup,
assign, boundp) would be more appropriate. This is
a consequence of the poor support for data
abstraction (that is, just cons cells!) that
classical Lisp provides.
It seems appropriate to find a model of
computation, and a type system to support it,
designed more specifically for reflective
evaluation, and not built around the
facilities of one particular language. Such a
system would have a data type representing the
state of a computation, in which type grand
reifiers would always return their results, and in
which grand reflectors would always take their
arguments. Each field of this type would also be
of a particular type, and these types would be few
and of simple well-defined characteristics. The type we will use to represent the state of a
computation is the interpretive closure, as
described in clochap, and the types
of its fields, as described in
typechap are interpretive closure,
environment, expression (or parse tree) and values
list.
Instead of the two special calls These jumpy operators differ from the reflective
operations described in section 9.4
in that they assign to the state rather than
bind the state. The pushy reflective
operators may be implemented on top of the jumpy
ones with the addition of a stack. The grand pushy reflector may be implemented as
the following macro, which takes something to
evaluate and a level in which to evaluate it, and
performs the evaluation in that context. The
variable A matching reifier is implemented by the following
macro, which takes a procedure to call and an
argument list, and splices onto the start of the
argument list the level within which it will call
that procedure:
As demonstrated by the procedures above, when used
on a conventional architecture (that does not
provide stack pushing as a primitive) these
assigning (jumpy) reflective operations are more
primitive than the binding (pushy) ones, in the
sense that they may be used as part of the
internals of the pushy operators. However, given
an interpreter outside the tower, the pushy
operators may also be implemented as primitively,
in that they are simply functions that produce or
consume an extra argument before calling a given
function, and where the meta-evaluator handles
that argument. (However, on a conventional
computer system, all reflective operators are
actually jumpy at the exact point of reflection,
as there is a point when control switches from one
context to another, regardless of whether the old
context has been pushed onto some kind of stack,
or otherwise stored.) Each of these two approaches to handling reified
execution state uses one of two kinds of
meta-continuations, as explained in
sections terminology and
in [Danvy And Malmkjær 88]: pushy
meta-continuations, which bind the tower state,
and jumpy meta-continuations, which assign
to the state. In activating a pushy continuation,
no information is lost, as the old state is kept
on a stack of continuations; while a jumpy
continuation discards information, simply
replacing the old state with the new one. These
terms may be used to describe both control flow
within one level (where pushy continuations are
procedures to be called, and jumpy continuations
are labels to be jumped to) and flow of meaning
between levels, where pushy and jumpy
meta-continuations are the two forms of reflection
that we have just described. Pushy and jumpy meta-continuations may be mixed
within one tower, and even within one level.
Viewed as tower-constructing devices, they have
rather different forms. A jumpy meta-continuation
destroys tower levels, by replacing a string of
them destructively with a shorter string, or
creates them, by replacing one level with a string
of several levels. A pushy meta-continuation cannot destroy or create
levels in the same tower in this sense, but starts
a new tower, orthogonal to the first. (This is as
described in section 4.5.) The
analogy for this in (non-meta) continuations is
that a jumpy continuation modifies a sequence of
actions within a procedure, while a pushy
continuation starts a fresh sequence in another
procedure. Just as tail-recursive pushy procedure
continuations can be transformed into iterative
continuations, tail-recursive pushy
meta-continuations can be transformed into
iterative metacontinuations.
Functions for these transformations are given in
[Danvy And Malmkjær 88]. From here on, we will use more systematic naming
for reflective operators. The reifier called The grand jumpy or pushy reifier and reflector are
the only reflective operations that we need.
However, most of our reflective actions will copy
the current tower, make some change, and make the
changed tower become the current one. To make this
kind of activity more convenient, and also to
avoid unnecessary work, we provide some more
specialized jumpy reflectors, for changing
individual components of the current tower. Changing one part of a level's state usually maps
to one operation in a typical programming
language. For example, alterations to the value list
can implement assignment to variables or binding
of variables; changing the continuation expression
implements a jump or a call. Assignment to the
interpreter has no equivalent in conventional
programming languages, although there are commonly
statements to make new procedure definitions
(usually statically) and to implement non-local
flow control (long jumps).
Providing separate reflectors for each part of the
state stored in the closure thereby models cleanly
the actions that a typical interpreter must
implement. We also provide a collection of
procedures for handling the reified values. These
not only make handling these values easier, but
also help to maintain consistency in the
tower---not a theoretical consideration, but
helpful in avoiding an easy way of crashing the
system. The reflectors (and, less importantly,
reifiers) which affect only part of the state are
as follows:
The reifiers above may be built on top of the This is more efficient than finding the whole
reified object by A requirement met implicitly by providing a set of
whole-tower reflective operators is that
reflective integrity should be preserved in going
through a level that uses this language. A
procedure shifted up to execute into such a level
can always return information back to its home
level.
Therefore, information may always be passed up and
down the tower by reflecting procedures to run in
different levels as long as the language at each
level has at least the facilities of the minimal
base language. This requirement must be met anyway
for other reasons: were a level not to include
facilities equivalent to those of the base
language, it could not form part of a integral
string of interpreters (this is a circular
argument) and so the integrity of the tower would
be lost at that point. This would make it into two
orthogonal towers, as described in
section 4.5. However, it would
still have structural integrity for reification
(as described later in this section), and so lower
levels would be able to access levels which they
are not interpreted by at all. Thus, such a tower
would no longer be grounded, as its connection
with the umbrella interpreter would no longer have
integrity---it would be two towers for purposes of
interpretation, but only one for reflection.
kindsofshift.> Structural integrity for reification, that is, for
access to remote levels through reifiers in one's
own level, is ensured through the structure of
each level, rather than through the language. Integrity of groundedness through this level is
met through the computational completeness of the
language. It is grounded because it can, in its
own right, compute anything that is computable.
Its groundedness does not depend on that of any
other levels.
The most important of the reflected data
manipulation procedures are one to insert a new
evaluator just above the base of the tower, and a
complementary one to remove an evaluator from just
above the base of the tower. Since an evaluator is an ordinary closure, it is
guaranteed that when inserted above an existing,
interpretable (grounded) evaluator, it can be
interpreted by the previous first evaluator, and
also that it can interpret the old application.
metaevaluator.)> Here are the procedures for inserting and deleting
evaluators from the end of the tower:
(where
is
whichever part of the context we wish to reflect),
calls a procedure given as one of its arguments,
with parts of the context surrounding it being
taken from its other arguments. For example, we
could have reflective operators such as
(eval-with-environment form environment)
,
which evaluates form
using environment
to provide any free variables needed by
form
, and (eval-with-arglistform arglist)
called funcall
in Lisp, or, to be as general
as possible, (eval-with-level form level)
.
For example, this function
(defun another-level-cons (d e)
(eval-with-level
'(cons d e)
(construct-funny-level)))
cons
operator of a different level.Jumpy and pushy reflection and reification
eval-with-level
and call-with-level
, which
always start new levels of interpretation, we can
use alternative, and simpler, forms of reifier and
reflector. In these operators, the state is
represented as a tower. The reifier, which is a
procedure taking no arguments, returns the state
of the system as its result, and the reflector
takes one argument, a tower, which replaces the
current tower.
*tower*
belongs to the level of the
current level's evaluator. The action of reading
it is a grand jumpy reifier, and writing it, a
grand jumpy reflector.
(defmacro eval-with-level (form level)
`(let ((old-levels (tower-levels *tower*)))
(setf (tower-levels *tower*)
(cons ,level old-levels))
(eval ',form) ; this is implicitly in
; the context of *tower*
(setf (tower-levels *tower*)
old-levels)))
(defmacro call-with-level (function &rest args)
`(funcall ,function
(car (tower-levels *tower*))
,@args))
eval-with-level
above will now be called
grand-pushy-reifier
, and the reflector
call-with-level
will be called
grand-pushy-reflector
. The jumpy reflective
operations, not given as named procedures above,
but used implicitly in-line, will be called
grand-jumpy-reifier
and
grand-jumpy-reflector
. Here is the code for the
two grand jumpy reflective
operators:
(proclaim '(special *tower*))
(defun grand-jumpy-reifier ()
*tower*)
(defun grand-jumpy-reflector (new-tower)
(setq *tower* new-tower))
Reifying and reflecting specific parts of a level
set_procedure_evaluator
and
procedure_evaluator
set_current_evaluator
and
current_evaluator
set_procedure_language
and procedure_language
set_current_language
and current_language
set_procedure_expression
and procedure_expression
set_current_expression
and current_expression
set_procedure_values
and procedure_values
set_current_values
and current_values
set_lexical_environment
and lexical_environment
set_dynamic_environment
and dynamic_environment
grand-jumpy-reifier
, and select a field of the
the resulting record. The reflectors are
different: they must find the part of the
structure that the corresponding reifier would
return, and modify just that part. In terms of
conventional programming language technology, this
is finding the left hand side value (abstract
address) of the reifier's result and then through
it assigning the new right hand side value.grand-jumpy-reifier
,
changing just one part and reflecting the whole
thing back in with grand-jumpy-reflector
. In
some forms of Lisp [Steele 84][section 7.2],
efficient code for these operations can be
generated automatically through the use of
defsetf
, producing an interface presented as
setf
forms.
Integrity
Inserting and removing levels of evaluation
(defun add-evaluator (evaluator)
(let* ((our-tower (grand-jumpy-reifier))
(new-level
(make-evaluator-interpretation-level
(car (level-call-record-stack our-tower))
evaluator)))
(push new-level
(level-call-record-stack our-tower))
(grand-jumpy-reflector our-tower)))
(defun remove-evaluator ()
(pop (level-call-record-stack (grand-jumpy-reifier))))
An operation on the tower that preserves its
integrity is one that replaces a string of levels
by another string of levels that also has
integrity. For the string to have integrity each
interpreter must be able to interpret its lower
neighbour, and so the new sequence must fit the
interpreters above and below it correctly:
which we can guarantee by taking those end interfaces, labelled Algol Language and Algol Program entirely into the operation, replacing not only the link between them but those levels themselves. This way, we never try to link levels which will be incompatible, but can always insert an extra level in between as a buffer, with the appropriate language definitions for the previous evaluator. (The ability to do this depends on an evaluator being provided written in the new language.) Unfortunately, this may add more levels of interpretation, and so perhaps should be avoided in practice, for efficiency.
In practice, common tower manipulations change
only one level, adding or removing an interpreter
between two that remain unchanged:
... an operation which is referentially transparent to all lower levels. Although the intensional meaning has been changed, the extensional meaning is still the same, because a transformation that preserves integrity is one that does not destroy the correctness of the previous interpretation.
An example of the usefulness of this is adding and removing tracing of evaluation, by adding and removing an evaluator that traces what is being processed.
To change
For example, all the operator closures of the language of a closure could be given the same evaluator, and then all activity in that language could be traced by tracing that evaluator. (To trace an evaluator i1 which is interpreted by an evaluator i2, we insert an evaluator it, which traces what it processes, between i1 and i2.)
Grouping closures in this manner can cause
harmless dimensional anomalies in the tower
structure. The closures that have been grouped
share a tower level because they have the same
evaluator; yet they may also be at distinct tower
levels for other reasons, such as one being part
of the interpreter of another. Thus, an evaluator
may exist at more than one level, and so the same
tower can have more than one number of levels
between the base and the umbrella, as shown in the
following diagram:
The same reification and reflection operators may be made into a part of any language; their action is always the same. Some languages will require some amount of packaging around the bare reflective operators, as they may not provide means for handling such data directly (they might have to present it in terms of a procedure library for handling reified data), or they may prefer to present the data in some form that matches the language's natural model of computation more closely.
As mentioned in sections rollup and unroll-circular-tower, the boring section of the tower is kept as a circular reference, which could in principle be followed indefinitely. If, however, we were to allow a program to do so via the reifiers, it would be hard to detect how many levels of the tower it had climbed before eventually changing something. So, to simplify the tracking of realized tower levels, we make the shadow versions of the the operators reifying and manipulating tower components to do some extra work that is invisible from within the tower (unless the program within the tower calls for reification of the meta-tower, as described in section 4.5. These are the extra actions needed:
closure
, compare the value
found with the standard evaluator (see
sections defstandint and unroll-circular-tower. If the value is the same
(eq
in Lisp terminology), a copy (one level deep,
not a tree copy) of it is returned instead, otherwise
the value is returned itself. The closure is also
lookup up in the tower's shadow maps, and if it is
found to be shadowed, a copy is returned likewise. In
the copy, the original
field points to the
closure of which it is a copy (see
section 4.4), and so the meta-evaluator
recognizes that this is still part of the boring
section of the tower, and so shadows its evaluation.
This copying ensures that the standard evaluator and
the shadowed operators can never be changed---it is
never actually held in any place reachable from any
variable of any program within the tower.
original
field of the closure to the closure itself,
so that it is no longer recognizable as a shadowed
closure if it originally was one, and when the
meta-evaluator evaluates the closure, it will do so by
interpretation instead of by shadowing.
The code used for doing these actions is part of the meta-evaluator, and so is presented in the meta-evaluator , in section 11.5.
There are a few operators that reflect down (or
up!) to the substrate language. The main one of
these is eval-in-cl
, which
takes an argument form that it passes to the
evaluator of the substrate language, which is the
eval
procedure of Common Lisp. The result of
the evaluation is then passed back as the result
of this Platypus operator. This operator was
provided so that the benchmark suite for Platypus
could also run the same tests in Common Lisp
automatically, for comparisons of the speeds. It
could also be used for a form of reflection, right
through to the substrate, to ask for compilation
of Lisp forms that could then be installed in the
shadow map to make new primitives. (This is
discussed further in section 13.2.)
There is also a break
operator, for use as a
breakpoint, that runs a read-eval-print loop in
the substrate language. When the user quits from
the loop, possibly having reflected some changes
into the system after reifying and displaying some
information, this operator returns.
The time-now
and input-output procedures are
also in some sense substrate system reflective
operations.
A language for use with the standard interpreter in the boring section of the tower must be powerful enough to support both the standard interpreter and the procedures that will run on it, which will typically be operators for other languages.
The implementation of the base language has two parts: the operators themselves, and their shadows, which are run at the next meta-level in the tower. (The last meta-tower is run in the substrate language on which the whole reflective system is built, and it is there that all operator definitions are eventually evaluated.)
The language should provide operations typically needed by interpreters, and those needed for reification and reflection. It is also desirable that the base language be reasonably expressive.
As well as the fundamental reifiers and reflectors, it is convenient to provide some jumpy reflectors that assign only part of the state; these are not only more efficient, but also more expressive of many common language features that they may be used to model.
In practice, we provide many more operators in the base language than are strictly necessary. ([Smith and des Rivières] explains how to work out which operators are necessary.)
Operators for reflection may be added to an existing language. With our model for mixed-language interpretation, the same operators will work for any language.
Reflective operators (reifiers and reflectors) are of two kinds, jumpy which move data between program-as-agent and program-as-subject without automatically creating new levels of interpretation, and pushy which create new levels either providing data from the program-as-subject or using it to create a new (or modified) program-as-subject. Jumpy operators are more primitive than pushy operators, in that (on a conventional architecture) they may be used to implement pushy operators, whereas, within one level of interpretation, pushy operators may not be used as the primitive on which jumpy operators may be built (other than by considerable wasted work).
One form of reflective operator is the grand reflective operator which reifies or reflects the entire state of the system. However, it is more efficient, as well as often more convenient, to reflect into just the part of the state required, and so reflectors that set only specific parts of the state are also worth providing in a practical system.
Reification of programs is homogenous between languages. The same reifiers (and reflectors) may be used in any language, and the values returned have closed into them all the linguistic information needed to understand the value in any way that might be required.
A fool hath no delight in understanding, but that his heart may discover itself.
Proverbs 18:2