Results

Overview of results

In researching this thesis, I developed three implementations of Platypus, at each stage having a different level of experience with reflective programming. The three systems have the same underlying principles, but different implementation technology. Each contributed some different things to my understanding of practical and theoretical procedural reflection, whilst all agreed on the major points.

Performance testing of Platypus took the form of running a set of simple benchmarks---a mixture of the Griss tests and Gabriel tests [Gabriel 85] in Lisp and a recursive picture ``Circle Limit'' in PostScript---on a variety of configurations of the system. The same benchmarks were also run on conventional Lisp and PostScript interpreters to provide a reference point.

The tests were run for each the two versions of the evaluator and meta-evaluator---that is, the versions without and with the boojum and snark functions. For each of these versions, the tests were run

with the evaluation system in its initial state (in which the operators used by the application are always shadowed directly);
with the evaluator changed, but to an identical one, thus forcing another level of interpretation to be added;
with parts of the evaluator and some of the operators changed to make the whole level use association-list based deep binding instead of Platypus's usual hash-table based shallow binding.

The expected results were that the performance of configuration in which the evaluator and operators used by the application were shadowed directly would be within an order of magnitude of the performance of the conventional Lisp and PostScript systems, and the versions in which the evaluator and operators used by the evaluator and operators were shadowed would be very much slower.

The design ideas

Before developing the first implementation, I spent some time in devising a way of representing a language interpreter in a manipulable form. My first attempt used re-write rules to describe a language, but it soon became clear that, as well as not being very expressive for describing most languages, they were also hard to manipulate for reasoning about and computing with languages.

I experimented briefly with a message-passing representation of closures in Lisp. I soon realized that, for manipulating languages, this was not much better than the re-write rules, although, with each operator being an object, it was some improvement.

Although the different possible representations for languages were the same in the computational power, they were very different in their expressiveness. Both of these forms lacked expressiveness in this problem domain. Expressiveness is hard to quantify, although for many pairs of expressions of the same idea, one is often clearly more expressive than the other. Yet this is a subjective evaluation, and it is difficult even to describe the parameters we use in deciding expressiveness. Conciseness is, perhaps, an important factor: a good expression of an idea is usually more concise than a poor one. This could be quantified objectively by the number of terms used in the expression, as a poor match between an idea and the language used to express the idea usually results in extra terms that are present only to make up for the poor match. (Consider, for example, using FORTRAN instead of Lisp for consed-list manipulation, or a command shell language such as sh for Fourier Transforms.)

Interpretive closures

My third way of structuring a language for representation divided it into operators and proccesor, each being represented as a procedure. The closure operation for procedures was extended to close over the language (environment of operators) and the evaluator, and thus this representation automatically provided interpretive towers.

This was the representation I took into the first implementation (written in Cambridge Lisp), which showed it to be basically suitable. I used the same representation in C-Platypus, and extended it slightly in Platypus89, while still keeping the same structure for the actual representation of languages---the changes were concerned with tower representation and interpretation.

Meta-towers

The idea of many-dimensioned towers, or meta-towers, was important in designing the system, and they are provided in Platypus89 as a compile-time option, making a slight change in the performance.

The slight loss in performance brought about by adding meta-towers appears because the operation call-meta-evaluator (used in the real, compiled, meta-evaluator) can no longer be simply a funcall in the substrate language, but must now switch on whether the meta-evaluator of the tower given as an argument to call-meta-evaluator is shadowed by the real one. For simplicity and efficiency, I used a different implementation of shadowing from that used in the normal towers; when the meta-evaluator has not been reflected into, it simply is the compiled one, and this is called with a Lisp funcall. Otherwise, a Platypus closure is there, and a level is launched to evaluate it. The selection here is done by a Lisp typecase form, although it would have been more consistent with the rest of Platypus for it to be done by an environment---another shadow map, in effect. The reason why the real meta-evaluator can be built into the tower at this point is that this is the direct, and singular grounding point of the tower. It would have been possible, although inelegant, to connect the meta-evaluator of an ordinary tower to that tower by putting it in place of the rolled-up boring section of the tower, and having the unrolling mechanism find the thing from which to unroll copies from elsewhere. The asymmetry of the direct connection seems more excusable in the light of the idea that, whereas reified data structures continue indefinitely outward, the real meta-evaluator must start from somewhere. This is not so much where the buck does not stop, as where it starts!

The development of the meta-tower system was closely linked with the theoretical development of the type system.

The type system

Platypus is a dynamically typed system, providing dynamic typing to the languages it supports. It would be much harder to implement dynamically typed languages on only a statically typed base, as static typing simply introduces a pre-execution check, after which evaluation of the program may procede with dynamic types (although all the values' types will be predictable). Implementing dynamic types on statically typed system requires the definition of a (static) type including a (dynamic) type field and a value field that can contain values of any type used in the system (and so must be of a union type---and thus not strictly statically typed). Languages with static typing may have their type-checking done when programs are read and converted into Platypus' expression representation.

In all the substrate languages (Cambridge Lisp, C, and Common Lisp), the type system provided was not powerful enough in itself to support Platypus' requirements directly. Common Lisp was the best here, lacking only the facility for the programmer to add new atomic (non-structured) types, which would have been useful for local variable indices; instead, string-characters (see sections charasindex and charindex) were used for this, as being a distinct type of integer that was already available.

The first two implementations had double type systems, using both the type system of the implementation language, and a separate type system for programs running in the tower. In the first implementation, this was very confusing, as the tower type system and the meta-evaluator type system were both dynamic, and some code in the meta-evaluator used one system, and some used the other. In C-Platypus, this was not so confusing, as C's type system is static, and the static types were useful for ensuring that data stored in the tower could not be confused with non-reifiable internal data of the meta-evaluator.

In the first two implementations, the type system of the implementation language was concealed, sometimes with some effort, from programs running in the tower.

In Platypus89, the type system of the substrate language is used within the tower, which made the implementation very much quicker and easier both to construct and to understand. Common Lisp's type system is sufficiently similar to that required by Platypus that this has proved to be a good strategy.

The uniform representation

Early on in designing Platypus, I decided that the representation of reified information handled by a program in the tower was to be the same as that used for the same information when reflected. (In some systems, this is not the case. For example, typical SmallTalk implementations have to construct reified stack frames with some effort when asked for a stack frame. In Platypus, reifiers neither modify values nor construct new ones, except to realize levels, as described in section 4.4---an activity which, from the application program's viewpoint, is not happening: it is backstage.) The success of this scheme depends on being able to evaluate efficiently using data representations that are suitable for use as reified data, as long as we generate all stack frames in heap. One representation is suitable for both uses. If we use conventional stacks, to avoid excessive garbage generation, the simplicity of our reifiers is lost, and a more active reifier is needed. Deciding which stack frames must be kept after returning from them, and reifying them then, is probably an appropriate solution, but may well be difficult to implement. This is discussed in [Moss 87]. In some cases, we know that we can use the substrate system's stack frames instead of allocating them in the heap, as explained in section 11.4.

Shadowing

Shadowing is an essential part of tower interpretation, and so appears in all the Platypus implementations. There are two approaches to implementing shadowing:

storing the shadow of a closure a field in the closure;
not storing the shadow of a closure a field in the closure, but instead recognizing shadowed closures from outside.

The second of these is better, since the shadowing information belongs on the outside of the tower, and so should not be part of any object within the tower.

I used the first of these in the Cambridge Lisp implementation of Platypus, and for C-Platypus switched to the second, to make sure that the tower remained hidden. (C-Platypus does not reveal higher dimensions of towers.) Having full control of the storage heap, it was possible to make a perfect hashing system from addresses of closures to their shadows, thus bringing the overhead of the mapping system down to only a little more than that of structure access, involving, at the machine code level, just 4 instructions: a bitwise and, an indexed load, a comparison, and another indexed load.

In Platypus89, I used a similar mapping system, although using Common Lisp's gethash (which is rather less efficient) instead of the perfect hashing system described above.

It may be possible to make a further short-cut for speed, using numbers stored in each closure, and doing a faster form of hashing based on the number, perhaps like the address hashing used in C-Platypus. This avoids hardwiring shadows into shadowed closures, while still providing very fast access from a closure to its shadow.

The implementations

Platypus-0

Platypus-0, written in Cambridge Lisp, was extremely slow; at 1.2 seconds to interpret


(cons 1 2)

(on an Orion-1, a machine of similar processing power to an early Vax), it showed that the ideas were viable, while leaving open-ended the question of whether an efficient implementation would be possible.

Platypus-0 provided a more sophisticated type system than the substrate Lisp, using conses of (type . value) to represent single values. This made the coding very confusing, as it was hard to keep track of whether a particular value was a (type . value) pair, or a simple Lisp value (perhaps with the type it had once had, removed). The very simple dynamic typing provided by Cambridge Lisp was not really powerful enough to be helpful here.

C-platypus

C-Platypus was written largely with speed in mind. Using an unusual hashing system to map between the tower and the meta-evaluator, it set out to show that it was possible to implement a tower efficiently with an absolutely pure reified heap without any pointers out from the heap to the the meta-evaluator.

This implementation was never sufficiently reliable to perform extensive timing tests, but it seemed to be capable of evaluating simple Lisp and FASL (a stack-based language oriented toward efficient construction of data structures, used for loading Lisp programs) code without noticeable inefficiency.

C-Platypus built some very complex data structures when starting. These were built entirely by hand-coded C routines, which were one of the major reasons for moving back to Lisp for continuing the work. The initialization code was very sensitive to changes, and difficult to maintain or re-organize. In conjunction with the use of a statically compiled implementation language, and the lack of higher dimensions of the tower, this made it impossible to extend the meta-evaluator once the tower had started to run. This is very much against the spirit of true reflective systems! The difficulty of initializing the system was one of the reasons for its being abandoned and superseded by a system built on a more powerful and flexible substrate language.

One very definite result of this stage of the work is that it is important to allow the building of the meta-evaluator incrementally, rather than providing all the pieces and putting them together at once. Retrospectively, in embarking on the C implementation of a tower, I should first have planned tools to extract the initialization code from comments or macros in the rest of the source code.

The implementation of C-Platypus included two languages, a Lisp-like one for general use (like the base language of Platypus89), and a FORTH-like language for use in program loading (in Lisp terminology, a FASL reader---a simple language for constructing data structures faster than can be done from a Lisp-like language, as its syntax and semantics are even simpler than those of Lisp).

C-Platypus made no provision for higher dimensions of the tower. Partly due to this, and partly due to the greater ease of development in Lisp, particularly concerning structuring and initializing the system, I proceeded to the third implementation, Platypus89. This left C-Platypus insufficiently reliable for timing results, so these are omitted from the result table given here.

Platypus89

Platypus89 was very much simpler to implement than the earlier systems, as well as being much more powerful. Common Lisp provides the combination of a reasonable type system (not quite present in C, and not present in Cambridge Lisp) with incremental compilation and the evaluation facilities of the language being made available at compilation time.

The technological contribution of Common Lisp made a great difference here, but perhaps not as much as the further understanding of reflective interpretation that I had developed while implementing C-Platypus, and while pondering it afterwards. Two conceptual advances were most significant here:

the mapping of types at successive levels of meta-towers;
an understanding of higher-dimensioned towers.

Although structurally the former is but part of the latter, it is the former that gave rise to the latter.

The implementation of Platypus89 includes a starter set of two languages: a dialect of Lisp, and a subset of PostScript. Both of these have all their operators defined directly in the base language of the system, although there is nothing to stop a language from using a mixture of shadowed and non-shadowed operators---indeed, for many languages this would be the case---it is desirable, for efficiency, for a language's input parser to use shadowed operators wherever it can; for example, many languages will have a two-way conditional, if... then... else..., for which the base language's if operator may be used; whereas FORTRAN's subroutine bodies with GOTOS, Including Three-Way Arithmetic GOTOs, and calculated GOTOs; or traditional Lisp's prog... go... return... forms, are obscure and high-level enough---away from the natural architecture of the system---not to match anything in the base language, and these would be implemented by non-shadowed operators, that are always interpreted by the meta-evaluator.

Performance of platypus89

To compare the performance of Platypus89 against other interpreters, I used as the main benchmarks:

the Griss Tests for Lisp---a set of simple benchmarks testing the performance of various parts of Lisp systems; I translated these from PSL (Portable Standard Lisp) to Platypus89's base language dialect of Lisp, and wrote a few macros for Common Lisp to enable it to run the tests from the same file.
circle limit, based on a picture by M. C. Escher [Escher], for PostScript, the only change required to the test file being renaming some tokens to avoid dependence on case (as Platypus89 uses a Lisp readtable and the Common Lisp reader system to read PostScript, it is case-insensitive, unlike real PostScript).

Lisp performance

The Griss tests consist of many separate tests, each of which does one thing many times. A table of the results for this is given below. The number of iterations for each test was chosen to bring the total time for the fastest version (usually the Common Lisp interpretive evaluator---the interpretive implementation of the substrate language) to around 10 seconds for each test, the timing resolution of the system being 10mS. The computer used for the tests was a Solbourne SPARC system (Sun-4 compatible) and the Common Lisp interpreter is that of Harlequin's LispWorks, which is also the substrate system on which Platypus89 was compiled and run for the tests.

The times are in seconds, and the column ``cl/pl'' is the ratio of speeds between Common Lisp (LispWorks) and Platypus (in the version not using the snark function), and the column ``cl/snark'' is the ratio of speeds between Common Lisp and the snark-based meta-evaluator.

It can be seen from this that, for these tests, Platypus89 is around one-half of the speed of a plain Lisp interpreter, or better---which is well within the one-tenth (see section 2) that I chose as my acceptance level.

I had expected the version using snark to be slower than the other, because of the extra procedure calls; at least one (evaluate-anything calling snark) at each evaluation. This did not prove to be the case; the most likely explanation is that making each of the most central routines smaller and simpler allows the substrate compiler to emit more efficient code, with a higher proportion of the live values being in CPU registers at any one time, and less traffic between CPU registers and the spill area (the part of the stack frame in which local values not currently in CPU registers are stored).

The complexity of snark and the meta-evaluator routines build around it is, as it turns out, closely matched to the number of registers available on the SPARC computers used to run it for the benchmarks.

Postscript performance

Since Platypus' PostScript implementation does not contain any actual graphics routines, but instead outputs distilled PostScript[1], it is harder to compare this with a conventional PostScript interpreter. To give a more realistic comparison, I compared it with still.ps, a program which redefines the graphics operators of PostScript to output distilled PostScript instead of drawing the picture itself. The PostScript system used for comparison is Harlequin's ScriptWorks, and is run on the same kind of computer as Platypus89 was for these tests---again, a Solbourne SPARC system. The results of this benchmark are shown below:

This gives a performance of 42% of that of the> ScriptWorks interpreter; as with the Lisp, this is well within the 10% aimed for at the start of the> project.

The commonly used PostScript benchmarks are not particularly relevant here, as they are chosen primarily to test the graphics system attached to the PostScript language, rather than the language itself. The program used here was chosen for its complex control flow; it uses a simple picture drawn with a graphic tools, with hand-written program for repeating that picture in a tessellating pattern.

Performance with extra levels of interpretation

Performance testing of the reflective facilities of Platypus89 is, of necessity, largely self-referential. Here are the results of adding one level of interpretation in various versions of the system:

The times are in seconds, and the column ``pl/int ratio'' gives the ratio of speeds between Platypus shadowing the application's evaluator and operators directly, and it interpreting them. Likewise, ``snark/boojum'' ratio gives the ratio of speeds for snark evaluating the benchmarks directly, and snark evaluating boojum evaluating the benchmarks, and so is equivalent to the pl/int ratio of the non-


snark

-based system.

Some of these figures seem anomalous, in that adding a level of interpretation improves the performance. This is not an impossibility; since a shadowed procedure and its shadow are two different extensions (implementations) of the same intension (specification) it is possible, for example, for a shadow to be less efficient than the interpreted procedure it shadows. In particular, the meta-evaluator has been optimized (by helping the substrate compiler by giving it more type information) especially heavily. However, the results are strange in places. They cannot be explained away simply as timing inconsistencies, since timing with a resolution of 10mS over a total of 10S gives an accuracy of 0.1% ---and some of the speed ratios are on the> mysterious side by twice this much.

A considerable loss in performance might be expected from adding levels of interpretation between the application and the meta-evaluator. However, since the meta-evaluator consists of several small procedures, of which not all need be changed for many practical uses of reflection, it need not all be interpreted, and so the overhead is lower than it would be for reflective architectures of coarser granularity of reflection. As the table shows, in practice the loss of efficiency is just a few percent. Experiments with interpretation of the entire evaluator showed a slowdown factor of around 40, so the fine-grained reflective changes yield a considerable advantage when reflection into the interpreter is brought into play, a the cost of extra procedure calls at all times, when compared with an evaluator using a single procedure with a typecase control structure to do this.

As may be expected, the boojum/snark-based system suffers less loss of performance on reflection, as less of the system is given another level of interpretation by each change. This is the positive side of the expected trade-off, but as explained on here, the negative side of the expected trade-off turned out to be another positive in the realized trade-off, and so, at least in this situation, the snark-based evaluator is the faster of the two.

Room for improvement?

An evaluator built around boojum and


snark

is highly parameterized, and thus inherently difficult for a compiler to optimize as well as it can an ad-hoc evaluator.

Changes to the levels of optimization allowed to the compiler, particularly in critical areas such as the evaluator itself, affected the performance by several percent. These changes include switching on or off various checks made by the compiled code, and adding type declarations that promise to the compiler that the expressions declared will be of specific types.

Although in general I expected code in which the same functionality was packed into fewer procedures (because of using fewer function calls), in practice smaller procedures allowed more effective use of CPU registers, and this version was slightly faster.

Time and space used

Profiling---where does the time go?

A statistical profile of the snark version of the evaluator, running the Griss tests, revealed that the following functions were on the top of the stack 1% of the time or more:>

The profile is taken by examining the stack on a regular timed interrupt. The third column indicates how many times the function was found anywhere on the stack (counting multiple times for multiple appearances) and the fourth shows this as a percentage---thus, something always present twice on the stack will show as being on the stack 200% of the time. The next two columns show the count and percentage for the procedure being on the top of the stack. Also noticeable in the profile was the system procedure


SYSTEM::DUMMY-STRUCTURE-ACCESSOR

, which implements access to defstructed data. It was called 166919089 times, but did not occupy the stack even 1% of the time, being called in such a> way that it does not show in its own right (as a machine-code subroutine, rather than as a conventional compiled procedure). gethash (disguised as lookup in the listings in section 11.1) occupies 11% of the time, which is much what I had expected.


snark

takes 35% of the time (including the time> spent in structure accessors, because of the calling structure of the substrate system---they do not appear on the stack in their own right). SYSTEM::ARG-IS-SAFE is part of the implementation of Common Lisp's apply---it checks that the last argument (the list of further arguments to the function) is not a circular list. (This has since been made more efficient, which would affect both the timings and the profiling; but the newer version of the substrate system inlines many functions that we are interested in the appearance of on the profile, and so is not used here.)

The functions named |frame-x| are shadow procedures for the operators x. These are all Lisp-like operators, which evaluate their arguments. The time for the evaluation of these arguments is included in the time for which their procedures are on the stack, since the argument evaluator is called from within the |frame- x| procedure. (See section 11.6 for details of how the |frame-x| procedures are defined.)

Code size and distribution

The Platypus89 system is implemented in around 5000 lines of Lisp, not including the large comments that make up the running text in sections that are presented in this thesis. Since the core of the system (as presented in section 7.3) is so simple, this leaves a lot to be accounted for. Where does all the bulk go?

A large piece of the system---about one third of the total---is the shadow procedures for the base language, and another one-sixth sets up the implementation of the in-tower equivalents of these---the shadowed procedures. Thus, about half of the system is the implementation of the base language, rather than of the generic language framework.

One-sixth of the code defines the data structures used in the tower, and accessor functions and macros for them. Another one-sixth defines syntactic extensions to Lisp, and defining forms for setting up procedures, operators and shadowing. About half of the remaining one sixth is the actual core of the Platypus89 evaluator, and the rest is a variety of miscellaneous support functions.

I consider this to be reasonably concise for an implementation of a language framework and reasonable subsets of two real programming languages, although it is in part parasitic on its substrate language, Common Lisp, of which, for example, the garbage collector is needed.

Further routines needed to make a standalone system

Platypus89, as it stands, relies on the substrate Lisp system for several runtime facilities, as well as for compiling the real meta-evaluator. In particular, it needs the storage allocator, the garbage collector, and the input/output system.

One thing not present in Platypus89 is a complete type definition system, such as defstruct in Common Lisp. This would have to be provided---but it can, of course, be provided by an application program and reflected in.

Platypus89 also depends on the substrate system for storage management, that is, allocation and garbage collection. C-Platypus provided these itself, which was complicated by the meta-evaluator being effectively in a separate address space from the evaluator, so the garbage collector (which was of the stop-and-copy variety) had to update the meta-evaluator's pointers into the heap as a separate action from normal block movements.

Changes needed to support non lisp-like languages

When I first wrote Platypus89, its Lisp-like base language was the only language which it ran. On adding PostScript, I changed some parts of the common meta-evaluator to make this easier. The most significant of these changes was the reversing of the order of the variables in each apparent frame of the stack, so that the first argument (from Lisp's point of view) is also the first argument from PostScript's point of view. This change was not connected at all with reflection or evaluation, but purely with mixed-language operation.

General results

Most procedural languages map readily onto reflective systems. The worst mismatch between conventional languages and a reflective interpreter is that it is natural to design reflective languages such that all calls are reflective---or rather, such that all procedures take the tower state as their argument, and return a new state---which does not map well to conventional procedure calls. With hindsight, I can see that this problem could have been tackled with towers and meta-towers.

Parameterized evaluators and meta-evaluators prove to be remarkably simple, particularly when each has a recurring part that may be isolated into a procedure in its own right. These procedures---representable as 4 and 12 lines of Lisp respectively---may be seen as refined representations of what evaluation and meta-evaluation are. Having been found, they seem intuitively very natural procedural realizations of these functions.

Debugging on a reflective system

A reflective system provides access to more information than a conventional system, and the amount of information displayed when tracing execution, getting backtraces and inspecting values turned out to be to voluminous for many purposes. However, debugging in terms of a specific language or situation can be done with more selective display of information.

The information provided is also in terms of Platypus' own model of execution, and may be at a lower level (in conventional terms, not reflective!) than an application programmer might expect. The raw data may need to be processed back into terms of the original language---which could be regarded as a form of decompilation, although not all of the data provided is program code. This decompilation will also reduce the bulk of data to be examined; for example, the insides of languages might not be shown when working in the context of a single language.

Decompilation of reified data should be quite general, and capable of producing sensible debugging printout into each language from any language wherever the equivalent facilities are available in both. This makes it possible, for example, for someone working in Lisp to debug a PostScript program without having to know PostScript syntax. The PostScript can be displayed as Lisp, and the interpreted definitions of the PostScript operators may be used as explanations of the PostScript language.

What debugging techniques were useful?

A major problem with debugging any of the Platypus implementations was the amount, or complexity, of the data being handled. A full backtrace of the stack, with all procedure arguments being printed, was too voluminous to be manageable, although there were a few occasions when it proved necessary. The bulk was partly because of the depth of the call stack, with many calls to each of the central evaluator routines active at any one time, and partly because of the size of the printed representation of each of the objects. The most useful part of a level to print out was the top entry of its call stack---a closure---and the most useful parts of a closure were its expression (printed as its procedure expression, with a marker pointing at the sub-expression that is currently the current expression) and its closure number, and the closure number of its evaluator. The closure number field was originally introduced for debugging, but subsequently also used for other things; it is a number that is issued from a counter every time a closure is allocated, and so identifies closures uniquely in their printed representation, making it visible to the programmer whether a closure is eq to some other closure.

At some stages of the work, it was useful to make the meta-evaluator print when it was doing a level-shift, and funcall-helper print when it was called. More general tracing than this was too voluminous to be helpful in most cases.

Backtraces of the meta-evaluator's (Common Lisp) stack were rarely useful, as the central meta-evaluator components called each other to such a depth of indirect recursion that it was difficult to work out, and to remember, what each invocation on the stack was there for.

A breakpoint operator, which used Common Lisp's break function, provided a read-eval-print loop in the substrate language, which was useful for investigating the tower contents interactively at various points in the tower's application's execution---this was often more convenient than bulky non-interactive printouts of the tower contents.

Occasionally it was useful to set a global variable in the substrate language to the value passed through a particular point in the meta-evaluator, so the last value that had appeared in that place could be inspected interactively at a breakpoint or after a crash.

An anecdotal result on resilience

On encountering a bug in gethash on the substrate Lisp system, which prevented the meta-evaluator from finding the shadow for an operator, the meta-evaluator went into deep recursion to interpret it from the shadowed definition (as distinct from the shadowing definition.) I spotted this while it was running, broke into the execution, patched around the


gethash

problem, and resumed execution from the break. The tower now shrunk back, shedding the unwanted levels of interpretation, and carried on as if nothing had been wrong.

Summary of results

Platypus has proved to be a practical interpretive system for Lisp-like languages, and promises to be able to run at speeds comparable to other, non-reflective, interpreted Lisp systems---it is already well within the factor of ten that I chose initially as a suitable limit for regarding this new interpretation technique as practical. There is certainly promise of being able to improve the performance, and I would expect to reach speeds similar to those of non-reflective, single-language interpreters. The usefulness of dynamic typing is clearly evident, and the appropriateness of implementing reflective interpretation by shadowing has been demonstrated.

Working with mixtures of two type systems, one for the substrate language and one for the tower, was particularly confusing, and this should either be avoided, or handled with careful planning, in future work in this area.

The amount of garbage generated is a potential problem, as all stack frames are, in principle, built on the heap. The problem is reduced by using the stack frames of the substrate system to store some of the data that might have otherwise needed more frames to be built for very short-term use.

Initializing the system is complicated by the number of cross-references and self-references that must be set up. It would be useful to have some program tools for generating (or gathering) the initialization code automatically from the rest of the source code.

It is possible to produce very compact versions of both the evaluator and the meta-evaluator, these being based on a pair of functions, boojum and snark, that appear to encapsulate the essence of evaluation and meta-evaluation respectively.

And I gave my heart my heart to seek and search out by wisdom concerning all the things that are done under heaven: this sore travail hath God given to the sons of man to be exercised with.

Ecclesiastes 1:13

Previous: The meta-evaluator
Up: A Lisp through the Looking-Glass
Next: History and future; related work; observations

Footnotes

Distilled PostScript is the result of running a PostScript program with the low-level drawing operators redefined to output the PostScript code needed to call them, with the appropriate arguments. All co-ordinate system transformations have been done by the time they are called, and loops in the input program do not appear in the output---they have been unrolled by this processing. Thus, a distilled PostScript program consists purely of a sequence of drawing operators---no transformations, and no flow control beyond the implicit sequential execution.

Submitted, Examined And Accepted: 1991

Contact me