Growing an Abstract Grammar: Teaching Language Engineering
Abstract grammars are neglected resources in language processor implementations. In the most favourable case they are used to format first-class program representations. In the worst case they serve as a temporary interface between compiler phases. But they can enable so much more … certainly in teaching.
In this presentation we report on a long-running experiment (>5 y.) to develop a language interpreter that is maximally supported by an extensible (abstract) grammar. The context of the experiment is an advanced course on Programming Language Engineering. The reference language is a simplified variation on Scheme, so: no objects in this story.
In this course abstract grammars serve as backbone for material ranging from formal language specifications to low-level implementation with an eye for optimisation. In order to do so, we require that an instance of an abstract grammar be first class, and that all of its attributes should be setable and getable from within any program that is associated with this instance. Depending on the level of detail at which its semantics are captured in the abstract grammar, this regulates the depth at which the program can reflect over its specification. Nothing new here, this is lisp and s-expressions, only more so.
A central idea to this notion of a rich abstract grammar, is a unified memory model. At a basic level no distinction is made between stacks, heaps, frames &c. concerning their residence in memory. This of course raises potential performance issues about memory management — but a sufficiently powerful garbage collector and various caching and inlining tactics go a long way in mitigating this concern. We will consider what it takes to explain memory models and garbage collection at a sufficient level of detail to investigate performance issues. We proceed with s-expressions and grow this in successive steps to describe the various structures employed by a language interpreter. Considering that the eval operation should ultimately map the grammar onto itself, the obvious ones are computational values that do not correspond with literals, such as closures and continuations. But with the introduction of lexical addressing, we should also include frames and environments; and if the interpretation strategy is based on a transformation into continuation passing style (as is the case here), structures resulting from lambda-lifting should be considered. However, the most interesting extensions to the abstract grammar are related to optimisations: tail call optimisation, inlining, prevalent function call patterns, &c.
This approach proved to be an interesting setting to expose graduate students to the vagaries low level language processor implementations. But it has also been suitable as a platform for sophisticated experiments with optimisations for language interpreters.
Mon 18 Jul
|13:50 - 14:50|
|14:50 - 15:20|
|Media Attached File Attached|