conclusion

This commit is contained in:
aarne
2004-09-20 11:51:44 +00:00
parent c8c9e517c1
commit 07464264da

View File

@@ -663,6 +663,60 @@ int main () { call ilt ; call ilt ;
\subsection{How to restore code generation by linearization}
SInce postprocessing is needed, we have not quite achieved
the goal of code generation as linearization.
If linearization is understood in the
sense of GF. In GF, linearization rules must be
compositional, and can only depend on parameters from
finite parameter sets. Hence it is not possible to encode
linearization with updates to and lookups from a symbol table,
as is usual in code generation.
Compositionality also prevents optimizations during linearization
by clever instruction selection, elimination of superfluous
labels and jumps, etc.
It would of course be possible to implement the compiler
back end in GF in the traditional way, as a noncompositional
function from the abstract syntax of C to a different abstract
syntax of JVM. The abstract syntax notation of GF permits
definitions of functions, and the GF interpreter can be used
for evaluating terms into normal form. Thus one could write
\begin{verbatim}
fun
transStm : Env -> Stm -> EnvInstr ;
def
transStm env (Decl typ rest) = ...
transStm env (Assign typ var exp rest) = ...
\end{verbatim}
This would be cumbersome in practice, because
GF does not have facilities like built-in lists and tuples,
or monads. Of course, the compiler could no longer be
inverted into a decompiler, in the way true linearization
can be inverted into a parser.
Yet another possibility is to change the abstract syntax
so that variables do not only carry a string with them but
also a relative address. This would certainly be possible
with dependent types; but it would clutter the abstract
syntax in a way that is hard to motivate when we are in
the business of describing the syntax of C.
Perhaps the key idea would be to hard-code some support
for symbol tables into the extension of GF tuned for
compiler construction. For instance, the linearization
of a binding could store, in addition to the variable
symbol field \verb6.$06, an integer-valued fiels \verb6.#06.
These fields correspond to an automatic renaming of variables
to \verb6x1, x2, x36,\ldots starting from the outermost one.
Linearization to C could then use the \verb6.$06 field, as
in this paper, and linearization to JVM the \verb6.#06 field.
\section{Related work}
The theoretical ideas behind our compiler experiment
@@ -709,45 +763,19 @@ desired for things like literals and precedences.
The parser generated by GF is not able to parse all
source programs, because some cyclic parse
rules (of the form $C ::= C$) are generated. Recovery from cyclic
rules is ongoing work in GF independently of this
rules (of the form $C ::= C$) are generated from our grammar.
Recovery from cyclic rules is ongoing work in GF independently of this
experiment.
For the time being, the interactive editor is the best way to
construct C programs using our grammar.
The most serious problem is that the idea of compilation
as linearization does not quite work in the generation
of JVM code, if linearization is understood in the
sense of GF. In GF, linearization rules must be
compositional, and can only depend on parameters from
finite parameter sets. Hence it is not possible to encode
linearization with updates to and lookups from a symbol table,
as is usual in code generation.
Compositionality also prevents optimizations during linearization
by clever instruction selection, elimination of superfluous
labels and jumps, etc.
It would of course be possible to implement the compiler
back end in GF in the traditional way, as a noncompositional
function from the abstract syntax of C to a different abstract
syntax of JVM. The abstract syntax notation of GF permits
definitions of functions, and the GF interpreter can be used
for evaluating terms into normal form. Thus one could write
\begin{verbatim}
fun
transStm : Env -> Stm -> EnvInstr ;
def
transStm env (Decl typ rest) = ...
transStm env (Assign typ var exp rest) = ...
\end{verbatim}
This would be cumbersome in practice, because
GF does not have facilities like built-in lists and tuples,
or monads. Of course, the compiler could no longer be
inverted into a decompiler, in the way true linearization
can be inverted into a parser.
The most serious difficulty with using GF as a compiler tool
is how to generate machine code by linearization if this depends on
an evolving symbol table mapping variables to addresses.
Since the compositional linearization model of GF does not
support this, we needed postprocessing to get real JVM code
from the linearization result. The question is this problem can
be solved by some simple and natural new feature of GF.