symbol table

2026-04-23 11:42:49 -06:00 · 2004-09-20 14:28:52 +00:00
parent 07464264da
commit 6afcb5009a
1 changed files with 41 additions and 26 deletions
--- a/examples/gfcc/complin.tex
+++ b/examples/gfcc/complin.tex
@@ -678,8 +678,30 @@ Compositionality also prevents optimizations during linearization
 by clever instruction selection, elimination of superfluous
 labels and jumps, etc.
-It would of course be possible to implement the compiler
+One way to achieve compositional JVM linearization would be
-back end in GF in the traditional way, as a noncompositional
+to change the abstract syntax
 so that variables do not only carry a string with them but
 also a relative address. This would certainly be possible
 with dependent types; but it would clutter the abstract
 syntax in a way that is hard to motivate when we are in
 the business of describing the syntax of C. The abstract syntax would
 have to, so to say, anticipate all demands of the compiler's
 target languages. 
 In fact, translation systems for natural
 languages have similar problems. For instance, to translate
 the English pronoun \eex{you} to German, you have to choose
 between \eex{du, ihr, Sie}; for Italian, there are four
 variants, and so on. All semantic distinctions
 made in any of the involved languages have to be present
 in the common abstract syntax. The usual solution to 
 this problem is \empha{transfer}: you do not just linearize
 the same syntax tree, but define a function that translates
 the trees of one language into the trees of another.
 Using transfer in the compiler
 back end is precisely what traditional compilers do.
 The transfer function in our case would be a noncompositional
 function from the abstract syntax of C to a different abstract
 syntax of JVM. The abstract syntax notation of GF permits
 definitions of functions, and the GF interpreter can be used
@@ -692,27 +714,20 @@ for evaluating terms into normal form. Thus one could write
    transStm env (Assign typ var exp rest) = ... 
 \end{verbatim}
 This would be cumbersome in practice, because
-GF does not have facilities like built-in lists and tuples, 
+GF does not have programming-language facilities 
-or monads. Of course, the compiler could no longer be
+like built-in lists and tuples, or monads. Of course, 
-inverted into a decompiler, in the way true linearization
+the compiler could no longer be inverted into a decompiler, 
-can be inverted into a parser.
+in the way true linearization can be inverted into a parser.
-Yet another possibility is to change the abstract syntax
+One more idea would be to hard-code some support
 so that variables do not only carry a string with them but
 also a relative address. This would certainly be possible
 with dependent types; but it would clutter the abstract
 syntax in a way that is hard to motivate when we are in
 the business of describing the syntax of C.
 Perhaps the key idea would be to hard-code some support
 for symbol tables into the extension of GF tuned for
-compiler construction. For instance, the linearization
+compiler construction. For instance, the concrete syntax of HOAS
-of a binding could store, in addition to the variable
+could not only keep track of variable symbols but also
-symbol field \verb6.$06, an integer-valued fiels \verb6.#06.
+assign a unique index to each symbol.
-These fields correspond to an automatic renaming of variables
+Linearization to C could then use the symbols, as
-to \verb6x1, x2, x36,\ldots starting from the outermost one.
+in this paper, and linearization to JVM could use 
-Linearization to C could then use the \verb6.$06 field, as
+the indexes.
-in this paper, and linearization to JVM the \verb6.#06 field.
+
@@ -753,7 +768,8 @@ semantics that is actually used in the implementation.
 \section{Conclusion}
-We managed to compile a large subset of C, and growing it
+We have managed to compile a representative 
 subset of C to JVM, and growing it
 does not necessarily pose any new kinds of problems. 
 Using HOAS and dependent types to describe the abstract
 syntax of C works fine, and defining the concrete syntax
@@ -765,17 +781,16 @@ The parser generated by GF is not able to parse all
 source programs, because some cyclic parse
 rules (of the form $C ::= C$) are generated from our grammar. 
 Recovery from cyclic rules is ongoing work in GF independently of this
-experiment. 
+experiment. For the time being, the interactive editor is the best way to
 For the time being, the interactive editor is the best way to
 construct C programs using our grammar.
 The most serious difficulty with using GF as a compiler tool
 is how to generate machine code by linearization if this depends on
-an evolving symbol table mapping variables to addresses.
+a symbol table mapping variables to addresses.
 Since the compositional linearization model of GF does not
 support this, we needed postprocessing to get real JVM code
 from the linearization result. The question is this problem can
-be solved by some simple and natural new feature of GF.
+be solved by some simple and natural extension of GF.