The *G-Machine* =============== ********** Motivation ********** Our initial model, the *Template Instantiator* (TI) was a very straightforward solution to compilation, but its core design has a major Achilles' heel, being that Compilation is interleaved with evaluation -- The heap nodes for supercombinators hold uninstantiated expressions, i.e. raw ASTs straight from the parser. When a supercombinator is found on the stack during evaluation, the template expression is instantiated (compiled) on the spot. .. math:: \transrule { a_0 : a_1 : \ldots : a_n : s & d & h \begin{bmatrix} a_0 : \mathtt{NSupercomb} \; [x_1,\ldots,x_n] \; e \end{bmatrix} & g } { a_n : s & d & h' & g \\ & \SetCell[c=3]{c} \text{where } h' = \mathtt{instantiateU} \; e \; a_n \; h \; g } The process of instantiating a supercombinator goes something like this 1. Augment the environment with bindings to the arguments. 2. Using the local augmented environment, instantiate the supercombinator body on the heap. 3. Remove the nodes applying the supercombinator to its arguments from the stack. 4. Push the address to the newly instantiated body onto the stack. .. literalinclude:: /../../src/TI.hs :dedent: :start-after: -- >> [ref/scStep] :end-before: -- << [ref/scStep] :caption: src/TI.hs Instantiating the supercombinator's body in this way is the root of our Achilles' heel. Traversing a tree structure is a very non-linear task unfit for an assembly target. The goal of our new G-Machine is to compile a *linear sequence of instructions* which instantiate the expression at execution. ************************** Trees and Vines, in Theory ************************** WIP. state transition rules Core Transition Rules --------------------- 1. Lookup a global by name and push its value onto the stack .. math:: \gmrule { \mathtt{PushGlobal} \; f : i & s & h & m \begin{bmatrix} f : a \end{bmatrix} } { i & a : s & h & m } 2. Allocate an int node on the heap, and push the address of the newly created node onto the stack .. math:: \gmrule { \mathtt{PushInt} \; n : i & s & h & m } { i & a : s & h \begin{bmatrix} a : \mathtt{NNum} \; n \end{bmatrix} & m } 3. Allocate an application node on the heap, applying the top of the stack to the address directly below it. The address of the application node is pushed onto the stack. .. math:: \gmrule { \mathtt{MkAp} : i & f : x : s & h & m } { i & a : s & h \begin{bmatrix} a : \mathtt{NAp} \; f \; x \end{bmatrix} & m } 4. Push a function's argument onto the stack .. math:: \gmrule { \mathtt{Push} \; n : i & a_0 : \ldots : a_n : s & h & m } { i & a_n : a_0 : \ldots : a_n : s & h & m } 5. Tidy up the stack after instantiating a supercombinator .. math:: \gmrule { \mathtt{Slide} \; n : i & a_0 : \ldots : a_n : s & h & m } { i & a_0 : s & h & m } 6. If a number is on top of the stack, :code:`Unwind` leaves the machine in a halt state .. math:: \gmrule { \mathtt{Unwind} : \nillist & a : s & h \begin{bmatrix} a : \mathtt{NNum} \; n \end{bmatrix} & m } { \nillist & a : s & h & m } 7. If an application is on top of the stack, :code:`Unwind` continues unwinding .. math:: \gmrule { \mathtt{Unwind} : \nillist & a : s & h \begin{bmatrix} a : \mathtt{NAp} \; f \; x \end{bmatrix} & m } { \mathtt{Unwind} : \nillist & f : a : s & h & m } 8. When a supercombinator is on top of the stack (and the correct number of arguments have been provided), :code:`Unwind` sets up the stack and jumps to the supercombinator's code (:math:`\beta`-reduction) .. math:: \gmrule { \mathtt{Unwind} : \nillist & a_0 : \ldots : a_n : s & h \begin{bmatrix} a_0 : \mathtt{NGlobal} \; n \; c \\ a_1 : \mathtt{NAp} \; a_0 \; e_1 \\ \vdots \\ a_n : \mathtt{NAp} \; a_{n-1} \; e_n \\ \end{bmatrix} & m } { c & e_1 : \ldots : e_n : a_n : s & h & m } 9. Pop the stack, and update the nth node to point to the popped address .. math:: \gmrule { \mathtt{Update} \; n : i & e : f : a_1 : \ldots : a_n : s & h \begin{bmatrix} a_1 : \mathtt{NAp} \; f \; e \\ \vdots \\ a_n : \mathtt{NAp} \; a_{n-1} \; e_n \end{bmatrix} & m } { i & f : a_1 : \ldots : a_n : s & h \begin{bmatrix} a_n : \mathtt{NInd} \; e \end{bmatrix} & m } 10. Pop the stack. .. math:: \gmrule { \mathtt{Pop} \; n : i & a_1 : \ldots : a_n : s & h & m } { i & s & h & m } 11. Follow indirections while unwinding .. math:: \gmrule { \mathtt{Unwind} : \nillist & a : s & h \begin{bmatrix} a : \mathtt{NInd} \; a' \end{bmatrix} & m } { \mathtt{Unwind} : \nillist & a' : s & h & m } 12. Allocate uninitialised heap space .. math:: \gmrule { \mathtt{Alloc} \; n : i & s & h & m } { i & a_1 : \ldots : a_n : s & h \begin{bmatrix} a_1 : \mathtt{NUninitialised} \\ \vdots \\ a_n : \mathtt{NUninitialised} \\ \end{bmatrix} & m } Extension Rules --------------- 1. A sneaky trick to enable sharing of :code:`NNum` nodes. We note that the global environment is a mapping of :code:`Name` objects (i.e. identifiers) to heap addresses. Strings of digits are not considered valid identifiers! We abuse this by modifying Core Rule 2 to update the global environment with the new node's address. Consider how this rule might impact garbage collection (remember that the environment is intended for *globals*). .. math:: \gmrule { \mathtt{PushInt} \; n : i & s & h & m } { i & a : s & h \begin{bmatrix} a : \mathtt{NNum} \; n \end{bmatrix} & m \begin{bmatrix} n' : a \end{bmatrix} \\ \SetCell[c=5]{c} \text{where $n'$ is the base-10 string rep. of $n$} } 2. In order for Extension Rule 1. to be effective, we are also required to take action when a number already exists in the environment: .. math:: \transrule { \mathtt{PushInt} \; n : i & s & h & m \begin{bmatrix} n' : a \end{bmatrix} } { i & a : s & h & m \\ \SetCell[c=5]{c} \text{where $n'$ is the base-10 string rep. of $n$} } ************************** Evaluation: Slurping Vines ************************** WIP. Laziness -------- WIP. * Instead of :code:`Slide (n+1); Unwind`, do :code:`Update n; Pop n; Unwind` **************************** Compilation: Squashing Trees **************************** WIP. Notice that we do not keep a (local) environment at run-time. The environment only exists at compile-time to map local names to stack indices. When compiling a supercombinator, the arguments are enumerated from zero (the top of the stack), and passed to :code:`compileR` as an environment. .. literalinclude:: /../../src/GM.hs :dedent: :start-after: -- >> [ref/compileSc] :end-before: -- << [ref/compileSc] :caption: src/GM.hs Of course, variables being indexed relative to the top of the stack means that they will become inaccurate the moment we push or pop the stack a single time. The way around this is quite simple: simply offset the stack when w .. literalinclude:: /../../src/GM.hs :dedent: :start-after: -- >> [ref/compileC] :end-before: -- << [ref/compileC] :caption: src/GM.hs