Files
rlp/docs/src/commentary/gm.rst
2023-12-01 14:43:40 -07:00

385 lines
7.5 KiB
ReStructuredText

The *G-Machine*
===============
**********
Motivation
**********
Our initial model, the *Template Instantiator* (TI) was a very
straightforward solution to compilation, but its core design has a major
Achilles' heel, being that Compilation is interleaved with evaluation -- The
heap nodes for supercombinators hold uninstantiated expressions, i.e. raw ASTs
straight from the parser. When a supercombinator is found on the stack during
evaluation, the template expression is instantiated (compiled) on the spot.
.. math::
\transrule
{ a_0 : a_1 : \ldots : a_n : s
& d
& h
\begin{bmatrix}
a_0 : \mathtt{NSupercomb} \; [x_1,\ldots,x_n] \; e
\end{bmatrix}
& g
}
{ a_n : s
& d
& h'
& g
\\
& \SetCell[c=3]{c}
\text{where } h' = \mathtt{instantiateU} \; e \; a_n \; h \; g
}
The process of instantiating a supercombinator goes something like this
1. Augment the environment with bindings to the arguments.
2. Using the local augmented environment, instantiate the supercombinator body
on the heap.
3. Remove the nodes applying the supercombinator to its arguments from the
stack.
4. Push the address to the newly instantiated body onto the stack.
.. literalinclude:: /../../src/TI.hs
:dedent:
:start-after: -- >> [ref/scStep]
:end-before: -- << [ref/scStep]
:caption: src/TI.hs
Instantiating the supercombinator's body in this way is the root of our
Achilles' heel. Traversing a tree structure is a very non-linear task unfit for
an assembly target. The goal of our new G-Machine is to compile a *linear
sequence of instructions* which instantiate the expression at execution.
**************************
Trees and Vines, in Theory
**************************
WIP. state transition rules
Core Transition Rules
---------------------
1. Lookup a global by name and push its value onto the stack
.. math::
\gmrule
{ \mathtt{PushGlobal} \; f : i
& s
& h
& m
\begin{bmatrix}
f : a
\end{bmatrix}
}
{ i
& a : s
& h
& m
}
2. Allocate an int node on the heap, and push the address of the newly created
node onto the stack
.. math::
\gmrule
{ \mathtt{PushInt} \; n : i
& s
& h
& m
}
{ i
& a : s
& h
\begin{bmatrix}
a : \mathtt{NNum} \; n
\end{bmatrix}
& m
}
3. Allocate an application node on the heap, applying the top of the stack to
the address directly below it. The address of the application node is pushed
onto the stack.
.. math::
\gmrule
{ \mathtt{MkAp} : i
& f : x : s
& h
& m
}
{ i
& a : s
& h
\begin{bmatrix}
a : \mathtt{NAp} \; f \; x
\end{bmatrix}
& m
}
4. Push a function's argument onto the stack
.. math::
\gmrule
{ \mathtt{Push} \; n : i
& a_0 : \ldots : a_n : s
& h
& m
}
{ i
& a_n : a_0 : \ldots : a_n : s
& h
& m
}
5. Tidy up the stack after instantiating a supercombinator
.. math::
\gmrule
{ \mathtt{Slide} \; n : i
& a_0 : \ldots : a_n : s
& h
& m
}
{ i
& a_0 : s
& h
& m
}
6. If a number is on top of the stack, :code:`Unwind` leaves the machine in a
halt state
.. math::
\gmrule
{ \mathtt{Unwind} : \nillist
& a : s
& h
\begin{bmatrix}
a : \mathtt{NNum} \; n
\end{bmatrix}
& m
}
{ \nillist
& a : s
& h
& m
}
7. If an application is on top of the stack, :code:`Unwind` continues unwinding
.. math::
\gmrule
{ \mathtt{Unwind} : \nillist
& a : s
& h
\begin{bmatrix}
a : \mathtt{NAp} \; f \; x
\end{bmatrix}
& m
}
{ \mathtt{Unwind} : \nillist
& f : a : s
& h
& m
}
8. When a supercombinator is on top of the stack (and the correct number of
arguments have been provided), :code:`Unwind` sets up the stack and jumps to
the supercombinator's code (:math:`\beta`-reduction)
.. math::
\gmrule
{ \mathtt{Unwind} : \nillist
& a_0 : \ldots : a_n : s
& h
\begin{bmatrix}
a_0 : \mathtt{NGlobal} \; n \; c \\
a_1 : \mathtt{NAp} \; a_0 \; e_1 \\
\vdots \\
a_n : \mathtt{NAp} \; a_{n-1} \; e_n \\
\end{bmatrix}
& m
}
{ c
& e_1 : \ldots : e_n : a_n : s
& h
& m
}
9. Pop the stack, and update the nth node to point to the popped address
.. math::
\gmrule
{ \mathtt{Update} \; n : i
& e : f : a_1 : \ldots : a_n : s
& h
\begin{bmatrix}
a_1 : \mathtt{NAp} \; f \; e \\
\vdots \\
a_n : \mathtt{NAp} \; a_{n-1} \; e_n
\end{bmatrix}
& m
}
{ i
& f : a_1 : \ldots : a_n : s
& h
\begin{bmatrix}
a_n : \mathtt{NInd} \; e
\end{bmatrix}
& m
}
10. Pop the stack.
.. math::
\gmrule
{ \mathtt{Pop} \; n : i
& a_1 : \ldots : a_n : s
& h
& m
}
{ i
& s
& h
& m
}
11. Follow indirections while unwinding
.. math::
\gmrule
{ \mathtt{Unwind} : \nillist
& a : s
& h
\begin{bmatrix}
a : \mathtt{NInd} \; a'
\end{bmatrix}
& m
}
{ \mathtt{Unwind} : \nillist
& a' : s
& h
& m
}
12. Allocate uninitialised heap space
.. math::
\gmrule
{ \mathtt{Alloc} \; n : i
& s
& h
& m
}
{ i
& a_1 : \ldots : a_n : s
& h
\begin{bmatrix}
a_1 : \mathtt{NUninitialised} \\
\vdots \\
a_n : \mathtt{NUninitialised} \\
\end{bmatrix}
& m
}
Extension Rules
---------------
1. A sneaky trick to enable sharing of :code:`NNum` nodes. We note that the
global environment is a mapping of :code:`Name` objects (i.e. identifiers) to
heap addresses. Strings of digits are not considered valid identifiers! We
abuse this by modifying Core Rule 2 to update the global environment with the
new node's address. Consider how this rule might impact garbage collection
(remember that the environment is intended for *globals*).
.. math::
\gmrule
{ \mathtt{PushInt} \; n : i
& s
& h
& m
}
{ i
& a : s
& h
\begin{bmatrix}
a : \mathtt{NNum} \; n
\end{bmatrix}
& m
\begin{bmatrix}
n' : a
\end{bmatrix}
\\
\SetCell[c=5]{c}
\text{where $n'$ is the base-10 string rep. of $n$}
}
2. In order for Extension Rule 1. to be effective, we are also required to take
action when a number already exists in the environment:
.. math::
\transrule
{ \mathtt{PushInt} \; n : i
& s
& h
& m
\begin{bmatrix}
n' : a
\end{bmatrix}
}
{ i
& a : s
& h
& m
\\
\SetCell[c=5]{c}
\text{where $n'$ is the base-10 string rep. of $n$}
}
**************************
Evaluation: Slurping Vines
**************************
WIP.
Laziness
--------
WIP.
* Instead of :code:`Slide (n+1); Unwind`, do :code:`Update n; Pop n; Unwind`
****************************
Compilation: Squashing Trees
****************************
WIP.
Notice that we do not keep a (local) environment at run-time. The environment
only exists at compile-time to map local names to stack indices. When compiling
a supercombinator, the arguments are enumerated from zero (the top of the
stack), and passed to :code:`compileR` as an environment.
.. literalinclude:: /../../src/GM.hs
:dedent:
:start-after: -- >> [ref/compileSc]
:end-before: -- << [ref/compileSc]
:caption: src/GM.hs
Of course, variables being indexed relative to the top of the stack means that
they will become inaccurate the moment we push or pop the stack a single time.
The way around this is quite simple: simply offset the stack when w
.. literalinclude:: /../../src/GM.hs
:dedent:
:start-after: -- >> [ref/compileC]
:end-before: -- << [ref/compileC]
:caption: src/GM.hs