docs -> doc

lol
2023-12-04 14:51:44 -07:00
parent cb6321fbf8
commit 6767bd0b4f
304 changed files with 29610 additions and 0 deletions
--- a/doc/src/commentary/gm.rst
+++ b/doc/src/commentary/gm.rst
@@ -0,0 +1,104 @@
+The *G-Machine*
+===============
+
+**********
+Motivation
+**********
+
+Our initial model, the *Template Instantiator* (TI) was a very
+straightforward solution to compilation, but its core design has a major
+Achilles' heel, being that Compilation is interleaved with evaluation -- The
+heap nodes for supercombinators hold uninstantiated expressions, i.e. raw ASTs
+straight from the parser. When a supercombinator is found on the stack during
+evaluation, the template expression is instantiated (compiled) on the spot.
+
+.. math::
+   \transrule
+   { a_0 : a_1 : \ldots : a_n : s
+   & d
+   & h
+   \begin{bmatrix}
+        a_0 : \mathtt{NSupercomb} \; [x_1,\ldots,x_n] \; e
+   \end{bmatrix}
+   & g
+   }
+   { a_n : s
+   & d
+   & h'
+   & g
+   \\
+   & \SetCell[c=3]{c}
+   \text{where } h' = \mathtt{instantiateU} \; e \; a_n \; h \; g
+   }
+
+The process of instantiating a supercombinator goes something like this
+
+1. Augment the environment with bindings to the arguments.
+
+2. Using the local augmented environment, instantiate the supercombinator body
+   on the heap.
+
+3. Remove the nodes applying the supercombinator to its arguments from the
+   stack.
+
+4. Push the address to the newly instantiated body onto the stack.
+
+.. literalinclude:: /../../src/TI.hs
+   :dedent:
+   :start-after: -- >> [ref/scStep]
+   :end-before: -- << [ref/scStep]
+   :caption: src/TI.hs
+
+Instantiating the supercombinator's body in this way is the root of our
+Achilles' heel. Traversing a tree structure is a very non-linear task unfit for
+an assembly target. The goal of our new G-Machine is to compile a *linear
+sequence of instructions* which instantiate the expression at execution.
+
+**************************
+Trees and Vines, in Theory
+**************************
+
+WIP.
+
+**************************
+Evaluation: Slurping Vines
+**************************
+
+WIP.
+
+Laziness
+--------
+
+WIP.
+
+* Instead of :code:`Slide (n+1); Unwind`, do :code:`Update n; Pop n; Unwind`
+
+****************************
+Compilation: Squashing Trees
+****************************
+
+WIP.
+
+Notice that we do not keep a (local) environment at run-time. The environment
+only exists at compile-time to map local names to stack indices. When compiling
+a supercombinator, the arguments are enumerated from zero (the top of the
+stack), and passed to :code:`compileR` as an environment.
+
+.. literalinclude:: /../../src/GM.hs
+   :dedent:
+   :start-after: -- >> [ref/compileSc]
+   :end-before: -- << [ref/compileSc]
+   :caption: src/GM.hs
+
+Of course, variables being indexed relative to the top of the stack means that
+they will become inaccurate the moment we push or pop the stack a single time.
+The way around this is quite simple: simply offset the stack when w
+
+.. literalinclude:: /../../src/GM.hs
+   :dedent:
+   :start-after: -- >> [ref/compileC]
+   :end-before: -- << [ref/compileC]
+   :caption: src/GM.hs
+
+
+
--- a/doc/src/commentary/layout-lexing.rst
+++ b/doc/src/commentary/layout-lexing.rst
@@ -0,0 +1,235 @@
+Lexing, Parsing, and Layouts
+============================
+
+The C-style languages of my previous experiences have all had quite trivial
+lexical analysis stages, peaking in complexity when I streamed tokens lazily in
+C. The task of tokenising a C-style language is very simple in description: you
+ignore all whitespace and point out what you recognise. If you don't recognise
+something, check if it's a literal or an identifier. Should it be neither,
+return an error.
+
+On paper, both lexing and parsing a Haskell-like language seem to pose a few
+greater challenges. Listed by ascending intimidation factor, some of the
+potential roadblocks on my mind before making an attempt were:
+
+* Operators; Haskell has not only user-defined infix operators, but user-defined
+  precedence levels and associativities. I recall using an algorithm that looked
+  up infix, prefix, postfix, and even mixfix operators up in a global table to
+  call their appropriate parser (if their precedence was appropriate, also
+  stored in the table). I never modified the table at runtime, however this
+  could be a very nice solution for Haskell.
+
+* Context-sensitive keywords; Haskell allows for some words to be used as identifiers in
+  appropriate contexts, such as :code:`family`, :code:`role`, :code:`as`.
+  Reading a note_ found in `GHC's lexer`_,
+  it appears that keywords are only considered in bodies for which their use is
+  relevant, e.g. :code:`family` and :code:`role` in type declarations,
+  :code:`as` after :code:`case`; :code:`if`, :code:`then`, and :code:`else` in
+  expressions, etc.
+
+* Whitespace sensitivity; While I was comfortable with the idea of a system
+  similar to Python's INDENT/DEDENT tokens, Haskell seemed to use whitespace to
+  section code in a way that *felt* different.
+
+.. _note: https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/coding-style#2-using-notes
+.. _GHC's lexer: https://gitlab.haskell.org/ghc/ghc/-/blob/master/compiler/GHC/Parser/Lexer.x#L1133
+
+After a bit of thought and research, whitespace sensitivity in the form of
+*layouts* as Haskell and I will refer to them as, are easily the scariest thing
+on this list -- however they are achievable!
+
+A Lexical Primer: Python
+************************
+
+We will compare and contrast with Python's lexical analysis. Much to my dismay,
+Python uses newlines and indentation to separate statements and resolve scope
+instead of the traditional semicolons and braces found in C-style languages (we
+may generally refer to these C-style languages as *explicitly-sectioned*).
+Internally during tokenisation, when the Python lexer begins a new line, they
+compare the indentation of the new line with that of the previous and apply the
+following rules:
+
+1. If the new line has greater indentation than the previous, insert an INDENT
+   token and push the new line's indentation level onto the indentation stack
+   (the stack is initialised with an indentation level of zero).
+
+2. If the new line has lesser indentation than the previous, pop the stack until
+   the top of the stack is greater than the new line's indentation level. A
+   DEDENT token is inserted for each level popped.
+
+3. If the indentation is equal, insert a NEWLINE token to terminate the previous
+   line, and leave it at that!
+
+Parsing Python with the INDENT, DEDENT, and NEWLINE tokens is identical to
+parsing a language with braces and semicolons. This is a solution pretty in line
+with Python's philosophy of the "one correct answer" (TODO: this needs a
+source). In developing our *layout* rules, we will follow in the pattern of
+translating the whitespace-sensitive source language to an explicitly sectioned
+language.
+
+But What About Haskell?
+***********************
+
+We saw that Python, the most notable example of an implicitly sectioned
+language, is pretty simple to lex. Why then am I so afraid of Haskell's layouts?
+To be frank, I'm far less scared after asking myself this -- however there are
+certainly some new complexities that Python needn't concern. Haskell has
+implicit line *continuation*: forms written over multiple lines; indentation
+styles often seen in Haskell are somewhat esoteric compared to Python's
+"s/[{};]//".
+
+.. code-block:: haskell
+
+   -- line continuation
+   something = this is a
+       single expression
+
+   -- an extremely common style found in haskell
+   data Python = Users
+       { are        :: Crying
+       , right      :: About
+       , now        :: Sorry
+       }
+
+   -- another formatting oddity
+   -- note that this is not a single
+   -- continued line! `look at`,
+   -- `this`, and `alignment` are all
+   -- separate expressions!
+   anotherThing = do look at
+                     this
+                     alignment
+
+But enough fear, lets actually think about implementation. Firstly, some
+formality: what do we mean when we say layout? We will define layout as the
+rules we apply to an implicitly-sectioned language in order to yield one that is
+explicitly-sectioned. We will also define indentation of a lexeme as the column
+number of its first character.
+
+Thankfully for us, our entry point is quite clear; layouts only appear after a
+select few keywords, (with a minor exception; TODO: elaborate) being :code:`let`
+(followed by supercombinators), :code:`where` (followed by supercombinators),
+:code:`do` (followed by expressions), and :code:`of` (followed by alternatives)
+(TODO: all of these terms need linked glossary entries). In order to manage the
+cascade of layout contexts, our lexer will record a stack for which each element
+is either :math:`\varnothing`, denoting an explicit layout written with braces
+and semicolons, or a :math:`\langle n \rangle`, denoting an implicitly laid-out
+layout where the start of each item belonging to the layout is indented
+:math:`n` columns.
+
+.. code-block:: haskell
+
+    -- layout stack: []
+    module M where -- layout stack: [∅]
+
+    f x = let -- layout keyword; remember indentation of next token
+              y = w * w -- layout stack: [∅, <10>]
+              w = x + x
+              -- layout ends here
+          in do -- layout keyword; next token is a brace!
+              { -- layout stack: [∅]
+                  print y;
+                  print x;
+              }
+
+Finally, we also need the concept of "virtual" brace tokens, which as far as
+we're concerned at this moment are exactly like normal brace tokens, except
+implicitly inserted by the compiler. With the presented ideas in mind, we may
+begin to introduce a small set of informal rules describing the lexer's handling
+of layouts, the first being:
+
+1. If a layout keyword is followed by the token '{', push :math:`\varnothing`
+   onto the layout context stack. Otherwise, push :math:`\langle n \rangle` onto
+   the layout context stack where :math:`n` is the indentation of the token
+   following the layout keyword. Additionally, the lexer is to insert a virtual
+   opening brace after the token representing the layout keyword.
+
+Consider the following observations from that previous code sample:
+
+* Function definitions should belong to a layout, each of which may start at
+  column 1.
+
+* A layout can enclose multiple bodies, as seen in the :code:`let`-bindings and
+  the :code:`do`-expression.
+
+* Semicolons should *terminate* items, rather than *separate* them.
+
+Our current focus is the semicolons. In an implicit layout, items are on
+separate lines each aligned with the previous. A naïve implementation would be
+to insert the semicolon token when the EOL is reached, but this proves unideal
+when you consider the alignment requirement. In our implementation, our lexer
+will wait until the first token on a new line is reached, then compare
+indentation and insert a semicolon if appropriate. This comparison -- the
+nondescript measurement of "more, less, or equal indentation" rather than a
+numeric value -- is referred to as *offside* by myself internally and the
+Haskell report describing layouts. We informally formalise this rule as follows:
+
+2. When the first token on a line is preceeded only by whitespace, if the
+   token's first grapheme resides on a column number :math:`m` equal to the
+   indentation level of the enclosing context -- i.e. the :math:`\langle n
+   \rangle` on top of the layout stack. Should no such context exist on the
+   stack, assume :math:`m > n`.
+
+We have an idea of how to begin layouts, delimit the enclosed items, and last
+we'll need to end layouts. This is where the distinction between virtual and
+non-virtual brace tokens comes into play. The lexer needs only partial concern
+towards closing layouts; the complete responsibility is shared with the parser.
+This will be elaborated on in the next section. For now, we will be content with
+naïvely inserting a virtual closing brace when a token is indented right of the
+layout.
+
+3. Under the same conditions as rule 2., when :math:`m < n` the lexer shall
+   insert a virtual closing brace and pop the layout stack.
+
+This rule covers some cases including the top-level, however, consider
+tokenising the :code:`in` in a :code:`let`-expression. If our lexical analysis
+framework only allows for lexing a single token at a time, we cannot return both
+a virtual right-brace and a :code:`in`. Under this model, the lexer may simply
+pop the layout stack and return the :code:`in` token. As we'll see in the next
+section, as long as the lexer keeps track of its own context (i.e. the stack),
+the parser will cope just fine without the virtual end-brace.
+
+Parsing Lonely Braces
+*********************
+
+When viewed in the abstract, parsing and tokenising are near-identical tasks yet
+the two are very often decomposed into discrete systems with very different
+implementations. Lexers operate on streams of text and tokens, while parsers
+are typically far less linear, using a parse stack or recursing top-down. A
+big reason for this separation is state management: the parser aims to be as
+context-free as possible, while the lexer tends to burden the necessary
+statefulness. Still, the nature of a stream-oriented lexer makes backtracking
+difficult and quite inelegant.
+
+However, simply declaring a parse error to be not an error at all
+counterintuitively proves to be an elegant solution our layout problem which
+minimises backtracking and state in both the lexer and the parser. Consider the
+following definitions found in rlp's BNF:
+
+.. productionlist:: rlp
+   VOpen   : `vopen`
+   VClose  : `vclose` | `error`
+
+A parse error is recovered and treated as a closing brace. Another point of note
+in the BNF is the difference between virtual and non-virtual braces (TODO: i
+don't like that the BNF is formatted without newlines :/):
+
+.. productionlist:: rlp
+   LetExpr : `let` VOpen Bindings VClose `in` Expr | `let` `{` Bindings `}` `in` Expr
+
+This ensures that non-virtual braces are closed explicitly.
+
+This set of rules is adequete enough to satisfy our basic concerns about line
+continations and layout lists. For a more pedantic description of the layout
+system, see `chapter 10
+<https://www.haskell.org/onlinereport/haskell2010/haskellch10.html>`_ of the
+2010 Haskell Report, which I heavily referenced here.
+
+References
+----------
+
+* `Python's lexical analysis
+  <https://docs.python.org/3/reference/lexical_analysis.html>`_
+
+* `Haskell syntax reference
+  <https://www.haskell.org/onlinereport/haskell2010/haskellch10.html>`_
--- a/doc/src/commentary/ti.rst
+++ b/doc/src/commentary/ti.rst
@@ -0,0 +1,6 @@
+The *Template Instantiator*
+====================================
+
+WIP. This will hopefully be expanded into a thorough walkthrough of the state
+machine.
+
--- a/doc/src/conf.py
+++ b/doc/src/conf.py
@@ -0,0 +1,61 @@
+# Configuration file for the Sphinx documentation builder.
+#
+# For the full list of built-in configuration values, see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+
+# -- Project information -----------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
+
+project = "rl'"
+copyright = '2023, madeleine sydney ślaga'
+author = 'madeleine sydney slaga'
+
+# -- General configuration ---------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
+
+extensions = ['sphinx.ext.imgmath']
+
+# templates_path = ['_templates']
+exclude_patterns = []
+
+# for the haskell girlies
+highlight_language = 'haskell'
+add_function_parentheses = False
+
+# -- Options for HTML output -------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
+
+html_theme = 'alabaster'
+# html_static_path = ['_static']
+
+# -- Options for LaTeX image math --------------------------------------------
+imgmath_latex_preamble = r'''
+\usepackage{amsmath}
+\usepackage{tabularray}
+
+\newcommand{\transrule}[2]
+    {\begin{tblr}{|rrrlc|}
+         \hline
+         & #1 \\
+         \implies & #2 \\
+         \hline
+    \end{tblr} }
+
+\newcommand{\gmrule}[2]
+    {\begin{tblr}{|rrrrll|}
+         \hline
+         & #1 \\
+         \implies & #2 \\
+         \hline
+    \end{tblr} }
+
+\newcommand{\nillist}{[\,]}
+'''
+
+imgmath_image_format = 'svg'
+imgmath_font_size = 14
+
+# helps with inlining:
+# https://www.sphinx-doc.org/en/master/usage/extensions/math.html#confval-imgmath_use_preview
+imgmath_use_preview = True
+
--- a/doc/src/glossary.rst
+++ b/doc/src/glossary.rst
@@ -0,0 +1,19 @@
+Glossary
+========
+
+Haskell and Haskell culture is infamous for using scary mathematical terms for
+simple ideas. Please excuse us, it's really fun :3.
+
+.. glossary::
+
+   supercombinator
+      An expression with no free variables. For most purposes, just think of a
+      top-level definition.
+
+   case alternative
+      An possible match in a case expression (TODO: example)
+
+   layout
+      The syntax used in rlp and Haskell that allows for implicitly sectioned
+      code using alignment and newlines.
+
--- a/doc/src/index.rst
+++ b/doc/src/index.rst
@@ -0,0 +1,26 @@
+Welcome to rlp's documentation!
+===============================
+
+Contents
+--------
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Index
+
+   glossary.rst
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Commentary
+   :glob:
+
+   commentary/*
+
+.. toctree::
+   :maxdepth: 1
+   :caption: References
+   :glob:
+
+   references/*
+
--- a/doc/src/references/gm-state-transitions.rst
+++ b/doc/src/references/gm-state-transitions.rst
@@ -0,0 +1,425 @@
+================================
+G-Machine State Transition Rules
+================================
+
+*********************
+Core Transition Rules
+*********************
+
+#. Lookup a global by name and push its value onto the stack
+
+   .. math::
+      \gmrule
+      { \mathtt{PushGlobal} \; f : i
+      & s
+      & d
+      & h
+      & m
+      \begin{bmatrix}
+           f : a
+      \end{bmatrix}
+      }
+      { i
+      & a : s
+      & d
+      & h
+      & m
+      }
+
+#. Allocate an int node on the heap, and push the address of the newly created
+   node onto the stack
+
+   .. math::
+      \gmrule
+      { \mathtt{PushInt} \; n : i
+      & s
+      & d
+      & h
+      & m
+      }
+      { i
+      & a : s
+      & d
+      & h
+      \begin{bmatrix}
+           a : \mathtt{NNum} \; n
+      \end{bmatrix}
+      & m
+      }
+
+#. Allocate an application node on the heap, applying the top of the stack to
+   the address directly below it. The address of the application node is pushed
+   onto the stack.
+
+   .. math::
+      \gmrule
+      { \mathtt{MkAp} : i
+      & f : x : s
+      & d
+      & h
+      & m
+      }
+      { i
+      & a : s
+      & d
+      & h
+      \begin{bmatrix}
+           a : \mathtt{NAp} \; f \; x
+      \end{bmatrix}
+      & m
+      }
+
+#. Push a function's argument onto the stack
+
+   .. math::
+      \gmrule
+      { \mathtt{Push} \; n : i
+      & a_0 : \ldots : a_n : s
+      & d
+      & h
+      & m
+      }
+      { i
+      & a_n : a_0 : \ldots : a_n : s
+      & d
+      & h
+      & m
+      }
+
+#. Tidy up the stack after instantiating a supercombinator
+
+   .. math::
+      \gmrule
+      { \mathtt{Slide} \; n : i
+      & a_0 : \ldots : a_n : s
+      & d
+      & h
+      & m
+      }
+      { i
+      & a_0 : s
+      & d
+      & h
+      & m
+      }
+
+#. If the top of the stack is in WHNF (currently this just means a number) is on
+   top of the stack, :code:`Unwind` considers evaluation complete. In the case
+   where the dump is **not** empty, the instruction queue and stack is restored
+   from the top.
+
+   .. math::
+      \gmrule
+      { \mathtt{Unwind} : \nillist
+      & a : s
+      & \langle i', s' \rangle : d
+      & h
+      \begin{bmatrix}
+           a : \mathtt{NNum} \; n
+      \end{bmatrix}
+      & m
+      }
+      { i'
+      & a : s'
+      & d
+      & h
+      & m
+      }
+
+#. Bulding on the previous rule, in the case where the dump **is** empty, leave
+   the machine in a halt state (i.e. with an empty instruction queue).
+
+   .. math::
+      \gmrule
+      { \mathtt{Unwind} : \nillist
+      & a : s
+      & \nillist
+      & h
+      \begin{bmatrix}
+           a : \mathtt{NNum} \; n
+      \end{bmatrix}
+      & m
+      }
+      { \nillist
+      & a : s
+      & \nillist
+      & h
+      & m
+      }
+
+#. Again, building on the previous rules, this rule makes the machine consider
+   unapplied supercombinators to be in WHNF
+
+   .. math::
+      \gmrule
+      { \mathtt{Unwind} : \nillist
+      & a_0 : \ldots : a_n : \nillist
+      & \langle i, s \rangle : d
+      & h
+      \begin{bmatrix}
+            a_0 : \mathtt{NGlobal} \; k \; c
+      \end{bmatrix}
+      & m
+      }
+      { i
+      & a_n : s
+      & d
+      & h
+      & m \\
+      \SetCell[c=2]{c}
+      \text{when $n < k$}
+      }
+
+#. If an application is on top of the stack, :code:`Unwind` continues unwinding
+
+   .. math::
+      \gmrule
+      { \mathtt{Unwind} : \nillist
+      & a : s
+      & d
+      & h
+      \begin{bmatrix}
+           a : \mathtt{NAp} \; f \; x
+      \end{bmatrix}
+      & m
+      }
+      { \mathtt{Unwind} : \nillist
+      & f : a : s
+      & d
+      & h
+      & m
+      }
+
+#. When a supercombinator is on top of the stack (and the correct number of
+   arguments have been provided), :code:`Unwind` sets up the stack and jumps to
+   the supercombinator's code (:math:`\beta`-reduction)
+
+   .. math::
+      \gmrule
+      { \mathtt{Unwind} : \nillist
+      & a_0 : \ldots : a_n : s
+      & d
+      & h
+      \begin{bmatrix}
+           a_0 : \mathtt{NGlobal} \; n \; c \\
+           a_1 : \mathtt{NAp} \; a_0 \; e_1 \\
+           \vdots \\
+           a_n : \mathtt{NAp} \; a_{n-1} \; e_n \\
+      \end{bmatrix}
+      & m
+      }
+      { c
+      & e_1 : \ldots : e_n : a_n : s
+      & d
+      & h
+      & m
+      }
+
+#. Pop the stack, and update the nth node to point to the popped address
+
+   .. math::
+      \gmrule
+      { \mathtt{Update} \; n : i
+      & e : f : a_1 : \ldots : a_n : s
+      & d
+      & h
+      \begin{bmatrix}
+           a_1 : \mathtt{NAp} \; f \; e \\
+           \vdots \\
+           a_n : \mathtt{NAp} \; a_{n-1} \; e_n
+      \end{bmatrix}
+      & m
+      }
+      { i
+      & f : a_1 : \ldots : a_n : s
+      & d
+      & h
+      \begin{bmatrix}
+           a_n : \mathtt{NInd} \; e
+      \end{bmatrix}
+      & m
+      }
+
+#. Pop the stack.
+
+   .. math::
+      \gmrule
+      { \mathtt{Pop} \; n : i
+      & a_1 : \ldots : a_n : s
+      & d
+      & h
+      & m
+      }
+      { i
+      & s
+      & d
+      & h
+      & m
+      }
+
+#. Follow indirections while unwinding
+
+   .. math::
+      \gmrule
+      { \mathtt{Unwind} : \nillist
+      & a : s
+      & d
+      & h
+      \begin{bmatrix}
+           a : \mathtt{NInd} \; a'
+      \end{bmatrix}
+      & m
+      }
+      { \mathtt{Unwind} : \nillist
+      & a' : s
+      & d
+      & h
+      & m
+      }
+
+#. Allocate uninitialised heap space
+
+   .. math::
+      \gmrule
+      { \mathtt{Alloc} \; n : i
+      & s
+      & d
+      & h
+      & m
+      }
+      { i
+      & a_1 : \ldots : a_n : s
+      & d
+      & h
+      \begin{bmatrix}
+           a_1 : \mathtt{NUninitialised} \\
+           \vdots \\
+           a_n : \mathtt{NUninitialised} \\
+      \end{bmatrix}
+      & m
+      }
+
+#. Evaluate the top of the stack to WHNF
+
+   .. math::
+      \gmrule
+      { \mathtt{Eval} : i
+      & a : s
+      & d
+      & h
+      & m
+      }
+      { \mathtt{Unwind} : \nillist
+      & a : \nillist
+      & \langle i, s \rangle : d
+      & h
+      & m
+      }
+
+#. Reduce a primitive binary operator :math:`*`.
+
+   .. math::
+      \gmrule
+      { * : i
+      & a_1 : a_2 : s
+      & d
+      & h
+      \begin{bmatrix}
+            a_1 : x \\
+            a_2 : y
+      \end{bmatrix}
+      & m
+      }
+      { i
+      & a' : s
+      & d
+      & h
+      \begin{bmatrix}
+           a' : (x * y)
+      \end{bmatrix}
+      & m
+      }
+
+#. Reduce a primitive unary operator :math:`\neg`.
+
+   .. math::
+      \gmrule
+      { \neg : i
+      & a : s
+      & d
+      & h
+      \begin{bmatrix}
+            a : x
+      \end{bmatrix}
+      & m
+      }
+      { i
+      & a' : s
+      & d
+      & h
+      \begin{bmatrix}
+           a' : (\neg x)
+      \end{bmatrix}
+      & m
+      }
+
+***************
+Extension Rules
+***************
+
+#. A sneaky trick to enable sharing of :code:`NNum` nodes. We note that the
+   global environment is a mapping of plain old strings to heap addresses.
+   Strings of digits are not considered valid identifiers, so putting them on
+   the global environment will never conflict with a supercombinator! We abuse
+   this by modifying Core Rule 2 to update the global environment with the new
+   node's address. Consider how this rule might impact garbage collection
+   (remember that the environment is intended for *globals*).
+
+   .. math::
+      \gmrule
+      { \mathtt{PushInt} \; n : i
+      & s
+      & d
+      & h
+      & m
+      }
+      { i
+      & a : s
+      & d
+      & h
+      \begin{bmatrix}
+           a : \mathtt{NNum} \; n
+      \end{bmatrix}
+      & m
+      \begin{bmatrix}
+           n' : a
+      \end{bmatrix}
+      \\
+      \SetCell[c=5]{c}
+      \text{where $n'$ is the base-10 string rep. of $n$}
+      }
+
+#. In order for the previous rule to be effective, we are also required to take
+   action when a number already exists in the environment:
+
+   .. math::
+      \gmrule
+      { \mathtt{PushInt} \; n : i
+      & s
+      & d
+      & h
+      & m
+      \begin{bmatrix}
+           n' : a
+      \end{bmatrix}
+      }
+      { i
+      & a : s
+      & d
+      & h
+      & m
+      \\
+      \SetCell[c=5]{c}
+      \text{where $n'$ is the base-10 string rep. of $n$}
+      }
+
--- a/doc/src/references/ti-state-transitions.rst
+++ b/doc/src/references/ti-state-transitions.rst
@@ -0,0 +1,334 @@
+============================================
+Template Instantiator State Transition Rules
+============================================
+
+Evaluation is complete when a single :code:`NNum` remains on the stack and the
+dump is empty.
+
+.. math::
+   \transrule
+   { a : \nillist
+   & \nillist
+   & h
+   \begin{bmatrix}
+        a : \mathtt{NNum} \; n
+   \end{bmatrix}
+   & g
+   }
+   { \mathtt{HALT}
+   }
+
+Dereference an indirection passed as an argument to a function.
+
+.. math::
+   \transrule
+        {a : s & d & h
+           \begin{bmatrix}
+               a : \mathtt{NAp} \; a_1 \; a_2 \\
+               a_2 : \mathtt{NInd} \; a_3
+           \end{bmatrix} & g}
+        {a : s & d & h[a : \mathtt{NAp} \; a_1 \; a_3] & g}
+
+Dereference an indirection on top of the stack.
+
+.. math::
+   \transrule
+        {p : s & d & h
+            \begin{bmatrix}
+                p : \mathtt{NInd} \; a
+            \end{bmatrix} & g}
+        {a : s & d & h & g}
+
+Perform a unary operation :math:`o(n)` with internal :code:`Prim` constructor
+:code:`O` on an argument in normal form.
+
+.. math::
+   \transrule
+        { f : a : s
+        & d
+        & h
+        \begin{bmatrix}
+            f : \mathtt{NPrim} \; \mathtt{O} \\
+            a : \mathtt{NAp} \; f \; x \\
+            x : \mathtt{NNum} \; n
+        \end{bmatrix}
+        & g
+        }
+        { a : s
+        & d
+        & h
+        \begin{bmatrix}
+            a : \mathtt{NNum} \; (o(n))
+        \end{bmatrix}
+        & g
+        }
+
+Evaluate the argument of a unary operation with internal :code:`Prim`
+constructor :code:`O`.
+
+.. math::
+   \transrule
+        { f : a : \nillist
+        & d
+        & h
+        \begin{bmatrix}
+            f : \mathtt{NPrim} \; \mathtt{O} \\
+            a : \mathtt{NAp} \; f \; x
+        \end{bmatrix}
+        & g
+        }
+        { x : \nillist
+        & (f : a : \nillist) : d
+        & h
+        & g
+        }
+
+Restore the stack when a sub-computation has completed.
+
+.. math::
+   \transrule
+   { a : \nillist
+   & s : d
+   & h
+   \begin{bmatrix}
+        a : \mathtt{NNum} \; n
+   \end{bmatrix}
+   & g
+   }
+   { s
+   & d
+   & h
+   & g
+   }
+
+Reduce a supercombinator and update the root with the :math:`\beta`-reduced form
+
+.. math::
+   \transrule
+   { a_0 : a_1 : \ldots : a_n : s
+   & d
+   & h
+   \begin{bmatrix}
+        a_0 : \mathtt{NSupercomb} \; [x_1,\ldots,x_n] \; e
+   \end{bmatrix}
+   & g
+   }
+   { a_n : s
+   & d
+   & h'
+   & g
+   \\
+   & \SetCell[c=3]{c}
+   \text{where } h' = \mathtt{instantiateU} \; e \; a_n \; h \; g
+   }
+
+Perform a binary operation :math:`o(x,y)` associated with internal :code:`Prim`
+constructor :code:`O` on two :code:`NNum` s both in normal form.
+
+.. math::
+   \transrule
+   { f : a_1 : a_2 : s
+   & d
+   & h
+   \begin{bmatrix}
+        f : \mathtt{NPrim} \; \mathtt{O} \\
+        a_1 : \mathtt{NAp} \; f \; (\mathtt{NNum} \; x) \\
+        a_2 : \mathtt{NAp} \; a_1 \; (\mathtt{NNum} \; y)
+   \end{bmatrix}
+   & g
+   }
+   { a_2 : s
+   & d
+   & h
+   \begin{bmatrix}
+        a_2 : \mathtt{NNum} \; (o(x,y))
+   \end{bmatrix}
+   & g
+   }
+
+In a conditional primitive, perform the reduction if the condition has been
+evaluated as True (:code:`NData 1 []`).
+
+.. math::
+   \transrule
+   { f : a_1 : a_2 : a_3 : s
+   & d
+   & h
+   \begin{bmatrix}
+        f : \mathtt{NPrim} \; \mathtt{IfP} \\
+        c : \mathtt{NPrim} \; (\mathtt{NData} \; 1 \; \nillist) \\
+        a_1 : \mathtt{NAp} \; f \; c \\
+        a_2 : \mathtt{NAp} \; a_1 \; x \\
+        a_3 : \mathtt{NAp} \; a_2 \; y
+   \end{bmatrix}
+   & g
+   }
+   { x : s
+   & d
+   & h
+   & g
+   }
+
+In a conditional primitive, perform the reduction if the condition has been
+evaluated as False (:code:`NData 0 []`).
+
+.. math::
+   \transrule
+   { f : a_1 : a_2 : a_3 : s
+   & d
+   & h
+   \begin{bmatrix}
+        f : \mathtt{NPrim} \; \mathtt{IfP} \\
+        c : \mathtt{NPrim} \; (\mathtt{NData} \; 0 \; \nillist) \\
+        a_1 : \mathtt{NAp} \; f \; c \\
+        a_2 : \mathtt{NAp} \; a_1 \; x \\
+        a_3 : \mathtt{NAp} \; a_2 \; y
+   \end{bmatrix}
+   & g
+   }
+   { y : s
+   & d
+   & h
+   & g
+   }
+
+
+In a conditional primitive, evaluate the condition.
+
+.. math::
+   \transrule
+   { f : a_1 : \nillist
+   & d
+   & h
+   \begin{bmatrix}
+        f : \mathtt{NPrim} \; \mathtt{IfP} \\
+        a_1 : \mathtt{NAp} \; f \; x
+   \end{bmatrix}
+   & g
+   }
+   { x : \nillist
+   & (f : a_1 : \nillist) : d
+   & h
+   & g
+   }
+
+Construct :code:`NData` out of a constructor and its arguments
+
+.. math::
+   \transrule
+   { c : a_1 : \ldots : a_n : \nillist
+   & d
+   & h
+   \begin{bmatrix}
+        c : \mathtt{NPrim} \; (\mathtt{ConP} \; t \; n) \\
+        a_1 : \mathtt{NAp} \; c \; x_1 \\
+        \vdots \\
+        a_n : \mathtt{NAp} \; a_{n-1} \; x_n
+   \end{bmatrix}
+   & g
+   }
+   { a_n : \nillist
+   & d
+   & h
+   \begin{bmatrix}
+        a_n : \mathtt{NData} \; t \; [x_1, \ldots, x_n]
+   \end{bmatrix}
+   & g
+   }
+
+Pairs
+-----
+
+Evaluate the first argument if necessary
+
+.. math::
+   \transrule
+   { c : a_1 : a_2 : \nillist
+   & d
+   & h
+   \begin{bmatrix}
+        c : \mathtt{NPrim} \; \mathtt{CasePairP} \\
+        p : \mathtt{NAp} \; \_ \: \_ \\
+        a_1 : \mathtt{NAp} \; c \; p \\
+        a_2 : \mathtt{NAp} \; a_2 \; f
+   \end{bmatrix}
+   & g
+   }
+   { p : \nillist
+   & (a_1 : a_2 : \nillist) : d
+   & h
+   & g
+   }
+
+Perform the reduction if the first argument is in normal form
+
+.. math::
+   \transrule
+   { c : a_1 : a_2 : s
+   & d
+   & h
+   \begin{bmatrix}
+        c : \mathtt{NPrim} \; \mathtt{CasePairP} \\
+        p : \mathtt{NData} \; 0 \; [x,y] \\
+        a_1 : \mathtt{NAp} \; c \; p \\
+        a_2 : \mathtt{NAp} \; a_1 \; f
+   \end{bmatrix}
+   & g
+   }
+   { a_1 : a_2 : s
+   & d
+   & h
+   \begin{bmatrix}
+        a_1 : \mathtt{NAp} \; f \; x \\
+        a_2 : \mathtt{NAp} \; a_1 \; y
+   \end{bmatrix}
+   & g
+   }
+
+Lists
+-----
+
+Evaluate the scrutinee
+
+.. math::
+   \transrule
+   { c : a_1 : a_2 : a_3 : \nillist
+   & d
+   & h
+   \begin{bmatrix}
+        c : \mathtt{NPrim} \; \mathtt{CaseListP} \\
+        a_1 : \mathtt{NAp} \; c \; x
+   \end{bmatrix}
+   & g
+   }
+   { x
+   & (a_1 : a_2 : a_3) : \nillist
+   & h
+   & g
+   }
+
+If the scrutinee is :code:`Nil`, perform the appropriate reduction.
+
+.. math::
+   \transrule
+   { c : a_1 : a_2 : a_3 : s
+   & d
+   & h
+   \begin{bmatrix}
+        c : \mathtt{NPrim} \; \mathtt{CaseListP} \\
+        p : \mathtt{NData} \; 1 \; \nillist \\
+        a_1 : \mathtt{NAp} \; c \; p \\
+        a_2 : \mathtt{NAp} \; p \; f_\text{nil} \\
+        a_3 : \mathtt{NAp} \; a_2 \; f_\text{cons}
+   \end{bmatrix}
+   & g
+   }
+   { a_3 : s
+   & d
+   & h
+   \begin{bmatrix}
+        a_3 : \mathtt{NAp} \; f_\text{nil}
+   \end{bmatrix}
+   & g
+   }
+