remove bad, incorrct, outdated docs

2024-02-13 13:20:39 -07:00
parent c57da862ae
commit ccc71a751c
3 changed files with 7 additions and 197 deletions
--- a/doc/src/commentary/gm.rst
+++ b/doc/src/commentary/gm.rst
@@ -63,52 +63,13 @@ an assembly target. The goal of our new G-Machine is to compile a *linear
 sequence of instructions* which, **when executed**, build up a graph
 representing the code.

-**************************
-Trees and Vines, in Theory
-**************************
-
-Rather than instantiating an expression at runtime -- traversing the AST and
-building a graph -- we want to compile all expressions at compile-time,
-generating a linear sequence of instructions which may be executed to build the
-graph.
-
-**************************
-Evaluation: Slurping Vines
-**************************
-
-WIP.
-
-Laziness
--------
-
-WIP.
-
-* Instead of :code:`Slide (n+1); Unwind`, do :code:`Update n; Pop n; Unwind`
-
-****************************
-Compilation: Squashing Trees
-****************************
-
-WIP.
-
-Notice that we do not keep a (local) environment at run-time. The environment
-only exists at compile-time to map local names to stack indices. When compiling
-a supercombinator, the arguments are enumerated from zero (the top of the
-stack), and passed to :code:`compileR` as an environment.
+*************
+The G-Machine
+*************

 .. literalinclude:: /../../src/GM.hs
   :dedent:
-   :start-after: -- >> [ref/compileSc]
-   :end-before: -- << [ref/compileSc]
-   :caption: src/GM.hs
-
-Of course, variables being indexed relative to the top of the stack means that
-they will become inaccurate the moment we push or pop the stack a single time.
-The way around this is quite simple: simply offset the stack when w
-
-.. literalinclude:: /../../src/GM.hs
-   :dedent:
-   :start-after: -- >> [ref/compileC]
-   :end-before: -- << [ref/compileC]
+   :start-after: -- >> [ref/Instr]
+   :end-before: -- << [ref/Instr]
   :caption: src/GM.hs

--- a/doc/src/commentary/layout-lexing.rst
+++ b/doc/src/commentary/layout-lexing.rst
@@ -62,159 +62,6 @@ braces and semicolons. In developing our *layout* rules, we will follow in the
 pattern of translating the whitespace-sensitive source language to an explicitly
 sectioned language.

-But What About Haskell?
-***********************
-
-Parsing Haskell -- and thus rl' -- is only slightly more complex than Python,
-but the design is certainly more sensitive. 
-
-.. code-block:: haskell
-
-   -- line folds
-   something = this is a
-       single expression
-
-   -- an extremely common style found in haskell
-   data Some = Data
-       { is    :: Presented
-       , in    :: This
-       , silly :: Style
-       }
-
-   -- another style oddity
-   -- note that this is not a single
-   -- continued line! `look at`,
-   -- `this odd`, and `alignment` are all
-   -- discrete items!
-   anotherThing = do look at
-                     this odd
-                     alignment
-
-But enough fear, lets actually think about implementation. Firstly, some
-formality: what do we mean when we say layout? We will define layout as the
-rules we apply to an implicitly-sectioned language in order to yield one that is
-explicitly-sectioned. We will also define indentation of a lexeme as the column
-number of its first character.
-
-Thankfully for us, our entry point is quite clear; layouts only appear after a
-select few keywords, (with a minor exception; TODO: elaborate) being :code:`let`
-(followed by supercombinators), :code:`where` (followed by supercombinators),
-:code:`do` (followed by expressions), and :code:`of` (followed by alternatives)
-(TODO: all of these terms need linked glossary entries). In order to manage the
-cascade of layout contexts, our lexer will record a stack for which each element
-is either :math:`\varnothing`, denoting an explicit layout written with braces
-and semicolons, or a :math:`\langle n \rangle`, denoting an implicitly laid-out
-layout where the start of each item belonging to the layout is indented
-:math:`n` columns.
-
-.. code-block:: haskell
-
-    -- layout stack: []
-    module M where -- layout stack: [∅]
-
-    f x = let -- layout keyword; remember indentation of next token
-              y = w * w -- layout stack: [∅, <10>]
-              w = x + x
-              -- layout ends here
-          in do -- layout keyword; next token is a brace!
-              { -- layout stack: [∅]
-                  print y;
-                  print x;
-              }
-
-Finally, we also need the concept of "virtual" brace tokens, which as far as
-we're concerned at this moment are exactly like normal brace tokens, except
-implicitly inserted by the compiler. With the presented ideas in mind, we may
-begin to introduce a small set of informal rules describing the lexer's handling
-of layouts, the first being:
-
-1. If a layout keyword is followed by the token '{', push :math:`\varnothing`
-   onto the layout context stack. Otherwise, push :math:`\langle n \rangle` onto
-   the layout context stack where :math:`n` is the indentation of the token
-   following the layout keyword. Additionally, the lexer is to insert a virtual
-   opening brace after the token representing the layout keyword.
-
-Consider the following observations from that previous code sample:
-
-* Function definitions should belong to a layout, each of which may start at
-  column 1.
-
-* A layout can enclose multiple bodies, as seen in the :code:`let`-bindings and
-  the :code:`do`-expression.
-
-* Semicolons should *terminate* items, rather than *separate* them.
-
-Our current focus is the semicolons. In an implicit layout, items are on
-separate lines each aligned with the previous. A naïve implementation would be
-to insert the semicolon token when the EOL is reached, but this proves unideal
-when you consider the alignment requirement. In our implementation, our lexer
-will wait until the first token on a new line is reached, then compare
-indentation and insert a semicolon if appropriate. This comparison -- the
-nondescript measurement of "more, less, or equal indentation" rather than a
-numeric value -- is referred to as *offside* by myself internally and the
-Haskell report describing layouts. We informally formalise this rule as follows:
-
-2. When the first token on a line is preceeded only by whitespace, if the
-   token's first grapheme resides on a column number :math:`m` equal to the
-   indentation level of the enclosing context -- i.e. the :math:`\langle n
-   \rangle` on top of the layout stack. Should no such context exist on the
-   stack, assume :math:`m > n`.
-
-We have an idea of how to begin layouts, delimit the enclosed items, and last
-we'll need to end layouts. This is where the distinction between virtual and
-non-virtual brace tokens comes into play. The lexer needs only partial concern
-towards closing layouts; the complete responsibility is shared with the parser.
-This will be elaborated on in the next section. For now, we will be content with
-naïvely inserting a virtual closing brace when a token is indented right of the
-layout.
-
-3. Under the same conditions as rule 2., when :math:`m < n` the lexer shall
-   insert a virtual closing brace and pop the layout stack.
-
-This rule covers some cases including the top-level, however, consider
-tokenising the :code:`in` in a :code:`let`-expression. If our lexical analysis
-framework only allows for lexing a single token at a time, we cannot return both
-a virtual right-brace and a :code:`in`. Under this model, the lexer may simply
-pop the layout stack and return the :code:`in` token. As we'll see in the next
-section, as long as the lexer keeps track of its own context (i.e. the stack),
-the parser will cope just fine without the virtual end-brace.
-
-Parsing Lonely Braces
-*********************
-
-When viewed in the abstract, parsing and tokenising are near-identical tasks yet
-the two are very often decomposed into discrete systems with very different
-implementations. Lexers operate on streams of text and tokens, while parsers
-are typically far less linear, using a parse stack or recursing top-down. A
-big reason for this separation is state management: the parser aims to be as
-context-free as possible, while the lexer tends to burden the necessary
-statefulness. Still, the nature of a stream-oriented lexer makes backtracking
-difficult and quite inelegant.
-
-However, simply declaring a parse error to be not an error at all
-counterintuitively proves to be an elegant solution our layout problem which
-minimises backtracking and state in both the lexer and the parser. Consider the
-following definitions found in rlp's BNF:
-
-.. productionlist:: rlp
-   VOpen   : `vopen`
-   VClose  : `vclose` | `error`
-
-A parse error is recovered and treated as a closing brace. Another point of note
-in the BNF is the difference between virtual and non-virtual braces (TODO: i
-don't like that the BNF is formatted without newlines :/):
-
-.. productionlist:: rlp
-   LetExpr : `let` VOpen Bindings VClose `in` Expr | `let` `{` Bindings `}` `in` Expr
-
-This ensures that non-virtual braces are closed explicitly.
-
-This set of rules is adequete enough to satisfy our basic concerns about line
-continations and layout lists. For a more pedantic description of the layout
-system, see `chapter 10
-<https://www.haskell.org/onlinereport/haskell2010/haskellch10.html>`_ of the
-2010 Haskell Report, which I heavily referenced here.
-
 References
 ----------

--- a/src/GM.hs
+++ b/src/GM.hs
@@ -93,6 +93,7 @@ data Key = NameKey Name
         | ConstrKey Tag Int
         deriving (Show, Eq)

+-- >> [ref/Instr]
 data Instr = Unwind
           | PushGlobal Name
           | PushConstr Tag Int
@@ -114,6 +115,7 @@ data Instr = Unwind
           | Print
           | Halt
           deriving (Show, Eq)
+-- << [ref/Instr]

 data Node = NNum Int
          | NAp Addr Addr