mirror of
https://github.com/GrammaticalFramework/gf-core.git
synced 2026-05-07 10:12:51 -06:00
updated gf-modules
This commit is contained in:
994
doc/gf-modules.txt
Normal file
994
doc/gf-modules.txt
Normal file
@@ -0,0 +1,994 @@
|
||||
The Module System of GF
|
||||
Aarne Ranta
|
||||
8/4/2005 - 5/7/2007
|
||||
|
||||
%!postproc(html): #SUB1 <sub>1</sub>
|
||||
%!postproc(html): #SUBk <sub>k</sub>
|
||||
%!postproc(html): #SUBi <sub>i</sub>
|
||||
%!postproc(html): #SUBm <sub>m</sub>
|
||||
%!postproc(html): #SUBn <sub>n</sub>
|
||||
%!postproc(html): #SUBp <sub>p</sub>
|
||||
%!postproc(html): #SUBq <sub>q</sub>
|
||||
|
||||
|
||||
% to compile: txt2tags --toc -thtml modulesystem.txt
|
||||
|
||||
|
||||
A GF grammar consists of a set of **modules**, which can be
|
||||
combined in different ways to build different grammars.
|
||||
There are several different **types of modules**:
|
||||
- ``abstract``
|
||||
- ``concrete``
|
||||
- ``resource``
|
||||
- ``interface``
|
||||
- ``instance``
|
||||
- ``incomplete concrete``
|
||||
|
||||
|
||||
We will go through the module types in this order, which is also
|
||||
their order of "importance" from the most basic to
|
||||
the more advanced ones.
|
||||
|
||||
This document presupposes knowledge of GF judgements and expressions, which can
|
||||
be gained from the [GF tutorial tutorial/gf-tutorial2.html]. It aims
|
||||
to give a systamatic description of the module system;
|
||||
some tutorial information is repeated to make the document
|
||||
self-contained.
|
||||
|
||||
|
||||
|
||||
|
||||
=The principal module types=
|
||||
|
||||
==Abstract syntax==
|
||||
|
||||
Any GF grammar that is used in an application
|
||||
will probably contain at least one module
|
||||
of the ``abstract`` module type. Here is an example of
|
||||
such a module, defining a fragment of propositional logic.
|
||||
```
|
||||
abstract Logic = {
|
||||
cat Prop ;
|
||||
fun Conj : Prop -> Prop -> Prop ;
|
||||
fun Disj : Prop -> Prop -> Prop ;
|
||||
fun Impl : Prop -> Prop -> Prop ;
|
||||
fun Falsum : Prop ;
|
||||
}
|
||||
```
|
||||
The **name** of this module is ``Logic``.
|
||||
|
||||
|
||||
|
||||
An ``abstract`` module defines an **abstract syntax**, which
|
||||
is a language-independent representation of a fragment of language.
|
||||
It consists of two kinds of **judgements**:
|
||||
- ``cat`` judgements telling what **categories** there are
|
||||
(types of abstract syntax trees)
|
||||
- ``fun`` judgements telling what **functions** there are
|
||||
(to build abstract syntax trees)
|
||||
|
||||
|
||||
There can also be ``def`` and ``data`` judgements in an
|
||||
abstract syntax.
|
||||
|
||||
|
||||
===Compilation of abstract syntax===
|
||||
|
||||
The GF grammar compiler expects to find the module ``Logic`` in a file named
|
||||
``Logic.gf``. When the compiler is run, it produces
|
||||
another file, named ``Logic.gfc``. This file is in the
|
||||
format called **canonical GF**, which is the "machine language"
|
||||
of GF. Next time that the module ``Logic`` is needed in
|
||||
compiling a grammar, it can be read from the compiled (``gfc``)
|
||||
file instead of the source (``gf``) file, unless the source
|
||||
has been changed after the compilation.
|
||||
|
||||
|
||||
==Concrete syntax==
|
||||
|
||||
In order for a GF grammar to describe a concrete language, the abstract
|
||||
syntax must be completed with a **concrete syntax** of it.
|
||||
For this purpose, we use modules of type ``concrete``: for instance,
|
||||
```
|
||||
concrete LogicEng of Logic = {
|
||||
lincat Prop = {s : Str} ;
|
||||
lin Conj a b = {s = a.s ++ "and" ++ b.s} ;
|
||||
lin Disj a b = {s = a.s ++ "or" ++ b.s} ;
|
||||
lin Impl a b = {s = "if" ++ a.s ++ "then" ++ b.s} ;
|
||||
lin Falsum = {s = ["we have a contradiction"]} ;
|
||||
}
|
||||
```
|
||||
The module ``LogicEng`` is a concrete syntax ``of`` the
|
||||
abstract syntax ``Logic``. The GF grammar compiler checks that
|
||||
the concrete is valid with respect to the abstract syntax ``of``
|
||||
which it is claimed to be. The validity requires that there has to be
|
||||
- a ``lincat`` judgement for each ``cat`` judgement, telling what the
|
||||
**linearization types** of categories are
|
||||
- a ``lin`` judgement for each ``fun`` judgement, telling what the
|
||||
**linearization functions** corresponding to functions are
|
||||
|
||||
|
||||
Validity also requires that the linearization functions defined by
|
||||
``lin`` judgements are type-correct with respect to the
|
||||
linearization types of the arguments and value of the function.
|
||||
|
||||
|
||||
|
||||
There can also be ``lindef`` and ``printname`` judgements in a
|
||||
concrete syntax.
|
||||
|
||||
|
||||
==Top-level grammar==
|
||||
|
||||
When a ``concrete`` module is successfully compiled, a ``gfc``
|
||||
file is produced in the same way as for ``abstract`` modules. The
|
||||
pair of an ``abstract`` and a corresponding ``concrete`` module
|
||||
is a **top-level grammar**, which can be used in the GF system to
|
||||
perform various tasks. The most fundamental tasks are
|
||||
- **linearization**: take an abstract syntax tree and find the corresponding string
|
||||
- **parsing**: take a string and find the corresponding abstract syntax
|
||||
trees (which can be zero, one, or many)
|
||||
|
||||
|
||||
In the current grammar, infinitely many trees and strings are recognized, although
|
||||
no very interesting ones. For example, the tree
|
||||
```
|
||||
Impl (Disj Falsum Falsum) Falsum
|
||||
```
|
||||
has the linearization
|
||||
```
|
||||
if we have a contradiction or we have a contradiction then we have a contradiction
|
||||
```
|
||||
which in turn can be parsed uniquely as that tree.
|
||||
|
||||
|
||||
===Compiling top-level grammars===
|
||||
|
||||
When GF compiles the module ``LogicEng`` it also has to compile
|
||||
all modules that it **depends** on (in this case, just ``Logic``).
|
||||
The compilation process starts with dependency analysis to find
|
||||
all these modules, recursively, starting from the explicitly imported one.
|
||||
The compiler then reads either ``gf`` or ``gfc`` files, in
|
||||
a dependency order. The decision on which files to read depends on
|
||||
time stamps and dependencies in a natural way, so that all and only
|
||||
those modules that have to be compiled are compiled. (This behaviour can
|
||||
be changed with flags, see below.)
|
||||
|
||||
|
||||
===Using top-level grammars===
|
||||
|
||||
To use a top-level grammar in the GF system, one uses the ``import``
|
||||
command (short name ``i``). For instance,
|
||||
```
|
||||
i LogicEng.gf
|
||||
```
|
||||
It is also possible to specify the imported grammar(s) on the command
|
||||
line when invoking GF:
|
||||
```
|
||||
gf LogicEng.gf
|
||||
```
|
||||
Various **compilation flags** can be added to both ways of compiling a module:
|
||||
- ``-src`` forces compilation form source files
|
||||
- ``-v`` gives more verbose information on compilation
|
||||
- ``-s`` makes compilation silent (except if it fails with an error message)
|
||||
|
||||
|
||||
A complete list of flags can be obtained in GF by ``help i``.
|
||||
|
||||
Importing a grammar makes it visible in GF's **internal state**. To see
|
||||
what modules are available, use the command ``print_options`` (``po``).
|
||||
You can empty the state with the command ``empty`` (``e``); this is
|
||||
needed if you want to read in grammars with a different abstract syntax
|
||||
than the current one without exiting GF.
|
||||
|
||||
|
||||
|
||||
Grammar modules can reside in different directories. They can then be found
|
||||
by means of a **search path**, which is a flag such as
|
||||
```
|
||||
-path=.:api/toplevel:prelude
|
||||
```
|
||||
given to the ``import`` command or the shell command invoking GF.
|
||||
(It can also be defined in the grammar file; see below.) The compiler
|
||||
writes every ``gfc`` file in the same directory as the corresponding
|
||||
``gf`` file.
|
||||
|
||||
The ``path`` is relative to the working directory ``pwd``, so that
|
||||
all directories listed are primarily interpreted as subdirectories of
|
||||
``pwd``. Secondarily, they are searched relative to the value of the
|
||||
environment variable ``GF_LIB_PATH``, which is by default set to
|
||||
``/usr/local/share/GF``.
|
||||
|
||||
Parsing and linearization can be performed with the ``parse``
|
||||
(``p``) and ``linearize`` (``l``) commands, respectively.
|
||||
For instance,
|
||||
```
|
||||
> l Impl (Disj Falsum Falsum) Falsum
|
||||
if we have a contradiction or we have a contradiction then we have a contradiction
|
||||
|
||||
> p -cat=Prop "we have a contradiction"
|
||||
Falsum
|
||||
```
|
||||
Notice that the ``parse`` command needs the parsing category
|
||||
as a flag. This necessary since a grammar can have several
|
||||
possible parsing categories ("entry points").
|
||||
|
||||
|
||||
|
||||
==Multilingual grammar==
|
||||
|
||||
One ``abstract`` syntax can have several ``concrete`` syntaxes.
|
||||
Here are two new ones for ``Logic``:
|
||||
```
|
||||
concrete LogicFre of Logic = {
|
||||
lincat Prop = {s : Str} ;
|
||||
lin Conj a b = {s = a.s ++ "et" ++ b.s} ;
|
||||
lin Disj a b = {s = a.s ++ "ou" ++ b.s} ;
|
||||
lin Impl a b = {s = "si" ++ a.s ++ "alors" ++ b.s} ;
|
||||
lin Falsum = {s = ["nous avons une contradiction"]} ;
|
||||
}
|
||||
|
||||
concrete LogicSymb of Logic = {
|
||||
lincat Prop = {s : Str} ;
|
||||
lin Conj a b = {s = "(" ++ a.s ++ "&" ++ b.s ++ ")"} ;
|
||||
lin Disj a b = {s = "(" ++ a.s ++ "v" ++ b.s ++ ")"} ;
|
||||
lin Impl a b = {s = "(" ++ a.s ++ "->" ++ b.s ++ ")"} ;
|
||||
lin Falsum = {s = "_|_"} ;
|
||||
}
|
||||
```
|
||||
The four modules ``Logic``, ``LogicEng``, ``LogicFre``, and
|
||||
``LogicSymb`` together form a **multilingual grammar**, in which
|
||||
it is possible to perform parsing and linearization with respect to any
|
||||
of the concrete syntaxes. As a combination of parsing and linearization,
|
||||
one can also perform **translation** from one language to another.
|
||||
(By **language** we mean the set of expressions generated by one
|
||||
concrete syntax.)
|
||||
|
||||
|
||||
===Using multilingual grammars===
|
||||
|
||||
Any combination of abstract syntax and corresponding concrete syntaxes
|
||||
is thus a multilingual grammar. With many languages and other enrichments
|
||||
(as described below), a multilingual grammar easily grows to the size of
|
||||
tens of modules. The grammar developer, having finished her job, can
|
||||
package the result in a **multilingual canonical grammar**, a file
|
||||
with the suffix ``.gfcm``. For instance, to compile the set of grammars
|
||||
described by now, the following sequence of GF commands can be used:
|
||||
```
|
||||
i LogicEng.gf
|
||||
i LogicFre.gf
|
||||
i LogicSymb.gf
|
||||
pm | wf logic.gfcm
|
||||
```
|
||||
The "end user" of the grammar only needs the file ``logic.gfcm`` to
|
||||
access all the functionality of the multilingual grammar. It can be
|
||||
imported in the GF system in the same way as ``.gf`` files. But
|
||||
it can also be used in the
|
||||
[Embedded Java Interpreter for GF http://www.cs.chalmers.se/~bringert/gf/gf-java.html]
|
||||
to build Java programs of which the multilingual grammar functionalities
|
||||
(linearization, parsing, translation) form a part.
|
||||
|
||||
In a multilingual grammar, the concrete syntax module names work as
|
||||
names of languages that can be selected for linearization and parsing:
|
||||
```
|
||||
> l -lang=LogicFre Impl Falsum Falsum
|
||||
si nous avons une contradiction alors nous avons une contradiction
|
||||
|
||||
> l -lang=LogicSymb Impl Falsum Falsum
|
||||
( _|_ -> _|_ )
|
||||
|
||||
> p -cat=Prop -lang=LogicSymb "( _|_ & _|_ )"
|
||||
Conj Falsum Falsum
|
||||
```
|
||||
The option ``-multi`` gives linearization to all languages:
|
||||
```
|
||||
> l -multi Impl Falsum Falsum
|
||||
if we have a contradiction then we have a contradiction
|
||||
si nous avons une contradiction alors nous avons une contradiction
|
||||
( _|_ -> _|_ )
|
||||
```
|
||||
Translation can be obtained by using a **pipe** from a parser
|
||||
to a linearizer:
|
||||
```
|
||||
> p -cat=Prop -lang=LogicSymb "( _|_ & _|_ )" | l -lang=LogicEng
|
||||
if we have a contradiction then we have a contradiction
|
||||
```
|
||||
|
||||
|
||||
|
||||
==Resource modules==
|
||||
|
||||
The ``concrete`` modules shown above would look much nicer if
|
||||
we used the main idea of functional programming: avoid repetitive
|
||||
code by using **functions** that capture repeated patterns of
|
||||
expressions. A collection of such functions can be a valuable
|
||||
**resource** for a programmer, reusable in many different
|
||||
top-level grammars. Thus we introduce the ``resource``
|
||||
module type, with the first example
|
||||
```
|
||||
resource Util = {
|
||||
oper SS : Type = {s : Str} ;
|
||||
oper ss : Str -> SS = \s -> {s = s} ;
|
||||
oper paren : Str -> Str = \s -> "(" ++ s ++ ")" ;
|
||||
oper infix : Str -> SS -> SS -> SS = \h,x,y ->
|
||||
ss (x.s ++ h ++ y.s) ;
|
||||
oper infixp : Str -> SS -> SS -> SS = \h,x,y ->
|
||||
ss (paren (infix h x y)) ;
|
||||
}
|
||||
```
|
||||
Modules of ``resource`` type have two forms of judgement:
|
||||
|
||||
- ``oper`` defining auxiliary operations
|
||||
- ``param`` defining parameter types
|
||||
|
||||
|
||||
A ``resource`` can be used in a ``concrete`` (or another
|
||||
``resource``) by ``open``ing it. This means that
|
||||
all operations (and parameter types) defined in the resource
|
||||
module become usable in module that opens it. For instance,
|
||||
we can rewrite the module ``LogicSymb`` much more concisely:
|
||||
```
|
||||
concrete LogicSymb of Logic = open Util in {
|
||||
lincat Prop = SS ;
|
||||
lin Conj = infixp "&" ;
|
||||
lin Disj = infixp "v" ;
|
||||
lin Impl = infixp "->" ;
|
||||
lin Falsum = ss "_|_" ;
|
||||
}
|
||||
```
|
||||
What happens when this variant of ``LogicSymb`` is
|
||||
compiled is that the ``oper``-defined constants
|
||||
of ``Util`` are **inlined** in the
|
||||
right-hand-sides of the judgements of ``LogicSymb``,
|
||||
and these expressions are **partially evaluated**, i.e.
|
||||
computed as far as possible. The generated ``gfc`` file
|
||||
will look just like the file generated for the first version
|
||||
of ``LogicSymb`` - at least, it will do the same job.
|
||||
|
||||
|
||||
Several ``resource`` modules can be ``open``ed
|
||||
at the same time. If the modules contain same names, the
|
||||
conflict can be resolved by **qualified** opening and
|
||||
reference. For instance,
|
||||
```
|
||||
concrete LogicSymb of Logic = open Util, Prelude in { ...
|
||||
} ;
|
||||
```
|
||||
(where ``Prelude`` is a standard library of GF) brings
|
||||
into scope two definitions of the constant ``SS``.
|
||||
To specify which one is used, you can write
|
||||
``Util.SS`` or ``Prelude.SS`` instead of just ``SS``.
|
||||
You can also introduce abbreviations to avoid long qualifiers, e.g.
|
||||
```
|
||||
concrete LogicSymb of Logic = open (U=Util), (P=Prelude) in { ...
|
||||
} ;
|
||||
```
|
||||
which means that you can write ``U.SS`` and ``P.SS``.
|
||||
|
||||
Judgements of ``param`` and ``oper`` forms may also be used
|
||||
in ``concrete`` modules, and they are then considered local
|
||||
to those modules, i.e. they are not exported.
|
||||
|
||||
|
||||
|
||||
===Compiling resource modules===
|
||||
|
||||
The compilation of a ``resource`` module differs
|
||||
from the compilation of ``abstract`` and
|
||||
``concrete`` modules because ``oper`` operations
|
||||
do not in general have values in ``gfc``. A ``gfc``
|
||||
file //is// generated, but it contains only
|
||||
``param`` judgements (also recall that ``oper``s
|
||||
are inlined in their top-level use sites, so it is not
|
||||
necessary to save them in the compiled grammar).
|
||||
However, since computing the operations over and over
|
||||
again can be time comsuming, and since type checking
|
||||
``resource`` modules also takes time, a third kind
|
||||
of file is generated for resource modules: a ``.gfr``
|
||||
file. This file is written in the GF source code notation,
|
||||
but it is type checked and type annotated, and ``oper``s
|
||||
are computed as far as possible.
|
||||
|
||||
|
||||
|
||||
If you look at any ``gfc`` or ``gfr`` file generated
|
||||
by the GF compiler, you see that all names have been replaced by
|
||||
their qualified variants. This is an important first step (after parsing)
|
||||
the compiler does. As for the commands in the GF shell, some output
|
||||
qualified names and some not. The difference does not always result
|
||||
from firm principles.
|
||||
|
||||
|
||||
===Using resource modules===
|
||||
|
||||
The typical use is through ``open`` in a
|
||||
``concrete`` module, which means that
|
||||
``resource`` modules are not imported on their own.
|
||||
However, in the developing and testing phase of grammars, it
|
||||
can be useful to evaluate ``oper``s with different
|
||||
arguments. To prevent them from being thrown away after inlining, the
|
||||
``-retain`` option can be used:
|
||||
```
|
||||
> i -retain Util.gf
|
||||
```
|
||||
The command ``compute_concrete`` (``cc``)
|
||||
can now be used for evaluating expressions that may contain
|
||||
operations defined in ``Util``:
|
||||
```
|
||||
> cc ss (paren "foo")
|
||||
{s = "(" ++ "foo" ++ ")"}
|
||||
```
|
||||
To find out what ``oper``s are available for a given type,
|
||||
the command ``show_operations`` (``so``) can be used:
|
||||
```
|
||||
> so SS
|
||||
Util.ss : Str -> SS ;
|
||||
Util.infix : Str -> SS -> SS -> SS ;
|
||||
Util.infixp : Str -> SS -> SS -> SS ;
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
==Inheritance==
|
||||
|
||||
The most characteristic modularity of GF lies in the division of
|
||||
grammars into ``abstract``, ``concrete``, and
|
||||
``resource`` modules. This permits writing multilingual
|
||||
grammar and sharing the maximum of code between different
|
||||
languages.
|
||||
|
||||
|
||||
In addition to this special kind of modularity, GF provides **inheritance**,
|
||||
which is familiar from other programming languages (in particular,
|
||||
object-oriented ones). Inheritance means that a module inherits all
|
||||
judgements from another module; we also say that it **extends**
|
||||
the other module. Inheritance is useful to divide big grammars into
|
||||
smaller units, and also to reuse the same units in different bigger
|
||||
grammars.
|
||||
|
||||
|
||||
|
||||
The first example of inheritance is for abstract syntax. Let us
|
||||
extend the module ``Logic`` to ``Arithmetic``:
|
||||
```
|
||||
abstract Arithmetic = Logic ** {
|
||||
cat Nat ;
|
||||
fun Even : Nat -> Prop ;
|
||||
fun Odd : Nat -> Prop ;
|
||||
fun Zero : Nat ;
|
||||
fun Succ : Nat -> Nat ;
|
||||
}
|
||||
```
|
||||
In parallel with the extension of the abstract syntax
|
||||
``Logic`` to ``Arithmetic``, we can extend
|
||||
the concrete syntax ``LogicEng`` to ``ArithmeticEng``:
|
||||
```
|
||||
concrete ArithmeticEng of Arithmetic = LogicEng ** open Util in {
|
||||
lincat Nat = SS ;
|
||||
lin Even x = ss (x.s ++ "is" ++ "even") ;
|
||||
lin Odd x = ss (x.s ++ "is" ++ "odd") ;
|
||||
lin Zero = ss "zero" ;
|
||||
lin Succ x = ss ("the" ++ "successor" ++ "of" ++ x.s) ;
|
||||
}
|
||||
```
|
||||
Another extension of ``Logic`` is ``Geometry``,
|
||||
```
|
||||
abstract Geometry = Logic ** {
|
||||
cat Point ;
|
||||
cat Line ;
|
||||
fun Incident : Point -> Line -> Prop ;
|
||||
}
|
||||
```
|
||||
The corresponding concrete syntax is left as exercise.
|
||||
|
||||
|
||||
===Multiple inheritance===
|
||||
|
||||
|
||||
Inheritance can be **multiple**, which means that a module
|
||||
may extend many modules at the same time. Suppose, for instance,
|
||||
that we want to build a module for mathematics covering both
|
||||
arithmetic and geometry, and the underlying logic. We then write
|
||||
```
|
||||
abstract Mathematics = Arithmetic, Geometry ** {
|
||||
} ;
|
||||
```
|
||||
We could of course add some new judgements in this module, but
|
||||
it is not necessary to do so. If no new judgements are added, the
|
||||
module body can be omitted:
|
||||
```
|
||||
abstract Mathematics = Arithmetic, Geometry ;
|
||||
```
|
||||
|
||||
The module ``Mathematics`` shows that it is possibe
|
||||
to extend a module already built by extension. The correctness
|
||||
criterion for extensions is that the same name
|
||||
(``cat``, ``fun``, ``oper``, or ``param``)
|
||||
may not be defined twice in the resulting union of names.
|
||||
That the names defined in ``Logic`` are "inherited twice"
|
||||
by ``Mathematics`` (via both ``Arithmetic`` and
|
||||
``Geometry``) is no violation of this rule; the usual
|
||||
problems of multiple inheritance do not arise, since
|
||||
the definitions of inherited constants cannot be changed.
|
||||
|
||||
|
||||
|
||||
===Restricted inheritance===
|
||||
|
||||
Inheritance can be **restricted**, which means that only some of
|
||||
the constants are inherited. There are two dual notations for this:
|
||||
```
|
||||
A [f,g]
|
||||
```
|
||||
meaning that //only// ``f`` and ``g`` are inherited from ``A``, and
|
||||
```
|
||||
A-[f,g]
|
||||
```
|
||||
meaning that //everything except// ``f`` is ``g`` are inherited from ``A``.
|
||||
|
||||
Constants that are not inherited may be redefined in the inheriting module.
|
||||
|
||||
|
||||
|
||||
|
||||
===Compiling inheritance===
|
||||
|
||||
Inherited judgements are not copied into the inheriting modules.
|
||||
Instead, an **indirection** is created for each inherited name,
|
||||
as can be seen by looking into the generated ``gfc`` (and
|
||||
``gfr``) files. Thus for instance the names
|
||||
```
|
||||
Mathematics.Prop Arithmetic.Prop Geometry.Prop Logic.Prop
|
||||
```
|
||||
all refer to the same category, declared in the module
|
||||
``Logic``.
|
||||
|
||||
|
||||
|
||||
===Inspecting grammar hierarchies===
|
||||
|
||||
The command ``visualize_graph`` (``vg``) shows the
|
||||
dependency graph in the current GF shell state. The graph can
|
||||
also be saved in a file and used e.g. in documentation, by the
|
||||
command ``print_multi -graph`` (``pm -graph``).
|
||||
|
||||
The ``vg`` command uses the free software packages Graphviz (commad ``dot``)
|
||||
and Ghostscript (command ``gv``).
|
||||
|
||||
|
||||
|
||||
==Reuse of top-level grammars as resources==
|
||||
|
||||
Top-level grammars have a straightforward translation to
|
||||
``resource`` modules. The translation concerns
|
||||
pairs of abstract-concrete judgements:
|
||||
```
|
||||
cat C ; ===> oper C : Type = T ;
|
||||
lincat C = T ;
|
||||
|
||||
fun f : A ; ===> oper f : A = t ;
|
||||
lin f = t ;
|
||||
```
|
||||
Due to this translation, a ``concrete`` module
|
||||
can be ``open``ed in the same way as a
|
||||
``resource`` module; the translation is done
|
||||
on the fly (it is computationally very cheap).
|
||||
|
||||
Modular grammar engineering often means that some grammarians
|
||||
focus on the semantics of the domain whereas others take care
|
||||
of linguistic details. Thus a typical reuse opens a
|
||||
linguistically oriented **resource grammar**,
|
||||
```
|
||||
abstract Resource = {
|
||||
cat S ; NP ; A ;
|
||||
fun PredA : NP -> A -> S ;
|
||||
}
|
||||
concrete ResourceEng of Resource = {
|
||||
lincat S = ... ;
|
||||
lin PredA = ... ;
|
||||
}
|
||||
```
|
||||
The **application grammar**, instead of giving linearizations
|
||||
explicitly, just reduces them to categories and functions in the
|
||||
resource grammar:
|
||||
```
|
||||
concrete ArithmeticEng of Arithmetic = LogicEng ** open ResourceEng in {
|
||||
lincat Nat = NP ;
|
||||
lin Even x = PredA x (regA "even") ;
|
||||
}
|
||||
```
|
||||
If the resource grammar is only capable of generating grammatically
|
||||
correct expressions, then the grammaticality of the application
|
||||
grammar is also guaranteed: the type checker of GF is used as
|
||||
grammar checker.
|
||||
To guarantee distinctions between categories that have
|
||||
the same linearization type, the actual translation used
|
||||
in GF adds to every linearization type and linearization
|
||||
a **lock field**,
|
||||
```
|
||||
cat C ; ===> oper C : Type = T ** {lock_C : {}} ;
|
||||
lincat C = T ;
|
||||
|
||||
fun f : C_1 ... C_n -> C ; ===> oper f : C_1 ... C_n -> C = \x_1,...,x_n ->
|
||||
lin f = t ; t x_1 ... x_n ** {lock_C = <>};
|
||||
```
|
||||
(Notice that the latter translation is type-correct because of
|
||||
record subtyping, which means that ``t`` can ignore the
|
||||
lock fields of its arguments.) An application grammarian who
|
||||
only uses resource grammar categories and functions never
|
||||
needs to write these lock fields herself. Having to do so
|
||||
serves as a warning that the grammaticality guarantee given
|
||||
by the resource grammar no longer holds.
|
||||
|
||||
**Note**. The lock field mechanism is experimental, and may be changed
|
||||
to a stronger abstraction mechnism in the future. This may result in
|
||||
hand-written lock fields ceasing to work.
|
||||
|
||||
|
||||
=Additional module types=
|
||||
|
||||
==Interfaces, instances, and incomplete grammars==
|
||||
|
||||
One difference between top-level grammars and ``resource``
|
||||
modules is that the former systematically separete the
|
||||
declarations of categories and functions from their definitions.
|
||||
In the reuse translation creating and ``oper`` judgement,
|
||||
the declaration coming from the ``abstract`` module is put
|
||||
together with the definition coming from the ``concrete``
|
||||
module.
|
||||
|
||||
|
||||
|
||||
However, the separation of declarations and definitions is so
|
||||
useful a notion that GF also has specific modules types that
|
||||
``resource`` modules into two parts. In this splitting,
|
||||
an ``interface`` module corresponds to an abstract syntax,
|
||||
in giving the declarations of operations (and parameter types).
|
||||
For instance, a generic markup interface would look as follows:
|
||||
```
|
||||
interface Markup = open Util in {
|
||||
oper Boldface : Str -> Str ;
|
||||
oper Heading : Str -> Str ;
|
||||
oper markupSS : (Str -> Str) -> SS -> SS = \f,r ->
|
||||
ss (f r.s) ;
|
||||
}
|
||||
```
|
||||
The definitions of the constants declared in an ``interface``
|
||||
are given in an ``instance`` module (which is always ``of``
|
||||
an interface, in the same way as a ``concrete`` is always
|
||||
``of`` an abstract). The following ``instance``s
|
||||
define markup in HTML and latex.
|
||||
```
|
||||
instance MarkupHTML of Markup = open Util in {
|
||||
oper Boldface s = "<b>" ++ s ++ "</b>" ;
|
||||
oper Heading s = "<h2>" ++ s ++ "</h2>" ;
|
||||
}
|
||||
|
||||
instance MarkupLatex of Markup = open Util in {
|
||||
oper Boldface s = "\\textbf{" ++ s ++ "}" ;
|
||||
oper Heading s = "\\section{" ++ s ++ "}" ;
|
||||
}
|
||||
```
|
||||
Notice that both ``interface``s and ``instance``s may
|
||||
``open`` ``resource``s (and also reused top-level grammars).
|
||||
An ``interface`` may moreover define some of the operations it
|
||||
declares; these definitions are inherited by all instances and cannot
|
||||
be changed in them. Inheritance by module extension
|
||||
is possible, as always, between modules of the same type.
|
||||
|
||||
|
||||
===Using an interface===
|
||||
|
||||
An ``interface`` or an ``instance``
|
||||
can be ``open``ed in
|
||||
a ``concrete`` using the same syntax as when opening
|
||||
a ``resource``. For an ``instance``, the semantics
|
||||
is the same as when opening the definitions together with
|
||||
the type signatures - one can think of an ``interface``
|
||||
and an ``instance`` of it together forming an ordinary
|
||||
``resource``. Opening an ``interface``, however,
|
||||
is different: functions that are only declared without
|
||||
having a definition cannot be compiled (inlined); neither
|
||||
can functions whose definitions depend on undefined functions.
|
||||
|
||||
|
||||
|
||||
A module that ``open``s an ``interface`` is therefore
|
||||
**incomplete**, and has to be **completed** with an
|
||||
``instance`` of the interface to become complete. To make
|
||||
this situation clear, GF requires any module that opens an
|
||||
``interface`` to be marked as ``incomplete``. Thus
|
||||
the module
|
||||
```
|
||||
incomplete concrete DocMarkup of Doc = open Markup in {
|
||||
...
|
||||
}
|
||||
```
|
||||
uses the interface ``Markup`` to place markup in
|
||||
chosen places in its linearization rules, but the
|
||||
implementation of markup - whether in HTML or in LaTeX - is
|
||||
left unspecified. This is a powerful way of sharing
|
||||
the code of a whole module with just differences in
|
||||
the definitions of some constants.
|
||||
|
||||
|
||||
|
||||
Another terminology for ``incomplete`` modules is
|
||||
**parametrized modules** or **functors**.
|
||||
The ``interface`` gives the list of parameters
|
||||
that the functor depends on.
|
||||
|
||||
|
||||
===Instantiating an interface===
|
||||
|
||||
To complete an ``incomplete`` module, each ``inteface``
|
||||
that it opens has to be provided an ``instance``. The following
|
||||
syntax is used for this:
|
||||
```
|
||||
concrete DocHTML of Doc = DocMarkup with (Markup = MarkupHTML) ;
|
||||
```
|
||||
Instantiation of ``Markup`` with ``MarkupLatex`` is
|
||||
another one-liner.
|
||||
|
||||
If more interfaces than one are instantiated, a comma-separated
|
||||
list of equations in parentheses is used, e.g.
|
||||
```
|
||||
concrete MusicIta = MusicI with
|
||||
(Syntax = SyntaxIta), (LexMusic = LexMusicIta) ;
|
||||
```
|
||||
This example shows a common design pattern for building applications:
|
||||
the concrete syntax is a functor on the generic resource grammar library
|
||||
interface ``Syntax`` and a domain-specific lexicon interface, here
|
||||
``LexMusic``.
|
||||
|
||||
All interfaces that are ``open``ed in the completed model
|
||||
must be completed.
|
||||
|
||||
Notice that the completion of an ``incomplete`` module
|
||||
may at the same time extend modules of the same type (which need
|
||||
not be completions). It can also add new judgements in a module body,
|
||||
and restrict inheritance from the functor.
|
||||
```
|
||||
concrete MusicIta = MusicI - [f] with
|
||||
(Syntax = SyntaxIta), (LexMusic = LexMusicIta) ** {
|
||||
|
||||
lin f = ...
|
||||
|
||||
} ;
|
||||
```
|
||||
|
||||
|
||||
===Compiling interfaces, instances, and parametrized modules===
|
||||
|
||||
Interfaces, instances, and parametric modules are purely a
|
||||
front-end feature of GF: these module types do not exist in
|
||||
the ``gfc`` and ``gfr`` formats. The compiler has
|
||||
nevertheless to keep track of their dependencies and modification
|
||||
times. Here is a summary of how they are compiled:
|
||||
- an ``interface`` is compiled into a ``resource`` with an empty body
|
||||
- an ``instance`` is compiled into a ``resource`` in union with its
|
||||
``interface``
|
||||
- an ``incomplete`` module (``concrete`` or ``resource``) is compiled
|
||||
into a module of the same type with an empty body
|
||||
- a completion module (``concrete`` or ``resource``) is compiled
|
||||
into a module of the same type by compiling its functor so that, instead of
|
||||
each ``interface``, its given ``instance`` is used
|
||||
|
||||
|
||||
This means that some generated code is duplicated, because those operations that
|
||||
do have complete definitions in an ``interface`` are copied to each of
|
||||
the ``instances``.
|
||||
|
||||
|
||||
=Summary of module syntax and semantics=
|
||||
|
||||
|
||||
==Abstract syntax modules==
|
||||
|
||||
Syntax:
|
||||
|
||||
``abstract`` A ``=`` (A#SUB1,...,A#SUBn ``**``)?
|
||||
``{``J#SUB1 ``;`` ... ``;`` J#SUBm ``; }``
|
||||
|
||||
|
||||
|
||||
where
|
||||
- i >= 0
|
||||
- each //A#SUBi// is itself an abstract module,
|
||||
possibly with restrictions on inheritance, i.e. //A#SUBi//``-[``//f,..,g//``]``
|
||||
or //A#SUBi//``[``//f,..,g//``]``
|
||||
- each //J#SUBi// is a judgement of one of the forms
|
||||
``cat, fun, def, data``
|
||||
|
||||
|
||||
Semantic conditions:
|
||||
- all inherited names declared in each //A#SUBi// and //A// must be distinct
|
||||
- names in restriction lists must be defined in the restricted module
|
||||
- inherited constants may not depend on names excluded by restriction
|
||||
|
||||
|
||||
|
||||
==Concrete syntax modules==
|
||||
|
||||
Syntax:
|
||||
|
||||
``incomplete``? ``concrete`` C ``of`` A ``=``
|
||||
(C#SUB1,...,C#SUBn ``**``)?
|
||||
(``open`` O#SUB1,...,O#SUBk ``in``)?
|
||||
``{``J#SUB1 ``;`` ... ``;`` J#SUBm ``; }``
|
||||
|
||||
|
||||
|
||||
where
|
||||
- i >= 0
|
||||
- //A// is an abstract module
|
||||
- each //C#SUBi// is a concrete module,
|
||||
possibly with restrictions on inheritance, i.e. //C#SUBi//``-[``//f,..,g//``]``
|
||||
- each //O#SUBi// is an open specification, of one of the forms
|
||||
- //R//
|
||||
- ``(``//Q//``=``//R//``)``
|
||||
|
||||
|
||||
where //R// is a resource, instance, or concrete, and //Q// is any identifier
|
||||
- each //J#SUBi// is a judgement of one of the forms
|
||||
``lincat, lin, lindef, printname``; also the forms ``oper, param`` are
|
||||
allowed, but they cannot be inherited.
|
||||
|
||||
|
||||
|
||||
If the modifier ``incomplete`` appears, then any //R// in
|
||||
an open specification may also be an interface or an abstract.
|
||||
|
||||
|
||||
Semantic conditions:
|
||||
- each ``cat`` judgement in //A//
|
||||
must have a corresponding, unique
|
||||
``lincat`` judgement in //C//
|
||||
- each ``fun`` judgement in //A//
|
||||
must have a corresponding, unique
|
||||
``lin`` judgement in //C//
|
||||
- names in restriction lists must be defined in the restricted module
|
||||
- inherited constants may not depend on names excluded by restriction
|
||||
|
||||
|
||||
|
||||
==Resource modules==
|
||||
|
||||
Syntax:
|
||||
|
||||
``resource`` R ``=``
|
||||
(R#SUB1,...,R#SUBn ``**``)?
|
||||
(``open`` O#SUB1,...,O#SUBk ``in``)?
|
||||
``{``J#SUB1 ``;`` ... ``;`` J#SUBm ``; }``
|
||||
|
||||
|
||||
where
|
||||
- i >= 0
|
||||
- each //R#SUBi// is a resource, instance, or concrete module,
|
||||
possibly with restrictions on inheritance, i.e. //R#SUBi//``-[``//f,..,g//``]``
|
||||
- each //O#SUBi// is an open specification, of one of the forms
|
||||
- //P//
|
||||
- ``(``//Q//``=``//R//``)``
|
||||
|
||||
|
||||
where //P// is a resource, instance, or concrete, and //Q// is any identifier
|
||||
- each //J#SUBi// is a judgement of one of the forms ``oper, param``
|
||||
|
||||
|
||||
|
||||
|
||||
Semantic conditions:
|
||||
- all names defined in each //R#SUBi// and //R// must be distinct
|
||||
- all constants declared must have a definition
|
||||
- names in restriction lists must be defined in the restricted module
|
||||
- inherited constants may not depend on names excluded by restriction
|
||||
|
||||
|
||||
|
||||
==Interface modules==
|
||||
|
||||
Syntax:
|
||||
|
||||
``interface`` R ``=``
|
||||
(R#SUB1,...,R#SUBn ``**``)?
|
||||
(``open`` O#SUB1,...,O#SUBk ``in``)?
|
||||
``{``J#SUB1 ``;`` ... ``;`` J#SUBm ``; }``
|
||||
|
||||
|
||||
where
|
||||
- i >= 0
|
||||
- each //R#SUBi// is an interface or abstract module,
|
||||
possibly with restrictions on inheritance, i.e. //R#SUBi//``-[``//f,..,g//``]``
|
||||
- each //O#SUBi// is an open specification, of one of the forms
|
||||
- //P//
|
||||
- ``(``//Q//``=``//R//``)``
|
||||
|
||||
|
||||
where //P// is a resource, instance, or concrete, and //Q// is any identifier
|
||||
- each //J#SUBi// is a judgement of one of the forms ``oper, param``
|
||||
|
||||
|
||||
|
||||
|
||||
Semantic conditions:
|
||||
- all names declared in each //R#SUBi// and //R// must be distinct
|
||||
- names in restriction lists must be defined in the restricted module
|
||||
- inherited constants may not depend on names excluded by restriction
|
||||
|
||||
|
||||
|
||||
|
||||
==Instance modules==
|
||||
|
||||
Syntax:
|
||||
|
||||
``instance`` R ``of`` I ``=``
|
||||
(R#SUB1,...,R#SUBn ``**``)?
|
||||
(``open`` O#SUB1,...,O#SUBk ``in``)?
|
||||
``{``J#SUB1 ``;`` ... ``;`` J#SUBm ``; }``
|
||||
|
||||
|
||||
where
|
||||
- i >= 0
|
||||
- //I// is an interface module
|
||||
- each //R#SUBi// is an instance, resource, or concrete module,
|
||||
possibly with restrictions on inheritance, i.e. //R#SUBi//``-[``//f,..,g//``]``
|
||||
|
||||
- each //O#SUBi// is an open specification, of one of the forms
|
||||
- //P//
|
||||
- ``(``//Q//``=``//R//``)``
|
||||
|
||||
|
||||
where //P// is a resource, instance, or concrete, and //Q// is any identifier
|
||||
- each //J#SUBi// is a judgement of one of the forms
|
||||
``oper, param``
|
||||
|
||||
|
||||
|
||||
|
||||
Semantic conditions:
|
||||
- all names declared in each //R#SUBi//, //I//, and //R// must be distinct
|
||||
- all constants declared in //I// must have a definition either in
|
||||
//I// or //R//
|
||||
- names in restriction lists must be defined in the restricted module
|
||||
- inherited constants may not depend on names excluded by restriction
|
||||
|
||||
|
||||
|
||||
==Instantiated concrete syntax modules==
|
||||
|
||||
Syntax:
|
||||
|
||||
``concrete`` C ``of`` A ``=``
|
||||
(C#SUB1,...,C#SUBn ``**``)?
|
||||
B
|
||||
``with``
|
||||
``(``I#SUB1 ``=``J#SUB1``),`` ...
|
||||
``, (``I#SUBp ``=``J#SUBp``)``
|
||||
(``-``? ``[``c#SUB1,...,c#SUBq ``]``)?
|
||||
(``**``?
|
||||
(``open`` O#SUB1,...,O#SUBk ``in``)?
|
||||
``{``J#SUB1 ``;`` ... ``;`` J#SUBm ``; }``)? ``;``
|
||||
|
||||
where
|
||||
- i >= 0
|
||||
- //A// is an abstract module
|
||||
- each //C#SUBi// is a concrete module,
|
||||
possibly with restrictions on inheritance, i.e. //R#SUBi//``-[``//f,..,g//``]``
|
||||
- //B// is an incomplete concrete syntax of //A//
|
||||
- each //I#SUBi// is an interface or an abstract
|
||||
- each //J#SUBi// is an instance or a concrete of //I#SUBi//
|
||||
- each //O#SUBi// is an open specification, of one of the forms
|
||||
- //R//
|
||||
- ``(``//Q//``=``//R//``)``
|
||||
|
||||
|
||||
where //R// is a resource, instance, or concrete, and //Q// is any identifier
|
||||
- each //J#SUBi// is a judgement of one of the forms
|
||||
``lincat, lin, lindef, printname``; also the forms ``oper, param`` are
|
||||
allowed, but they cannot be inherited.
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user