tutorial goes on

This commit is contained in:
aarne
2005-12-16 21:19:32 +00:00
parent 7110ad70cc
commit 5161a93ae8
2 changed files with 265 additions and 240 deletions

View File

@@ -7,7 +7,7 @@
<P ALIGN="center"><CENTER><H1>Grammatical Framework Tutorial</H1> <P ALIGN="center"><CENTER><H1>Grammatical Framework Tutorial</H1>
<FONT SIZE="4"> <FONT SIZE="4">
<I>Author: Aarne Ranta &lt;aarne (at) cs.chalmers.se&gt;</I><BR> <I>Author: Aarne Ranta &lt;aarne (at) cs.chalmers.se&gt;</I><BR>
Last update: Fri Dec 16 21:04:37 2005 Last update: Fri Dec 16 22:10:53 2005
</FONT></CENTER> </FONT></CENTER>
<P></P> <P></P>
@@ -18,7 +18,7 @@ Last update: Fri Dec 16 21:04:37 2005
<UL> <UL>
<LI><A HREF="#toc2">Getting the GF program</A> <LI><A HREF="#toc2">Getting the GF program</A>
</UL> </UL>
<LI><A HREF="#toc3">My first grammar</A> <LI><A HREF="#toc3">The ``.cf`` grammar format</A>
<UL> <UL>
<LI><A HREF="#toc4">Importing grammars and parsing strings</A> <LI><A HREF="#toc4">Importing grammars and parsing strings</A>
<LI><A HREF="#toc5">Generating trees and strings</A> <LI><A HREF="#toc5">Generating trees and strings</A>
@@ -28,25 +28,60 @@ Last update: Fri Dec 16 21:04:37 2005
<LI><A HREF="#toc9">More on pipes; tracing</A> <LI><A HREF="#toc9">More on pipes; tracing</A>
<LI><A HREF="#toc10">Writing and reading files</A> <LI><A HREF="#toc10">Writing and reading files</A>
<LI><A HREF="#toc11">Labelled context-free grammars</A> <LI><A HREF="#toc11">Labelled context-free grammars</A>
<LI><A HREF="#toc12">The labelled context-free format</A>
</UL> </UL>
<LI><A HREF="#toc12">The GF grammar format</A> <LI><A HREF="#toc13">The ``.gf`` grammar format</A>
<UL> <UL>
<LI><A HREF="#toc13">Abstract and concrete syntax</A> <LI><A HREF="#toc14">Abstract and concrete syntax</A>
<LI><A HREF="#toc14">Resource modules</A> <LI><A HREF="#toc15">Judgement forms</A>
<LI><A HREF="#toc15">Opening a ``resource``</A> <LI><A HREF="#toc16">Module types</A>
<LI><A HREF="#toc17">Record types, records, and ``Str``s</A>
<LI><A HREF="#toc18">An abstract syntax example</A>
<LI><A HREF="#toc19">A concrete syntax example</A>
<LI><A HREF="#toc20">Modules and files</A>
</UL> </UL>
<LI><A HREF="#toc16">Topics still to be written</A> <LI><A HREF="#toc21">Multilingual grammars and translation</A>
<UL> <UL>
<LI><A HREF="#toc17">Free variation</A> <LI><A HREF="#toc22">An Italian concrete syntax</A>
<LI><A HREF="#toc18">Record extension, tuples</A> <LI><A HREF="#toc23">Using a multilingual grammar</A>
<LI><A HREF="#toc19">Predefined types and operations</A> <LI><A HREF="#toc24">Translation quiz</A>
<LI><A HREF="#toc20">Lexers and unlexers</A> <LI><A HREF="#toc25">The multilingual shell state</A>
<LI><A HREF="#toc21">Grammars of formal languages</A> </UL>
<LI><A HREF="#toc22">Resource grammars and their reuse</A> <LI><A HREF="#toc26">Grammar architecture</A>
<LI><A HREF="#toc23">Embedded grammars in Haskell, Java, and Prolog</A> <UL>
<LI><A HREF="#toc24">Dependent types, variable bindings, semantic definitions</A> <LI><A HREF="#toc27">Extending a grammar</A>
<LI><A HREF="#toc25">Transfer modules</A> <LI><A HREF="#toc28">Multiple inheritance</A>
<LI><A HREF="#toc26">Alternative input and output grammar formats</A> <LI><A HREF="#toc29">Visualizing module structure</A>
<LI><A HREF="#toc30">The module structure of ``GathererEng``</A>
</UL>
<LI><A HREF="#toc31">Resource modules</A>
<UL>
<LI><A HREF="#toc32">Parameters and tables</A>
<LI><A HREF="#toc33">Inflection tables, paradigms, and ``oper`` definitions</A>
<LI><A HREF="#toc34">The ``resource`` module type</A>
<LI><A HREF="#toc35">Opening a ``resource``</A>
<LI><A HREF="#toc36">Worst-case macros and data abstraction</A>
<LI><A HREF="#toc37">A system of paradigms using ``Prelude`` operations</A>
<LI><A HREF="#toc38">An intelligent noun paradigm using ``case`` expressions</A>
<LI><A HREF="#toc39">Pattern matching</A>
<LI><A HREF="#toc40">Morphological analysis and morphology quiz</A>
<LI><A HREF="#toc41">Parametric vs. inherent features, agreement</A>
<LI><A HREF="#toc42">English concrete syntax with parameters</A>
<LI><A HREF="#toc43">Hierarchic parameter types</A>
<LI><A HREF="#toc44">Discontinuous constituents</A>
</UL>
<LI><A HREF="#toc45">Topics still to be written</A>
<UL>
<LI><A HREF="#toc46">Free variation</A>
<LI><A HREF="#toc47">Record extension, tuples</A>
<LI><A HREF="#toc48">Predefined types and operations</A>
<LI><A HREF="#toc49">Lexers and unlexers</A>
<LI><A HREF="#toc50">Grammars of formal languages</A>
<LI><A HREF="#toc51">Resource grammars and their reuse</A>
<LI><A HREF="#toc52">Embedded grammars in Haskell, Java, and Prolog</A>
<LI><A HREF="#toc53">Dependent types, variable bindings, semantic definitions</A>
<LI><A HREF="#toc54">Transfer modules</A>
<LI><A HREF="#toc55">Alternative input and output grammar formats</A>
</UL> </UL>
</UL> </UL>
@@ -109,7 +144,7 @@ To start the GF program, assuming you have installed it, just type
in the shell. You will see GF's welcome message and the prompt <CODE>&gt;</CODE>. in the shell. You will see GF's welcome message and the prompt <CODE>&gt;</CODE>.
</P> </P>
<A NAME="toc3"></A> <A NAME="toc3"></A>
<H2>My first grammar</H2> <H2>The ``.cf`` grammar format</H2>
<P> <P>
Now you are ready to try out your first grammar. Now you are ready to try out your first grammar.
We start with one that is not written in GF language, but We start with one that is not written in GF language, but
@@ -260,7 +295,7 @@ generate ten strings with one and the same command:
<A NAME="toc8"></A> <A NAME="toc8"></A>
<H3>Systematic generation</H3> <H3>Systematic generation</H3>
<P> <P>
To generate &lt;i&gt;all&lt;i&gt; sentence that a grammar To generate <I>all</I> sentence that a grammar
can generate, use the command <CODE>generate_trees = gt</CODE>. can generate, use the command <CODE>generate_trees = gt</CODE>.
</P> </P>
<PRE> <PRE>
@@ -301,9 +336,10 @@ want to see:
</P> </P>
<PRE> <PRE>
&gt; gr -tr | l -tr | p &gt; gr -tr | l -tr | p
Mks_0 (Mks_7 Mks_10) (Mks_1 Mks_18)
a louse sleeps S_NP_VP (NP_the_CN CN_snake) (VP_V V_sleeps)
Mks_0 (Mks_7 Mks_10) (Mks_1 Mks_18) the snake sleeps
S_NP_VP (NP_the_CN CN_snake) (VP_V V_sleeps)
</PRE> </PRE>
<P> <P>
This facility is good for test purposes: for instance, you This facility is good for test purposes: for instance, you
@@ -324,7 +360,7 @@ You can read the file back to GF with the
<CODE>read_file = rf</CODE> command, <CODE>read_file = rf</CODE> command,
</P> </P>
<PRE> <PRE>
&gt; read_file exx.tmp | l -tr | p -lines &gt; read_file exx.tmp | p -lines
</PRE> </PRE>
<P> <P>
Notice the flag <CODE>-lines</CODE> given to the parsing Notice the flag <CODE>-lines</CODE> given to the parsing
@@ -338,45 +374,51 @@ a sentence but a sequence of ten sentences.
<P> <P>
The syntax trees returned by GF's parser in the previous examples The syntax trees returned by GF's parser in the previous examples
are not so nice to look at. The identifiers of form <CODE>Mks</CODE> are not so nice to look at. The identifiers of form <CODE>Mks</CODE>
are <B>labels</B> of the EBNF rules. To see which label corresponds to are <B>labels</B> of the BNF rules. To see which label corresponds to
which rule, you can use the <CODE>print_grammar = pg</CODE> command which rule, you can use the <CODE>print_grammar = pg</CODE> command
with the <CODE>printer</CODE> flag set to <CODE>cf</CODE> (which means context-free): with the <CODE>printer</CODE> flag set to <CODE>cf</CODE> (which means context-free):
</P> </P>
<PRE> <PRE>
&gt; print_grammar -printer=cf &gt; print_grammar -printer=cf
Mks_10. CN ::= "louse" ;
Mks_11. CN ::= "snake" ; V_laughs. V ::= "laughs" ;
Mks_12. CN ::= "worm" ; V_sleeps. V ::= "sleeps" ;
Mks_8. CN ::= A CN ; V_swims. V ::= "swims" ;
Mks_9. CN ::= "boy" ; VP_TV_NP. VP ::= TV NP ;
Mks_4. NP ::= "this" CN ; VP_V. VP ::= V ;
Mks_15. A ::= "thick" ; VP_is_A. VP ::= "is" A ;
TV_eats. TV ::= "eats" ;
TV_kills. TV ::= "kills" ;
TV_washes. TV ::= "washes" ;
S_NP_VP. S ::= NP VP ;
NP_a_CN. NP ::= "a" ;
... ...
</PRE> </PRE>
<P> <P>
A syntax tree such as A syntax tree such as
</P> </P>
<PRE> <PRE>
Mks_4 (Mks_8 Mks_15 Mks_12) NP_this_CN (CN_A_CN A_thick CN_worm)
this thick worm this thick worm
</PRE> </PRE>
<P> <P>
encodes the sequence of grammar rules used for building the encodes the sequence of grammar rules used for building the
expression. If you look at this tree, you will notice that <CODE>Mks_4</CODE> expression. If you look at this tree, you will notice that <CODE>NP_this_CN</CODE>
is the label of the rule prefixing <CODE>this</CODE> to a common noun, is the label of the rule prefixing <CODE>this</CODE> to a common noun (<CODE>CN</CODE>),
<CODE>Mks_15</CODE> is the label of the adjective <CODE>thick</CODE>, thereby forming a noun phrase (<CODE>NP</CODE>).
and so on. <CODE>A_thick</CODE> is the label of the adjective <CODE>thick</CODE>,
</P> and so on. These labels are formed automatically when the grammar
<P> is compiled by GF.
&lt;h4&gt;The labelled context-free format&lt;h4&gt;
</P> </P>
<A NAME="toc12"></A>
<H3>The labelled context-free format</H3>
<P> <P>
The <B>labelled context-free grammar</B> format permits user-defined The <B>labelled context-free grammar</B> format permits user-defined
labels to each rule. labels to each rule.
GF recognizes files of this format by the suffix In files with the suffix <CODE>.cf</CODE>, you can prefix rules with
<CODE>.cf</CODE>. It is intermediate between EBNF and full GF format. labels that you provide yourself - these may be more useful
Let us include the following rules in the file than the automatically generated ones. The following is a possible
<CODE>paleolithic.cf</CODE>. labelling of <CODE>paleolithic.cf</CODE> with nicer-looking labels.
</P> </P>
<PRE> <PRE>
PredVP. S ::= NP VP ; PredVP. S ::= NP VP ;
@@ -403,25 +445,10 @@ Let us include the following rules in the file
Kill. TV ::= "kills" Kill. TV ::= "kills"
Wash. TV ::= "washes" ; Wash. TV ::= "washes" ;
</PRE> </PRE>
<P></P>
<P> <P>
&lt;h4&gt;Using the labelled context-free format&lt;h4&gt; With this grammar, the trees look as follows:
</P>
<P>
The GF commands for the <CODE>.cf</CODE> format are
exactly the same as for the <CODE>.ebnf</CODE> format.
Just the syntax trees become nicer to read and
to remember. Notice that before reading in
a new grammar in GF you often (but not always,
as we will see later) have first to give the
command (<CODE>empty = e</CODE>), which removes the
old grammar from the GF shell state.
</P> </P>
<PRE> <PRE>
&gt; empty
&gt; i paleolithic.cf
&gt; p "the boy eats a snake" &gt; p "the boy eats a snake"
PredVP (Def Boy) (ComplTV Eat (Indef Snake)) PredVP (Def Boy) (ComplTV Eat (Indef Snake))
@@ -430,10 +457,10 @@ old grammar from the GF shell state.
a louse is thick a louse is thick
</PRE> </PRE>
<P></P> <P></P>
<A NAME="toc12"></A> <A NAME="toc13"></A>
<H2>The GF grammar format</H2> <H2>The ``.gf`` grammar format</H2>
<P> <P>
To see what there really is in GF's shell state when a grammar To see what there is in GF's shell state when a grammar
has been imported, you can give the plain command has been imported, you can give the plain command
<CODE>print_grammar = pg</CODE>. <CODE>print_grammar = pg</CODE>.
</P> </P>
@@ -446,15 +473,16 @@ you did not need to write the grammar in that notation, but that the
GF grammar compiler produced it. GF grammar compiler produced it.
</P> </P>
<P> <P>
However, we will now start to show how GF's own notation gives you However, we will now start the demonstration
much more expressive power than the <CODE>.cf</CODE> and <CODE>.ebnf</CODE> how GF's own notation gives you
formats. We will introduce the <CODE>.gf</CODE> format by presenting much more expressive power than the <CODE>.cf</CODE>
format. We will introduce the <CODE>.gf</CODE> format by presenting
one more way of defining the same grammar as in one more way of defining the same grammar as in
<CODE>paleolithic.cf</CODE> and <CODE>paleolithic.ebnf</CODE>. <CODE>paleolithic.cf</CODE>.
Then we will show how the full GF grammar format enables you Then we will show how the full GF grammar format enables you
to do things that are not possible in the weaker formats. to do things that are not possible in the weaker formats.
</P> </P>
<A NAME="toc13"></A> <A NAME="toc14"></A>
<H3>Abstract and concrete syntax</H3> <H3>Abstract and concrete syntax</H3>
<P> <P>
A GF grammar consists of two main parts: A GF grammar consists of two main parts:
@@ -482,16 +510,15 @@ is interpreted as the following pair of rules:
The former rule, with the keyword <CODE>fun</CODE>, belongs to the abstract syntax. The former rule, with the keyword <CODE>fun</CODE>, belongs to the abstract syntax.
It defines the <B>function</B> It defines the <B>function</B>
<CODE>PredVP</CODE> which constructs syntax trees of form <CODE>PredVP</CODE> which constructs syntax trees of form
(<CODE>PredVP</CODE> &lt;i&gt;x&lt;i&gt; &lt;i&gt;y&lt;i&gt;). (<CODE>PredVP</CODE> <I>x</I> <I>y</I>).
</P> </P>
<P> <P>
The latter rule, with the keyword <CODE>lin</CODE>, belongs to the concrete syntax. The latter rule, with the keyword <CODE>lin</CODE>, belongs to the concrete syntax.
It defines the <B>linearization function</B> for It defines the <B>linearization function</B> for
syntax trees of form (<CODE>PredVP</CODE> &lt;i&gt;x&lt;i&gt; &lt;i&gt;y&lt;i&gt;). syntax trees of form (<CODE>PredVP</CODE> <I>x</I> <I>y</I>).
</P>
<P>
&lt;h4&gt;Judgement forms&lt;h4&gt;
</P> </P>
<A NAME="toc15"></A>
<H3>Judgement forms</H3>
<P> <P>
Rules in a GF grammar are called <B>judgements</B>, and the keywords Rules in a GF grammar are called <B>judgements</B>, and the keywords
<CODE>fun</CODE> and <CODE>lin</CODE> are used for distinguishing between two <CODE>fun</CODE> and <CODE>lin</CODE> are used for distinguishing between two
@@ -543,27 +570,25 @@ judgement forms:
<P> <P>
We return to the precise meanings of these judgement forms later. We return to the precise meanings of these judgement forms later.
First we will look at how judgements are grouped into modules, and First we will look at how judgements are grouped into modules, and
show how the grammar <CODE>paleolithic.cf</CODE> is show how the paleolithic grammar is
expressed by using modules and judgements. expressed by using modules and judgements.
</P> </P>
<P> <A NAME="toc16"></A>
&lt;h4&gt;Module types&lt;h4&gt; <H3>Module types</H3>
</P>
<P> <P>
A GF grammar consists of <B>modules</B>, A GF grammar consists of <B>modules</B>,
into which judgements are grouped. The most important into which judgements are grouped. The most important
module forms are module forms are
</P> </P>
<UL> <UL>
<LI><CODE>abstract</CODE> A = M``, abstract syntax A with judgements in <LI><CODE>abstract</CODE> A <CODE>=</CODE> M, abstract syntax A with judgements in
the module body M. the module body M.
<LI><CODE>concrete</CODE> C <CODE>of</CODE> A = M``, concrete syntax C of the <LI><CODE>concrete</CODE> C <CODE>of</CODE> A <CODE>=</CODE> M, concrete syntax C of the
abstract syntax A, with judgements in the module body M. abstract syntax A, with judgements in the module body M.
</UL> </UL>
<P> <A NAME="toc17"></A>
&lt;h4&gt;Record types, records, and <CODE>Str</CODE>s&lt;h4&gt; <H3>Record types, records, and ``Str``s</H3>
</P>
<P> <P>
The linearization type of a category is a <B>record type</B>, with The linearization type of a category is a <B>record type</B>, with
zero of more <B>fields</B> of different types. The simplest record zero of more <B>fields</B> of different types. The simplest record
@@ -579,8 +604,8 @@ which has one field, with <B>label</B> <CODE>s</CODE> and type <CODE>Str</CODE>.
Examples of records of this type are Examples of records of this type are
</P> </P>
<PRE> <PRE>
[s = "foo"} {s = "foo"}
[s = "hello" ++ "world"} {s = "hello" ++ "world"}
</PRE> </PRE>
<P> <P>
The type <CODE>Str</CODE> is really the type of <B>token lists</B>, but The type <CODE>Str</CODE> is really the type of <B>token lists</B>, but
@@ -589,18 +614,26 @@ denoted by string literals in double quotes.
</P> </P>
<P> <P>
Whenever a record <CODE>r</CODE> of type <CODE>{s : Str}</CODE> is given, Whenever a record <CODE>r</CODE> of type <CODE>{s : Str}</CODE> is given,
<CODE>r.s</CODE> is an object of type <CODE>Str</CODE>. This is of course <CODE>r.s</CODE> is an object of type <CODE>Str</CODE>. This is
a special case of the <B>projection</B> rule, allowing the extraction a special case of the <B>projection</B> rule, allowing the extraction
of fields from a record. of fields from a record:
</P> </P>
<UL>
<LI>if <I>r</I> : <CODE>{</CODE> ... <I>p</I> : <I>T</I> ... <CODE>}</CODE> then <I>r.p</I> : <I>T</I>
</UL>
<A NAME="toc18"></A>
<H3>An abstract syntax example</H3>
<P> <P>
&lt;h4&gt;An abstract syntax example&lt;h4&gt; To express the abstract syntax of <CODE>paleolithic.cf</CODE> in
</P> a file <CODE>Paleolithic.gf</CODE>, we write two kinds of judgements:
<P>
Each nonterminal occurring in the grammar <CODE>paleolithic.cf</CODE> is
introduced by a <CODE>cat</CODE> judgement. Each
rule label is introduced by a <CODE>fun</CODE> judgement.
</P> </P>
<UL>
<LI>Each category is introduced by a <CODE>cat</CODE> judgement.
<LI>Each rule label is introduced by a <CODE>fun</CODE> judgement,
with the type formed from the nonterminals of the rule.
</UL>
<PRE> <PRE>
abstract Paleolithic = { abstract Paleolithic = {
cat cat
@@ -623,9 +656,8 @@ Notice the use of shorthands permitting the sharing of
the keyword in subsequent judgements, and of the type the keyword in subsequent judgements, and of the type
in subsequent <CODE>fun</CODE> judgements. in subsequent <CODE>fun</CODE> judgements.
</P> </P>
<P> <A NAME="toc19"></A>
&lt;h4&gt;A concrete syntax example&lt;h4&gt; <H3>A concrete syntax example</H3>
</P>
<P> <P>
Each category introduced in <CODE>Paleolithic.gf</CODE> is Each category introduced in <CODE>Paleolithic.gf</CODE> is
given a <CODE>lincat</CODE> rule, and each given a <CODE>lincat</CODE> rule, and each
@@ -663,9 +695,8 @@ apply as in <CODE>abstract</CODE> modules.
} }
</PRE> </PRE>
<P></P> <P></P>
<P> <A NAME="toc20"></A>
&lt;h4&gt;Modules and files&lt;h4&gt; <H3>Modules and files</H3>
</P>
<P> <P>
Module name + <CODE>.gf</CODE> = file name Module name + <CODE>.gf</CODE> = file name
</P> </P>
@@ -691,9 +722,8 @@ GF source files. When reading a module, GF knows whether
to use an existing <CODE>.gfc</CODE> file or to generate to use an existing <CODE>.gfc</CODE> file or to generate
a new one, by looking at modification times. a new one, by looking at modification times.
</P> </P>
<P> <A NAME="toc21"></A>
&lt;h4&gt;Multilingual grammar&lt;h4&gt; <H2>Multilingual grammars and translation</H2>
</P>
<P> <P>
The main advantage of separating abstract from concrete syntax is that The main advantage of separating abstract from concrete syntax is that
one abstract syntax can be equipped with many concrete syntaxes. one abstract syntax can be equipped with many concrete syntaxes.
@@ -705,9 +735,8 @@ translation. Let us buid an Italian concrete syntax for
<CODE>Paleolithic</CODE> and then test the resulting <CODE>Paleolithic</CODE> and then test the resulting
multilingual grammar. multilingual grammar.
</P> </P>
<P> <A NAME="toc22"></A>
&lt;h4&gt;An Italian concrete syntax&lt;h4&gt; <H3>An Italian concrete syntax</H3>
</P>
<PRE> <PRE>
concrete PaleolithicIta of Paleolithic = { concrete PaleolithicIta of Paleolithic = {
lincat lincat
@@ -739,9 +768,8 @@ multilingual grammar.
} }
</PRE> </PRE>
<P></P> <P></P>
<P> <A NAME="toc23"></A>
&lt;h4&gt;Using a multilingual grammar&lt;h4&gt; <H3>Using a multilingual grammar</H3>
</P>
<P> <P>
Import without first emptying Import without first emptying
</P> </P>
@@ -767,9 +795,8 @@ Translate by using a pipe:
il ragazzo mangia il serpente il ragazzo mangia il serpente
</PRE> </PRE>
<P></P> <P></P>
<P> <A NAME="toc24"></A>
&lt;h4&gt;Translation quiz&lt;h4&gt; <H3>Translation quiz</H3>
</P>
<P> <P>
This is a simple language exercise that can be automatically This is a simple language exercise that can be automatically
generated from a multilingual grammar. The system generates a set of generated from a multilingual grammar. The system generates a set of
@@ -802,9 +829,8 @@ file for later use, by the command <CODE>translation_list = tl</CODE>
<P> <P>
The number flag gives the number of sentences generated. The number flag gives the number of sentences generated.
</P> </P>
<P> <A NAME="toc25"></A>
&lt;h4&gt;The multilingual shell state&lt;h4&gt; <H3>The multilingual shell state</H3>
</P>
<P> <P>
A GF shell is at any time in a state, which A GF shell is at any time in a state, which
contains a multilingual grammar. One of the concrete contains a multilingual grammar. One of the concrete
@@ -825,9 +851,10 @@ things), you can use the command
all concretes : PaleolithicIta PaleolithicEng all concretes : PaleolithicIta PaleolithicEng
</PRE> </PRE>
<P></P> <P></P>
<P> <A NAME="toc26"></A>
&lt;h4&gt;Extending a grammar&lt;h4&gt; <H2>Grammar architecture</H2>
</P> <A NAME="toc27"></A>
<H3>Extending a grammar</H3>
<P> <P>
The module system of GF makes it possible to <B>extend</B> a The module system of GF makes it possible to <B>extend</B> a
grammar in different ways. The syntax of extension is grammar in different ways. The syntax of extension is
@@ -856,9 +883,8 @@ be built for concrete syntaxes:
The effect of extension is that all of the contents of the extended The effect of extension is that all of the contents of the extended
and extending module are put together. and extending module are put together.
</P> </P>
<P> <A NAME="toc28"></A>
&lt;h4&gt;Multiple inheritance&lt;h4&gt; <H3>Multiple inheritance</H3>
</P>
<P> <P>
Specialized vocabularies can be represented as small grammars that Specialized vocabularies can be represented as small grammars that
only do "one thing" each, e.g. only do "one thing" each, e.g.
@@ -887,9 +913,8 @@ same time:
} }
</PRE> </PRE>
<P></P> <P></P>
<P> <A NAME="toc29"></A>
&lt;h4&gt;Visualizing module structure&lt;h4&gt; <H3>Visualizing module structure</H3>
</P>
<P> <P>
When you have created all the abstract syntaxes and When you have created all the abstract syntaxes and
one set of concrete syntaxes needed for <CODE>Gatherer</CODE>, one set of concrete syntaxes needed for <CODE>Gatherer</CODE>,
@@ -918,9 +943,8 @@ The command <CODE>print_multi = pm</CODE> is used for printing the current multi
grammar in various formats, of which the format <CODE>-printer=graph</CODE> just grammar in various formats, of which the format <CODE>-printer=graph</CODE> just
shows the module dependencies. shows the module dependencies.
</P> </P>
<P> <A NAME="toc30"></A>
&lt;h4&gt;The module structure of <CODE>GathererEng</CODE>&lt;h4&gt; <H3>The module structure of ``GathererEng``</H3>
</P>
<P> <P>
The graph uses The graph uses
</P> </P>
@@ -934,8 +958,8 @@ The graph uses
<P> <P>
&lt;img src="Gatherer.gif"&gt; &lt;img src="Gatherer.gif"&gt;
</P> </P>
<A NAME="toc14"></A> <A NAME="toc31"></A>
<H3>Resource modules</H3> <H2>Resource modules</H2>
<P> <P>
Suppose we want to say, with the vocabulary included in Suppose we want to say, with the vocabulary included in
<CODE>Paleolithic.gf</CODE>, things like <CODE>Paleolithic.gf</CODE>, things like
@@ -946,7 +970,7 @@ Suppose we want to say, with the vocabulary included in
</PRE> </PRE>
<P> <P>
The new grammatical facility we need are the plural forms The new grammatical facility we need are the plural forms
of nouns and verbs (&lt;i&gt;boys, sleep&lt;i&gt;), as opposed to their of nouns and verbs (<I>boys, sleep</I>), as opposed to their
singular forms. singular forms.
</P> </P>
<P> <P>
@@ -969,9 +993,8 @@ To be able to do all this, we need two new judgement forms,
a new module form, and a generalizarion of linearization types a new module form, and a generalizarion of linearization types
from strings to more complex types. from strings to more complex types.
</P> </P>
<P> <A NAME="toc32"></A>
&lt;h4&gt;Parameters and tables&lt;h4&gt; <H3>Parameters and tables</H3>
</P>
<P> <P>
We define the <B>parameter type</B> of number in Englisn by We define the <B>parameter type</B> of number in Englisn by
using a new form of judgement: using a new form of judgement:
@@ -1011,13 +1034,12 @@ operator <CODE>!</CODE>. For instance,
<P> <P>
is a selection, whose value is <CODE>"boys"</CODE>. is a selection, whose value is <CODE>"boys"</CODE>.
</P> </P>
<P> <A NAME="toc33"></A>
&lt;h4&gt;Inflection tables, paradigms, and <CODE>oper</CODE> definitions&lt;h4&gt; <H3>Inflection tables, paradigms, and ``oper`` definitions</H3>
</P>
<P> <P>
All English common nouns are inflected in number, most of them in the All English common nouns are inflected in number, most of them in the
same way: the plural form is formed from the singular form by adding the same way: the plural form is formed from the singular form by adding the
ending &lt;i&gt;s&lt;i&gt;. This rule is an example of ending <I>s</I>. This rule is an example of
a <B>paradigm</B> - a formula telling how the inflection a <B>paradigm</B> - a formula telling how the inflection
forms of a word are formed. forms of a word are formed.
</P> </P>
@@ -1046,9 +1068,8 @@ the function, and the <B>glueing</B> operator <CODE>+</CODE> telling that
the string held in the variable <CODE>x</CODE> and the ending <CODE>"s"</CODE> the string held in the variable <CODE>x</CODE> and the ending <CODE>"s"</CODE>
are written together to form one <B>token</B>. are written together to form one <B>token</B>.
</P> </P>
<P> <A NAME="toc34"></A>
&lt;h4&gt;The <CODE>resource</CODE> module type&lt;h4&gt; <H3>The ``resource`` module type</H3>
</P>
<P> <P>
Parameter and operator definitions do not belong to the abstract syntax. Parameter and operator definitions do not belong to the abstract syntax.
They can be used when defining concrete syntax - but they are not They can be used when defining concrete syntax - but they are not
@@ -1080,7 +1101,7 @@ Resource modules can extend other resource modules, in the
same way as modules of other types can extend modules of the same way as modules of other types can extend modules of the
same type. same type.
</P> </P>
<A NAME="toc15"></A> <A NAME="toc35"></A>
<H3>Opening a ``resource``</H3> <H3>Opening a ``resource``</H3>
<P> <P>
Any number of <CODE>resource</CODE> modules can be Any number of <CODE>resource</CODE> modules can be
@@ -1114,9 +1135,8 @@ available through resource grammars, whose users only need
to pick the right operations and not to know their implementation to pick the right operations and not to know their implementation
details. details.
</P> </P>
<P> <A NAME="toc36"></A>
&lt;h4&gt;Worst-case macros and data abstraction&lt;h4&gt; <H3>Worst-case macros and data abstraction</H3>
</P>
<P> <P>
Some English nouns, such as <CODE>louse</CODE>, are so irregular that Some English nouns, such as <CODE>louse</CODE>, are so irregular that
it makes little sense to see them as instances of a paradigm. Even it makes little sense to see them as instances of a paradigm. Even
@@ -1149,9 +1169,8 @@ interface (i.e. the system of type signatures) that makes it
correct to use these functions in concrete modules. In programming correct to use these functions in concrete modules. In programming
terms, <CODE>Noun</CODE> is then treated as an <B>abstract datatype</B>. terms, <CODE>Noun</CODE> is then treated as an <B>abstract datatype</B>.
</P> </P>
<P> <A NAME="toc37"></A>
&lt;h4&gt;A system of paradigms using <CODE>Prelude</CODE> operations&lt;h4&gt; <H3>A system of paradigms using ``Prelude`` operations</H3>
</P>
<P> <P>
The regular noun paradigm <CODE>regNoun</CODE> can - and should - of course be defined The regular noun paradigm <CODE>regNoun</CODE> can - and should - of course be defined
by the worst-case macro <CODE>mkNoun</CODE>. In addition, some more noun paradigms by the worst-case macro <CODE>mkNoun</CODE>. In addition, some more noun paradigms
@@ -1162,8 +1181,8 @@ could be defined, for instance,
sNoun : Str -&gt; Noun = \kiss -&gt; mkNoun kiss (kiss + "es") ; sNoun : Str -&gt; Noun = \kiss -&gt; mkNoun kiss (kiss + "es") ;
</PRE> </PRE>
<P> <P>
What about nouns like &lt;i&gt;fly&lt;i&gt;, with the plural &lt;i&gt;flies&lt;i&gt;? The already What about nouns like <I>fly</I>, with the plural <I>flies</I>? The already
available solution is to use the so-called "technical stem" &lt;i&gt;fl&lt;i&gt; as available solution is to use the so-called "technical stem" <I>fl</I> as
argument, and define argument, and define
</P> </P>
<PRE> <PRE>
@@ -1183,9 +1202,8 @@ The operator <CODE>init</CODE> belongs to a set of operations in the
resource module <CODE>Prelude</CODE>, which therefore has to be resource module <CODE>Prelude</CODE>, which therefore has to be
<CODE>open</CODE>ed so that <CODE>init</CODE> can be used. <CODE>open</CODE>ed so that <CODE>init</CODE> can be used.
</P> </P>
<P> <A NAME="toc38"></A>
&lt;h4&gt;An intelligent noun paradigm using <CODE>case</CODE> expressions&lt;h4&gt; <H3>An intelligent noun paradigm using ``case`` expressions</H3>
</P>
<P> <P>
It may be hard for the user of a resource morphology to pick the right It may be hard for the user of a resource morphology to pick the right
inflection paradigm. A way to help this is to define a more intelligent inflection paradigm. A way to help this is to define a more intelligent
@@ -1207,16 +1225,15 @@ these forms are explained in the following section.
</P> </P>
<P> <P>
The paradigms <CODE>regNoun</CODE> does not give the correct forms for The paradigms <CODE>regNoun</CODE> does not give the correct forms for
all nouns. For instance, &lt;i&gt;louse - lice&lt;i&gt; and all nouns. For instance, <I>louse - lice</I> and
&lt;i&gt;fish - fish&lt;i&gt; must be given by using <CODE>mkNoun</CODE>. <I>fish - fish</I> must be given by using <CODE>mkNoun</CODE>.
Also the word &lt;i&gt;boy&lt;i&gt; would be inflected incorrectly; to prevent Also the word <I>boy</I> would be inflected incorrectly; to prevent
this, either use <CODE>mkNoun</CODE> or modify this, either use <CODE>mkNoun</CODE> or modify
<CODE>regNoun</CODE> so that the <CODE>"y"</CODE> case does not <CODE>regNoun</CODE> so that the <CODE>"y"</CODE> case does not
apply if the second-last character is a vowel. apply if the second-last character is a vowel.
</P> </P>
<P> <A NAME="toc39"></A>
&lt;h4&gt;Pattern matching&lt;h4&gt; <H3>Pattern matching</H3>
</P>
<P> <P>
Expressions of the <CODE>table</CODE> form are built from lists of Expressions of the <CODE>table</CODE> form are built from lists of
argument-value pairs. These pairs are called the <B>branches</B> argument-value pairs. These pairs are called the <B>branches</B>
@@ -1251,9 +1268,8 @@ programming languages are syntactic sugar for table selections:
case e of {...} === table {...} ! e case e of {...} === table {...} ! e
</PRE> </PRE>
<P></P> <P></P>
<P> <A NAME="toc40"></A>
&lt;h4&gt;Morphological analysis and morphology quiz&lt;h4&gt; <H3>Morphological analysis and morphology quiz</H3>
</P>
<P> <P>
Even though in GF morphology Even though in GF morphology
is mostly seen as an auxiliary of syntax, a morphology once defined is mostly seen as an auxiliary of syntax, a morphology once defined
@@ -1292,14 +1308,13 @@ file for later use, by the command <CODE>morpho_list = ml</CODE>
<P> <P>
The number flag gives the number of exercises generated. The number flag gives the number of exercises generated.
</P> </P>
<P> <A NAME="toc41"></A>
&lt;h4&gt;Parametric vs. inherent features, agreement&lt;h4&gt; <H3>Parametric vs. inherent features, agreement</H3>
</P>
<P> <P>
The rule of subject-verb agreement in English says that the verb The rule of subject-verb agreement in English says that the verb
phrase must be inflected in the number of the subject. This phrase must be inflected in the number of the subject. This
means that a noun phrase (functioning as a subject), in some sense means that a noun phrase (functioning as a subject), in some sense
&lt;i&gt;has&lt;i&gt; a number, which it "sends" to the verb. The verb does not <I>has</I> a number, which it "sends" to the verb. The verb does not
have a number, but must be able to receive whatever number the have a number, but must be able to receive whatever number the
subject has. This distinction is nicely represented by the subject has. This distinction is nicely represented by the
different linearization types of noun phrases and verb phrases: different linearization types of noun phrases and verb phrases:
@@ -1329,9 +1344,8 @@ regular only in the present tensse).
The reader is invited to inspect the way in which agreement works in The reader is invited to inspect the way in which agreement works in
the formation of noun phrases and verb phrases. the formation of noun phrases and verb phrases.
</P> </P>
<P> <A NAME="toc42"></A>
&lt;h4&gt;English concrete syntax with parameters&lt;h4&gt; <H3>English concrete syntax with parameters</H3>
</P>
<PRE> <PRE>
concrete PaleolithicEng of Paleolithic = open MorphoEng in { concrete PaleolithicEng of Paleolithic = open MorphoEng in {
lincat lincat
@@ -1358,9 +1372,8 @@ the formation of noun phrases and verb phrases.
} }
</PRE> </PRE>
<P></P> <P></P>
<P> <A NAME="toc43"></A>
&lt;h4&gt;Hierarchic parameter types&lt;h4&gt; <H3>Hierarchic parameter types</H3>
</P>
<P> <P>
The reader familiar with a functional programming language such as The reader familiar with a functional programming language such as
&lt;a href="<A HREF="http://www.haskell.org">http://www.haskell.org</A>"&gt;Haskell&lt;a&gt; must have noticed the similarity &lt;a href="<A HREF="http://www.haskell.org">http://www.haskell.org</A>"&gt;Haskell&lt;a&gt; must have noticed the similarity
@@ -1401,15 +1414,14 @@ the adjectival paradigm in which the two singular forms are the same, can be def
} }
</PRE> </PRE>
<P></P> <P></P>
<P> <A NAME="toc44"></A>
&lt;h4&gt;Discontinuous constituents&lt;h4&gt; <H3>Discontinuous constituents</H3>
</P>
<P> <P>
A linearization type may contain more strings than one. A linearization type may contain more strings than one.
An example of where this is useful are English particle An example of where this is useful are English particle
verbs, such as &lt;i&gt;switch off&lt;i&gt;. The linearization of verbs, such as <I>switch off</I>. The linearization of
a sentence may place the object between the verb and the particle: a sentence may place the object between the verb and the particle:
&lt;i&gt;he switched it off&lt;i&gt;. <I>he switched it off</I>.
</P> </P>
<P> <P>
The first of the following judgements defines transitive verbs as a The first of the following judgements defines transitive verbs as a
@@ -1427,27 +1439,27 @@ GF currently requires that all fields in linearization records that
have a table with value type <CODE>Str</CODE> have as labels have a table with value type <CODE>Str</CODE> have as labels
either <CODE>s</CODE> or <CODE>s</CODE> with an integer index. either <CODE>s</CODE> or <CODE>s</CODE> with an integer index.
</P> </P>
<A NAME="toc16"></A> <A NAME="toc45"></A>
<H2>Topics still to be written</H2> <H2>Topics still to be written</H2>
<A NAME="toc17"></A> <A NAME="toc46"></A>
<H3>Free variation</H3> <H3>Free variation</H3>
<A NAME="toc18"></A> <A NAME="toc47"></A>
<H3>Record extension, tuples</H3> <H3>Record extension, tuples</H3>
<A NAME="toc19"></A> <A NAME="toc48"></A>
<H3>Predefined types and operations</H3> <H3>Predefined types and operations</H3>
<A NAME="toc20"></A> <A NAME="toc49"></A>
<H3>Lexers and unlexers</H3> <H3>Lexers and unlexers</H3>
<A NAME="toc21"></A> <A NAME="toc50"></A>
<H3>Grammars of formal languages</H3> <H3>Grammars of formal languages</H3>
<A NAME="toc22"></A> <A NAME="toc51"></A>
<H3>Resource grammars and their reuse</H3> <H3>Resource grammars and their reuse</H3>
<A NAME="toc23"></A> <A NAME="toc52"></A>
<H3>Embedded grammars in Haskell, Java, and Prolog</H3> <H3>Embedded grammars in Haskell, Java, and Prolog</H3>
<A NAME="toc24"></A> <A NAME="toc53"></A>
<H3>Dependent types, variable bindings, semantic definitions</H3> <H3>Dependent types, variable bindings, semantic definitions</H3>
<A NAME="toc25"></A> <A NAME="toc54"></A>
<H3>Transfer modules</H3> <H3>Transfer modules</H3>
<A NAME="toc26"></A> <A NAME="toc55"></A>
<H3>Alternative input and output grammar formats</H3> <H3>Alternative input and output grammar formats</H3>
<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) --> <!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->

View File

@@ -66,7 +66,7 @@ in the shell. You will see GF's welcome message and the prompt ``>``.
%--! %--!
==My first grammar== ==The ``.cf`` grammar format==
Now you are ready to try out your first grammar. Now you are ready to try out your first grammar.
We start with one that is not written in GF language, but We start with one that is not written in GF language, but
@@ -200,7 +200,7 @@ generate ten strings with one and the same command:
%--! %--!
===Systematic generation=== ===Systematic generation===
To generate <i>all<i> sentence that a grammar To generate //all// sentence that a grammar
can generate, use the command ``generate_trees = gt``. can generate, use the command ``generate_trees = gt``.
``` ```
> generate_trees | l > generate_trees | l
@@ -243,7 +243,7 @@ want to see:
S_NP_VP (NP_the_CN CN_snake) (VP_V V_sleeps) S_NP_VP (NP_the_CN CN_snake) (VP_V V_sleeps)
the snake sleeps the snake sleeps
S_NP_VP (NP_the_CN CN_snake) (VP_V V_sleeps) S_NP_VP (NP_the_CN CN_snake) (VP_V V_sleeps)
```
This facility is good for test purposes: for instance, you This facility is good for test purposes: for instance, you
may want to see if a grammar is **ambiguous**, i.e. may want to see if a grammar is **ambiguous**, i.e.
contains strings that can be parsed in more than one way. contains strings that can be parsed in more than one way.
@@ -310,7 +310,7 @@ is compiled by GF.
%--! %--!
<h4>The labelled context-free format<h4> ===The labelled context-free format===
The **labelled context-free grammar** format permits user-defined The **labelled context-free grammar** format permits user-defined
labels to each rule. labels to each rule.
@@ -355,9 +355,9 @@ With this grammar, the trees look as follows:
%--! %--!
==The GF grammar format== ==The ``.gf`` grammar format==
To see what there really is in GF's shell state when a grammar To see what there is in GF's shell state when a grammar
has been imported, you can give the plain command has been imported, you can give the plain command
``print_grammar = pg``. ``print_grammar = pg``.
``` ```
@@ -402,17 +402,17 @@ is interpreted as the following pair of rules:
The former rule, with the keyword ``fun``, belongs to the abstract syntax. The former rule, with the keyword ``fun``, belongs to the abstract syntax.
It defines the **function** It defines the **function**
``PredVP`` which constructs syntax trees of form ``PredVP`` which constructs syntax trees of form
(``PredVP`` <i>x<i> <i>y<i>). (``PredVP`` //x// //y//).
The latter rule, with the keyword ``lin``, belongs to the concrete syntax. The latter rule, with the keyword ``lin``, belongs to the concrete syntax.
It defines the **linearization function** for It defines the **linearization function** for
syntax trees of form (``PredVP`` <i>x<i> <i>y<i>). syntax trees of form (``PredVP`` //x// //y//).
%--! %--!
<h4>Judgement forms<h4> ===Judgement forms===
Rules in a GF grammar are called **judgements**, and the keywords Rules in a GF grammar are called **judgements**, and the keywords
``fun`` and ``lin`` are used for distinguishing between two ``fun`` and ``lin`` are used for distinguishing between two
@@ -435,26 +435,26 @@ judgement forms:
We return to the precise meanings of these judgement forms later. We return to the precise meanings of these judgement forms later.
First we will look at how judgements are grouped into modules, and First we will look at how judgements are grouped into modules, and
show how the grammar ``paleolithic.cf`` is show how the paleolithic grammar is
expressed by using modules and judgements. expressed by using modules and judgements.
%--! %--!
<h4>Module types<h4> ===Module types===
A GF grammar consists of **modules**, A GF grammar consists of **modules**,
into which judgements are grouped. The most important into which judgements are grouped. The most important
module forms are module forms are
- ``abstract`` A = M``, abstract syntax A with judgements in - ``abstract`` A ``=`` M, abstract syntax A with judgements in
the module body M. the module body M.
- ``concrete`` C ``of`` A = M``, concrete syntax C of the - ``concrete`` C ``of`` A ``=`` M, concrete syntax C of the
abstract syntax A, with judgements in the module body M. abstract syntax A, with judgements in the module body M.
%--! %--!
<h4>Record types, records, and ``Str``s<h4> ===Record types, records, and ``Str``s===
The linearization type of a category is a **record type**, with The linearization type of a category is a **record type**, with
zero of more **fields** of different types. The simplest record zero of more **fields** of different types. The simplest record
@@ -468,8 +468,8 @@ which has one field, with **label** ``s`` and type ``Str``.
Examples of records of this type are Examples of records of this type are
``` ```
[s = "foo"} {s = "foo"}
[s = "hello" ++ "world"} {s = "hello" ++ "world"}
``` ```
The type ``Str`` is really the type of **token lists**, but The type ``Str`` is really the type of **token lists**, but
most of the time one can conveniently think of it as the type of strings, most of the time one can conveniently think of it as the type of strings,
@@ -478,17 +478,24 @@ denoted by string literals in double quotes.
Whenever a record ``r`` of type ``{s : Str}`` is given, Whenever a record ``r`` of type ``{s : Str}`` is given,
``r.s`` is an object of type ``Str``. This is of course ``r.s`` is an object of type ``Str``. This is
a special case of the **projection** rule, allowing the extraction a special case of the **projection** rule, allowing the extraction
of fields from a record. of fields from a record:
- if //r// : ``{`` ... //p// : //T// ... ``}`` then //r.p// : //T//
%--! %--!
<h4>An abstract syntax example<h4> ===An abstract syntax example===
To express the abstract syntax of ``paleolithic.cf`` in
a file ``Paleolithic.gf``, we write two kinds of judgements:
- Each category is introduced by a ``cat`` judgement.
- Each rule label is introduced by a ``fun`` judgement,
with the type formed from the nonterminals of the rule.
Each nonterminal occurring in the grammar ``paleolithic.cf`` is
introduced by a ``cat`` judgement. Each
rule label is introduced by a ``fun`` judgement.
``` ```
abstract Paleolithic = { abstract Paleolithic = {
cat cat
@@ -512,7 +519,7 @@ in subsequent ``fun`` judgements.
%--! %--!
<h4>A concrete syntax example<h4> ===A concrete syntax example===
Each category introduced in ``Paleolithic.gf`` is Each category introduced in ``Paleolithic.gf`` is
given a ``lincat`` rule, and each given a ``lincat`` rule, and each
@@ -551,7 +558,7 @@ lin
%--! %--!
<h4>Modules and files<h4> ===Modules and files===
Module name + ``.gf`` = file name Module name + ``.gf`` = file name
@@ -581,7 +588,7 @@ a new one, by looking at modification times.
%--! %--!
<h4>Multilingual grammar<h4> ==Multilingual grammars and translation==
The main advantage of separating abstract from concrete syntax is that The main advantage of separating abstract from concrete syntax is that
one abstract syntax can be equipped with many concrete syntaxes. one abstract syntax can be equipped with many concrete syntaxes.
@@ -598,7 +605,7 @@ multilingual grammar.
%--! %--!
<h4>An Italian concrete syntax<h4> ===An Italian concrete syntax===
``` ```
concrete PaleolithicIta of Paleolithic = { concrete PaleolithicIta of Paleolithic = {
@@ -632,7 +639,7 @@ lin
``` ```
%--! %--!
<h4>Using a multilingual grammar<h4> ===Using a multilingual grammar===
Import without first emptying Import without first emptying
``` ```
@@ -656,7 +663,7 @@ Translate by using a pipe:
%--! %--!
<h4>Translation quiz<h4> ===Translation quiz===
This is a simple language exercise that can be automatically This is a simple language exercise that can be automatically
generated from a multilingual grammar. The system generates a set of generated from a multilingual grammar. The system generates a set of
@@ -687,7 +694,7 @@ The number flag gives the number of sentences generated.
%--! %--!
<h4>The multilingual shell state<h4> ===The multilingual shell state===
A GF shell is at any time in a state, which A GF shell is at any time in a state, which
contains a multilingual grammar. One of the concrete contains a multilingual grammar. One of the concrete
@@ -710,7 +717,9 @@ things), you can use the command
%--! %--!
<h4>Extending a grammar<h4> ==Grammar architecture==
===Extending a grammar===
The module system of GF makes it possible to **extend** a The module system of GF makes it possible to **extend** a
grammar in different ways. The syntax of extension is grammar in different ways. The syntax of extension is
@@ -738,7 +747,7 @@ and extending module are put together.
%--! %--!
<h4>Multiple inheritance<h4> ===Multiple inheritance===
Specialized vocabularies can be represented as small grammars that Specialized vocabularies can be represented as small grammars that
only do "one thing" each, e.g. only do "one thing" each, e.g.
@@ -767,7 +776,7 @@ same time:
%--! %--!
<h4>Visualizing module structure<h4> ===Visualizing module structure===
When you have created all the abstract syntaxes and When you have created all the abstract syntaxes and
one set of concrete syntaxes needed for ``Gatherer``, one set of concrete syntaxes needed for ``Gatherer``,
@@ -795,7 +804,7 @@ shows the module dependencies.
%--! %--!
<h4>The module structure of ``GathererEng``<h4> ===The module structure of ``GathererEng``===
The graph uses The graph uses
@@ -811,7 +820,7 @@ The graph uses
%--! %--!
===Resource modules=== ==Resource modules==
Suppose we want to say, with the vocabulary included in Suppose we want to say, with the vocabulary included in
``Paleolithic.gf``, things like ``Paleolithic.gf``, things like
@@ -820,7 +829,7 @@ Suppose we want to say, with the vocabulary included in
all boys sleep all boys sleep
``` ```
The new grammatical facility we need are the plural forms The new grammatical facility we need are the plural forms
of nouns and verbs (<i>boys, sleep<i>), as opposed to their of nouns and verbs (//boys, sleep//), as opposed to their
singular forms. singular forms.
@@ -846,7 +855,7 @@ from strings to more complex types.
%--! %--!
<h4>Parameters and tables<h4> ===Parameters and tables===
We define the **parameter type** of number in Englisn by We define the **parameter type** of number in Englisn by
using a new form of judgement: using a new form of judgement:
@@ -880,11 +889,11 @@ is a selection, whose value is ``"boys"``.
%--! %--!
<h4>Inflection tables, paradigms, and ``oper`` definitions<h4> ===Inflection tables, paradigms, and ``oper`` definitions===
All English common nouns are inflected in number, most of them in the All English common nouns are inflected in number, most of them in the
same way: the plural form is formed from the singular form by adding the same way: the plural form is formed from the singular form by adding the
ending <i>s<i>. This rule is an example of ending //s//. This rule is an example of
a **paradigm** - a formula telling how the inflection a **paradigm** - a formula telling how the inflection
forms of a word are formed. forms of a word are formed.
@@ -914,7 +923,7 @@ are written together to form one **token**.
%--! %--!
<h4>The ``resource`` module type<h4> ===The ``resource`` module type===
Parameter and operator definitions do not belong to the abstract syntax. Parameter and operator definitions do not belong to the abstract syntax.
They can be used when defining concrete syntax - but they are not They can be used when defining concrete syntax - but they are not
@@ -983,7 +992,7 @@ details.
%--! %--!
<h4>Worst-case macros and data abstraction<h4> ===Worst-case macros and data abstraction===
Some English nouns, such as ``louse``, are so irregular that Some English nouns, such as ``louse``, are so irregular that
it makes little sense to see them as instances of a paradigm. Even it makes little sense to see them as instances of a paradigm. Even
@@ -1016,7 +1025,7 @@ terms, ``Noun`` is then treated as an **abstract datatype**.
%--! %--!
<h4>A system of paradigms using ``Prelude`` operations<h4> ===A system of paradigms using ``Prelude`` operations===
The regular noun paradigm ``regNoun`` can - and should - of course be defined The regular noun paradigm ``regNoun`` can - and should - of course be defined
by the worst-case macro ``mkNoun``. In addition, some more noun paradigms by the worst-case macro ``mkNoun``. In addition, some more noun paradigms
@@ -1025,8 +1034,8 @@ could be defined, for instance,
regNoun : Str -> Noun = \snake -> mkNoun snake (snake + "s") ; regNoun : Str -> Noun = \snake -> mkNoun snake (snake + "s") ;
sNoun : Str -> Noun = \kiss -> mkNoun kiss (kiss + "es") ; sNoun : Str -> Noun = \kiss -> mkNoun kiss (kiss + "es") ;
``` ```
What about nouns like <i>fly<i>, with the plural <i>flies<i>? The already What about nouns like //fly//, with the plural //flies//? The already
available solution is to use the so-called "technical stem" <i>fl<i> as available solution is to use the so-called "technical stem" //fl// as
argument, and define argument, and define
``` ```
yNoun : Str -> Noun = \fl -> mkNoun (fl + "y") (fl + "ies") ; yNoun : Str -> Noun = \fl -> mkNoun (fl + "y") (fl + "ies") ;
@@ -1045,7 +1054,7 @@ resource module ``Prelude``, which therefore has to be
%--! %--!
<h4>An intelligent noun paradigm using ``case`` expressions<h4> ===An intelligent noun paradigm using ``case`` expressions===
It may be hard for the user of a resource morphology to pick the right It may be hard for the user of a resource morphology to pick the right
inflection paradigm. A way to help this is to define a more intelligent inflection paradigm. A way to help this is to define a more intelligent
@@ -1066,9 +1075,9 @@ these forms are explained in the following section.
The paradigms ``regNoun`` does not give the correct forms for The paradigms ``regNoun`` does not give the correct forms for
all nouns. For instance, <i>louse - lice<i> and all nouns. For instance, //louse - lice// and
<i>fish - fish<i> must be given by using ``mkNoun``. //fish - fish// must be given by using ``mkNoun``.
Also the word <i>boy<i> would be inflected incorrectly; to prevent Also the word //boy// would be inflected incorrectly; to prevent
this, either use ``mkNoun`` or modify this, either use ``mkNoun`` or modify
``regNoun`` so that the ``"y"`` case does not ``regNoun`` so that the ``"y"`` case does not
apply if the second-last character is a vowel. apply if the second-last character is a vowel.
@@ -1076,7 +1085,7 @@ apply if the second-last character is a vowel.
%--! %--!
<h4>Pattern matching<h4> ===Pattern matching===
Expressions of the ``table`` form are built from lists of Expressions of the ``table`` form are built from lists of
argument-value pairs. These pairs are called the **branches** argument-value pairs. These pairs are called the **branches**
@@ -1111,7 +1120,7 @@ programming languages are syntactic sugar for table selections:
%--! %--!
<h4>Morphological analysis and morphology quiz<h4> ===Morphological analysis and morphology quiz===
Even though in GF morphology Even though in GF morphology
is mostly seen as an auxiliary of syntax, a morphology once defined is mostly seen as an auxiliary of syntax, a morphology once defined
@@ -1147,12 +1156,12 @@ The number flag gives the number of exercises generated.
%--! %--!
<h4>Parametric vs. inherent features, agreement<h4> ===Parametric vs. inherent features, agreement===
The rule of subject-verb agreement in English says that the verb The rule of subject-verb agreement in English says that the verb
phrase must be inflected in the number of the subject. This phrase must be inflected in the number of the subject. This
means that a noun phrase (functioning as a subject), in some sense means that a noun phrase (functioning as a subject), in some sense
<i>has<i> a number, which it "sends" to the verb. The verb does not //has// a number, which it "sends" to the verb. The verb does not
have a number, but must be able to receive whatever number the have a number, but must be able to receive whatever number the
subject has. This distinction is nicely represented by the subject has. This distinction is nicely represented by the
different linearization types of noun phrases and verb phrases: different linearization types of noun phrases and verb phrases:
@@ -1182,7 +1191,7 @@ the formation of noun phrases and verb phrases.
%--! %--!
<h4>English concrete syntax with parameters<h4> ===English concrete syntax with parameters===
``` ```
concrete PaleolithicEng of Paleolithic = open MorphoEng in { concrete PaleolithicEng of Paleolithic = open MorphoEng in {
@@ -1213,7 +1222,7 @@ lin
%--! %--!
<h4>Hierarchic parameter types<h4> ===Hierarchic parameter types===
The reader familiar with a functional programming language such as The reader familiar with a functional programming language such as
<a href="http://www.haskell.org">Haskell<a> must have noticed the similarity <a href="http://www.haskell.org">Haskell<a> must have noticed the similarity
@@ -1255,13 +1264,13 @@ the adjectival paradigm in which the two singular forms are the same, can be def
%--! %--!
<h4>Discontinuous constituents<h4> ===Discontinuous constituents===
A linearization type may contain more strings than one. A linearization type may contain more strings than one.
An example of where this is useful are English particle An example of where this is useful are English particle
verbs, such as <i>switch off<i>. The linearization of verbs, such as //switch off//. The linearization of
a sentence may place the object between the verb and the particle: a sentence may place the object between the verb and the particle:
<i>he switched it off<i>. //he switched it off//.
@@ -1311,6 +1320,10 @@ either ``s`` or ``s`` with an integer index.
===Speech input and output===
===Embedded grammars in Haskell, Java, and Prolog=== ===Embedded grammars in Haskell, Java, and Prolog===