forked from GitHub/gf-core
diagrams
This commit is contained in:
BIN
doc/tutorial/Foodmarket.png
Normal file
BIN
doc/tutorial/Foodmarket.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 2.9 KiB |
BIN
doc/tutorial/Tree2.png
Normal file
BIN
doc/tutorial/Tree2.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.7 KiB |
@@ -7,7 +7,7 @@
|
||||
<P ALIGN="center"><CENTER><H1>Grammatical Framework Tutorial</H1>
|
||||
<FONT SIZE="4">
|
||||
<I>Author: Aarne Ranta <aarne (at) cs.chalmers.se></I><BR>
|
||||
Last update: Sun Dec 18 22:27:21 2005
|
||||
Last update: Sun Dec 18 22:29:50 2005
|
||||
</FONT></CENTER>
|
||||
|
||||
<P></P>
|
||||
@@ -77,6 +77,45 @@ Last update: Sun Dec 18 22:27:21 2005
|
||||
<UL>
|
||||
<LI><A HREF="#toc47">Parametric vs. inherent features, agreement</A>
|
||||
<LI><A HREF="#toc48">English concrete syntax with parameters</A>
|
||||
<LI><A HREF="#toc49">Hierarchic parameter types</A>
|
||||
<LI><A HREF="#toc50">Morphological analysis and morphology quiz</A>
|
||||
<LI><A HREF="#toc51">Discontinuous constituents</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc52">More constructs for concrete syntax</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc53">Free variation</A>
|
||||
<LI><A HREF="#toc54">Record extension and subtyping</A>
|
||||
<LI><A HREF="#toc55">Tuples and product types</A>
|
||||
<LI><A HREF="#toc56">Prefix-dependent choices</A>
|
||||
<LI><A HREF="#toc57">Predefined types and operations</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc58">More features of the module system</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc59">Resource grammars and their reuse</A>
|
||||
<LI><A HREF="#toc60">Interfaces, instances, and functors</A>
|
||||
<LI><A HREF="#toc61">Restricted inheritance and qualified opening</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc62">More concepts of abstract syntax</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc63">Dependent types</A>
|
||||
<LI><A HREF="#toc64">Higher-order abstract syntax</A>
|
||||
<LI><A HREF="#toc65">Semantic definitions</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc66">Transfer modules</A>
|
||||
<LI><A HREF="#toc67">Practical issues</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc68">Lexers and unlexers</A>
|
||||
<LI><A HREF="#toc69">Efficiency of grammars</A>
|
||||
<LI><A HREF="#toc70">Speech input and output</A>
|
||||
<LI><A HREF="#toc71">Multilingual syntax editor</A>
|
||||
<LI><A HREF="#toc72">Interactive Development Environment (IDE)</A>
|
||||
<LI><A HREF="#toc73">Communicating with GF</A>
|
||||
<LI><A HREF="#toc74">Embedded grammars in Haskell, Java, and Prolog</A>
|
||||
<LI><A HREF="#toc75">Alternative input and output grammar formats</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc76">Case studies</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc77">Interfacing formal and natural languages</A>
|
||||
</UL>
|
||||
</UL>
|
||||
|
||||
@@ -1568,432 +1607,426 @@ the formation of sentences.
|
||||
} ;
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
|
||||
%--!
|
||||
===Hierarchic parameter types===
|
||||
|
||||
The reader familiar with a functional programming language such as
|
||||
[Haskell http://www.haskell.org] must have noticed the similarity
|
||||
between parameter types in GF and **algebraic datatypes** (``data`` definitions
|
||||
in Haskell). The GF parameter types are actually a special case of algebraic
|
||||
datatypes: the main restriction is that in GF, these types must be finite.
|
||||
(It is this restriction that makes it possible to invert linearization rules into
|
||||
parsing methods.)
|
||||
|
||||
However, finite is not the same thing as enumerated. Even in GF, parameter
|
||||
constructors can take arguments, provided these arguments are from other
|
||||
parameter types - only recursion is forbidden. Such parameter types impose a
|
||||
hierarchic order among parameters. They are often needed to define
|
||||
the linguistically most accurate parameter systems.
|
||||
|
||||
To give an example, Swedish adjectives
|
||||
are inflected in number (singular or plural) and
|
||||
gender (uter or neuter). These parameters would suggest 2*2=4 different
|
||||
forms. However, the gender distinction is done only in the singular. Therefore,
|
||||
it would be inaccurate to define adjective paradigms using the type
|
||||
``Gender => Number => Str``. The following hierarchic definition
|
||||
yields an accurate system of three adjectival forms.
|
||||
</PRE>
|
||||
<P></P>
|
||||
<A NAME="toc49"></A>
|
||||
<H3>Hierarchic parameter types</H3>
|
||||
<P>
|
||||
param AdjForm = ASg Gender | APl ;
|
||||
param Gender = Uter | Neuter ;
|
||||
The reader familiar with a functional programming language such as
|
||||
<A HREF="http://www.haskell.org">Haskell</A> must have noticed the similarity
|
||||
between parameter types in GF and <B>algebraic datatypes</B> (<CODE>data</CODE> definitions
|
||||
in Haskell). The GF parameter types are actually a special case of algebraic
|
||||
datatypes: the main restriction is that in GF, these types must be finite.
|
||||
(It is this restriction that makes it possible to invert linearization rules into
|
||||
parsing methods.)
|
||||
</P>
|
||||
<P>
|
||||
However, finite is not the same thing as enumerated. Even in GF, parameter
|
||||
constructors can take arguments, provided these arguments are from other
|
||||
parameter types - only recursion is forbidden. Such parameter types impose a
|
||||
hierarchic order among parameters. They are often needed to define
|
||||
the linguistically most accurate parameter systems.
|
||||
</P>
|
||||
<P>
|
||||
To give an example, Swedish adjectives
|
||||
are inflected in number (singular or plural) and
|
||||
gender (uter or neuter). These parameters would suggest 2*2=4 different
|
||||
forms. However, the gender distinction is done only in the singular. Therefore,
|
||||
it would be inaccurate to define adjective paradigms using the type
|
||||
<CODE>Gender => Number => Str</CODE>. The following hierarchic definition
|
||||
yields an accurate system of three adjectival forms.
|
||||
</P>
|
||||
<PRE>
|
||||
In pattern matching, a constructor can have patterns as arguments. For instance,
|
||||
the adjectival paradigm in which the two singular forms are the same, can be defined
|
||||
param AdjForm = ASg Gender | APl ;
|
||||
param Gender = Uter | Neuter ;
|
||||
</PRE>
|
||||
<P>
|
||||
oper plattAdj : Str -> AdjForm => Str = \x -> table {
|
||||
ASg _ => x ;
|
||||
APl => x + "a" ;
|
||||
}
|
||||
In pattern matching, a constructor can have patterns as arguments. For instance,
|
||||
the adjectival paradigm in which the two singular forms are the same, can be defined
|
||||
</P>
|
||||
<PRE>
|
||||
|
||||
|
||||
%--!
|
||||
===Morphological analysis and morphology quiz===
|
||||
|
||||
Even though in GF morphology
|
||||
is mostly seen as an auxiliary of syntax, a morphology once defined
|
||||
can be used on its own right. The command ``morpho_analyse = ma``
|
||||
can be used to read a text and return for each word the analyses that
|
||||
it has in the current concrete syntax.
|
||||
oper plattAdj : Str -> AdjForm => Str = \x -> table {
|
||||
ASg _ => x ;
|
||||
APl => x + "a" ;
|
||||
}
|
||||
</PRE>
|
||||
<P></P>
|
||||
<A NAME="toc50"></A>
|
||||
<H3>Morphological analysis and morphology quiz</H3>
|
||||
<P>
|
||||
> rf bible.txt | morpho_analyse
|
||||
Even though in GF morphology
|
||||
is mostly seen as an auxiliary of syntax, a morphology once defined
|
||||
can be used on its own right. The command <CODE>morpho_analyse = ma</CODE>
|
||||
can be used to read a text and return for each word the analyses that
|
||||
it has in the current concrete syntax.
|
||||
</P>
|
||||
<PRE>
|
||||
In the same way as translation exercises, morphological exercises can
|
||||
be generated, by the command ``morpho_quiz = mq``. Usually,
|
||||
the category is set to be something else than ``S``. For instance,
|
||||
> rf bible.txt | morpho_analyse
|
||||
</PRE>
|
||||
<P>
|
||||
> i lib/resource/french/VerbsFre.gf
|
||||
> morpho_quiz -cat=V
|
||||
</P>
|
||||
<P>
|
||||
Welcome to GF Morphology Quiz.
|
||||
...
|
||||
</P>
|
||||
<P>
|
||||
réapparaître : VFin VCondit Pl P2
|
||||
réapparaitriez
|
||||
> No, not réapparaitriez, but
|
||||
réapparaîtriez
|
||||
Score 0/1
|
||||
In the same way as translation exercises, morphological exercises can
|
||||
be generated, by the command <CODE>morpho_quiz = mq</CODE>. Usually,
|
||||
the category is set to be something else than <CODE>S</CODE>. For instance,
|
||||
</P>
|
||||
<PRE>
|
||||
Finally, a list of morphological exercises and save it in a
|
||||
file for later use, by the command ``morpho_list = ml``
|
||||
> i lib/resource/french/VerbsFre.gf
|
||||
> morpho_quiz -cat=V
|
||||
|
||||
Welcome to GF Morphology Quiz.
|
||||
...
|
||||
|
||||
réapparaître : VFin VCondit Pl P2
|
||||
réapparaitriez
|
||||
> No, not réapparaitriez, but
|
||||
réapparaîtriez
|
||||
Score 0/1
|
||||
</PRE>
|
||||
<P>
|
||||
> morpho_list -number=25 -cat=V
|
||||
Finally, a list of morphological exercises and save it in a
|
||||
file for later use, by the command <CODE>morpho_list = ml</CODE>
|
||||
</P>
|
||||
<PRE>
|
||||
The ``number`` flag gives the number of exercises generated.
|
||||
|
||||
|
||||
|
||||
%--!
|
||||
===Discontinuous constituents===
|
||||
|
||||
A linearization type may contain more strings than one.
|
||||
An example of where this is useful are English particle
|
||||
verbs, such as //switch off//. The linearization of
|
||||
a sentence may place the object between the verb and the particle:
|
||||
//he switched it off//.
|
||||
|
||||
The first of the following judgements defines transitive verbs as
|
||||
**discontinuous constituents**, i.e. as having a linearization
|
||||
type with two strings and not just one. The second judgement
|
||||
shows how the constituents are separated by the object in complementization.
|
||||
> morpho_list -number=25 -cat=V
|
||||
</PRE>
|
||||
<P>
|
||||
lincat TV = {s : Number => Str ; s2 : Str} ;
|
||||
lin ComplTV tv obj = {s = \\n => tv.s ! n ++ obj.s ++ tv.s2} ;
|
||||
The <CODE>number</CODE> flag gives the number of exercises generated.
|
||||
</P>
|
||||
<A NAME="toc51"></A>
|
||||
<H3>Discontinuous constituents</H3>
|
||||
<P>
|
||||
A linearization type may contain more strings than one.
|
||||
An example of where this is useful are English particle
|
||||
verbs, such as <I>switch off</I>. The linearization of
|
||||
a sentence may place the object between the verb and the particle:
|
||||
<I>he switched it off</I>.
|
||||
</P>
|
||||
<P>
|
||||
The first of the following judgements defines transitive verbs as
|
||||
<B>discontinuous constituents</B>, i.e. as having a linearization
|
||||
type with two strings and not just one. The second judgement
|
||||
shows how the constituents are separated by the object in complementization.
|
||||
</P>
|
||||
<PRE>
|
||||
There is no restriction in the number of discontinuous constituents
|
||||
(or other fields) a ``lincat`` may contain. The only condition is that
|
||||
the fields must be of finite types, i.e. built from records, tables,
|
||||
parameters, and ``Str``, and not functions. A mathematical result
|
||||
about parsing in GF says that the worst-case complexity of parsing
|
||||
increases with the number of discontinuous constituents. Moreover,
|
||||
the parsing and linearization commands only give reliable results
|
||||
for categories whose linearization type has a unique ``Str`` valued
|
||||
field labelled ``s``.
|
||||
|
||||
|
||||
%--!
|
||||
==More constructs for concrete syntax==
|
||||
|
||||
|
||||
%--!
|
||||
===Free variation===
|
||||
|
||||
Sometimes there are many alternative ways to define a concrete syntax.
|
||||
For instance, the verb negation in English can be expressed both by
|
||||
//does not// and //doesn't//. In linguistic terms, these expressions
|
||||
are in **free variation**. The ``variants`` construct of GF can
|
||||
be used to give a list of strings in free variation. For example,
|
||||
lincat TV = {s : Number => Str ; s2 : Str} ;
|
||||
lin ComplTV tv obj = {s = \\n => tv.s ! n ++ obj.s ++ tv.s2} ;
|
||||
</PRE>
|
||||
<P>
|
||||
NegVerb verb = {s = variants {["does not"] ; "doesn't} ++ verb.s} ;
|
||||
There is no restriction in the number of discontinuous constituents
|
||||
(or other fields) a <CODE>lincat</CODE> may contain. The only condition is that
|
||||
the fields must be of finite types, i.e. built from records, tables,
|
||||
parameters, and <CODE>Str</CODE>, and not functions. A mathematical result
|
||||
about parsing in GF says that the worst-case complexity of parsing
|
||||
increases with the number of discontinuous constituents. Moreover,
|
||||
the parsing and linearization commands only give reliable results
|
||||
for categories whose linearization type has a unique <CODE>Str</CODE> valued
|
||||
field labelled <CODE>s</CODE>.
|
||||
</P>
|
||||
<A NAME="toc52"></A>
|
||||
<H2>More constructs for concrete syntax</H2>
|
||||
<A NAME="toc53"></A>
|
||||
<H3>Free variation</H3>
|
||||
<P>
|
||||
Sometimes there are many alternative ways to define a concrete syntax.
|
||||
For instance, the verb negation in English can be expressed both by
|
||||
<I>does not</I> and <I>doesn't</I>. In linguistic terms, these expressions
|
||||
are in <B>free variation</B>. The <CODE>variants</CODE> construct of GF can
|
||||
be used to give a list of strings in free variation. For example,
|
||||
</P>
|
||||
<PRE>
|
||||
An empty variant list
|
||||
NegVerb verb = {s = variants {["does not"] ; "doesn't} ++ verb.s} ;
|
||||
</PRE>
|
||||
<P>
|
||||
variants {}
|
||||
An empty variant list
|
||||
</P>
|
||||
<PRE>
|
||||
can be used e.g. if a word lacks a certain form.
|
||||
|
||||
In general, ``variants`` should be used cautiously. It is not
|
||||
recommended for modules aimed to be libraries, because the
|
||||
user of the library has no way to choose among the variants.
|
||||
Moreover, even though ``variants`` admits lists of any type,
|
||||
its semantics for complex types can cause surprises.
|
||||
|
||||
|
||||
|
||||
|
||||
===Record extension and subtyping===
|
||||
|
||||
Record types and records can be **extended** with new fields. For instance,
|
||||
in German it is natural to see transitive verbs as verbs with a case.
|
||||
The symbol ``**`` is used for both constructs.
|
||||
variants {}
|
||||
</PRE>
|
||||
<P>
|
||||
lincat TV = Verb ** {c : Case} ;
|
||||
can be used e.g. if a word lacks a certain form.
|
||||
</P>
|
||||
<P>
|
||||
lin Follow = regVerb "folgen" ** {c = Dative} ;
|
||||
In general, <CODE>variants</CODE> should be used cautiously. It is not
|
||||
recommended for modules aimed to be libraries, because the
|
||||
user of the library has no way to choose among the variants.
|
||||
Moreover, even though <CODE>variants</CODE> admits lists of any type,
|
||||
its semantics for complex types can cause surprises.
|
||||
</P>
|
||||
<A NAME="toc54"></A>
|
||||
<H3>Record extension and subtyping</H3>
|
||||
<P>
|
||||
Record types and records can be <B>extended</B> with new fields. For instance,
|
||||
in German it is natural to see transitive verbs as verbs with a case.
|
||||
The symbol <CODE>**</CODE> is used for both constructs.
|
||||
</P>
|
||||
<PRE>
|
||||
To extend a record type or a record with a field whose label it
|
||||
already has is a type error.
|
||||
lincat TV = Verb ** {c : Case} ;
|
||||
|
||||
A record type //T// is a **subtype** of another one //R//, if //T// has
|
||||
all the fields of //R// and possibly other fields. For instance,
|
||||
an extension of a record type is always a subtype of it.
|
||||
|
||||
If //T// is a subtype of //R//, an object of //T// can be used whenever
|
||||
an object of //R// is required. For instance, a transitive verb can
|
||||
be used whenever a verb is required.
|
||||
|
||||
**Contravariance** means that a function taking an //R// as argument
|
||||
can also be applied to any object of a subtype //T//.
|
||||
|
||||
|
||||
|
||||
===Tuples and product types===
|
||||
|
||||
Product types and tuples are syntactic sugar for record types and records:
|
||||
lin Follow = regVerb "folgen" ** {c = Dative} ;
|
||||
</PRE>
|
||||
<P>
|
||||
T1 * ... * Tn === {p1 : T1 ; ... ; pn : Tn}
|
||||
<t1, ..., tn> === {p1 = T1 ; ... ; pn = Tn}
|
||||
To extend a record type or a record with a field whose label it
|
||||
already has is a type error.
|
||||
</P>
|
||||
<P>
|
||||
A record type <I>T</I> is a <B>subtype</B> of another one <I>R</I>, if <I>T</I> has
|
||||
all the fields of <I>R</I> and possibly other fields. For instance,
|
||||
an extension of a record type is always a subtype of it.
|
||||
</P>
|
||||
<P>
|
||||
If <I>T</I> is a subtype of <I>R</I>, an object of <I>T</I> can be used whenever
|
||||
an object of <I>R</I> is required. For instance, a transitive verb can
|
||||
be used whenever a verb is required.
|
||||
</P>
|
||||
<P>
|
||||
<B>Contravariance</B> means that a function taking an <I>R</I> as argument
|
||||
can also be applied to any object of a subtype <I>T</I>.
|
||||
</P>
|
||||
<A NAME="toc55"></A>
|
||||
<H3>Tuples and product types</H3>
|
||||
<P>
|
||||
Product types and tuples are syntactic sugar for record types and records:
|
||||
</P>
|
||||
<PRE>
|
||||
Thus the labels ``p1, p2,...``` are hard-coded.
|
||||
|
||||
|
||||
%--!
|
||||
===Prefix-dependent choices===
|
||||
|
||||
The construct exemplified in
|
||||
T1 * ... * Tn === {p1 : T1 ; ... ; pn : Tn}
|
||||
<t1, ..., tn> === {p1 = T1 ; ... ; pn = Tn}
|
||||
</PRE>
|
||||
<P>
|
||||
oper artIndef : Str =
|
||||
pre {"a" ; "an" / strs {"a" ; "e" ; "i" ; "o"}} ;
|
||||
Thus the labels <CODE>p1, p2,...`</CODE> are hard-coded.
|
||||
</P>
|
||||
<A NAME="toc56"></A>
|
||||
<H3>Prefix-dependent choices</H3>
|
||||
<P>
|
||||
The construct exemplified in
|
||||
</P>
|
||||
<PRE>
|
||||
Thus
|
||||
oper artIndef : Str =
|
||||
pre {"a" ; "an" / strs {"a" ; "e" ; "i" ; "o"}} ;
|
||||
</PRE>
|
||||
<P>
|
||||
artIndef ++ "cheese" ---> "a" ++ "cheese"
|
||||
artIndef ++ "apple" ---> "an" ++ "cheese"
|
||||
Thus
|
||||
</P>
|
||||
<PRE>
|
||||
This very example does not work in all situations: the prefix
|
||||
//u// has no general rules, and some problematic words are
|
||||
//euphemism, one-eyed, n-gram//. It is possible to write
|
||||
artIndef ++ "cheese" ---> "a" ++ "cheese"
|
||||
artIndef ++ "apple" ---> "an" ++ "cheese"
|
||||
</PRE>
|
||||
<P>
|
||||
oper artIndef : Str =
|
||||
pre {"a" ;
|
||||
"a" / strs {"eu" ; "one"} ;
|
||||
"an" / strs {"a" ; "e" ; "i" ; "o" ; "n-"}
|
||||
} ;
|
||||
This very example does not work in all situations: the prefix
|
||||
<I>u</I> has no general rules, and some problematic words are
|
||||
<I>euphemism, one-eyed, n-gram</I>. It is possible to write
|
||||
</P>
|
||||
<PRE>
|
||||
|
||||
|
||||
|
||||
===Predefined types and operations===
|
||||
|
||||
GF has the following predefined categories in abstract syntax:
|
||||
oper artIndef : Str =
|
||||
pre {"a" ;
|
||||
"a" / strs {"eu" ; "one"} ;
|
||||
"an" / strs {"a" ; "e" ; "i" ; "o" ; "n-"}
|
||||
} ;
|
||||
</PRE>
|
||||
<P></P>
|
||||
<A NAME="toc57"></A>
|
||||
<H3>Predefined types and operations</H3>
|
||||
<P>
|
||||
cat Int ; -- integers, e.g. 0, 5, 743145151019
|
||||
cat Float ; -- floats, e.g. 0.0, 3.1415926
|
||||
cat String ; -- strings, e.g. "", "foo", "123"
|
||||
GF has the following predefined categories in abstract syntax:
|
||||
</P>
|
||||
<PRE>
|
||||
The objects of each of these categories are **literals**
|
||||
as indicated in the comments above. No ``fun`` definition
|
||||
can have a predefined category as its value type, but
|
||||
they can be used as arguments. For example:
|
||||
cat Int ; -- integers, e.g. 0, 5, 743145151019
|
||||
cat Float ; -- floats, e.g. 0.0, 3.1415926
|
||||
cat String ; -- strings, e.g. "", "foo", "123"
|
||||
</PRE>
|
||||
<P>
|
||||
fun StreetAddress : Int -> String -> Address ;
|
||||
lin StreetAddress number street = {s = number.s ++ street.s} ;
|
||||
</P>
|
||||
<P>
|
||||
-- e.g. (StreetAddress 10 "Downing Street") : Address
|
||||
The objects of each of these categories are <B>literals</B>
|
||||
as indicated in the comments above. No <CODE>fun</CODE> definition
|
||||
can have a predefined category as its value type, but
|
||||
they can be used as arguments. For example:
|
||||
</P>
|
||||
<PRE>
|
||||
fun StreetAddress : Int -> String -> Address ;
|
||||
lin StreetAddress number street = {s = number.s ++ street.s} ;
|
||||
|
||||
|
||||
%--!
|
||||
==More features of the module system==
|
||||
|
||||
|
||||
===Resource grammars and their reuse===
|
||||
|
||||
See
|
||||
[resource library documentation ../../lib/resource/doc/gf-resource.html]
|
||||
|
||||
|
||||
===Interfaces, instances, and functors===
|
||||
|
||||
See an
|
||||
[example built this way ../../examples/mp3/mp3-resource.html]
|
||||
|
||||
|
||||
===Restricted inheritance and qualified opening===
|
||||
|
||||
|
||||
|
||||
==More concepts of abstract syntax==
|
||||
|
||||
|
||||
===Dependent types===
|
||||
|
||||
===Higher-order abstract syntax===
|
||||
|
||||
===Semantic definitions===
|
||||
|
||||
|
||||
|
||||
==Transfer modules==
|
||||
|
||||
Transfer means noncompositional tree-transforming operations.
|
||||
The command ``apply_transfer = at`` is typically used in a pipe:
|
||||
-- e.g. (StreetAddress 10 "Downing Street") : Address
|
||||
</PRE>
|
||||
<P></P>
|
||||
<A NAME="toc58"></A>
|
||||
<H2>More features of the module system</H2>
|
||||
<A NAME="toc59"></A>
|
||||
<H3>Resource grammars and their reuse</H3>
|
||||
<P>
|
||||
> p "John walks and John runs" | apply_transfer aggregate | l
|
||||
John walks and runs
|
||||
See
|
||||
<A HREF="../../lib/resource/doc/gf-resource.html">resource library documentation</A>
|
||||
</P>
|
||||
<A NAME="toc60"></A>
|
||||
<H3>Interfaces, instances, and functors</H3>
|
||||
<P>
|
||||
See an
|
||||
<A HREF="../../examples/mp3/mp3-resource.html">example built this way</A>
|
||||
</P>
|
||||
<A NAME="toc61"></A>
|
||||
<H3>Restricted inheritance and qualified opening</H3>
|
||||
<A NAME="toc62"></A>
|
||||
<H2>More concepts of abstract syntax</H2>
|
||||
<A NAME="toc63"></A>
|
||||
<H3>Dependent types</H3>
|
||||
<A NAME="toc64"></A>
|
||||
<H3>Higher-order abstract syntax</H3>
|
||||
<A NAME="toc65"></A>
|
||||
<H3>Semantic definitions</H3>
|
||||
<A NAME="toc66"></A>
|
||||
<H2>Transfer modules</H2>
|
||||
<P>
|
||||
Transfer means noncompositional tree-transforming operations.
|
||||
The command <CODE>apply_transfer = at</CODE> is typically used in a pipe:
|
||||
</P>
|
||||
<PRE>
|
||||
See the
|
||||
[sources ../../transfer/examples/aggregation] of this example.
|
||||
|
||||
See the
|
||||
[transfer language documentation ../transfer.html]
|
||||
for more information.
|
||||
|
||||
|
||||
==Practical issues==
|
||||
|
||||
|
||||
===Lexers and unlexers===
|
||||
|
||||
Lexers and unlexers can be chosen from
|
||||
a list of predefined ones, using the flags``-lexer`` and `` -unlexer`` either
|
||||
in the grammar file or on the GF command line.
|
||||
|
||||
Given by ``help -lexer``, ``help -unlexer``:
|
||||
> p "John walks and John runs" | apply_transfer aggregate | l
|
||||
John walks and runs
|
||||
</PRE>
|
||||
<P>
|
||||
The default is words.
|
||||
-lexer=words tokens are separated by spaces or newlines
|
||||
-lexer=literals like words, but GF integer and string literals recognized
|
||||
-lexer=vars like words, but "x","x_...","$...$" as vars, "?..." as meta
|
||||
-lexer=chars each character is a token
|
||||
-lexer=code use Haskell's lex
|
||||
-lexer=codevars like code, but treat unknown words as variables, ?? as meta
|
||||
-lexer=text with conventions on punctuation and capital letters
|
||||
-lexer=codelit like code, but treat unknown words as string literals
|
||||
-lexer=textlit like text, but treat unknown words as string literals
|
||||
-lexer=codeC use a C-like lexer
|
||||
-lexer=ignore like literals, but ignore unknown words
|
||||
-lexer=subseqs like ignore, but then try all subsequences from longest
|
||||
See the
|
||||
<A HREF="../../transfer/examples/aggregation">sources</A> of this example.
|
||||
</P>
|
||||
<P>
|
||||
The default is unwords.
|
||||
-unlexer=unwords space-separated token list (like unwords)
|
||||
-unlexer=text format as text: punctuation, capitals, paragraph <p>
|
||||
-unlexer=code format as code (spacing, indentation)
|
||||
-unlexer=textlit like text, but remove string literal quotes
|
||||
-unlexer=codelit like code, but remove string literal quotes
|
||||
-unlexer=concat remove all spaces
|
||||
-unlexer=bind like identity, but bind at "&+"
|
||||
See the
|
||||
<A HREF="../transfer.html">transfer language documentation</A>
|
||||
for more information.
|
||||
</P>
|
||||
<A NAME="toc67"></A>
|
||||
<H2>Practical issues</H2>
|
||||
<A NAME="toc68"></A>
|
||||
<H3>Lexers and unlexers</H3>
|
||||
<P>
|
||||
Lexers and unlexers can be chosen from
|
||||
a list of predefined ones, using the flags<CODE>-lexer</CODE> and `` -unlexer`` either
|
||||
in the grammar file or on the GF command line.
|
||||
</P>
|
||||
<P>
|
||||
Given by <CODE>help -lexer</CODE>, <CODE>help -unlexer</CODE>:
|
||||
</P>
|
||||
<PRE>
|
||||
The default is words.
|
||||
-lexer=words tokens are separated by spaces or newlines
|
||||
-lexer=literals like words, but GF integer and string literals recognized
|
||||
-lexer=vars like words, but "x","x_...","$...$" as vars, "?..." as meta
|
||||
-lexer=chars each character is a token
|
||||
-lexer=code use Haskell's lex
|
||||
-lexer=codevars like code, but treat unknown words as variables, ?? as meta
|
||||
-lexer=text with conventions on punctuation and capital letters
|
||||
-lexer=codelit like code, but treat unknown words as string literals
|
||||
-lexer=textlit like text, but treat unknown words as string literals
|
||||
-lexer=codeC use a C-like lexer
|
||||
-lexer=ignore like literals, but ignore unknown words
|
||||
-lexer=subseqs like ignore, but then try all subsequences from longest
|
||||
|
||||
|
||||
===Efficiency of grammars===
|
||||
|
||||
Issues:
|
||||
|
||||
- the choice of datastructures in ``lincat``s
|
||||
- the value of the ``optimize`` flag
|
||||
- parsing efficiency: ``-mcfg`` vs. others
|
||||
|
||||
|
||||
===Speech input and output===
|
||||
|
||||
The``speak_aloud = sa`` command sends a string to the speech
|
||||
synthesizer
|
||||
[Flite http://www.speech.cs.cmu.edu/flite/doc/].
|
||||
It is typically used via a pipe:
|
||||
``` generate_random | linearize | speak_aloud
|
||||
The result is only satisfactory for English.
|
||||
|
||||
The ``speech_input = si`` command receives a string from a
|
||||
speech recognizer that requires the installation of
|
||||
[ATK http://mi.eng.cam.ac.uk/~sjy/software.htm].
|
||||
It is typically used to pipe input to a parser:
|
||||
``` speech_input -tr | parse
|
||||
The method words only for grammars of English.
|
||||
|
||||
Both Flite and ATK are freely available through the links
|
||||
above, but they are not distributed together with GF.
|
||||
|
||||
|
||||
|
||||
|
||||
===Multilingual syntax editor===
|
||||
|
||||
The
|
||||
[Editor User Manual http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm]
|
||||
describes the use of the editor, which works for any multilingual GF grammar.
|
||||
|
||||
Here is a snapshot of the editor:
|
||||
|
||||
[../quick-editor.gif]
|
||||
|
||||
The grammars of the snapshot are from the
|
||||
[Letter grammar package http://www.cs.chalmers.se/~aarne/GF/examples/letter].
|
||||
|
||||
|
||||
|
||||
===Interactive Development Environment (IDE)===
|
||||
|
||||
Forthcoming.
|
||||
|
||||
|
||||
===Communicating with GF===
|
||||
|
||||
Other processes can communicate with the GF command interpreter,
|
||||
and also with the GF syntax editor.
|
||||
|
||||
|
||||
===Embedded grammars in Haskell, Java, and Prolog===
|
||||
|
||||
GF grammars can be used as parts of programs written in the
|
||||
following languages. The links give more documentation.
|
||||
|
||||
- [Java http://www.cs.chalmers.se/~bringert/gf/gf-java.html]
|
||||
- [Haskell http://www.cs.chalmers.se/~aarne/GF/src/GF/Embed/EmbedAPI.hs]
|
||||
- [Prolog http://www.cs.chalmers.se/~peb/software.html]
|
||||
|
||||
|
||||
===Alternative input and output grammar formats===
|
||||
|
||||
A summary is given in the following chart of GF grammar compiler phases:
|
||||
[../gf-compiler.png]
|
||||
|
||||
|
||||
==Case studies==
|
||||
|
||||
===Interfacing formal and natural languages===
|
||||
|
||||
[Formal and Informal Software Specifications http://www.cs.chalmers.se/~krijo/thesis/thesisA4.pdf],
|
||||
PhD Thesis by
|
||||
[Kristofer Johannisson http://www.cs.chalmers.se/~krijo], is an extensive example of this.
|
||||
The system is based on a multilingual grammar relating the formal language OCL with
|
||||
English and German.
|
||||
|
||||
A simpler example will be explained here.
|
||||
The default is unwords.
|
||||
-unlexer=unwords space-separated token list (like unwords)
|
||||
-unlexer=text format as text: punctuation, capitals, paragraph <p>
|
||||
-unlexer=code format as code (spacing, indentation)
|
||||
-unlexer=textlit like text, but remove string literal quotes
|
||||
-unlexer=codelit like code, but remove string literal quotes
|
||||
-unlexer=concat remove all spaces
|
||||
-unlexer=bind like identity, but bind at "&+"
|
||||
|
||||
</PRE>
|
||||
<P></P>
|
||||
<A NAME="toc69"></A>
|
||||
<H3>Efficiency of grammars</H3>
|
||||
<P>
|
||||
Issues:
|
||||
</P>
|
||||
<UL>
|
||||
<LI>the choice of datastructures in <CODE>lincat</CODE>s
|
||||
<LI>the value of the <CODE>optimize</CODE> flag
|
||||
<LI>parsing efficiency: <CODE>-mcfg</CODE> vs. others
|
||||
</UL>
|
||||
|
||||
<A NAME="toc70"></A>
|
||||
<H3>Speech input and output</H3>
|
||||
<P>
|
||||
The<CODE>speak_aloud = sa</CODE> command sends a string to the speech
|
||||
synthesizer
|
||||
<A HREF="http://www.speech.cs.cmu.edu/flite/doc/">Flite</A>.
|
||||
It is typically used via a pipe:
|
||||
</P>
|
||||
<PRE>
|
||||
generate_random | linearize | speak_aloud
|
||||
</PRE>
|
||||
<P>
|
||||
The result is only satisfactory for English.
|
||||
</P>
|
||||
<P>
|
||||
The <CODE>speech_input = si</CODE> command receives a string from a
|
||||
speech recognizer that requires the installation of
|
||||
<A HREF="http://mi.eng.cam.ac.uk/~sjy/software.htm">ATK</A>.
|
||||
It is typically used to pipe input to a parser:
|
||||
</P>
|
||||
<PRE>
|
||||
speech_input -tr | parse
|
||||
</PRE>
|
||||
<P>
|
||||
The method words only for grammars of English.
|
||||
</P>
|
||||
<P>
|
||||
Both Flite and ATK are freely available through the links
|
||||
above, but they are not distributed together with GF.
|
||||
</P>
|
||||
<A NAME="toc71"></A>
|
||||
<H3>Multilingual syntax editor</H3>
|
||||
<P>
|
||||
The
|
||||
<A HREF="http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm">Editor User Manual</A>
|
||||
describes the use of the editor, which works for any multilingual GF grammar.
|
||||
</P>
|
||||
<P>
|
||||
Here is a snapshot of the editor:
|
||||
</P>
|
||||
<P>
|
||||
<IMG ALIGN="middle" SRC="../quick-editor.gif" BORDER="0" ALT="">
|
||||
</P>
|
||||
<P>
|
||||
The grammars of the snapshot are from the
|
||||
<A HREF="http://www.cs.chalmers.se/~aarne/GF/examples/letter">Letter grammar package</A>.
|
||||
</P>
|
||||
<A NAME="toc72"></A>
|
||||
<H3>Interactive Development Environment (IDE)</H3>
|
||||
<P>
|
||||
Forthcoming.
|
||||
</P>
|
||||
<A NAME="toc73"></A>
|
||||
<H3>Communicating with GF</H3>
|
||||
<P>
|
||||
Other processes can communicate with the GF command interpreter,
|
||||
and also with the GF syntax editor.
|
||||
</P>
|
||||
<A NAME="toc74"></A>
|
||||
<H3>Embedded grammars in Haskell, Java, and Prolog</H3>
|
||||
<P>
|
||||
GF grammars can be used as parts of programs written in the
|
||||
following languages. The links give more documentation.
|
||||
</P>
|
||||
<UL>
|
||||
<LI><A HREF="http://www.cs.chalmers.se/~bringert/gf/gf-java.html">Java</A>
|
||||
<LI><A HREF="http://www.cs.chalmers.se/~aarne/GF/src/GF/Embed/EmbedAPI.hs">Haskell</A>
|
||||
<LI><A HREF="http://www.cs.chalmers.se/~peb/software.html">Prolog</A>
|
||||
</UL>
|
||||
|
||||
<A NAME="toc75"></A>
|
||||
<H3>Alternative input and output grammar formats</H3>
|
||||
<P>
|
||||
A summary is given in the following chart of GF grammar compiler phases:
|
||||
<IMG ALIGN="middle" SRC="../gf-compiler.png" BORDER="0" ALT="">
|
||||
</P>
|
||||
<A NAME="toc76"></A>
|
||||
<H2>Case studies</H2>
|
||||
<A NAME="toc77"></A>
|
||||
<H3>Interfacing formal and natural languages</H3>
|
||||
<P>
|
||||
<A HREF="http://www.cs.chalmers.se/~krijo/thesis/thesisA4.pdf">Formal and Informal Software Specifications</A>,
|
||||
PhD Thesis by
|
||||
<A HREF="http://www.cs.chalmers.se/~krijo">Kristofer Johannisson</A>, is an extensive example of this.
|
||||
The system is based on a multilingual grammar relating the formal language OCL with
|
||||
English and German.
|
||||
</P>
|
||||
<P>
|
||||
A simpler example will be explained here.
|
||||
</P>
|
||||
|
||||
<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
|
||||
<!-- cmdline: txt2tags -\-toc gf-tutorial2.txt -->
|
||||
|
||||
@@ -1345,7 +1345,7 @@ concrete FoodsEng of Foods = open Prelude, MorphoEng in {
|
||||
} ;
|
||||
|
||||
}
|
||||
```
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user