=== {p1 = T1 ; ... ; pn = Tn}
```
Thus the labels ``p1, p2,...`` are hard-coded.
#NEW
====Prefix-dependent choices====
English indefinite article:
```
oper artIndef : Str =
pre {"a" ; "an" / strs {"a" ; "e" ; "i" ; "o"}} ;
```
Thus
```
artIndef ++ "cheese" ---> "a" ++ "cheese"
artIndef ++ "apple" ---> "an" ++ "apple"
```
//Prefix-dependent choice may be deprecated in GF version 3.//
#NEW
=Lesson 4: Using the resource grammar library=
#Lchapfive
Goals:
- navigate in the GF resource grammar library and use it in applications
- get acquainted with basic linguistic categories
- write functors to achieve maximal sharing of code in multilingual grammars
#NEW
==The coverage of the library==
The current 12 resource languages are
- ``Bul``garian
- ``Cat``alan
- ``Dan``ish
- ``Eng``lish
- ``Fin``nish
- ``Fre``nch
- ``Ger``man
- ``Ita``lian
- ``Nor``wegian
- ``Rus``sian
- ``Spa``nish
- ``Swe``dish
The first three letters (``Eng`` etc) are used in grammar module names
(ISO 639 standard).
#NEW
==The structure of the library==
#Lseclexical
Semantic grammars (up to now in this tutorial):
a grammar defines a system of meanings (abstract syntax) and
tells how they are expressed(concrete syntax).
Resource grammars (as usual in linguistic tradition):
a grammar specifies the **grammatically correct combinations of words**,
whatever their meanings are.
With resource grammars, we can achieve a
wider coverage than with semantic grammars.
#NEW
===Lexical vs. phrasal rules===
A resource grammar has two kinds of categories and two kinds of rules:
- lexical:
- lexical categories, to classify words
- lexical rules, to define words and their properties
- phrasal (combinatorial, syntactic):
- phrasal categories, to classify phrases of arbitrary size
- phrasal rules, to combine phrases into larger phrases
GE makes no formal distinction between these two kinds.
But it is a good discipline to follow.
#NEW
===Lexical categories===
Two kinds of lexical categories:
- **closed**:
- a finite number of words
- seldom extended in the history of language
- structural words / function words, e.g.
```
Conj ; -- conjunction e.g. "and"
QuantSg ; -- singular quantifier e.g. "this"
QuantPl ; -- plural quantifier e.g. "this"
```
- **open**:
- new words are added all the time
- content words, e.g.
```
N ; -- noun e.g. "pizza"
A ; -- adjective e.g. "good"
V ; -- verb e.g. "sleep"
```
#NEW
===Lexical rules===
Closed classes: module ``Syntax``. In the ``Foods`` grammar, we need
```
this_QuantSg, that_QuantSg : QuantSg ;
these_QuantPl, those_QuantPl : QuantPl ;
very_AdA : AdA ;
```
Naming convention: word followed by the category (so we can
distinguish the quantifier //that// from the conjunction //that//).
Open classes have no objects in ``Syntax``. Words are
built as they are needed in applications: if we have
```
fun Wine : Kind ;
```
we will define
```
lin Wine = mkN "wine" ;
```
where we use ``mkN`` from ``ParadigmsEng``:
#NEW
===Resource lexicon===
Alternative concrete syntax for
```
fun Wine : Kind ;
```
is to provide a **resource lexicon**, which contains definitions such as
```
oper wine_N : N = mkN "wine" ;
```
so that we can write
```
lin Wine = wine_N ;
```
Advantages:
- we accumulate a reusable lexicon
- we can use a #Rsecfunctor to speed up multilingual grammar implementation
#NEW
===Phrasal categories===
In ``Foods``, we need just four phrasal categories:
```
Cl ; -- clause e.g. "this pizza is good"
NP ; -- noun phrase e.g. "this pizza"
CN ; -- common noun e.g. "warm pizza"
AP ; -- adjectival phrase e.g. "very warm"
```
Clauses are similar to sentences (``S``), but without a
fixed tense and mood; see #Rsecextended for how they relate.
Common nouns are made into noun phrases by adding determiners.
#NEW
===Syntactic combinations===
We need the following combinations:
```
mkCl : NP -> AP -> Cl ; -- e.g. "this pizza is very warm"
mkNP : QuantSg -> CN -> NP ; -- e.g. "this pizza"
mkNP : QuantPl -> CN -> NP ; -- e.g. "these pizzas"
mkCN : AP -> CN -> CN ; -- e.g. "warm pizza"
mkAP : AdA -> AP -> AP ; -- e.g. "very warm"
```
We also need **lexical insertion**, to form phrases from single words:
```
mkCN : N -> NP ;
mkAP : A -> AP ;
```
Naming convention: to construct a //C//, use a function ``mk``//C//.
Heavy overloading: the current library
(version 1.2) has 23 operations named ``mkNP``!
#NEW
===Example syntactic combination===
The sentence
#BEQU
//these very warm pizzas are Italian//
#ENQU
can be built as follows:
```
mkCl
(mkNP these_QuantPl
(mkCN (mkAP very_AdA (mkAP warm_A)) (mkCN pizza_CN)))
(mkAP italian_AP)
```
The task now: to define the concrete syntax of ``Foods`` so that
this syntactic tree gives the value of linearizing the semantic tree
```
Is (These (QKind (Very Warm) Pizza)) Italian
```
#NEW
==The resource API==
Language-specific and language-independent parts - roughly,
- the syntax API ``Syntax``//L// has the same types and
functions for all languages //L//
- the morphology API ``Paradigms``//L// has partly
different types and functions
for different languages //L//
Full API documentation on-line: the **resource synopsis**,
[``digitalgrammars.com/gf/lib/resource/doc/synopsis.html`` http://digitalgrammars.com/gf/lib/resource/doc/synopsis.html]
#NEW
===A miniature resource API: categories===
|| Category | Explanation | Example ||
| ``Cl`` | clause (sentence), with all tenses | //she looks at this// |
| ``AP`` | adjectival phrase | //very warm// |
| ``CN`` | common noun (without determiner) | //red house// |
| ``NP`` | noun phrase (subject or object) | //the red house// |
| ``AdA`` | adjective-modifying adverb, | //very// |
| ``QuantSg`` | singular quantifier | //these// |
| ``QuantPl`` | plural quantifier | //this// |
| ``A`` | one-place adjective | //warm// |
| ``N`` | common noun | //house// |
#NEW
===A miniature resource API: rules===
|| Function | Type | Example ||
| ``mkCl`` | ``NP -> AP -> Cl`` | //John is very old// |
| ``mkNP`` | ``QuantSg -> CN -> NP`` | //this old man// |
| ``mkNP`` | ``QuantPl -> CN -> NP`` | //these old man// |
| ``mkCN`` | ``N -> CN`` | //house// |
| ``mkCN`` | ``AP -> CN -> CN`` | //very big blue house// |
| ``mkAP`` | ``A -> AP`` | //old// |
| ``mkAP`` | ``AdA -> AP -> AP`` | //very very old// |
#NEW
===A miniature resource API: structural words===
|| Function | Type | In English ||
| ``this_QuantSg`` | ``QuantSg`` | //this// |
| ``that_QuantSg`` | ``QuantSg`` | //that// |
| ``these_QuantPl`` | ``QuantPl`` | //this// |
| ``those_QuantPl`` | ``QuantPl`` | //that// |
| ``very_AdA`` | ``AdA`` | //very// |
#NEW
===A miniature resource API: paradigms===
From ``ParadigmsEng``:
|| Function | Type ||
| ``mkN`` | ``(dog : Str) -> N`` |
| ``mkN`` | ``(man,men : Str) -> N`` |
| ``mkA`` | ``(cold : Str) -> A`` |
From ``ParadigmsIta``:
|| Function | Type ||
| ``mkN`` | ``(vino : Str) -> N`` |
| ``mkA`` | ``(caro : Str) -> A`` |
#NEW
===A miniature resource API: more paradigms===
From ``ParadigmsGer``:
|| Function | Type ||
| ``Gender`` | ``Type`` |
| ``masculine`` | ``Gender`` |
| ``feminine`` | ``Gender`` |
| ``neuter`` | ``Gender`` |
| ``mkN`` | ``(Stufe : Str) -> N`` |
| ``mkN`` | ``(Bild,Bilder : Str) -> Gender -> N`` |
| ``mkA`` | ``(klein : Str) -> A`` |
| ``mkA`` | ``(gut,besser,beste : Str) -> A`` |
From ``ParadigmsFin``:
|| Function | Type ||
| ``mkN`` | ``(talo : Str) -> N`` |
| ``mkA`` | ``(hieno : Str) -> A`` |
#NEW
===Exercises===
1. Try out the morphological paradigms in different languages. Do
as follows:
```
> i -path=alltenses -retain alltenses/ParadigmsGer.gfo
> cc -table mkN "Farbe"
> cc -table mkA "gut" "besser" "beste"
```
#NEW
==Example: English==
#Lsecenglish
We assume the abstract syntax ``Foods`` from #Rchapfour.
We don't need to think about inflection and agreement, but just pick
functions from the resource grammar library.
We need a path with
- the current directory ``.``
- the directory ``../foods``, in which ``Foods.gf`` resides.
- the library directory ``present``, which is relative to the
environment variable ``GF_LIB_PATH``
Thus the beginning of the module is
```
--# -path=.:../foods:present
concrete FoodsEng of Foods = open SyntaxEng,ParadigmsEng in {
```
#NEW
===English example: linearization types and combination rules===
As linearization types, we use clauses for ``Phrase``, noun phrases
for ``Item``, common nouns for ``Kind``, and adjectival phrases for ``Quality``.
```
lincat
Phrase = Cl ;
Item = NP ;
Kind = CN ;
Quality = AP ;
```
Now the combination rules we need almost write themselves automatically:
```
lin
Is item quality = mkCl item quality ;
This kind = mkNP this_QuantSg kind ;
That kind = mkNP that_QuantSg kind ;
These kind = mkNP these_QuantPl kind ;
Those kind = mkNP those_QuantPl kind ;
QKind quality kind = mkCN quality kind ;
Very quality = mkAP very_AdA quality ;
```
#NEW
===English example: lexical rules===
We use resource paradigms and lexical insertion rules.
The two-place noun paradigm is needed only once, for
//fish// - everythins else is regular.
```
Wine = mkCN (mkN "wine") ;
Pizza = mkCN (mkN "pizza") ;
Cheese = mkCN (mkN "cheese") ;
Fish = mkCN (mkN "fish" "fish") ;
Fresh = mkAP (mkA "fresh") ;
Warm = mkAP (mkA "warm") ;
Italian = mkAP (mkA "Italian") ;
Expensive = mkAP (mkA "expensive") ;
Delicious = mkAP (mkA "delicious") ;
Boring = mkAP (mkA "boring") ;
}
```
#NEW
===English example: exercises===
1. Compile the grammar ``FoodsEng`` and generate
and parse some sentences.
2. Write a concrete syntax of ``Foods`` for Italian
or some other language included in the resource library. You can
compare the results with the hand-written
grammars presented earlier in this tutorial.
#NEW
==Functor implementation of multilingual grammars==
#Lsecfunctor
===New language by copy and paste===
If you write a concrete syntax of ``Foods`` for some other
language, much of the code will look exactly the same
as for English. This is because
- the ``Syntax`` API is the same for all languages (because
all languages in the resource package do implement the same
syntactic structures)
- languages tend to use the syntactic structures in similar ways
But lexical rules are more language-dependent.
Thus, to port a grammar to a new language, you
+ copy the concrete syntax of a given language
+ change the words (strings and inflection paradigms)
Can we avoid this programming by copy-and-paste?
#NEW
===Functors: functions on the module level===
**Functors** familiar from the functional programming languages ML and OCaml,
also known as **parametrized modules**.
In GF, a functor is a module that ``open``s one or more **interfaces**.
An ``interface`` is a module similar to a ``resource``, but it only
contains the //types// of ``oper``s, not (necessarily) their definitions.
Syntax for functors: add the keyword ``incomplete``. We will use the header
```
incomplete concrete FoodsI of Foods = open Syntax, LexFoods in
```
where
```
interface Syntax -- the resource grammar interface
interface LexFoods -- the domain lexicon interface
```
When we moreover have
```
instance SyntaxEng of Syntax -- the English resource grammar
instance LexFoodsEng of LexFoods -- the English domain lexicon
```
we can write a **functor instantiation**,
```
concrete FoodsGer of Foods = FoodsI with
(Syntax = SyntaxGer),
(LexFoods = LexFoodsGer) ;
```
#NEW
===Code for the Foods functor===
```
--# -path=.:../foods
incomplete concrete FoodsI of Foods = open Syntax, LexFoods in {
lincat
Phrase = Cl ;
Item = NP ;
Kind = CN ;
Quality = AP ;
lin
Is item quality = mkCl item quality ;
This kind = mkNP this_QuantSg kind ;
That kind = mkNP that_QuantSg kind ;
These kind = mkNP these_QuantPl kind ;
Those kind = mkNP those_QuantPl kind ;
QKind quality kind = mkCN quality kind ;
Very quality = mkAP very_AdA quality ;
Wine = mkCN wine_N ;
Pizza = mkCN pizza_N ;
Cheese = mkCN cheese_N ;
Fish = mkCN fish_N ;
Fresh = mkAP fresh_A ;
Warm = mkAP warm_A ;
Italian = mkAP italian_A ;
Expensive = mkAP expensive_A ;
Delicious = mkAP delicious_A ;
Boring = mkAP boring_A ;
}
```
#NEW
===Code for the LexFoods interface===
#Lsecinterface
```
interface LexFoods = open Syntax in {
oper
wine_N : N ;
pizza_N : N ;
cheese_N : N ;
fish_N : N ;
fresh_A : A ;
warm_A : A ;
italian_A : A ;
expensive_A : A ;
delicious_A : A ;
boring_A : A ;
}
```
#NEW
===Code for a German instance of the lexicon===
```
instance LexFoodsGer of LexFoods = open SyntaxGer, ParadigmsGer in {
oper
wine_N = mkN "Wein" ;
pizza_N = mkN "Pizza" "Pizzen" feminine ;
cheese_N = mkN "Käse" "Käsen" masculine ;
fish_N = mkN "Fisch" ;
fresh_A = mkA "frisch" ;
warm_A = mkA "warm" "wärmer" "wärmste" ;
italian_A = mkA "italienisch" ;
expensive_A = mkA "teuer" ;
delicious_A = mkA "köstlich" ;
boring_A = mkA "langweilig" ;
}
```
#NEW
===Code for a German functor instantiation===
```
--# -path=.:../foods:present
concrete FoodsGer of Foods = FoodsI with
(Syntax = SyntaxGer),
(LexFoods = LexFoodsGer) ;
```
#NEW
===Adding languages to a functor implementation===
Just two modules are needed:
- a domain lexicon instance
- a functor instantiation
The functor instantiation is completely mechanical to write.
The domain lexicon instance requires some knowledge of the words of the
language:
- what words are used for which concepts
- how the words are
- features such as genders
#NEW
===Example: adding Finnish===
Lexicon instance
```
instance LexFoodsFin of LexFoods = open SyntaxFin, ParadigmsFin in {
oper
wine_N = mkN "viini" ;
pizza_N = mkN "pizza" ;
cheese_N = mkN "juusto" ;
fish_N = mkN "kala" ;
fresh_A = mkA "tuore" ;
warm_A = mkA "lämmin" ;
italian_A = mkA "italialainen" ;
expensive_A = mkA "kallis" ;
delicious_A = mkA "herkullinen" ;
boring_A = mkA "tylsä" ;
}
```
Functor instantiation
```
--# -path=.:../foods:present
concrete FoodsFin of Foods = FoodsI with
(Syntax = SyntaxFin),
(LexFoods = LexFoodsFin) ;
```
#NEW
===A design pattern===
This can be seen as a //design pattern// for multilingual grammars:
```
concrete DomainL*
instance LexDomainL instance SyntaxL*
incomplete concrete DomainI
/ | \
interface LexDomain abstract Domain interface Syntax*
```
Modules marked with ``*`` are either given in the library, or trivial.
Of the hand-written modules, only ``LexDomainL`` is language-dependent.
#NEW
===Functors: exercises===
1. Compile and test ``FoodsGer``.
2. Refactor ``FoodsEng`` into a functor instantiation.
3. Instantiate the functor ``FoodsI`` to some language of
your choice.
4. Design a small grammar that can be used for controlling
an MP3 player. The grammar should be able to recognize commands such
as //play this song//, with the following variations:
- verbs: //play//, //remove//
- objects: //song//, //artist//
- determiners: //this//, //the previous//
- verbs without arguments: //stop//, //pause//
The implementation goes in the following phases:
+ abstract syntax
+ (optional:) prototype string-based concrete syntax
+ functor over resource syntax and lexicon interface
+ lexicon instance for the first language
+ functor instantiation for the first language
+ lexicon instance for the second language
+ functor instantiation for the second language
+ ...
#NEW
==Restricted inheritance==
===A problem with functors===
Problem: a functor only works when all languages use the resource ``Syntax``
in the same way.
Example (contrived): assume that English has
no word for ``Pizza``, but has to use the paraphrase //Italian pie//.
This is no longer a noun ``N``, but a complex phrase
in the category ``CN``.
Possible solution: change interface the ``LexFoods`` with
```
oper pizza_CN : CN ;
```
Problem with this solution:
- we may end up changing the interface and the function with each new language
- we must every time also change the instances for the old languages to maintain
type correctness
#NEW
===Restricted inheritance: include or exclude===
A module may inherit just a selection of names.
Example: the ``FoodMarket`` example "Rsecarchitecture:
```
abstract Foodmarket = Food, Fruit [Peach], Mushroom - [Agaric]
```
Here, from ``Fruit`` we include ``Peach`` only, and from ``Mushroom``
we exclude ``Agaric``.
A concrete syntax of ``Foodmarket`` must make the analogous restrictions.
#NEW
===The functor problem solved===
The English instantiation inherits the functor
implementation except for the constant ``Pizza``. This constant
is defined in the body instead:
```
--# -path=.:../foods:present
concrete FoodsEng of Foods = FoodsI - [Pizza] with
(Syntax = SyntaxEng),
(LexFoods = LexFoodsEng) **
open SyntaxEng, ParadigmsEng in {
lin Pizza = mkCN (mkA "Italian") (mkN "pie") ;
}
```
#NEW
==Grammar reuse==
Abstract syntax modules can be used as interfaces,
and concrete syntaxes as their instances.
The following correspondencies are then applied:
```
cat C <---> oper C : Type
fun f : A <---> oper f : A
lincat C = T <---> oper C : Type = T
lin f = t <---> oper f : A = t
```
#NEW
===Library exercises===
1. Find resource grammar terms for the following
English phrases (in the category ``Phr``). You can first try to
build the terms manually.
//every man loves a woman//
//this grammar speaks more than ten languages//
//which languages aren't in the grammar//
//which languages did you want to speak//
Then translate the phrases to other languages.
#NEW
==Tenses==
#Lsectense
In ``Foods`` grammars, we have used the path
```
--# -path=.:../foods
```
The library subdirectory ``present`` is a restricted version
of the resource, with only present tense of verbs and sentences.
By just changing the path, we get all tenses:
```
--# -path=.:../foods:alltenses
```
Now we can see all the tenses of phrases, by using the ``-all`` flag
in linearization:
```
> gr | l -all
This wine is delicious
Is this wine delicious
This wine isn't delicious
Isn't this wine delicious
This wine is not delicious
Is this wine not delicious
This wine has been delicious
Has this wine been delicious
This wine hasn't been delicious
Hasn't this wine been delicious
This wine has not been delicious
Has this wine not been delicious
This wine was delicious
Was this wine delicious
This wine wasn't delicious
Wasn't this wine delicious
This wine was not delicious
Was this wine not delicious
This wine had been delicious
Had this wine been delicious
This wine hadn't been delicious
Hadn't this wine been delicious
This wine had not been delicious
Had this wine not been delicious
This wine will be delicious
Will this wine be delicious
This wine won't be delicious
Won't this wine be delicious
This wine will not be delicious
Will this wine not be delicious
This wine will have been delicious
Will this wine have been delicious
This wine won't have been delicious
Won't this wine have been delicious
This wine will not have been delicious
Will this wine not have been delicious
This wine would be delicious
Would this wine be delicious
This wine wouldn't be delicious
Wouldn't this wine be delicious
This wine would not be delicious
Would this wine not be delicious
This wine would have been delicious
Would this wine have been delicious
This wine wouldn't have been delicious
Wouldn't this wine have been delicious
This wine would not have been delicious
Would this wine not have been delicious
```
We also see
- polarity (positive vs. negative)
- word order (direct vs. inverted)
- variation between contracted and full negation
The list is even longer in languages that have more
tenses and moods, e.g. the Romance languages.
#NEW
=Lesson 5: Refining semantics in abstract syntax=
**NOTICE**: The methods described in this lesson are not yet fully supported
in GF 3.0 beta. Use GF 2.9 to get all functionalities.
#Lchapsix
Goals:
- include semantic conditions in grammars, by using
- **dependent types**
- **higher order abstract syntax**
- proof objects
- semantic definitions
These concepts are inherited from **type theory** (more precisely:
constructive type theory, or Martin-Löf type theory).
Type theory is the basis **logical frameworks**.
GF = logical framework + concrete syntax.
#NEW
==Dependent types==
#Lsecsmarthouse
Problem: to express **conditions of semantic well-formedness**.
Example: a voice command system for a "smart house" wants to
eliminate meaningless commands.
Thus we want to restrict particular actions to
particular devices - we can //dim a light//, but we cannot
//dim a fan//.
The following example is borrowed from the
Regulus Book (Rayner & al. 2006).
A simple example is a "smart house" system, which
defines voice commands for household appliances.
#NEW
===A dependent type system===
Ontology:
- there are commands and device kinds
- for each kind of device, there are devices and actions
- a command concerns an action of some kind on a device of the same kind
Abstract syntax formalizing this:
```
cat
Command ;
Kind ;
Device Kind ; -- argument type Kind
Action Kind ;
fun
CAction : (k : Kind) -> Action k -> Device k -> Command ;
```
``Device`` and ``Action`` are both dependent types.
#NEW
===Examples of devices and actions===
Assume the kinds ``light`` and ``fan``,
```
light, fan : Kind ;
dim : Action light ;
```
Given a kind, //k//, you can form the device //the k//.
```
DKindOne : (k : Kind) -> Device k ; -- the light
```
Now we can form the syntax tree
```
CAction light dim (DKindOne light)
```
but we cannot form the trees
```
CAction light dim (DKindOne fan)
CAction fan dim (DKindOne light)
CAction fan dim (DKindOne fan)
```
#NEW
===Linearization and parsing with dependent types===
Concrete syntax does not know if a category is a dependent type.
```
lincat Action = {s : Str} ;
lin CAction _ act dev = {s = act.s ++ dev.s} ;
```
Notice that the ``Kind`` argument is suppressed in linearization.
Parsing with dependent types is performed in two phases:
+ context-free parsing
+ filtering through type checker
By just doing the first phase, the ``kind`` argument is not found:
```
> parse "dim the light"
CAction ? dim (DKindOne light)
```
Moreover, type-incorrect commands are not rejected:
```
> parse "dim the fan"
CAction ? dim (DKindOne fan)
```
The term ``?`` is a **metavariable**, returned by the parser
for any subtree that is suppressed by a linearization rule.
These are the same kind of metavariables as were used #Rsecediting
to mark incomplete parts of trees in the syntax editor.
#NEW
===Solving metavariables===
Use the command ``put_tree = pt`` with the flag ``-transform=solve``:
```
> parse "dim the light" | put_tree -transform=solve
CAction light dim (DKindOne light)
```
The ``solve`` process may fail, in which case no tree is returned:
```
> parse "dim the fan" | put_tree -transform=solve
no tree found
```
#NEW
==Polymorphism==
#Lsecpolymorphic
Sometimes an action can be performed on all kinds of devices.
This is represented as a function that takes a ``Kind`` as an argument
and produce an ``Action`` for that ``Kind``:
```
fun switchOn, switchOff : (k : Kind) -> Action k ;
```
Functions of this kind are called **polymorphic**.
We can use this kind of polymorphism in concrete syntax as well,
to express Haskell-type library functions:
```
oper const :(a,b : Type) -> a -> b -> a =
\_,_,c,_ -> c ;
oper flip : (a,b,c : Type) -> (a -> b ->c) -> b -> a -> c =
\_,_,_,f,x,y -> f y x ;
```
#NEW
===Dependent types: exercises===
1. Write an abstract syntax module with above contents
and an appropriate English concrete syntax. Try to parse the commands
//dim the light// and //dim the fan//, with and without ``solve`` filtering.
2. Perform random and exhaustive generation, with and without
``solve`` filtering.
3. Add some device kinds and actions to the grammar.
#NEW
==Proof objects==
**Curry-Howard isomorphism** = **propositions as types principle**:
a proposition is a type of proofs (= proof objects).
Example: define the //less than// proposition for natural numbers,
```
cat Nat ;
fun Zero : Nat ;
fun Succ : Nat -> Nat ;
```
Define inductively what it means for a number //x// to be //less than//
a number //y//:
- ``Zero`` is less than ``Succ`` //y// for any //y//.
- If //x// is less than //y//, then ``Succ`` //x// is less than ``Succ`` //y//.
Expressing these axioms in type theory
with a dependent type ``Less`` //x y// and two functions constructing
its objects:
```
cat Less Nat Nat ;
fun lessZ : (y : Nat) -> Less Zero (Succ y) ;
fun lessS : (x,y : Nat) -> Less x y -> Less (Succ x) (Succ y) ;
```
Example: the fact that 2 is less that 4 has the proof object
```
lessS (Succ Zero) (Succ (Succ (Succ Zero)))
(lessS Zero (Succ (Succ Zero)) (lessZ (Succ Zero)))
: Less (Succ (Succ Zero)) (Succ (Succ (Succ (Succ Zero))))
```
#NEW
===Proof-carrying documents===
Idea: to be semantically well-formed, the abstract syntax of a document
must contain a proof of some property,
although the proof is not shown in the concrete document.
Example: documents describing flight connections:
//To fly from Gothenburg to Prague, first take LH3043 to Frankfurt, then OK0537 to Prague.//
The well-formedness of this text is partly expressible by dependent typing:
```
cat
City ;
Flight City City ;
fun
Gothenburg, Frankfurt, Prague : City ;
LH3043 : Flight Gothenburg Frankfurt ;
OK0537 : Flight Frankfurt Prague ;
```
To extend the conditions to flight connections, we introduce a category
of proofs that a change is possible:
```
cat IsPossible (x,y,z : City)(Flight x y)(Flight y z) ;
```
A legal connection is formed by the function
```
fun Connect : (x,y,z : City) ->
(u : Flight x y) -> (v : Flight y z) ->
IsPossible x y z u v -> Flight x z ;
```
#NEW
==Restricted polymorphism==
Above, all Actions were either of
- **monomorphic**: defined for one Kind
- **polymorphic**: defined for all Kinds
To make this scale up for new Kinds, we can refine this to
**restricted polymorphism**: defined for Kinds of a certain **class**
The notion of class uses the Curry-Howard isomorphism as follows:
- a class is a **predicate** of Kinds --- i.e. a type depending of Kinds
- a Kind is in a class if there is a proof object of this type
#NEW
===Example: classes for switching and dimming===
We modify the smart house grammar:
```
cat
Switchable Kind ;
Dimmable Kind ;
fun
switchable_light : Switchable light ;
switchable_fan : Switchable fan ;
dimmable_light : Dimmable light ;
switchOn : (k : Kind) -> Switchable k -> Action k ;
dim : (k : Kind) -> Dimmable k -> Action k ;
```
Classes for new actions can be added incrementally.
#NEW
==Variable bindings==
#Lsecbinding
Mathematical notation and programming languages have
expressions that **bind** variables.
Example: universal quantifier formula
```
(All x)B(x)
```
The variable ``x`` has a **binding** ``(All x)``, and
occurs **bound** in the **body** ``B(x)``.
Examples from informal mathematical language:
```
for all x, x is equal to x
the function that for any numbers x and y returns the maximum of x+y
and x*y
Let x be a natural number. Assume that x is even. Then x + 3 is odd.
```
#NEW
===Higher-order abstract syntax===
Abstract syntax can use functions as arguments:
```
cat Ind ; Prop ;
fun All : (Ind -> Prop) -> Prop
```
where ``Ind`` is the type of individuals and ``Prop``,
the type of propositions.
Let us add an equality predicate
```
fun Eq : Ind -> Ind -> Prop
```
Now we can form the tree
```
All (\x -> Eq x x)
```
which we want to relate to the ordinary notation
```
(All x)(x = x)
```
In **higher-order abstract syntax** (HOAS), all variable bindings are
expressed using higher-order syntactic constructors.
#NEW
===Higher-order abstract syntax: linearization===
HOAS has proved to be useful in the semantics and computer implementation of
variable-binding expressions.
How do we relate HOAS to the concrete syntax?
In GF, we write
```
fun All : (Ind -> Prop) -> Prop
lin All B = {s = "(" ++ "All" ++ B.$0 ++ ")" ++ B.s}
```
General rule: if an argument type of a ``fun`` function is
a function type ``A -> C``, the linearization type of
this argument is the linearization type of ``C``
together with a new field ``$0 : Str``.
The argument ``B`` thus has the linearization type
```
{s : Str ; $0 : Str},
```
If there are more bindings, we add ``$1``, ``$2``, etc.
#NEW
===Eta expansion===
To make sense of linearization, syntax trees must be
**eta-expanded**: for any function of type
```
A -> B
```
an eta-expanded syntax tree has the form
```
\x -> b
```
where ``b : B`` under the assumption ``x : A``.
Given the linearization rule
```
lin Eq a b = {s = "(" ++ a.s ++ "=" ++ b.s ++ ")"}
```
the linearization of the tree
```
\x -> Eq x x
```
is the record
```
{$0 = "x", s = ["( x = x )"]}
```
Then we can compute the linearization of the formula,
```
All (\x -> Eq x x) --> {s = "[( All x ) ( x = x )]"}.
```
The linearization of the variable ``x`` is,
"automagically", the string ``"x"``.
#NEW
===Parsing variable bindings===
GF needs to know what strings are parsed as variable symbols.
This is defined in a special lexer,
```
> p -cat=Prop -lexer=codevars "(All x)(x = x)"
All (\x -> Eq x x)
```
More details on lexers #Rseclexing.
#NEW
===Exercises on variable bindings===
1. Write an abstract syntax of the whole
**predicate calculus**, with the
**connectives** "and", "or", "implies", and "not", and the
**quantifiers** "exists" and "for all". Use higher-order functions
to guarantee that unbounded variables do not occur.
2. Write a concrete syntax for your favourite
notation of predicate calculus. Use Latex as target language
if you want nice output. You can also try producing boolean
expressions of some programming language. Use as many parenthesis as you need to
guarantee non-ambiguity.
#NEW
==Semantic definitions==
#Lsecdefdef
The ``fun`` judgements of GF are declarations of functions, giving their types.
Can we **compute** ``fun`` functions?
Mostly we are not interested, since functions are seen as constructors,
i.e. data forms - as usual with
```
fun Zero : Nat ;
fun Succ : Nat -> Nat ;
```
But it is also possible to give **semantic definitions** to functions.
The key word is ``def``:
```
fun one : Nat ;
def one = Succ Zero ;
fun twice : Nat -> Nat ;
def twice x = plus x x ;
fun plus : Nat -> Nat -> Nat ;
def
plus x Zero = x ;
plus x (Succ y) = Succ (Sum x y) ;
```
#NEW
===Computing a tree===
Computation: follow a chain of definition until no definition
can be applied,
```
plus one one -->
plus (Succ Zero) (Succ Zero) -->
Succ (plus (Succ Zero) Zero) -->
Succ (Succ Zero)
```
Computation in GF is performed with the ``put_term`` command and the
``compute`` transformation, e.g.
```
> parse -tr "1 + 1" | put_term -transform=compute -tr | l
plus one one
Succ (Succ Zero)
s(s(0))
```
#NEW
===Definitional equality===
Two trees are definitionally equal if they compute into the same tree.
Definitional equality does not guarantee sameness of linearization:
```
plus one one ===> 1 + 1
Succ (Succ Zero) ===> s(s(0))
```
The main use of this concept is in type checking: sameness of types.
Thus e.g. the following types are equal
```
Less Zero one
Less Zero (Succ Zero))
```
so that an object of one also is an object of the other.
#NEW
===Judgement forms for constructors===
The judgement form ``data`` tells that a category has
certain functions as constructors:
```
data Nat = Succ | Zero ;
```
The type signatures of constructors are given separately,
```
fun Zero : Nat ;
fun Succ : Nat -> Nat ;
```
There is also a shorthand:
```
data Succ : Nat -> Nat ; === fun Succ : Nat -> Nat ;
data Nat = Succ ;
```
Notice: in ``def`` definitions, identifier patterns not
marked as ``data`` will be treated as variables.
#NEW
===Exercises on semantic definitions===
1. Implement an interpreter of a small functional programming
language with natural numbers, lists, pairs, lambdas, etc. Use higher-order
abstract syntax with semantic definitions. As concrete syntax, use
your favourite programming language.
2. There is no termination checking for ``def`` definitions.
Construct an examples that makes type checking loop.
Type checking can be invoked with ``put_term -transform=solve``.
#NEW
==Lesson 6: Grammars of formal languages==
**NOTICE**: The methods described in this lesson are not yet fully supported
in GF 3.0 beta. Use GF 2.9 to get all functionalities.
#Lchapseven
Goals:
- write grammars for formal languages (mathematical notation, programming languages)
- interface between formal and natural langauges
- implement a compiler by using GF
#NEW
===Arithmetic expressions===
We construct a calculator with addition, subtraction, multiplication, and
division of integers.
```
abstract Calculator = {
cat Exp ;
fun
EPlus, EMinus, ETimes, EDiv : Exp -> Exp -> Exp ;
EInt : Int -> Exp ;
}
```
The category ``Int`` is a built-in category of
integers. Its syntax trees **integer literals**, i.e.
sequences of digits:
```
5457455814608954681 : Int
```
These are the only objects of type ``Int``:
grammars are not allowed to declare functions with ``Int`` as value type.
#NEW
===Concrete syntax: a simple approach===
We begin with a
concrete syntax that always uses parentheses around binary
operator applications:
```
concrete CalculatorP of Calculator = {
lincat
Exp = SS ;
lin
EPlus = infix "+" ;
EMinus = infix "-" ;
ETimes = infix "*" ;
EDiv = infix "/" ;
EInt i = i ;
oper
infix : Str -> SS -> SS -> SS = \f,x,y ->
ss ("(" ++ x.s ++ f ++ y.s ++ ")") ;
}
```
Now we have
```
> linearize EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
( 2 + ( 3 * 4 ) )
```
First problems:
- to get rid of superfluous spaces and
- to recognize integer literals in the parser
#NEW
==Lexing and unlexing==
#Lseclexing
The input of parsing in GF is not just a string, but a list of
**tokens**, returned by a **lexer**.
The default lexer in GF returns chunks separated by spaces:
```
"(12 + (3 * 4))" ===> "(12", "+", "(3". "*". "4))"
```
The proper way would be
```
"(", "12", "+", "(", "3", "*", "4", ")", ")"
```
Moreover, the tokens ``"12"``, ``"3"``, and ``"4"`` should be recognized as
integer literals - they cannot be found in the grammar.
We choose a proper with a flag:
```
> parse -cat=Exp -lexer=codelit "(2 + (3 * 4))"
EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
```
We could also put the flag into the grammar (concrete syntax):
```
flags lexer = codelit ;
```
In linearization, we use a corresponding **unlexer**:
```
> l -unlexer=code EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
(2 + (3 * 4))
```
#NEW
===Most common lexers and unlexers===
|| lexer | description ||
| ``words`` | (default) tokens are separated by spaces or newlines
| ``literals`` | like words, but integer and string literals recognized
| ``chars`` | each character is a token
| ``code`` | program code conventions (uses Haskell's lex)
| ``text`` | with conventions on punctuation and capital letters
| ``codelit`` | like code, but recognize literals (unknown words as strings)
| ``textlit`` | like text, but recognize literals (unknown words as strings)
|| unlexer | description ||
| ``unwords`` | (default) space-separated token list
| ``text`` | format as text: punctuation, capitals, paragraph
| ``code`` | format as code (spacing, indentation)
| ``textlit`` | like text, but remove string literal quotes
| ``codelit`` | like code, but remove string literal quotes
| ``concat`` | remove all spaces
#NEW
==Precedence and fixity==
Arithmetic expressions should be unambiguous. If we write
```
2 + 3 * 4
```
it should be parsed as one, but not both, of
```
EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
ETimes (EPlus (EInt 2) (EInt 3)) (EInt 4)
```
We choose the former tree, because
multiplication has **higher precedence** than addition.
To express the latter tree, we have to use parentheses:
```
(2 + 3) * 4
```
The usual precedence rules:
- Integer constants and expressions in parentheses have the highest precedence.
- Multiplication and division have equal precedence, lower than the highest
but higher than addition and subtraction, which are again equal.
- All the four binary operations are **left-associative**:
``1 + 2 + 3`` means the same as ``(1 + 2) + 3``.
#NEW
===Precedence as a parameter===
Precedence can be made into an inherent feature of expressions:
```
oper
Prec : PType = Ints 2 ;
TermPrec : Type = {s : Str ; p : Prec} ;
mkPrec : Prec -> Str -> TermPrec = \p,s -> {s = s ; p = p} ;
lincat
Exp = TermPrec ;
```
Notice ``Ints 2``: a parameter type, whose values are the integers
``0,1,2``.
Using precedence levels: compare the inherent precedence of an
expression with the expected precedence.
- if the inherent precedence is lower than the expected precedence,
use parentheses
- otherwise, no parentheses are needed
This idea is encoded in the operation
```
oper usePrec : TermPrec -> Prec -> Str = \x,p ->
case lessPrec x.p p of {
True => "(" x.s ")" ;
False => x.s
} ;
```
(We use ``lessPrec`` from ``lib/prelude/Formal``.)
#NEW
===Fixities===
We can define left-associative infix expressions:
```
infixl : Prec -> Str -> (_,_ : TermPrec) -> TermPrec = \p,f,x,y ->
mkPrec p (usePrec x p ++ f ++ usePrec y (nextPrec p)) ;
```
Constant-like expressions (the highest level):
```
constant : Str -> TermPrec = mkPrec 2 ;
```
All these operations can be found in ``lib/prelude/Formal``,
which has 5 levels.
Now we can write the whole concrete syntax of ``Calculator`` compactly:
```
concrete CalculatorC of Calculator = open Formal, Prelude in {
flags lexer = codelit ; unlexer = code ; startcat = Exp ;
lincat Exp = TermPrec ;
lin
EPlus = infixl 0 "+" ;
EMinus = infixl 0 "-" ;
ETimes = infixl 1 "*" ;
EDiv = infixl 1 "/" ;
EInt i = constant i.s ;
}
```
#NEW
===Exercises on precedence===
1. Define non-associative and right-associative infix operations
analogous to ``infixl``.
2. Add a constructor that puts parentheses around expressions
to raise their precedence, but that is eliminated by a ``def`` definition.
Test parsing with and without a pipe to ``pt -transform=compute``.
#NEW
==Code generation as linearization==
Translate arithmetic (infix) to JVM (postfix):
```
2 + 3 * 4
===>
iconst 2 : iconst 3 ; iconst 4 ; imul ; iadd
```
Just give linearization rules for JVM:
```
lin
EPlus = postfix "iadd" ;
EMinus = postfix "isub" ;
ETimes = postfix "imul" ;
EDiv = postfix "idiv" ;
EInt i = ss ("iconst" ++ i.s) ;
oper
postfix : Str -> SS -> SS -> SS = \op,x,y ->
ss (x.s ++ ";" ++ y.s ++ ";" ++ op) ;
```
#NEW
===Programs with variables===
A **straight code** programming language, with
**initializations** and **assignments**:
```
int x = 2 + 3 ;
int y = x + 1 ;
x = x + 9 * y ;
```
We define programs by the following constructors:
```
fun
PEmpty : Prog ;
PInit : Exp -> (Var -> Prog) -> Prog ;
PAss : Var -> Exp -> Prog -> Prog ;
```
``PInit`` uses higher-order abstract syntax for making the
initialized variable available in the **continuation** of the program.
The abstract syntax tree for the above code is
```
PInit (EPlus (EInt 2) (EInt 3)) (\x ->
PInit (EPlus (EVar x) (EInt 1)) (\y ->
PAss x (EPlus (EVar x) (ETimes (EInt 9) (EVar y)))
PEmpty))
```
No uninitialized variables are allowed - there are no constructors for ``Var``!
But we do have the rule
```
fun EVar : Var -> Exp ;
```
The rest of the grammar is just the same as for arithmetic expressions
#Rsecprecedence. The best way to implement it is perhaps by writing a
module that extends the expression module. The most natural start category
of the extension is ``Prog``.
#NEW
===Exercises on code generation===
1. Define a C-like concrete syntax of the straight-code language.
2. Extend the straight-code language to expressions of type ``float``.
To guarantee type safety, you can define a category ``Typ`` of types, and
make ``Exp`` and ``Var`` dependent on ``Typ``. Basic floating point expressions
can be formed from literal of the built-in GF type ``Float``. The arithmetic
operations should be made polymorphic (as #Rsecpolymorphic).
3. Extend JVM generation to the straight-code language, using
two more instructions
- ``iload`` //x//, which loads the value of the variable //x//
- ``istore`` //x// which stores a value to the variable //x//
Thus the code for the example in the previous section is
```
iconst 2 ; iconst 3 ; iadd ; istore x ;
iload x ; iconst 1 ; iadd ; istore y ;
iload x ; iconst 9 ; iload y ; imul ; iadd ; istore x ;
```
4. If you made the exercise of adding floating point numbers to
the language, you can now cash out the main advantage of type checking
for code generation: selecting type-correct JVM instructions. The floating
point instructions are precisely the same as the integer one, except that
the prefix is ``f`` instead of ``i``, and that ``fconst`` takes floating
point literals as arguments.
#NEW
=Lesson 7: Embedded grammars=
#Lchapeight
Goals:
- use as parts of programs written in other programming Haskell and Java
- implement stand-alone question-answering systems and translators based on
GF grammars
- generate language models for speech recognition from grammars
#NEW
==Functionalities of an embedded grammar format==
GF grammars can be used as parts of programs written in other programming
languages. Haskell and Java.
This facility is based on several components:
- a portable format for multilingual GF grammars
- an interpreter for this format written in the host language
- an API that enables reading grammar files and calling the interpreter
- a way to manipulate abstract syntax trees in the host language
#NEW
==The portable grammar format==
The portable format is called PGF, "Portable Grammar Format".
A file can be produced in GF by the command
```
> print_grammar | write_file FILE.pgf
```
There is also a batch compiler, executable from the operative system shell:
```
% gfc --make SOURCE.gf
```
//This applies to GF version 3 and upwards. Older GF used a format suffixed//
``.gfcm``.
//At the moment of writing, also the Java interpreter still uses the GFCM format.//
PGF is the recommended format in
which final grammar products are distributed, because they
are stripped from superfluous information and can be started and applied
faster than sets of separate modules.
Application programmers have never any need to read or modify PGF files.
PGF thus plays the same role as machine code in
general-purpose programming (or bytecode in Java).
#NEW
===Haskell: the EmbedAPI module===
The Haskell API contains (among other things) the following types and functions:
```
readPGF :: FilePath -> IO PGF
linearize :: PGF -> Language -> Tree -> String
parse :: PGF -> Language -> Category -> String -> [Tree]
linearizeAll :: PGF -> Tree -> [String]
linearizeAllLang :: PGF -> Tree -> [(Language,String)]
parseAll :: PGF -> Category -> String -> [[Tree]]
parseAllLang :: PGF -> Category -> String -> [(Language,[Tree])]
languages :: PGF -> [Language]
categories :: PGF -> [Category]
startCat :: PGF -> Category
```
This is the only module that needs to be imported in the Haskell application.
It is available as a part of the GF distribution, in the file
``src/PGF.hs``.
#NEW
===First application: a translator===
Let us first build a stand-alone translator, which can translate
in any multilingual grammar between any languages in the grammar.
```
module Main where
import PGF
import System (getArgs)
main :: IO ()
main = do
file:_ <- getArgs
gr <- readPGF file
interact (translate gr)
translate :: PGF -> String -> String
translate gr s = case parseAllLang gr (startCat gr) s of
(lg,t:_):_ -> unlines [linearize gr l t | l <- languages gr, l /= lg]
_ -> "NO PARSE"
```
To run the translator, first compile it by
```
% ghc --make -o trans Translator.hs
```
For this, you need the Haskell compiler [GHC http://www.haskell.org/ghc].
#NEW
===Producing GFCC for the translator===
Then produce a GFCC file. For instance, the ``Food`` grammar set can be
compiled as follows:
```
% gfc --make FoodEng.gf FoodIta.gf
```
This produces the file ``Food.pgf`` (its name comes from the abstract syntax).
The Haskell library function ``interact`` makes the ``trans`` program work
like a Unix filter, which reads from standard input and writes to standard
output. Therefore it can be a part of a pipe and read and write files.
The simplest way to translate is to ``echo`` input to the program:
```
% echo "this wine is delicious" | ./trans Food.pgf
questo vino è delizioso
```
The result is given in all languages except the input language.
#NEW
===A translator loop===
To avoid starting the translator over and over again:
change ``interact`` in the main function to ``loop``, defined as
follows:
```
loop :: (String -> String) -> IO ()
loop trans = do
s <- getLine
if s == "quit" then putStrLn "bye" else do
putStrLn $ trans s
loop trans
```
The loop keeps on translating line by line until the input line
is ``quit``.
#NEW
===A question-answer system===
#Lsecmathprogram
The next application is also a translator, but it adds a
**transfer** component - a function that transforms syntax trees.
The transfer function we use is one that computes a question into an answer.
The program accepts simple questions about arithmetic and answers
"yes" or "no" in the language in which the question was made:
```
Is 123 prime?
No.
77 est impair ?
Oui.
```
We change the pure translator by giving
the ``translate`` function the transfer as an extra argument:
```
translate :: (Tree -> Tree) -> PGF -> String -> String
```
Ordinary translation as a special case where
transfer is the identity function (``id`` in Haskell).
To reply in the //same// language as the question:
```
translate tr gr = case parseAllLang gr (startCat gr) s of
(lg,t:_):_ -> linearize gr lg (tr t)
_ -> "NO PARSE"
```
#NEW
===Exporting GF datatypes to Haskell===
To make it easy to define a transfer function, we export the
abstract syntax to a system of Haskell datatypes:
```
% gfc --output-format=haskell Food.gfcc
```
It is also possible to produce the Haskell file together with GFCC, by
```
% gfc --make --output-format=haskell FoodEng.gf FoodIta.gf
```
The result is a file named ``Food.hs``, containing a
module named ``Food``.
#NEW
===Example of exporting GF datatypes===
Input: abstract syntax judgements
```
cat
Answer ; Question ; Object ;
fun
Even : Object -> Question ;
Odd : Object -> Question ;
Prime : Object -> Question ;
Number : Int -> Object ;
Yes : Answer ;
No : Answer ;
```
Output: Haskell definitions
```
newtype GInt = GInt Integer
data GAnswer =
GYes
| GNo
data GObject = GNumber GInt
data GQuestion =
GPrime GObject
| GOdd GObject
| GEven GObject
```
All type and constructor names are prefixed with a ``G`` to prevent clashes.
The Haskell module name is the same as the abstract syntax name.
#NEW
===The question-answer function===
Haskell's type checker guarantees that the functions are well-typed also with
respect to GF.
```
answer :: GQuestion -> GAnswer
answer p = case p of
GOdd x -> test odd x
GEven x -> test even x
GPrime x -> test prime x
value :: GObject -> Int
value e = case e of
GNumber (GInt i) -> fromInteger i
test :: (Int -> Bool) -> GObject -> GAnswer
test f x = if f (value x) then GYes else GNo
```
#NEW
===Converting between Haskell and GF trees===
The generated Haskell module also contains
```
class Gf a where
gf :: a -> Tree
fg :: Tree -> a
instance Gf GQuestion where
gf (GEven x1) = DTr [] (AC (CId "Even")) [gf x1]
gf (GOdd x1) = DTr [] (AC (CId "Odd")) [gf x1]
gf (GPrime x1) = DTr [] (AC (CId "Prime")) [gf x1]
fg t =
case t of
DTr [] (AC (CId "Even")) [x1] -> GEven (fg x1)
DTr [] (AC (CId "Odd")) [x1] -> GOdd (fg x1)
DTr [] (AC (CId "Prime")) [x1] -> GPrime (fg x1)
_ -> error ("no Question " ++ show t)
```
For the programmer, it is enougo to know:
- all GF names are in Haskell prefixed with ``G``
- ``gf`` translates from Haskell to GF
- ``fg`` translates from GF to Haskell
#NEW
===Putting it all together: the transfer definition===
```
module TransferDef where
import PGF (Tree)
import Math -- generated from GF
transfer :: Tree -> Tree
transfer = gf . answer . fg
answer :: GQuestion -> GAnswer
answer p = case p of
GOdd x -> test odd x
GEven x -> test even x
GPrime x -> test prime x
value :: GObject -> Int
value e = case e of
GNumber (GInt i) -> fromInteger i
test :: (Int -> Bool) -> GObject -> GAnswer
test f x = if f (value x) then GYes else GNo
prime :: Int -> Bool
prime x = elem x primes where
primes = sieve [2 .. x]
sieve (p:xs) = p : sieve [ n | n <- xs, n `mod` p > 0 ]
sieve [] = []
```
#NEW
===Putting it all together: the Main module===
Here is the complete code in the Haskell file ``TransferLoop.hs``.
```
module Main where
import PGF
import TransferDef (transfer)
main :: IO ()
main = do
gr <- file2grammar "Math.pgf"
loop (translate transfer gr)
loop :: (String -> String) -> IO ()
loop trans = do
s <- getLine
if s == "quit" then putStrLn "bye" else do
putStrLn $ trans s
loop trans
translate :: (Tree -> Tree) -> PGF -> String -> String
translate tr gr = case parseAllLang gr (startCat gr) s of
(lg,t:_):_ -> linearize gr lg (tr t)
_ -> "NO PARSE"
```
#NEW
===Putting it all together: the Makefile===
To automate the production of the system, we write a ``Makefile`` as follows:
```
all:
gfc --make -haskell MathEng.gf MathFre.gf
ghc --make -o ./math TransferLoop.hs
strip math
```
(The empty segments starting the command lines in a Makefile must be tabs.)
Now we can compile the whole system by just typing
```
make
```
Then you can run it by typing
```
./math
```
Just to summarize, the source of the application consists of the following files:
```
Makefile -- a makefile
Math.gf -- abstract syntax
Math???.gf -- concrete syntaxes
TransferDef.hs -- definition of question-to-answer function
TransferLoop.hs -- Haskell Main module
```
#NEW
===Translets: embedded translators in Java===
**NOTICE**. Only for GF 2.9 and older at the moment.
A Java system needs many more files than a Haskell system.
To get started, fetch the package ``gfc2java`` from
[``www.cs.chalmers.se/~bringert/darcs/gfc2java/`` http://www.cs.chalmers.se/~bringert/darcs/gfc2java/]
by using the Darcs version control system as described in this page.
The ``gfc2java`` package contains a script ``build-translet``, which
can be applied
to any ``.gfcm`` file to create a **translet**, a small translation GUI.
For the ``Food``
grammars of #Rchapthree, we first create a file ``food.gfcm`` by
```
% echo "pm | wf food.gfcm" | gf FoodEng.gf FoodIta.gf
```
and then run
```
% build_translet food.gfcm
```
The resulting file ``translate-food.jar`` can be run with
```
% java -jar translate-food.jar
```
The translet looks like this:
[food-translet.png]
#NEW
===Dialogue systems in Java===
**NOTICE**. Only for GF 2.9 and older at the moment.
A question-answer system is a special case of a **dialogue system**,
where the user and
the computer communicate by writing or, even more properly, by speech.
The ``gf-java``
homepage provides an example of a most simple dialogue system imaginable,
where two
the conversation has just two rules:
- if the user says //here you go//, the system says //thanks//
- if the user says //thanks//, the system says //you are welcome//
The conversation can be made in both English and Swedish; the user's initiative
decides which language the system replies in. Thus the structure is very similar
to the ``math`` program #Rsecmathprogram.
The GF and Java sources of the program can be
found in
[``www.cs.chalmers.se/~bringert/darcs/simpledemo http://www.cs.chalmers.se/~bringert/darcs/simpledemo``]
again accessible with the Darcs version control system.
#NEW
==Language models for speech recognition==
The standard way of using GF in speech recognition is by building
**grammar-based language models**.
GF supports several formats, including
GSL, the formatused in the [Nuance speech recognizer www.nuance.com].
GSL is produced from GF by running ``gfc`` with the flag
``--output-format=gsl``.
Example: GSL generated from ``FoodsEng.gf``.
```
% gfc --make --output-format=gsl FoodsEng.gf
% more FoodsEng.gsl
;GSL2.0
; Nuance speech recognition grammar for FoodsEng
; Generated by GF
.MAIN Phrase_cat
Item_1 [("that" Kind_1) ("this" Kind_1)]
Item_2 [("these" Kind_2) ("those" Kind_2)]
Item_cat [Item_1 Item_2]
Kind_1 ["cheese" "fish" "pizza" (Quality_1 Kind_1)
"wine"]
Kind_2 ["cheeses" "fish" "pizzas"
(Quality_1 Kind_2) "wines"]
Kind_cat [Kind_1 Kind_2]
Phrase_1 [(Item_1 "is" Quality_1)
(Item_2 "are" Quality_1)]
Phrase_cat Phrase_1
Quality_1 ["boring" "delicious" "expensive"
"fresh" "italian" ("very" Quality_1) "warm"]
Quality_cat Quality_1
```
#NEW
===More speech recognition grammar formats===
Other formats available via the ``--output-format`` flag include:
|| Format | Description ||
| ``gsl`` | Nuance GSL speech recognition grammar
| ``jsgf`` | Java Speech Grammar Format (JSGF)
| ``jsgf_sisr_old`` | JSGF with semantic tags in SISR WD 20030401 format
| ``srgs_abnf`` | SRGS ABNF format
| ``srgs_xml`` | SRGS XML format
| ``srgs_xml_prob`` | SRGS XML format, with weights
| ``slf`` | finite automaton in the HTK SLF format
| ``slf_sub`` | finite automaton with sub-automata in HTK SLF
All currently available formats can be seen with ``gfc --help``.