-- same as t, to help type inference
-```
-Accessing bound variables in ``lin``: use fields ``$1, $2, $3,...``.
-Example:
-```
-fun F : (A : Set) -> (El A -> Prop) -> Prop ;
-lin F A B = {s = ["for all"] ++ A.s ++ B.$1 ++ B.s}
-```
-
-
-==Pattern matching==
-
-These patterns can be used in branches of ``table`` and
-``case`` expressions. Patterns are matched in the order in
-which they appear in the grammar.
-```
-C -- atomic param constructor
-C p q -- param constr. applied to patterns
-x -- variable, matches anything
-_ -- wildcard, matches anything
-"foo" -- string
-56 -- integer
-{s = p ; y = q} -- record, matches extensions too
- -- tuple, same as {p1=p ; p2=q}
-p | q -- disjunction, binds to first match
-x@p -- binds x to what p matches
-- p -- negation
-p + "s" -- sequence of two string patterns
-p* -- repetition of a string pattern
-```
-
-==Sample library functions==
-
-```
--- lib/prelude/Predef.gf
-drop : Int -> Tok -> Tok -- drop prefix of length
-take : Int -> Tok -> Tok -- take prefix of length
-tk : Int -> Tok -> Tok -- drop suffix of length
-dp : Int -> Tok -> Tok -- take suffix of length
-occur : Tok -> Tok -> PBool -- test if substring
-occurs : Tok -> Tok -> PBool -- test if any char occurs
-show : (P:Type) -> P ->Tok -- param to string
-read : (P:Type) -> Tok-> P -- string to param
-toStr : (L:Type) -> L ->Str -- find "first" string
-
--- lib/prelude/Prelude.gf
-param Bool = True | False
-oper
- SS : Type -- the type {s : Str}
- ss : Str -> SS -- construct SS
- cc2 : (_,_ : SS) -> SS -- concat SS's
- optStr : Str -> Str -- string or empty
- strOpt : Str -> Str -- empty or string
- bothWays : Str -> Str -> Str -- X++Y or Y++X
- init : Tok -> Tok -- all but last char
- last : Tok -> Tok -- last char
- prefixSS : Str -> SS -> SS
- postfixSS : Str -> SS -> SS
- infixSS : Str -> SS -> SS -> SS
- if_then_else : (A : Type) -> Bool -> A -> A -> A
- if_then_Str : Bool -> Str -> Str -> Str
-```
-
-
-==Flags==
-
-Flags can appear, with growing priority,
-- in files, judgement ``flags`` and without dash (``-``)
-- as flags to ``gf`` when invoked, with dash
-- as flags to various GF commands, with dash
-
-
-Some common flags used in grammars:
-```
-startcat=cat use this category as default
-
-lexer=literals int and string literals recognized
-lexer=code like program code
-lexer=text like text: spacing, capitals
-lexer=textlit text, unknowns as string lits
-
-unlexer=code like program code
-unlexer=codelit code, remove string lit quotes
-unlexer=text like text: punctuation, capitals
-unlexer=textlit text, remove string lit quotes
-unlexer=concat remove all spaces
-unlexer=bind remove spaces around "&+"
-
-optimize=all_subs best for almost any concrete
-optimize=values good for lexicon concrete
-optimize=all usually good for resource
-optimize=noexpand for resource, if =all too big
-```
-For the full set of values for ``FLAG``,
-use on-line ``h -FLAG``.
-
-
-
-==File paths==
-
-Colon-separated lists of directories searched in the
-given order:
-```
---# -path=.:../abstract:../common:prelude
-```
-This can be (in order of growing preference), as
-first line in the top file, as flag to ``gf``
-when invoked, or as flag to the ``i`` command.
-The prefix ``--#`` is used only in files.
-
-If the environment variabls ``GF_LIB_PATH`` is defined, its
-value is automatically prefixed to each directory to
-extend the original search path.
-
-
-==Alternative grammar formats==
-
-**Old GF** (before GF 2.0):
-all judgements in any kinds of modules,
-division into files uses ``include``s.
-A file ``Foo.gf`` is recognized as the old format
-if it lacks a module header.
-
-**Context-free** (file ``foo.cf``). The form of rules is e.g.
-```
-Fun. S ::= NP "is" AP ;
-```
-If ``Fun`` is omitted, it is generated automatically.
-Rules must be one per line. The RHS can be empty.
-
-**Extended BNF** (file ``foo.ebnf``). The form of rules is e.g.
-```
-S ::= (NP+ ("is" | "was") AP | V NP*) ;
-```
-where the RHS is a regular expression of categories
-and quoted tokens: ``"foo", CAT, T U, T|U, T*, T+, T?``, or empty.
-Rule labels are generated automatically.
-
-
-**Probabilistic grammars** (not a separate format).
-You can set the probability of a function ``f`` (in its value category) by
-```
---# prob f 0.009
-```
-These are put into a file given to GF using the ``probs=File`` flag
-on command line. This file can be the grammar file itself.
-
-**Example-based grammars** (file ``foo.gfe``). Expressions of the form
-```
-in Cat "example string"
-```
-are preprocessed by using a parser given by the flag
-```
---# -resource=File
-```
-and the result is written to ``foo.gf``.
-
-
diff --git a/doc/tutorial/gf-tutorial2_9.txt b/doc/tutorial/gf-tutorial2_9.txt
deleted file mode 100644
index 9363e16f3..000000000
--- a/doc/tutorial/gf-tutorial2_9.txt
+++ /dev/null
@@ -1,4316 +0,0 @@
-Grammatical Framework: Tutorial, Advanced Applications, and Reference Manual
-Author: Aarne Ranta aarne (at) cs.chalmers.se
-Last update: %%date(%c)
-
-% NOTE: this is a txt2tags file.
-% Create an html file from this file using:
-% txt2tags --toc gf-tutorial2.txt
-
-%!target:html
-%!encoding: iso-8859-1
-
-%%!postproc(tex): "section\*" "section"
-
-%!postproc(tex): "subsection\*" "section"
-%!postproc(tex): "section\*" "chapter"
-
-%!postproc(html): #BCEN
-%!postproc(html): #ECEN
-
-%!postproc(tex): #BCEN "begin{center}"
-%!postproc(tex): #ECEN "end{center}"
-
-%!preproc(html): #EDITORPNG [../quick-editor.png]
-%!preproc(tex): #EDITORPNG [../../lib/resource-1.0/doc/10lang-small.png]
-
-%!preproc(html): #LOGOPNG [../gf-logo.png]
-%!preproc(tex): #LOGOPNG ""
-
-
-%!postproc(tex): #PARTone "part{Tutorial}"
-%!postproc(tex): #PARTtwo "part{Advanced Applications}"
-%!postproc(tex): #PARTthree "part{Reference Manual}"
-
-
-#LOGOPNG
-
-
-
-%--!
-=Introduction=
-
-==Natural language application programming==
-
-Making computers understand human language is one of the oldest dreams of
-programmers. Projects with machine translations started almost as soon as
-the first computers appeared in the 1940's. This was partly encouraged by the
-success of decryption during the Second World War. Thus some American scientists
-had the vision that Russian can be seen as encrypted English, which can be
-deciphered by similar algorithms as those used for cracking the Germans' Enigma.
-
-Despite substantial efforts on machine translation, the early visions were not
-realized, and the general conclusion reached by the mid-1960's was that
-high-quality broad-coverage machine translation is impossible. Machine
-translation was translated to the less ambitious and more specialized tasks of
-computational linguistics. Parallel to this, fantacies of "speaking robots" and
-other language-understanding machines prevailed, exemplified by such science
-fiction figures as the HAL computer in the film "2001: A Space Odyssey" from
-1970.
-
-What we see in today's market of language understanding machines is a variety of
-products, which focus on different aspects of the task and none of which comes
-even close to HAL or a machine translator with human-like capacities. Here is a
-list of some such applications:
-- browse-quality machine translation: Systran
-- machine translation specialized on weather reports: Meteo
-- electronic dictionaries
-- spelling and grammar checkers
-- dialogue systems for enabling simple speech interaction with a computer
-
-
-A common feature of these applications is that their construction requires
-**linguistic knowledge**: theoretical understanding of languages. As opposed to
-practical understanding, which means the ability to speak, listen, write, and
-read, theoretical understanding means knowledge of the **rules** of language.
-It is by expressing these rules in a programming language the we can hope to
-make a computer understand at least something of a natural language.
-
-This is where GF comes into picture. GF, Grammatical Framework, is a programming
-language designed for expressing linguistic rules. A set of such rules is called
-a **grammar**. GF is designed in such a way that it is much easier to write
-grammar rules in it than in a general-purpose programming language, such as
-Java or C or Haskell. At the same time, GF is equipped with tools for
-**embedded grammars**. This means that a GF grammar can be used as a component
-of a program written in another language, such as Java or C or Haskell. To build
-a language application usually involves much more than just a grammar, and it is
-important that the grammar can be integrated seemlessly with the rest of the
-application.
-
-Since natural language application programming requires linguistic knowledge, it
-is usually considered to need linguistic training. The mission of GF is to relieve
-some of this need. This is achieved in two ways:
-- GF works in a way familiar to ordinary programmers, namely as a **compiler**
- that analyses a language and generates a result.
-- GF has a set of **resource grammar libraries**, which encapsulate much of
- the linguistic knowledge needed when writing grammars.
-
-
-This said, GF makes no claim to "fire linguists" from natural language programming
-projects. The claim is rather one of the **division of labour**: GF enables the
-division of grammar writing into different **modules**, where some modules
-require linguistic knowledge and others don't. Linguists working on the linguistic
-modules will appreciate the way GF supports abstractions and generalizations, and
-also the grammar development tools that enable testing of linguistic rules.
-Non-linguists working on the application-oriented modules will appreciate the
-possibility to take grammar rules for granted and focus on other aspects of
-the program.
-
-
-
-==The history of GF and its applications==
-
-GF belongs to the tradition of **functional programming languages**, exemplified
-by Lisp and, as later and closer relatives, ML and Haskell. An important branch
-of functional programming is **type theory**, which in turn has its roots in
-logic and the foundations of mathematics. GF was, at the first place, created to
-implement the idea that type theory can provide **semantics**, i.e. formalize
-the meaning of natural languages. Several aspects of type-theoretical semantics
-were covered in the monograph //Type-Theoretical Grammar// (A. Ranta, OUP 1994).
-But a stronger aspect grew out of subsequent experiments dealing with different
-languages: it is possible to have a common semantics for many language, and
-thereby build systems that translate between languages via the semantics. During
-this period, discussions with Per Martin-Löf (Ranta's PhD supervisor at the
-University of Stockholm) had a major impact on the work, and cooperation
-with Petri Mäenpää at the University of Helsinki led to the first computer
-implementations.
-
-As a stand-alone programming language, GF was first implemented in 1998. This
-took place at Xerox Research Centre Europe in Grenoble, within a project entitled
-//Multilingual Document Authoring//. The leading idea in the project was to
-enable writing documents in multiple languages simultaneously, so that the user
-need only know one of the languages; the rest will be produced automatically
-via translations from the type-theoretical semantics. The Xerox staff involved
-in the project included Marc Dymetman, Lauri Karttunen, Veronika Lux,
-Sylvain Pogodalla, and Annie Zaenen.
-
-The Xerox project produced some prototype applications, e.g. a restaurant phrase
-book and an editor of medical drug descriptions. The grammars that were build
-remained the property of Xerox, but the GF formalism and its implementation
-were released as open-source software under GNU General Public License. The
-principal author of GF got an academic position in 1999, at the Department of
-Computing Science of Chalmers University of Technology and Gothenburg University.
-At Chalmers, both functional programming and type theory flourish, and in this
-environment, GF developed into a more stable and more full-fledged programming
-language. In this process, collaboration with Koen Claessen, Thierry Coquand,
-Thomas Hallgren, Patrik Jansson, and Bengt Nordström made important contributions.
-
-The idea of making GF into "the working programmer's grammar formalism", as
-opposed to a tool requiring linguistic expertise, was confirmed at Chalmers
-in courses given to computer science students and later in joint research
-projects. A nice experience of the courses was that computer scientists are
-often very interested in languages and have firm intuitions on grammar; given
-a suitable programming tool, they can achieve impressive results. GF seemed to
-be close to such a tool, and, in subsequent collaborations at the Department,
-it evolved even more to a programming language with a virtues of familiarity
-and "the least surprise". Issues of stability are also important, including
-backward compatibility, and documentation is something there can hardly be
-too much of. As a mark of stability, version 1.0 of GF was released in
-2002. In 2004, a theoretical reference paper appeared in the Journal
-of Functional Programming, as well as a long tutorial text in the ESSLLI
-lecture notes post-publication.
-
-The first full-scale applications of GF emerged as natural-language interfaces.
-The first one was for the proof editor Alfa, written with Thomas Hallgren.
-The second one was a syntax editor and a natural-language interface to the
-software specification language OCL (Object Constraint Language) built
-within the KeY project. This work was done first with Reiner Hähnle, then
-with the students Kristoffer Johannisson (PhD 2005), Hans-Joachim Daniels,
-and David Burke. On the GF implementation side, Janna Khegai (PhD 2006) built
-a Java-based syntax editor. Peter Ljunglöf (PhD 2004) succeeded to identify
-the complexity of parsing in GF and found an algorithm that greatly improved
-the use of GF in parsing. He implemented the algorithm with Håkan Burden, and
-it was later still improved by Krasimir Angelov.
-
-At the same time, collaboration with the Linguistics Department of
-Gothenburg University served as a "linguistic sanity check" of GF.
-Robin Cooper, an eminent linguist working at the Department, initiated
-two efforts that have formed the development of GF:
-- resource grammar libraries
-- dialogue system applications
-
-
-It was the resource grammar libraries that made GF really usable for non-linguist
-programmers in more serious projects. They were heavily missed in the Alfa
-project, and heavily used and improved in the KeY project. The development of
-the library started in 2002; a version stable enough to be released with number
-1.0 was complete in 2006, comprising ten languages.
-
-Dialogue systems, on the other hand, turned
-out to be a major source of interesting problems and also of successful solutions.
-Much of this work was carried out in the European project TALK (Tools for Ambient
-Linguistic Knowledge, 2004-2006), by Björn Bringert, Rebecca Jonson, and
-Peter Ljunglöf in Gothenburg, and Oliver Lemon (Edinburgh), Nadine Perera (BMW),
-and Karl Weilhammer (Cambridge) at the other sites. In addition to
-complete systems, this project produced supporting tools for embedded grammars
-and speech recognition, and additions to the resource grammar library.
-
-Besides dialogue systems, multilingual authoring and translation continues
-to be the main application of GF. The European WebALT project (Web Advanced
-Learning Technologies, 2005-2006), used GF to build a tool for translating
-mathematical exercises from formal specifications (written in MathML) to
-six language. Also tool integrating GF with a computer algebra system was
-developed. The project gave rise to a company, WebALT Inc. Many members
-of the WebALT staff also contributed to GF and the resource grammar library:
-Lauri Carlson, Glòria Casanellas, Anni Laine, Wanjiku N'gan'ga, and
-Jordi Saludes.
-
-As of the time of writing (August 2007), the release of GF has version
-number 2.8. It is a stable system that has been built with contributions
-of dozens of persons and been used by at least hundreds; download figures
-are in thousands. New ideas of how to apply GF are posted by users almost
-every week. These users are often programmers with good knowledge of
-functional languages, highly developed instinct for programming language
-design, and firm intuitions on natural language. Another group of users
-are those that have been trained in GF on courses.
-
-
-
-==The purpose and scope of this book==
-
-The purpose of this book is to serve the growing user base of GF with
-a manual that gathers all relevant information in one place. However, it
-is also intended to serve those who want to get started with GF, and
-who don't necessarily have the technical background of the typical
-users. We believe that learning to program in GF is not more difficult
-than learning some other programming language; as for the linguistic
-aspects, we believe that writing grammars is an excellent introduction
-to the problems of linguistics, where theory can be learnt at the
-same time as it is motivated by concrete problems.
-
-The book thus starts with a tutorial, which gradually explains all
-the constructs of the GF programming language. Also the design and style
-aspects of grammar engineering are covered, to help the user to scale
-up from small to large and possibly collaborative applications.
-After the tutorial, the book continues with a "cook book" containing
-hints and case studies for advanced users. Moreover, the resource
-grammar library is covered in some detail, which will help the
-programmers who want to port the library to new languages, but also
-motivate linguistically the choices made in the libraries.
-A complete reference manual concludes the book, with a quick reference
-card as an appendix.
-
-What is not covered by the book is theoretical discussions of
-GF, especially in comparison to other grammar formalism. Even though important
-in the development of GF as a scientifically justified framework, such
-discussions are not relevant for programmers who want to use GF - any more
-than, say, a book on Haskell has to include comparisons with Java. In fact,
-introducing Haskell by references to Java may have some point, since many
-of the readers can already be assumed to know Java. But, even though some
-readers will know DCG or HPSG or LFG, we will not assume this; we will just
-note in passing the relation between GF and context-free grammars, also
-known as BNF grammars in computer science.
-
-
-
-#PARTone
-
-=Getting started=
-
-In this chapter, we will introduce the GF program and write a first GF grammar.
-We show how the grammar is used for the tasks of translation and multilingual
-generation.
-
-
-==What GF is==
-
-We use the term GF for three different things:
-- a **system** (computer program) used for working with grammars
-- a **programming language** in which grammars can be written
-- a **theory** about grammars and languages
-
-
-The relation between these things is obvious: the GF system is an implementation
-of the GF programming language, which in turn is built on the ideas of the
-GF theory. The main focus of this book is on the GF programming language.
-We learn how grammars are written in the language. At the same time, we learn
-the way of thinking in the GF theory. To make this all useful and fun, we
-make the grammars run on a computer by using the GF system.
-
-
-
-%--!
-==What GF grammars are used for==
-
-A grammar is a definition of a language.
-From this definition, different language processing components
-can be derived:
-- **parsing**: to analyse the language
-- **linearization**: to generate the language
-- **translation**: to analyse one language and generate another
-
-
-A GF grammar can be seen as a declarative program from which these
-processing tasks can be automatically derived. In addition, many
-other tasks are readily available for GF grammars:
-- **morphological analysis**: find out the possible inflection forms of words
-- **morphological synthesis**: generate all inflection forms of words
-- **random generation**: generate random expressions
-- **corpus generation**: generate all expressions
-- **treebank generation**: generate a list of trees with multiple linearizations
-- **teaching quizzes**: train morphology and translation
-- **multilingual authoring**: create a document in many languages simultaneously
-- **speech input**: optimize a speech recognition system for your grammar
-
-
-A typical GF application is based on a **multilingual grammar** involving
-translation on a special domain. Existing applications of this idea include
-- [Alfa http://www.cs.chalmers.se/~hallgren/Alfa/Tutorial/GFplugin.html]:
- a natural-language interface to a proof editor
- (languages: English, French, Swedish)
-- [KeY http://www.key-project.org/]:
- a multilingual authoring system for creating software specifications
- (languages: OCL, English, German)
-- [TALK http://www.talk-project.org]:
- multilingual and multimodal dialogue systems
- (languages: English, Finnish, French, German, Italian, Spanish, Swedish)
-- [WebALT http://webalt.math.helsinki.fi/content/index_eng.html]:
- a multilingual translator of mathematical exercises
- (languages: Catalan, English, Finnish, French, Spanish, Swedish)
-- [Numeral translator http://www.cs.chalmers.se/~bringert/gf/translate/]:
- number words from 1 to 999,999
- (88 languages)
-
-
-The specialization of a grammar to a domain makes it possible to
-obtain much better translations than in an unlimited machine translation
-system. This is due to the well-defined semantics of such domains.
-Grammars having this character are called **application grammars**.
-They are different from most grammars written by linguists just
-because they are multilingual and domain-specific.
-
-However, there is another kind of grammars, which we call **resource grammars**.
-These are large, comprehensive grammars that can be used on any domain.
-The GF Resource Grammar Library has resource grammars for 10 languages.
-These grammars can be used as **libraries** to define application grammars.
-In this way, it is possible to write a high-quality grammar without
-knowing about linguistics: in general, to write an application grammar
-by using the resource library just requires practical knowledge of
-the target language. and all theoretical knowledge about its grammar
-is given by the libraries.
-
-
-
-
-%--!
-==Who is the tutorial for==
-
-The tutorial part of this book is mainly for programmers
-who want to learn to write application grammars.
-It will go through GF's programming concepts, and does not
-presuppose knowledge of any of the main ingredients of GF:
-linguistics, functional programming, and type theory.
-Thus it should be accessible to anyone who has some
-previous programming experience from any language; the basics
-of using computers are also presupposed, e.g. the use of
-text editors and the management of files.
-
-Those who already know GF well can skip the tutorial part,
-or skim thorough it, and go directly to the part on advanced applications.
-These will involve large scale GF programming, such as needed in resource
-grammars, and also the embedding of GF in systems such as
-natural-language user interfaces and dialogue systems.
-
-
-
-%--!
-==The coverage of the tutorial==
-
-The tutorial gives a hands-on introduction to grammar writing.
-We start by building a "Hello World" grammar, which covers greetings
-in three languages (//hello world//, //terve maailma//, //ciao mondo//).
-This **multilingual grammar** is based on the distinction, central in
-GF, between the **abstract syntax**
-(the logical structure) and the **concrete syntax** (the
-sequence of words) of expressions.
-
-From the "Hello World" example, we proceed
-to a larger grammar for the domain of food:
-in this grammar, you can say things like
-```
- this Italian cheese is delicious
-```
-in English and Italian. This grammar illustrates how translation is
-more than just replacement of words. For instance, the order of
-words may have to be changed:
-```
- Italian cheese ===> formaggio italiano
-```
-Moreover, words can have different forms, and which forms
-they have vary from language to language. For instance,
-Italian adjectives usually have four forms where English
-has just one:
-```
- delicious (wine, wines, pizza, pizzas)
- vino delizioso, vini deliziosi, pizza deliziosa, pizze deliziose
-```
-The **morphology** of a language describes the
-forms of its words.
-
-While the complete description of morphology
-belongs to resource grammars, and the use of them will be covered
-by the tutorial. However, we will explain all the
-programming concepts involved in resource grammars.
-The tutorial will in fact build a miniature resource grammar in order
-to give an introduction to linguistically oriented grammar writing.
-
-Of course, we will not presuppose that the reader knows Italian.
-We have chosen Italian as the example language because it has a rich
-morphological structure that illustrates very well the capacities of
-GF. Moreover, even those who don't know Italian, will find many of
-its words familiar. The exercises will encourage the reader to
-port the examples to other languages; in fact, many GF
-applications work for 5-10 languages.
-
-Thus it is by elaborating the Food grammar example that
-the tutorial makes a guided tour through most of GF.
-While the constructs of the GF language are the main focus,
-also the commands of the GF system are introduced as they
-are needed.
-
-In addition to multilinguality, **semantics** is an important aspect of GF
-grammars. The concepts needed for "purely linguistic" grammars belong to
-the concrete syntax part of GF, whereas semantics is expressed in the abstract
-syntax. After the presentation of concrete syntax constructs, we proceed
-to the enrichment of abstract syntax with **dependent types**,
-**variable bindings**, and **semantic definitions**.
-
-To learn how to write GF grammars is not the only goal of
-this tutorial. We will also explain the most important
-commands of the GF system. With these commands,
-simple applications of grammars, such as translation and
-quiz systems, can be built simply by writing scripts for the
-system.
-
-More complicated applications, such as natural-language
-interfaces and dialogue systems, moreover require programming in
-some general-purpose language. The part on advanced topics will
-explain how GF grammars are used as components of Haskell and Java programs.
-
-
-%--!
-==Getting the GF program==
-
-The GF program is open-source free software, which you can download via the
-GF Homepage:
-
-[``http://www.cs.chalmers.se/~aarne/GF`` http://www.cs.chalmers.se/~aarne/GF]
-
-There you can download
-- binaries for Linux, Mac OS X, and Windows
-- source code and documentation
-- grammar libraries and examples
-
-
-If you want to compile GF from source, you need a Haskell compiler.
-To compile the interactive editor, you also need a Java compilers.
-But normally you don't have to compile, and you definitely
-don't need to know Haskell or Java to use GF.
-
-We are assuming the availability of a Unix shell. Linux and Mac OS X users
-have it automatically, the latter under the name "terminal".
-Windows users are recommended to install Cywgin, the free Unix shell for Windows.
-
-
-%--!
-==Running the GF program==
-
-To start the GF program, assuming you have installed it, just type
-``gf`` in the Unix (or Cygwin) shell:
-```
- % gf
-```
-You will see GF's welcome message and the prompt ``>``.
-The command
-```
- > help
-```
-will give you a list of available commands.
-
-As a common convention in this Tutorial, we will use
-- ``%`` as a prompt that marks system commands
-- ``>`` as a prompt that marks GF commands
-
-
-Thus you should not type these prompts, but only the characters that
-follow them.
-
-
-==A "Hello World" grammar==
-
-The tradition in programming language tutorials is to start with a
-program that prints "Hello World" on the terminal. GF should be no
-exception. But our program has features that distinguish it from
-most "Hello World" programs:
-- **Multilinguality**: the message is printed in many languages.
-- **Reversibility**: in addition to printing, you can **parse** the
- message and translate it to other languages.
-
-
-===The program: abstract syntax and concrete syntaxes===
-
-A GF program, in general, is a **multilingual grammar**. Its main parts
-are
-- an **abstract syntax**
-- one or more **concrete syntaxes**
-
-
-The abstract syntax defines, in a language-independent way, what **meanings**
-can be expressed in the grammar. In the "Hello World" grammar we want
-to express //Greetings//, where we greet a //Recipient//, which can be
-//World// or //Mum// or //Friends//. Here is the entire
-GF code for the abstract syntax:
-```
- -- a "Hello World" grammar
- abstract Hello = {
-
- flags startcat = Greeting ;
-
- cat Greeting ; Recipient ;
-
- fun
- Hello : Recipient -> Greeting ;
- World, Mum, Friends : Recipient ;
- }
-```
-The code has the following parts:
-- a **comment** (optional), saying what the module is doing
-- a **module header** indicating that it is an abstract syntax
- module named ``Hello``
-- a **module body** in braces, consisting of
- - a **startcat flag declaration** stating that ``Greeting`` is the
- main category, i.e. the one we are most interested in
- - **category declarations** stating that ``Greeting`` and ``recipient``
- are categories, i.e. types of meanings
- - **function declarations** stating what meaning-building functions there
- are; these are the three possible recipients, as well as the function
- ``Hello`` constructing a greeting from a recipient
-
-
-A concrete syntax defines a mapping from the abstract meanings to their
-expressions in a language. We first give an English concrete syntax:
-```
- concrete HelloEng of Hello = {
-
- lincat Greeting, Recipient = {s : Str} ;
-
- lin
- Hello rec = {s = "hello" ++ rec.s} ;
- World = {s = "world"} ;
- Mum = {s = "mum"} ;
- Friends = {s = "friends"} ;
- }
-```
-The major parts of this code are:
-- a module header indicating that it is a concrete syntax of the abstract syntax
- ``Hello``, itself named ``HelloEng``
-- a module body in braces, consisting of
- - **linearization type definitions** stating that
- ``Greeting`` and ``recipient`` are **records** with a **string** ``s``
- - **linearization definitions** telling what records are assigned to
- each of the meanings defined in the abstract syntax; the recipients are
- linearized to records containing single words, whereas the ``Hello`` greeting
- has a function telling that the word ``hello`` is prefixed to the argument
-
-
-
-
-To make the grammar truly multilingual, we add a Finnish and an Italian concrete
-syntax:
-```
- concrete HelloFin of Hello = {
- lincat Greeting, Recipient = {s : Str} ;
- lin
- Hello rec = {s = "terve" ++ rec.s} ;
- World = {s = "maailma"} ;
- Mum = {s = "äiti"} ;
- Friends = {s = "ystävät"} ;
- }
-
- concrete HelloIta of Hello = {
- lincat Greeting, Recipient = {s : Str} ;
- lin
- Hello rec = {s = "ciao" ++ rec.s} ;
- World = {s = "mondo"} ;
- Mum = {s = "mamma"} ;
- Friends = {s = "amici"} ;
- }
-```
-Now we have a trilingual grammar usable for translation and
-many other tasks, which we will now look into.
-
-
-
-===Using the grammar in the GF program===
-
-In order to compile the grammar in GF, each of the four modules
-has to be put in a file named //modulename//``.gf``:
-```
- Hello.gf HelloEng.gf HelloFin.gf HelloIta.gf
-```
-The first GF command needed when using a grammar is to **import** it.
-The command has a long name, ``import``, and a short name, ``i``.
-You can type either
-```
- > import food.cf
-```
-or
-```
- > i food.cf
-```
-to get the same effect. In general, all GF commands have a long and a short name;
-short names are convenient when typing commands by hand, whereas long commands
-are more readable in scripts, i.e. files with lists of commands.
-
-The effect of ``import`` is that the GF program **compiles** your grammar
-into an internal representation, and shows a new prompt when it is ready.
-It will also show how much CPU time was consumed:
-```
- > i HelloEng.gf
- - compiling Hello.gf... wrote file Hello.gfc 8 msec
- - compiling HelloEng.gf... wrote file HelloEng.gfc 12 msec
-
- 12 msec
-```
-You can now use GF for **parsing**:
-```
- > parse "hello world"
- Hello World
-```
-The ``parse`` (= ``p``) command takes a **string**
-(in double quotes) and returns an **abstract syntax tree** - the meaning
-of the string defined in the abstract syntax.
-A tree is, in general, something easier than a string
-for a machine to understand and to process further, although this
-is not so obvious in this simple grammar.
-
-Strings that return a tree when parsed do so in virtue of the grammar
-you imported. Try parsing something that is not in grammar, and you fail
-```
- > parse "hello dad"
- Unknown words: dad
-
- > parse "world hello"
- no tree found
-```
-In the first example, the failure is caused by an unknown word.
-In the second example, the combination of words is ungrammatical.
-
-In addition to parsing, you can also use GF for **linearizing**
-(``linearize = l``). This is the inverse of
-parsing, taking trees into strings:
-```
- > linearize Hello World
- hello world
-```
-What is the use of this? Typically not that you type in a tree at
-the GF prompt. The utility of linearization comes from the fact that
-you can obtain a tree from somewhere else - for instance, from
-a parser. A prime example of this is **translation**: you parse
-with one concrete syntax and linearize with another. Let us
-now do this by first importing the Italian grammar:
-```
- > import HelloIta.gf
-```
-We can now parse with ``HelloEng`` and **pipe** the result
-into linearizing with ``HelloIta``:
-```
- > parse -lang=HelloEng "hello mum" | linearize -lang=HelloIta
- ciao mamma
-```
-Notice that the commands must use a **language flag** to indicate
-which concrete syntax is used in each of the operations.
-
-To conclude the translation exercise, we import the Finnish grammar
-and pipe English parsing into **multilingual generation**:
-```
- > parse -lang=HelloEng "hello friends" | linearize -multi
- terve ystävät
- ciao amici
- hello friends
-```
-
-**Exercise**. Test the parsing and translation examples shown above, as well as
-five other examples.
-
-**Exercise**. Extend the grammar ``Hello.gf`` and some of the
-concrete syntaxes by five new recipients and one new greeting
-form.
-
-**Exercise**. Add a concrete syntax for some other
-languages you might know.
-
-
-
-==What else can be done with the grammar==
-
-Now we have built our first multilingual grammar and seen the basic
-functionalities of GF: parsing and linearization. We have tested
-these functionalities inside the GF program. In the forthcoming
-chapters, we will build larger grammars and have more fun with
-these functionalities. But we will also introduce many more:
-- random generation
-- exhaustive generation
-- treebank generation
-- syntax editing
-- morphological analysis
-- translation and morphological quizzes
-- semantic filtering
-
-
-The usefulness of GF would be quite limited if grammars were
-usable only inside the GF program. In the forthcoming chapters,
-we will see many other ways of using grammars:
-- compile them to new formats, such as speech recognition grammars
-- embed them in Java and Haskell programs
-- build applications using compilation and embedding:
- - voice commands
- - spoken language translators
- - dialogue systems
- - user interfaces
- - localization: parametrize the messages printed by a program
- to support different languages
-
-
-All GF functionalities, both those inside the GF program and those
-ported to other environments,
-are of course applicable to the simplest of grammars,
-such as the ``Hello`` grammars presented above. But the main focus
-of this tutorial will be on grammar writing. Thus we will show
-how larger and more expressive grammars can be built by using
-the constructs of the GF programming language, before entering the
-applications in the next part of the book.
-
-
-
-==Summary of GF language features==
-
-A GF grammar consists of **modules**,
-into which judgements are grouped. The most important
-module forms are
-- ``abstract`` A ``=`` M, abstract syntax A with judgements in
- the module body M.
-- ``concrete`` C ``of`` A ``=`` M, concrete syntax C of the
- abstract syntax A, with judgements in the module body M.
-
-
-Each module is written in a file named //Modulename//.``.gf``.
-
-Rules in a GF grammar are called **judgements**, and the keywords
-``fun`` and ``lin`` are used for distinguishing between two
-**judgement forms**. Here is a summary of the most important
-judgement forms:
-
- - abstract syntax
-
- | form | reading |
- | ``cat`` C | C is a category
- | ``fun`` f ``:`` A | f is a function of type A
-
- - concrete syntax
-
- | form | reading |
- | ``lincat`` C ``=`` T | category C has linearization type T
- | ``lin`` f ``=`` t | function f has linearization t
-
-
-Both abstract and concrete modules may moreover contain definitions of
-**flags**, of the form
-- ``flags`` //flag//``=``//value//
-
-
-and **comments** of the forms
-- ``--`` //anything till a newline//
-- ``{-`` //anything except hyphen followed by closing brace// ``-}``
-
-
-Shorthands permit the sharing of
-the keyword in subsequent judgements,
-```
- cat Phrase ; Item ; === cat Phrase ; cat Item ;
-```
-and of the right-hand-side in subsequent judgements of the same form
-```
- fun World, Mum, Friends : Recipient ; ===
- fun World : Recipient ; Mum : Recipient ; Friends : Recipient ;
-```
-The order of judgements in a module is free. In particular, an identifier
-need not be declared before it is used.
-
-An **identifier** is a letter followed by a sequence of letters, digits, and
-characters ``'`` or ``_``. Each identifier can only be
-introduced once in the same module.
-
-**Types** in an abstract syntax are either **basic types**,
-i.e. ones introduced in ``cat`` judgements, or
-**function types** of the form
-```
- A1 -> ... -> An -> A
-```
-where each of ``A1, ..., An, A`` is a basic type (this restriction
-will be relieved later). The last type in the arrow-separated sequence
-is the **value type** of the function type, the earlier types are
-its **argument types**.
-
-In a concrete syntax, the available types include
-- the type of strings, ``Str``
-- record types of form ``{`` r1 : T1 ; ... ; rn : Tn ``}``
-
-
-**Terms** used in linearizations have the forms
-- quoted string: ``"foo"``, of type ``Str``
-- record: ``{`` r1 = t1 ; ... ; rn = Tn ``}``,
- of type ``{`` r1 : R1 ; ... ; rn : Rn ``}``
-- projection ``t.r`` with a record label, of the corresponding record
- field type
-- argument variable ``x`` bound by the left-hand-side of a ``lin`` rule,
- of the corresponding linearization type
-
-
-
-
-
-
-=Designing a grammar for complex phrases=
-
-We will now start with a grammar that has much more structure than
-the ``Hello`` grammar. We will look at how the abstract
-is divided into suitable categories, and how infinitely many
-phrases can be built by using recursive rules. We will also
-introduce **modularity** by showing how a large grammar can be
-divided into modules, and how functions defined **resource modules**
-can be used for avoiding repeated code.
-
-
-==The abstract syntax Food==
-
-The grammar we wrote defines a set of phrases usable for speaking about food:
-- the main category is ``Phrase``
-- a ``Phrase`` can be built by assigning a ``Quality`` to an ``Item``s
-- an``Item`` are build from a ``Kind`` by prefixing "this" or "that"
-- a ``Kind`` is either **atomic**, such as "cheese" and "wine", or formed
- modifying a given ``Kind`` with a ``Quality``
-- a ``Quality`` is either atomic, such as "Italian" and "boring",
- or built by modifying a given ``Quality`` "very"
-
-
-These verbal descriptions can be expressed as the following abstract syntax:
-```
- abstract Food = {
-
- flags startcat = Phrase ;
-
- cat
- Phrase ; Item ; Kind ; Quality ;
-
- fun
- Is : Item -> Quality -> Phrase ;
- This, That : Kind -> Item ;
- QKind : Quality -> Kind -> Kind ;
- Wine, Cheese, Fish : Kind ;
- Very : Quality -> Quality ;
- Fresh, Warm, Italian, Expensive, Delicious, Boring : Quality ;
- }
-```
-In the concrete syntax, we will be able to build phrases such as
-```
- this delicious Italian wine is very very expensive
-```
-
-
-==The concrete syntax FoodEng==
-
-The English concrete syntax gives no surprises:
-```
- concrete FoodEng of Food = {
-
- lincat
- Phrase, Item, Kind, Quality = {s : Str} ;
-
- lin
- Is item quality = {s = item.s ++ "is" ++ quality.s} ;
- This kind = {s = "this" ++ kind.s} ;
- That kind = {s = "that" ++ kind.s} ;
- QKind quality kind = {s = quality.s ++ kind.s} ;
- Wine = {s = "wine"} ;
- Cheese = {s = "cheese"} ;
- Fish = {s = "fish"} ;
- Very quality = {s = "very" ++ quality.s} ;
- Fresh = {s = "fresh"} ;
- Warm = {s = "warm"} ;
- Italian = {s = "Italian"} ;
- Expensive = {s = "expensive"} ;
- Delicious = {s = "delicious"} ;
- Boring = {s = "boring"} ;
- }
-```
-Let us test how the grammar works in parsing:
-```
- > import FoodEng.gf
- > parse "this delicious wine is very very Italian"
- Is (This (QKind Delicious Wine)) (Very (Very Italian))
-```
-You can also try parsing in other categories than the ``startcat``,
-by setting the command-line ``cat`` flag:
-```
- p -cat=Kind "very Italian wine"
- QKind (Very Italian) Wine
-```
-
-**Exercise**. Extend the ``Food`` grammar by ten new food kinds and
-qualities, and run the parser with new kinds of examples.
-
-
-**Exercise**. Add a rule that enables question phrases of the form
-//is this cheese Italian//.
-
-
-**Exercise**. Enable the optional prefixing of
-phrases with the words "excuse me but". Do this in such a way that
-the prefix can occur at most once.
-
-
-
-==Commands for testing grammars==
-
-===Generating trees and strings===
-
-When we have a grammar above the trivial size, especially a recursive
-one, we need more efficient ways of testing it than just by parsing
-sentences that happen to come to our minds. One way to do this is
-based on **automatic generation**, which can be either
-**random** or **exhausive**.
-
-Random generation (``generate_random = gr``) is an operation that
-builds a random tree in accordance with an abstract syntax:
-```
- > generate_random
- Is (This (QKind Italian Fish)) Fresh
-```
-By using a pipe, random generation can be fed into linearization:
-```
- > gr | l
- this Italian fish is fresh
-```
-Random generation is a good way to test a grammar; it can also
-be fun. By using the ``number`` flag, several strings can be generated
-in one command:
-```
- > gr -number=10 | l
- that wine is boring
- that fresh cheese is fresh
- that cheese is very boring
- this cheese is Italian
- that expensive cheese is expensive
- that fish is fresh
- that wine is very Italian
- this wine is Italian
- this cheese is boring
- this fish is boring
-```
-To generate //all// phrases that a grammar can produce,
-GF provides the command ``generate_trees = gt``.
-```
- > generate_trees | l
- that cheese is very Italian
- that cheese is very boring
- that cheese is very delicious
- that cheese is very expensive
- that cheese is very fresh
- ...
- this wine is expensive
- this wine is fresh
- this wine is warm
-
-```
-You get quite a few trees but not all of them: only up to a given
-**depth** of trees. The default depth is 3; the depth can be
-set by using the ``depth`` flag:
-```
- > generate_trees -depth=5 | l
-```
-Other options to the generation commands (like all commands) can be seen
-by GF's ``help = h`` command:
-```
- > help gr
- > help gt
-```
-
-**Exercise**. If the command ``gt`` generated all
-trees in your grammar, it would never terminate. Why?
-
-**Exercise**. Measure how many trees the grammar gives with depths 4 and 5,
-respectively. You use the Unix **word count** command ``wc`` to count lines.
-**Hint**. You can pipe the output of a GF command into a Unix command by
-using the escape ``?``, as follows:
-```
- > generate_trees -depth=4 | ? wc
-```
-
-
-
-
-
-===More on pipes; tracing===
-
-A pipe of GF commands can have any length, but the "output type"
-(either string or tree) of one command must always match the "input type"
-of the next command, in order for the result to make sense.
-
-The intermediate results in a pipe can be observed by putting the
-**tracing** flag ``-tr`` to each command whose output you
-want to see:
-```
- > gr -tr | l -tr | p
-
- Is (This Cheese) Boring
- this cheese is boring
- Is (This Cheese) Boring
-```
-This facility is good for test purposes: for instance, you
-may want to see if a grammar is **ambiguous**, i.e.
-contains strings that can be parsed in more than one way.
-
-**Exercise**. Extend the ``Food`` grammar so that it produces ambiguous
-strings, and try out the ambiguity test.
-
-
-
-===Writing and reading files===
-
-To save the outputs of GF commands into a file, you can
-pipe it to the ``write_file = wf`` command,
-```
- > gr -number=10 | l | write_file exx.tmp
-```
-You can read the file back to GF with the
-``read_file = rf`` command,
-```
- > read_file exx.tmp | p -lines
-```
-Notice the flag ``-lines`` given to the parsing
-command. This flag tells GF to parse each line of
-the file separately. Without the flag, the grammar could
-not recognize the string in the file, because it is not
-a sentence but a sequence of ten sentences.
-
-Files with examples can be used for **regression testing**
-of grammars.
-
-
-
-
-%--!
-==Modules and files==
-
-GF uses suffixes to recognize different file formats. The most
-important ones are:
-- Source files: //Modulname//``.gf``
-- Target files: //Modulname//``.gfc``
-
-
-When you import ``FoodEng.gf``, you see the target files being
-generated:
-```
- > i FoodEng.gf
- - compiling Food.gf... wrote file Food.gfc 16 msec
- - compiling FoodEng.gf... wrote file FoodEng.gfc 20 msec
-```
-You also see that the GF program does not only read the file
-``FoodEng.gf``, but also all other files that it
-depends on - in this case, ``Food.gf``.
-
-For each file that is compiled, a ``.gfc`` file
-is generated. The GFC format (="GF Canonical") is the
-"machine code" of GF, which is faster to process than
-GF source files. When reading a module, GF decides whether
-to use an existing ``.gfc`` file or to generate
-a new one, by looking at modification times.
-
-**Exercise**. What happens when you import ``FoodEng.gf`` for
-a second time? Try this in different situations:
-- Right after importing it the first time (the modules are kept in
- the memory of GF and need no reloading).
-- After issuing the command ``empty`` (``e``), which clears the memory
- of GF.
-- After making a small change in ``FoodEng.gf``, be it only an added space.
-- After making a change in ``Food.gf``.
-
-
-
-==An Italian concrete syntax==
-
-We write the Italian grammar in a straightforward way, by replacing
-English words with their usual dictionary equivalents:
-```
- concrete FoodIta of Food = {
-
- lincat
- Phrase, Item, Kind, Quality = {s : Str} ;
-
- lin
- Is item quality = {s = item.s ++ "è" ++ quality.s} ;
- This kind = {s = "questo" ++ kind.s} ;
- That kind = {s = "quello" ++ kind.s} ;
- QKind quality kind = {s = kind.s ++ quality.s} ;
- Wine = {s = "vino"} ;
- Cheese = {s = "formaggio"} ;
- Fish = {s = "pesce"} ;
- Very quality = {s = "molto" ++ quality.s} ;
- Fresh = {s = "fresco"} ;
- Warm = {s = "caldo"} ;
- Italian = {s = "italiano"} ;
- Expensive = {s = "caro"} ;
- Delicious = {s = "delizioso"} ;
- Boring = {s = "noioso"} ;
- }
-```
-An alert reader, or one who already knows Italian, may notice one point in
-which a change more radical than replacement of words is made: the order of
-a quality and the kind it modifies in
-```
- QKind quality kind = {s = kind.s ++ quality.s} ;
-```
-Thus Italian says ``vino italiano`` for ``Italian wine``.
-
-**Exercise**. Write a concrete syntax of ``Food`` for some other language.
-You will probably end up with grammatically incorrect output - but don't
-worry about this yet.
-
-**Exercise**. If you have written ``Food`` for German, Swedish, or some
-other language, test with random or exhaustive generation what constructs
-come out incorrect, and prepare a list of those ones that cannot be helped
-with the currently available fragment of GF.
-
-
-
-==More application of multilingual grammars==
-
-===Multilingual treebanks===
-
-A **multilingual treebank**, is a set of trees with their
-translations in different languages:
-```
- > gr -number=2 | tree_bank
-
- Is (That Cheese) (Very Boring)
- quello formaggio è molto noioso
- that cheese is very boring
-
- Is (That Cheese) Fresh
- quello formaggio è fresco
- that cheese is fresh
-```
-
-
-===Translation session===
-
-If translation is what you want to do with a set of grammars, a convenient
-way to do it is to open a ``translation_session = ts``. In this session,
-you can translate between all the languages that are in scope.
-A dot ``.`` terminates the translation session.
-```
- > ts
-
- trans> that very warm cheese is boring
- quello formaggio molto caldo è noioso
- that very warm cheese is boring
-
- trans> questo vino molto italiano è molto delizioso
- questo vino molto italiano è molto delizioso
- this very Italian wine is very delicious
-
- trans> .
- >
-```
-
-
-===Translation quiz===
-
-This is a simple language exercise that can be automatically
-generated from a multilingual grammar. The system generates a set of
-random sentences, displays them in one language, and checks the user's
-answer given in another language. The command ``translation_quiz = tq``
-makes this in a subshell of GF.
-```
- > translation_quiz FoodEng FoodIta
-
- Welcome to GF Translation Quiz.
- The quiz is over when you have done at least 10 examples
- with at least 75 % success.
- You can interrupt the quiz by entering a line consisting of a dot ('.').
-
- this fish is warm
- questo pesce è caldo
- > Yes.
- Score 1/1
-
- this cheese is Italian
- questo formaggio è noioso
- > No, not questo formaggio è noioso, but
- questo formaggio è italiano
-
- Score 1/2
- this fish is expensive
-```
-You can also generate a list of translation exercises and save it in a
-file for later use, by the command ``translation_list = tl``
-```
- > translation_list -number=25 FoodEng FoodIta | write_file transl.txt
-```
-The ``number`` flag gives the number of sentences generated.
-
-
-
-===Multilingual syntax editing===
-
-Any multilingual grammar can be used in the graphical syntax editor, which is
-opened by the shell
-command ``gfeditor`` followed by the names of the grammar files.
-Thus
-```
- % gfeditor FoodEng.gf FoodIta.gf
-```
-opens the editor for the two ``Food`` grammars.
-
-The editor supports commands for manipulating an abstract syntax tree.
-The process is started by choosing a category from the "New" menu.
-Choosing ``Phrase`` creates a new tree of type ``Phrase``. A new tree
-is in general completely unknown: it consists of a **metavariable**
-``?1``. However, since the category ``Phrase`` in ``Food`` has
-only one possible constructor, ``Is``, the tree is readily
-given the form ``Is ?1 ?2``. Here is what the editor looks like at
-this stage:
-
- [food1.png]
-
-Editing goes on by **refinements**, i.e. choices of constructors from
-the menu, until no metavariables remain. Here is a tree resulting from the
-current editing session:
-
- [food2.png]
-
-Editing can be continued even when the tree is finished. The user can shift
-the **focus** to some of the subtrees by clicking at it of the corresponding
-part of a linearization. In the picture, the focus is on "fish".
-The menu shows no refinements, since there are no metavariables, but other
-possible actions:
-- to **change** "fish" to "cheese" or "wine"
-- to **delete** "fish", i.e. change it to a metavariable
-- to **wrap** "fish" in a qualification, i.e. change it to
- ``QKind ? Fish``, where the quality can be given in a later refinement
-
-
-In adition to menu-based editing, the tool supports refinement by parsing,
-which gets accessible by middle-clicking at the linearization field.
-
-**Exercise**. Construct the sentence
-//this very expensive cheese is very very delicious//
-and its Italian translation by using ``gfeditor``.
-
-
-==The context-free grammar format==
-
-Readers not familar with context-free grammars, also known as BNF grammars, can
-skip this section. Those that are familar with them will find here the exact
-relation between GF and context-free grammars. We will moreover show how
-the BNF format can be used as input to the GF program; it is often more
-concise than GF proper, but also more restricted in expressive power.
-
-
-
-==Using resource modules==
-
-===The golden rule of functional programming===
-
-When writing a grammar, you have to type lots of
-characters. You have probably
-done this by the copy-paste-modify method, which is a common way to
-avoid repeating work.
-
-However, there is a more elegant way to avoid repeating work than
-the copy-and-paste
-method. The **golden rule of functional programming** says that
-- whenever you find yourself programming by copy-and-paste,
- write a function instead.
-
-
-A function separates the shared parts of different computations from the
-changing parts, its **arguments**, or **parameters**.
-In functional programming languages, such as
-[Haskell http://www.haskell.org], it is possible to share much more
-code with functions than in languages such as C and Java, because
-of higher-order functions (functions that takes functions as arguments).
-
-
-===Operation definitions===
-
-GF is a functional programming language, not only in the sense that
-the abstract syntax is a system of functions (``fun``), but also because
-functional programming can be used when defining concrete syntax. This is
-done by using a new form of judgement, with the keyword ``oper`` (for
-**operation**), distinct from ``fun`` for the sake of clarity.
-Here is a simple example of an operation:
-```
- oper ss : Str -> {s : Str} = \x -> {s = x} ;
-```
-The operation can be **applied** to an argument, and GF will
-**compute** the application into a value. For instance,
-```
- ss "boy" ===> {s = "boy"}
-```
-We use the symbol ``===>`` to indicate how an expression is
-computed into a value; this symbol is not a part of GF.
-
-Thus an ``oper`` judgement includes the name of the defined operation,
-its type, and an expression defining it. As for the syntax of the defining
-expression, notice the **lambda abstraction** form ``\``//x// ``->`` //t// of
-the function. It reads: function with variable //x// and **function body**
-//t//.
-
-For lambda abstraction with multiple arguments, we have the shorthand
-```
- \x,y,z -> t === \x -> \y -> \z -> t
-```
-The notation we have used for linearization rules,
-```
- lin f x y = t
-```
-is shorthand for
-```
- lin f = \x,y -> t
-```
-
-
-
-
-
-%--!
-===The ``resource`` module type===
-
-Operator definitions can be included in a concrete syntax.
-But they are not really tied to a particular set of linearization rules.
-They should rather be seen as **resources**
-usable in many concrete syntaxes.
-
-The ``resource`` module type is used to package
-``oper`` definitions into reusable resources. Here is
-an example, with a handful of operations to manipulate
-strings and records.
-```
- resource StringOper = {
- oper
- SS : Type = {s : Str} ;
- ss : Str -> SS = \x -> {s = x} ;
- cc : SS -> SS -> SS = \x,y -> ss (x.s ++ y.s) ;
- prefix : Str -> SS -> SS = \p,x -> ss (p ++ x.s) ;
- }
-```
-Resource modules can extend other resource modules, in the
-same way as modules of other types can extend modules of the
-same type. Thus it is possible to build resource hierarchies.
-
-
-
-%--!
-===Opening a resource===
-
-Any number of ``resource`` modules can be
-**opened** in a ``concrete`` syntax, which
-makes definitions contained
-in the resource usable in the concrete syntax. Here is
-an example, where the resource ``StringOper`` is
-opened in a new version of ``FoodEng``.
-```
- concrete FoodEng of Food = open StringOper in {
-
- lincat
- S, Item, Kind, Quality = SS ;
-
- lin
- Is item quality = cc item (prefix "is" quality) ;
- This k = prefix "this" k ;
- That k = prefix "that" k ;
- QKind k q = cc k q ;
- Wine = ss "wine" ;
- Cheese = ss "cheese" ;
- Fish = ss "fish" ;
- Very = prefix "very" ;
- Fresh = ss "fresh" ;
- Warm = ss "warm" ;
- Italian = ss "Italian" ;
- Expensive = ss "expensive" ;
- Delicious = ss "delicious" ;
- Boring = ss "boring" ;
- }
-```
-
-**Exercise**. Use the same string operations to write ``FoodIta``
-more concisely.
-
-
-
-%--!
-===Partial application===
-
-GF, like Haskell, permits **partial application** of
-functions. An example of this is the rule
-```
- lin This k = prefix "this" k ;
-```
-which can be written more concisely
-```
- lin This = prefix "this" ;
-```
-The first form is perhaps more intuitive to write
-but, once you get used to partial application, you will appreciate its
-conciseness and elegance. The logic of partial application
-is known as **currying**, with a reference to Haskell B. Curry.
-The idea is that any //n//-place function can be defined as a 1-place
-function whose value is an //n-//1 -place function. Thus
-```
- oper prefix : Str -> SS -> SS ;
-```
-can be used as a 1-place function that takes a ``Str`` into a
-function ``SS -> SS``. The expected linearization of ``This`` is exactly
-a function of such a type, operating on an argument of type ``Kind``
-whose linearization is of type ``SS``. Thus we can define the
-linearization directly as ``prefix "this"``.
-
-**Exercise**. Define an operation ``infix`` analogous to ``prefix``,
-such that it allows you to write
-```
- lin Is = infix "is" ;
-```
-
-
-
-===Testing resource modules===
-
-To test a ``resource`` module independently, you must import it
-with the flag ``-retain``, which tells GF to retain ``oper`` definitions
-in the memory; the usual behaviour is that ``oper`` definitions
-are just applied to compile linearization rules
-(this is called **inlining**) and then thrown away.
-```
- > i -retain StringOper.gf
-```
-The command ``compute_concrete = cc`` computes any expression
-formed by operations and other GF constructs. For example,
-```
- > compute_concrete prefix "in" (ss "addition")
- {
- s : Str = "in" ++ "addition"
- }
-```
-
-
-
-
-==Grammar architecture==
-
-===Extending a grammar===
-
-The module system of GF makes it possible to **extend** a
-grammar in different ways. The syntax of extension is
-shown by the following example. We extend ``Food`` by
-adding a category of questions and two new functions.
-```
- abstract Morefood = Food ** {
- cat
- Question ;
- fun
- QIs : Item -> Quality -> Question ;
- Pizza : Kind ;
-
- }
-```
-Parallel to the abstract syntax, extensions can
-be built for concrete syntaxes:
-```
- concrete MorefoodEng of Morefood = FoodEng ** {
- lincat
- Question = {s : Str} ;
- lin
- QIs item quality = {s = "is" ++ item.s ++ quality.s} ;
- Pizza = {s = "pizza"} ;
- }
-```
-The effect of extension is that all of the contents of the extended
-and extending module are put together. We also say that the new
-module **inherits** the contents of the old module.
-
-At the same time as extending a module of the same type, a concrete
-syntax module may open resources. The syntax is shown by the
-following Italian grammar module:
-```
- concrete MorefoodIta of Morefood = FoodIta ** open StringOper in {
- lincat
- Question = SS ;
- lin
- QIs item quality = ss (item.s ++ "è" ++ quality.s) ;
- Pizza = ss "pizza" ;
- }
-```
-
-
-
-===Multiple inheritance===
-
-Specialized vocabularies can be represented as small grammars that
-only do "one thing" each. For instance, the following are grammars
-for fruit and mushrooms
-```
- abstract Fruit = {
- cat Fruit ;
- fun Apple, Peach : Fruit ;
- }
-
- abstract Mushroom = {
- cat Mushroom ;
- fun Cep, Agaric : Mushroom ;
- }
-```
-They can afterwards be combined into bigger grammars by using
-**multiple inheritance**, i.e. extension of several grammars at the
-same time:
-```
- abstract Foodmarket = Food, Fruit, Mushroom ** {
- fun
- FruitKind : Fruit -> Kind ;
- MushroomKind : Mushroom -> Kind ;
- }
-```
-
-**Exercise**. Refactor ``Food`` by taking apart ``Wine`` into a special
-``Drink`` module.
-
-
-
-===System commands===
-
-To document your grammar, you may want to print the
-graph into a file, e.g. a ``.png`` file that
-can be included in an HTML document. You can do this
-by first printing the graph into a file ``.dot`` and then
-processing this file with the ``dot`` program (from the Graphviz package).
-```
- > pm -printer=graph | wf Foodmarket.dot
- > ! dot -Tpng Foodmarket.dot > Foodmarket.png
-```
-The latter command is a Unix command, issued from GF by using the
-shell escape symbol ``!``. The resulting graph was shown in the previous section.
-
-The command ``print_multi = pm`` is used for printing the current multilingual
-grammar in various formats, of which the format ``-printer=graph`` just
-shows the module dependencies. Use ``help`` to see what other formats
-are available:
-```
- > help pm
- > help -printer
- > help help
-```
-Another form of system commands are those usable in GF pipes. The escape symbol
-is then ``?``.
-```
- > generate_trees | ? wc
-```
-
-
-===Division of labour===
-
-Using operations defined in resource modules is a
-way to avoid repetitive code.
-In addition, it enables a new kind of modularity
-and division of labour in grammar writing: grammarians familiar with
-the linguistic details of a language can make their knowledge
-available through resource grammar modules, whose users only need
-to pick the right operations and not to know their implementation
-details.
-
-In the following sections, we will go through some
-such linguistic details. The programming constructs needed when
-doing this are useful for all GF programmers, even for those who don't
-hand-code the linguistics of their applications but get them
-from libraries. And it is quite interesting to know something about the
-linguistic concepts of inflection, agreement, and parts of speech.
-
-
-==Summary of GF language features==
-
-Module extensions, multiple inheritance.
-
-Resource modules.
-
-Oper judgements.
-
-Lambda abstraction.
-
-The ``.cf`` grammar format.
-
-
-
-
-=Grammars with parameters=
-
-==The problem: words have to be inflected==
-
-Suppose we want to say, with the vocabulary included in
-``Food.gf``, things like
-```
- all Italian wines are delicious
-```
-The new grammatical facility we need are the plural forms
-of nouns and verbs (//wines, are//), as opposed to their
-singular forms.
-
-The introduction of plural forms requires two things:
-- the **inflection** of nouns and verbs in singular and plural
-- the **agreement** of the verb to subject:
- the verb must have the same number as the subject
-
-
-Different languages have different rules of inflection and agreement.
-For instance, Italian has also agreement in gender (masculine vs. feminine).
-We want to express such special features of languages in the
-concrete syntax while ignoring them in the abstract syntax.
-
-To be able to do all this, we need one new judgement form
-and many new expression forms.
-We also need to generalize linearization types
-from strings to more complex types.
-
-**Exercise**. Make a list of the possible forms that nouns,
-adjectives, and verbs can have in some languages that you know.
-
-
-%--!
-==Parameters and tables==
-
-We define the **parameter type** of number in English by
-using a new form of judgement:
-```
- param Number = Sg | Pl ;
-```
-To express that ``Kind`` expressions in English have a linearization
-depending on number, we replace the linearization type ``{s : Str}``
-with a type where the ``s`` field is a **table** depending on number:
-```
- lincat Kind = {s : Number => Str} ;
-```
-The **table type** ``Number => Str`` is in many respects similar to
-a function type (``Number -> Str``). The main difference is that the
-argument type of a table type must always be a parameter type. This means
-that the argument-value pairs can be listed in a finite table. The following
-example shows such a table:
-```
- lin Cheese = {s = table {
- Sg => "cheese" ;
- Pl => "cheeses"
- }
- } ;
-```
-The table consists of **branches**, where a **pattern** on the
-left of the arrow ``=>`` is assigned a **value** on the right.
-
-The application of a table to a parameter is done by the **selection**
-operator ``!``. For instance,
-```
- table {Sg => "cheese" ; Pl => "cheeses"} ! Pl
-```
-is a selection that computes into the value ``"cheeses"``.
-This computation is performed by **pattern matching**: return
-the value from the first branch whose pattern matches the
-selection argument. Thus
-```
- table {Sg => "cheese" ; Pl => "cheeses"} ! Pl
- ===> "cheeses"
-```
-
-**Exercise**. In a previous exercise, we made a list of the possible
-forms that nouns, adjectives, and verbs can have in some languages that
-you know. Now take some of the results and implement them by
-using parameter type definitions and tables. Write them into a ``resource``
-module, which you can test by using the command ``compute_concrete``.
-
-
-
-%--!
-==Inflection tables and paradigms==
-
-All English common nouns are inflected in number, most of them in the
-same way: the plural form is obtained from the singular by adding the
-ending //s//. This rule is an example of
-a **paradigm** - a formula telling how the inflection
-forms of a word are formed.
-
-From the GF point of view, a paradigm is a function that takes a **lemma** -
-also known as a **dictionary form** - and returns an inflection
-table of desired type. Paradigms are not functions in the sense of the
-``fun`` judgements of abstract syntax (which operate on trees and not
-on strings), but operations defined in ``oper`` judgements.
-The following operation defines the regular noun paradigm of English:
-```
- oper regNoun : Str -> {s : Number => Str} = \x -> {
- s = table {
- Sg => x ;
- Pl => x + "s"
- }
- } ;
-```
-The **gluing** operator ``+`` tells that
-the string held in the variable ``x`` and the ending ``"s"``
-are written together to form one **token**. Thus, for instance,
-```
- (regNoun "cheese").s ! Pl ===> "cheese" + "s" ===> "cheeses"
-```
-
-**Exercise**. Identify cases in which the ``regNoun`` paradigm does not
-apply in English, and implement some alternative paradigms.
-
-**Exercise**. Implement a paradigm for regular verbs in English.
-
-**Exercise**. Implement some regular paradigms for other languages you have
-considered in earlier exercises.
-
-
-
-==Using parameters in concrete syntax==
-
-We can now enrich the concrete syntax definitions to
-comprise morphology. This will permit a more radical
-variation between languages (e.g. English and Italian)
-then just the use of different words. In general,
-parameters and linearization types are different in
-different languages - but this does not prevent the
-use of a common abstract syntax.
-
-
-%--!
-===Parametric vs. inherent features, agreement===
-
-The rule of subject-verb agreement in English says that the verb
-phrase must be inflected in the number of the subject. This
-means that a noun phrase (functioning as a subject), inherently
-has a number, which it passes to the verb. The verb does not
-//have// a number, but must be able to //receive// whatever number the
-subject has. This distinction is nicely represented by the
-different linearization types of **noun phrases** and **verb phrases**:
-```
- lincat NP = {s : Str ; n : Number} ;
- lincat VP = {s : Number => Str} ;
-```
-We say that the number of ``NP`` is an **inherent feature**,
-whereas the number of ``NP`` is a **variable feature** (or a
-**parametric feature**).
-
-The agreement rule itself is expressed in the linearization rule of
-the predication function:
-```
- lin PredVP np vp = {s = np.s ++ vp.s ! np.n} ;
-```
-The following section will present
-``FoodsEng``, assuming the abstract syntax ``Foods``
-that is similar to ``Food`` but also has the
-plural determiners ``These`` and ``Those``.
-The reader is invited to inspect the way in which agreement works in
-the formation of sentences.
-
-
-%--!
-===English concrete syntax with parameters===
-
-The grammar uses both
-[``Prelude`` ../../lib/prelude/Prelude.gf] and
-[``MorphoEng`` resource/MorphoEng].
-We will later see how to make the grammar even
-more high-level by using a resource grammar library
-and parametrized modules.
-```
---# -path=.:resource:prelude
-
-concrete FoodsEng of Foods = open Prelude, MorphoEng in {
-
- lincat
- S, Quality = SS ;
- Kind = {s : Number => Str} ;
- Item = {s : Str ; n : Number} ;
-
- lin
- Is item quality =
- ss (item.s ++ (mkVerb "are" "is").s ! item.n ++ quality.s) ;
- This = det Sg "this" ;
- That = det Sg "that" ;
- These = det Pl "these" ;
- Those = det Pl "those" ;
- QKind quality kind = {s = \\n => quality.s ++ kind.s ! n} ;
- Wine = regNoun "wine" ;
- Cheese = regNoun "cheese" ;
- Fish = mkNoun "fish" "fish" ;
- Very = prefixSS "very" ;
- Fresh = ss "fresh" ;
- Warm = ss "warm" ;
- Italian = ss "Italian" ;
- Expensive = ss "expensive" ;
- Delicious = ss "delicious" ;
- Boring = ss "boring" ;
-
- oper
- det : Number -> Str -> Noun -> {s : Str ; n : Number} =
- \n,d,cn -> {
- s = d ++ cn.s ! n ;
- n = n
- } ;
-}
-```
-
-
-==Pattern matching==
-
-We have so far built all expressions of the ``table`` form
-from branches whose patterns are constants introduced in
-``param`` definitions, as well as constant strings.
-But there are more expressive patterns. Here is a summary of the possible forms:
-- a constructor pattern (identifier introduced in a ``param`` definition) matches
- the identical constructor
-- a variable pattern (identifier other than constant parameter) matches anything
-- the wild card ``_`` matches anything
-- a string literal pattern, e.g. ``"s"``, matches the same string
-- a disjunctive pattern ``P | ... | Q`` matches anything that
- one of the disjuncts matches
-
-
-Pattern matching is performed in the order in which the branches
-appear in the table: the branch of the first matching pattern is followed.
-As a first example, let us take an English noun that has the same form in
-singular and plura:
-```
- lin Fish = {s = table {_ => "fish"}} ;
-```
-As syntactic sugar, one-branch tables can be written concisely,
-```
- \\P,...,Q => t === table {P => ... table {Q => t} ...}
-```
-Thus we could rewrite the above rule
-```
- lin Fish = {s = \\_ => "fish"} ;
-```
-Finally, the ``case`` expressions common in functional
-programming languages are syntactic sugar for table selections:
-```
- case e of {...} === table {...} ! e
-```
-
-
-
-%--!
-==Hierarchic parameter types==
-
-The reader familiar with a functional programming language such as
-[Haskell http://www.haskell.org] must have noticed the similarity
-between parameter types in GF and **algebraic datatypes** (``data`` definitions
-in Haskell). The GF parameter types are actually a special case of algebraic
-datatypes: the main restriction is that in GF, these types must be finite.
-(It is this restriction that makes it possible to invert linearization rules into
-parsing methods.)
-
-However, finite is not the same thing as enumerated. Even in GF, parameter
-constructors can take arguments, provided these arguments are from other
-parameter types - only recursion is forbidden. Such parameter types impose a
-hierarchic order among parameters. They are often needed to define
-the linguistically most accurate parameter systems.
-
-To give an example, Swedish adjectives
-are inflected in number (singular or plural) and
-gender (uter or neuter). These parameters would suggest 2*2=4 different
-forms. However, the gender distinction is done only in the singular. Therefore,
-it would be inaccurate to define adjective paradigms using the type
-``Gender => Number => Str``. The following hierarchic definition
-yields an accurate system of three adjectival forms.
-```
- param AdjForm = ASg Gender | APl ;
- param Gender = Utr | Neutr ;
-```
-Here is an example of pattern matching, the paradigm of regular adjectives.
-```
- oper regAdj : Str -> AdjForm => Str = \fin -> table {
- ASg Utr => fin ;
- ASg Neutr => fin + "t" ;
- APl => fin + "a" ;
- }
-```
-A constructor can be used as a pattern that has patterns as arguments. For instance,
-the adjectival paradigm in which the two singular forms are the same,
-can be defined
-```
- oper plattAdj : Str -> AdjForm => Str = \platt -> table {
- ASg _ => platt ;
- APl => platt + "a" ;
- }
-```
-
-
-
-
-%--!
-==Discontinuous constituents==
-
-A linearization type may contain more strings than one.
-An example of where this is useful are English particle
-verbs, such as //switch off//. The linearization of
-a sentence may place the object between the verb and the particle:
-//he switched it off//.
-
-The following judgement defines transitive verbs as
-**discontinuous constituents**, i.e. as having a linearization
-type with two strings and not just one.
-```
- lincat TV = {s : Number => Str ; part : Str} ;
-```
-This linearization rule
-shows how the constituents are separated by the object in complementization.
-```
- lin PredTV tv obj = {s = \\n => tv.s ! n ++ obj.s ++ tv.part} ;
-```
-There is no restriction in the number of discontinuous constituents
-(or other fields) a ``lincat`` may contain. The only condition is that
-the fields must be of finite types, i.e. built from records, tables,
-parameters, and ``Str``, and not functions.
-
-A mathematical result
-about parsing in GF says that the worst-case complexity of parsing
-increases with the number of discontinuous constituents. This is
-potentially a reason to avoid discontinuous constituents.
-Moreover, the parsing and linearization commands only give accurate
-results for categories whose linearization type has a unique ``Str``
-valued field labelled ``s``. Therefore, discontinuous constituents
-are not a good idea in top-level categories accessed by the users
-of a grammar application.
-
-
-**Exercise**. Define the language ``a^n b^n c^n`` in GF.
-
-
-==More constructs for concrete syntax==
-
-In this section, we go through constructs that are not necessary
-in simple grammars or when the concrete syntax relies on libraries.
-But they are useful when writing advanced concrete syntax implementations,
-such as resource grammar libraries. Moreover, they conclude
-the presentation of concrete syntax constructs.
-
-
-%--!
-===Local definitions===
-
-Local definitions ("``let`` expressions") are used in functional
-programming for two reasons: to structure the code into smaller
-expressions, and to avoid repeated computation of one and
-the same expression. Here is an example, from
-[``MorphoIta`` resource/MorphoIta.gf]:
-```
- oper regNoun : Str -> Noun = \vino ->
- let
- vin = init vino ;
- o = last vino
- in
- case o of {
- "a" => mkNoun Fem vino (vin + "e") ;
- "o" | "e" => mkNoun Masc vino (vin + "i") ;
- _ => mkNoun Masc vino vino
- } ;
-```
-
-
-
-===Record extension and subtyping===
-
-Record types and records can be **extended** with new fields. For instance,
-in German it is natural to see transitive verbs as verbs with a case.
-The symbol ``**`` is used for both constructs.
-```
- lincat TV = Verb ** {c : Case} ;
-
- lin Follow = regVerb "folgen" ** {c = Dative} ;
-```
-To extend a record type or a record with a field whose label it
-already has is a type error. It is also an error to extend a type or
-object that is not a record.
-
-A record type //T// is a **subtype** of another one //R//, if //T// has
-all the fields of //R// and possibly other fields. For instance,
-an extension of a record type is always a subtype of it.
-
-If //T// is a subtype of //R//, an object of //T// can be used whenever
-an object of //R// is required. For instance, a transitive verb can
-be used whenever a verb is required.
-
-**Contravariance** means that a function taking an //R// as argument
-can also be applied to any object of a subtype //T//.
-
-
-
-===Tuples and product types===
-
-Product types and tuples are syntactic sugar for record types and records:
-```
- T1 * ... * Tn === {p1 : T1 ; ... ; pn : Tn}
- === {p1 = T1 ; ... ; pn = Tn}
-```
-Thus the labels ``p1, p2,...`` are hard-coded.
-
-
-===Record and tuple patterns===
-
-Record types of parameter types also count as parameter types.
-A typical example is a record of agreement features, e.g. French
-```
- oper Agr : PType = {g : Gender ; n : Number ; p : Person} ;
-```
-Notice the term ``PType`` rather than just ``Type`` referring to
-parameter types. Every ``PType`` is also a ``Type``, but not vice-versa.
-
-Pattern matching is done in the expected way, but it can moreover
-utilize partial records: the branch
-```
- {g = Fem} => t
-```
-in a table of type ``Agr => T`` means the same as
-```
- {g = Fem ; n = _ ; p = _} => t
-```
-Tuple patterns are translated to record patterns in the
-same way as tuples to records; partial patterns make it
-possible to write, slightly surprisingly,
-```
- case of {
- => t
- ...
- }
-```
-
-===Regular expression patterns===
-
-To define string operations computed at compile time, such
-as in morphology, it is handy to use regular expression patterns:
- - //p// ``+`` //q// : token consisting of //p// followed by //q//
- - //p// ``*`` : token //p// repeated 0 or more times
- (max the length of the string to be matched)
- - ``-`` //p// : matches anything that //p// does not match
- - //x// ``@`` //p// : bind to //x// what //p// matches
- - //p// ``|`` //q// : matches what either //p// or //q// matches
-
-
-The last three apply to all types of patterns, the first two only to token strings.
-As an example, we give a rule for the formation of English word forms
-ending with an //s// and used in the formation of both plural nouns and
-third-person present-tense verbs.
-```
- add_s : Str -> Str = \w -> case w of {
- _ + "oo" => w + "s" ; -- bamboo
- _ + ("s" | "z" | "x" | "sh" | "o") => w + "es" ; -- bus, hero
- _ + ("a" | "o" | "u" | "e") + "y" => w + "s" ; -- boy
- x + "y" => x + "ies" ; -- fly
- _ => w + "s" -- car
- } ;
-```
-Here is another example, the plural formation in Swedish 2nd declension.
-The second branch uses a variable binding with ``@`` to cover the cases where an
-unstressed pre-final vowel //e// disappears in the plural
-(//nyckel-nycklar, seger-segrar, bil-bilar//):
-```
- plural2 : Str -> Str = \w -> case w of {
- pojk + "e" => pojk + "ar" ;
- nyck + "e" + l@("l" | "r" | "n") => nyck + l + "ar" ;
- bil => bil + "ar"
- } ;
-```
-Variables in regular expression patterns
-are always bound to the **first match**, which is the first
-in the sequence of binding lists. For example:
-- ``x + "e" + y`` matches ``"peter"`` with ``x = "p", y = "ter"``
-- ``x + "er"*`` matches ``"burgerer"`` with ``x = "burg"
-
-
-
-**Exercise**. Implement the German **Umlaut** operation on word stems.
-The operation changes the vowel of the stressed stem syllable as follows:
-//a// to //ä//, //au// to //äu//, //o// to //ö//, and //u// to //ü//. You
-can assume that the operation only takes syllables as arguments. Test the
-operation to see whether it correctly changes //Arzt// to //Ärzt//,
-//Baum// to //Bäum//, //Topf// to //Töpf//, and //Kuh// to //Küh//.
-
-**Exercise**. Define an operation that deletes all vowels from the
-end of a string, so that e.g. "aigeia" becomes "aig".
-
-
-===Free variation===
-
-Sometimes there are many alternative ways to define a concrete syntax.
-For instance, the verb negation in English can be expressed both by
-//does not// and //doesn't//. In linguistic terms, these expressions
-are in **free variation**. The ``variants`` construct of GF can
-be used to give a list of strings in free variation. For example,
-```
- NegVerb verb = {s = variants {["does not"] ; "doesn't} ++ verb.s ! Pl} ;
-```
-An empty variant list
-```
- variants {}
-```
-can be used e.g. if a word lacks a certain form.
-
-In general, ``variants`` should be used cautiously. It is not
-recommended for modules aimed to be libraries, because the
-user of the library has no way to choose among the variants.
-
-
-%--!
-===Prefix-dependent choices===
-
-Sometimes a token has different forms depending on the token
-that follows. An example is the English indefinite article,
-which is //an// if a vowel follows, //a// otherwise.
-Which form is chosen can only be decided at run time, i.e.
-when a string is actually build. GF has a special construct for
-such tokens, the ``pre`` construct exemplified in
-```
- oper artIndef : Str =
- pre {"a" ; "an" / strs {"a" ; "e" ; "i" ; "o"}} ;
-```
-Thus
-```
- artIndef ++ "cheese" ---> "a" ++ "cheese"
- artIndef ++ "apple" ---> "an" ++ "apple"
-```
-This very example does not work in all situations: the prefix
-//u// has no general rules, and some problematic words are
-//euphemism, one-eyed, n-gram//. It is possible to write
-```
- oper artIndef : Str =
- pre {"a" ;
- "a" / strs {"eu" ; "one"} ;
- "an" / strs {"a" ; "e" ; "i" ; "o" ; "n-"}
- } ;
-```
-
-
-===Predefined types===
-
-GF has the following predefined categories in abstract syntax:
-```
- cat Int ; -- integers, e.g. 0, 5, 743145151019
- cat Float ; -- floats, e.g. 0.0, 3.1415926
- cat String ; -- strings, e.g. "", "foo", "123"
-```
-The objects of each of these categories are **literals**
-as indicated in the comments above. No ``fun`` definition
-can have a predefined category as its value type, but
-they can be used as arguments. For example:
-```
- fun StreetAddress : Int -> String -> Address ;
- lin StreetAddress number street = {s = number.s ++ street.s} ;
-
- -- e.g. (StreetAddress 10 "Downing Street") : Address
-```
-FIXME: The linearization type is ``{s : Str}`` for all these categories.
-
-
-===Overloading of operations===
-
-Large libraries, such as the GF Resource Grammar Library, may define
-hundreds of names. This can be unpractical
-for both the library author and the user: the author has to invent longer
-and longer names which are not always intuitive,
-and the author has to learn or at least be able to find all these names.
-A solution to this problem, adopted by languages such as C++,
-is **overloading**: one and the same name can be used for several functions.
-When such a name is used, the
-compiler performs **overload resolution** to find out which of
-the possible functions is meant. Overload resolution is based on
-the types of the functions: all functions that
-have the same name must have different types.
-
-In C++, functions with the same name can be scattered everywhere in the program.
-In GF, they must be grouped together in ``overload`` groups. Here is an example
-of an overload group, giving three different ways to define verbs in English:
-```
- oper mkV = overload {
- mkV : (walk : Str) -> V = -- regular verbs
- mkV : (omit,omitted : Str) -> V = -- regular verbs with duplication
- mkN : (sing,sang,sung : Str) -> V = -- irregular verbs
- mkN : (run,ran,run,running : Str) -> V = -- irregular verbs with duplication
- }
-```
-Intuitively, the forms correspond to the way regular and irregular words
-are given in a dictionary: by listing relevant forms, instead of
-referring to a paradigm.
-
-
-
-
-=Implementing morphology and syntax=
-
-In this chapter, we will dig deeper into linguistic concepts than
-so far. We will build an implementation of a linguistic motivated
-fragment of English and Italian, covering basic morphology of syntax.
-The result is a miniature of the GF resource library, which will
-be covered in the next chapter. There are two main purposes
-for this chapter:
-- first, to understand the linguistic concepts underlying the resource
- grammar library
-- second, to get practice in the more advanced constructs of concrete syntax
-
-
-However, the reader who is not willing to work on an advanced level
-of concrete syntax may just skim through the introductory parts of
-each section, thus using the chapter in its first purpose only.
-
-
-
-==Worst-case functions and data abstraction==
-
-Some English nouns, such as ``mouse``, are so irregular that
-it makes no sense to see them as instances of a paradigm. Even
-then, it is useful to perform **data abstraction** from the
-definition of the type ``Noun``, and introduce a constructor
-operation, a **worst-case function** for nouns:
-```
- oper mkNoun : Str -> Str -> Noun = \x,y -> {
- s = table {
- Sg => x ;
- Pl => y
- }
- } ;
-```
-Thus we can define
-```
- lin Mouse = mkNoun "mouse" "mice" ;
-```
-and
-```
- oper regNoun : Str -> Noun = \x ->
- mkNoun x (x + "s") ;
-```
-instead of writing the inflection tables explicitly.
-
-The grammar engineering advantage of worst-case functions is that
-the author of the resource module may change the definitions of
-``Noun`` and ``mkNoun``, and still retain the
-interface (i.e. the system of type signatures) that makes it
-correct to use these functions in concrete modules. In programming
-terms, ``Noun`` is then treated as an **abstract datatype**.
-
-
-
-%--!
-==A system of paradigms using predefined string operations==
-
-In addition to the completely regular noun paradigm ``regNoun``,
-some other frequent noun paradigms deserve to be
-defined, for instance,
-```
- sNoun : Str -> Noun = \kiss -> mkNoun kiss (kiss + "es") ;
-```
-What about nouns like //fly//, with the plural //flies//? The already
-available solution is to use the longest common prefix
-//fl// (also known as the **technical stem**) as argument, and define
-```
- yNoun : Str -> Noun = \fl -> mkNoun (fl + "y") (fl + "ies") ;
-```
-But this paradigm would be very unintuitive to use, because the technical stem
-is not an existing form of the word. A better solution is to use
-the lemma and a string operator ``init``, which returns the initial segment (i.e.
-all characters but the last) of a string:
-```
- yNoun : Str -> Noun = \fly -> mkNoun fly (init fly + "ies") ;
-```
-The operation ``init`` belongs to a set of operations in the
-resource module ``Prelude``, which therefore has to be
-``open``ed so that ``init`` can be used.
-```
- > cc init "curry"
- "curr"
-```
-Its dual is ``last``:
-```
- > cc last "curry"
- "y"
-```
-As generalizations of the library functions ``init`` and ``last``, GF has
-two predefined funtions:
-``Predef.dp``, which "drops" suffixes of any length,
-and ``Predef.tk``, which "takes" a prefix
-just omitting a number of characters from the end. For instance,
-```
- > cc Predef.tk 3 "worried"
- "worr"
- > cc Predef.dp 3 "worried"
- "ied"
-```
-The prefix ``Predef`` is given to a handful of functions that could
-not be defined internally in GF. They are available in all modules
-without explicit ``open`` of the module ``Predef``.
-
-
-
-
-
-
-%--!
-==An intelligent noun paradigm using pattern matching==
-
-It may be hard for the user of a resource morphology to pick the right
-inflection paradigm. A way to help this is to define a more intelligent
-paradigm, which chooses the ending by first analysing the lemma.
-The following variant for English regular nouns puts together all the
-previously shown paradigms, and chooses one of them on the basis of
-the final letter of the lemma (found by the prelude operation ``last``).
-```
- regNoun : Str -> Noun = \s -> case last s of {
- "s" | "z" => mkNoun s (s + "es") ;
- "y" => mkNoun s (init s + "ies") ;
- _ => mkNoun s (s + "s")
- } ;
-```
-The paradigms ``regNoun`` does not give the correct forms for
-all nouns. For instance, //mouse - mice// and
-//fish - fish// must be given by using ``mkNoun``.
-Also the word //boy// would be inflected incorrectly; to prevent
-this, either use ``mkNoun`` or modify
-``regNoun`` so that the ``"y"`` case does not
-apply if the second-last character is a vowel.
-
-**Exercise**. Extend the ``regNoun`` paradigm so that it takes care
-of all variations there are in English. Test it with the nouns
-//ax//, //bamboo//, //boy//, //bush//, //hero//, //match//.
-**Hint**. The library functions ``Predef.dp`` and ``Predef.tk``
-are useful in this task.
-
-**Exercise**. The same rules that form plural nouns in English also
-apply in the formation of third-person singular verbs.
-Write a regular verb paradigm that uses this idea, but first
-rewrite ``regNoun`` so that the analysis needed to build //s//-forms
-is factored out as a separate ``oper``, which is shared with
-``regVerb``.
-
-
-
-
-
-%--!
-==Morphological resource modules==
-
-A common idiom is to
-gather the ``oper`` and ``param`` definitions
-needed for inflecting words in
-a language into a morphology module. Here is a simple
-example, [``MorphoEng`` resource/MorphoEng.gf].
-```
- --# -path=.:prelude
-
- resource MorphoEng = open Prelude in {
-
- param
- Number = Sg | Pl ;
-
- oper
- Noun, Verb : Type = {s : Number => Str} ;
-
- mkNoun : Str -> Str -> Noun = \x,y -> {
- s = table {
- Sg => x ;
- Pl => y
- }
- } ;
-
- regNoun : Str -> Noun = \s -> case last s of {
- "s" | "z" => mkNoun s (s + "es") ;
- "y" => mkNoun s (init s + "ies") ;
- _ => mkNoun s (s + "s")
- } ;
-
- mkVerb : Str -> Str -> Verb = \x,y -> mkNoun y x ;
-
- regVerb : Str -> Verb = \s -> case last s of {
- "s" | "z" => mkVerb s (s + "es") ;
- "y" => mkVerb s (init s + "ies") ;
- "o" => mkVerb s (s + "es") ;
- _ => mkVerb s (s + "s")
- } ;
- }
-```
-The first line gives as a hint to the compiler the
-**search path** needed to find all the other modules that the
-module depends on. The directory ``prelude`` is a subdirectory of
-``GF/lib``; to be able to refer to it in this simple way, you can
-set the environment variable ``GF_LIB_PATH`` to point to this
-directory.
-
-
-
-%--!
-==Morphological analysis and morphology quiz==
-
-Even though morphology is in GF
-mostly used as an auxiliary for syntax, it
-can also be useful on its own right. The command ``morpho_analyse = ma``
-can be used to read a text and return for each word the analyses that
-it has in the current concrete syntax.
-```
- > rf bible.txt | morpho_analyse
-```
-In the same way as translation exercises, morphological exercises can
-be generated, by the command ``morpho_quiz = mq``. Usually,
-the category is set to be something else than ``S``. For instance,
-```
- > cd GF/lib/resource-1.0/
- > i french/IrregFre.gf
- > morpho_quiz -cat=V
-
- Welcome to GF Morphology Quiz.
- ...
-
- réapparaître : VFin VCondit Pl P2
- réapparaitriez
- > No, not réapparaitriez, but
- réapparaîtriez
- Score 0/1
-```
-Finally, a list of morphological exercises can be generated
-off-line and saved in a
-file for later use, by the command ``morpho_list = ml``
-```
- > morpho_list -number=25 -cat=V | wf exx.txt
-```
-The ``number`` flag gives the number of exercises generated.
-
-
-
-
-
-=Using the resource grammar library=
-
-In this chapter, we will take a look at the GF resource grammar library.
-We will use the library to implement a slightly extended ``Food`` grammar
-and port it to some new languages.
-
-**Exercise**. Define the mini resource of the previous chapter by
-using a functor over the full resource.
-
-
-==The coverage of the library==
-
-The GF Resource Grammar Library contains grammar rules for
-10 languages (in addition, 2 languages are available as incomplete
-implementations, and a few more are under construction). Its purpose
-is to make these rules available for application programmers,
-who can thereby concentrate on the semantic and stylistic
-aspects of their grammars, without having to think about
-grammaticality. The targeted level of application grammarians
-is that of a skilled programmer with
-a practical knowledge of the target languages, but without
-theoretical knowledge about their grammars.
-Such a combination of
-skills is typical of programmers who, for instance, want to localize
-software to new languages.
-
-The current resource languages are
-- ``Ara``bic (incomplete)
-- ``Cat``alan (incomplete)
-- ``Dan``ish
-- ``Eng``lish
-- ``Fin``nish
-- ``Fre``nch
-- ``Ger``man
-- ``Ita``lian
-- ``Nor``wegian
-- ``Rus``sian
-- ``Spa``nish
-- ``Swe``dish
-
-
-The first three letters (``Eng`` etc) are used in grammar module names.
-The incomplete Arabic and Catalan implementations are
-enough to be used in many applications; they both contain, amoung other
-things, complete inflectional morphology.
-
-
-==The resource API==
-
-The resource library API is devided into language-specific
-and language-independent parts. To put it roughly,
-- the syntax API is language-independent, i.e. has the same types and functions for all
- languages.
- Its name is ``Syntax``//L// for each language //L//
-- the morphology API is language-specific, i.e. has partly different types and functions
- for different languages.
- Its name is ``Paradigms``//L// for each language //L//
-
-
-A full documentation of the API is available on-line in the
-[resource synopsis ../../lib/resource-1.0/synopsis.html]. For our
-examples, we will only need a fragment of the full API.
-
-In the first examples,
-we will make use of the following categories, from the module ``Syntax``.
-
-|| Category | Explanation | Example ||
-| ``Utt`` | sentence, question, word... | "be quiet" |
-| ``Adv`` | verb-phrase-modifying adverb, | "in the house" |
-| ``AdA`` | adjective-modifying adverb, | "very" |
-| ``S`` | declarative sentence | "she lived here" |
-| ``Cl`` | declarative clause, with all tenses | "she looks at this" |
-| ``AP`` | adjectival phrase | "very warm" |
-| ``CN`` | common noun (without determiner) | "red house" |
-| ``NP`` | noun phrase (subject or object) | "the red house" |
-| ``Det`` | determiner phrase | "those seven" |
-| ``Predet`` | predeterminer | "only" |
-| ``Quant`` | quantifier with both sg and pl | "this/these" |
-| ``Prep`` | preposition, or just case | "in" |
-| ``A`` | one-place adjective | "warm" |
-| ``N`` | common noun | "house" |
-
-
-We will need the following syntax rules from ``Syntax``.
-
-|| Function | Type | Example ||
-| ``mkUtt`` | ``S -> Utt`` | //John walked// |
-| ``mkUtt`` | ``Cl -> Utt`` | //John walks// |
-| ``mkCl`` | ``NP -> AP -> Cl`` | //John is very old// |
-| ``mkNP`` | ``Det -> CN -> NP`` | //the first old man// |
-| ``mkNP`` | ``Predet -> NP -> NP`` | //only John// |
-| ``mkDet`` | ``Quant -> Det`` | //this// |
-| ``mkCN`` | ``N -> CN`` | //house// |
-| ``mkCN`` | ``AP -> CN -> CN`` | //very big blue house// |
-| ``mkAP`` | ``A -> AP`` | //old// |
-| ``mkAP`` | ``AdA -> AP -> AP`` | //very very old// |
-
-We will also need the following structural words from ``Syntax``.
-
-|| Function | Type | Example ||
-| ``all_Predet`` | ``Predet`` | //all// |
-| ``defPlDet`` | ``Det`` | //the (houses)// |
-| ``this_Quant`` | ``Quant`` | //this// |
-| ``very_AdA`` | ``AdA`` | //very// |
-
-
-For French, we will use the following part of ``ParadigmsFre``.
-
-|| Function | Type ||
-| ``Gender`` | ``Type`` |
-| ``masculine`` | ``Gender`` |
-| ``feminine`` | ``Gender`` |
-| ``mkN`` | ``(cheval : Str) -> N`` |
-| ``mkN`` | ``(foie : Str) -> Gender -> N`` |
-| ``mkA`` | ``(cher : Str) -> A`` |
-| ``mkA`` | ``(sec,seche : Str) -> A`` |
-
-
-For German, we will use the following part of ``ParadigmsGer``.
-
-|| Function | Type ||
-| ``Gender`` | ``Type`` |
-| ``masculine`` | ``Gender`` |
-| ``feminine`` | ``Gender`` |
-| ``neuter`` | ``Gender`` |
-| ``mkN`` | ``(Stufe : Str) -> N`` |
-| ``mkN`` | ``(Bild,Bilder : Str) -> Gender -> N`` |
-| ``mkA`` | ``(klein : Str) -> A`` |
-| ``mkA`` | ``(gut,besser,beste : Str) -> A`` |
-
-
-**Exercise**. Try out the morphological paradigms in different languages. Do
-in this way:
-```
- > i -path=alltenses:prelude -retain alltenses/ParadigmsGer.gfr
- > cc mkN "Farbe"
- > cc mkA "gut" "besser" "beste"
-```
-
-
-==Example: French==
-
-We start with an abstract syntax that is like ``Food`` before, but
-has a plural determiner (//all wines//) and some new nouns that will
-need different genders in most languages.
-```
- abstract Food = {
- cat
- S ; Item ; Kind ; Quality ;
- fun
- Is : Item -> Quality -> S ;
- This, All : Kind -> Item ;
- QKind : Quality -> Kind -> Kind ;
- Wine, Cheese, Fish, Beer, Pizza : Kind ;
- Very : Quality -> Quality ;
- Fresh, Warm, Italian, Expensive, Delicious, Boring : Quality ;
- }
-```
-The French implementation opens ``SyntaxFre`` and ``ParadigmsFre``
-to get access to the resource libraries needed. In order to find
-the libraries, a ``path`` directive is prepended; it is interpreted
-relative to the environment variable ``GF_LIB_PATH``.
-```
- --# -path=.:present:prelude
-
- concrete FoodFre of Food = open SyntaxFre,ParadigmsFre in {
- lincat
- S = Utt ;
- Item = NP ;
- Kind = CN ;
- Quality = AP ;
- lin
- Is item quality = mkUtt (mkCl item quality) ;
- This kind = mkNP (mkDet this_Quant) kind ;
- All kind = mkNP all_Predet (mkNP defPlDet kind) ;
- QKind quality kind = mkCN quality kind ;
- Wine = mkCN (mkN "vin") ;
- Beer = mkCN (mkN "bière") ;
- Pizza = mkCN (mkN "pizza" feminine) ;
- Cheese = mkCN (mkN "fromage" masculine) ;
- Fish = mkCN (mkN "poisson") ;
- Very quality = mkAP very_AdA quality ;
- Fresh = mkAP (mkA "frais" "fraîche") ;
- Warm = mkAP (mkA "chaud") ;
- Italian = mkAP (mkA "italien") ;
- Expensive = mkAP (mkA "cher") ;
- Delicious = mkAP (mkA "délicieux") ;
- Boring = mkAP (mkA "ennuyeux") ;
- }
-```
-The ``lincat`` definitions in ``FoodFre`` assign **resource categories**
-to **application categories**. In a sense, the application categories
-are **semantic**, as they correspond to concepts in the grammar application,
-whereas the resource categories are **syntactic**: they give the linguistic
-means to express concepts in any application.
-
-The ``lin`` definitions likewise assign resource functions to application
-functions. Under the hood, there is a lot of matching with parameters to
-take care of word order, inflection, and agreement. But the user of the
-library sees nothing of this: the only parameters you need to give are
-the genders of some nouns, which cannot be correctly inferred from the word.
-
-In French, for example, the one-argument ``mkN`` assigns the noun the feminine
-gender if and only if it ends with an //e//. Therefore the words //fromage// and
-//pizza// are given genders manually.
-One can of course always give genders manually, to be on the safe side.
-
-As for inflection, the one-argument adjective pattern ``mkA`` takes care of
-completely regular adjective such as //chaud-chaude//, but also of special
-cases such as //italien-italienne//, //cher-chère//, and //délicieux-délicieuse//.
-But it cannot form //frais-fraîche// properly. Once again, you can give more
-forms to be on the safe side. You can also test the paradigms in the GF
-system.
-
-**Exercise**. Compile the grammar ``FoodFre`` and generate and parse some sentences.
-
-**Exercise**. Write a concrete syntax of ``Food`` for English or some other language
-included in the resource library. You can also compare the output with the hand-written
-grammars presented earlier in this tutorial.
-
-**Exercise**. In particular, try to write a concrete syntax for Italian, even if
-you don't know Italian. What you need to know is that "beer" is //birra// and
-"pizza" is //pizza//, and that all the nouns and adjectives in the grammar
-are regular.
-
-
-
-==Functor implementation of multilingual grammars==
-
-If you did the exercise of writing a concrete syntax of ``Food`` for some other
-language, you probably noticed that much of the code looks exactly the same
-as for French. The immediate reason for this is that the ``Syntax`` API is the
-same for all languages; the deeper reason is that all languages (at least those
-in the resource package) implement the same syntactic structures and tend to use them
-in similar ways. Thus it is only the lexical parts of a concrete syntax that
-you need to write anew for a new language. In brief,
-- first copy the concrete syntax for one language
-- then change the words (the strings and perhaps some paradigms)
-
-
-But programming by copy-and-paste is not worthy of a functional programmer.
-Can we write a function that takes care of the shared parts of grammar modules?
-Yes, we can. It is not a function in the ``fun`` or ``oper`` sense, but
-a function operating on modules, called a **functor**. This construct
-is familiar from the functional languages ML and OCaml, but it does not
-exist in Haskell. It also bears some resemblance to templates in C++.
-Functors are also known as **parametrized modules**.
-
-In GF, a functor is a module that ``open``s one or more **interfaces**.
-An ``interface`` is a module similar to a ``resource``, but it only
-contains the types of ``oper``s, not their definitions. You can think
-of an interface as a kind of a record type. Thus a functor is a kind
-of a function taking records as arguments and producins a module
-as value.
-
-Let us look at a functor implementation of the ``Food`` grammar.
-Consider its module header first:
-```
- incomplete concrete FoodI of Food = open Syntax, LexFood in
-```
-In the functor-function analogy, ``FoodI`` would be presented as a function
-with the following type signature:
-```
- FoodI : instance of Syntax -> instance of LexFood -> concrete of Food
-```
-It takes as arguments two interfaces:
-- ``Syntax``, the resource grammar interface
-- ``LexFood``, the domain-specific lexicon interface
-
-
-Functors opening ``Syntax`` and a domain lexicon interface are in fact
-so typical in GF applications, that this structure could be called
-a **design patter**
-for GF grammars. The idea in this pattern is, again, that
-the languages use the same syntactic structures but different words.
-
-Before going to the details of the module bodies, let us look at how functors
-are concretely used. An interface has a header such as
-```
- interface LexFood = open Syntax in
-```
-To give an ``instance`` of it means that all ``oper``s are given definitione (of
-appropriate types). For example,
-```
- instance LexFoodGer of LexFood = open SyntaxGer, ParadigmsGer in
-```
-Notice that when an interface opens an interface, such as ``Syntax``,
-then its instance
-opens an instance of it. But the instance may also open some other
-resources - typically,
-a domain lexicon instance opens a ``Paradigms`` module.
-
-In the function-functor analogy, we now have
-```
- SyntaxGer : instance of Syntax
- LexFoodGer : instance of LexFood
-```
-Thus we can complete the German implementation by "applying" the functor:
-```
- FoodI SyntaxGer LexFoodGer : concrete of Food
-```
-The GF syntax for doing so is
-```
- concrete FoodGer of Food = FoodI with
- (Syntax = SyntaxGer),
- (LexFood = LexFoodGer) ;
-```
-Notice that this is the //complete// module, not just a header of it.
-The module body is received from ``FoodI``, by instantiating the
-interface constants with their definitions given in the German
-instances.
-
-A module of this form, characterized by the keyword ``with``, is
-called a **functor instantiation**.
-
-Here is the complete code for the functor ``FoodI``:
-```
- incomplete concrete FoodI of Food = open Syntax, LexFood in {
- lincat
- S = Utt ;
- Item = NP ;
- Kind = CN ;
- Quality = AP ;
- lin
- Is item quality = mkUtt (mkCl item quality) ;
- This kind = mkNP (mkDet this_Quant) kind ;
- All kind = mkNP all_Predet (mkNP defPlDet kind) ;
- QKind quality kind = mkCN quality kind ;
- Wine = mkCN wine_N ;
- Beer = mkCN beer_N ;
- Pizza = mkCN pizza_N ;
- Cheese = mkCN cheese_N ;
- Fish = mkCN fish_N ;
- Very quality = mkAP very_AdA quality ;
- Fresh = mkAP fresh_A ;
- Warm = mkAP warm_A ;
- Italian = mkAP italian_A ;
- Expensive = mkAP expensive_A ;
- Delicious = mkAP delicious_A ;
- Boring = mkAP boring_A ;
-}
-```
-
-
-==Interfaces and instances==
-
-Let us now define the ``LexFood`` interface:
-```
- interface LexFood = open Syntax in {
- oper
- wine_N : N ;
- beer_N : N ;
- pizza_N : N ;
- cheese_N : N ;
- fish_N : N ;
- fresh_A : A ;
- warm_A : A ;
- italian_A : A ;
- expensive_A : A ;
- delicious_A : A ;
- boring_A : A ;
-}
-```
-In this interface, only lexical items are declared. In general, an
-interface can declare any functions and also types. The ``Syntax``
-interface does so.
-
-Here is the German instance of the interface:
-```
- instance LexFoodGer of LexFood = open SyntaxGer, ParadigmsGer in {
- oper
- wine_N = mkN "Wein" ;
- beer_N = mkN "Bier" "Biere" neuter ;
- pizza_N = mkN "Pizza" "Pizzen" feminine ;
- cheese_N = mkN "Käse" "Käsen" masculine ;
- fish_N = mkN "Fisch" ;
- fresh_A = mkA "frisch" ;
- warm_A = mkA "warm" "wärmer" "wärmste" ;
- italian_A = mkA "italienisch" ;
- expensive_A = mkA "teuer" ;
- delicious_A = mkA "köstlich" ;
- boring_A = mkA "langweilig" ;
- }
-```
-Just to complete the picture, we repeat the German functor instantiation
-for ``FoodI``, this time with a path directive that makes it compilable.
-```
- --# -path=.:present:prelude
-
- concrete FoodGer of Food = FoodI with
- (Syntax = SyntaxGer),
- (LexFood = LexFoodGer) ;
-```
-
-
-**Exercise**. Compile and test ``FoodGer``.
-
-**Exercise**. Refactor ``FoodFre`` into a functor instantiation.
-
-
-
-==Adding languages to a functor implementation==
-
-Once we have an application grammar defined by using a functor,
-adding a new language is simple. Just two modules need to be written:
-- a domain lexicon instance
-- a functor instantiation
-
-
-The functor instantiation is completely mechanical to write.
-Here is one for Finnish:
-```
---# -path=.:present:prelude
-
-concrete FoodFin of Food = FoodI with
- (Syntax = SyntaxFin),
- (LexFood = LexFoodFin) ;
-```
-The domain lexicon instance requires some knowledge of the words of the
-language: what words are used for which concepts, how the words are
-inflected, plus features such as genders. Here is a lexicon instance for
-Finnish:
-```
- instance LexFoodFin of LexFood = open SyntaxFin, ParadigmsFin in {
- oper
- wine_N = mkN "viini" ;
- beer_N = mkN "olut" ;
- pizza_N = mkN "pizza" ;
- cheese_N = mkN "juusto" ;
- fish_N = mkN "kala" ;
- fresh_A = mkA "tuore" ;
- warm_A = mkA "lämmin" ;
- italian_A = mkA "italialainen" ;
- expensive_A = mkA "kallis" ;
- delicious_A = mkA "herkullinen" ;
- boring_A = mkA "tylsä" ;
- }
-```
-
-**Exercise**. Instantiate the functor ``FoodI`` to some language of
-your choice.
-
-
-==Division of labour revisited==
-
-One purpose with the resource grammars was stated to be a division
-of labour between linguists and application grammarians. We can now
-reflect on what this means more precisely, by asking ourselves what
-skills are required of grammarians working on different components.
-
-Building a GF application starts from the abstract syntax. Writing
-an abstract syntax requires
-- understanding the semantic structure of the application domain
-- knowledge of the GF fragment with categories and functions
-
-
-If the concrete syntax is written by means of a functor, the programmer
-has to decide what parts of the implementation are put to the interface
-and what parts are shared in the functor. This requires
-- knowing how the domain concepts are expressed in natural language
-- knowledge of the resource grammar library - the categories and combinators
-- understanding what parts are likely to be expressed in language-dependent
- ways, so that they must belong to the interface and not the functor
-- knowledge of the GF fragment with function applications and strings
-
-
-Instantiating a ready-made functor to a new language is less demanding.
-It requires essentially
-- knowing how the domain words are expressed in the language
-- knowing, roughly, how these words are inflected
-- knowledge of the paradigms available in the library
-- knowledge of the GF fragment with function applications and strings
-
-
-Notice that none of these tasks requires the use of GF records, tables,
-or parameters. Thus only a small fragment of GF is needed; the rest of
-GF is only relevant for those who write the libraries.
-
-Of course, grammar writing is not always straightforward usage of libraries.
-For example, GF can be used for other languages than just those in the
-libraries - for both natural and formal languages. A knowledge of records
-and tables can, unfortunately, also be needed for understanding GF's error
-messages.
-
-**Exercise**. Design a small grammar that can be used for controlling
-an MP3 player. The grammar should be able to recognize commands such
-as //play this song//, with the following variations:
-- verbs: //play//, //remove//
-- objects: //song//, //artist//
-- determiners: //this//, //the previous//
-- verbs without arguments: //stop//, //pause//
-
-
-The implementation goes in the following phases:
-+ abstract syntax
-+ functor and lexicon interface
-+ lexicon instance for the first language
-+ functor instantiation for the first language
-+ lexicon instance for the second language
-+ functor instantiation for the second language
-+ ...
-
-
-
-==Restricted inheritance==
-
-A functor implementation using the resource ``Syntax`` interface
-works as long as all concepts are expressed by using the same structures
-in all languages. If this is not the case, the deviant linearization can
-be made into a parameter and moved to the domain lexicon interface.
-
-Let us take a slightly contrived example: assume that English has
-no word for ``Pizza``, but has to use the paraphrase //Italian pie//.
-This paraphrase is no longer a noun ``N``, but a complex phrase
-in the category ``CN``. An obvious way to solve this problem is
-to change interface ``LexEng`` so that the constant declared for
-``Pizza`` gets a new type:
-```
- oper pizza_CN : CN ;
-```
-But this solution is unstable: we may end up changing the interface
-and the function with each new language, and we must every time also
-change the interface instances for the old languages to maintain
-type correctness.
-
-A better solution is to use **restricted inheritance**: the English
-instantiation inherits the functor implementation except for the
-constant ``Pizza``. This is how we write:
-```
- --# -path=.:present:prelude
-
- concrete FoodEng of Food = FoodI - [Pizza] with
- (Syntax = SyntaxEng),
- (LexFood = LexFoodEng) **
- open SyntaxEng, ParadigmsEng in {
-
- lin Pizza = mkCN (mkA "Italian") (mkN "pie") ;
- }
-```
-Restricted inheritance is available for all inherited modules. One can for
-instance exclude some mushrooms and pick up just some fruit in
-the ``FoodMarket`` example:
-```
- abstract Foodmarket = Food, Fruit [Peach], Mushroom - [Agaric]
-```
-A concrete syntax of ``Foodmarket`` must then indicate the same inheritance
-restrictions.
-
-
-**Exercise**. Change ``FoodGer`` in such a way that it says, instead of
-//X is Y//, the equivalent of //X must be Y// (//X muss Y sein//).
-You will have to browse the full resource API to find all
-the functions needed.
-
-
-==Browsing the resource with GF commands==
-
-In addition to reading the
-[resource synopsis ../../lib/resource-1.0/synopsis.html], you
-can find resource function combinations by using the parser. This
-is so because the resource library is in the end implemented as
-a top-level ``abstract-concrete`` grammar, on which parsing
-and linearization work.
-
-Unfortunately, only English and the Scandinavian languages can be
-parsed within acceptable computer resource limits when the full
-resource is used.
-
-To look for a syntax tree in the overload API by parsing, do like this:
-```
- > $GF_LIB_PATH
- > i -path=alltenses:prelude alltenses/OverLangEng.gfc
- > p -cat=S -overload "this grammar is too big"
- mkS (mkCl (mkNP (mkDet this_Quant) grammar_N) (mkAP too_AdA big_A))
-```
-To view linearizations in all languages by parsing from English:
-```
- > i alltenses/langs.gfcm
- > p -cat=S -lang=LangEng "this grammar is too big" | tb
- UseCl TPres ASimul PPos (PredVP (DetCN (DetSg (SgQuant this_Quant)
- NoOrd) (UseN grammar_N)) (UseComp (CompAP (AdAP too_AdA (PositA big_A)))))
- Den här grammatiken är för stor
- Esta gramática es demasiado grande
- (Cyrillic: eta grammatika govorit des'at' jazykov)
- Denne grammatikken er for stor
- Questa grammatica è troppo grande
- Diese Grammatik ist zu groß
- Cette grammaire est trop grande
- Tämä kielioppi on liian suuri
- This grammar is too big
- Denne grammatik er for stor
-```
-Unfortunately, the Russian grammar uses at the moment a different
-character encoding than the rest and is therefore not displayed correctly
-in a terminal window. However, the GF syntax editor does display all
-examples correctly:
-```
- % gfeditor alltenses/langs.gfcm
-```
-When you have constructed the tree, you will see the following screen:
-
-#BCEN
-
- [../../lib/resource-1.0/doc/10lang-small.png]
-
-#ECEN
-
-
-**Exercise**. Find the resource grammar translations for the following
-English phrases (parse in the category ``Phr``). You can first try to
-build the terms manually.
-
-//every man loves a woman//
-
-//this grammar speaks more than ten languages//
-
-//which languages aren't in the grammar//
-
-//which languages did you want to speak//
-
-
-
-=Refining semantics in abstract syntax=
-
-==GF as a logical framework==
-
-In this section, we will show how
-to encode advanced semantic concepts in an abstract syntax.
-We use concepts inherited from **type theory**. Type theory
-is the basis of many systems known as **logical frameworks**, which are
-used for representing mathematical theorems and their proofs on a computer.
-In fact, GF has a logical framework as its proper part:
-this part is the abstract syntax.
-
-In a logical framework, the formalization of a mathematical theory
-is a set of type and function declarations. The following is an example
-of such a theory, represented as an ``abstract`` module in GF.
-```
-abstract Arithm = {
- cat
- Prop ; -- proposition
- Nat ; -- natural number
- fun
- Zero : Nat ; -- 0
- Succ : Nat -> Nat ; -- successor of x
- Even : Nat -> Prop ; -- x is even
- And : Prop -> Prop -> Prop ; -- A and B
- }
-```
-
-**Exercise**. Give a concrete syntax of ``Arithm``, either from scatch or
-by using the resource library.
-
-
-
-
-==Dependent types==
-
-**Dependent types** are a characteristic feature of GF,
-inherited from the **constructive type theory** of Martin-Löf and
-distinguishing GF from most other grammar formalisms and
-functional programming languages.
-
-Dependent types can be used for stating stronger
-**conditions of well-formedness** than ordinary types.
-A simple example is a "smart house" system, which
-defines voice commands for household appliances. This example
-is borrowed from the
-[Regulus Book http://cslipublications.stanford.edu/site/1575865262.html]
-(Rayner & al. 2006).
-
-One who enters a smart house can use speech to dim lights, switch
-on the fan, etc. For each ``Kind`` of a device, there is a set of
-``Actions`` that can be performed on it; thus one can dim the lights but
- not the fan, for example. These dependencies can be expressed by
-by making the type ``Action`` dependent on ``Kind``. We express this
-as follows in ``cat`` declarations:
-```
- cat
- Command ;
- Kind ;
- Action Kind ;
- Device Kind ;
-```
-The crucial use of the dependencies is made in the rule for forming commands:
-```
- fun CAction : (k : Kind) -> Action k -> Device k -> Command ;
-```
-In other words: an action and a device can be combined into a command only
-if they are of the same ``Kind`` ``k``. If we have the functions
-```
- DKindOne : (k : Kind) -> Device k ; -- the light
-
- light, fan : Kind ;
- dim : Action light ;
-```
-we can form the syntax tree
-```
- CAction light dim (DKindOne light)
-```
-but we cannot form the trees
-```
- CAction light dim (DKindOne fan)
- CAction fan dim (DKindOne light)
- CAction fan dim (DKindOne fan)
-```
-Linearization rules are written as usual: the concrete syntax does not
-know if a category is a dependent type. In English, you can write as follows:
-```
- lincat Action = {s : Str} ;
- lin CAction kind act dev = {s = act.s ++ dev.s} ;
-```
-Notice that the argument ``kind`` does not appear in the linearization.
-The type checker will be able to reconstruct it from the ``dev`` argument.
-
-Parsing with dependent types is performed in two phases:
-+ context-free parsing
-+ filtering through type checker
-
-
-If you just parse in the usual way, you don't enter the second phase, and
-the ``kind`` argument is not found:
-```
- > parse "dim the light"
- CAction ? dim (DKindOne light)
-```
-Moreover, type-incorrect commands are not rejected:
-```
- > parse "dim the fan"
- CAction ? dim (DKindOne fan)
-```
-The question mark ``?`` is a **metavariable**, and is returned by the parser
-for any subtree that is suppressed by a linearization rule.
-
-To get rid of metavariables, you must feed the parse result into the
-second phase of **solving** them. The ``solve`` process uses the dependent
-type checker to restore the values of the metavariables. It is invoked by
-the command ``put_tree = pt`` with the flag ``-transform=solve``:
-```
- > parse "dim the light" | put_tree -transform=solve
- CAction light dim (DKindOne light)
-```
-The ``solve`` process may fail, in which case no tree is returned:
-```
- > parse "dim the fan" | put_tree -transform=solve
- no tree found
-```
-
-
-**Exercise**. Write an abstract syntax module with above contents
-and an appropriate English concrete syntax. Try to parse the commands
-//dim the light// and //dim the fan//, with and without ``solve`` filtering.
-
-
-**Exercise**. Perform random and exhaustive generation, with and without
-``solve`` filtering.
-
-**Exercise**. Add some device kinds and actions to the grammar.
-
-
-==Polymorphism==
-
-Sometimes an action can be performed on all kinds of devices. It would be
-possible to introduce separate ``fun`` constants for each kind-action pair,
-but this would be tedious. Instead, one can use **polymorphic** actions,
-i.e. actions that take a ``Kind`` as an argument and produce an ``Action``
-for that ``Kind``:
-```
- fun switchOn, switchOff : (k : Kind) -> Action k ;
-```
-Functions that are not polymorphic are **monomorphic**. However, the
-dichotomy into monomorphism and full polymorphism is not always sufficien
-for good semantic modelling: very typically, some actions are defined
-for a proper subset of devices, but not just one. For instance, both doors and
-windows can be opened, whereas lights cannot.
-We will return to this problem by introducing the
-concept of **restricted polymorphism** later,
-after a chapter on proof objects.
-
-
-
-==Dependent types and spoken language models==
-
-We have used dependent types to control semantic well-formedness
-in grammars. This is important in traditional type theory
-applications such as proof assistants, where only mathematically
-meaningful formulas should be constructed. But semantic filtering has
-also proved important in speech recognition, because it reduces the
-ambiguity of the results.
-
-
-===Grammar-based language models===
-
-The standard way of using GF in speech recognition is by building
-**grammar-based language models**. To this end, GF comes with compilers
-into several formats that are used in speech recognition systems.
-One such format is GSL, used in the [Nuance speech recognizer www.nuance.com].
-It is produced from GF simply by printing a grammar with the flag
-``-printer=gsl``.
-```
- > import -conversion=finite SmartEng.gf
- > print_grammar -printer=gsl
-
- ;GSL2.0
- ; Nuance speech recognition grammar for SmartEng
- ; Generated by GF
-
- .MAIN SmartEng_2
-
- SmartEng_0 [("switch" "off") ("switch" "on")]
- SmartEng_1 ["dim" ("switch" "off")
- ("switch" "on")]
- SmartEng_2 [(SmartEng_0 SmartEng_3)
- (SmartEng_1 SmartEng_4)]
- SmartEng_3 ("the" SmartEng_5)
- SmartEng_4 ("the" SmartEng_6)
- SmartEng_5 "fan"
- SmartEng_6 "light"
-```
-Now, GSL is a context-free format, so how does it cope with dependent types?
-In general, dependent types can give rise to infinitely many basic types
-(exercise!), whereas a context-free grammar can by definition only have
-finitely many nonterminals.
-
-This is where the flag ``-conversion=finite`` is needed in the ``import``
-command. Its effect is to convert a GF grammar with dependent types to
-one without, so that each instance of a dependent type is replaced by
-an atomic type. This can then be used as a nonterminal in a context-free
-grammar. The ``finite`` conversion presupposes that every
-dependent type has only finitely many instances, which is in fact
-the case in the ``Smart`` grammar.
-
-
-**Exercise**. If you have access to the Nuance speech recognizer,
-test it with GF-generated language models for ``SmartEng``. Do this
-both with and without ``-conversion=finite``.
-
-**Exercise**. Construct an abstract syntax with infinitely many instances
-of dependent types.
-
-
-===Statistical language models===
-
-An alternative to grammar-based language models are
-**statistical language models** (**SLM**s). An SLM is
-built from a **corpus**, i.e. a set of utterances. It specifies the
-probability of each **n-gram**, i.e. sequence of //n// words. The
-typical value of //n// is 2 (bigrams) or 3 (trigrams).
-
-One advantage of SLMs over grammar-based models is that they are
-**robust**, i.e. they can be used to recognize sequences that would
-be out of the grammar or the corpus. Another advantage is that
-an SLM can be built "for free" if a corpus is available.
-
-However, collecting a corpus can require a lot of work, and writing
-a grammar can be less demanding, especially with tools such as GF or
-Regulus. This advantage of grammars can be combined with robustness
-by creating a back-up SLM from a **synthesized corpus**. This means
-simply that the grammar is used for generating such a corpus.
-In GF, this can be done with the ``generate_trees`` command.
-As with grammar-based models, the quality of the SLM is better
-if meaningless utterances are excluded from the corpus. Thus
-a good way to generate an SLM from a GF grammar is by using
-dependent types and filter the results through the type checker:
-```
- > generate_trees | put_trees -transform=solve | linearize
-```
-
-
-**Exercise**. Measure the size of the corpus generated from
-``SmartEng``, with and without type checker filtering.
-
-
-
-==Digression: dependent types in concrete syntax==
-
-===Variables in function types===
-
-A dependent function type needs to introduce a variable for
-its argument type, as in
-```
- switchOff : (k : Kind) -> Action k
-```
-Function types //without//
-variables are actually a shorthand notation: writing
-```
- fun PredVP : NP -> VP -> S
-```
-is shorthand for
-```
- fun PredVP : (x : NP) -> (y : VP) -> S
-```
-or any other naming of the variables. Actually the use of variables
-sometimes shortens the code, since they can share a type:
-```
- octuple : (x,y,z,u,v,w,s,t : Str) -> Str
-```
-If a bound variable is not used, it can here, as elsewhere in GF, be replaced by
-a wildcard:
-```
- octuple : (_,_,_,_,_,_,_,_ : Str) -> Str
-```
-A good practice for functions with many arguments of the same type
-is to indicate the number of arguments:
-```
- octuple : (x1,_,_,_,_,_,_,x8 : Str) -> Str
-```
-One can also use the variables to document what each argument is expected
-to provide, as is done in inflection paradigms in the resource grammar.
-```
- mkV : (drink,drank,drunk : Str) -> V
-```
-
-
-===Polymorphism in concrete syntax===
-
-The **functional fragment** of GF
-terms and types comprises function types, applications, lambda
-abstracts, constants, and variables. This fragment is similar in
-abstract and concrete syntax. In particular,
-dependent types are also available in concrete syntax.
-We have not made use of them yet,
-but we will now look at one example of how they
-can be used.
-
-Those readers who are familiar with functional programming languages
-like ML and Haskell, may already have missed **polymorphic**
-functions. For instance, Haskell programmers have access to
-the functions
-```
- const :: a -> b -> a
- const c _ = c
-
- flip :: (a -> b -> c) -> b -> a -> c
- flip f y x = f x y
-```
-which can be used for any given types ``a``,``b``, and ``c``.
-
-The GF counterpart of polymorphic functions are **monomorphic**
-functions with explicit **type variables**. Thus the above
-definitions can be written
-```
- oper const :(a,b : Type) -> a -> b -> a =
- \_,_,c,_ -> c ;
-
- oper flip : (a,b,c : Type) -> (a -> b ->c) -> b -> a -> c =
- \_,_,_,f,x,y -> f y x ;
-```
-When the operations are used, the type checker requires
-them to be equipped with all their arguments; this may be a nuisance
-for a Haskell or ML programmer.
-
-
-
-==Proof objects==
-
-Perhaps the most well-known idea in constructive type theory is
-the **Curry-Howard isomorphism**, also known as the
-**propositions as types principle**. Its earliest formulations
-were attempts to give semantics to the logical systems of
-propositional and predicate calculus. In this section, we will consider
-a more elementary example, showing how the notion of proof is useful
-outside mathematics, as well.
-
-We first define the category of unary (also known as Peano-style)
-natural numbers:
-```
- cat Nat ;
- fun Zero : Nat ;
- fun Succ : Nat -> Nat ;
-```
-The **successor function** ``Succ`` generates an infinite
-sequence of natural numbers, beginning from ``Zero``.
-
-We then define what it means for a number //x// to be //less than//
-a number //y//. Our definition is based on two axioms:
-- ``Zero`` is less than ``Succ`` //y// for any //y//.
-- If //x// is less than //y//, then ``Succ`` //x// is less than ``Succ`` //y//.
-
-
-The most straightforward way of expressing these axioms in type theory
-is as typing judgements that introduce objects of a type ``Less`` //x y//:
-```
- cat Less Nat Nat ;
- fun lessZ : (y : Nat) -> Less Zero (Succ y) ;
- fun lessS : (x,y : Nat) -> Less x y -> Less (Succ x) (Succ y) ;
-```
-Objects formed by ``lessZ`` and ``lessS`` are
-called **proof objects**: they establish the truth of certain
-mathematical propositions.
-For instance, the fact that 2 is less that
-4 has the proof object
-```
- lessS (Succ Zero) (Succ (Succ (Succ Zero)))
- (lessS Zero (Succ (Succ Zero)) (lessZ (Succ Zero)))
-```
-whose type is
-```
- Less (Succ (Succ Zero)) (Succ (Succ (Succ (Succ Zero))))
-```
-which is the formalization of the proposition that 2 is less than 4.
-
-GF grammars can be used to provide a **semantic control** of
-well-formedness of expressions. We have already seen examples of this:
-the grammar of well-formed actions on household devices. By introducing proof objects
-we have now added a very powerful technique of expressing semantic conditions.
-
-A simple example of the use of proof objects is the definition of
-well-formed //time spans//: a time span is expected to be from an earlier to
-a later time:
-```
- from 3 to 8
-```
-is thus well-formed, whereas
-```
- from 8 to 3
-```
-is not. The following rules for spans impose this condition
-by using the ``Less`` predicate:
-```
- cat Span ;
- fun span : (m,n : Nat) -> Less m n -> Span ;
-```
-
-**Exercise**. Write an abstract and concrete syntax with the
-concepts of this section, and experiment with it in GF.
-
-
-**Exercise**. Define the notions of "even" and "odd" in terms
-of proof objects. **Hint**. You need one function for proving
-that 0 is even, and two other functions for propagating the
-properties.
-
-
-
-
-===Proof-carrying documents===
-
-Another possible application of proof objects is **proof-carrying documents**:
-to be semantically well-formed, the abstract syntax of a document must contain a proof
-of some property, although the proof is not shown in the concrete document.
-Think, for instance, of small documents describing flight connections:
-
-//To fly from Gothenburg to Prague, first take LH3043 to Frankfurt, then OK0537 to Prague.//
-
-The well-formedness of this text is partly expressible by dependent typing:
-```
- cat
- City ;
- Flight City City ;
- fun
- Gothenburg, Frankfurt, Prague : City ;
- LH3043 : Flight Gothenburg Frankfurt ;
- OK0537 : Flight Frankfurt Prague ;
-```
-This rules out texts saying //take OK0537 from Gothenburg to Prague//.
-However, there is a
-further condition saying that it must be possible to
-change from LH3043 to OK0537 in Frankfurt.
-This can be modelled as a proof object of a suitable type,
-which is required by the constructor
-that connects flights.
-```
- cat
- IsPossible (x,y,z : City)(Flight x y)(Flight y z) ;
- fun
- Connect : (x,y,z : City) ->
- (u : Flight x y) -> (v : Flight y z) ->
- IsPossible x y z u v -> Flight x z ;
-```
-
-
-==Restricted polymorphism==
-
-In the first version of the smart house grammar ``Smart``,
-all Actions were either of
-- **monomorphic**: defined for one Kind
-- **polymorphic**: defined for all Kinds
-
-
-To make this scale up for new Kinds, we can refine this to
-**restricted polymorphism**: defined for Kinds of a certain **class**
-
-
-The notion of class can be expressed in abstract syntax
-by using the Curry-Howard isomorphism as follows:
-- a class is a **predicate** of Kinds - i.e. a type depending of Kinds
-- a Kind is in a class if there is a proof object of this type
-
-
-Here is an example with switching and dimming. The classes are called
-``switchable`` and ``dimmable``.
-```
-cat
- Switchable Kind ;
- Dimmable Kind ;
-fun
- switchable_light : Switchable light ;
- switchable_fan : Switchable fan ;
- dimmable_light : Dimmable light ;
-
- switchOn : (k : Kind) -> Switchable k -> Action k ;
- dim : (k : Kind) -> Dimmable k -> Action k ;
-```
-One advantage of this formalization is that classes for new
-actions can be added incrementally.
-
-**Exercise**. Write a new version of the ``Smart`` grammar with
-classes, and test it in GF.
-
-**Exercise**. Add some actions, kinds, and classes to the grammar.
-Try to port the grammar to a new language. You will probably find
-out that restricted polymorphism works differently in different languages.
-For instance, in Finnish not only doors but also TVs and radios
-can be "opened", which means switching them on.
-
-
-==Variable bindings==
-
-Mathematical notation and programming languages have
-expressions that **bind** variables. For instance,
-a universally quantifier proposition
-```
- (All x)B(x)
-```
-consists of the **binding** ``(All x)`` of the variable ``x``,
-and the **body** ``B(x)``, where the variable ``x`` can have
-**bound occurrences**.
-
-Variable bindings appear in informal mathematical language as well, for
-instance,
-```
- for all x, x is equal to x
-
- the function that for any numbers x and y returns the maximum of x+y
- and x*y
-
- Let x be a natural number. Assume that x is even. Then x + 3 is odd.
-```
-In type theory, variable-binding expression forms can be formalized
-as functions that take functions as arguments. The universal
-quantifier is defined
-```
- fun All : (Ind -> Prop) -> Prop
-```
-where ``Ind`` is the type of individuals and ``Prop``,
-the type of propositions. If we have, for instance, the equality predicate
-```
- fun Eq : Ind -> Ind -> Prop
-```
-we may form the tree
-```
- All (\x -> Eq x x)
-```
-which corresponds to the ordinary notation
-```
- (All x)(x = x).
-```
-An abstract syntax where trees have functions as arguments, as in
-the two examples above, has turned out to be precisely the right
-thing for the semantics and computer implementation of
-variable-binding expressions. The advantage lies in the fact that
-only one variable-binding expression form is needed, the lambda abstract
-``\x -> b``, and all other bindings can be reduced to it.
-This makes it easier to implement mathematical theories and reason
-about them, since variable binding is tricky to implement and
-to reason about. The idea of using functions as arguments of
-syntactic constructors is known as **higher-order abstract syntax**.
-
-The question now arises: how to define linearization rules
-for variable-binding expressions?
-Let us first consider universal quantification,
-```
- fun All : (Ind -> Prop) -> Prop
-```
-We write
-```
- lin All B = {s = "(" ++ "All" ++ B.$0 ++ ")" ++ B.s}
-```
-to obtain the form shown above.
-This linearization rule brings in a new GF concept - the ``$0``
-field of ``B`` containing a bound variable symbol.
-The general rule is that, if an argument type of a function is
-itself a function type ``A -> C``, the linearization type of
-this argument is the linearization type of ``C``
-together with a new field ``$0 : Str``. In the linearization rule
-for ``All``, the argument ``B`` thus has the linearization
-type
-```
- {$0 : Str ; s : Str},
-```
-since the linearization type of ``Prop`` is
-```
- {s : Str}
-```
-In other words, the linearization of a function
-consists of a linearization of the body together with a
-field for a linearization of the bound variable.
-Those familiar with type theory or lambda calculus
-should notice that GF requires trees to be in
-**eta-expanded** form in order to be linearizable:
-any function of type
-```
- A -> B
-```
-always has a syntax tree of the form
-```
- \x -> b
-```
-where ``b : B`` under the assumption ``x : A``.
-It is in this form that an expression can be analysed
-as having a bound variable and a body.
-
-
-Given the linearization rule
-```
- lin Eq a b = {s = "(" ++ a.s ++ "=" ++ b.s ++ ")"}
-```
-the linearization of
-```
- \x -> Eq x x
-```
-is the record
-```
- {$0 = "x", s = ["( x = x )"]}
-```
-Thus we can compute the linearization of the formula,
-```
- All (\x -> Eq x x) --> {s = "[( All x ) ( x = x )]"}.
-```
-How did we get the //linearization// of the variable ``x``
-into the string ``"x"``? GF grammars have no rules for
-this: it is just hard-wired in GF that variable symbols are
-linearized into the same strings that represent them in
-the print-out of the abstract syntax.
-
-To be able to //parse// variable symbols, however, GF needs to know what
-to look for (instead of e.g. trying to parse //any//
-string as a variable). What strings are parsed as variable symbols
-is defined in the lexical analysis part of GF parsing
-```
- > p -cat=Prop -lexer=codevars "(All x)(x = x)"
- All (\x -> Eq x x)
-```
-(see more details on lexers below). If several variables are bound in the
-same argument, the labels are ``$0, $1, $2``, etc.
-
-
-**Exercise**. Write an abstract syntax of the whole
-**predicate calculus**, with the
-**connectives** "and", "or", "implies", and "not", and the
-**quantifiers** "exists" and "for all". Use higher-order functions
-to guarantee that unbounded variables do not occur.
-
-**Exercise**. Write a concrete syntax for your favourite
-notation of predicate calculus. Use Latex as target language
-if you want nice output. You can also try producing Haskell boolean
-expressions. Use as many parenthesis as you need to
-guarantee non-ambiguity.
-
-
-
-==Semantic definitions==
-
-We have seen that,
-just like functional programming languages, GF has declarations
-of functions, telling what the type of a function is.
-But we have not yet shown how to **compute**
-these functions: all we can do is provide them with arguments
-and linearize the resulting terms.
-Since our main interest is the well-formedness of expressions,
-this has not yet bothered
-us very much. As we will see, however, computation does play a role
-even in the well-formedness of expressions when dependent types are
-present.
-
-GF has a form of judgement for **semantic definitions**,
-recognized by the key word ``def``. At its simplest, it is just
-the definition of one constant, e.g.
-```
- def one = Succ Zero ;
-```
-We can also define a function with arguments,
-```
- def Neg A = Impl A Abs ;
-```
-which is still a special case of the most general notion of
-definition, that of a group of **pattern equations**:
-```
- def
- sum x Zero = x ;
- sum x (Succ y) = Succ (Sum x y) ;
-```
-To compute a term is, as in functional programming languages,
-simply to follow a chain of reductions until no definition
-can be applied. For instance, we compute
-```
- Sum one one -->
- Sum (Succ Zero) (Succ Zero) -->
- Succ (sum (Succ Zero) Zero) -->
- Succ (Succ Zero)
-```
-Computation in GF is performed with the ``pt`` command and the
-``compute`` transformation, e.g.
-```
- > p -tr "1 + 1" | pt -transform=compute -tr | l
- sum one one
- Succ (Succ Zero)
- s(s(0))
-```
-
-The ``def`` definitions of a grammar induce a notion of
-**definitional equality** among trees: two trees are
-definitionally equal if they compute into the same tree.
-Thus, trivially, all trees in a chain of computation
-(such as the one above)
-are definitionally equal to each other. So are the trees
-```
- sum Zero (Succ one)
- Succ one
- sum (sum Zero Zero) (sum (Succ Zero) one)
-```
-and infinitely many other trees.
-
-A fact that has to be emphasized about ``def`` definitions is that
-they are //not// performed as a first step of linearization.
-We say that **linearization is intensional**, which means that
-the definitional equality of two trees does not imply that
-they have the same linearizations. For instance, each of the seven terms
-shown above has a different linearizations in arithmetic notation:
-```
- 1 + 1
- s(0) + s(0)
- s(s(0) + 0)
- s(s(0))
- 0 + s(0)
- s(1)
- 0 + 0 + s(0) + 1
-```
-This notion of intensionality is
-no more exotic than the intensionality of any **pretty-printing**
-function of a programming language (function that shows
-the expressions of the language as strings). It is vital for
-pretty-printing to be intensional in this sense - if we want,
-for instance, to trace a chain of computation by pretty-printing each
-intermediate step, what we want to see is a sequence of different
-expression, which are definitionally equal.
-
-What is more exotic is that GF has two ways of referring to the
-abstract syntax objects. In the concrete syntax, the reference is intensional.
-In the abstract syntax, the reference is extensional, since
-**type checking is extensional**. The reason is that,
-in the type theory with dependent types, types may depend on terms.
-Two types depending on terms that are definitionally equal are
-equal types. For instance,
-```
- Proof (Odd one)
- Proof (Odd (Succ Zero))
-```
-are equal types. Hence, any tree that type checks as a proof that
-1 is odd also type checks as a proof that the successor of 0 is odd.
-(Recall, in this connection, that the
-arguments a category depends on never play any role
-in the linearization of trees of that category,
-nor in the definition of the linearization type.)
-
-In addition to computation, definitions impose a
-**paraphrase** relation on expressions:
-two strings are paraphrases if they
-are linearizations of trees that are
-definitionally equal.
-Paraphrases are sometimes interesting for
-translation: the **direct translation**
-of a string, which is the linearization of the same tree
-in the targer language, may be inadequate because it is e.g.
-unidiomatic or ambiguous. In such a case,
-the translation algorithm may be made to consider
-translation by a paraphrase.
-
-To stress express the distinction between
-**constructors** (=**canonical** functions)
-and other functions, GF has a judgement form
-``data`` to tell that certain functions are canonical, e.g.
-```
- data Nat = Succ | Zero ;
-```
-Unlike in Haskell, but similarly to ALF (where constructor functions
-are marked with a flag ``C``),
-new constructors can be added to
-a type with new ``data`` judgements. The type signatures of constructors
-are given separately, in ordinary ``fun`` judgements.
-One can also write directly
-```
- data Succ : Nat -> Nat ;
-```
-which is equivalent to the two judgements
-```
- fun Succ : Nat -> Nat ;
- data Nat = Succ ;
-```
-
-**Exercise**. Implement an interpreter of a small functional programming
-language with natural numbers, lists, pairs, lambdas, etc. Use higher-order
-abstract syntax with semantic definitions. As target language, use
-your favourite programming language.
-
-**Exercise**. To make your interpreted language look nice, use
-**precedences** instead of putting parentheses everywhere.
-You can use the [precedence library ../../lib/prelude/Precedence.gf]
-of GF to facilitate this.
-
-
-
-#PARTtwo
-
-=Embedded grammars in Haskell=
-
-GF grammars can be used as parts of programs written in the
-following languages. We will go through a skeleton application in
-Haskell, while the next chapter will show how to build an
-application in Java.
-
-We will show how to build a minimal resource grammar
-application whose architecture scales up to much
-larger applications. The application is run from the
-shell by the command
-```
- math
-```
-whereafter it reads user input in English and French.
-To each input line, it answers by the truth value of
-the sentence.
-```
- ./math
- zéro est pair
- True
- zero is odd
- False
- zero is even and zero is odd
- False
-```
-The source of the application consists of the following
-files:
-```
- LexEng.gf -- English instance of Lex
- LexFre.gf -- French instance of Lex
- Lex.gf -- lexicon interface
- Makefile -- a makefile
- MathEng.gf -- English instantiation of MathI
- MathFre.gf -- French instantiation of MathI
- Math.gf -- abstract syntax
- MathI.gf -- concrete syntax functor for Math
- Run.hs -- Haskell Main module
-```
-The system was built in 22 steps explained below.
-
-
-==Writing GF grammars==
-
-===Creating the first grammar===
-
-1. Write ``Math.gf``, which defines what you want to say.
-```
- abstract Math = {
- cat Prop ; Elem ;
- fun
- And : Prop -> Prop -> Prop ;
- Even : Elem -> Prop ;
- Zero : Elem ;
- }
-```
-2. Write ``Lex.gf``, which defines which language-dependent
-parts are needed in the concrete syntax. These are mostly
-words (lexicon), but can in fact be any operations. The definitions
-only use resource abstract syntax, which is opened.
-```
- interface Lex = open Syntax in {
- oper
- even_A : A ;
- zero_PN : PN ;
- }
-```
-3. Write ``LexEng.gf``, the English implementation of ``Lex.gf``
-This module uses English resource libraries.
-```
- instance LexEng of Lex = open GrammarEng, ParadigmsEng in {
- oper
- even_A = regA "even" ;
- zero_PN = regPN "zero" ;
-
- }
-```
-4. Write ``MathI.gf``, a language-independent concrete syntax of
-``Math.gf``. It opens interfaces.
-which makes it an incomplete module, aka. parametrized module, aka.
-functor.
-```
- incomplete concrete MathI of Math =
-
- open Syntax, Lex in {
-
- flags startcat = Prop ;
-
- lincat
- Prop = S ;
- Elem = NP ;
- lin
- And x y = mkS and_Conj x y ;
- Even x = mkS (mkCl x even_A) ;
- Zero = mkNP zero_PN ;
- }
-```
-5. Write ``MathEng.gf``, which is just an instatiation of ``MathI.gf``,
-replacing the interfaces by their English instances. This is the module
-that will be used as a top module in GF, so it contains a path to
-the libraries.
-```
- instance LexEng of Lex = open SyntaxEng, ParadigmsEng in {
- oper
- even_A = mkA "even" ;
- zero_PN = mkPN "zero" ;
- }
-```
-
-
-===Testing===
-
-6. Test the grammar in GF by random generation and parsing.
-```
- $ gf
- > i MathEng.gf
- > gr -tr | l -tr | p
- And (Even Zero) (Even Zero)
- zero is evenand zero is even
- And (Even Zero) (Even Zero)
-```
-When importing the grammar, you will fail if you haven't
-- correctly defined your ``GF_LIB_PATH`` as ``GF/lib``
-- installed the resource package or
- compiled the resource from source by ``make`` in ``GF/lib/resource-1.0``
-
-
-
-===Adding a new language===
-
-7. Now it is time to add a new language. Write a French lexicon ``LexFre.gf``:
-```
- instance LexFre of Lex = open SyntaxFre, ParadigmsFre in {
- oper
- even_A = mkA "pair" ;
- zero_PN = mkPN "zéro" ;
- }
-```
-8. You also need a French concrete syntax, ``MathFre.gf``:
-```
- --# -path=.:present:prelude
-
- concrete MathFre of Math = MathI with
- (Syntax = SyntaxFre),
- (Lex = LexFre) ;
-```
-9. This time, you can test multilingual generation:
-```
- > i MathFre.gf
- > gr | tb
- Even Zero
- zéro est pair
- zero is even
-```
-
-
-===Extending the language===
-
-10. You want to add a predicate saying that a number is odd.
-It is first added to ``Math.gf``:
-```
- fun Odd : Elem -> Prop ;
-```
-11. You need a new word in ``Lex.gf``.
-```
- oper odd_A : A ;
-```
-12. Then you can give a language-independent concrete syntax in
-``MathI.gf``:
-```
- lin Odd x = mkS (mkCl x odd_A) ;
-```
-13. The new word is implemented in ``LexEng.gf``.
-```
- oper odd_A = mkA "odd" ;
-```
-14. The new word is implemented in ``LexFre.gf``.
-```
- oper odd_A = mkA "impair" ;
-```
-15. Now you can test with the extended lexicon. First empty
-the environment to get rid of the old abstract syntax, then
-import the new versions of the grammars.
-```
- > e
- > i MathEng.gf
- > i MathFre.gf
- > gr | tb
- And (Odd Zero) (Even Zero)
- zéro est impair et zéro est pair
- zero is odd and zero is even
-```
-
-
-==Building a user program==
-
-===Producing a compiled grammar package===
-
-16. Your grammar is going to be used by persons wh``MathEng.gf``o do not need
-to compile it again. They may not have access to the resource library,
-either. Therefore it is advisable to produce a multilingual grammar
-package in a single file. We call this package ``math.gfcm`` and
-produce it, when we have ``MathEng.gf`` and
-``MathEng.gf`` in the GF state, by the command
-```
- > pm | wf math.gfcm
-```
-
-
-===Writing the Haskell application===
-
-17. Write the Haskell main file ``Run.hs``. It uses the ``EmbeddedAPI``
-module defining some basic functionalities such as parsing.
-The answer is produced by an interpreter of trees returned by the parser.
-```
-module Main where
-
-import GSyntax
-import GF.Embed.EmbedAPI
-
-main :: IO ()
-main = do
- gr <- file2grammar "math.gfcm"
- loop gr
-
-loop :: MultiGrammar -> IO ()
-loop gr = do
- s <- getLine
- interpret gr s
- loop gr
-
-interpret :: MultiGrammar -> String -> IO ()
-interpret gr s = do
- let tss = parseAll gr "Prop" s
- case (concat tss) of
- [] -> putStrLn "no parse"
- t:_ -> print $ answer $ fg t
-
-answer :: GProp -> Bool
-answer p = case p of
- (GOdd x1) -> odd (value x1)
- (GEven x1) -> even (value x1)
- (GAnd x1 x2) -> answer x1 && answer x2
-
-value :: GElem -> Int
-value e = case e of
- GZero -> 0
-```
-
-18. The syntax trees manipulated by the interpreter are not raw
-GF trees, but objects of the Haskell datatype ``GProp``.
-From any GF grammar, a file ``GFSyntax.hs`` with
-datatypes corresponding to its abstract
-syntax can be produced by the command
-```
- > pg -printer=haskell | wf GSyntax.hs
-```
-The module also defines the overloaded functions
-``gf`` and ``fg`` for translating from these types to
-raw trees and back.
-
-
-===Compiling the Haskell grammar===
-
-19. Before compiling ``Run.hs``, you must check that the
-embedded GF modules are found. The easiest way to do this
-is by two symbolic links to your GF source directories:
-```
- $ ln -s /home/aarne/GF/src/GF
- $ ln -s /home/aarne/GF/src/Transfer/
-```
-
-20. Now you can run the GHC Haskell compiler to produce the program.
-```
- $ ghc --make -o math Run.hs
-```
-The program can be tested with the command ``./math``.
-
-
-===Building a distribution===
-
-21. For a stand-alone binary-only distribution, only
-the two files ``math`` and ``math.gfcm`` are needed.
-For a source distribution, the files mentioned in
-the beginning of this documents are needed.
-
-
-===Using a Makefile===
-
-22. As a part of the source distribution, a ``Makefile`` is
-essential. The ``Makefile`` is also useful when developing the
-application. It should always be possible to build an executable
-from source by typing ``make``. Here is a minimal such ``Makefile``:
-```
- all:
- echo "pm | wf math.gfcm" | gf MathEng.gf MathFre.gf
- echo "pg -printer=haskell | wf GSyntax.hs" | gf math.gfcm
- ghc --make -o math Run.hs
-```
-
-
-==The Embedded GF Haskell API==
-
-
-
-=Embedded grammars in Java=
-
-Forthcoming; at the moment, the document
-
- [``http://www.cs.chalmers.se/~bringert/gf/gf-java.html`` http://www.cs.chalmers.se/~bringert/gf/gf-java.html]
-
-by Björn Bringert gives more information on Java.
-
-
-=Spoken language translators=
-
-
-=Multimodal dialogue systems=
-
-
-=Grammar of formal languages=
-
-==Precedence and ficity==
-
-==Higher-order abstract syntax==
-
-==Extensible natural-language interfaces==
-
-
-
-=Inside the resource grammar library=
-
-==Writing your own resource implementation==
-
-==Parametrized modules for language families==
-
-
-
-=Using Transfer for semantics actions=
-
-
-
-#PARTthree
-
-
-=Syntax and semantics of the GF grammar formalism=
-
-=The resource grammar API=
-
-=The GFC format=
-
-=The command language of the GF shell=
-
-==Lexers and unlexers==
-
-Lexers and unlexers can be chosen from
-a list of predefined ones, using the flags``-lexer`` and `` -unlexer`` either
-in the grammar file or on the GF command line. Here are some often-used lexers
-and unlexers:
-```
- The default is words.
- -lexer=words tokens are separated by spaces or newlines
- -lexer=literals like words, but GF integer and string literals recognized
- -lexer=vars like words, but "x","x_...","$...$" as vars, "?..." as meta
- -lexer=chars each character is a token
- -lexer=code use Haskell's lex
- -lexer=codevars like code, but treat unknown words as variables, ?? as meta
- -lexer=text with conventions on punctuation and capital letters
- -lexer=codelit like code, but treat unknown words as string literals
- -lexer=textlit like text, but treat unknown words as string literals
-
- The default is unwords.
- -unlexer=unwords space-separated token list (like unwords)
- -unlexer=text format as text: punctuation, capitals, paragraph
- -unlexer=code format as code (spacing, indentation)
- -unlexer=textlit like text, but remove string literal quotes
- -unlexer=codelit like code, but remove string literal quotes
- -unlexer=concat remove all spaces
-```
-More options can be found by ``help -lexer`` and ``help -unlexer``:
-
-
-
-
-==Speech input and output==
-
-The ``speak_aloud = sa`` command sends a string to the speech
-synthesizer
-[Flite http://www.speech.cs.cmu.edu/flite/doc/].
-It is typically used via a pipe:
-``` generate_random | linearize | speak_aloud
-The result is only satisfactory for English.
-
-The ``speech_input = si`` command receives a string from a
-speech recognizer that requires the installation of
-[ATK http://mi.eng.cam.ac.uk/~sjy/software.htm].
-It is typically used to pipe input to a parser:
-``` speech_input -tr | parse
-The method words only for grammars of English.
-
-Both Flite and ATK are freely available through the links
-above, but they are not distributed together with GF.
-
-
-
-==Multilingual syntax editor==
-
-The
-[Editor User Manual http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm]
-describes the use of the editor, which works for any multilingual GF grammar.
-
-Here is a snapshot of the editor:
-
-%#BCEN
-
-%#EDITORPNG
-
-%#ECEN
-
-
-The grammars of the snapshot are from the
-[Letter grammar package http://www.cs.chalmers.se/~aarne/GF/examples/letter].
-
-
-==Communicating with GF==
-
-Other processes can communicate with the GF command interpreter,
-and also with the GF syntax editor. Useful flags when invoking GF are
-- ``-batch`` suppresses the promps and structures the communication with XML tags.
-- ``-s`` suppresses non-output non-error messages and XML tags.
-- ``-nocpu`` suppresses CPU time indication.
-
-
-Thus the most silent way to invoke GF is
-```
- gf -batch -s -nocpu
-```
-
-
-
-
-
-
-=Further reading=
-
-Syntax Editor User Manual:
-
-[``http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm`` http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm]
-
-Resource Grammar Synopsis (on using resource grammars):
-
-[``http://www.cs.chalmers.se/~aarne/GF/lib/resource-1.0/synopsis.html`` ../../lib/resource-1.0/synopsis.html]
-
-Resource Grammar HOWTO (on writing resource grammars):
-
-[``http://www.cs.chalmers.se/~aarne/GF/lib/resource-1.0/synopsis.html`` ../../lib/resource-1.0/doc/Resource-HOWTO.html]
-
-GF Homepage:
-
-[``http://www.cs.chalmers.se/~aarne/GF/doc`` ../..]
-
diff --git a/doc/tutorial/prelude b/doc/tutorial/prelude
index e1790817b..3f7b84056 100644
--- a/doc/tutorial/prelude
+++ b/doc/tutorial/prelude
@@ -1,6 +1,12 @@
-\documentclass[11pt]{book}
+\documentclass[nwbk_0pt]{book}
\usepackage[latin1]{inputenc}
+%\setlength{\oddsidemargin}{0mm}
+%\setlength{\evensidemargin}{-2mm}
+%\setlength{\topmargin}{-12mm}
+%\setlength{\textheight}{220mm}
+%\setlength{\textwidth}{158mm}
+
\newcommand{\bequ}{\begin{quote}}
\newcommand{\enqu}{\end{quote}}
%%%
\ No newline at end of file