diff --git a/doc/tutorial/Foodmarket.png b/doc/tutorial/Foodmarket.png new file mode 100644 index 000000000..753cc850d Binary files /dev/null and b/doc/tutorial/Foodmarket.png differ diff --git a/doc/tutorial/Tree2.png b/doc/tutorial/Tree2.png new file mode 100644 index 000000000..f58e56b95 Binary files /dev/null and b/doc/tutorial/Tree2.png differ diff --git a/doc/tutorial/gf-tutorial2.html b/doc/tutorial/gf-tutorial2.html index e2a88d9d3..1fc1dd015 100644 --- a/doc/tutorial/gf-tutorial2.html +++ b/doc/tutorial/gf-tutorial2.html @@ -7,7 +7,7 @@
- param AdjForm = ASg Gender | APl ;
- param Gender = Uter | Neuter ;
+The reader familiar with a functional programming language such as
+Haskell must have noticed the similarity
+between parameter types in GF and algebraic datatypes (data definitions
+in Haskell). The GF parameter types are actually a special case of algebraic
+datatypes: the main restriction is that in GF, these types must be finite.
+(It is this restriction that makes it possible to invert linearization rules into
+parsing methods.)
+
+However, finite is not the same thing as enumerated. Even in GF, parameter +constructors can take arguments, provided these arguments are from other +parameter types - only recursion is forbidden. Such parameter types impose a +hierarchic order among parameters. They are often needed to define +the linguistically most accurate parameter systems. +
+
+To give an example, Swedish adjectives
+are inflected in number (singular or plural) and
+gender (uter or neuter). These parameters would suggest 2*2=4 different
+forms. However, the gender distinction is done only in the singular. Therefore,
+it would be inaccurate to define adjective paradigms using the type
+Gender => Number => Str. The following hierarchic definition
+yields an accurate system of three adjectival forms.
- In pattern matching, a constructor can have patterns as arguments. For instance, - the adjectival paradigm in which the two singular forms are the same, can be defined + param AdjForm = ASg Gender | APl ; + param Gender = Uter | Neuter ;
- oper plattAdj : Str -> AdjForm => Str = \x -> table { - ASg _ => x ; - APl => x + "a" ; - } +In pattern matching, a constructor can have patterns as arguments. For instance, +the adjectival paradigm in which the two singular forms are the same, can be defined
-
-
- %--!
- ===Morphological analysis and morphology quiz===
-
- Even though in GF morphology
- is mostly seen as an auxiliary of syntax, a morphology once defined
- can be used on its own right. The command ``morpho_analyse = ma``
- can be used to read a text and return for each word the analyses that
- it has in the current concrete syntax.
+ oper plattAdj : Str -> AdjForm => Str = \x -> table {
+ ASg _ => x ;
+ APl => x + "a" ;
+ }
+
+
+
- > rf bible.txt | morpho_analyse
+Even though in GF morphology
+is mostly seen as an auxiliary of syntax, a morphology once defined
+can be used on its own right. The command morpho_analyse = ma
+can be used to read a text and return for each word the analyses that
+it has in the current concrete syntax.
- In the same way as translation exercises, morphological exercises can - be generated, by the command ``morpho_quiz = mq``. Usually, - the category is set to be something else than ``S``. For instance, + > rf bible.txt | morpho_analyse
- > i lib/resource/french/VerbsFre.gf - > morpho_quiz -cat=V -
-- Welcome to GF Morphology Quiz. - ... -
-
- réapparaître : VFin VCondit Pl P2
- réapparaitriez
- > No, not réapparaitriez, but
- réapparaîtriez
- Score 0/1
+In the same way as translation exercises, morphological exercises can
+be generated, by the command morpho_quiz = mq. Usually,
+the category is set to be something else than S. For instance,
- Finally, a list of morphological exercises and save it in a - file for later use, by the command ``morpho_list = ml`` + > i lib/resource/french/VerbsFre.gf + > morpho_quiz -cat=V + + Welcome to GF Morphology Quiz. + ... + + réapparaître : VFin VCondit Pl P2 + réapparaitriez + > No, not réapparaitriez, but + réapparaîtriez + Score 0/1
- > morpho_list -number=25 -cat=V
+Finally, a list of morphological exercises and save it in a
+file for later use, by the command morpho_list = ml
- The ``number`` flag gives the number of exercises generated. - - - - %--! - ===Discontinuous constituents=== - - A linearization type may contain more strings than one. - An example of where this is useful are English particle - verbs, such as //switch off//. The linearization of - a sentence may place the object between the verb and the particle: - //he switched it off//. - - The first of the following judgements defines transitive verbs as - **discontinuous constituents**, i.e. as having a linearization - type with two strings and not just one. The second judgement - shows how the constituents are separated by the object in complementization. + > morpho_list -number=25 -cat=V
- lincat TV = {s : Number => Str ; s2 : Str} ;
- lin ComplTV tv obj = {s = \\n => tv.s ! n ++ obj.s ++ tv.s2} ;
+The number flag gives the number of exercises generated.
+
+A linearization type may contain more strings than one. +An example of where this is useful are English particle +verbs, such as switch off. The linearization of +a sentence may place the object between the verb and the particle: +he switched it off. +
++The first of the following judgements defines transitive verbs as +discontinuous constituents, i.e. as having a linearization +type with two strings and not just one. The second judgement +shows how the constituents are separated by the object in complementization.
- There is no restriction in the number of discontinuous constituents
- (or other fields) a ``lincat`` may contain. The only condition is that
- the fields must be of finite types, i.e. built from records, tables,
- parameters, and ``Str``, and not functions. A mathematical result
- about parsing in GF says that the worst-case complexity of parsing
- increases with the number of discontinuous constituents. Moreover,
- the parsing and linearization commands only give reliable results
- for categories whose linearization type has a unique ``Str`` valued
- field labelled ``s``.
-
-
- %--!
- ==More constructs for concrete syntax==
-
-
- %--!
- ===Free variation===
-
- Sometimes there are many alternative ways to define a concrete syntax.
- For instance, the verb negation in English can be expressed both by
- //does not// and //doesn't//. In linguistic terms, these expressions
- are in **free variation**. The ``variants`` construct of GF can
- be used to give a list of strings in free variation. For example,
+ lincat TV = {s : Number => Str ; s2 : Str} ;
+ lin ComplTV tv obj = {s = \\n => tv.s ! n ++ obj.s ++ tv.s2} ;
- NegVerb verb = {s = variants {["does not"] ; "doesn't} ++ verb.s} ;
+There is no restriction in the number of discontinuous constituents
+(or other fields) a lincat may contain. The only condition is that
+the fields must be of finite types, i.e. built from records, tables,
+parameters, and Str, and not functions. A mathematical result
+about parsing in GF says that the worst-case complexity of parsing
+increases with the number of discontinuous constituents. Moreover,
+the parsing and linearization commands only give reliable results
+for categories whose linearization type has a unique Str valued
+field labelled s.
+
+Sometimes there are many alternative ways to define a concrete syntax.
+For instance, the verb negation in English can be expressed both by
+does not and doesn't. In linguistic terms, these expressions
+are in free variation. The variants construct of GF can
+be used to give a list of strings in free variation. For example,
- An empty variant list
+ NegVerb verb = {s = variants {["does not"] ; "doesn't} ++ verb.s} ;
- variants {} +An empty variant list
- can be used e.g. if a word lacks a certain form.
-
- In general, ``variants`` should be used cautiously. It is not
- recommended for modules aimed to be libraries, because the
- user of the library has no way to choose among the variants.
- Moreover, even though ``variants`` admits lists of any type,
- its semantics for complex types can cause surprises.
-
-
-
-
- ===Record extension and subtyping===
-
- Record types and records can be **extended** with new fields. For instance,
- in German it is natural to see transitive verbs as verbs with a case.
- The symbol ``**`` is used for both constructs.
+ variants {}
- lincat TV = Verb ** {c : Case} ; +can be used e.g. if a word lacks a certain form.
- lin Follow = regVerb "folgen" ** {c = Dative} ;
+In general, variants should be used cautiously. It is not
+recommended for modules aimed to be libraries, because the
+user of the library has no way to choose among the variants.
+Moreover, even though variants admits lists of any type,
+its semantics for complex types can cause surprises.
+
+Record types and records can be extended with new fields. For instance,
+in German it is natural to see transitive verbs as verbs with a case.
+The symbol ** is used for both constructs.
- To extend a record type or a record with a field whose label it
- already has is a type error.
+ lincat TV = Verb ** {c : Case} ;
- A record type //T// is a **subtype** of another one //R//, if //T// has
- all the fields of //R// and possibly other fields. For instance,
- an extension of a record type is always a subtype of it.
-
- If //T// is a subtype of //R//, an object of //T// can be used whenever
- an object of //R// is required. For instance, a transitive verb can
- be used whenever a verb is required.
-
- **Contravariance** means that a function taking an //R// as argument
- can also be applied to any object of a subtype //T//.
-
-
-
- ===Tuples and product types===
-
- Product types and tuples are syntactic sugar for record types and records:
+ lin Follow = regVerb "folgen" ** {c = Dative} ;
- T1 * ... * Tn === {p1 : T1 ; ... ; pn : Tn} - <t1, ..., tn> === {p1 = T1 ; ... ; pn = Tn} +To extend a record type or a record with a field whose label it +already has is a type error. +
++A record type T is a subtype of another one R, if T has +all the fields of R and possibly other fields. For instance, +an extension of a record type is always a subtype of it. +
++If T is a subtype of R, an object of T can be used whenever +an object of R is required. For instance, a transitive verb can +be used whenever a verb is required. +
++Contravariance means that a function taking an R as argument +can also be applied to any object of a subtype T. +
+ ++Product types and tuples are syntactic sugar for record types and records:
- Thus the labels ``p1, p2,...``` are hard-coded.
-
-
- %--!
- ===Prefix-dependent choices===
-
- The construct exemplified in
+ T1 * ... * Tn === {p1 : T1 ; ... ; pn : Tn}
+ <t1, ..., tn> === {p1 = T1 ; ... ; pn = Tn}
- oper artIndef : Str =
- pre {"a" ; "an" / strs {"a" ; "e" ; "i" ; "o"}} ;
+Thus the labels p1, p2,...` are hard-coded.
+
+The construct exemplified in
- Thus
+ oper artIndef : Str =
+ pre {"a" ; "an" / strs {"a" ; "e" ; "i" ; "o"}} ;
- artIndef ++ "cheese" ---> "a" ++ "cheese" - artIndef ++ "apple" ---> "an" ++ "cheese" +Thus
- This very example does not work in all situations: the prefix - //u// has no general rules, and some problematic words are - //euphemism, one-eyed, n-gram//. It is possible to write + artIndef ++ "cheese" ---> "a" ++ "cheese" + artIndef ++ "apple" ---> "an" ++ "cheese"
- oper artIndef : Str = - pre {"a" ; - "a" / strs {"eu" ; "one"} ; - "an" / strs {"a" ; "e" ; "i" ; "o" ; "n-"} - } ; +This very example does not work in all situations: the prefix +u has no general rules, and some problematic words are +euphemism, one-eyed, n-gram. It is possible to write
-
-
-
- ===Predefined types and operations===
-
- GF has the following predefined categories in abstract syntax:
+ oper artIndef : Str =
+ pre {"a" ;
+ "a" / strs {"eu" ; "one"} ;
+ "an" / strs {"a" ; "e" ; "i" ; "o" ; "n-"}
+ } ;
+
+
+- cat Int ; -- integers, e.g. 0, 5, 743145151019 - cat Float ; -- floats, e.g. 0.0, 3.1415926 - cat String ; -- strings, e.g. "", "foo", "123" +GF has the following predefined categories in abstract syntax:
- The objects of each of these categories are **literals** - as indicated in the comments above. No ``fun`` definition - can have a predefined category as its value type, but - they can be used as arguments. For example: + cat Int ; -- integers, e.g. 0, 5, 743145151019 + cat Float ; -- floats, e.g. 0.0, 3.1415926 + cat String ; -- strings, e.g. "", "foo", "123"
- fun StreetAddress : Int -> String -> Address ; - lin StreetAddress number street = {s = number.s ++ street.s} ; -
-
- -- e.g. (StreetAddress 10 "Downing Street") : Address
+The objects of each of these categories are literals
+as indicated in the comments above. No fun definition
+can have a predefined category as its value type, but
+they can be used as arguments. For example:
+ fun StreetAddress : Int -> String -> Address ;
+ lin StreetAddress number street = {s = number.s ++ street.s} ;
-
- %--!
- ==More features of the module system==
-
-
- ===Resource grammars and their reuse===
-
- See
- [resource library documentation ../../lib/resource/doc/gf-resource.html]
-
-
- ===Interfaces, instances, and functors===
-
- See an
- [example built this way ../../examples/mp3/mp3-resource.html]
-
-
- ===Restricted inheritance and qualified opening===
-
-
-
- ==More concepts of abstract syntax==
-
-
- ===Dependent types===
-
- ===Higher-order abstract syntax===
-
- ===Semantic definitions===
-
-
-
- ==Transfer modules==
-
- Transfer means noncompositional tree-transforming operations.
- The command ``apply_transfer = at`` is typically used in a pipe:
+ -- e.g. (StreetAddress 10 "Downing Street") : Address
+
+
+- > p "John walks and John runs" | apply_transfer aggregate | l - John walks and runs +See +resource library documentation +
+ ++See an +example built this way +
+ +
+Transfer means noncompositional tree-transforming operations.
+The command apply_transfer = at is typically used in a pipe:
- See the - [sources ../../transfer/examples/aggregation] of this example. - - See the - [transfer language documentation ../transfer.html] - for more information. - - - ==Practical issues== - - - ===Lexers and unlexers=== - - Lexers and unlexers can be chosen from - a list of predefined ones, using the flags``-lexer`` and `` -unlexer`` either - in the grammar file or on the GF command line. - - Given by ``help -lexer``, ``help -unlexer``: + > p "John walks and John runs" | apply_transfer aggregate | l + John walks and runs
- The default is words. - -lexer=words tokens are separated by spaces or newlines - -lexer=literals like words, but GF integer and string literals recognized - -lexer=vars like words, but "x","x_...","$...$" as vars, "?..." as meta - -lexer=chars each character is a token - -lexer=code use Haskell's lex - -lexer=codevars like code, but treat unknown words as variables, ?? as meta - -lexer=text with conventions on punctuation and capital letters - -lexer=codelit like code, but treat unknown words as string literals - -lexer=textlit like text, but treat unknown words as string literals - -lexer=codeC use a C-like lexer - -lexer=ignore like literals, but ignore unknown words - -lexer=subseqs like ignore, but then try all subsequences from longest +See the +sources of this example.
- The default is unwords. - -unlexer=unwords space-separated token list (like unwords) - -unlexer=text format as text: punctuation, capitals, paragraph <p> - -unlexer=code format as code (spacing, indentation) - -unlexer=textlit like text, but remove string literal quotes - -unlexer=codelit like code, but remove string literal quotes - -unlexer=concat remove all spaces - -unlexer=bind like identity, but bind at "&+" +See the +transfer language documentation +for more information. +
+ +
+Lexers and unlexers can be chosen from
+a list of predefined ones, using the flags-lexer and `` -unlexer`` either
+in the grammar file or on the GF command line.
+
+Given by help -lexer, help -unlexer:
+ The default is words. + -lexer=words tokens are separated by spaces or newlines + -lexer=literals like words, but GF integer and string literals recognized + -lexer=vars like words, but "x","x_...","$...$" as vars, "?..." as meta + -lexer=chars each character is a token + -lexer=code use Haskell's lex + -lexer=codevars like code, but treat unknown words as variables, ?? as meta + -lexer=text with conventions on punctuation and capital letters + -lexer=codelit like code, but treat unknown words as string literals + -lexer=textlit like text, but treat unknown words as string literals + -lexer=codeC use a C-like lexer + -lexer=ignore like literals, but ignore unknown words + -lexer=subseqs like ignore, but then try all subsequences from longest - - ===Efficiency of grammars=== - - Issues: - - - the choice of datastructures in ``lincat``s - - the value of the ``optimize`` flag - - parsing efficiency: ``-mcfg`` vs. others - - - ===Speech input and output=== - - The``speak_aloud = sa`` command sends a string to the speech - synthesizer - [Flite http://www.speech.cs.cmu.edu/flite/doc/]. - It is typically used via a pipe: - ``` generate_random | linearize | speak_aloud - The result is only satisfactory for English. - - The ``speech_input = si`` command receives a string from a - speech recognizer that requires the installation of - [ATK http://mi.eng.cam.ac.uk/~sjy/software.htm]. - It is typically used to pipe input to a parser: - ``` speech_input -tr | parse - The method words only for grammars of English. - - Both Flite and ATK are freely available through the links - above, but they are not distributed together with GF. - - - - - ===Multilingual syntax editor=== - - The - [Editor User Manual http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm] - describes the use of the editor, which works for any multilingual GF grammar. - - Here is a snapshot of the editor: - - [../quick-editor.gif] - - The grammars of the snapshot are from the - [Letter grammar package http://www.cs.chalmers.se/~aarne/GF/examples/letter]. - - - - ===Interactive Development Environment (IDE)=== - - Forthcoming. - - - ===Communicating with GF=== - - Other processes can communicate with the GF command interpreter, - and also with the GF syntax editor. - - - ===Embedded grammars in Haskell, Java, and Prolog=== - - GF grammars can be used as parts of programs written in the - following languages. The links give more documentation. - - - [Java http://www.cs.chalmers.se/~bringert/gf/gf-java.html] - - [Haskell http://www.cs.chalmers.se/~aarne/GF/src/GF/Embed/EmbedAPI.hs] - - [Prolog http://www.cs.chalmers.se/~peb/software.html] - - - ===Alternative input and output grammar formats=== - - A summary is given in the following chart of GF grammar compiler phases: - [../gf-compiler.png] - - - ==Case studies== - - ===Interfacing formal and natural languages=== - - [Formal and Informal Software Specifications http://www.cs.chalmers.se/~krijo/thesis/thesisA4.pdf], - PhD Thesis by - [Kristofer Johannisson http://www.cs.chalmers.se/~krijo], is an extensive example of this. - The system is based on a multilingual grammar relating the formal language OCL with - English and German. - - A simpler example will be explained here. + The default is unwords. + -unlexer=unwords space-separated token list (like unwords) + -unlexer=text format as text: punctuation, capitals, paragraph <p> + -unlexer=code format as code (spacing, indentation) + -unlexer=textlit like text, but remove string literal quotes + -unlexer=codelit like code, but remove string literal quotes + -unlexer=concat remove all spaces + -unlexer=bind like identity, but bind at "&+"+ + +
+Issues: +
+lincats
+optimize flag
+-mcfg vs. others
+
+Thespeak_aloud = sa command sends a string to the speech
+synthesizer
+Flite.
+It is typically used via a pipe:
+
+ generate_random | linearize | speak_aloud ++
+The result is only satisfactory for English. +
+
+The speech_input = si command receives a string from a
+speech recognizer that requires the installation of
+ATK.
+It is typically used to pipe input to a parser:
+
+ speech_input -tr | parse ++
+The method words only for grammars of English. +
++Both Flite and ATK are freely available through the links +above, but they are not distributed together with GF. +
+ ++The +Editor User Manual +describes the use of the editor, which works for any multilingual GF grammar. +
++Here is a snapshot of the editor: +
+
+
+
+The grammars of the snapshot are from the +Letter grammar package. +
+ ++Forthcoming. +
+ ++Other processes can communicate with the GF command interpreter, +and also with the GF syntax editor. +
+ ++GF grammars can be used as parts of programs written in the +following languages. The links give more documentation. +
+ + + +
+A summary is given in the following chart of GF grammar compiler phases:
+
+
+Formal and Informal Software Specifications, +PhD Thesis by +Kristofer Johannisson, is an extensive example of this. +The system is based on a multilingual grammar relating the formal language OCL with +English and German. +
++A simpler example will be explained here. +
diff --git a/doc/tutorial/gf-tutorial2.txt b/doc/tutorial/gf-tutorial2.txt index 696f5cbf8..cc5e323c0 100644 --- a/doc/tutorial/gf-tutorial2.txt +++ b/doc/tutorial/gf-tutorial2.txt @@ -1345,7 +1345,7 @@ concrete FoodsEng of Foods = open Prelude, MorphoEng in { } ; } - ``` +```