From e24a8298f0bd527c155984306ad8a09273e05943 Mon Sep 17 00:00:00 2001 From: aarne Date: Sat, 16 Apr 2005 11:32:08 +0000 Subject: [PATCH] new tutorial --- doc/tutorial/gf-tutorial2.html | 288 +++++++++++++++++++++++++++++++++ doc/tutorial/neolithic.cf | 26 +++ 2 files changed, 314 insertions(+) create mode 100644 doc/tutorial/gf-tutorial2.html create mode 100644 doc/tutorial/neolithic.cf diff --git a/doc/tutorial/gf-tutorial2.html b/doc/tutorial/gf-tutorial2.html new file mode 100644 index 000000000..22490f8dd --- /dev/null +++ b/doc/tutorial/gf-tutorial2.html @@ -0,0 +1,288 @@ + + + +
+ + + +

Grammatical Framework Tutorial

+ +

+ +3rd Edition, for GF version 2.2 or later + +

+ +Aarne Ranta + +

+

+ +aarne@cs.chalmers.se +

+ + + +

GF = Grammatical Framework

+ +The term GF is used for different things: + + +

+ +This tutorial is about the GF program and the GF programming language. +It will guide you +

+ + + +

The GF program

+ +The program is open-source free software, which you can download from the +GF Homepage:
+ +http://www.cs.chalmers.se/~aarne/GF + +

+ +There you can download +

+If you want to compile GF from source, you need Haskell and Java +compilers. But normally you don't have to compile, and you don't +need to know Haskell or Java to use GF. + +

+ +To start the GF program, assuming you have installed it, just type +

+  gf
+
+in the shell. You will see GF's welcome message and the prompt >. + + + +

My first grammar

+ +Now you are ready to try out your first grammar. +We start with one that is not written in GF language, but +in the EBNF notation (Extended Backus Naur Form), which GF can also +understand. Type (or copy) the following lines in a file named +stoneage.ebnf: +
+  S   ::= NP VP ;
+  VP  ::= V | TV NP | "is" A ;
+  NP  ::= ("this" | "that" | "the" | "a") CN ;
+  CN  ::= A CN ;
+  CN  ::= "bird" | "boy" | "man" | "louse" | "snake" | "worm" ;
+  A   ::= "big" | "green" | "rotten" | "thick" | "warm" ;
+  V   ::= "laughs" | "sleeps" | "swims" ;
+  TV  ::= "eats" | "kills" | "washes" ;
+
+ + + +

Importing grammars and parsing strings

+ +The first GF command when using a grammar is to import it. +The command has a long name, import, and a short name, i. +
+  import stoneage.gf
+
+The GF program now compiles your grammar into an internal +representation, and shows a new prompt when it is ready. + +

+ +You can use GF for parsing: +

+  > parse "the boy eats a snake"
+  Mks_0 (Mks_6 Mks_10) (Mks_2 Mks_23 (Mks_7 Mks_13))
+
+  > parse "the snake eats a boy"
+  Mks_0 (Mks_6 Mks_13) (Mks_2 Mks_23 (Mks_7 Mks_10))
+
+The parse (= p) command takes a string +(in double quotes) and returns an abstract syntax tree - the thing +with Mkss and parentheses. We will see soon how to make sense +of the abstract syntax trees - now you should just notice that the tree +is different for the two strings. + +

+ +Strings that return a tree when parsed do so in virtue of the grammar +you imported. Try parsing something else, and you fail +

+  > p "hello world"
+  No success in cf parsing
+  no tree found
+
+
+
+
+

Generating trees and strings

+ +You can also use GF for linearizing +(linearize = l). This is the inverse of +parsing, taking trees into strings: +
+  > linearize Mks_0 (Mks_6 Mks_13) (Mks_2 Mks_23 (Mks_7 Mks_10))
+  the snake eats a boy
+
+What is the use of this? Typically not that you type in a tree at +the GF prompt. The utility of linearization comes from the fact that +you can obtain a tree from somewhere else. One way to do so is +random generation (generate_random = gr): +
+  > generate_random
+  Mks_0 (Mks_4 Mks_11) (Mks_3 Mks_15)
+
+Now you can copy the tree and paste it to the linearize command. +Or, more efficiently, feed random generation into parsing by using +a pipe. +
+  > gr | l
+  this man is big
+
+ + + +

Some random-generated sentences

+ +Random generation can be quite amusing. So you may want to +generate ten strings with one and the same command: +
+  > gr -number=10 | l
+  a snake laughs
+  that man laughs
+  the man swims
+  this man is warm
+  a louse is rotten
+  that worm washes a man
+  a boy swims
+  a snake laughs
+  a man washes this man
+  this louse kills the boy
+
+ + + +

Systematic generation

+ +To generate all sentence that a grammar +can generate, use the command generate_trees = gt. +
+  this boy laughs
+  this boy sleeps
+  this boy swims
+  this boy is big
+  ...
+  a bird is rotten
+  a bird is thick
+  a bird is warm
+
+You get quite a few trees but not all of them: only up to a given +depth of trees. To see how you can get more, use the +help = h command, +
+  h gr
+
+Quiz. If the command gt generated all +trees in your grammar, it would never terminate. Why? + + + +

More on pipes; tracing

+ +A pipe of GF commands can have any length, but the "output type" +(either string or tree) of one command must always match the "input type" +of the next command. + +

+ +The intermediate results in a pipe can be observed by putting the +tracing flag -tr to each command whose output you +want to see: +

+  > gr -tr | l -tr | p
+  Mks_0 (Mks_6 Mks_13) (Mks_1 Mks_20)
+  the snake laughs
+  Mks_0 (Mks_6 Mks_13) (Mks_1 Mks_20)
+
+This facility is good for test purposes: for instance, you +may want to see if a grammar is ambiguous, i.e. +contains strings that can be parsed in more than one way. + + + + +

Writing and reading files

+ +To save the outputs of GF commands into a file, you can +pipe it to the write_file = wf command, +
+  > gr -number=10 | l | write_file exx.tmp
+
+You can read the file back to GF with the +read_file = rf command, +
+  > read_file exx.tmp | l -tr | p -lines
+
+Notice the flag -lines given to the parsing +command. This flag tells GF to parse each line of +the file separately. Without the flag, the grammar could +not recognize the string in the file, because it is not +a sentence but a sequence of ten sentences. + + + + +

Labelled context-free grammars

+ +

Rules and labels

+ +The syntax trees returned by GF's parser in the previous examples +are not so nice to look at. The identifiers of form Mks +are labels of the EBNF rules. To see which label corresponds to +which rule, you can use the print_grammar = pg command +with the printer flag set to cf (which means context-free): +
+  > print_grammar -printer=cf
+  Mks_10. CN ::= "boy" ;
+  Mks_11. CN ::= "man" ;
+  Mks_12. CN ::= "louse" ;
+  Mks_13. CN ::= "snake" ;
+  Mks_14. CN ::= "worm" ;
+  Mks_8.  CN ::= A CN ;
+  Mks_9.  CN ::= "bird" ;
+  Mks_4.  NP ::= "this" CN ;
+  Mks_18. A  ::= "thick" ;
+
+A syntax tree such as +
+  Mks_4 (Mks_8 Mks_18 Mks_14)
+  this thick worm
+
+encodes the sequence of grammar rules used for building the +expression. If you look at this tree, you will notice that Mks_4 +is the label of the rule prefixing this to a common noun, +Mks_18 is the label of the adjective thick, +and so on. + + + + + + + \ No newline at end of file diff --git a/doc/tutorial/neolithic.cf b/doc/tutorial/neolithic.cf new file mode 100644 index 000000000..d9869a257 --- /dev/null +++ b/doc/tutorial/neolithic.cf @@ -0,0 +1,26 @@ +PredVP. S ::= NP VP ; +UseV. VP ::= V ; +ComplTV. VP ::= TV NP ; +UseA. VP ::= "is" A ; +This. NP ::= "this" CN ; +That. NP ::= "that" CN ; +Def. NP ::= "the" CN ; +Indef. NP ::= "a" CN ; +ModA. CN ::= A CN ; +Bird. CN ::= "bird" ; +Boy. CN ::= "boy" ; +Man. CN ::= "man" ; +Louse. CN ::= "louse" ; +Snake. CN ::= "snake" ; +Worm. CN ::= "worm" ; +Big. A ::= "big" ; +Green. A ::= "green" ; +Rotten. A ::= "rotten" ; +Thick. A ::= "thick" ; +Warm. A ::= "warm" ; +Laugh. V ::= "laughs" ; +Sleep. V ::= "sleeps" ; +Swim. V ::= "swims" ; +Eat. TV ::= "eats" ; +Kill. TV ::= "kills" +Wash. TV ::= "washes" ;