3rd Edition, for GF version 2.2 or later
aarne@cs.chalmers.se
This tutorial is about the GF program and the GF programming language. It will guide you
There you can download
To start the GF program, assuming you have installed it, just type
gfin the shell. You will see GF's welcome message and the prompt >.
S ::= NP VP ;
VP ::= V | TV NP | "is" A ;
NP ::= ("this" | "that" | "the" | "a") CN ;
CN ::= A CN ;
CN ::= "bird" | "boy" | "man" | "louse" | "snake" | "worm" ;
A ::= "big" | "green" | "rotten" | "thick" | "warm" ;
V ::= "laughs" | "sleeps" | "swims" ;
TV ::= "eats" | "kills" | "washes" ;
import stoneage.gfThe GF program now compiles your grammar into an internal representation, and shows a new prompt when it is ready.
You can use GF for parsing:
> parse "the boy eats a snake" Mks_0 (Mks_6 Mks_10) (Mks_2 Mks_23 (Mks_7 Mks_13)) > parse "the snake eats a boy" Mks_0 (Mks_6 Mks_13) (Mks_2 Mks_23 (Mks_7 Mks_10))The parse (= p) command takes a string (in double quotes) and returns an abstract syntax tree - the thing with Mkss and parentheses. We will see soon how to make sense of the abstract syntax trees - now you should just notice that the tree is different for the two strings.
Strings that return a tree when parsed do so in virtue of the grammar you imported. Try parsing something else, and you fail
> p "hello world" No success in cf parsing no tree foundGenerating trees and strings
You can also use GF for linearizing (linearize = l). This is the inverse of parsing, taking trees into strings:> linearize Mks_0 (Mks_6 Mks_13) (Mks_2 Mks_23 (Mks_7 Mks_10)) the snake eats a boyWhat is the use of this? Typically not that you type in a tree at the GF prompt. The utility of linearization comes from the fact that you can obtain a tree from somewhere else. One way to do so is random generation (generate_random = gr):> generate_random Mks_0 (Mks_4 Mks_11) (Mks_3 Mks_15)Now you can copy the tree and paste it to the linearize command. Or, more efficiently, feed random generation into parsing by using a pipe.> gr | l this man is bigSome random-generated sentences
Random generation can be quite amusing. So you may want to generate ten strings with one and the same command:> gr -number=10 | l a snake laughs that man laughs the man swims this man is warm a louse is rotten that worm washes a man a boy swims a snake laughs a man washes this man this louse kills the boySystematic generation
To generate all sentence that a grammar can generate, use the command generate_trees = gt.this boy laughs this boy sleeps this boy swims this boy is big ... a bird is rotten a bird is thick a bird is warmYou get quite a few trees but not all of them: only up to a given depth of trees. To see how you can get more, use the help = h command,h grQuiz. If the command gt generated all trees in your grammar, it would never terminate. Why?More on pipes; tracing
A pipe of GF commands can have any length, but the "output type" (either string or tree) of one command must always match the "input type" of the next command.The intermediate results in a pipe can be observed by putting the tracing flag -tr to each command whose output you want to see:
> gr -tr | l -tr | p Mks_0 (Mks_6 Mks_13) (Mks_1 Mks_20) the snake laughs Mks_0 (Mks_6 Mks_13) (Mks_1 Mks_20)This facility is good for test purposes: for instance, you may want to see if a grammar is ambiguous, i.e. contains strings that can be parsed in more than one way.Writing and reading files
To save the outputs of GF commands into a file, you can pipe it to the write_file = wf command,> gr -number=10 | l | write_file exx.tmpYou can read the file back to GF with the read_file = rf command,> read_file exx.tmp | l -tr | p -linesNotice the flag -lines given to the parsing command. This flag tells GF to parse each line of the file separately. Without the flag, the grammar could not recognize the string in the file, because it is not a sentence but a sequence of ten sentences.Labelled context-free grammars
Rules and labels
The syntax trees returned by GF's parser in the previous examples are not so nice to look at. The identifiers of form Mks are labels of the EBNF rules. To see which label corresponds to which rule, you can use the print_grammar = pg command with the printer flag set to cf (which means context-free):> print_grammar -printer=cf Mks_10. CN ::= "boy" ; Mks_11. CN ::= "man" ; Mks_12. CN ::= "louse" ; Mks_13. CN ::= "snake" ; Mks_14. CN ::= "worm" ; Mks_8. CN ::= A CN ; Mks_9. CN ::= "bird" ; Mks_4. NP ::= "this" CN ; Mks_18. A ::= "thick" ;A syntax tree such asMks_4 (Mks_8 Mks_18 Mks_14) this thick wormencodes the sequence of grammar rules used for building the expression. If you look at this tree, you will notice that Mks_4 is the label of the rule prefixing this to a common noun, Mks_18 is the label of the adjective thick, and so on.