mirror of
https://github.com/GrammaticalFramework/gf-core.git
synced 2026-04-09 04:59:31 -06:00
new tutorial
This commit is contained in:
288
doc/tutorial/gf-tutorial2.html
Normal file
288
doc/tutorial/gf-tutorial2.html
Normal file
@@ -0,0 +1,288 @@
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
|
||||
<html><head><title></title></head>
|
||||
<body bgcolor="#ffffff" text="#000000">
|
||||
<center>
|
||||
|
||||
<img src="../gf-logo.gif">
|
||||
|
||||
<h1>Grammatical Framework Tutorial</h1>
|
||||
|
||||
<p>
|
||||
|
||||
<b>3rd Edition, for GF version 2.2 or later</b>
|
||||
|
||||
</p><p>
|
||||
|
||||
<a href="http://www.cs.chalmers.se/~aarne</a>">Aarne Ranta</a>
|
||||
|
||||
</p>
|
||||
<p>
|
||||
|
||||
<tt>aarne@cs.chalmers.se</tt>
|
||||
</p></center>
|
||||
|
||||
|
||||
<!-- NEW -->
|
||||
<h2>GF = Grammatical Framework</h2>
|
||||
|
||||
The term GF is used for different things:
|
||||
<ul>
|
||||
<li> a <b>program</b> used for working with grammars
|
||||
<li> a <b>programming language</b> in which grammars can be written
|
||||
<li> a <b>theory</b> about the concepts of grammars and languages
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
|
||||
This tutorial is about the GF program and the GF programming language.
|
||||
It will guide you
|
||||
<ul>
|
||||
<li> to use the GF program
|
||||
<li> to write GF grammars
|
||||
<li> to write programs in which GF grammars are used as components
|
||||
</ul>
|
||||
|
||||
|
||||
<!-- NEW -->
|
||||
<h2>The GF program</h2>
|
||||
|
||||
The program is open-source free software, which you can download from the
|
||||
GF Homepage:<br>
|
||||
<a href="http://www.cs.chalmers.se/%7Eaarne/GF">
|
||||
<tt>http://www.cs.chalmers.se/~aarne/GF</tt></a>
|
||||
|
||||
<p>
|
||||
|
||||
There you can download
|
||||
<ul>
|
||||
<li> ready-made binaries for Linux, Solaris, Macintosh, and Windows
|
||||
<li> source code and documentation
|
||||
<li> grammar libraries and examples
|
||||
</ul>
|
||||
If you want to compile GF from source, you need Haskell and Java
|
||||
compilers. But normally you don't have to compile, and you don't
|
||||
need to know Haskell or Java to use GF.
|
||||
|
||||
<p>
|
||||
|
||||
To start the GF program, assuming you have installed it, just type
|
||||
<pre>
|
||||
gf
|
||||
</pre>
|
||||
in the shell. You will see GF's welcome message and the prompt <tt>></tt>.
|
||||
|
||||
|
||||
<!-- NEW -->
|
||||
<h2>My first grammar</h2>
|
||||
|
||||
Now you are ready to try out your first grammar.
|
||||
We start with one that is not written in GF language, but
|
||||
in the EBNF notation (Extended Backus Naur Form), which GF can also
|
||||
understand. Type (or copy) the following lines in a file named
|
||||
<tt>stoneage.ebnf</tt>:
|
||||
<pre>
|
||||
S ::= NP VP ;
|
||||
VP ::= V | TV NP | "is" A ;
|
||||
NP ::= ("this" | "that" | "the" | "a") CN ;
|
||||
CN ::= A CN ;
|
||||
CN ::= "bird" | "boy" | "man" | "louse" | "snake" | "worm" ;
|
||||
A ::= "big" | "green" | "rotten" | "thick" | "warm" ;
|
||||
V ::= "laughs" | "sleeps" | "swims" ;
|
||||
TV ::= "eats" | "kills" | "washes" ;
|
||||
</pre>
|
||||
|
||||
|
||||
<!-- NEW -->
|
||||
<h2>Importing grammars and parsing strings</h2>
|
||||
|
||||
The first GF command when using a grammar is to <b>import</b> it.
|
||||
The command has a long name, <tt>import</tt>, and a short name, <tt>i</tt>.
|
||||
<pre>
|
||||
import stoneage.gf
|
||||
</pre>
|
||||
The GF program now <b>compiles</b> your grammar into an internal
|
||||
representation, and shows a new prompt when it is ready.
|
||||
|
||||
<p>
|
||||
|
||||
You can use GF for <b>parsing</b>:
|
||||
<pre>
|
||||
> parse "the boy eats a snake"
|
||||
Mks_0 (Mks_6 Mks_10) (Mks_2 Mks_23 (Mks_7 Mks_13))
|
||||
|
||||
> parse "the snake eats a boy"
|
||||
Mks_0 (Mks_6 Mks_13) (Mks_2 Mks_23 (Mks_7 Mks_10))
|
||||
</pre>
|
||||
The <tt>parse</tt> (= <tt>p</tt>) command takes a <b>string</b>
|
||||
(in double quotes) and returns an <b>abstract syntax tree</b> - the thing
|
||||
with <tt>Mks</tt>s and parentheses. We will see soon how to make sense
|
||||
of the abstract syntax trees - now you should just notice that the tree
|
||||
is different for the two strings.
|
||||
|
||||
<p>
|
||||
|
||||
Strings that return a tree when parsed do so in virtue of the grammar
|
||||
you imported. Try parsing something else, and you fail
|
||||
<pre>
|
||||
> p "hello world"
|
||||
No success in cf parsing
|
||||
no tree found
|
||||
<pre>
|
||||
|
||||
|
||||
<!-- NEW -->
|
||||
<h2>Generating trees and strings</h2>
|
||||
|
||||
You can also use GF for <b>linearizing</b>
|
||||
(<tt>linearize = l</tt>). This is the inverse of
|
||||
parsing, taking trees into strings:
|
||||
<pre>
|
||||
> linearize Mks_0 (Mks_6 Mks_13) (Mks_2 Mks_23 (Mks_7 Mks_10))
|
||||
the snake eats a boy
|
||||
</pre>
|
||||
What is the use of this? Typically not that you type in a tree at
|
||||
the GF prompt. The utility of linearization comes from the fact that
|
||||
you can obtain a tree from somewhere else. One way to do so is
|
||||
<b>random generation</b> (<tt>generate_random = gr</tt>):
|
||||
<pre>
|
||||
> generate_random
|
||||
Mks_0 (Mks_4 Mks_11) (Mks_3 Mks_15)
|
||||
</pre>
|
||||
Now you can copy the tree and paste it to the <tt>linearize command</tt>.
|
||||
Or, more efficiently, feed random generation into parsing by using
|
||||
a <b>pipe</b>.
|
||||
<pre>
|
||||
> gr | l
|
||||
this man is big
|
||||
</pre>
|
||||
|
||||
|
||||
<!-- NEW -->
|
||||
<h2>Some random-generated sentences</h2>
|
||||
|
||||
Random generation can be quite amusing. So you may want to
|
||||
generate ten strings with one and the same command:
|
||||
<pre>
|
||||
> gr -number=10 | l
|
||||
a snake laughs
|
||||
that man laughs
|
||||
the man swims
|
||||
this man is warm
|
||||
a louse is rotten
|
||||
that worm washes a man
|
||||
a boy swims
|
||||
a snake laughs
|
||||
a man washes this man
|
||||
this louse kills the boy
|
||||
</pre>
|
||||
|
||||
|
||||
<!-- NEW -->
|
||||
<h2>Systematic generation</h2>
|
||||
|
||||
To generate <i>all</i> sentence that a grammar
|
||||
can generate, use the command <tt>generate_trees = gt</tt>.
|
||||
<pre>
|
||||
this boy laughs
|
||||
this boy sleeps
|
||||
this boy swims
|
||||
this boy is big
|
||||
...
|
||||
a bird is rotten
|
||||
a bird is thick
|
||||
a bird is warm
|
||||
</pre>
|
||||
You get quite a few trees but not all of them: only up to a given
|
||||
<b>depth</b> of trees. To see how you can get more, use the
|
||||
<tt>help = h</tt> command,
|
||||
<pre>
|
||||
h gr
|
||||
</pre>
|
||||
<b>Quiz</b>. If the command <tt>gt</tt> generated all
|
||||
trees in your grammar, it would never terminate. Why?
|
||||
|
||||
|
||||
<!-- NEW -->
|
||||
<h2>More on pipes; tracing</h2>
|
||||
|
||||
A pipe of GF commands can have any length, but the "output type"
|
||||
(either string or tree) of one command must always match the "input type"
|
||||
of the next command.
|
||||
|
||||
<p>
|
||||
|
||||
The intermediate results in a pipe can be observed by putting the
|
||||
<b>tracing</b> flag <tt>-tr</tt> to each command whose output you
|
||||
want to see:
|
||||
<pre>
|
||||
> gr -tr | l -tr | p
|
||||
Mks_0 (Mks_6 Mks_13) (Mks_1 Mks_20)
|
||||
the snake laughs
|
||||
Mks_0 (Mks_6 Mks_13) (Mks_1 Mks_20)
|
||||
</pre>
|
||||
This facility is good for test purposes: for instance, you
|
||||
may want to see if a grammar is <b>ambiguous</b>, i.e.
|
||||
contains strings that can be parsed in more than one way.
|
||||
|
||||
|
||||
|
||||
<!-- NEW -->
|
||||
<h2>Writing and reading files</h2>
|
||||
|
||||
To save the outputs of GF commands into a file, you can
|
||||
pipe it to the <tt>write_file = wf</tt> command,
|
||||
<pre>
|
||||
> gr -number=10 | l | write_file exx.tmp
|
||||
</pre>
|
||||
You can read the file back to GF with the
|
||||
<tt>read_file = rf</tt> command,
|
||||
<pre>
|
||||
> read_file exx.tmp | l -tr | p -lines
|
||||
</pre>
|
||||
Notice the flag <tt>-lines</tt> given to the parsing
|
||||
command. This flag tells GF to parse each line of
|
||||
the file separately. Without the flag, the grammar could
|
||||
not recognize the string in the file, because it is not
|
||||
a sentence but a sequence of ten sentences.
|
||||
|
||||
|
||||
|
||||
<!-- NEW -->
|
||||
<h2>Labelled context-free grammars</h2>
|
||||
|
||||
<h3>Rules and labels</h3>
|
||||
|
||||
The syntax trees returned by GF's parser in the previous examples
|
||||
are not so nice to look at. The identifiers of form <tt>Mks</tt>
|
||||
are <b>labels</b> of the EBNF rules. To see which label corresponds to
|
||||
which rule, you can use the <tt>print_grammar = pg</tt> command
|
||||
with the <tt>printer</tt> flag set to <tt>cf</tt> (which means context-free):
|
||||
<pre>
|
||||
> print_grammar -printer=cf
|
||||
Mks_10. CN ::= "boy" ;
|
||||
Mks_11. CN ::= "man" ;
|
||||
Mks_12. CN ::= "louse" ;
|
||||
Mks_13. CN ::= "snake" ;
|
||||
Mks_14. CN ::= "worm" ;
|
||||
Mks_8. CN ::= A CN ;
|
||||
Mks_9. CN ::= "bird" ;
|
||||
Mks_4. NP ::= "this" CN ;
|
||||
Mks_18. A ::= "thick" ;
|
||||
</pre>
|
||||
A syntax tree such as
|
||||
<pre>
|
||||
Mks_4 (Mks_8 Mks_18 Mks_14)
|
||||
this thick worm
|
||||
</pre>
|
||||
encodes the sequence of grammar rules used for building the
|
||||
expression. If you look at this tree, you will notice that <tt>Mks_4</tt>
|
||||
is the label of the rule prefixing <tt>this</tt> to a common noun,
|
||||
<tt>Mks_18</tt> is the label of the adjective <tt>thick</tt>,
|
||||
and so on.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
</body>
|
||||
</html>
|
||||
26
doc/tutorial/neolithic.cf
Normal file
26
doc/tutorial/neolithic.cf
Normal file
@@ -0,0 +1,26 @@
|
||||
PredVP. S ::= NP VP ;
|
||||
UseV. VP ::= V ;
|
||||
ComplTV. VP ::= TV NP ;
|
||||
UseA. VP ::= "is" A ;
|
||||
This. NP ::= "this" CN ;
|
||||
That. NP ::= "that" CN ;
|
||||
Def. NP ::= "the" CN ;
|
||||
Indef. NP ::= "a" CN ;
|
||||
ModA. CN ::= A CN ;
|
||||
Bird. CN ::= "bird" ;
|
||||
Boy. CN ::= "boy" ;
|
||||
Man. CN ::= "man" ;
|
||||
Louse. CN ::= "louse" ;
|
||||
Snake. CN ::= "snake" ;
|
||||
Worm. CN ::= "worm" ;
|
||||
Big. A ::= "big" ;
|
||||
Green. A ::= "green" ;
|
||||
Rotten. A ::= "rotten" ;
|
||||
Thick. A ::= "thick" ;
|
||||
Warm. A ::= "warm" ;
|
||||
Laugh. V ::= "laughs" ;
|
||||
Sleep. V ::= "sleeps" ;
|
||||
Swim. V ::= "swims" ;
|
||||
Eat. TV ::= "eats" ;
|
||||
Kill. TV ::= "kills"
|
||||
Wash. TV ::= "washes" ;
|
||||
Reference in New Issue
Block a user