diff --git a/doc/gf-tutorial.html b/doc/gf-tutorial.html index 3e4197b4c..cc0f03a96 100644 --- a/doc/gf-tutorial.html +++ b/doc/gf-tutorial.html @@ -8,7 +8,7 @@
-Hands-on introduction to grammar writing in GF. +This is a hands-on introduction to grammar writing in GF.
Main ingredients of GF: @@ -395,7 +386,7 @@ using the GF system.
-A GF program is called a grammar.
@@ -403,7 +394,7 @@ A GF program is called a grammar. A grammar defines of a language.-From this definition, processing components can be derived: +From this definition, language processing components can be derived:
-To compile the interactive editor, you also need a Java compiler. -
-We assume a Unix shell: Bash in Linux, "terminal" in Mac OS X, or -Cygwin in Windows. +Cygwin in Windows. But you can do most things even without Cygwin in Windows.
@@ -634,8 +622,8 @@ Finnish and an Italian concrete syntaxes:
-In order to compile the grammar in GF, each of the f
-We create four files, named Modulename.gf:
+In order to compile the grammar in GF,
+we create four files, one for each module, named Modulename.gf:
Hello.gf HelloEng.gf HelloFin.gf HelloIta.gf
@@ -732,7 +720,9 @@ Default of the language flag (-lang): the last-imported concrete sy
ciao amici
hello friends
-
+
+As -multi is the default, it can be omitted.
+
@@ -777,7 +767,7 @@ You can use the
gf program in a Unix pipe.
- % echo "l -multi Hello Wordl" | gf HelloEng.gf HelloFin.gf HelloIta.gf + % echo "l Hello World" | gf HelloEng.gf HelloFin.gf HelloIta.gf
You can also write a script, a file containing the lines @@ -786,7 +776,7 @@ You can also write a script, a file containing the lines import HelloEng.gf import HelloFin.gf import HelloIta.gf - linearize -multi Hello World + linearize Hello World
@@ -798,15 +788,14 @@ You can also write a script, a file containing the lines
If we name this script hello.gfs, we can do
- $ gf -batch -s <hello.gfs s
+ $ gf --run <hello.gfs s
ciao mondo
terve maailma
hello world
-The options -batch and -s ("silent") remove prompts, CPU time,
-and other messages.
+The option --run removes prompts, CPU time, and other messages.
See Lesson 7, for stand-alone programs that don't need the GF system to run.
@@ -1041,7 +1030,7 @@ The default depth is 3; the depth can be
set by using the depth flag:
- > generate_trees -depth=5 | l + > generate_trees -depth=2 | l
What options a command has can be seen by the help = h command:
@@ -1099,17 +1088,16 @@ strings, and try out the ambiguity test.
To save the outputs into a file, pipe it to the write_file = wf command,
- > gr -number=10 | linearize | write_file exx.tmp + > gr -number=10 | linearize | write_file -file=exx.tmp
To read a file to GF, use the read_file = rf command,
- > read_file exx.tmp | parse -lines + > read_file -file=exx.tmp -lines | parse
-The flag -lines tells GF to parse each line of
-the file separately.
+The flag -lines tells GF to read each line of the file separately.
Files with examples can be used for regression testing
@@ -1131,16 +1119,24 @@ Human eye may prefer to see a visualization: visualize_tree = vt:
> parse "this delicious cheese is very Italian" | visualize_tree
+
+The tree is generated in postscript (.ps) file. The -view option is used for
+telling what command to use to view the file. Its default is "gv", which works
+on most Linux installations. On a Mac, one would probably write
+
+ > parse "this delicious cheese is very Italian" | visualize_tree -view="open" +
-This command uses the programs Graphviz and Ghostview, which you +This command uses the program Graphviz, which you might not have, but which are freely available on the web.
-You can save the temporary file grphtmp.dot,
+You can save the temporary file _grph.dot,
which the command vt produces.
@@ -1148,7 +1144,7 @@ Then you can process this file with the dot
program (from the Graphviz package).
- % dot -Tpng grphtmp.dot > mytree.png + % dot -Tpng _grph.dot > mytree.png
@@ -1165,18 +1161,20 @@ You can give a system command without leaving GF: > ! open mytree.png
-System commands are those that receive arguments from
-GF pipes: ?.
+A system command may also receive its argument from
+a GF pipes. It then has the name sp = system_pipe:
- > generate_trees | ? wc + > generate_trees -depth=4 | sp -command="wc -l"- +
+This command example returns the number of generated trees. +
Exercise.
Measure how many trees the grammar FoodEng gives with depths 4 and 5,
respectively. Use the Unix word count command wc to count lines, and
-a pipe from a GF command into a Unix command.
+a system pipe from a GF command into a Unix command.
@@ -1198,7 +1196,7 @@ Just (?) replace English words with their dictionary equivalents: lin Is item quality = {s = item.s ++ "è" ++ quality.s} ; This kind = {s = "questo" ++ kind.s} ; - That kind = {s = "quello" ++ kind.s} ; + That kind = {s = "quel" ++ kind.s} ; QKind quality kind = {s = kind.s ++ quality.s} ; Wine = {s = "vino"} ; Cheese = {s = "formaggio"} ; @@ -1241,7 +1239,7 @@ which are introduced in Lesson 3.)
Food for some other language.
You will probably end up with grammatically incorrect
-linearizations --- but don't
+linearizations - but don't
worry about this yet.
Food for German, Swedish, or some
@@ -1307,11 +1305,11 @@ linearizations in different languages:
> gr -number=2 | tree_bank
Is (That Cheese) (Very Boring)
- quello formaggio è molto noioso
+ quel formaggio è molto noioso
that cheese is very boring
Is (That Cheese) Fresh
- quello formaggio è fresco
+ quel formaggio è fresco
that cheese is fresh
@@ -1322,33 +1320,6 @@ suitable for regression testing; see help tb for more details.
-translation_session = ts:
-you can translate between all the languages that are in scope.
-
-A dot . terminates the translation session.
-
- > ts - - trans> that very warm cheese is boring - quello formaggio molto caldo è noioso - that very warm cheese is boring - - trans> questo vino molto italiano è molto delizioso - questo vino molto italiano è molto delizioso - this very Italian wine is very delicious - - trans> . - > -- -
- -
-
translation_quiz = tq:
@@ -1356,7 +1327,7 @@ generate random sentences, display them in one language, and check the user's
answer given in another language.
- > translation_quiz FoodEng FoodIta
+ > translation_quiz -from=FoodEng -to=FoodIta
Welcome to GF Translation Quiz.
The quiz is over when you have done at least 10 examples
@@ -1376,73 +1347,13 @@ answer given in another language.
Score 1/2
this fish is expensive
-
-Off-line list of translation exercises: translation_list = tl
-
- > translation_list -number=25 FoodEng FoodIta | write_file transl.txt -
- -
-Any multilingual grammar can be used in the graphical syntax editor, opened -from Unix shell: -
-- % gfeditor FoodEng.gf FoodIta.gf --
-opens the editor for the two Food grammars.
-
-First choose a category from the "New" menu, e.g. Phrase:
-
-
-
-Then make refinements: choose of constructors from -the menu, until no metavariables (question marks) remain: -
-
-
-
- -
--Editing can be continued even when the tree is finished. The user can -
-QKind ? Fish, where the quality can be given in a later refinement
--Also: refinement by parsing: middle-click -in the tree or in the linearization field. -
-
-Exercise. Construct the sentence
-this very expensive cheese is very very delicious
-and its Italian translation by using gfeditor.
-
- -
- +
The grammar FoodEng could be written in a BNF format as follows:
@@ -1464,8 +1375,8 @@ The grammar FoodEng could be written in a BNF format as follows:
Warm. Quality ::= "warm" ;
-The GF system can convert BNF grammars into GF. BNF files are recognized
-by the file name suffix .cf:
+The GF system v 2.9 can be used for converting BNF grammars into GF.
+BNF files are recognized by the file name suffix .cf:
> import food.cf
@@ -1476,7 +1387,7 @@ It creates separate abstract and concrete modules.
-
+
Restrictions of context-free grammars
Separating concrete and abstract syntax allows
@@ -1495,7 +1406,7 @@ copy language {x x | x <- (a|b)*} in GF.
-
+
Modules and files
GF uses suffixes to recognize different file formats:
@@ -1510,22 +1421,19 @@ Importing generates target from source:
> i FoodEng.gf
- - compiling Food.gf... wrote file Food.gfc 16 msec
- - compiling FoodEng.gf... wrote file FoodEng.gfc 20 msec
+ - compiling Food.gf... wrote file Food.gfo 16 msec
+ - compiling FoodEng.gf... wrote file FoodEng.gfo 20 msec
-The GFC format (="GF Canonical") is the "machine code" of GF.
+The .gfo format (="GF Object") is precompiled GF, which is
+faster to load than source GF (.gf).
When reading a module, GF decides whether
-to use an existing .gfc file or to generate
+to use an existing .gfo file or to generate
a new one, by looking at modification times.
-In GF version 3, the gfc format is replaced by the format suffixed
-gfo, "GF object".
-
-
@@ -1544,9 +1452,9 @@ a second time? Try this in different situations:
-
+
Using operations and resource modules
-
+
Operation definitions
The golden rule of functional programmin:
@@ -1608,7 +1516,7 @@ sugar for abstraction:
-
+
The ``resource`` module type
The resource module type is used to package
@@ -1627,7 +1535,7 @@ The resource module type is used to package
-
+
Opening a resource
Any number of resource modules can be
@@ -1660,7 +1568,7 @@ Any number of resource modules can be
-
+
Partial application
@@ -1698,7 +1606,7 @@ such that it allows you to write
-
+
Testing resource modules
Import with the flag -retain,
@@ -1711,20 +1619,18 @@ Compute the value with compute_concrete = cc,
> compute_concrete prefix "in" (ss "addition")
- {
- s : Str = "in" ++ "addition"
- }
+ {s : Str = "in" ++ "addition"}
-
+
Grammar architecture
-
+
Extending a grammar
A new module can extend an old one:
@@ -1781,7 +1687,7 @@ possible to build resource hierarchies.
-
+
Multiple inheritance
Extend several grammars at the same time:
@@ -1815,43 +1721,7 @@ where
-
-Visualizing module structure
-
-visualize_graph = vg,
-
-
- > visualize_graph
-
-
-and the graph will pop up in a separate window:
-
-
-
-
-
-The graph uses
-
-
-You can also print
-the graph into a .dot file by using the command print_multi = pm:
-
- > print_multi -printer=graph | write_file Foodmarket.dot - > ! dot -Tpng Foodmarket.dot > Foodmarket.png -- -
- -
- +@@ -1880,7 +1750,7 @@ could be left to library implementors.
- +
Plural forms are needed in things like @@ -1913,7 +1783,7 @@ adjectives, and verbs can have in some languages that you know.
- +
We define the parameter type of number in English by
@@ -2021,7 +1891,7 @@ module, which you can test by using the command compute_concrete.
- +
A morphological paradigm is a formula telling how a class of
@@ -2073,7 +1943,7 @@ uses a wild card pattern _.
- +
regNoun paradigm does not
@@ -2086,7 +1956,7 @@ considered in earlier exercises.
- +
Purpose: a more radical @@ -2111,7 +1981,7 @@ This will force us to deal with gender-
- +
In English, the phrase-forming rule @@ -2153,7 +2023,7 @@ Now we can write
- +
How does an Item subject receive its number? The rules
@@ -2223,7 +2093,7 @@ In a more lexicalized grammar, determiners would be a category:
- +
Kinds have number as a parametric feature: both singular and plural
@@ -2291,7 +2161,7 @@ Notice
- +
We use some string operations from the library Prelude are used.
@@ -2356,7 +2226,7 @@ We use some string operations from the library Prelude are used.
- +
@@ -2370,7 +2240,7 @@ add words to a lexicon.
- +
We perform data abstraction from the type @@ -2460,8 +2330,8 @@ parameters.
- -
The regular dog-dogs paradigm has
predictable variations:
@@ -2527,7 +2397,7 @@ the suffix "oo" prevents bamboo from matching the suffix
- +
- +
In Lesson 5, dependent function types need a notation @@ -2608,7 +2478,7 @@ looking like the expected forms:
- +
In librarues, it is useful to group type signatures separately from
@@ -2628,7 +2498,7 @@ With the interface and instance module types
- +
Overloading: different functions can be given the same name, as e.g. in C++. @@ -2650,7 +2520,7 @@ Example: different ways to define nouns in English: }
-Cf. dictionaries: ff the +Cf. dictionaries: if the word is regular, just one form is needed. If it is irregular, more forms are given.
@@ -2670,7 +2540,7 @@ an overload group.- +
The command morpho_analyse = ma
@@ -2707,7 +2577,7 @@ To create a list for later use, use the command morpho_list = ml
- +
@@ -2821,9 +2691,9 @@ The complete set of linearization rules: Is item quality = ss (item.s ++ copula item.n ++ quality.s ! item.g ! item.n) ; This = det Sg "questo" "questa" ; - That = det Sg "quello" "quella" ; + That = det Sg "quel" "quella" ; These = det Pl "questi" "queste" ; - Those = det Pl "quelli" "quelle" ; + Those = det Pl "quei" "quelle" ; QKind quality kind = { s = \\n => kind.s ! n ++ quality.s ! kind.g ! n ; g = kind.g @@ -2845,7 +2715,7 @@ The complete set of linearization rules:
- +
FoodsIta. You can do this by printing the grammar in the context-free format
-(print_grammar -printer=cfg) and counting the lines.
+(print_grammar -printer=bnf) and counting the lines.
- +
A linearization record may contain more strings than one, and those @@ -2903,7 +2773,7 @@ but can be defined in GF by using discontinuous constituents.
- +
Tokens are created in the following ways: @@ -2956,19 +2826,13 @@ after linearization.
Correspondingly, a lexer that e.g. analyses "warm?" into
-to tokens is needed before parsing. Both can be given in a grammar
-by using flags:
-
- flags lexer=text ; unlexer=text ; --
-More on lexers and unlexers will be told here. +to tokens is needed before parsing. +This topic will be covered in here.
- +
@@ -3033,7 +2897,7 @@ Thus
- +
@@ -3050,14 +2914,14 @@ Goals:
- +
The current 12 resource languages are
Arabic (incomplete)
-Catalan (incomplete)
+Bulgarian
+Catalan
Danish
English
Finnish
@@ -3077,7 +2941,7 @@ The first three letters (Eng etc) are used in grammar module names
- +
@@ -3099,7 +2963,7 @@ wider coverage than with semantic grammars.
- +
A resource grammar has two kinds of categories and two kinds of rules: @@ -3127,7 +2991,7 @@ But it is a good discipline to follow.
- +
Two kinds of lexical categories: @@ -3160,7 +3024,7 @@ Two kinds of lexical categories:
- +
Closed classes: module Syntax. In the Foods grammar, we need
@@ -3193,7 +3057,7 @@ where we use mkN from ParadigmsEng:
- +
Alternative concrete syntax for @@ -3224,7 +3088,7 @@ Advantages:
- +
In Foods, we need just four phrasal categories:
@@ -3245,7 +3109,7 @@ Common nouns are made into noun phrases by adding determiners.
- +
We need the following combinations: @@ -3274,7 +3138,7 @@ Heavy overloading: the current library
- +
The sentence @@ -3300,7 +3164,7 @@ this syntactic tree gives the value of linearizing the semantic tree
- +
Language-specific and language-independent parts - roughly, @@ -3322,7 +3186,7 @@ Full API documentation on-line: the resource synopsis,
- +
-All currently available formats can be seen in gf with help -printer.
+All currently available formats can be seen with gfc --help.