From 307042a6a1863854920da7eaae6fbc588457221c Mon Sep 17 00:00:00 2001 From: aarne Date: Wed, 1 Oct 2008 13:13:10 +0000 Subject: [PATCH] refreshed the tutorial --- doc/gf-tutorial.html | 1010 ++++++++++-------------- examples/tutorial/semantics/SemBase.hs | 3 +- examples/tutorial/semantics/Top.hs | 8 +- 3 files changed, 424 insertions(+), 597 deletions(-) diff --git a/doc/gf-tutorial.html b/doc/gf-tutorial.html index 3e4197b4c..cc0f03a96 100644 --- a/doc/gf-tutorial.html +++ b/doc/gf-tutorial.html @@ -8,7 +8,7 @@

Grammatical Framework Tutorial

Aarne Ranta
-Version 3, February 2008 +Version 3.1, October 2008

@@ -23,7 +23,7 @@ Version 3, February 2008
  • Lesson 1: Getting Started with GF -
  • Lesson 3: Grammars with parameters +
  • Lesson 3: Grammars with parameters -
  • Lesson 4: Using the resource grammar library +
  • Lesson 4: Using the resource grammar library -
  • Lesson 5: Refining semantics in abstract syntax +
  • Lesson 5: Refining semantics in abstract syntax -
  • Lesson 7: Embedded grammars +
  • Lesson 7: Embedded grammars @@ -273,7 +264,7 @@ Version 3, February 2008

    Overview

    -Hands-on introduction to grammar writing in GF. +This is a hands-on introduction to grammar writing in GF.

    Main ingredients of GF: @@ -395,7 +386,7 @@ using the GF system.

    -

    GF grammars and processing tasks

    +

    GF grammars and language processing tasks

    A GF program is called a grammar.

    @@ -403,7 +394,7 @@ A GF program is called a grammar. A grammar defines of a language.

    -From this definition, processing components can be derived: +From this definition, language processing components can be derived:

    -    % echo "l -multi Hello Wordl" | gf HelloEng.gf HelloFin.gf HelloIta.gf
    +    % echo "l Hello World" | gf HelloEng.gf HelloFin.gf HelloIta.gf
     

    You can also write a script, a file containing the lines @@ -786,7 +776,7 @@ You can also write a script, a file containing the lines import HelloEng.gf import HelloFin.gf import HelloIta.gf - linearize -multi Hello World + linearize Hello World

    @@ -798,15 +788,14 @@ You can also write a script, a file containing the lines If we name this script hello.gfs, we can do

    -    $ gf -batch -s <hello.gfs s
    +    $ gf --run <hello.gfs s
       
         ciao mondo
         terve maailma
         hello world
     

    -The options -batch and -s ("silent") remove prompts, CPU time, -and other messages. +The option --run removes prompts, CPU time, and other messages.

    See Lesson 7, for stand-alone programs that don't need the GF system to run. @@ -1041,7 +1030,7 @@ The default depth is 3; the depth can be set by using the depth flag:

    -    > generate_trees -depth=5 | l
    +    > generate_trees -depth=2 | l
     

    What options a command has can be seen by the help = h command: @@ -1099,17 +1088,16 @@ strings, and try out the ambiguity test. To save the outputs into a file, pipe it to the write_file = wf command,

    -    > gr -number=10 | linearize | write_file exx.tmp
    +    > gr -number=10 | linearize | write_file -file=exx.tmp
     

    To read a file to GF, use the read_file = rf command,

    -    > read_file exx.tmp | parse -lines
    +    > read_file -file=exx.tmp -lines | parse
     

    -The flag -lines tells GF to parse each line of -the file separately. +The flag -lines tells GF to read each line of the file separately.

    Files with examples can be used for regression testing @@ -1131,16 +1119,24 @@ Human eye may prefer to see a visualization: visualize_tree = vt:

         > parse "this delicious cheese is very Italian" | visualize_tree
     
    +

    +The tree is generated in postscript (.ps) file. The -view option is used for +telling what command to use to view the file. Its default is "gv", which works +on most Linux installations. On a Mac, one would probably write +

    +
    +    > parse "this delicious cheese is very Italian" | visualize_tree -view="open"
    +

    -This command uses the programs Graphviz and Ghostview, which you +This command uses the program Graphviz, which you might not have, but which are freely available on the web.

    -You can save the temporary file grphtmp.dot, +You can save the temporary file _grph.dot, which the command vt produces.

    @@ -1148,7 +1144,7 @@ Then you can process this file with the dot program (from the Graphviz package).

    -    % dot -Tpng grphtmp.dot > mytree.png
    +    % dot -Tpng _grph.dot > mytree.png
     

    @@ -1165,18 +1161,20 @@ You can give a system command without leaving GF: > ! open mytree.png

    -System commands are those that receive arguments from -GF pipes: ?. +A system command may also receive its argument from +a GF pipes. It then has the name sp = system_pipe:

    -    > generate_trees | ? wc
    +    > generate_trees -depth=4 | sp -command="wc -l"
     
    -

    +

    +This command example returns the number of generated trees. +

    Exercise. Measure how many trees the grammar FoodEng gives with depths 4 and 5, respectively. Use the Unix word count command wc to count lines, and -a pipe from a GF command into a Unix command. +a system pipe from a GF command into a Unix command.

    @@ -1198,7 +1196,7 @@ Just (?) replace English words with their dictionary equivalents: lin Is item quality = {s = item.s ++ "è" ++ quality.s} ; This kind = {s = "questo" ++ kind.s} ; - That kind = {s = "quello" ++ kind.s} ; + That kind = {s = "quel" ++ kind.s} ; QKind quality kind = {s = kind.s ++ quality.s} ; Wine = {s = "vino"} ; Cheese = {s = "formaggio"} ; @@ -1241,7 +1239,7 @@ which are introduced in Lesson 3.)

    1. Write a concrete syntax of Food for some other language. You will probably end up with grammatically incorrect -linearizations --- but don't +linearizations - but don't worry about this yet.

    2. If you have written Food for German, Swedish, or some @@ -1307,11 +1305,11 @@ linearizations in different languages: > gr -number=2 | tree_bank Is (That Cheese) (Very Boring) - quello formaggio è molto noioso + quel formaggio è molto noioso that cheese is very boring Is (That Cheese) Fresh - quello formaggio è fresco + quel formaggio è fresco that cheese is fresh

      @@ -1322,33 +1320,6 @@ suitable for regression testing; see help tb for more details.

      -

      Translation session

      -

      -translation_session = ts: -you can translate between all the languages that are in scope. -

      -

      -A dot . terminates the translation session. -

      -
      -    > ts
      -  
      -    trans> that very warm cheese is boring
      -    quello formaggio molto caldo è noioso
      -    that very warm cheese is boring
      -  
      -    trans> questo vino molto italiano è molto delizioso
      -    questo vino molto italiano è molto delizioso
      -    this very Italian wine is very delicious
      -  
      -    trans> .
      -    >
      -
      -

      -

      - -

      -

      Translation quiz

      translation_quiz = tq: @@ -1356,7 +1327,7 @@ generate random sentences, display them in one language, and check the user's answer given in another language.

      -    > translation_quiz FoodEng FoodIta
      +    > translation_quiz -from=FoodEng -to=FoodIta
         
           Welcome to GF Translation Quiz.
           The quiz is over when you have done at least 10 examples
      @@ -1376,73 +1347,13 @@ answer given in another language.
           Score 1/2
           this fish is expensive
       
      -

      -Off-line list of translation exercises: translation_list = tl -

      -
      -    > translation_list -number=25 FoodEng FoodIta | write_file transl.txt
      -

      - -

      Multilingual syntax editing

      -

      - -

      -

      -Any multilingual grammar can be used in the graphical syntax editor, opened -from Unix shell: -

      -
      -    % gfeditor FoodEng.gf FoodIta.gf 
      -
      -

      -opens the editor for the two Food grammars. -

      -

      -First choose a category from the "New" menu, e.g. Phrase: -

      -

      - -

      -

      -Then make refinements: choose of constructors from -the menu, until no metavariables (question marks) remain: -

      -

      - -

      -

      - -

      -

      -Editing can be continued even when the tree is finished. The user can -

      -
        -
      • shift focus to any subtree by clicking at it -
      • to change "fish" to "cheese" or "wine" -
      • to delete "fish", i.e. change it to a metavariable -
      • to wrap "fish" in a qualification, i.e. change it to - QKind ? Fish, where the quality can be given in a later refinement -
      - -

      -Also: refinement by parsing: middle-click -in the tree or in the linearization field. -

      -

      -Exercise. Construct the sentence -this very expensive cheese is very very delicious -and its Italian translation by using gfeditor. -

      -

      - -

      - +

      Context-free grammars and GF

      - +

      The "cf" grammar format

      The grammar FoodEng could be written in a BNF format as follows: @@ -1464,8 +1375,8 @@ The grammar FoodEng could be written in a BNF format as follows: Warm. Quality ::= "warm" ;

      -The GF system can convert BNF grammars into GF. BNF files are recognized -by the file name suffix .cf: +The GF system v 2.9 can be used for converting BNF grammars into GF. +BNF files are recognized by the file name suffix .cf:

           > import food.cf
      @@ -1476,7 +1387,7 @@ It creates separate abstract and concrete modules.
       

      - +

      Restrictions of context-free grammars

      Separating concrete and abstract syntax allows @@ -1495,7 +1406,7 @@ copy language {x x | x <- (a|b)*} in GF.

      - +

      Modules and files

      GF uses suffixes to recognize different file formats: @@ -1510,22 +1421,19 @@ Importing generates target from source:

           > i FoodEng.gf
      -    - compiling Food.gf...   wrote file Food.gfc 16 msec
      -    - compiling FoodEng.gf...   wrote file FoodEng.gfc 20 msec
      +    - compiling Food.gf...   wrote file Food.gfo 16 msec
      +    - compiling FoodEng.gf...   wrote file FoodEng.gfo 20 msec
       

      -The GFC format (="GF Canonical") is the "machine code" of GF. +The .gfo format (="GF Object") is precompiled GF, which is +faster to load than source GF (.gf).

      When reading a module, GF decides whether -to use an existing .gfc file or to generate +to use an existing .gfo file or to generate a new one, by looking at modification times.

      -In GF version 3, the gfc format is replaced by the format suffixed -gfo, "GF object". -

      -

      @@ -1544,9 +1452,9 @@ a second time? Try this in different situations:

      - +

      Using operations and resource modules

      - +

      Operation definitions

      The golden rule of functional programmin: @@ -1608,7 +1516,7 @@ sugar for abstraction:

      - +

      The ``resource`` module type

      The resource module type is used to package @@ -1627,7 +1535,7 @@ The resource module type is used to package

      - +

      Opening a resource

      Any number of resource modules can be @@ -1660,7 +1568,7 @@ Any number of resource modules can be

      - +

      Partial application

      @@ -1698,7 +1606,7 @@ such that it allows you to write

      - +

      Testing resource modules

      Import with the flag -retain, @@ -1711,20 +1619,18 @@ Compute the value with compute_concrete = cc,

           > compute_concrete prefix "in" (ss "addition")
      -    {
      -      s : Str = "in" ++ "addition"
      -    }
      +    {s : Str = "in" ++ "addition"}
       

      - +

      Grammar architecture

      - +

      Extending a grammar

      A new module can extend an old one: @@ -1781,7 +1687,7 @@ possible to build resource hierarchies.

      - +

      Multiple inheritance

      Extend several grammars at the same time: @@ -1815,43 +1721,7 @@ where

      - -

      Visualizing module structure

      -

      -visualize_graph = vg, -

      -
      -    > visualize_graph
      -
      -

      -and the graph will pop up in a separate window: -

      -

      - -

      -

      -The graph uses -

      -
        -
      • oval boxes for abstract modules -
      • square boxes for concrete modules -
      • black-headed arrows for inheritance -
      • white-headed arrows for the concrete-of-abstract relation -
      - -

      -You can also print -the graph into a .dot file by using the command print_multi = pm: -

      -
      -    > print_multi -printer=graph | write_file Foodmarket.dot
      -    > ! dot -Tpng Foodmarket.dot > Foodmarket.png
      -
      -

      -

      - -

      - +

      Lesson 3: Grammars with parameters

      @@ -1880,7 +1750,7 @@ could be left to library implementors.

      - +

      The problem: words have to be inflected

      Plural forms are needed in things like @@ -1913,7 +1783,7 @@ adjectives, and verbs can have in some languages that you know.

      - +

      Parameters and tables

      We define the parameter type of number in English by @@ -2021,7 +1891,7 @@ module, which you can test by using the command compute_concrete.

      - +

      Inflection tables and paradigms

      A morphological paradigm is a formula telling how a class of @@ -2073,7 +1943,7 @@ uses a wild card pattern _.

      - +

      Exercises on morphology

      1. Identify cases in which the regNoun paradigm does not @@ -2086,7 +1956,7 @@ considered in earlier exercises.

        - +

        Using parameters in concrete syntax

        Purpose: a more radical @@ -2111,7 +1981,7 @@ This will force us to deal with gender-

        - +

        Agreement

        In English, the phrase-forming rule @@ -2153,7 +2023,7 @@ Now we can write

        - +

        Determiners

        How does an Item subject receive its number? The rules @@ -2223,7 +2093,7 @@ In a more lexicalized grammar, determiners would be a category:

        - +

        Parametric vs. inherent features

        Kinds have number as a parametric feature: both singular and plural @@ -2291,7 +2161,7 @@ Notice

        - +

        An English concrete syntax for Foods with parameters

        We use some string operations from the library Prelude are used. @@ -2356,7 +2226,7 @@ We use some string operations from the library Prelude are used.

        - +

        More on inflection paradigms

        @@ -2370,7 +2240,7 @@ add words to a lexicon.

        - +

        Worst-case functions

        We perform data abstraction from the type @@ -2460,8 +2330,8 @@ parameters.

        - -

        Intelligent paradigms

        + +

        Smart paradigms

        The regular dog-dogs paradigm has predictable variations: @@ -2527,7 +2397,7 @@ the suffix "oo" prevents bamboo from matching the suffix

        - +

        Exercises on regular patterns

        1. The same rules that form plural nouns in English also @@ -2552,7 +2422,7 @@ operation to see whether it correctly changes Arzt to

          - +

          Function types with variables

          In Lesson 5, dependent function types need a notation @@ -2608,7 +2478,7 @@ looking like the expected forms:

          - +

          Separating operation types and definitions

          In librarues, it is useful to group type signatures separately from @@ -2628,7 +2498,7 @@ With the interface and instance module types

          - +

          Overloading of operations

          Overloading: different functions can be given the same name, as e.g. in C++. @@ -2650,7 +2520,7 @@ Example: different ways to define nouns in English: }

      -Cf. dictionaries: ff the +Cf. dictionaries: if the word is regular, just one form is needed. If it is irregular, more forms are given.

      @@ -2670,7 +2540,7 @@ an overload group.

      - +

      Morphological analysis and morphology quiz

      The command morpho_analyse = ma @@ -2707,7 +2577,7 @@ To create a list for later use, use the command morpho_list = ml

      - +

      The Italian Foods grammar

      @@ -2821,9 +2691,9 @@ The complete set of linearization rules: Is item quality = ss (item.s ++ copula item.n ++ quality.s ! item.g ! item.n) ; This = det Sg "questo" "questa" ; - That = det Sg "quello" "quella" ; + That = det Sg "quel" "quella" ; These = det Pl "questi" "queste" ; - Those = det Pl "quelli" "quelle" ; + Those = det Pl "quei" "quelle" ; QKind quality kind = { s = \\n => kind.s ! n ++ quality.s ! kind.g ! n ; g = kind.g @@ -2845,7 +2715,7 @@ The complete set of linearization rules:

      - +

      Exercises on using parameters

      1. Experiment with multilingual generation and translation in the @@ -2859,13 +2729,13 @@ now aiming for complete grammatical correctness by the use of parameters.

      2. Measure the size of the context-free grammar corresponding to FoodsIta. You can do this by printing the grammar in the context-free format -(print_grammar -printer=cfg) and counting the lines. +(print_grammar -printer=bnf) and counting the lines.

      - +

      Discontinuous constituents

      A linearization record may contain more strings than one, and those @@ -2903,7 +2773,7 @@ but can be defined in GF by using discontinuous constituents.

      - +

      Strings at compile time vs. run time

      Tokens are created in the following ways: @@ -2956,19 +2826,13 @@ after linearization.

      Correspondingly, a lexer that e.g. analyses "warm?" into -to tokens is needed before parsing. Both can be given in a grammar -by using flags: -

      -
      -    flags lexer=text ; unlexer=text ;
      -
      -

      -More on lexers and unlexers will be told here. +to tokens is needed before parsing. +This topic will be covered in here.

      - +

      Supplementary constructs for concrete syntax

      Record extension and subtyping

      @@ -3033,7 +2897,7 @@ Thus

      - +

      Lesson 4: Using the resource grammar library

      @@ -3050,14 +2914,14 @@ Goals:

      - +

      The coverage of the library

      The current 12 resource languages are

        -
      • Arabic (incomplete) -
      • Catalan (incomplete) +
      • Bulgarian +
      • Catalan
      • Danish
      • English
      • Finnish @@ -3077,7 +2941,7 @@ The first three letters (Eng etc) are used in grammar module names

        - +

        The structure of the library

        @@ -3099,7 +2963,7 @@ wider coverage than with semantic grammars.

        - +

        Lexical vs. phrasal rules

        A resource grammar has two kinds of categories and two kinds of rules: @@ -3127,7 +2991,7 @@ But it is a good discipline to follow.

        - +

        Lexical categories

        Two kinds of lexical categories: @@ -3160,7 +3024,7 @@ Two kinds of lexical categories:

        - +

        Lexical rules

        Closed classes: module Syntax. In the Foods grammar, we need @@ -3193,7 +3057,7 @@ where we use mkN from ParadigmsEng:

        - +

        Resource lexicon

        Alternative concrete syntax for @@ -3224,7 +3088,7 @@ Advantages:

        - +

        Phrasal categories

        In Foods, we need just four phrasal categories: @@ -3245,7 +3109,7 @@ Common nouns are made into noun phrases by adding determiners.

        - +

        Syntactic combinations

        We need the following combinations: @@ -3274,7 +3138,7 @@ Heavy overloading: the current library

        - +

        Example syntactic combination

        The sentence @@ -3300,7 +3164,7 @@ this syntactic tree gives the value of linearizing the semantic tree

        - +

        The resource API

        Language-specific and language-independent parts - roughly, @@ -3322,7 +3186,7 @@ Full API documentation on-line: the resource synopsis,

        - +

        A miniature resource API: categories

        @@ -3380,7 +3244,7 @@ Full API documentation on-line: the resource synopsis,

        - +

        A miniature resource API: rules

        @@ -3428,7 +3292,7 @@ Full API documentation on-line: the resource synopsis,

        - +

        A miniature resource API: structural words

        @@ -3466,7 +3330,7 @@ Full API documentation on-line: the resource synopsis,

        - +

        A miniature resource API: paradigms

        From ParadigmsEng: @@ -3511,7 +3375,7 @@ From ParadigmsIta:

        - +

        A miniature resource API: more paradigms

        From ParadigmsGer: @@ -3576,22 +3440,22 @@ From ParadigmsFin:

        - +

        Exercises

        1. Try out the morphological paradigms in different languages. Do as follows:

        -    > i -path=alltenses:prelude -retain alltenses/ParadigmsGer.gfr
        -    > cc mkN "Farbe"
        -    > cc mkA "gut" "besser" "beste"
        +    > i -path=alltenses -retain alltenses/ParadigmsGer.gfo
        +    > cc -table mkN "Farbe"
        +    > cc -table mkA "gut" "besser" "beste"
         

        - +

        Example: English

        @@ -3617,7 +3481,7 @@ We need a path with Thus the beginning of the module is

        -    --# -path=.:../foods:present:prelude
        +    --# -path=.:../foods:present
           
             concrete FoodsEng of Foods = open SyntaxEng,ParadigmsEng in {
         
        @@ -3625,7 +3489,7 @@ Thus the beginning of the module is

        - +

        English example: linearization types and combination rules

        As linearization types, we use clauses for Phrase, noun phrases @@ -3655,7 +3519,7 @@ Now the combination rules we need almost write themselves automatically:

        - +

        English example: lexical rules

        We use resource paradigms and lexical insertion rules. @@ -3681,7 +3545,7 @@ The two-place noun paradigm is needed only once, for

        - +

        English example: exercises

        1. Compile the grammar FoodsEng and generate @@ -3696,12 +3560,12 @@ grammars presented earlier in this tutorial.

        - +

        Functor implementation of multilingual grammars

        - +

        New language by copy and paste

        If you write a concrete syntax of Foods for some other @@ -3732,7 +3596,7 @@ Can we avoid this programming by copy-and-paste?

        - +

        Functors: functions on the module level

        Functors familiar from the functional programming languages ML and OCaml, @@ -3777,10 +3641,10 @@ we can write a functor instantiation,

        - +

        Code for the Foods functor

        -    --# -path=.:../foods:present
        +    --# -path=.:../foods
           
             incomplete concrete FoodsI of Foods = open Syntax, LexFoods in {
             lincat
        @@ -3813,7 +3677,7 @@ we can write a functor instantiation,
         

        - +

        Code for the LexFoods interface

        @@ -3837,7 +3701,7 @@ we can write a functor instantiation,

        - +

        Code for a German instance of the lexicon

             instance LexFoodsGer of LexFoods = open SyntaxGer, ParadigmsGer in {
        @@ -3858,10 +3722,10 @@ we can write a functor instantiation,
         

        - +

        Code for a German functor instantiation

        -    --# -path=.:../foods:present:prelude
        +    --# -path=.:../foods:present
           
             concrete FoodsGer of Foods = FoodsI with 
               (Syntax = SyntaxGer),
        @@ -3871,7 +3735,7 @@ we can write a functor instantiation,
         

        - +

        Adding languages to a functor implementation

        Just two modules are needed: @@ -3897,7 +3761,7 @@ language:

        - +

        Example: adding Finnish

        Lexicon instance @@ -3921,7 +3785,7 @@ Lexicon instance Functor instantiation

        -    --# -path=.:../foods:present:prelude
        +    --# -path=.:../foods:present
           
             concrete FoodsFin of Foods = FoodsI with 
               (Syntax = SyntaxFin),
        @@ -3931,7 +3795,7 @@ Functor instantiation
         

        - +

        A design pattern

        This can be seen as a design pattern for multilingual grammars: @@ -3954,7 +3818,7 @@ Of the hand-written modules, only LexDomainL is language-dependent.

        - +

        Functors: exercises

        1. Compile and test FoodsGer. @@ -3995,9 +3859,9 @@ The implementation goes in the following phases:

        - +

        Restricted inheritance

        - +

        A problem with functors

        Problem: a functor only works when all languages use the resource Syntax @@ -4027,7 +3891,7 @@ Problem with this solution:

        - +

        Restricted inheritance: include or exclude

        A module may inherit just a selection of names. @@ -4048,15 +3912,15 @@ A concrete syntax of Foodmarket must make the analogous restriction

        - -

        The functor proble solved

        + +

        The functor problem solved

        The English instantiation inherits the functor implementation except for the constant Pizza. This constant is defined in the body instead:

        -    --# -path=.:../foods:present:prelude
        +    --# -path=.:../foods:present
           
             concrete FoodsEng of Foods = FoodsI - [Pizza] with 
               (Syntax = SyntaxEng),
        @@ -4070,7 +3934,7 @@ is defined in the body instead:
         

        - +

        Grammar reuse

        Abstract syntax modules can be used as interfaces, @@ -4092,58 +3956,10 @@ The following correspondencies are then applied:

        - -

        Browsing the resource with GF commands

        - -

        Find a term by parsing

        + +

        Library exercises

        - -

        -

        -To look for a syntax tree in the overload API by parsing: -

        -
        -    % gf $GF_LIB_PATH/alltenses/OverLangEng.gfc
        -  
        -    > p -cat=S -overload "this grammar is too big"
        -    mkS (mkCl (mkNP this_QuantSg grammar_N) (mkAP too_AdA big_A))
        -
        -

        -The -overload option finds the -shallowest overloaded term that matches the parse tree. -

        -

        - -

        - -

        Browsing the resource with GF commands

        - -

        Find a term using syntax editor

        -

        -Open the editor with a precompiled resource package: -

        -
        -    % gfeditor $GF_LIB_PATH/alltenses/langs.gfcm
        -
        -

        -Constructed a tree resulting in the following screen: -

        -

        -

        -

        -

        - -

        -

        -

        -

        -

        - -

        - -

        Browsing exercises

        -

        -1. Find the resource grammar terms for the following +1. Find resource grammar terms for the following English phrases (in the category Phr). You can first try to build the terms manually.

        @@ -4160,9 +3976,12 @@ build the terms manually. which languages did you want to speak

        +Then translate the phrases to other languages. +

        +

        - +

        Tenses

        @@ -4171,7 +3990,7 @@ build the terms manually. In Foods grammars, we have used the path

        -    --# -path=.:../foods:present
        +    --# -path=.:../foods
         

        The library subdirectory present is a restricted version @@ -4254,9 +4073,13 @@ tenses and moods, e.g. the Romance languages.

        - +

        Lesson 5: Refining semantics in abstract syntax

        +NOTICE: The methods described in this lesson are not yet fully supported +in GF 3.0 beta. Use GF 2.9 to get all functionalities. +

        +

        @@ -4282,7 +4105,7 @@ GF = logical framework + concrete syntax.

        - +

        Dependent types

        @@ -4310,7 +4133,7 @@ defines voice commands for household appliances.

        - +

        A dependent type system

        Ontology: @@ -4339,7 +4162,7 @@ Abstract syntax formalizing this:

        - +

        Examples of devices and actions

        Assume the kinds light and fan, @@ -4372,7 +4195,7 @@ but we cannot form the trees

        - +

        Linearization and parsing with dependent types

        Concrete syntax does not know if a category is a dependent type. @@ -4415,7 +4238,7 @@ to mark incomplete parts of trees in the syntax editor.

        - +

        Solving metavariables

        Use the command put_tree = pt with the flag -transform=solve: @@ -4435,7 +4258,7 @@ The solve process may fail, in which case no tree is returned:

        - +

        Polymorphism

        @@ -4468,7 +4291,7 @@ to express Haskell-type library functions:

        - +

        Dependent types: exercises

        1. Write an abstract syntax module with above contents @@ -4485,7 +4308,7 @@ and an appropriate English concrete syntax. Try to parse the commands

        - +

        Proof objects

        Curry-Howard isomorphism = propositions as types principle: @@ -4530,7 +4353,7 @@ Example: the fact that 2 is less that 4 has the proof object

        - +

        Proof-carrying documents

        Idea: to be semantically well-formed, the abstract syntax of a document @@ -4574,7 +4397,7 @@ A legal connection is formed by the function

        - +

        Restricted polymorphism

        Above, all Actions were either of @@ -4599,7 +4422,7 @@ The notion of class uses the Curry-Howard isomorphism as follows:

        - +

        Example: classes for switching and dimming

        We modify the smart house grammar: @@ -4622,7 +4445,7 @@ Classes for new actions can be added incrementally.

        - +

        Variable bindings

        @@ -4656,7 +4479,7 @@ Examples from informal mathematical language:

        - +

        Higher-order abstract syntax

        Abstract syntax can use functions as arguments: @@ -4694,7 +4517,7 @@ expressed using higher-order syntactic constructors.

        - +

        Higher-order abstract syntax: linearization

        HOAS has proved to be useful in the semantics and computer implementation of @@ -4728,7 +4551,7 @@ If there are more bindings, we add $1, $2, etc.

        - +

        Eta expansion

        To make sense of linearization, syntax trees must be @@ -4777,7 +4600,7 @@ The linearization of the variable x is,

        - +

        Parsing variable bindings

        GF needs to know what strings are parsed as variable symbols. @@ -4795,7 +4618,7 @@ More details on lexers here.

        - +

        Exercises on variable bindings

        1. Write an abstract syntax of the whole @@ -4814,7 +4637,7 @@ guarantee non-ambiguity.

        - +

        Semantic definitions

        @@ -4853,7 +4676,7 @@ The key word is def:

        - +

        Computing a tree

        Computation: follow a chain of definition until no definition @@ -4879,7 +4702,7 @@ Computation in GF is performed with the put_term command and the

        - +

        Definitional equality

        Two trees are definitionally equal if they compute into the same tree. @@ -4907,7 +4730,7 @@ so that an object of one also is an object of the other.

        - +

        Judgement forms for constructors

        The judgement form data tells that a category has @@ -4937,7 +4760,7 @@ marked as data will be treated as variables.

        - +

        Exercises on semantic definitions

        1. Implement an interpreter of a small functional programming @@ -4953,9 +4776,13 @@ Type checking can be invoked with put_term -transform=solve.

        - +

        Lesson 6: Grammars of formal languages

        +NOTICE: The methods described in this lesson are not yet fully supported +in GF 3.0 beta. Use GF 2.9 to get all functionalities. +

        +

        @@ -4970,7 +4797,7 @@ Goals:

        - +

        Arithmetic expressions

        We construct a calculator with addition, subtraction, multiplication, and @@ -5001,7 +4828,7 @@ grammars are not allowed to declare functions with Int as value typ

        - +

        Concrete syntax: a simple approach

        We begin with a @@ -5043,7 +4870,7 @@ First problems:

        - +

        Lexing and unlexing

        @@ -5092,7 +4919,7 @@ In linearization, we use a corresponding unlexer:

        - +

        Most common lexers and unlexers

        @@ -5163,7 +4990,7 @@ In linearization, we use a corresponding unlexer:

        - +

        Precedence and fixity

        Arithmetic expressions should be unambiguous. If we write @@ -5202,7 +5029,7 @@ The usual precedence rules:

        - +

        Precedence as a parameter

        Precedence can be made into an inherent feature of expressions: @@ -5247,7 +5074,7 @@ This idea is encoded in the operation

        - +

        Fixities

        We can define left-associative infix expressions: @@ -5288,7 +5115,7 @@ Now we can write the whole concrete syntax of Calculator compactly:

        - +

        Exercises on precedence

        1. Define non-associative and right-associative infix operations @@ -5302,7 +5129,7 @@ Test parsing with and without a pipe to pt -transform=compute.

        - +

        Code generation as linearization

        Translate arithmetic (infix) to JVM (postfix): @@ -5332,7 +5159,7 @@ Just give linearization rules for JVM:

        - +

        Programs with variables

        A straight code programming language, with @@ -5381,7 +5208,7 @@ of the extension is Prog.

        - +

        Exercises on code generation

        1. Define a C-like concrete syntax of the straight-code language. @@ -5422,7 +5249,7 @@ point literals as arguments.

        - +

        Lesson 7: Embedded grammars

        @@ -5440,7 +5267,7 @@ Goals:

        - +

        Functionalities of an embedded grammar format

        GF grammars can be used as parts of programs written in other programming @@ -5457,76 +5284,74 @@ This facility is based on several components:

        - +

        The portable grammar format

        -The portable format is called GFCC, "GF Canonical Compiled". +The portable format is called PGF, "Portable Grammar Format".

        -A GFCC file can be produced in GF by the command +A file can be produced in GF by the command

        -    > print_multi -printer=gfcc | write_file FILE.gfcc
        +    > print_grammar | write_file FILE.pgf
        +
        +

        +There is also a batch compiler, executable from the operative system shell: +

        +
        +    % gfc --make SOURCE.gf
         
        -

        This applies to GF version 3 and upwards. Older GF used a format suffixed .gfcm. At the moment of writing, also the Java interpreter still uses the GFCM format.

        -GFCC is the recommended format in +PGF is the recommended format in which final grammar products are distributed, because they are stripped from superfluous information and can be started and applied faster than sets of separate modules.

        -Application programmers have never any need to read or modify GFCC files. +Application programmers have never any need to read or modify PGF files.

        -GFCC thus plays the same role as machine code in +PGF thus plays the same role as machine code in general-purpose programming (or bytecode in Java).

        - +

        Haskell: the EmbedAPI module

        The Haskell API contains (among other things) the following types and functions:

        -  module EmbedAPI where
        +    readPGF   :: FilePath -> IO PGF
           
        -  type MultiGrammar 
        -  type Language     
        -  type Category     
        -  type Tree         
        +    linearize :: PGF -> Language -> Tree -> String
        +    parse     :: PGF -> Language -> Category -> String -> [Tree]
           
        -  file2grammar :: FilePath -> IO MultiGrammar
        +    linearizeAll     :: PGF -> Tree -> [String]
        +    linearizeAllLang :: PGF -> Tree -> [(Language,String)]
           
        -  linearize :: MultiGrammar -> Language -> Tree -> String
        -  parse     :: MultiGrammar -> Language -> Category -> String -> [Tree]
        +    parseAll     :: PGF -> Category -> String -> [[Tree]]
        +    parseAllLang :: PGF -> Category -> String -> [(Language,[Tree])]
           
        -  linearizeAll     :: MultiGrammar -> Tree -> [String]
        -  linearizeAllLang :: MultiGrammar -> Tree -> [(Language,String)]
        -  
        -  parseAll     :: MultiGrammar -> Category -> String -> [[Tree]]
        -  parseAllLang :: MultiGrammar -> Category -> String -> [(Language,[Tree])]
        -  
        -  languages  :: MultiGrammar -> [Language]
        -  categories :: MultiGrammar -> [Category]
        -  startCat   :: MultiGrammar -> Category
        +    languages    :: PGF -> [Language]
        +    categories   :: PGF -> [Category]
        +    startCat     :: PGF -> Category
         

        This is the only module that needs to be imported in the Haskell application. It is available as a part of the GF distribution, in the file -src/GF/GFCC/API.hs. +src/PGF.hs.

        - +

        First application: a translator

        Let us first build a stand-alone translator, which can translate @@ -5535,17 +5360,17 @@ in any multilingual grammar between any languages in the grammar.

           module Main where
           
        -  import GF.GFCC.API
        +  import PGF
           import System (getArgs)
           
           main :: IO () 
           main = do
             file:_ <- getArgs
        -    gr <- file2grammar file
        +    gr     <- readPGF file
             interact (translate gr)
           
        -  translate :: MultiGrammar -> String -> String
        -  translate gr = case parseAllLang gr (startCat gr) s of
        +  translate :: PGF -> String -> String
        +  translate gr s = case parseAllLang gr (startCat gr) s of
             (lg,t:_):_ -> unlines [linearize gr l t | l <- languages gr, l /= lg]
             _ -> "NO PARSE"
         
        @@ -5555,11 +5380,13 @@ To run the translator, first compile it by
             % ghc --make -o trans Translator.hs 
         
        -

        +

        +For this, you need the Haskell compiler GHC. +

        - +

        Producing GFCC for the translator

        Then produce a GFCC file. For instance, the Food grammar set can be @@ -5569,23 +5396,16 @@ compiled as follows: % gfc --make FoodEng.gf FoodIta.gf

        -This produces the file Food.gfcc (its name comes from the abstract syntax). +This produces the file Food.pgf (its name comes from the abstract syntax).

        -The gfc batch compiler program is available in GF 3 and upwards. -In earlier versions, the appropriate command can be piped to gf: -

        -
        -    % echo "pm -printer=gfcc | wf Food.gfcc" | gf FoodEng.gf FoodIta.gf
        -
        -

        The Haskell library function interact makes the trans program work like a Unix filter, which reads from standard input and writes to standard output. Therefore it can be a part of a pipe and read and write files. The simplest way to translate is to echo input to the program:

        -    % echo "this wine is delicious" | ./trans Food.gfcc
        +    % echo "this wine is delicious" | ./trans Food.pgf
             questo vino è delizioso
         

        @@ -5594,7 +5414,7 @@ The result is given in all languages except the input language.

        - +

        A translator loop

        To avoid starting the translator over and over again: @@ -5616,7 +5436,7 @@ is quit.

        - +

        A question-answer system

        @@ -5643,7 +5463,7 @@ We change the pure translator by giving the translate function the transfer as an extra argument:

        -    translate :: (Tree -> Tree) -> MultiGrammar -> String -> String
        +    translate :: (Tree -> Tree) -> PGF -> String -> String
         

        Ordinary translation as a special case where @@ -5661,36 +5481,29 @@ To reply in the same language as the question:

        - +

        Exporting GF datatypes to Haskell

        To make it easy to define a transfer function, we export the abstract syntax to a system of Haskell datatypes:

        -    % gfc -haskell Food.gfcc
        +    % gfc --output-format=haskell Food.gfcc
         

        It is also possible to produce the Haskell file together with GFCC, by

        -    % gfc --make -haskell FoodEng.gf FoodIta.gf
        +    % gfc --make --output-format=haskell FoodEng.gf FoodIta.gf
         

        -The result is a file named GSyntax.hs, containing a -module named GSyntax. +The result is a file named Food.hs, containing a +module named Food.

        -In GF before version 3, the same result is obtained from within GF, by the command -

        -
        -    > print_grammar -printer=gfcc_haskell | write_file GSyntax.hs
        -
        -

        -

        - +

        Example of exporting GF datatypes

        Input: abstract syntax judgements @@ -5729,9 +5542,12 @@ Output: Haskell definitions All type and constructor names are prefixed with a G to prevent clashes.

        +The Haskell module name is the same as the abstract syntax name. +

        +

        - +

        The question-answer function

        Haskell's type checker guarantees that the functions are well-typed also with @@ -5755,10 +5571,10 @@ respect to GF.

        - +

        Converting between Haskell and GF trees

        -The GSyntax module also contains +The generated Haskell module also contains

           class Gf a where 
        @@ -5788,13 +5604,13 @@ For the programmer, it is enougo to know:
         

        - +

        Putting it all together: the transfer definition

           module TransferDef where
           
        -  import GF.GFCC.API (Tree)
        -  import GSyntax
        +  import PGF (Tree)
        +  import Math   -- generated from GF
           
           transfer :: Tree -> Tree
           transfer = gf . answer . fg
        @@ -5822,7 +5638,7 @@ For the programmer, it is enougo to know:
         

        - +

        Putting it all together: the Main module

        Here is the complete code in the Haskell file TransferLoop.hs. @@ -5830,12 +5646,12 @@ Here is the complete code in the Haskell file TransferLoop.hs.

           module Main where
           
        -  import GF.GFCC.API
        +  import PGF
           import TransferDef (transfer)
           
           main :: IO () 
           main = do
        -    gr <- file2grammar "Math.gfcc"
        +    gr <- file2grammar "Math.pgf"
             loop (translate transfer gr)
           
           loop :: (String -> String) -> IO ()
        @@ -5845,7 +5661,7 @@ Here is the complete code in the Haskell file TransferLoop.hs.
               putStrLn $ trans s
               loop trans
           
        -  translate :: (Tree -> Tree) -> MultiGrammar -> String -> String
        +  translate :: (Tree -> Tree) -> PGF -> String -> String
           translate tr gr = case parseAllLang gr (startCat gr) s of
             (lg,t:_):_ -> linearize gr lg (tr t)
             _ -> "NO PARSE"
        @@ -5854,7 +5670,7 @@ Here is the complete code in the Haskell file TransferLoop.hs.
         

        - +

        Putting it all together: the Makefile

        To automate the production of the system, we write a Makefile as follows: @@ -5892,9 +5708,12 @@ Just to summarize, the source of the application consists of the following files

        - +

        Translets: embedded translators in Java

        +NOTICE. Only for GF 2.9 and older at the moment. +

        +

        A Java system needs many more files than a Haskell system. To get started, fetch the package gfc2java from

        @@ -5937,9 +5756,12 @@ The translet looks like this:

        - +

        Dialogue systems in Java

        +NOTICE. Only for GF 2.9 and older at the moment. +

        +

        A question-answer system is a special case of a dialogue system, where the user and the computer communicate by writing or, even more properly, by speech. @@ -5971,7 +5793,7 @@ again accessible with the Darcs version control system.

        - +

        Language models for speech recognition

        The standard way of using GF in speech recognition is by building @@ -5982,40 +5804,46 @@ GF supports several formats, including GSL, the formatused in the Nuance speech recognizer.

        -GSL is produced from GF by printing a grammar with the flag --printer=gsl. +GSL is produced from GF by running gfc with the flag +--output-format=gsl.

        -Example: GSL generated from the smart house grammar here. +Example: GSL generated from FoodsEng.gf.

        -    > import -conversion=finite SmartEng.gf
        -    > print_grammar -printer=gsl
        +    % gfc --make --output-format=gsl FoodsEng.gf
        +    % more FoodsEng.gsl
           
             ;GSL2.0
        -    ; Nuance speech recognition grammar for SmartEng
        +    ; Nuance speech recognition grammar for FoodsEng
             ; Generated by GF
           
        -    .MAIN SmartEng_2
        +    .MAIN Phrase_cat
           
        -    SmartEng_0 [("switch" "off") ("switch" "on")]
        -    SmartEng_1 ["dim" ("switch" "off")
        -                ("switch" "on")]
        -    SmartEng_2 [(SmartEng_0 SmartEng_3)
        -                (SmartEng_1 SmartEng_4)]
        -    SmartEng_3 ("the" SmartEng_5)
        -    SmartEng_4 ("the" SmartEng_6)
        -    SmartEng_5 "fan"
        -    SmartEng_6 "light"
        +    Item_1 [("that" Kind_1) ("this" Kind_1)]
        +    Item_2 [("these" Kind_2) ("those" Kind_2)]
        +    Item_cat [Item_1 Item_2]
        +    Kind_1 ["cheese" "fish" "pizza" (Quality_1 Kind_1)
        +            "wine"]
        +    Kind_2 ["cheeses" "fish" "pizzas"
        +            (Quality_1 Kind_2) "wines"]
        +    Kind_cat [Kind_1 Kind_2]
        +    Phrase_1 [(Item_1 "is" Quality_1)
        +              (Item_2 "are" Quality_1)]
        +    Phrase_cat Phrase_1
        +    
        +    Quality_1 ["boring" "delicious" "expensive"
        +               "fresh" "italian" ("very" Quality_1) "warm"]
        +    Quality_cat Quality_1
         

        - +

        More speech recognition grammar formats

        -Other formats available via the -printer flag include: +Other formats available via the --output-format flag include:

        @@ -6057,9 +5885,9 @@ Other formats available via the -printer flag include:

        -All currently available formats can be seen in gf with help -printer. +All currently available formats can be seen with gfc --help.

        - + diff --git a/examples/tutorial/semantics/SemBase.hs b/examples/tutorial/semantics/SemBase.hs index 24073894b..b682010e1 100644 --- a/examples/tutorial/semantics/SemBase.hs +++ b/examples/tutorial/semantics/SemBase.hs @@ -1,6 +1,6 @@ module SemBase where -import GSyntax +import Base import Logic -- translation of Base syntax to Logic @@ -8,7 +8,6 @@ import Logic iS :: GS -> Prop iS s = case s of GPredAP np ap -> iNP np (iAP ap) - GConjS c s t -> iConj c (iS s) (iS t) iNP :: GNP -> (Exp -> Prop) -> Prop iNP np p = case np of diff --git a/examples/tutorial/semantics/Top.hs b/examples/tutorial/semantics/Top.hs index 6027b238c..51d5fbb99 100644 --- a/examples/tutorial/semantics/Top.hs +++ b/examples/tutorial/semantics/Top.hs @@ -1,16 +1,16 @@ module Main where -import GSyntax +import Base import SemBase import Logic -import GF.GFCC.API +import PGF main :: IO () main = do - gr <- file2grammar "base.gfcc" + gr <- file2grammar "Base.pgf" loop gr -loop :: MultiGrammar -> IO () +loop :: PGF -> IO () loop gr = do s <- getLine let t:_ = parse gr "BaseEng" "S" s