GF Resource Grammar Library v. 1.0 Author: Aarne Ranta Last update: %%date(%c) % NOTE: this is a txt2tags file. % Create an html file from this file using: % txt2tags --toc -thtml index.txt %!target:html The GF Resource Grammar Library defines the basic grammar of ten languages: Danish, English, Finnish, French, German, Italian, Norwegian, Russian, Spanish, Swedish. **Notice**. This document concerns the API v. 1.0 which has not yet been "officially" released. The release will be made in combination with a new version of GF itself, since the grammars use new features not available in GF 2.4. V. 1.0 is not yet available for Russian and Danish: for them, we refer to [v. 0.9 ../../resource/]. ==Authors== Janna Khegai (Russian modules, forthcoming), Bjorn Bringert (many Swadesh lexica), Carlos Gonzalia (Spanish cardinals), Partik Jansson (Swedish cardinals), Aarne Ranta. We are grateful for contributions and comments to several other people who have used this and the previous versions of the resource library, including Ana Bove, David Burke, Lauri Carlson, Gloria Casanellas, Karin Cavallin, Hans-Joachim Daniels, Kristofer Johannisson, Anni Laine, Wanjiku Ng'ang'a, Jordi Saludes. ==License== The GF Resource Grammar Library is open-source software licensed under GNU General Public License. See the file [LICENSE ../LICENSE] for more details. ==Scope== Coverage, for each language: - complete morphology - lexicon of the ca. 100 most important structural words - test lexicon of ca. 300 content words - representative fragment of syntax (cf. CLE (Core Language Engine)) - rather flat semantics (cf. Quasi-Logical Form of CLE) Organization: - top-level (API) modules - Ground API + special-purpose APIs - "school grammar" concepts rather than advanced linguistic theory Presentation: - tool ``gfdoc`` for generating HTML from grammars - example collections ==Quick start== Go to the main directory, compile the grammars, and run a test. ``` cd GF/lib/resource-1.0 make make test ``` This will take quite some time. An alternative is to use the [precompiled grammar package ../../compiled.tgz]. Just do ``` cd GF/lib/resource-1.0 make pretest ``` For more examples, see the [Overview slides clt2006.html]. ===The language independent ground API=== This API is accessible by both ``present`` and ``alltenses``. The API is divided into a bunch of ``abstract`` modules. The following figure gives the dependencies of these modules. [Lang.png] The documentation of the individual modules: - [Common gfdoc/Common.html]: abstract notions with language-indep. implementations - [Cat gfdoc/Cat.html]: the category system - [Noun gfdoc/Noun.html]: construction of nouns and noun phrases - [Adjective gfdoc/Adjective.html]: construction of adjectival phrases - [Verb gfdoc/Verb.html]: construction of verb phrases - [Adverb gfdoc/Adverb.html]: construction of adverbial phrases - [Numeral gfdoc/Numeral.html]: construction of cardinal and ordinal numerals - [Sentence gfdoc/Sentence.html]: construction of sentences and imperatives - [Question gfdoc/Question.html]: construction of questions - [Relative gfdoc/Relative.html]: construction of relative clauses - [Conjunction gfdoc/Conjunction.html]: coordination of phrases - [Phrase gfdoc/Phrase.html]: construction of the major units of text and speech - [Text gfdoc/Text.html]: construction of texts from phrases, using punctuation - [Idiom gfdoc/Idiom.html]: idiomatic phrases, such as existentials - [Structural gfdoc/Structural.html]: a lexicon of structural words - [Lexicon gfdoc/Lexicon.html]: a lexicon of other common words, for test purposes - [Lang gfdoc/Lang.html]: the main module comprising all the others ===The language-dependent APIs=== - [ParadigmsEng gfdoc/ParadigmsEng.html]: English lexical paradigms - [ParadigmsFin gfdoc/ParadigmsFin.html]: Finnish lexical paradigms - [ParadigmsFre gfdoc/ParadigmsFre.html]: French lexical paradigms - [ParadigmsIta gfdoc/ParadigmsIta.html]: Italian lexical paradigms - [ParadigmsGer gfdoc/ParadigmsGer.html]: German lexical paradigms - [ParadigmsNor gfdoc/ParadigmsNor.html]: Norwegian lexical paradigms - [ParadigmsSpa gfdoc/ParadigmsSpa.html]: Spanish lexical paradigms - [ParadigmsSwe gfdoc/ParadigmsSwe.html]: Swedish lexical paradigms - [IrregEng gfdoc/IrregEng.gf]: English irregular verbs - [IrregFre gfdoc/IrregFre.gf]: French irregular verbs % - [IrregGer gfdoc/IrregGer.gf]: German irregular verbs - [IrregNor gfdoc/IrregNor.gf]: Norwegian irregular verbs - [IrregSwe gfdoc/IrregSwe.gf]: Swedish irregular verbs ===Special-purpose APIs=== ====Present==== The API is the same as for the full ground API, but the compiler has ignored all verb and sentence tenses except the present. Lines ignored in the source files are marked by ``--# notpresent``. The result is a smaller and more efficient grammar, which is still sufficient for many applications. ====Multimodal==== - [Multimodal gfdoc/Multimodal.html]: main module for multimodal dialogue systems - [Demonstrative gfdoc/Demonstrative.html]: demonstrative noun phrases and adverbs ====Mathematical==== - [Mathematical gfdoc/Mathematical.html]: main module for mathematical language - [Predication gfdoc/Predication.html]: predication with verbs, adjectives, etc - [Symbol gfdoc/Symbol.html]: symbols and numbers in text ==Using the library== ===The compiled version=== The simplest way to get the library is to install the precompiled version [``lib/compiled.tgz`` ../../compiled.tgz]. Just do ``` cd GF/lib tar xvfz compiled.tgz ``` There is no need to link application grammars to the source directories of the library. Use one (or several) of the following packages instead: - ``lib/alltenses`` the complete ground-API library with all forms - ``lib/present`` a pruned ground-API library with present tense only - ``lib/mathematical`` special-purpose API for mathematical applications - ``lib/multimodal`` special-purpose API for multimodal dialogue applications ===Linking applications to libraries=== Notice, however, that both special-purpose APIs share modules with ``present``. It is therefore not a good idea to use them in combination with ``alltenses``. It is advisable to use the bare package names in paths pointing to the libraries. Here is an example, from ``examples/tram``: ``` --# -path=.:present:multimodal:mathematical:prelude ``` To reach these directories from anywhere, set the environment variable ``GF_LIB_PATH`` to point to the directory ``GF/lib/``. For instance, I have the following line in my ``.bashrc`` file: ``` export GF_LIB_PATH=/home/aarne/GF/lib ``` ===Using the libraries as top-level grammars=== If you have done ``make`` in ``lib/resource-1.0``, you will have a file ``langs.gfcm``. This file can be used with fast startup for tasks such as treebank generation: ``` > i -nocf langs.gfcm > gr -cat=S -cf -number=10 | tb ``` The ``-nocf`` flag saves startup time and memory by preventing the creation of context-free parse grammars. The resource grammar libraries do //not// support parsing very well. While it is theoretically possible to parse with any GF grammar, the resource grammars are so abstract and complex that building the actual parser in memory may just need too much resources to succeed. An exception is ``LangEng``. It is actually feasible to parse with both ``alltenses/LangEng`` and ``present/LangEng`` - the latter being much faster than the former. The ``-mcfg`` flag (multiple context-free grammar) must be used: ``` p -lang=LangEng -mcfg "this man is old" ``` Parsing with the ``-mcfg`` flag takes a few extra seconds the first time during each session, but gets faster at later runs. ==Example applications== These applications are meand to serve as starting points for new applications, showing how the libraries can be used in typical situations. ===Brozeage=== The [examples/bronzeage ../../../examples/bronzeage] grammar set implements a language fragment based on the Swadesh list of 200 words. It is useful for things like language training. ===Tram=== The [examples/tram ../../../examples/tram] grammar set implements the user grammar of a multimodal dialogue system concerning public transport. Its purpose is to serve as a prototype for applications in the TALK project. ===Animals=== The [examples/animal ../../../examples/animal] grammar set implements some queries about animals. Its purpose is to serve as a prototype for example-based grammar writing. ==More reading== [Grammars as Software Libraries gslt-sem-2006.html]. Slides with background and motivation for the resource grammar library. [GF Resource Grammar Library Version 1.0 clt2006.html]. Slides giving an overview of the library and practical hints on its use. [How to write resource grammars Resource-HOWTO.html]. Helps you start if you want to add another language to the library. [Parametrized modules for Romance languages http://www.cs.chalmers.se/~aarne/geocal2006.pdf]. Slides explaining some ideas in the implementation of French, Italian, and Spanish. [Grammar writing by examples http://www.cs.chalmers.se/~aarne/slides/webalt-2005.pdf]. Slides showing how the method is used. [Multimodal Resource Grammars http://www.cs.chalmers.se/~aarne/slides/talk-edin2005.pdf]. Slides showing how to use the multimodal resource library.