mirror of
https://github.com/GrammaticalFramework/gf-core.git
synced 2026-04-09 21:19:31 -06:00
268 lines
7.1 KiB
Plaintext
268 lines
7.1 KiB
Plaintext
GF Resource Grammar Library v. 1.2
|
|
Author: Aarne Ranta <aarne (at) cs.chalmers.se>
|
|
Last update: %%date(%c)
|
|
|
|
% NOTE: this is a txt2tags file.
|
|
% Create an html file from this file using:
|
|
% txt2tags --toc -thtml index.txt
|
|
|
|
%!target:html
|
|
|
|
%!postproc(html): #BCEN <center>
|
|
%!postproc(html): #ECEN </center>
|
|
|
|
|
|
#BCEN
|
|
|
|
[10lang-large.png]
|
|
|
|
#ECEN
|
|
|
|
|
|
The GF Resource Grammar Library defines the basic grammar of
|
|
ten languages:
|
|
Danish, English, Finnish, French, German,
|
|
Italian, Norwegian, Russian, Spanish, Swedish.
|
|
Still incomplete implementations for Arabic and Catalan are also
|
|
included.
|
|
|
|
**New** in December 2007: Browsing the library by syntax editor
|
|
[directly on the web ../../../demos/resource-api/editor.html].
|
|
|
|
|
|
|
|
|
|
==Authors==
|
|
|
|
Inger Andersson and Therese Soderberg (Spanish morphology),
|
|
Nicolas Barth and Sylvain Pogodalla (French verb list),
|
|
Ali El Dada (Arabic modules),
|
|
Magda Gerritsen and Ulrich Real (Russian paradigms and lexicon),
|
|
Janna Khegai (Russian modules),
|
|
Bjorn Bringert (many Swadesh lexica),
|
|
Carlos Gonzalía (Spanish cardinals),
|
|
Harald Hammarström (German morphology),
|
|
Patrik Jansson (Swedish cardinals),
|
|
Andreas Priesnitz (German lexicon),
|
|
Aarne Ranta,
|
|
Jordi Saludes (Catalan modules),
|
|
Henning Thielemann (German lexicon).
|
|
|
|
|
|
We are grateful for contributions and
|
|
comments to several other people who have used this and
|
|
the previous versions of the resource library, including
|
|
Ludmilla Bogavac,
|
|
Ana Bove,
|
|
David Burke,
|
|
Lauri Carlson,
|
|
Gloria Casanellas,
|
|
Karin Cavallin,
|
|
Robin Cooper,
|
|
Hans-Joachim Daniels,
|
|
Elisabet Engdahl,
|
|
Markus Forsberg,
|
|
Kristofer Johannisson,
|
|
Anni Laine,
|
|
Hans Leiß,
|
|
Peter Ljunglöf,
|
|
Saara Myllyntausta,
|
|
Wanjiku Ng'ang'a,
|
|
Nadine Perera,
|
|
Jordi Saludes.
|
|
|
|
|
|
==License==
|
|
|
|
The GF Resource Grammar Library is open-source software licensed under
|
|
GNU Lesser General Public License (LGPL). See the file [LICENSE ../LICENSE] for more
|
|
details.
|
|
|
|
|
|
==Scope==
|
|
|
|
Coverage, for each language:
|
|
- complete morphology
|
|
- lexicon of the ca. 100 most important structural words
|
|
- test lexicon of ca. 300 content words (rough equivalents in each language)
|
|
- list of irregular verbs (separately for each language)
|
|
- representative fragment of syntax (cf. CLE (Core Language Engine))
|
|
- rather flat semantics (cf. Quasi-Logical Form of CLE)
|
|
|
|
|
|
Organization:
|
|
- top-level (API) modules
|
|
- Ground API + special-purpose APIs
|
|
- "school grammar" concepts rather than advanced linguistic theory
|
|
|
|
|
|
Presentation:
|
|
- tool ``gfdoc`` for generating HTML from grammars
|
|
- example collections
|
|
|
|
|
|
==Location==
|
|
|
|
Assuming you have installed the libraries, you will find the precompiled
|
|
``gfc`` and ``gfr`` files directly under ``$GF_LIB_PATH``, whose default
|
|
value is ``/usr/local/share/GF/``. The precompiled subdirectories are
|
|
```
|
|
alltenses
|
|
mathematical
|
|
multimodal
|
|
present
|
|
```
|
|
Do for instance
|
|
```
|
|
cd $GF_LIB_PATH
|
|
gf alltenses/langs.gfcm
|
|
|
|
> p -cat=S -lang=LangEng "this grammar is too big" | tb
|
|
```
|
|
For more details, see the [Synopsis synopsis.html].
|
|
|
|
|
|
==Compilation==
|
|
|
|
If you want to compile the library from scratch, use ``make`` in the root of
|
|
the source directory:
|
|
```
|
|
cd GF/lib/resource-1.0
|
|
make
|
|
```
|
|
The ``make`` procedure does not by default make Arabic and Catalan, but you
|
|
can uncomment the relevant lines in ``Makefile`` to compile them.
|
|
|
|
|
|
==Encoding==
|
|
|
|
Finnish, German, Romance, and Scandinavian languages are in isolatin-1.
|
|
|
|
Arabic and Russian are in UTF-8.
|
|
|
|
English is in pure ASCII.
|
|
|
|
The different encodings imply, unfortunately, that it is hard to get
|
|
a nice view of all languages simultaneously. The easiest way to achieve this is
|
|
to use ``gfeditor``, which automatically converts grammars to UTF-8.
|
|
|
|
|
|
==Using the resource as library==
|
|
|
|
This API is accessible by both ``present`` and ``alltenses``. The modules you most often need are
|
|
- ``Syntax``, the interface to syntactic structures
|
|
- ``Syntax``//L//, the implementations of ``Syntax`` for each language //L//
|
|
- ``Paradigms``//L//, the morphological paradigms for each language //L//
|
|
|
|
|
|
The [Synopsis synopsis.html] gives examples on the typical usage of these
|
|
modules.
|
|
|
|
|
|
==Using the resource as top level grammar==
|
|
|
|
The following modules can be used for parsing and linearization. They are accessible from both
|
|
``present`` and ``alltenses``.
|
|
- ``Lang``//L// for each language //L//, implementing a common abstract syntax ``Lang``
|
|
- ``Danish``, ``English``, etc, implementing ``Lang`` with language-specific extensions
|
|
|
|
|
|
In addition, there is in both ``present`` and ``alltenses`` the file
|
|
- ``langs.gfcm``, a package with precompiled ``Lang``//L// grammars
|
|
|
|
|
|
A way to test and view the resource grammar is to load ``langs.gfcm`` either into ``gfeditor``
|
|
or into the ``gf`` shell and perform actions such as syntax editing and treebank generation.
|
|
For instance, the command
|
|
```
|
|
> p -lang=LangEng -cat=S "this grammar is too big" | tb
|
|
```
|
|
creates a treebank entry with translations of this sentence.
|
|
|
|
For parsing, currently only English and the Scandinavian languages are within the limits ofr
|
|
reasonable resources. For other languages //L//, parsing with ``Lang``//L// will probably eat
|
|
up the computer resources before finishing the parser generation.
|
|
|
|
|
|
|
|
==Accessing the lower level ground API==
|
|
|
|
The ``Syntax`` API is implemented in terms a bunch of ``abstract`` modules, which
|
|
as of version 1.2 are mainly interesting for implementors of the resource.
|
|
See the [documentation for version 1.1 index-1.1.html] for more details.
|
|
|
|
|
|
==Known bugs and missing components==
|
|
|
|
Danish
|
|
- the lexicon and chosen inflections are only partially verified
|
|
|
|
|
|
English
|
|
|
|
|
|
Finnish
|
|
- wrong cases in some passive constructions
|
|
|
|
|
|
French
|
|
- multiple clitics (with V3) not always right
|
|
- third person pronominal questions with inverted word order
|
|
have wrong forms if "t" is required e.g.
|
|
(e.g. "comment fera-t-il" becomes "comment fera il")
|
|
|
|
|
|
German
|
|
|
|
|
|
Italian
|
|
- multiple clitics (with V3) not always right
|
|
|
|
|
|
Norwegian
|
|
- the lexicon and chosen inflections are only partially verified
|
|
|
|
|
|
Russian
|
|
- some functions missing
|
|
- some regular paradigms are missing
|
|
|
|
|
|
Spanish
|
|
- multiple clitics (with V3) not always right
|
|
- missing contractions with imperatives and clitics
|
|
|
|
|
|
Swedish
|
|
|
|
|
|
|
|
|
|
==More reading==
|
|
|
|
[Synopsis synopsis.html]. The concise guide to API v. 1.2.
|
|
|
|
[Grammars as Software Libraries gslt-sem-2006.html]. Slides
|
|
with background and motivation for the resource grammar library.
|
|
|
|
[GF Resource Grammar Library Version 1.0 clt2006.html]. Slides
|
|
giving an overview of the library and practical hints on its use.
|
|
|
|
[How to write resource grammars Resource-HOWTO.html]. Helps you
|
|
start if you want to add another language to the library.
|
|
|
|
[Parametrized modules for Romance languages http://www.cs.chalmers.se/~aarne/geocal2006.pdf].
|
|
Slides explaining some ideas in the implementation of
|
|
French, Italian, and Spanish.
|
|
|
|
[Grammar writing by examples http://www.cs.chalmers.se/~aarne/slides/webalt-2005.pdf].
|
|
Slides showing how linearization rules are written as strings parsable by the resource grammar.
|
|
|
|
[Multimodal Resource Grammars http://www.cs.chalmers.se/~aarne/slides/talk-edin2005.pdf].
|
|
Slides showing how to use the multimodal resource library. N.B. the library
|
|
examples are from ``multimodal/old``, which is a reduced-size API.
|
|
|
|
[GF Resource Grammar Library ../../../doc/resource.pdf] (pdf).
|
|
Printable user manual with API documentation, for version 1.0.
|
|
|