1
0
forked from GitHub/gf-core

updated resource doc

This commit is contained in:
aarne
2006-06-08 21:37:01 +00:00
parent c89a97d4c6
commit 165b7269a9
2 changed files with 145 additions and 34 deletions

View File

@@ -7,9 +7,41 @@
<P ALIGN="center"><CENTER><H1>GF Resource Grammar Library v. 1.0</H1>
<FONT SIZE="4">
<I>Author: Aarne Ranta &lt;aarne (at) cs.chalmers.se&gt;</I><BR>
Last update: Sun Jun 4 00:01:57 2006
Last update: Thu Jun 8 23:35:47 2006
</FONT></CENTER>
<P></P>
<HR NOSHADE SIZE=1>
<P></P>
<UL>
<LI><A HREF="#toc1">Authors</A>
<LI><A HREF="#toc2">License</A>
<LI><A HREF="#toc3">Scope</A>
<LI><A HREF="#toc4">Quick start</A>
<UL>
<LI><A HREF="#toc5">The language independent ground API</A>
<LI><A HREF="#toc6">The language-dependent APIs</A>
<LI><A HREF="#toc7">Special-purpose APIs</A>
</UL>
<LI><A HREF="#toc8">Using the library</A>
<UL>
<LI><A HREF="#toc9">The compiled version</A>
<LI><A HREF="#toc10">Linking applications to libraries</A>
<LI><A HREF="#toc11">Using the libraries as top-level grammars</A>
</UL>
<LI><A HREF="#toc12">Example applications</A>
<UL>
<LI><A HREF="#toc13">Brozeage</A>
<LI><A HREF="#toc14">Dialogue</A>
<LI><A HREF="#toc15">Animals</A>
</UL>
<LI><A HREF="#toc16">Known bugs and missing components</A>
<LI><A HREF="#toc17">More reading</A>
</UL>
<P></P>
<HR NOSHADE SIZE=1>
<P></P>
<P>
The GF Resource Grammar Library defines the basic grammar of
ten languages:
@@ -21,6 +53,7 @@ Italian, Norwegian, Russian, Spanish, Swedish.
yet been "officially" released. The release is planned in the end
of June 2006.
</P>
<A NAME="toc1"></A>
<H2>Authors</H2>
<P>
Inger Andersson and Therese Soderberg (Spanish morphology),
@@ -55,12 +88,14 @@ Saara Myllyntausta,
Wanjiku Ng'ang'a,
Jordi Saludes.
</P>
<A NAME="toc2"></A>
<H2>License</H2>
<P>
The GF Resource Grammar Library is open-source software licensed under
GNU General Public License. See the file <A HREF="../LICENSE">LICENSE</A> for more
details.
</P>
<A NAME="toc3"></A>
<H2>Scope</H2>
<P>
Coverage, for each language:
@@ -68,7 +103,8 @@ Coverage, for each language:
<UL>
<LI>complete morphology
<LI>lexicon of the ca. 100 most important structural words
<LI>test lexicon of ca. 300 content words
<LI>test lexicon of ca. 300 content words (rough equivalents in each language)
<LI>list of irregular verbs (language-dependent)
<LI>representative fragment of syntax (cf. CLE (Core Language Engine))
<LI>rather flat semantics (cf. Quasi-Logical Form of CLE)
</UL>
@@ -90,6 +126,7 @@ Presentation:
<LI>example collections
</UL>
<A NAME="toc4"></A>
<H2>Quick start</H2>
<P>
Go to the main directory, compile the grammars, and run a test.
@@ -122,6 +159,7 @@ Do for instance
<P>
For more examples, see the <A HREF="clt2006.html">Overview slides</A>.
</P>
<A NAME="toc5"></A>
<H3>The language independent ground API</H3>
<P>
This API is accessible by both <CODE>present</CODE> and <CODE>alltenses</CODE>.
@@ -129,7 +167,7 @@ The API is divided into a bunch of <CODE>abstract</CODE> modules.
The following figure gives the dependencies of these modules.
</P>
<P>
<IMG ALIGN="left" SRC="Lang.png" BORDER="0" ALT="">
<IMG ALIGN="left" SRC="Grammar.png" BORDER="0" ALT="">
</P>
<P>
The documentation of the individual modules:
@@ -151,9 +189,11 @@ The documentation of the individual modules:
<LI><A HREF="gfdoc/Idiom.html">Idiom</A>: idiomatic phrases, such as existentials
<LI><A HREF="gfdoc/Structural.html">Structural</A>: a lexicon of structural words
<LI><A HREF="gfdoc/Lexicon.html">Lexicon</A>: a lexicon of other common words, for test purposes
<LI><A HREF="gfdoc/Lang.html">Lang</A>: the main module comprising all the others
<LI><A HREF="gfdoc/Grammar.html">Grammar</A>: the main module comprising all but <CODE>Lexicon</CODE>
<LI><A HREF="gfdoc/Lang.html">Lang</A>: the main module comprising both <CODE>Grammar</CODE> and <CODE>Lexicon</CODE>
</UL>
<A NAME="toc6"></A>
<H3>The language-dependent APIs</H3>
<UL>
<LI><A HREF="gfdoc/ParadigmsDan.html">ParadigmsDan</A>: Danish lexical paradigms
@@ -172,24 +212,36 @@ The documentation of the individual modules:
<LI><A HREF="gfdoc/IrregDan.gf">IrregDan</A>: Danish irregular verbs (very incomplete)
<LI><A HREF="gfdoc/IrregEng.gf">IrregEng</A>: English irregular verbs
<LI><A HREF="gfdoc/IrregFre.gf">IrregFre</A>: French irregular verbs
<LI><A HREF="gfdoc/IrregGer.gf">IrregGer</A>: German irregular verbs
<LI><A HREF="gfdoc/IrregNor.gf">IrregNor</A>: Norwegian irregular verbs (very incomplete)
<LI><A HREF="gfdoc/IrregSwe.gf">IrregSwe</A>: Swedish irregular verbs
</UL>
<P>
This is the structure of each language-dependent top module.
</P>
<P>
<IMG ALIGN="middle" SRC="English.png" BORDER="0" ALT="">
</P>
<UL>
<LI><A HREF="../abstract/Extra.gf">Extra</A>: extra constructs implemented in some languages
<LI><A HREF="../scandinavian/ExtraScandAbs.gf">ExtraScand</A>: extra constructs in Scandinavian only
<LI><A HREF="../norwegian/ExtraNorAbs.gf">ExtraNor</A>: extra constructs in Norwegian only
<LI><A HREF="../finnish/ExtraFinAbs.gf">ExtraFin</A>: extra constructs in Finnish only
<LI><A HREF="../french/ExtraFreAbs.gf">ExtraFre</A>: extra constructs in French only
<LI><A HREF="../english/ExtraEngAbs.gf">ExtraEng</A>: extra constructs in English only
</UL>
<UL>
<LI><A HREF="../english/EnglishAbs.gf">English</A>: English with all extras
<LI><A HREF="../finnish/FinnishAbs.gf">Finnish</A>: Finnish with all extras
<LI><A HREF="../french/FrenchAbs.gf">French</A>: French with all extras
<LI><A HREF="../german/GermanAbs.gf">German</A>: German with all extras
<LI><A HREF="../norwegian/NorwegianAbs.gf">Norwegian</A>: Norwegian with all extras
<LI><A HREF="../swedish/SwedishAbs.gf">Swedish</A>: Swedish with all extras
</UL>
<A NAME="toc7"></A>
<H3>Special-purpose APIs</H3>
<H4>Present</H4>
<P>
@@ -200,6 +252,12 @@ The result is a smaller and more efficient grammar, which is still
sufficient for many applications.
</P>
<H4>Multimodal</H4>
<P>
The API is the same as for the full ground API, but with modified
linearization types of <CODE>NP</CODE> and <CODE>Adv</CODE>, and all other categories
depending on them: an extra field is added to a demonstrative pointing
gesture. Some functions for constructing demonstratives are provided.
</P>
<UL>
<LI><A HREF="gfdoc/Multi.html">Multi</A>: main module for multimodal dialogue systems
</UL>
@@ -211,7 +269,9 @@ sufficient for many applications.
<LI><A HREF="gfdoc/Symbol.html">Symbol</A>: symbols and numbers in text
</UL>
<A NAME="toc8"></A>
<H2>Using the library</H2>
<A NAME="toc9"></A>
<H3>The compiled version</H3>
<P>
The simplest way to get the library is to install the precompiled version
@@ -233,12 +293,24 @@ library. Use one (or several) of the following packages instead:
multimodal dialogue applications
</UL>
<A NAME="toc10"></A>
<H3>Linking applications to libraries</H3>
<P>
Notice, however, that both special-purpose APIs share modules with
<CODE>present</CODE>. It is therefore not a good idea to use them in combination with
<CODE>alltenses</CODE>.
Typically, open one of
</P>
<UL>
<LI><CODE>GrammarX</CODE> for just syntax
<LI><CODE>LangX</CODE> for both syntax and a small lexicon
<LI><CODE>X</CODE> (e.g. <CODE>English</CODE>) for syntax, lexicon, and language-dependent extensions
</UL>
<P>
Usually you also need your own lexicon, and hence have to open
</P>
<UL>
<LI><CODE>ParadigmsX</CODE> for lexicon-building functions
</UL>
<P>
It is advisable to use the bare package names in paths pointing to the
libraries. Here is an example, from <CODE>examples/dialogue/LightsEng.gf</CODE>:
@@ -255,6 +327,12 @@ I have the following line in my <CODE>.bashrc</CODE> file:
export GF_LIB_PATH=/home/aarne/GF/lib
</PRE>
<P></P>
<P>
The <CODE>mathematical</CODE> API shares modules with
<CODE>present</CODE>. It is therefore not a good idea to use it in combination with
<CODE>alltenses</CODE>.
</P>
<A NAME="toc11"></A>
<H3>Using the libraries as top-level grammars</H3>
<P>
If you have done <CODE>make</CODE> in <CODE>lib/resource-1.0</CODE>, you will have
@@ -277,22 +355,27 @@ to succeed.
<P>
An exception is <CODE>LangEng</CODE>. It is actually feasible to parse with
both <CODE>alltenses/LangEng</CODE> and <CODE>present/LangEng</CODE> - the latter being
much faster than the former. The <CODE>-mcfg</CODE> flag (multiple context-free grammar)
much faster than the former. The <CODE>-fcfg</CODE> flag (fast multiple context-free grammar)
must be used:
</P>
<PRE>
p -lang=LangEng -mcfg -parser=topdown "this man is old"
p -lang=LangEng -fcfg "this man is old"
</PRE>
<P>
Parsing with the <CODE>-mcfg</CODE> flag takes a few extra seconds the first time during
Parsing with the <CODE>-fcfg</CODE> flag takes a few extra seconds the first time during
each session, but gets faster at later runs.
</P>
<P>
It is also feasible to parse in Scandinavian languages (Danish, Norwegian, Swedish).
</P>
<A NAME="toc12"></A>
<H2>Example applications</H2>
<P>
These applications are meand to serve as starting points for
new applications, showing how the libraries can be used in
typical situations.
</P>
<A NAME="toc13"></A>
<H3>Brozeage</H3>
<P>
The <A HREF="../../../examples/bronzeage">examples/bronzeage</A>
@@ -300,6 +383,7 @@ grammar set implements a language fragment
based on the Swadesh list of 200 words. It is useful for
things like language training.
</P>
<A NAME="toc14"></A>
<H3>Dialogue</H3>
<P>
The <A HREF="../../../examples/dialogue">examples/dialogue</A>
@@ -308,6 +392,7 @@ multimodal dialogue system.
Its purpose is to serve as a prototype for applications in the
TALK project.
</P>
<A NAME="toc15"></A>
<H3>Animals</H3>
<P>
The <A HREF="../../../examples/animal">examples/animal</A>
@@ -315,6 +400,7 @@ grammar set implements some queries about animals.
Its purpose is to serve as a prototype for example-based
grammar writing.
</P>
<A NAME="toc16"></A>
<H2>Known bugs and missing components</H2>
<P>
This bugs should be fixed before the final release of v. 1.0.
@@ -344,14 +430,14 @@ Finnish
French
</P>
<UL>
<LI>only direct word order in questions
<LI>no inverted word order in questions
</UL>
<P>
German
</P>
<UL>
<LI>no list of irregular verbs
<LI>-
</UL>
<P>
@@ -381,13 +467,12 @@ Russian
Spanish
</P>
<UL>
<LI>no ordinal numbers
<LI>-
Swedish
<LI>-
</UL>
<P>
Swedish
-
</P>
<A NAME="toc17"></A>
<H2>More reading</H2>
<P>
<A HREF="gslt-sem-2006.html">Grammars as Software Libraries</A>. Slides
@@ -417,5 +502,5 @@ examples are from <CODE>multimodal/old</CODE>, which is a reduced-size API.
</P>
<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
<!-- cmdline: txt2tags index.txt -->
<!-- cmdline: txt2tags -\-toc -thtml index.txt -->
</BODY></HTML>

View File

@@ -65,7 +65,8 @@ details.
Coverage, for each language:
- complete morphology
- lexicon of the ca. 100 most important structural words
- test lexicon of ca. 300 content words
- test lexicon of ca. 300 content words (rough equivalents in each language)
- list of irregular verbs (language-dependent)
- representative fragment of syntax (cf. CLE (Core Language Engine))
- rather flat semantics (cf. Quasi-Logical Form of CLE)
@@ -115,7 +116,7 @@ This API is accessible by both ``present`` and ``alltenses``.
The API is divided into a bunch of ``abstract`` modules.
The following figure gives the dependencies of these modules.
[Lang.png]
[Grammar.png]
The documentation of the individual modules:
@@ -135,7 +136,8 @@ The documentation of the individual modules:
- [Idiom gfdoc/Idiom.html]: idiomatic phrases, such as existentials
- [Structural gfdoc/Structural.html]: a lexicon of structural words
- [Lexicon gfdoc/Lexicon.html]: a lexicon of other common words, for test purposes
- [Lang gfdoc/Lang.html]: the main module comprising all the others
- [Grammar gfdoc/Grammar.html]: the main module comprising all but ``Lexicon``
- [Lang gfdoc/Lang.html]: the main module comprising both ``Grammar`` and ``Lexicon``
===The language-dependent APIs===
@@ -156,19 +158,27 @@ The documentation of the individual modules:
- [IrregDan gfdoc/IrregDan.gf]: Danish irregular verbs (very incomplete)
- [IrregEng gfdoc/IrregEng.gf]: English irregular verbs
- [IrregFre gfdoc/IrregFre.gf]: French irregular verbs
% - [IrregGer gfdoc/IrregGer.gf]: German irregular verbs
- [IrregGer gfdoc/IrregGer.gf]: German irregular verbs
- [IrregNor gfdoc/IrregNor.gf]: Norwegian irregular verbs (very incomplete)
- [IrregSwe gfdoc/IrregSwe.gf]: Swedish irregular verbs
This is the structure of each language-dependent top module.
[English.png]
- [Extra ../abstract/Extra.gf]: extra constructs implemented in some languages
- [ExtraScand ../scandinavian/ExtraScandAbs.gf]: extra constructs in Scandinavian only
- [ExtraNor ../norwegian/ExtraNorAbs.gf]: extra constructs in Norwegian only
- [ExtraFin ../finnish/ExtraFinAbs.gf]: extra constructs in Finnish only
- [ExtraFre ../french/ExtraFreAbs.gf]: extra constructs in French only
- [ExtraEng ../english/ExtraEngAbs.gf]: extra constructs in English only
- [English ../english/EnglishAbs.gf]: English with all extras
- [Finnish ../finnish/FinnishAbs.gf]: Finnish with all extras
- [French ../french/FrenchAbs.gf]: French with all extras
- [German ../german/GermanAbs.gf]: German with all extras
- [Norwegian ../norwegian/NorwegianAbs.gf]: Norwegian with all extras
- [Swedish ../swedish/SwedishAbs.gf]: Swedish with all extras
@@ -187,6 +197,11 @@ sufficient for many applications.
====Multimodal====
The API is the same as for the full ground API, but with modified
linearization types of ``NP`` and ``Adv``, and all other categories
depending on them: an extra field is added to a demonstrative pointing
gesture. Some functions for constructing demonstratives are provided.
- [Multi gfdoc/Multi.html]: main module for multimodal dialogue systems
@@ -220,9 +235,14 @@ library. Use one (or several) of the following packages instead:
===Linking applications to libraries===
Notice, however, that both special-purpose APIs share modules with
``present``. It is therefore not a good idea to use them in combination with
``alltenses``.
Typically, open one of
- ``GrammarX`` for just syntax
- ``LangX`` for both syntax and a small lexicon
- ``X`` (e.g. ``English``) for syntax, lexicon, and language-dependent extensions
Usually you also need your own lexicon, and hence have to open
- ``ParadigmsX`` for lexicon-building functions
It is advisable to use the bare package names in paths pointing to the
@@ -237,6 +257,11 @@ I have the following line in my ``.bashrc`` file:
export GF_LIB_PATH=/home/aarne/GF/lib
```
The ``mathematical`` API shares modules with
``present``. It is therefore not a good idea to use it in combination with
``alltenses``.
===Using the libraries as top-level grammars===
@@ -257,14 +282,17 @@ to succeed.
An exception is ``LangEng``. It is actually feasible to parse with
both ``alltenses/LangEng`` and ``present/LangEng`` - the latter being
much faster than the former. The ``-mcfg`` flag (multiple context-free grammar)
much faster than the former. The ``-fcfg`` flag (fast multiple context-free grammar)
must be used:
```
p -lang=LangEng -mcfg -parser=topdown "this man is old"
p -lang=LangEng -fcfg "this man is old"
```
Parsing with the ``-mcfg`` flag takes a few extra seconds the first time during
Parsing with the ``-fcfg`` flag takes a few extra seconds the first time during
each session, but gets faster at later runs.
It is also feasible to parse in Scandinavian languages (Danish, Norwegian, Swedish).
==Example applications==
@@ -314,11 +342,11 @@ Finnish
French
- only direct word order in questions
- no inverted word order in questions
German
- no list of irregular verbs
- -
Italian
@@ -336,11 +364,9 @@ Russian
Spanish
- no ordinal numbers
- -
Swedish
-
- -