updated resource doc

2006-06-08 21:37:01 +00:00
parent c89a97d4c6
commit 165b7269a9
2 changed files with 145 additions and 34 deletions
@@ -7,9 +7,41 @@
 <P ALIGN="center"><CENTER><H1>GF Resource Grammar Library v. 1.0</H1>
 <FONT SIZE="4">
 <I>Author: Aarne Ranta &lt;aarne (at) cs.chalmers.se&gt;</I><BR>
-Last update: Sun Jun  4 00:01:57 2006
+Last update: Thu Jun  8 23:35:47 2006
 </FONT></CENTER>
 <P></P>
 <HR NOSHADE SIZE=1>
 <P></P>
    <UL>
    <LI><A HREF="#toc1">Authors</A>
    <LI><A HREF="#toc2">License</A>
    <LI><A HREF="#toc3">Scope</A>
    <LI><A HREF="#toc4">Quick start</A>
      <UL>
      <LI><A HREF="#toc5">The language independent ground API</A>
      <LI><A HREF="#toc6">The language-dependent APIs</A>
      <LI><A HREF="#toc7">Special-purpose APIs</A>
      </UL>
    <LI><A HREF="#toc8">Using the library</A>
      <UL>
      <LI><A HREF="#toc9">The compiled version</A>
      <LI><A HREF="#toc10">Linking applications to libraries</A>
      <LI><A HREF="#toc11">Using the libraries as top-level grammars</A>
      </UL>
    <LI><A HREF="#toc12">Example applications</A>
      <UL>
      <LI><A HREF="#toc13">Brozeage</A>
      <LI><A HREF="#toc14">Dialogue</A>
      <LI><A HREF="#toc15">Animals</A>
      </UL>
    <LI><A HREF="#toc16">Known bugs and missing components</A>
    <LI><A HREF="#toc17">More reading</A>
    </UL>
 <P></P>
 <HR NOSHADE SIZE=1>
 <P></P>
 <P>
 The GF Resource Grammar Library defines the basic grammar of
 ten languages: 
@@ -21,6 +53,7 @@ Italian, Norwegian, Russian, Spanish, Swedish.
 yet been "officially" released. The release is planned in the end
 of June 2006.
 </P>
 <A NAME="toc1"></A>
 <H2>Authors</H2>
 <P>
 Inger Andersson and Therese Soderberg (Spanish morphology),
@@ -55,12 +88,14 @@ Saara Myllyntausta,
 Wanjiku Ng'ang'a,
 Jordi Saludes.
 </P>
 <A NAME="toc2"></A>
 <H2>License</H2>
 <P>
 The GF Resource Grammar Library is open-source software licensed under
 GNU General Public License. See the file <A HREF="../LICENSE">LICENSE</A> for more
 details.
 </P>
 <A NAME="toc3"></A>
 <H2>Scope</H2>
 <P>
 Coverage, for each language:
@@ -68,7 +103,8 @@ Coverage, for each language:
 <UL>
 <LI>complete morphology
 <LI>lexicon of the ca. 100 most important structural words
-<LI>test lexicon of ca. 300 content words
+<LI>test lexicon of ca. 300 content words (rough equivalents in each language)
 <LI>list of irregular verbs (language-dependent)
 <LI>representative fragment of syntax (cf. CLE (Core Language Engine))
 <LI>rather flat semantics (cf. Quasi-Logical Form of CLE)
 </UL>
@@ -90,6 +126,7 @@ Presentation:
 <LI>example collections
 </UL>
 <A NAME="toc4"></A>
 <H2>Quick start</H2>
 <P>
 Go to the main directory, compile the grammars, and run a test.
@@ -122,6 +159,7 @@ Do for instance
 <P>
 For more examples, see the <A HREF="clt2006.html">Overview slides</A>. 
 </P>
 <A NAME="toc5"></A>
 <H3>The language independent ground API</H3>
 <P>
 This API is accessible by both <CODE>present</CODE> and <CODE>alltenses</CODE>.
@@ -129,7 +167,7 @@ The API is divided into a bunch of <CODE>abstract</CODE> modules.
 The following figure gives the dependencies of these modules.
 </P>
 <P>
-<IMG ALIGN="left" SRC="Lang.png" BORDER="0" ALT=""> 
+<IMG ALIGN="left" SRC="Grammar.png" BORDER="0" ALT=""> 
 </P>
 <P>
 The documentation of the individual modules:
@@ -151,9 +189,11 @@ The documentation of the individual modules:
 <LI><A HREF="gfdoc/Idiom.html">Idiom</A>: idiomatic phrases, such as existentials
 <LI><A HREF="gfdoc/Structural.html">Structural</A>: a lexicon of structural words
 <LI><A HREF="gfdoc/Lexicon.html">Lexicon</A>: a lexicon of other common words, for test purposes
-<LI><A HREF="gfdoc/Lang.html">Lang</A>: the main module comprising all the others
+<LI><A HREF="gfdoc/Grammar.html">Grammar</A>: the main module comprising all but <CODE>Lexicon</CODE>
 <LI><A HREF="gfdoc/Lang.html">Lang</A>: the main module comprising both <CODE>Grammar</CODE> and <CODE>Lexicon</CODE>
 </UL>
 <A NAME="toc6"></A>
 <H3>The language-dependent APIs</H3>
 <UL>
 <LI><A HREF="gfdoc/ParadigmsDan.html">ParadigmsDan</A>: Danish lexical paradigms
@@ -172,24 +212,36 @@ The documentation of the individual modules:
 <LI><A HREF="gfdoc/IrregDan.gf">IrregDan</A>: Danish irregular verbs (very incomplete)
 <LI><A HREF="gfdoc/IrregEng.gf">IrregEng</A>: English irregular verbs
 <LI><A HREF="gfdoc/IrregFre.gf">IrregFre</A>: French irregular verbs
 <LI><A HREF="gfdoc/IrregGer.gf">IrregGer</A>: German irregular verbs
 <LI><A HREF="gfdoc/IrregNor.gf">IrregNor</A>: Norwegian irregular verbs (very incomplete)
 <LI><A HREF="gfdoc/IrregSwe.gf">IrregSwe</A>: Swedish irregular verbs
 </UL>
 <P>
 This is the structure of each language-dependent top module.
 </P>
 <P>
 <IMG ALIGN="middle" SRC="English.png" BORDER="0" ALT="">
 </P>
 <UL>
 <LI><A HREF="../abstract/Extra.gf">Extra</A>: extra constructs implemented in some languages
 <LI><A HREF="../scandinavian/ExtraScandAbs.gf">ExtraScand</A>: extra constructs in Scandinavian only
 <LI><A HREF="../norwegian/ExtraNorAbs.gf">ExtraNor</A>: extra constructs in Norwegian only
 <LI><A HREF="../finnish/ExtraFinAbs.gf">ExtraFin</A>: extra constructs in Finnish only
 <LI><A HREF="../french/ExtraFreAbs.gf">ExtraFre</A>: extra constructs in French only
 <LI><A HREF="../english/ExtraEngAbs.gf">ExtraEng</A>: extra constructs in English only
 </UL>
 <UL>
 <LI><A HREF="../english/EnglishAbs.gf">English</A>: English with all extras
 <LI><A HREF="../finnish/FinnishAbs.gf">Finnish</A>: Finnish with all extras
 <LI><A HREF="../french/FrenchAbs.gf">French</A>: French with all extras
 <LI><A HREF="../german/GermanAbs.gf">German</A>: German with all extras
 <LI><A HREF="../norwegian/NorwegianAbs.gf">Norwegian</A>: Norwegian with all extras
 <LI><A HREF="../swedish/SwedishAbs.gf">Swedish</A>: Swedish with all extras
 </UL>
 <A NAME="toc7"></A>
 <H3>Special-purpose APIs</H3>
 <H4>Present</H4>
 <P>
@@ -200,6 +252,12 @@ The result is a smaller and more efficient grammar, which is still
 sufficient for many applications.
 </P>
 <H4>Multimodal</H4>
 <P>
 The API is the same as for the full ground API, but with modified
 linearization types of <CODE>NP</CODE> and <CODE>Adv</CODE>, and all other categories
 depending on them: an extra field is added to a demonstrative pointing
 gesture. Some functions for constructing demonstratives are provided.
 </P>
 <UL>
 <LI><A HREF="gfdoc/Multi.html">Multi</A>: main module for multimodal dialogue systems
 </UL>
@@ -211,7 +269,9 @@ sufficient for many applications.
 <LI><A HREF="gfdoc/Symbol.html">Symbol</A>: symbols and numbers in text
 </UL>
 <A NAME="toc8"></A>
 <H2>Using the library</H2>
 <A NAME="toc9"></A>
 <H3>The compiled version</H3>
 <P>
 The simplest way to get the library is to install the precompiled version
@@ -233,12 +293,24 @@ library. Use one (or several) of the following packages instead:
  multimodal dialogue applications
 </UL>
 <A NAME="toc10"></A>
 <H3>Linking applications to libraries</H3>
 <P>
-Notice, however, that both special-purpose APIs share modules with
+Typically, open one of
 <CODE>present</CODE>. It is therefore not a good idea to use them in combination with
 <CODE>alltenses</CODE>.
 </P>
 <UL>
 <LI><CODE>GrammarX</CODE> for just syntax
 <LI><CODE>LangX</CODE> for both syntax and a small lexicon
 <LI><CODE>X</CODE> (e.g. <CODE>English</CODE>) for syntax, lexicon, and language-dependent extensions
 </UL>
 <P>
 Usually you also need your own lexicon, and hence have to open
 </P>
 <UL>
 <LI><CODE>ParadigmsX</CODE> for lexicon-building functions
 </UL>
 <P>
 It is advisable to use the bare package names in paths pointing to the
 libraries. Here is an example, from <CODE>examples/dialogue/LightsEng.gf</CODE>:
@@ -255,6 +327,12 @@ I have the following line in my <CODE>.bashrc</CODE> file:
    export GF_LIB_PATH=/home/aarne/GF/lib
 </PRE>
 <P></P>
 <P>
 The <CODE>mathematical</CODE> API shares modules with
 <CODE>present</CODE>. It is therefore not a good idea to use it in combination with
 <CODE>alltenses</CODE>.
 </P>
 <A NAME="toc11"></A>
 <H3>Using the libraries as top-level grammars</H3>
 <P>
 If you have done <CODE>make</CODE> in <CODE>lib/resource-1.0</CODE>, you will have 
@@ -277,22 +355,27 @@ to succeed.
 <P>
 An exception is <CODE>LangEng</CODE>. It is actually feasible to parse with
 both <CODE>alltenses/LangEng</CODE> and <CODE>present/LangEng</CODE> - the latter being
-much faster than the former. The <CODE>-mcfg</CODE> flag (multiple context-free grammar)
+much faster than the former. The <CODE>-fcfg</CODE> flag (fast multiple context-free grammar)
 must be used:
 </P>
 <PRE>
-    p -lang=LangEng -mcfg -parser=topdown "this man is old"
+    p -lang=LangEng -fcfg "this man is old"
 </PRE>
 <P>
-Parsing with the <CODE>-mcfg</CODE> flag takes a few extra seconds the first time during
+Parsing with the <CODE>-fcfg</CODE> flag takes a few extra seconds the first time during
 each session, but gets faster at later runs.
 </P>
 <P>
 It is also feasible to parse in Scandinavian languages (Danish, Norwegian, Swedish).
 </P>
 <A NAME="toc12"></A>
 <H2>Example applications</H2>
 <P>
 These applications are meand to serve as starting points for
 new applications, showing how the libraries can be used in
 typical situations.
 </P>
 <A NAME="toc13"></A>
 <H3>Brozeage</H3>
 <P>
 The <A HREF="../../../examples/bronzeage">examples/bronzeage</A> 
@@ -300,6 +383,7 @@ grammar set implements a language fragment
 based on the Swadesh list of 200 words. It is useful for
 things like language training.
 </P>
 <A NAME="toc14"></A>
 <H3>Dialogue</H3>
 <P>
 The <A HREF="../../../examples/dialogue">examples/dialogue</A> 
@@ -308,6 +392,7 @@ multimodal dialogue system.
 Its purpose is to serve as a prototype for applications in the
 TALK project.
 </P>
 <A NAME="toc15"></A>
 <H3>Animals</H3>
 <P>
 The <A HREF="../../../examples/animal">examples/animal</A> 
@@ -315,6 +400,7 @@ grammar set implements some queries about animals.
 Its purpose is to serve as a prototype for example-based 
 grammar writing.
 </P>
 <A NAME="toc16"></A>
 <H2>Known bugs and missing components</H2>
 <P>
 This bugs should be fixed before the final release of v. 1.0.
@@ -344,14 +430,14 @@ Finnish
 French
 </P>
 <UL>
-<LI>only direct word order in questions
+<LI>no inverted word order in questions
 </UL>
 <P>
 German
 </P>
 <UL>
-<LI>no list of irregular verbs
+<LI>-
 </UL>
 <P>
@@ -381,13 +467,12 @@ Russian
 Spanish
 </P>
 <UL>
-<LI>no ordinal numbers
+<LI>-
 Swedish
 <LI>-
 </UL>
-<P>
+<A NAME="toc17"></A>
 Swedish
 - 
 </P>
 <H2>More reading</H2>
 <P>
 <A HREF="gslt-sem-2006.html">Grammars as Software Libraries</A>. Slides
@@ -417,5 +502,5 @@ examples are from <CODE>multimodal/old</CODE>, which is a reduced-size API.
 </P>
 <!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
-<!-- cmdline: txt2tags index.txt -->
+<!-- cmdline: txt2tags -\-toc -thtml index.txt -->
 </BODY></HTML>
@@ -65,7 +65,8 @@ details.
 Coverage, for each language:
 - complete morphology
 - lexicon of the ca. 100 most important structural words
- test lexicon of ca. 300 content words
+- test lexicon of ca. 300 content words (rough equivalents in each language)
 - list of irregular verbs (language-dependent)
 - representative fragment of syntax (cf. CLE (Core Language Engine))
 - rather flat semantics (cf. Quasi-Logical Form of CLE)
@@ -115,7 +116,7 @@ This API is accessible by both ``present`` and ``alltenses``.
 The API is divided into a bunch of ``abstract`` modules.
 The following figure gives the dependencies of these modules.
-[Lang.png] 
+[Grammar.png] 
 The documentation of the individual modules:
@@ -135,7 +136,8 @@ The documentation of the individual modules:
 - [Idiom gfdoc/Idiom.html]: idiomatic phrases, such as existentials
 - [Structural gfdoc/Structural.html]: a lexicon of structural words
 - [Lexicon gfdoc/Lexicon.html]: a lexicon of other common words, for test purposes
- [Lang gfdoc/Lang.html]: the main module comprising all the others
+- [Grammar gfdoc/Grammar.html]: the main module comprising all but ``Lexicon``
 - [Lang gfdoc/Lang.html]: the main module comprising both ``Grammar`` and ``Lexicon``
 ===The language-dependent APIs===
@@ -156,19 +158,27 @@ The documentation of the individual modules:
 - [IrregDan gfdoc/IrregDan.gf]: Danish irregular verbs (very incomplete)
 - [IrregEng gfdoc/IrregEng.gf]: English irregular verbs
 - [IrregFre gfdoc/IrregFre.gf]: French irregular verbs
-% - [IrregGer gfdoc/IrregGer.gf]: German irregular verbs
+- [IrregGer gfdoc/IrregGer.gf]: German irregular verbs
 - [IrregNor gfdoc/IrregNor.gf]: Norwegian irregular verbs (very incomplete)
 - [IrregSwe gfdoc/IrregSwe.gf]: Swedish irregular verbs
 This is the structure of each language-dependent top module.
 [English.png]
 - [Extra ../abstract/Extra.gf]: extra constructs implemented in some languages
 - [ExtraScand ../scandinavian/ExtraScandAbs.gf]: extra constructs in Scandinavian only
 - [ExtraNor ../norwegian/ExtraNorAbs.gf]: extra constructs in Norwegian only
 - [ExtraFin ../finnish/ExtraFinAbs.gf]: extra constructs in Finnish only
 - [ExtraFre ../french/ExtraFreAbs.gf]: extra constructs in French only
 - [ExtraEng ../english/ExtraEngAbs.gf]: extra constructs in English only
 - [English ../english/EnglishAbs.gf]: English with all extras
 - [Finnish ../finnish/FinnishAbs.gf]: Finnish with all extras
 - [French ../french/FrenchAbs.gf]: French with all extras
 - [German ../german/GermanAbs.gf]: German with all extras
 - [Norwegian ../norwegian/NorwegianAbs.gf]: Norwegian with all extras
 - [Swedish ../swedish/SwedishAbs.gf]: Swedish with all extras
@@ -187,6 +197,11 @@ sufficient for many applications.
 ====Multimodal====
 The API is the same as for the full ground API, but with modified
 linearization types of ``NP`` and ``Adv``, and all other categories
 depending on them: an extra field is added to a demonstrative pointing
 gesture. Some functions for constructing demonstratives are provided.
 - [Multi gfdoc/Multi.html]: main module for multimodal dialogue systems
@@ -220,9 +235,14 @@ library. Use one (or several) of the following packages instead:
 ===Linking applications to libraries===
-Notice, however, that both special-purpose APIs share modules with
+Typically, open one of
-``present``. It is therefore not a good idea to use them in combination with
+- ``GrammarX`` for just syntax
-``alltenses``.
+- ``LangX`` for both syntax and a small lexicon
 - ``X`` (e.g. ``English``) for syntax, lexicon, and language-dependent extensions
 Usually you also need your own lexicon, and hence have to open
 - ``ParadigmsX`` for lexicon-building functions
 It is advisable to use the bare package names in paths pointing to the
@@ -237,6 +257,11 @@ I have the following line in my ``.bashrc`` file:
  export GF_LIB_PATH=/home/aarne/GF/lib
 ```
 The ``mathematical`` API shares modules with
 ``present``. It is therefore not a good idea to use it in combination with
 ``alltenses``.
 ===Using the libraries as top-level grammars===
@@ -257,14 +282,17 @@ to succeed.
 An exception is ``LangEng``. It is actually feasible to parse with
 both ``alltenses/LangEng`` and ``present/LangEng`` - the latter being
-much faster than the former. The ``-mcfg`` flag (multiple context-free grammar)
+much faster than the former. The ``-fcfg`` flag (fast multiple context-free grammar)
 must be used:
 ```
-  p -lang=LangEng -mcfg -parser=topdown "this man is old"
+  p -lang=LangEng -fcfg "this man is old"
 ```
-Parsing with the ``-mcfg`` flag takes a few extra seconds the first time during
+Parsing with the ``-fcfg`` flag takes a few extra seconds the first time during
 each session, but gets faster at later runs.
 It is also feasible to parse in Scandinavian languages (Danish, Norwegian, Swedish).
 ==Example applications==
@@ -314,11 +342,11 @@ Finnish
 French
- only direct word order in questions
+- no inverted word order in questions
 German
- no list of irregular verbs
+- -
 Italian
@@ -336,11 +364,9 @@ Russian
 Spanish
- no ordinal numbers
+- -
 Swedish
- 
+- -