mirror of
https://github.com/GrammaticalFramework/gf-core.git
synced 2026-04-10 05:29:30 -06:00
316 lines
8.6 KiB
HTML
316 lines
8.6 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<META NAME="generator" CONTENT="http://txt2tags.sf.net">
|
|
<TITLE>GF Resource Grammar Library v. 1.2</TITLE>
|
|
</HEAD><BODY BGCOLOR="white" TEXT="black">
|
|
<P ALIGN="center"><CENTER><H1>GF Resource Grammar Library v. 1.2</H1>
|
|
<FONT SIZE="4">
|
|
<I>Author: Aarne Ranta <aarne (at) cs.chalmers.se></I><BR>
|
|
Last update: Fri Dec 21 18:15:24 2007
|
|
</FONT></CENTER>
|
|
|
|
<P>
|
|
<center>
|
|
</P>
|
|
<P>
|
|
<IMG ALIGN="middle" SRC="../../../doc/lang10.png" BORDER="0" ALT="">
|
|
</P>
|
|
<P>
|
|
</center>
|
|
</P>
|
|
<P>
|
|
The GF Resource Grammar Library defines the basic grammar of
|
|
ten languages:
|
|
Danish, English, Finnish, French, German,
|
|
Italian, Norwegian, Russian, Spanish, Swedish.
|
|
Still incomplete implementations for Arabic and Catalan are also
|
|
included.
|
|
</P>
|
|
<P>
|
|
<B>New</B> in December 2007: Browsing the library by syntax editor
|
|
<A HREF="../../../demos/resource-api/editor.html">directly on the web</A>.
|
|
</P>
|
|
<H2>Authors</H2>
|
|
<P>
|
|
Inger Andersson and Therese Soderberg (Spanish morphology),
|
|
Nicolas Barth and Sylvain Pogodalla (French verb list),
|
|
Ali El Dada (Arabic modules),
|
|
Magda Gerritsen and Ulrich Real (Russian paradigms and lexicon),
|
|
Janna Khegai (Russian modules),
|
|
Bjorn Bringert (many Swadesh lexica),
|
|
Carlos Gonzalía (Spanish cardinals),
|
|
Harald Hammarström (German morphology),
|
|
Patrik Jansson (Swedish cardinals),
|
|
Andreas Priesnitz (German lexicon),
|
|
Aarne Ranta,
|
|
Jordi Saludes (Catalan modules),
|
|
Henning Thielemann (German lexicon).
|
|
</P>
|
|
<P>
|
|
We are grateful for contributions and
|
|
comments to several other people who have used this and
|
|
the previous versions of the resource library, including
|
|
Ludmilla Bogavac,
|
|
Ana Bove,
|
|
David Burke,
|
|
Lauri Carlson,
|
|
Gloria Casanellas,
|
|
Karin Cavallin,
|
|
Robin Cooper,
|
|
Hans-Joachim Daniels,
|
|
Elisabet Engdahl,
|
|
Markus Forsberg,
|
|
Kristofer Johannisson,
|
|
Anni Laine,
|
|
Hans Leiß,
|
|
Peter Ljunglöf,
|
|
Saara Myllyntausta,
|
|
Wanjiku Ng'ang'a,
|
|
Nadine Perera,
|
|
Jordi Saludes.
|
|
</P>
|
|
<H2>License</H2>
|
|
<P>
|
|
The GF Resource Grammar Library is open-source software licensed under
|
|
GNU Lesser General Public License (LGPL). See the file <A HREF="../LICENSE">LICENSE</A> for more
|
|
details.
|
|
</P>
|
|
<H2>Scope</H2>
|
|
<P>
|
|
Coverage, for each language:
|
|
</P>
|
|
<UL>
|
|
<LI>complete morphology
|
|
<LI>lexicon of the ca. 100 most important structural words
|
|
<LI>test lexicon of ca. 300 content words (rough equivalents in each language)
|
|
<LI>list of irregular verbs (separately for each language)
|
|
<LI>representative fragment of syntax (cf. CLE (Core Language Engine))
|
|
<LI>rather flat semantics (cf. Quasi-Logical Form of CLE)
|
|
</UL>
|
|
|
|
<P>
|
|
Organization:
|
|
</P>
|
|
<UL>
|
|
<LI>top-level (API) modules
|
|
<LI>Ground API + special-purpose APIs
|
|
<LI>"school grammar" concepts rather than advanced linguistic theory
|
|
</UL>
|
|
|
|
<P>
|
|
Presentation:
|
|
</P>
|
|
<UL>
|
|
<LI>tool <CODE>gfdoc</CODE> for generating HTML from grammars
|
|
<LI>example collections
|
|
</UL>
|
|
|
|
<H2>Location</H2>
|
|
<P>
|
|
Assuming you have installed the libraries, you will find the precompiled
|
|
<CODE>gfc</CODE> and <CODE>gfr</CODE> files directly under <CODE>$GF_LIB_PATH</CODE>, whose default
|
|
value is <CODE>/usr/local/share/GF/</CODE>. The precompiled subdirectories are
|
|
</P>
|
|
<PRE>
|
|
alltenses
|
|
mathematical
|
|
multimodal
|
|
present
|
|
</PRE>
|
|
<P>
|
|
Do for instance
|
|
</P>
|
|
<PRE>
|
|
cd $GF_LIB_PATH
|
|
gf alltenses/langs.gfcm
|
|
|
|
> p -cat=S -lang=LangEng "this grammar is too big" | tb
|
|
</PRE>
|
|
<P>
|
|
For more details, see the <A HREF="synopsis.html">Synopsis</A>.
|
|
</P>
|
|
<H2>Compilation</H2>
|
|
<P>
|
|
If you want to compile the library from scratch, use <CODE>make</CODE> in the root of
|
|
the source directory:
|
|
</P>
|
|
<PRE>
|
|
cd GF/lib/resource-1.0
|
|
make
|
|
</PRE>
|
|
<P>
|
|
The <CODE>make</CODE> procedure does not by default make Arabic and Catalan, but you
|
|
can uncomment the relevant lines in <CODE>Makefile</CODE> to compile them.
|
|
</P>
|
|
<H2>Encoding</H2>
|
|
<P>
|
|
Finnish, German, Romance, and Scandinavian languages are in isolatin-1.
|
|
</P>
|
|
<P>
|
|
Arabic and Russian are in UTF-8.
|
|
</P>
|
|
<P>
|
|
English is in pure ASCII.
|
|
</P>
|
|
<P>
|
|
The different encodings imply, unfortunately, that it is hard to get
|
|
a nice view of all languages simultaneously. The easiest way to achieve this is
|
|
to use <CODE>gfeditor</CODE>, which automatically converts grammars to UTF-8.
|
|
</P>
|
|
<H2>Using the resource as library</H2>
|
|
<P>
|
|
This API is accessible by both <CODE>present</CODE> and <CODE>alltenses</CODE>. The modules you most often need are
|
|
</P>
|
|
<UL>
|
|
<LI><CODE>Syntax</CODE>, the interface to syntactic structures
|
|
<LI><CODE>Syntax</CODE><I>L</I>, the implementations of <CODE>Syntax</CODE> for each language <I>L</I>
|
|
<LI><CODE>Paradigms</CODE><I>L</I>, the morphological paradigms for each language <I>L</I>
|
|
</UL>
|
|
|
|
<P>
|
|
The <A HREF="synopsis.html">Synopsis</A> gives examples on the typical usage of these
|
|
modules.
|
|
</P>
|
|
<H2>Using the resource as top level grammar</H2>
|
|
<P>
|
|
The following modules can be used for parsing and linearization. They are accessible from both
|
|
<CODE>present</CODE> and <CODE>alltenses</CODE>.
|
|
</P>
|
|
<UL>
|
|
<LI><CODE>Lang</CODE><I>L</I> for each language <I>L</I>, implementing a common abstract syntax <CODE>Lang</CODE>
|
|
<LI><CODE>Danish</CODE>, <CODE>English</CODE>, etc, implementing <CODE>Lang</CODE> with language-specific extensions
|
|
</UL>
|
|
|
|
<P>
|
|
In addition, there is in both <CODE>present</CODE> and <CODE>alltenses</CODE> the file
|
|
</P>
|
|
<UL>
|
|
<LI><CODE>langs.gfcm</CODE>, a package with precompiled <CODE>Lang</CODE><I>L</I> grammars
|
|
</UL>
|
|
|
|
<P>
|
|
A way to test and view the resource grammar is to load <CODE>langs.gfcm</CODE> either into <CODE>gfeditor</CODE>
|
|
or into the <CODE>gf</CODE> shell and perform actions such as syntax editing and treebank generation.
|
|
For instance, the command
|
|
</P>
|
|
<PRE>
|
|
> p -lang=LangEng -cat=S "this grammar is too big" | tb
|
|
</PRE>
|
|
<P>
|
|
creates a treebank entry with translations of this sentence.
|
|
</P>
|
|
<P>
|
|
For parsing, currently only English and the Scandinavian languages are within the limits ofr
|
|
reasonable resources. For other languages <I>L</I>, parsing with <CODE>Lang</CODE><I>L</I> will probably eat
|
|
up the computer resources before finishing the parser generation.
|
|
</P>
|
|
<H2>Accessing the lower level ground API</H2>
|
|
<P>
|
|
The <CODE>Syntax</CODE> API is implemented in terms a bunch of <CODE>abstract</CODE> modules, which
|
|
as of version 1.2 are mainly interesting for implementors of the resource.
|
|
See the <A HREF="index-1.1.html">documentation for version 1.1</A> for more details.
|
|
</P>
|
|
<H2>Known bugs and missing components</H2>
|
|
<P>
|
|
Danish
|
|
</P>
|
|
<UL>
|
|
<LI>the lexicon and chosen inflections are only partially verified
|
|
</UL>
|
|
|
|
<P>
|
|
English
|
|
</P>
|
|
<P>
|
|
Finnish
|
|
</P>
|
|
<UL>
|
|
<LI>wrong cases in some passive constructions
|
|
</UL>
|
|
|
|
<P>
|
|
French
|
|
</P>
|
|
<UL>
|
|
<LI>multiple clitics (with V3) not always right
|
|
<LI>third person pronominal questions with inverted word order
|
|
have wrong forms if "t" is required e.g.
|
|
(e.g. "comment fera-t-il" becomes "comment fera il")
|
|
</UL>
|
|
|
|
<P>
|
|
German
|
|
</P>
|
|
<P>
|
|
Italian
|
|
</P>
|
|
<UL>
|
|
<LI>multiple clitics (with V3) not always right
|
|
</UL>
|
|
|
|
<P>
|
|
Norwegian
|
|
</P>
|
|
<UL>
|
|
<LI>the lexicon and chosen inflections are only partially verified
|
|
</UL>
|
|
|
|
<P>
|
|
Russian
|
|
</P>
|
|
<UL>
|
|
<LI>some functions missing
|
|
<LI>some regular paradigms are missing
|
|
</UL>
|
|
|
|
<P>
|
|
Spanish
|
|
</P>
|
|
<UL>
|
|
<LI>multiple clitics (with V3) not always right
|
|
<LI>missing contractions with imperatives and clitics
|
|
</UL>
|
|
|
|
<P>
|
|
Swedish
|
|
</P>
|
|
<H2>More reading</H2>
|
|
<P>
|
|
<A HREF="synopsis.html">Synopsis</A>. The concise guide to API v. 1.2.
|
|
</P>
|
|
<P>
|
|
<A HREF="gslt-sem-2006.html">Grammars as Software Libraries</A>. Slides
|
|
with background and motivation for the resource grammar library.
|
|
</P>
|
|
<P>
|
|
<A HREF="clt2006.html">GF Resource Grammar Library Version 1.0</A>. Slides
|
|
giving an overview of the library and practical hints on its use.
|
|
</P>
|
|
<P>
|
|
<A HREF="Resource-HOWTO.html">How to write resource grammars</A>. Helps you
|
|
start if you want to add another language to the library.
|
|
</P>
|
|
<P>
|
|
<A HREF="http://www.cs.chalmers.se/~aarne/geocal2006.pdf">Parametrized modules for Romance languages</A>.
|
|
Slides explaining some ideas in the implementation of
|
|
French, Italian, and Spanish.
|
|
</P>
|
|
<P>
|
|
<A HREF="http://www.cs.chalmers.se/~aarne/slides/webalt-2005.pdf">Grammar writing by examples</A>.
|
|
Slides showing how linearization rules are written as strings parsable by the resource grammar.
|
|
</P>
|
|
<P>
|
|
<A HREF="http://www.cs.chalmers.se/~aarne/slides/talk-edin2005.pdf">Multimodal Resource Grammars</A>.
|
|
Slides showing how to use the multimodal resource library. N.B. the library
|
|
examples are from <CODE>multimodal/old</CODE>, which is a reduced-size API.
|
|
</P>
|
|
<P>
|
|
<A HREF="../../../doc/resource.pdf">GF Resource Grammar Library</A> (pdf).
|
|
Printable user manual with API documentation, for version 1.0.
|
|
</P>
|
|
|
|
<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
|
|
<!-- cmdline: txt2tags -thtml index.txt -->
|
|
</BODY></HTML>
|