resource doc; make lib

This commit is contained in:
aarne
2004-08-10 10:17:19 +00:00
parent 548962f7d7
commit f301993bff
3 changed files with 519 additions and 43 deletions

View File

@@ -20,8 +20,8 @@ August 10, 2004.
<b>August 10, 2004. GF 2.0 now released</b>.
Here are the <a
href="http://www.cs.chalmers.se/%7Eaarne/GF/doc/gf2-highlights.html">highlights</a>.
Software available on the <a href="http://www.cs.chalmers.se/%7Eaarne/GF/download">Download Page</a>.
href="doc/gf2-highlights.html">highlights</a>.
Software available on the <a href="download">Download Page</a>.
<p>
@@ -79,6 +79,7 @@ for Zaurus. To learn more about Zaurus, read this
review</a>.
</font>
</p><h2>What is GF?</h2>
The Grammatical Framework (=GF) is a grammar formalism based on type
@@ -113,14 +114,19 @@ GF Version 2.0 adds the aspect of
<li> modularity and grammar engineering.
</ul>
GF is open-source software licensed under
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/doc/gpl.html">GNU General Public License (GPL)</a>.
<a href="LICENSE">GNU General Public License (GPL)</a>.
<h2>Examples and demos</h2>
<a href="2341.html">Numeral translator</a>: recognizes and generates
<a href="http://www.cs.chalmers.se/~bringert/gf/translate/">Numeral
translator</a>: recognizes and generates
numbers from 1 to 999,999 in 80 languages.
(The link goes to a live applet, which requires
<a href="http://java.sun.com/j2se/1.5.0/download.jsp">Java 1.5 plugin</a>.
Here is an <a href="doc/2341.html">example</a>, which does
not require the plugin.)
<p>
@@ -130,8 +136,7 @@ French, Swedish, and Russian with a few mouse clicks.
<p>
<a href="http://129.16.225.78/aarne/GF/resource/">Resource grammar
library</a>:
<a href="lib/resource/">Resource grammar library</a>:
basic structures of seven languages
(English, Finnish, French, German, Italian, Russian, Swedish).
Resource grammars can be used as libraries for writing GF
@@ -143,7 +148,7 @@ but they can also be useful for language training.
GF is available precompiled for
several platforms: Linux, Mac OS X, Microsoft Windows, and Sun OS.
For more information, see the <a href="http://www.cs.chalmers.se/%7Eaarne/GF/download">Download Page</a>.
For more information, see the <a href="download">Download Page</a>.
<h2>Source code</h2>
@@ -160,9 +165,9 @@ The platform-independent graphical user interface is written in
</p><p>
Here is a
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/gf-src.tgz">GF source package</a>, which includes a Makefile
<a href="gf-src.tgz">GF source package</a>, which includes a Makefile
for different platforms and Haskell compilers.
The <a href="http://www.cs.chalmers.se/%7Eaarne/GF/download">Download Page</a> gives more information on
The <a href="download">Download Page</a> gives more information on
compiler requirements.
</p><p>
@@ -175,39 +180,39 @@ Here are some older source packages still available:
<ul>
<li>
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/doc/javaGUImanual/javaGUImanual.htm">User's tutorial</a>
<a href="doc/javaGUImanual/javaGUImanual.htm">User's tutorial</a>
on editing in the Java interface.
</li><li>
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/Tutorial/">Grammarian's tutorial</a>
<a href="Tutorial/">Grammarian's tutorial</a>
on writing GF grammars, with exercises.
</li><li>
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/doc/short/01-gf-short.html">
<a href="doc/short/01-gf-short.html">
GF in 25 Minutes</a> for programmers.
</li><li>
<a href="http://www.cs.chalmers.se/%7Eaarne/articles/gf-jfp.ps.gz">Grammatical Framework: A Type-Theoretical
<a href="http://www.cs.chalmers.se/~aarne/articles/gf-jfp.ps.gz">Grammatical Framework: A Type-Theoretical
Grammar Formalism</a> (ps.gz). Theoretical paper on GF by A. Ranta, appeared
in <i>The Journal of Functional Programming</i>, vol. 14:2. 2004, pp. 145-189.
</li><li>
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/doc/gf-manual.html">
<a href="doc/gf-manual.html">
User Manual</a> explaining the GF user interfaces and command language.
</li><li>
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/doc/gf-specification.html">
<a href="doc/DocGF.pdf">
Language specification</a> of the GF grammar formalism.
</li><li>
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/doc/gf2-highlights.html">
<a href="doc/gf2-highlights.html">
Highlights</a> of Version 2.0 (in comparison with version 1.1).
</li><li>
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/doc/gf-bibliography.html">
<a href="doc/gf-bibliography.html">
Bibliography</a>:
publications on GF, as well as background literature.
</li></ul>
@@ -222,7 +227,10 @@ Knowledge</a>. GF is used in implementing multimodal and multilingual dialogue s
<a hred="http://www.key-project.org/">KeY</a> project on Integrated Deductive
Software Design. GF is used for
authoring informal and formal specifications.
authoring informal and formal specifications. More details on the GF
application
<a href="http://www.cs.chalmers.se/%7Ekrijo/GF/specifications.html">
here</a>.
<p>
@@ -237,14 +245,9 @@ in Vienna 2003.
<ul>
<li>
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/gf-grammars.tgz">
<a href="gf-grammars.tgz">
Package of example GF grammars</a>
<!--
<li>
<a href="grammars/">
Unpacked directories with example GF grammars</a>
-->
</li><li>
<a href="http://www.cs.chalmers.se/%7Ekrijo/gramlets.html">Gramlets</a>:
@@ -256,16 +259,6 @@ GF grammars compiled to Java applets.
The GF Xerox Home Page</a>
with the oldest releases of and documents on GF, Version 0.54, 1999.
</li><li>
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/doc/gf-local.html">
Local guide</a>
on running GF on Chalmers CS computers.
</li><li>
Application project:
<a href="http://www.cs.chalmers.se/%7Ekrijo/GF/specifications.html">
Grammars for Object-Oriented Software Specifications</a>
by <a href="http://www.cs.chalmers.se/%7Ekrijo/">Kristofer Johannisson</a>.
</li><li>
Earlier application:
@@ -295,15 +288,6 @@ More details on the
<a href="http://www.cs.chalmers.se/%7Eaarne/GF/doc/gf-people.html">
Authors and Acknowledgements</a> page.
<!--
<h2>Demo</h2>
There was an
<a href="demo/FormTranslate.html">on-line translator demo</a>
currently running on an old server, and not always functional.
-->
<h2>Implementation project</h2>

View File

@@ -4,3 +4,5 @@ distr:
test:
sh tst.sh
lib:
sh mkLib.sh

490
lib/resource-0.6/index.html Normal file
View File

@@ -0,0 +1,490 @@
<html>
<body bgcolor="#FFFFFF" text="#000000" >
<center>
<img SRC="../../doc/gf-logo.gif">
<h1>The GF Resource Grammar Library</h1>
<a href="http://www.cs.chalmers.se/~aarne">Aarne Ranta</a>
2002-2004
<p>
Version 0.6: <a href="gf-resource.tgz">source package</a>.
<p>
Current languages: English, Finnish, French, German, Italian, Russian, Swedish.
</center>
<font size=2>
<b>News</b>. <br>
10/8/2004 This document updated as a revision of the
<a href="http://tournesol.cs.chalmers.se/aarne/GF/resource/">old resource page</a>.
<br>
13/4/2004 Version 0.6 written using the module system of GF 2. Also an
extended coverage. The files are placed in separate subdirectories (one
per language) and have different names than before, so that file names
(without the extension <tt>.gf</tt>) are also legal module names.
</font>
<p>
<i>
<b>Notice</b>. You need GF Version 2.0beta or later
to work with these resource grammars.
It is available from the
<a href="http://www.cs.chalmers.se/~aarne/GF/">GF home page</a>.
</i>
<p>
<h2>Introduction</h2>
As programs in general can be divided into
<ul>
<li> application programs
<li> library programs
</ul>
GF grammars can be divided into
<ul>
<li> <b>application grammars</b>
<li> <b>resource grammars</b>
</ul>
An application grammar is typically built around
a semantic model, which is formalized as the abstract
syntax of the language. Concrete syntax defines
a mapping from the abstract syntax into English or
Swedish or some other language.
<p>
A resource grammar is not based on semantics, but its
purpose is to define the linguistic "surface" structures
of some language. The availability of these structures makes it easier to
write application grammars.
<p>
With resource grammars, we aim to achieve <b>division of labour</b> in
grammar writing:
<ul>
<li> application grammars are written by domain experts
<li> resource grammars are written by linguists
</ul>
By using resource grammars, experts of application domains can take
linguistic details for granted. For instance, to
express the linearization of the arithmetical predicate <i>even</i>
in French, she does not have to write
<pre>
lin Even x = {s =
table {
m => x.s ++
table {Ind => "est" ; Subj => "soit"} ! m ++
table {Masc => "pair" ; Fem => "paire"} ! x.g
}
} ;
</pre>
but simply
<pre>
lin Even = predA1 (adjReg "pair") ;
</pre>
The author of the French resource grammar will have defined the
functions <tt>predAdj</tt> and <tt>adjReg</tt> in such a way that
they can be used in all applications.
<p>
What is more, the resource grammar has a <b>language-independent
API</b>, which makes it possible to write the corresponding rule
for other languages in a very similar way. For instance, the
German rule is
<pre>
lin Even = predA1 (adjReg "gerade") ;
</pre>
<h2>Coverage</h2>
The ultimate goal of the resource grammar library is a full coverage of the linguistic
structures of each language. As of Version 0.6, we still have some way
to go to reach that goal. But we do have
<ul>
<li> fairly complete sets of inflection paradigms for each language
<li> a representative fragment of syntax covering present-tense
indicative, interrogative, and imperative sentence.
<li> lexica of structural words such as pronouns, articles, conjunctions.
</ul>
<h2>Demo</h2>
To get an idea of the coverage of the resource library, and also
to help finding the right functions for your applications, you
can do
<pre>
make test
jgf TestAll.gfcm
</pre>
This opens the syntax editor with all the seven resource grammars
extended with a small lexicon.
<h2>Programmer's view on resource grammars</h2>
The resource grammar library a hierarchical structure. Its main layers are
<ul>
<li> The language-dependent <b>core resources</b>, to be described below.
<li> The language-independent <b>core resource API</b>,
<a href="doc/Combinations.html"><tt>Combinations.gf</tt></a>.
<a href="doc/Structural.html"><tt>Structural.gf</tt></a>.
<li> The <b>derived resource libraries</b>, some of which are
language-dependent, some of which aren't. The most important
ones are the language-dependent lexical paradigm modules
<tt>ParadigmsX.gf</tt>.
</ul>
The core resources should not be needed by application grammarians: it should
be enough to use the core resource API and the derived libraries. If
this is not the case, the best solution is to extend the derived resource
libraries or create new ones.
<h3>Grammaticality guarantee via data abstraction</h3>
An important principle is that
<ul>
<li> the core resource API and the derived resource libraries guarantee
that all type-correct uses of them preserve grammaticality.
</ul>
This principle is simultaneously a guidance for resource grammarians
and an argument for the application grammarian to use these libraries.
What we mean by "only using the libraries" is that
<ul>
<li> all <tt>lin</tt> and
<tt>lincat</tt> rules are built solely from library functions and
argument variables.
</ul>
Thus for instance no records, tables, selections or projections should appear
in the rules. What we have achieved then is <b>total data abstraction</b>,
and the grammaticality guarantee can be given.
<p>
Since the resource grammars are work in progress, their coverage is not
yet sufficient for complete data abstraction. In addition, there may of course
be bugs in the resource grammars that destroy grammaticality. The GF group is
grateful for bug reports, requests, and contributions!
<p>
The most important exception to total data abstraction in practice is the
incompleteness of resource lexica. Since it is impossible to have
full coverage of all the words in a language, users often have to introduce
their own lexical entries, and thereby use literal strings in their GF code.
The safest and most convenient way of using this is via functions
defined in <tt>ParadigmsX.gf</tt> files. Using these functions guarantees
that the lexical entries created are type-correct. But nothing guards
against misspelling a word, picking a wrong inflectional pattern, or
a wrong inherent feature (such as gender).
<h3>The resource grammar documentation in <tt>gfdoc</tt></h3>
All documented GF grammars linked from this page
have been written in GF and then translated to HTML
using a light-weight documentation tool,
<tt>gfdoc</tt>. The tool is available as a part of the GF
source code package, in the Haskell file
<tt>util/GFDoc.hs</tt> that can be run in the Hugs interpreter
by the script <tt>util/gfdoc</tt>. The program also has the
flag <tt>+latex</tt>, which produces output in Latex instead of
HTML.
<h3>The core resource API</h3>
The API is divided into two modules, <tt>Combiantions</tt> and
its extension <tt>Structural</tt>.
<p>
The file <a href="doc/Combinations.html"><tt>Combinations.gf</tt></a>
gives the core resource type signatures of phrasal categories and
syntactic combination rules, together with some explanations
and examples. The examples are so far only in English, but their
equivalents are available in all of the languages for which the
API has been implemented.
<p>
The file <a href="doc/Structural.html"><tt>Structurals.gf</tt></a>
gives a list of structural words such as determiners, pronouns,
prepositions, and conjunctions.
<p>
The file <tt>Structural.gf</tt> cannot be imported directly, but
via the generated files <tt>ResourceX.gf</tt> for each language <tt>X</tt>.
In these files, the <tt>fun/lin</tt> and <tt>cat/lincat</tt> judgements have been
translated into <tt>oper</tt> judgements.
<h3>The lexical paradigm modules</h3>
The lexical paradigm modules define, for
each lexical category, a <b>worst-case macro</b> for adding words
of that category by giving a sufficient number of characteristic
forms. In addition, the most common <b>regular paradigms</b> are
included, where it is enough just to give one form to generate
all the others.
<p>
For example, the English paradigm module has the worst-case macro for nouns,
<pre>
mkN : (man,men,man's,men's : Str) -> Gender -> N ;
</pre>
taking four forms and a gender (<tt>human</tt> or <tt>nonhuman</tt>,
as is also explained in the module). Its application
<pre>
mkN "mouse" "mice" "mouse's" "mice's" nonhuman
</pre>
defines all information that is needed for the noun <i>mouse</i>.
There are also some regular patterns, for instance,
<pre>
nReg : Str -> Gender -> N ; -- dog, dogs
nKiss : Str -> Gender -> N ; -- kiss, kisses
</pre>
examples of which are
<pre>
nReg "car" nonhuman
nKiss "waitress" human
</pre>
<p>
Here are the documented versions of the paradigm modules:
<ul>
<li> English: <a href="doc/ParadigmsEng.html"><tt>ParadigmsEng.gf</tt></a>
<li> Finnish: <a href="doc/ParadigmsFin.html"><tt>ParadigmsFin.gf</tt></a>
<li> French: <a href="doc/ParadigmsFra.html"><tt>ParadigmsFra.gf</tt></a>
<li> German: <a href="doc/ParadigmsDeu.html"><tt>ParadigmsDeu.gf</tt></a>
<li> Italian: <a href="doc/ParadigmsIta.html"><tt>ParadigmsIta.gf</tt></a>
<li> Russian: <a href="doc/ParadigmsRus.html"><tt>ParadigmsRus.gf</tt></a>
<li> Swedish: <a href="doc/ParadigmsSwe.html"><tt>ParadigmsSwe.gf</tt></a>
</ul>
<h3>The derived resource libraries</h3>
The core resource grammar is minimal in the sense that it defines the
smallest syntactic combinations and has no redundancy. For applications, it
is usually more convenient to use combinations of the minimal rules.
Some such combinations are given in the <b>predication library</b>,
which defines the simultaneous applications of one- and two-place
verbs and adjectives to all their argument noun phrases. It also
defines some other constructions useful for logical and mathematical
applications.
<p>
The API of the predication library is in the file
<a href="doc/Predication.html"><tt>Predication.gf</tt></a>.
What is imported is one of the language-dependent files,
<tt>X/PredicationX.gf</tt> for each language <tt>X</tt>.
<h2>Linguist's view on resource grammars</h2>
<h3>GF and other grammar formalisms</h3>
Linguists in particular might be interested in resource
grammars for their own sake, not as basis of applications.
Since few linguists are so far familiar with GF, we refer to the
<a href="http://www.cs.chalmers.se/~aarne/GF/">GF Homepage</a>
and especially to the
<a href="http://www.cs.chalmers.se/~aarne/GF/Tutorial/">GF Tutorial</a>.
What comes here is a brief summary of the relation of GF to
other record-based formalisms.
<p>
The records of GF are much like feature structures in PATR or HPSG.
The main differences are that
<ul>
<li> GF has a type system inherited from
functional programming languages;
<li> GF records are primarily obtained as linearizations of trees, not
as parses of strings.
</ul>
The latter difference explains why a GF record typically carries more
information than a feature structure. For instance, the record describing
the French noun <i>cheval</i> is
<pre>
{s = table {Sg => "cheval" ; Pl => "chevaux"} ; g = Masc} ;
</pre>
showing the full inflection table of the (abstract) noun <i>cheval</i>.
A PATR record
for the French word <i>cheval</i> would be
<pre>
{s = "cheval" ; n = Sg ; g = Masc} ;
</pre>
showing just the information that can be gathered from the (concrete)
string <i>cheval</i>.
There is a rather straightforward sense in which the PATR record is an
<b>instance</b> of the GF record.
<p>
When generating language from syntax trees (or from logical formulas via
syntax trees), the record containing full inflection tables is an efficient
(linear-time) method of producing the correct forms.
This is important when text is generated in real time in
an interactive system.
<h2>The structure of core resource grammars</h2>
As explained above, the application grammarian's view on resource grammars
is through API modules. They are collections of type signatures of functions.
It is the task of linguists to define these functions.
The definitions are in the end given
in the <b>core resource grammars</b>.
<p>
We have divided the core resource grammar for each language <tt>X</tt>
into the following parts:
<ul>
<li> Type system: <tt>TypesX.gf</tt>
<li> Morphology: <tt>MorphoX.gf</tt>
<li> Syntax: <tt>SyntaxX.gf</tt>
</ul>
To get the most powerful resource grammar for each language, one can use
these files directly.
<p>
However, the languages we have studied have so much in common
that we have gathered a considerable set of categories and rules
in a <b>multilingual resource grammar</b>. Its parts are
<ul>
<li> Abstract syntax: <tt>Resource.gf</tt></a>
<li> Language-dependent concrete syntax: <tt>ResourceX.gf</tt></a> for
each language.
</ul>
The advantage of using this API in application grammars is that
<b>their concrete syntax looks the same for all languages</b>
up to non-structural words. Thus it is possible to produce concrete syntaxes
for new languages without knowing almost anything about them.
The abstract syntax serves as a common API to the core resource grammar.
<h3>The code for the core resource grammars</h3>
Each language has its resource code in a separate directory.
You can view the code as it is, or download it and run <tt>gfdoc</tt>
on each file.
<ul>
<li> English:
<a href="english"><tt>english</tt></a>
<li> Finnish:
<a href="finnish"><tt>Finnish</tt></a>
<li> Shared Romance:
<a href="romance"><tt>romance</tt></a>
<li> French (building on Romance):
<a href="french"><tt>French</tt></a>
<li> Italian (building on Romance):
<a href="italian"><tt>italian</tt></a>
<li> Russian:
<a href="russian"><tt>russian</tt></a>
<li> German:
<a href="german"><tt>german</tt></a>
<li> Swedish:
<a href="swedish"><tt>swedish</tt></a>
</ul>
<h2>Compiling and using the resource</h2>
To compile the resource into reusable operations, for all languages, type
<pre>
make
</pre>
in the <tt>resource/</tt> directory.
This requires that you have a recent version of GF (>= 2.0).
What you get is a set of files with names <tt>ResourceX.gfr</tt>,
<tt>ResourceX.gfc</tt>, <tt>ParadigmsX.gfr</tt>, and <tt>ParadigmsX.gfc</tt>.
You need never consult any of these files,
but only look into the <a href="doc">documentation</a>.
<h2>Examples of using the resource grammars</h2>
<h3>A test suite</h3>
The grammars <tt>TestResourceX.gf</tt> define a few expressions of each
lexical category and make it possible to test linearization, parsing,
random generation, and editing.
<h3>A database query language</h3>
The grammars
<a href="../database/">
<tt>database/(Database | Restaurant)X.gf</tt></a>
make use of the resource. The <tt>RestaurantX.gf</tt>
grammars are just one possible application building on the generic
<tt>DatabaseX.gf</tt> grammars.
Notice that the
<tt>DatabaseX</tt> gramamrs are defined as instantiations of
the parametrized module <tt>DatabaseI</tt>.
<h2>Functional morphology</h2>
Even though GF is a useful language for describing syntax and semantics, it
is not the optimal choice for morphology.
One reason is the absence of low-level
programming, such as string matching. Another reason is efficiency.
In connection with the resource grammar project, we have started another
project, <a href="http://www.cs.chalmers.se/%7Emarkus/FM">
functional morphology</a>,
which uses Haskell to implement
morphology. Haskell morphologies can then be used for generating
GF morphologies.
<h2>Further reading</h2>
If you want to read an informal introduction to
resource grammars, see these
<a href="karlsruhe.ps.gz">slides</a>, written for a German computer science
audience. Or these
<a href="swedish.ps.gz">other slides</a>, written for a Swedish
linguistic audience.
</body>
</html>