diff --git a/resource-1.0/doc/Resource-HOWTO.html b/resource-1.0/doc/Resource-HOWTO.html index 4435d3c8e..58e05bd46 100644 --- a/resource-1.0/doc/Resource-HOWTO.html +++ b/resource-1.0/doc/Resource-HOWTO.html @@ -7,9 +7,56 @@
The purpose of this document is to tell how to implement the GF resource grammar API for a new language. We will not cover how @@ -17,23 +64,43 @@ to use the resource grammar, nor how to change the API. But we will give some hints how to extend the API.
-Notice. This document concerns the API v. 1.0 which has not
-yet been released. You can find the current code
-in GF/lib/resource-1.0/. See the
-resource-1.0/README for
+A manual for using the resource grammar is found in
+
+http://www.cs.chalmers.se/~aarne/GF/doc/resource.pdf.
+
+A tutorial on GF, also introducing the idea of resource grammars, is found in +
+
+http://www.cs.chalmers.se/~aarne/GF/doc/tutorial/gf-tutorial2.html.
+
+This document concerns the API v. 1.0. You can find the current code in +
+
+http://www.cs.chalmers.se/~aarne/GF/lib/resource-1.0/
+
+See the README for
details on how this differs from previous versions.
The API is divided into a bunch of abstract modules.
The following figure gives the dependencies of these modules.
-
+
-The module structure is rather flat: almost every module is a direct
-parent of the top module Lang. The idea
+Thus the API consists of a grammar and a lexicon, which is
+provided for test purposes.
+
+The module structure is rather flat: most modules are direct
+parents of Grammar. The idea
is that you can concentrate on one linguistic aspect at a time, or
also distribute the work among several authors. The module Cat
defines the "glue" that ties the aspects together - a type system
@@ -41,6 +108,7 @@ to which all the other modules conform, so that e.g. NP means
the same thing in those modules that use NPs and those that
constructs them.
The direct parents of the top will be called phrase category modules, @@ -65,6 +133,7 @@ one of a small number of different types). Thus we have
Idiom: idiomatic phrases such as existentials
+
Expressions of each phrase category are constructed in the corresponding
@@ -93,6 +162,7 @@ can skip the lincat definition of a category and use the default
{s : Str} until you need to change it to something else. In
English, for instance, many categories do have this linearization type.
What is lexical and what is syntactic is not as clearcut in GF as in @@ -129,6 +199,45 @@ different languages on the level of a resource grammar. In other words, application grammars are likely to use the resource in different ways for different languages.
+ ++In addition to the common API, there is room for language-dependent extensions +of the resource. The top level of each languages looks as follows (with English as example): +
++ abstract English = Grammar, ExtraEngAbs, DictEngAbs ++
+where ExtraEngAbs is a collection of syntactic structures specific to English,
+and DictEngAbs is an English dictionary
+(at the moment, it consists of IrregEngAbs,
+the irregular verbs of English). Each of these language-specific grammars has
+the potential to grow into a full-scale grammar of the language. These grammar
+can also be used as libraries, but the possibility of using functors is lost.
+
+To give a better overview of language-specific structures,
+modules like ExtraEngAbs
+are built from a language-independent module ExtraAbs
+by restricted inheritance:
+
+ abstract ExtraEngAbs = Extra [f,g,...] ++
+Thus any category and function in Extra may be shared by a subset of all
+languages. One can see this set-up as a matrix, which tells
+what Extra structures
+are implemented in what languages. For the common API in Grammar, the matrix
+is filled with 1's (everything is implemented in every language).
+
+In a minimal resource grammar implementation, the language-dependent +extensions are just empty modules, but it is good to provide them for +the sake of uniformity. +
+Among all categories and functions, a handful are @@ -153,6 +262,7 @@ rules relate the categories to each other. It is intended to be a first approximation when designing the parameter system of a new language.
+If you want to experiment with a small subset of the resource API first, @@ -161,6 +271,7 @@ try out the module explained in the GF Tutorial.
+Some lines in the resource library are suffixed with the comment @@ -176,7 +287,9 @@ implementation. To compile a grammar with present-tense-only, use i -preproc=GF/lib/resource-1.0/mkPresent LangGer.gf
+
Unless you are writing an instance of a parametrized implementation
@@ -262,6 +375,7 @@ as e.g. VerbGer.
The real work starts now. There are many ways to proceed, the main ones being @@ -360,6 +474,7 @@ and dependences there are in your language, and you can now produce very much in the order you please. +
The following develop-test cycle will @@ -416,6 +531,7 @@ follow soon. (You will found out that these explanations involve a rational reconstruction of the live process! Among other things, the API was changed during the actual process to make it more intuitive.)
+
These modules will be written by you.
@@ -472,6 +588,7 @@ almost everything. This led in practice to the duplication of almost
all code on the lin and oper levels, and made the code
hard to understand and maintain.
The paradigms needed to implement
@@ -542,6 +659,7 @@ These constants are defined in terms of parameter types and constructors
in ResGer and MorphoGer, which modules are not
visible to the application grammarian.
An important difference between MorphoGer and
@@ -588,6 +706,7 @@ in her hidden definitions of constants in Paradigms. For instance,
-- mkAdv s = {s = s ; lock_Adv = <>} ;
The lexicon belonging to LangGer consists of two modules:
@@ -607,17 +726,20 @@ the coverage of the paradigms gets thereby tested and that the
use of the paradigms in LexiconGer gives a good set of examples for
those who want to build new lexica.
Detailed implementation tricks are found in the comments of each module.
+
It may be handy to provide a separate module of irregular
@@ -658,6 +784,7 @@ few hundred perhaps. Building such a lexicon separately also
makes it less important to cover everything by the
worst-case paradigms (mkV etc).
You can often find resources such as lists of @@ -692,6 +819,7 @@ When using ready-made word lists, you should think about coyright issues. Ideally, all resource grammar material should be provided under GNU General Public License.
+This is a cheap technique to build a lexicon of thousands @@ -699,6 +827,7 @@ of words, if text data is available in digital format. See the Functional Morphology homepage for details.
+Sooner or later it will happen that the resource grammar API @@ -707,6 +836,7 @@ that it does not include idiomatic expressions in a given language. The solution then is in the first place to build language-specific extension modules. This chapter will deal with this issue (to be completed).
+Above we have looked at how a resource implementation is built by @@ -726,6 +856,7 @@ the Romance family (to be completed). Here is a set of slides on the topic.
+This is the most demanding form of resource grammar writing. @@ -742,5 +873,5 @@ is constructed from the Finnish grammar through parametrization.
- +