From 8d571ffce44dc3d972a86121a18d4129cc4cb9d8 Mon Sep 17 00:00:00 2001
From: aarne
The purpose of this document is to tell how to implement the GF
resource grammar API for a new language. We will not cover how
@@ -69,7 +23,6 @@ in
The API is divided into a bunch of
The direct parents of the top will be called phrase category modules,
@@ -113,7 +65,6 @@ one of a small number of different types). Thus we have
Expressions of each phrase category are constructed in the corresponding
@@ -142,7 +93,6 @@ can skip the
What is lexical and what is syntactic is not as clearcut in GF as in
@@ -179,7 +129,6 @@ different languages on the level of a resource grammar. In other words,
application grammars are likely to use the resource in different ways for
different languages.
Among all categories and functions, a handful are
@@ -204,7 +153,6 @@ rules relate the categories to each other. It is intended to be a
first approximation when designing the parameter system of a new
language.
If you want to experiment with a small subset of the resource API first,
@@ -213,7 +161,6 @@ try out the module
explained in the
GF Tutorial.
Some lines in the resource library are suffixed with the comment
@@ -229,9 +176,7 @@ implementation. To compile a grammar with present-tense-only, use
i -preproc=GF/lib/resource-1.0/mkPresent LangGer.gf
Unless you are writing an instance of a parametrized implementation
@@ -317,7 +262,6 @@ as e.g.
The real work starts now. There are many ways to proceed, the main ones being
@@ -416,7 +360,6 @@ and dependences there are in your language, and you can now produce very
much in the order you please.
-
The following develop-test cycle will
@@ -473,7 +416,6 @@ follow soon. (You will found out that these explanations involve
a rational reconstruction of the live process! Among other things, the
API was changed during the actual process to make it more intuitive.)
These modules will be written by you.
@@ -492,8 +434,9 @@ package.
Resource grammar writing HOWTO
Author: Aarne Ranta <aarne (at) cs.chalmers.se>
-Last update: Wed Mar 1 16:52:09 2006
+Last update: Fri May 26 17:36:48 2006
-
-
-
-
-
-
-
GF/lib/resource-1.0/. See the
resource-1.0/README for
details on how this differs from previous versions.
The resource grammar API
abstract modules.
@@ -88,7 +41,6 @@ to which all the other modules conform, so that e.g. NP means
the same thing in those modules that use NPs and those that
constructs them.
Phrase category modules
Idiom: idiomatic phrases such as existentials
-
Infrastructure modules
lincat definition of a category and use the default
{s : Str} until you need to change it to something else. In
English, for instance, many categories do have this linearization type.
Lexical modules
The core of the syntax
Another reduced API
The present-tense fragment
Phases of the work
-
Putting up a directory
VerbGer.
Direction of work
The develop-test cycle
Resource modules used
ParamX: parameter types used in many languages
-CommonX: implementation of the categories $Text$ and $Phr$, as well as of
- the logical tense, anteriority, and polarity parameters
+CommonX: implementation of language-uniform categories
+ such as $Text$ and $Phr$, as well as of
+ the logical tense, anteriority, and polarity parameters
Coordination: operations to deal with lists and coordination
Prelude: general-purpose operations on strings, records,
truth values, etc.
@@ -529,7 +472,6 @@ almost everything. This led in practice to the duplication of almost
all code on the lin and oper levels, and made the code
hard to understand and maintain.
The paradigms needed to implement
@@ -600,7 +542,6 @@ These constants are defined in terms of parameter types and constructors
in ResGer and MorphoGer, which modules are not
visible to the application grammarian.
An important difference between MorphoGer and
@@ -611,8 +552,8 @@ record types in a resource modules, such as ParadigmsGer,
a lock field is added to the record, so that categories
with the same implementation are not confused with each other.
(This is inspired by the newtype discipline in Haskell.)
-For instance, the lincats of adverbs and conjunctions may be the same
-in CatGer:
+For instance, the lincats of adverbs and conjunctions are the same
+in CommonX (and therefore in CatGer, which inherits it):
lincat Adv = {s : Str} ;
@@ -647,7 +588,6 @@ in her hidden definitions of constants in Paradigms. For instance,
-- mkAdv s = {s = s ; lock_Adv = <>} ;
-
The lexicon belonging to LangGer consists of two modules:
@@ -667,20 +607,17 @@ the coverage of the paradigms gets thereby tested and that the
use of the paradigms in LexiconGer gives a good set of examples for
those who want to build new lexica.
Detailed implementation tricks are found in the comments of each module.
-
It may be handy to provide a separate module of irregular
@@ -725,7 +658,6 @@ few hundred perhaps. Building such a lexicon separately also
makes it less important to cover everything by the
worst-case paradigms (mkV etc).
You can often find resources such as lists of @@ -760,7 +692,6 @@ When using ready-made word lists, you should think about coyright issues. Ideally, all resource grammar material should be provided under GNU General Public License.
-This is a cheap technique to build a lexicon of thousands @@ -768,7 +699,6 @@ of words, if text data is available in digital format. See the Functional Morphology homepage for details.
-Sooner or later it will happen that the resource grammar API @@ -777,7 +707,6 @@ that it does not include idiomatic expressions in a given language. The solution then is in the first place to build language-specific extension modules. This chapter will deal with this issue (to be completed).
-Above we have looked at how a resource implementation is built by @@ -797,7 +726,6 @@ the Romance family (to be completed). Here is a set of slides on the topic.
-This is the most demanding form of resource grammar writing. @@ -813,6 +741,6 @@ This chapter will work out an example of how an Estonian grammar is constructed from the Finnish grammar through parametrization.
- - + +