diff --git a/resource-0.6/index.html b/resource-0.6/index.html new file mode 100644 index 000000000..db2fdec4d --- /dev/null +++ b/resource-0.6/index.html @@ -0,0 +1,490 @@ + + + + +
+ + +

The GF Resource Grammar Library

+ + +Aarne Ranta +2002-2004 + +

+ +Version 0.6: source package. + +

+ +Current languages: English, Finnish, French, German, Italian, Russian, Swedish. + +

+ + +News.
+ +10/8/2004 This document updated as a revision of the +old resource page. + +
+ +13/4/2004 Version 0.6 written using the module system of GF 2. Also an +extended coverage. The files are placed in separate subdirectories (one +per language) and have different names than before, so that file names +(without the extension .gf) are also legal module names. +
+ +

+ + +Notice. You need GF Version 2.0beta or later +to work with these resource grammars. +It is available from the +GF home page. + + + +

+ + +

Introduction

+ +As programs in general can be divided into + +GF grammars can be divided into + +An application grammar is typically built around +a semantic model, which is formalized as the abstract +syntax of the language. Concrete syntax defines +a mapping from the abstract syntax into English or +Swedish or some other language. + +

+ +A resource grammar is not based on semantics, but its +purpose is to define the linguistic "surface" structures +of some language. The availability of these structures makes it easier to +write application grammars. + +

+ +With resource grammars, we aim to achieve division of labour in +grammar writing: +

+By using resource grammars, experts of application domains can take +linguistic details for granted. For instance, to +express the linearization of the arithmetical predicate even +in French, she does not have to write +
+  lin Even x = {s =
+      table {
+        m => x.s ++ 
+             table {Ind  => "est" ;  Subj => "soit"} ! m ++
+             table {Masc => "pair" ; Fem  => "paire"} ! x.g
+      }
+    } ;
+
+but simply +
+  lin Even = predA1 (adjReg "pair") ;
+
+The author of the French resource grammar will have defined the +functions predAdj and adjReg in such a way that +they can be used in all applications. + +

+ +What is more, the resource grammar has a language-independent +API, which makes it possible to write the corresponding rule +for other languages in a very similar way. For instance, the +German rule is +

+  lin Even = predA1 (adjReg "gerade") ;
+
+ + + +

Coverage

+ +The ultimate goal of the resource grammar library is a full coverage of the linguistic +structures of each language. As of Version 0.6, we still have some way +to go to reach that goal. But we do have + + + +

Demo

+ +To get an idea of the coverage of the resource library, and also +to help finding the right functions for your applications, you +can do +
+  make test
+  jgf TestAll.gfcm
+
+This opens the syntax editor with all the seven resource grammars +extended with a small lexicon. + + + + +

Programmer's view on resource grammars

+ +The resource grammar library a hierarchical structure. Its main layers are + +The core resources should not be needed by application grammarians: it should +be enough to use the core resource API and the derived libraries. If +this is not the case, the best solution is to extend the derived resource +libraries or create new ones. + + + +

Grammaticality guarantee via data abstraction

+ +An important principle is that + +This principle is simultaneously a guidance for resource grammarians +and an argument for the application grammarian to use these libraries. +What we mean by "only using the libraries" is that + +Thus for instance no records, tables, selections or projections should appear +in the rules. What we have achieved then is total data abstraction, +and the grammaticality guarantee can be given. + +

+ +Since the resource grammars are work in progress, their coverage is not +yet sufficient for complete data abstraction. In addition, there may of course +be bugs in the resource grammars that destroy grammaticality. The GF group is +grateful for bug reports, requests, and contributions! + +

+ +The most important exception to total data abstraction in practice is the +incompleteness of resource lexica. Since it is impossible to have +full coverage of all the words in a language, users often have to introduce +their own lexical entries, and thereby use literal strings in their GF code. +The safest and most convenient way of using this is via functions +defined in ParadigmsX.gf files. Using these functions guarantees +that the lexical entries created are type-correct. But nothing guards +against misspelling a word, picking a wrong inflectional pattern, or +a wrong inherent feature (such as gender). + + + +

The resource grammar documentation in gfdoc

+ +All documented GF grammars linked from this page +have been written in GF and then translated to HTML +using a light-weight documentation tool, +gfdoc. The tool is available as a part of the GF +source code package, in the Haskell file +util/GFDoc.hs that can be run in the Hugs interpreter +by the script util/gfdoc. The program also has the +flag +latex, which produces output in Latex instead of +HTML. + + + +

The core resource API

+ +The API is divided into two modules, Combiantions and +its extension Structural. + +

+ +The file Combinations.gf +gives the core resource type signatures of phrasal categories and +syntactic combination rules, together with some explanations +and examples. The examples are so far only in English, but their +equivalents are available in all of the languages for which the +API has been implemented. + +

+ +The file Structurals.gf +gives a list of structural words such as determiners, pronouns, +prepositions, and conjunctions. + +

+ +The file Structural.gf cannot be imported directly, but +via the generated files ResourceX.gf for each language X. +In these files, the fun/lin and cat/lincat judgements have been +translated into oper judgements. + + + +

The lexical paradigm modules

+ +The lexical paradigm modules define, for +each lexical category, a worst-case macro for adding words +of that category by giving a sufficient number of characteristic +forms. In addition, the most common regular paradigms are +included, where it is enough just to give one form to generate +all the others. + +

+ +For example, the English paradigm module has the worst-case macro for nouns, +

+  mkN : (man,men,man's,men's : Str) -> Gender -> N ;
+
+taking four forms and a gender (human or nonhuman, +as is also explained in the module). Its application +
+  mkN "mouse" "mice" "mouse's" "mice's" nonhuman
+
+defines all information that is needed for the noun mouse. +There are also some regular patterns, for instance, +
+  nReg  : Str -> Gender -> N ;   -- dog, dogs
+  nKiss : Str -> Gender -> N ;   -- kiss, kisses
+
+examples of which are +
+  nReg "car" nonhuman
+  nKiss "waitress" human
+
+ +

+ +Here are the documented versions of the paradigm modules: +

+ + +

The derived resource libraries

+ +The core resource grammar is minimal in the sense that it defines the +smallest syntactic combinations and has no redundancy. For applications, it +is usually more convenient to use combinations of the minimal rules. +Some such combinations are given in the predication library, +which defines the simultaneous applications of one- and two-place +verbs and adjectives to all their argument noun phrases. It also +defines some other constructions useful for logical and mathematical +applications. + +

+ +The API of the predication library is in the file +Predication.gf. +What is imported is one of the language-dependent files, +X/PredicationX.gf for each language X. + + + + +

Linguist's view on resource grammars

+ +

GF and other grammar formalisms

+ +Linguists in particular might be interested in resource +grammars for their own sake, not as basis of applications. +Since few linguists are so far familiar with GF, we refer to the +GF Homepage +and especially to the +GF Tutorial. +What comes here is a brief summary of the relation of GF to +other record-based formalisms. + +

+ +The records of GF are much like feature structures in PATR or HPSG. +The main differences are that +

+The latter difference explains why a GF record typically carries more +information than a feature structure. For instance, the record describing +the French noun cheval is +
+  {s = table {Sg => "cheval" ; Pl => "chevaux"} ; g = Masc} ;
+
+showing the full inflection table of the (abstract) noun cheval. +A PATR record +for the French word cheval would be +
+  {s = "cheval" ; n = Sg ; g = Masc} ;
+
+showing just the information that can be gathered from the (concrete) +string cheval. +There is a rather straightforward sense in which the PATR record is an +instance of the GF record. + +

+ +When generating language from syntax trees (or from logical formulas via +syntax trees), the record containing full inflection tables is an efficient +(linear-time) method of producing the correct forms. +This is important when text is generated in real time in +an interactive system. + + + +

The structure of core resource grammars

+ +As explained above, the application grammarian's view on resource grammars +is through API modules. They are collections of type signatures of functions. +It is the task of linguists to define these functions. +The definitions are in the end given +in the core resource grammars. + +

+ +We have divided the core resource grammar for each language X +into the following parts: +

+To get the most powerful resource grammar for each language, one can use +these files directly. + +

+ +However, the languages we have studied have so much in common +that we have gathered a considerable set of categories and rules +in a multilingual resource grammar. Its parts are +

+The advantage of using this API in application grammars is that +their concrete syntax looks the same for all languages +up to non-structural words. Thus it is possible to produce concrete syntaxes +for new languages without knowing almost anything about them. +The abstract syntax serves as a common API to the core resource grammar. + + +

The code for the core resource grammars

+ +Each language has its resource code in a separate directory. +You can view the code as it is, or download it and run gfdoc +on each file. + + + +

Compiling and using the resource

+ +To compile the resource into reusable operations, for all languages, type +
+  make
+
+in the resource/ directory. +This requires that you have a recent version of GF (>= 2.0). +What you get is a set of files with names ResourceX.gfr, +ResourceX.gfc, ParadigmsX.gfr, and ParadigmsX.gfc. +You need never consult any of these files, +but only look into the documentation. + + + +

Examples of using the resource grammars

+ +

A test suite

+ +The grammars TestResourceX.gf define a few expressions of each +lexical category and make it possible to test linearization, parsing, +random generation, and editing. + + +

A database query language

+ +The grammars + +database/(Database | Restaurant)X.gf +make use of the resource. The RestaurantX.gf +grammars are just one possible application building on the generic +DatabaseX.gf grammars. +Notice that the +DatabaseX gramamrs are defined as instantiations of +the parametrized module DatabaseI. + + +

Functional morphology

+ +Even though GF is a useful language for describing syntax and semantics, it +is not the optimal choice for morphology. +One reason is the absence of low-level +programming, such as string matching. Another reason is efficiency. +In connection with the resource grammar project, we have started another +project, +functional morphology, +which uses Haskell to implement +morphology. Haskell morphologies can then be used for generating +GF morphologies. + + +

Further reading

+ +If you want to read an informal introduction to +resource grammars, see these +slides, written for a German computer science +audience. Or these +other slides, written for a Swedish +linguistic audience. + + + + +