Purpose
Background
Coverage
Structure
How to use
How to implement a new language
How to extend the API
High-level access to grammatical rules
E.g. You have k new messages rendered in ten languages X
render X (Have (You (Number (k (New Message)))))
Usability for different purposes
Often in NLP, a grammar is just high-level code for a parser.
But writing a grammar can be inadequate for parsing:
Moreover, a grammar fine-tuned for parsing may not be reusable
Linguistic ontology: abstract syntax
E.g. adjectival modification
AdjCN : AP -> CN -> CN ;
Rendering in different languages: concrete syntax
Resource grammars have generation perspective, rather than parsing
Division of labour: resource grammars hide linguistic details
Presentation: "school grammar" concepts, dictionary-like conventions
API = Application Programmer's Interface
Documentation: gfdoc
IDE = Interactive Development Environment (forthcoming)
Example-based grammar writing
render Ita (parse Eng "you have k messages")
Linguistics
Computer science
2002: v. 0.2
2003: v. 0.6
2005: v. 0.9
2006: v. 1.0
Janna Khegai (Russian modules, forthcoming), Bjorn Bringert (many Swadesh lexica), Carlos Gonzalia (Spanish cardinals), Partik Jansson (Swedish cardinals), Aarne Ranta.
We are grateful for contributions and comments to several other people who have used this and the previous versions of the resource library, including Ana Bove, David Burke, Lauri Carlson, Gloria Casanellas, Karin Cavallin, Hans-Joachim Daniels, Kristofer Johannisson, Anni Laine, Wanjiku Ng'ang'a, Jordi Saludes.
CLE (Core Language Engine, Book 1992)
Rosetta Machine Translation (Book 1994)
===Languages====
The current GF Resource Project covers ten languages:
Danish
English
Finnish
French
German
Italian
Norwegian (bokmål)
Russian
Spanish
Swedish
In addition, parts (morphology) of Arabic, Estonian, Latin, and Urdu
API 1.0 not yet implemented for Danish and Russian
===Morphology====
Complete inflection engine
High-level access via ParadigmsX; e.g. Swedish:
mkV : (supa,super,sup,söp,supit,supen : Str) -> V ;
regV : (talar : Str) -> V ;
irregV : (dricka, drack, druckit : Str) -> V ;
IrregX:
draga_V : V =
mkV (variants { "dra"; "draga"}) (variants { "drar" ; "drager"})
(variants { "dra" ; "drag" }) "drog" "dragit" "dragen" ;
67 categories
150 abstract syntax combination rules
100 structural words
350 content words in a test lexicon
Lines of source code (4/3/2006):
abstract 1131
english 2344
german 2386
finnish 3396
norwegian 1257
swedish 1465
scandinavian 1023
french 3246 -- Besch + Irreg + Morpho 2111
italian 7797 -- Besch 6512
spanish 7120 -- Besch 5877
romance 1066