From 1ce8ef0ba99dc4ebb21096d7e2594a902a042d41 Mon Sep 17 00:00:00 2001
From: aarne
@@ -63,7 +63,7 @@ Usability for different purposes
Often in NLP, a grammar is just high-level code for a parser.
The GF Resource Grammar Library Version 1.0
Author: Aarne Ranta <aarne (at) cs.chalmers.se>
-Last update: Wed Mar 8 09:47:07 2006
+Last update: Wed Mar 8 12:04:15 2006
Grammar as parser
+Not primarily code for a parser
-E.g. adjectival modification +E.g. adjectival modification rule
AdjCN : AP -> CN -> CN ;
-
Rendering in different languages: concrete syntax
++ AdjCN (PositA even_A) (UseN number_N) + + even number, even sums + + jämnt tal, jämna summor + + nombre pair, sommes paires ++
+Abstract away from inflection, agreement, word order. +
Resource grammars have generation perspective, rather than parsing
@@ -314,7 +325,7 @@ The current GF Resource Project covers ten languages:
-In addition, parts (morphology) of Arabic, Estonian, Latin, and Urdu +In addition, parts of Arabic, Estonian, Latin, and Urdu
API 1.0 not yet implemented for Danish and Russian @@ -449,7 +460,7 @@ proper names, pronouns, determiners, possessives, cardinals and ordinals
-
-Initializing a lexicon with regXs is
+Initializing a lexicon with regX for every entry is
usually a good starting point in grammar development.
@@ -494,9 +505,9 @@ In Swedish, giving the gender of N improves a lot
There are also special constructs taking other forms:
- mk2N : (nyckel,nycklar : Str) -> N
+ mk2N : (nyckel,nycklar : Str) -> N
- mk1N : (bilarna : Str) -> N
+ mk1N : (bilarna : Str) -> N
irregV : (dricka, drack, druckit : Str) -> V
@@ -533,8 +544,13 @@ Iregular words in IrregX, e.g. Swedish:
draga_V : V =
- mkV (variants { "dra"; "draga"}) (variants { "drar" ; "drager"})
- (variants { "dra" ; "drag" }) "drog" "dragit" "dragen" ;
+ mkV
+ (variants { "dra" ; "draga"})
+ (variants { "drar" ; "drager"})
+ (variants { "dra" ; "drag" })
+ "drog"
+ "dragit"
+ "dragen" ;
Goal: eliminate the user's need of worst-case functions. @@ -547,14 +563,18 @@ Goal: eliminate the user's need of worst-case functions. Syntactic structures that are not shared by all languages.
+Alternative (and often more idiomatic) ways to say what is already covered by the API. +
+Not implemented yet.
Candidates:
Nor post-possessives: bilen min
-Fre question forms: est-ce que tu dors ?
+bilen min
+est-ce que tu dors ?
+@@ -600,7 +620,7 @@ files again. Just do some of gf -nocf -path=alltenses:prelude alltenses/LangSwe.gfc -- Swedish only - gf -nocf -path=alltenses:prelude present/LangSwe.gfc -- Swedish only, present tense only + gf -nocf -path=present:prelude present/LangSwe.gfc -- Swedish in present tense only
@@ -608,10 +628,14 @@ files again. Just do some of
-The default parser does not work! +The default parser does not work! (It is obsolete anyway.)
-The MCFG parser works in some languages, after waiting appr. 20 seconds +The MCFG parser (the new standard) works in theory, but can +in practice be too slow to build. +
++But it does work in some languages, after waiting appr. 20 seconds
p -mcfg -lang=LangEng -cat=S "I would see her"
@@ -621,6 +645,14 @@ The MCFG parser works in some languages, after waiting appr. 20 seconds
Parsing in present/ versions is quicker.
+
+Remedies:
+
+@@ -818,7 +850,7 @@ Problems:
-
Everything else is variations of this
@@ -842,15 +874,105 @@ Everything else is variations of this-
This toy Latin grammar shows in a nutshell how the core can be implemented.
+
+ param
+ Number = Sg | Pl ;
+ Person = P1 | P2 | P3 ;
+ Tense = Pres | Past ;
+ Polarity = Pos | Neg ;
+ Case = Nom | Acc | Dat ;
+ Gender = Masc | Fem | Neutr ;
+ oper
+ Agr = {g : Gender ; n : Number ; p : Person} ; -- agreement features
+
+
-Use this API as a first approximation when designing the parameter system of a new -language. +
+
+ lincat
+ Cl = {
+ s : Tense => Polarity => Str
+ } ;
+ VP = {
+ verb : Tense => Polarity => Agr => Str ; -- finite verb
+ neg : Polarity => Str ; -- negation
+ compl : Agr => Str -- complement
+ } ;
+ V2 = {
+ s : Tense => Number => Person => Str ;
+ c : Case -- complement case
+ } ;
+ NP = {
+ s : Case => Str ;
+ a : Agr -- agreement features
+ } ;
+ CN = {
+ s : Number => Case => Str ;
+ g : Gender
+ } ;
+ Det = {
+ s : Gender => Case => Str ;
+ n : Number
+ } ;
+ AP = {
+ s : Gender => Number => Case => Str
+ } ;
+
+
++ +
+
+ lin
+ PredVP np vp = {
+ s = \\t,p =>
+ let
+ agr = np.a ;
+ subject = np.s ! Nom ;
+ object = vp.compl ! agr ;
+ verb = vp.neg ! p ++ vp.verb ! t ! p ! agr
+ in
+ subject ++ object ++ verb
+ } ;
+
+ ComplV2 v np = {
+ verb = \\t,p,a => v.s ! t ! a.n ! a.p ;
+ compl = \\_ => np.s ! v.c ;
+ neg = table {Pos => [] ; Neg => "non"}
+ } ;
+
+
++ +
+
+ DetCN det cn =
+ let
+ g = cn.g ;
+ n = det.n
+ in {
+ s = \\c => det.s ! g ! c ++ cn.s ! n ! c ;
+ a = {g = g ; n = n ; p = P3}
+ } ;
+
+ ModCN ap cn =
+ let
+ g = cn.g
+ in {
+ s = \\n,c => cn.s ! n ! c ++ ap.s ! g ! n ! c ;
+ g = g
+ } ;
+
+
@@ -886,6 +1008,6 @@ Exception: if you are working with a language-specific API extension, you can work directly in that module. - +