working on resource doc and exx, fixing bugs

This commit is contained in:
aarne
2005-02-18 13:53:29 +00:00
parent b7ced424be
commit e4f6d7e913
20 changed files with 621 additions and 170 deletions

View File

@@ -6,11 +6,11 @@
<h1>Grammatical Framework Version 2</h1>
Highlights, versions 2.0 and 2.1
Highlights, versions 2.0, 2.1, and 2.2
<p>
13/10/2003 - 25/11 - 2/4/2004 - 18/6 - 13/10
13/10/2003 - 25/11 - 2/4/2004 - 18/6 - 13/10 - 16/2/2005
<p>
@@ -24,7 +24,7 @@ Highlights, versions 2.0 and 2.1
An accurate <a href="DocGF.pdf">language specification</a> is now available.
<h2>Summary of novelties</h2>
<h2>Summary of novelties in Versions 2.0 to 2.2</h2>
<h4>Module system</h4>
@@ -35,18 +35,20 @@ An accurate <a href="DocGF.pdf">language specification</a> is now available.
<li> Hierarchic structure (single inheritance <tt>**</tt>) +
cross-cutting reuse (<tt>open</tt>)
<li> Separate compilation, one module per file
<li> Reuse of <tt>abstract</tt>+<tt>concrete</tt> as <tt>resource</tt>
<li> Reuse of <tt>abstract</tt>+<tt>concrete</tt> as <tt>resource</tt><br>
<b>Version 2.2</b>: separate <tt>reuse</tt> modules no longer needed
<li> Parametrized modules:
<tt>interface</tt>, <tt>instance</tt>, <tt>incomplete</tt>.
<li> New experimental module types: <tt>transfer</tt>,
<tt>union</tt>.
<li> <b>Version 2.1</b>: multiple inheritance in module extension.
<li> Version 2.1: multiple inheritance in module extension.
<h4>Canonical format GFC</h4>
<li> The target of GF compiler; to reuse, just read in.
<li> Readable by Haskell/Java/C++/C applications.
<li> <b>Version 2.1</b>: Java interpreter available for GFC (by Björn Bringert).
<li> Version 2.1: Java interpreter available for GFC (by Björn Bringert).
<li> <b>Version 2.2</b>: new optimizations to reduce the size of GFC files
<h4>New features in expression language</h4>
@@ -59,7 +61,7 @@ An accurate <a href="DocGF.pdf">language specification</a> is now available.
braces and <tt>where</tt>.
<li> Pattern variables can be used on lhs's of <tt>oper</tt> definitions.
<li> New Unicode transliterations (by Harad Hammarström).
<li> <b>Version 2.1</b>: Initial segments of integers
<li> Version 2.1: Initial segments of integers
(<tt>Ints</tt><i>n</i>) available as parameter types.
@@ -78,6 +80,8 @@ An accurate <a href="DocGF.pdf">language specification</a> is now available.
<li> <tt>pm</tt> = <tt>print_multi</tt> prints the multilingual
grammar resident in the current state to a ready-compiles
<tt>.gfcm</tt> file.
<li> <b>Version 2.2</b>: several new command options
<li> <b>Version 2.2</b>: <tt>vg</tt> visializes the module dependency graph
<li> All commands have both long and short names (see help). Short
names are easier to type, whereas long names
make scripts more readable.
@@ -89,6 +93,7 @@ An accurate <a href="DocGF.pdf">language specification</a> is now available.
<li> Active text field: click the middle button in the focus to send
in refinement through the parser.
<li> Clipboard: copy complex terms into the refine menu.
<li> <b>Version 2.2</b>: text corresponding to subtrees with constraints marked with red colour
<h4>Improved implementation</h4>
@@ -99,6 +104,10 @@ An accurate <a href="DocGF.pdf">language specification</a> is now available.
<li> Lexical rules sorted out by option <tt>-cflexer</tt> for efficient
parsing with large lexica.
<li> GHC optimizations and strictness flags are used for improving performance.
<li> <b>Version 2.2</b>: started <a
href="http://www.haskell.org/haddock">haddock</tt> documentation
by using uniform module headers
<h4>New parser (work in progress)</h4>
@@ -106,131 +115,12 @@ An accurate <a href="DocGF.pdf">language specification</a> is now available.
<li> By Peter Ljunglöf, based on MCFG.
<li> Much more efficient for morphology and discontinuous constituents.
<li> Treatment of cyclic rules.
<li> <b>Version 2.1</b>: improved generation of speech recognition
<li> Version 2.1: improved generation of speech recognition
grammars (by Björn Bringert).
<li> <b>Version 2.1</b>: output of Labelled BNF files readable by the
<li> Version 2.1: output of Labelled BNF files readable by the
BNF Converter.
<!-- NEW -->
<h2>Missing features of GF 1.2 (13/10/2004)</h2>
Generally, GF1 grammars can be automatically translated to GF2, although the
result is not as good
as manual, since indentation and comments are destroyed.
The results can be
saved in GF2 files, but this is not necessary.
Some rarely used GF1 features are no longer supported (see next section).
It is also possible to write a GF2 grammar back to GF1, with the
command <tt>pg -printer=old</tt>.
<p>
Resource libraries
and some example grammars have been
converted. Most old example grammars work without any changes.
However, there is a new resource API with
many new constructions, and which is recommended.
<p>
Soundness checking of module depencencies and completeness is not
complete. This means that some errors may show up too late.
<p>
Latex and XML printing of grammars do not work yet.
<!-- NEW -->
<h2>How to use GF 1.* files</h2>
Backward compatibility with respect to old GF grammars has been
a central goal. All GF grammars, from version 0.9, should work in
the old way in GF2. The main exceptions are some features that
are rarely used.
<ul>
<li> The <tt>package</tt> system introduced in GF 1.2, cannot be
interpreted in the module system of GF 2.0, since packages are in
mutual scope with the top level.
<li> <tt>tokenizer</tt> pragmas are cannot be parsed any more. In GF
1.2, they are already replaced by <tt>lexer</tt> flags.
<li> <tt>var</tt> pragmas cannot be parsed any more.
</ul>
<p>
Very old GF grammars (from versions before 0.9), with the completely
different notation, do not work. They should be first converted to
GF1 by using GF version 1.2.
<p>
The import command <tt>i</tt> can be given the option <tt>-old</tt>. E.g.
<pre>
i -old tut1.Eng.g2
</pre>
But this is no more necessary: GF2 detects automatically if a grammar
is in the GF1 format.
<p>
Importing a set of GF2 files generates, internally, three modules:
<pre>
abstract tut1 = ...
resource ResEng = ...
concrete Eng of tut1 = open ResEng in ...
</pre>
(The names are different if the file name has fewer parts.)
<p>
The option <tt>-o</tt> causes GF2 to write these modules into files.
<p>
The flags <tt>-abs</tt>, <tt>-cnc</tt>, and <tt>-res</tt> can be used
to give custom names to the modules. In particular, it is good to use
the <tt>-abs</tt> flag to guarantee that the abstract syntax module
has the same name for all grammars in a multilingual environmens:
<pre>
i -old -abs=Numerals hungarian.gf
i -old -abs=Numerals tamil.gf
i -old -abs=Numerals sanskrit.gf
</pre>
<p>
The same flags as in the import command can be used when invoking
GF2 from the system shell. Many grammars can be imported on the same command
line, e.g.
<pre>
% gf2 -old -abs=Tutorial tut1.Eng.gf tut1.Fin.gf tut1.Fra.gf
</pre>
<p>
To write a GF2 grammar back to GF1 (as one big file), use the command
<pre>
> pg -old
</pre>
<p>
GF2 has more reserved words than GF 1.2. When old files are read, a preprocessor
replaces every identifier that has the shape of a new reserved word
with a variant where the last letter is replaced by <tt>Z</tt>, e.g.
<tt>instance</tt> is replaced by <tt>instancZ</tt>. This method is of course
unsafe and should be replaced by something better.
<!-- NEW -->
@@ -404,6 +294,54 @@ To force compilation:
when testing operations with the <tt>cc</tt> command.
</ul>
<!-- NEW -->
<h3>Compiler optimizations</h3>
<b>Version 2.2</b>
<p>
The sometimes exploding size of generated <tt>gfc</tt> and
<tt>gfr</tt> files has made it urgent to find optimizations
that reduce the size of the code. There are five
combinations optimizations that can be chosen, as the value of the
<tt>optimize</tt> flag:
<ul>
<li> <tt>share</tt>: group tables so that common branch values are shared
by the use of disjunctive patterns.
<li> <tt>parametrize</tt>: if table branches differ at most at the
occurrence of the pattern, replace the expanded table by a one-branch
table with a variable. If this fails, perform <tt>share</tt>.
<li> <tt>values</tt>: only show the values of table branches, not the
patterns.
<li> <tt>all</tt>: try <tt>parametrize</tt>; if this fails, do <tt>values</tt>.
<li> <tt>none</tt>: don't do any optimizations
</ul>
The <tt>share</tt> and <tt>parametrize</tt> optimizations are always
just good, whereas the <tt>values</tt> optimization may slow down the
use of the table. However, it is very good for grammars mostly consisting
of the inflection tables of lexical items: it can reduce the file size
by the factor of 4.
<p>
An optimization can be selected individually for each
<tt>resource</tt> and <tt>concrete</tt> module by including
the judgement
<pre>
flags optimize=(share|parametrize|values|all|none) ;
</pre>
in the module body. These flags can be overridden by a flag given
in the <tt>i</tt> command, e.g.
<pre>
i -src -optimize=none Foo.gf
</pre>
Notice that the option <tt>-src</tt> is needed if there already are
generated files created with other optimization flags.
<!-- NEW -->
<h2>Module search paths</h2>
@@ -429,7 +367,124 @@ places:
</ul>
A flag set on a command line overrides ones set in files.
<!-- NEW -->
<h2>How to use GF 1.* files</h2>
Backward compatibility with respect to old GF grammars has been
a central goal. All GF grammars, from version 0.9, should work in
the old way in GF2. The main exceptions are some features that
are rarely used.
<ul>
<li> The <tt>package</tt> system introduced in GF 1.2, cannot be
interpreted in the module system of GF 2.0, since packages are in
mutual scope with the top level.
<li> <tt>tokenizer</tt> pragmas are cannot be parsed any more. In GF
1.2, they are already replaced by <tt>lexer</tt> flags.
<li> <tt>var</tt> pragmas cannot be parsed any more.
</ul>
<p>
Very old GF grammars (from versions before 0.9), with the completely
different notation, do not work. They should be first converted to
GF1 by using GF version 1.2.
<p>
The import command <tt>i</tt> can be given the option <tt>-old</tt>. E.g.
<pre>
i -old tut1.Eng.g2
</pre>
But this is no more necessary: GF2 detects automatically if a grammar
is in the GF1 format.
<p>
Importing a set of GF2 files generates, internally, three modules:
<pre>
abstract tut1 = ...
resource ResEng = ...
concrete Eng of tut1 = open ResEng in ...
</pre>
(The names are different if the file name has fewer parts.)
<p>
The option <tt>-o</tt> causes GF2 to write these modules into files.
<p>
The flags <tt>-abs</tt>, <tt>-cnc</tt>, and <tt>-res</tt> can be used
to give custom names to the modules. In particular, it is good to use
the <tt>-abs</tt> flag to guarantee that the abstract syntax module
has the same name for all grammars in a multilingual environmens:
<pre>
i -old -abs=Numerals hungarian.gf
i -old -abs=Numerals tamil.gf
i -old -abs=Numerals sanskrit.gf
</pre>
<p>
The same flags as in the import command can be used when invoking
GF2 from the system shell. Many grammars can be imported on the same command
line, e.g.
<pre>
% gf2 -old -abs=Tutorial tut1.Eng.gf tut1.Fin.gf tut1.Fra.gf
</pre>
<p>
To write a GF2 grammar back to GF1 (as one big file), use the command
<pre>
> pg -old
</pre>
<p>
GF2 has more reserved words than GF 1.2. When old files are read, a preprocessor
replaces every identifier that has the shape of a new reserved word
with a variant where the last letter is replaced by <tt>Z</tt>, e.g.
<tt>instance</tt> is replaced by <tt>instancZ</tt>. This method is of course
unsafe and should be replaced by something better.
<!-- NEW -->
<h2>Missing features of GF 1.2 (13/10/2004)</h2>
Generally, GF1 grammars can be automatically translated to GF2, although the
result is not as good
as manual, since indentation and comments are destroyed.
The results can be
saved in GF2 files, but this is not necessary.
Some rarely used GF1 features are no longer supported (see next section).
It is also possible to write a GF2 grammar back to GF1, with the
command <tt>pg -printer=old</tt>.
<p>
Resource libraries
and some example grammars have been
converted. Most old example grammars work without any changes.
However, there is a new resource API with
many new constructions, and which is recommended.
<p>
Soundness checking of module depencencies and completeness is not
complete. This means that some errors may show up too late.
<p>
Latex and XML printing of grammars do not work yet.
</body>
</html>

View File

@@ -1 +1,3 @@
--# -path=.:../../prelude
abstract Resource = Rules, Clause, Structural ** {} ;

View File

@@ -6,17 +6,27 @@ htmls:
gfdoc:
gfdoc ../abstract/Categories.gf ; mv ../abstract/Categories.html .
gfdoc ../abstract/Rules.gf ; mv ../abstract/Rules.html .
gfdoc ../abstract/Verbphrase.gf ; mv ../abstract/Verbphrase.html .
gfdoc ../abstract/Clause.gf ; mv ../abstract/Clause.html .
gfdoc ../abstract/Structural.gf ; mv ../abstract/Structural.html .
gfdoc ../abstract/Basic.gf ; mv ../abstract/Basic.html .
gfdoc ../abstract/Time.gf ; mv ../abstract/Time.html .
gfdoc ../abstract/Lang.gf ; mv ../abstract/Lang.html .
gfdoc ../swedish/ParadigmsSwe.gf ; mv ../swedish/ParadigmsSwe.html .
gfdoc ../swedish/VerbsSwe.gf ; mv ../swedish/VerbsSwe.html .
gfdoc ../swedish/BasicSwe.gf ; mv ../swedish/BasicSwe.html .
gfdoc ../english/ParadigmsEng.gf ; mv ../english/ParadigmsEng.html .
gfdoc ../english/VerbsEng.gf ; mv ../english/VerbsEng.html .
gfdoc ../english/BasicEng.gf ; mv ../english/BasicEng.html .
gfdoc ../french/ParadigmsFre.gf ; mv ../french/ParadigmsFre.html .
gfdoc ../french/VerbsFre.gf ; mv ../french/VerbsFre.html .
gfdoc ../french/BasicFre.gf ; mv ../french/BasicFre.html .
gifs: lang scand low
gifs: api lang scand low
api:
# echo "pm -printer=graph | wf Resource.dot" | gf ../abstract/Resource.gf
dot -Tgif ResourceVP.dot>Resource.gif
lang:
echo "pm -printer=graph | wf Lang.dot" | gf ../abstract/Lang.gf

View File

@@ -0,0 +1,28 @@
digraph {
Verbphrase [style = "solid", shape = "ellipse", URL = "Verbphrase.gf"];
Verbphrase -> Categories [style = "solid"];
Resource [style = "solid", shape = "ellipse", URL = "Resource.gf"];
Resource -> Rules [style = "solid"];
Resource -> Clause [style = "solid"];
Resource -> Structural [style = "solid"];
Rules [style = "solid", shape = "ellipse", URL = "Rules.gf"];
Rules -> Categories [style = "solid"];
Clause [style = "solid", shape = "ellipse", URL = "Clause.gf"];
Clause -> Categories [style = "solid"];
Structural [style = "solid", shape = "ellipse", URL = "Structural.gf"];
Structural -> Categories [style = "solid"];
Structural -> Numerals [style = "solid"];
Categories [style = "solid", shape = "ellipse", URL = "Categories.gf"];
Categories -> PredefAbs [style = "solid"];
PredefAbs [style = "solid", shape = "ellipse", URL = "PredefAbs.gf"];
Numerals [style = "solid", shape = "ellipse", URL = "Numerals.gf"];
}

View File

@@ -0,0 +1,12 @@
abstract Animals = {
cat
Phrase ; Animal ; Action ;
fun
Who : Action -> Animal -> Phrase ;
Whom : Animal -> Action -> Phrase ;
Answer : Animal -> Action -> Animal -> Phrase ;
Dog, Cat, Mouse, Lion, Zebra : Animal ;
Chase, Eat, Like : Action ;
}

View File

@@ -0,0 +1,37 @@
--# -path=.:resource/english:resource/abstract:resource/../prelude
concrete AnimalsEng of Animals = open ResourceEng, ParadigmsEng, VerbsEng in {
lincat
Phrase = Phr ;
Animal = N ;
Action = V2 ;
lin
Who act obj = QuestPhrase (UseQCl (PosTP TPresent ASimul)
(QPredV2 who8one_IP act (IndefNumNP NoNum (UseN obj)))) ;
Whom subj act = QuestPhrase (UseQCl (PosTP TPresent ASimul)
(IntSlash who8one_IP (SlashV2 (DefOneNP (UseN subj)) act))) ;
Answer subj act obj = IndicPhrase (UseCl (PosTP TPresent ASimul)
(SPredV2 (DefOneNP (UseN subj)) act (IndefNumNP NoNum (UseN obj)))) ;
Dog = regN "dog" ;
Cat = regN "cat" ;
Mouse = mk2N "mouse" "mice" ;
Lion = regN "lion" ;
Zebra = regN "zebra" ;
Chase = dirV2 (regV "chase") ;
Eat = dirV2 (eat_V ** {lock_V = <>}) ;
Like = dirV2 (regV "like") ;
}
{-
> p -cat=Phr "who likes cars ?"
QuestPhrase (UseQCl (PosTP TPresent ASimul) (QPredV2 who8one_IP like_V2 (IndefNumNP NoNum (UseN car_N))))
QuestPhrase (UseQCl (PosTP TPresent ASimul) (IntSlash who8one_IP (SlashV2 (DefOneNP (UseN car_N)) like_V2)))
> p -cat=Phr "the house likes cars ."
IndicPhrase (UseCl (PosTP TPresent ASimul) (SPredV2 (DefOneNP (UseN house_N)) like_V2 (IndefNumNP NoNum (UseN car_N))))
-}

View File

@@ -0,0 +1,23 @@
--# -path=.:resource/french:resource/romance:resource/abstract:resource/../prelude
concrete AnimalsFre of Animals = open ResourceFre, ParadigmsFre, VerbsFre in {
lincat
Phrase = Phr ;
Animal = N ;
Action = V2 ;
lin
Who act obj = QuestPhrase (UseQCl (PosTP TPresent ASimul)
(QPredV2 who8one_IP act (IndefNumNP NoNum (UseN obj)))) ;
Whom subj act = QuestPhrase (UseQCl (PosTP TPresent ASimul)
(IntSlash who8one_IP (SlashV2 (DefOneNP (UseN subj)) act))) ;
Answer subj act obj = IndicPhrase (UseCl (PosTP TPresent ASimul)
(SPredV2 (DefOneNP (UseN subj)) act (IndefNumNP NoNum (UseN obj)))) ;
Dog = regN "chien" masculine ;
Cat = regN "chat" masculine ;
Mouse = regN "souris" feminine ;
Lion = regN "lion" masculine ;
Zebra = regN "zèbre" masculine ;
Chase = dirV2 (regV "chasser") ;
Eat = dirV2 (regV "manger") ;
Like = dirV2 (regV "aimer") ;
}

View File

@@ -0,0 +1,23 @@
--# -path=.:resource/swedish:resource/scandinavian:resource/abstract:resource/../prelude
concrete AnimalsSwe of Animals = open ResourceSwe, ParadigmsSwe, VerbsSwe in {
lincat
Phrase = Phr ;
Animal = N ;
Action = V2 ;
lin
Who act obj = QuestPhrase (UseQCl (PosTP TPresent ASimul)
(QPredV2 who8one_IP act (IndefNumNP NoNum (UseN obj)))) ;
Whom subj act = QuestPhrase (UseQCl (PosTP TPresent ASimul)
(IntSlash who8one_IP (SlashV2 (DefOneNP (UseN subj)) act))) ;
Answer subj act obj = IndicPhrase (UseCl (PosTP TPresent ASimul)
(SPredV2 (DefOneNP (UseN subj)) act (IndefNumNP NoNum (UseN obj)))) ;
Dog = regN "hund" utrum ;
Cat = mk2N "katt" "katter" ;
Mouse = mkN "mus" "musen" "möss" "mössen" ;
Lion = mk2N "lejon" "lejon" ;
Zebra = regN "zebra" utrum ;
Chase = dirV2 (regV "jaga") ;
Eat = dirV2 äta_V ;
Like = mkV2 (mk2V "tycka" "tycker") "om" ;
}

View File

@@ -9,7 +9,9 @@
<p>
<b>First Draft, Gothenburg, 7 February 2005</b>
Second Version, Gothenburg, 18 February 2005
<br>
First Draft, Gothenburg, 7 February 2005
</p><p>
@@ -23,18 +25,124 @@ Aarne Ranta
<!-- NEW -->
<h2>The purpose of the resource grammar library</h2>
<h2>GF = Grammatical Framework</h2>
Basic syntactic structures
A grammar formalism based on functional programming and type theory.
<p>
Designed to be nice for <i>ordinary programmers</i> to use.
<p>
Mission: to make natural-language applications available for
ordinary programmers, in tasks like
<ul>
<li> software documentation
<li> domain-specific translation
<li> human-computer interaction
<li> dialogue systems
</ul>
Thus <i>not</i> primarily another theoretical framework for
linguists.
<!-- NEW -->
<h2>Language + Libraries</h2>
Writing natural language grammars still requires
theoretical knowledge about the language.
<p>
Which kind of programmer is easier to find?
<ul>
<li> one who can write a sorting algorithm
<li> one who can write a grammar for Swedish determiners
</ul>
<p>
In main-stream programming, sorting algorithms are not
written by hand but taken from <b>libraries</b>.
<p>
In the same way, we want to create grammar libraries that encapsulate
basic linguistic facts.
<p>
Cf. the Java success story: the language is just a half of the
success - libraries are another half.
<!-- NEW -->
<h2>Example of library-based grammar writing</h2>
To define Swedish definite phrases form scratch:
<pre>
</pre>
To use a library function for Swedish definite phrases:
<pre>
</pre>
<!-- NEW -->
<h2>Questions in grammar library design</h2>
What should there be in the library?
<br>
<li> morphology, lexicon, syntax, semantics,...
<p>
How do we organize and present the library?
<br>
<li> division into modules, level of granularity
<br>
<li> "school grammar" vs. sophisticated linguistic concepts
<p>
Where do we get the data from?
<br>
<li> automatic extraction or hand-writing?
<br>
<li> reuse of existing resources?
<p>
Extra constraint: we want open-source free software.
<!-- NEW -->
<h2>The scope of the resource grammar library</h2>
All morphological paradigms
<p>
Basic lexicon of structural, common, and irregular words
<p>
Basic syntactic structures
<p>
Currently <i>no</i> semantics,
<i>no</i> language-specific structures if not
necessary for expressivity.
<!-- NEW -->
<h2>Success criteria</h2>
@@ -43,17 +151,23 @@ Grammatical correctness
<p>
Semantic coverage
Semantic coverage: you can express whatever you want.
<p>
Usability as library for non-linguists
Usability as library for non-linguists.
<p>
(Bonus for linguists:) nice generalizations w.r.t. language
families, using the module system of GF.
<!-- NEW -->
<h2>These are not success criteria</h2>
<h2>These are not our success criteria</h2>
Language coverage
Language coverage: you can parse all expressions.
<p>
@@ -64,11 +178,61 @@ Semantic correctness
the time is seventy past forty-two
</pre>
<p>
(Warning for linguists:) theoretical innovation in
syntax
<!-- NEW -->
<h2>So where is semantics?</h2>
GF incorporates a <b>Logical Framework</b> and is therefore
capable of expressing logical semantics <i>à la</i> Montague
or any other flavour, including anaphora and discourse.
<p>
But we do <i>not</i> believe semantics can be given once and
for all for a natural language.
<p>
Instead, we expect semantics to be given in
<b>application grammars</b> built on semantic models
of different domains.
<p>
Example application: number theory
<pre>
fun Even : Nat -> Prop ; -- a mathematical predicate
lin Even = predA (regA "even") ; -- English translation
lin Even = predA (regA "pair") ; -- French translation
lin Even = predA (regA "jämn") ; -- Swedish translation
</pre>
How could the resource predict that just <i>these</i>
translations are correct in this domain?
<p>
Application grammars are built by experts of these domains
who - thanks to resource grammars - do no more need to be
experts in linguistics.
<!-- NEW -->
<h2>Languages</h2>
<p>
The current GF Resource Project covers ten languages:
<ul>
<li><tt>Dan</tt>ish
<li><tt>Eng</tt>lish
<li><tt>Fin</tt>nish
@@ -79,31 +243,87 @@ Semantic correctness
<li><tt>Rus</tt>sian
<li><tt>Spa</tt>nish
<li><tt>Swe</tt>dish
<p>
</ul>>
The first three letters (<tt>Dan</tt> etc) are used in grammar module names
<!-- NEW -->
<h2>Library structure 1: language-independent API</h2>
<li> syntactic <tt>Categories</tt> (parts of speech, word classes), e.g.
<pre>
V ; NP ; CN ; Det ; -- verb, noun phrase, common noun, determiner
</pre>
<li> <tt>Rules</tt> for combining words and phrases, e.g.
<pre>
DetNP : Det -> CN -> NP ; -- combine Det and CN into NP
</pre>
<li> the most common <tt>Structural</tt> words (determiners,
conjunctions, pronouns), e.g.
<pre>
and_Conj : Conj ;
</pre>
<!-- NEW -->
<h2>Library structure 2: language-dependent modules</h2>
<li> morphological <tt>Paradigms</tt>, e.g.
<pre>
mkN : Str -> Str -> Str -> Str -> Gender -> N ; -- worst-case nouns
mkN : Str -> N ; -- regular nouns
</pre>
<li> irregular <tt>Verbs</tt>, e.g.
<pre>
angripa_V = irregV "angripa" "angrep" "angripit" ;
</pre>
<li> <tt>Lexicon</tt> of frequent words
<pre>
man_N = mkN "man" "mannen" "män" "männen" masculine ;
</pre>
<li> <tt>Ext</tt>ended syntax with language-specific rules
<pre>
PassBli : V2 -> NP -> VP ; -- bli överkörd av ngn
</pre>
<!-- NEW -->
<h2>Library structure: overview</h2>
<h2>How much can be language-independent?</h2>
Language-independent API
For the ten languages we have considered, it <i>is</i> possible
to implement the current API.
<p>
Language-dependent resources
Reservations:
<ul>
<li> morphological <tt>Paradigms</tt>
<li> irregular <tt>Verbs</tt>
</ul>
<li> does not necessarily extend to all other languages
<li> does not necessarily cover the most idiomatic expressions
of each language
<li> may not be the easiest API to implement (e.g. negation and
inversion with <i>do</i> in English suggest that some other
structure would be more natural)
<li> does not guarantee that same structure has the same semantics
in different languages
<p>
<!-- NEW -->
<h2>Library structure: language-independent API</h2>
<center>
<img src="Resource.gif">
</center>
<!-- NEW -->
<h2>Library structure: test bed for the language-independent API</h2>
<center>
<img src="Lang.gif">
</center>
<!-- NEW -->
<h2>API documentation</h2>
@@ -113,7 +333,9 @@ Language-dependent resources
<a href="Rules.html">Rules</a>
<p>
<a href="Clause.html">Clause</a>
Alternative views on sentence formation:
<a href="Clause.html">Clause</a>,
<a href="Verbphrase.html">Verbphrase</a>
<p>
<a href="Structural.html">Structural</a>
@@ -135,19 +357,27 @@ Language-dependent resources
<!-- NEW -->
<h2>Paradigms documentation</h2>
<a href="ParadigmsEng.html">English</a>
<p>
<a href="ParadigmsEng.html">English paradigms</a>
<br>
<a href="BasicEng.html">example use of English oaradigms</a>
<br>
<a href="VerbsEng.html">English verbs</a>
<p>
<a href="ParadigmsSwe.html">Swedish</a>
<a href="ParadigmsFre.html">French paradigms</a>
<br>
<a href="BasicFre.html">example use of French paradigms</a>
<br>
<a href="VerbsFre.html">French verbs</a>
<p>
<a href="BasicSwe.html">example of Swedish</a>
<a href="ParadigmsSwe.html">Swedish paradigms</a>
<br>
<a href="BasicSwe.html">example use of Swedish paradigms</a>
<br>
<a href="VerbsSwe.html">Swedish verbs</a>

View File

@@ -1,6 +1,6 @@
--# -path=.:../abstract:../../prelude
--1 English Lexical Paradigms UNDER RECONSTRUCTION!
--1 English Lexical Paradigms
--
-- Aarne Ranta 2003
--

View File

@@ -1076,7 +1076,7 @@ oper
intPronWho : Number -> IntPron = \num -> {
s = table {
NomP => "who" ;
AccP => variants {"who" ; "whom"} ;
AccP => variants {"whom" ; "who"} ;
GenP => "whose" ;
GenSP => "whom"
} ;

View File

@@ -1,6 +1,6 @@
--# -path=.:../scandinavian:../abstract:../../prelude
--1 Swedish Lexical Paradigms UNDER RECONSTRUCTION!
--1 Swedish Lexical Paradigms
--
-- Aarne Ranta 2003
--

View File

@@ -53,11 +53,11 @@ shellStateFromFiles :: Options -> ShellState -> FilePath -> IOE ShellState
shellStateFromFiles opts st file = case fileSuffix file of
"gfcm" -> do
cenv <- compileOne opts (compileEnvShSt st []) file
ioeErr $ updateShellState opts st cenv
ioeErr $ updateShellState opts Nothing st cenv
s | elem s ["cf","ebnf"] -> do
let osb = addOptions (options [beVerbose]) opts
grts <- compileModule osb st file
ioeErr $ updateShellState opts st grts
ioeErr $ updateShellState opts Nothing st grts
_ -> do
b <- ioeIO $ isOldFile file
let opts' = if b then (addOption showOld opts) else opts
@@ -66,7 +66,8 @@ shellStateFromFiles opts st file = case fileSuffix file of
then addOptions (options [beVerbose]) opts' -- for old no emit
else addOptions (options [beVerbose, emitCode]) opts'
grts <- compileModule osb st file
ioeErr $ updateShellState opts' st grts
let top = identC $ justModuleName file
ioeErr $ updateShellState opts' (Just top) st grts
--- liftM (changeModTimes rts) $ grammar2shellState opts gr
getShellStateFromFiles :: Options -> FilePath -> IO ShellState

View File

@@ -118,12 +118,14 @@ openInterfaces ds m = do
-- | this function finds out what modules are really needed in the canoncal gr.
-- its argument is typically a concrete module name
requiredCanModules :: (Eq i, Show i) => MGrammar i f a -> i -> [i]
requiredCanModules gr = nub . iterFix (concatMap more) . singleton where
requiredCanModules :: (Ord i, Show i) => MGrammar i f a -> i -> [i]
requiredCanModules gr = nub . iterFix (concatMap more) . allExtends gr where
more i = errVal [] $ do
m <- lookupModMod gr i
return $ extends m ++ map openedModule (opens m)
return $ extends m ++ [o | o <- map openedModule (opens m), notReuse o]
notReuse i = errVal True $ do
m <- lookupModMod gr i
return $ isModRes m -- to exclude reused Cnc and Abs from required
{-

View File

@@ -112,8 +112,8 @@ evalCncInfo gr cnc abs (c,info) = case info of
return (c, CncCat ptyp pde' ppr')
CncFun (mt@(Just (_,ty))) pde ppr -> eIn ("linearization in type" +++
show ty +++ "of") $ do
CncFun (mt@(Just (_,ty@(cont,val)))) pde ppr ->
eIn ("linearization in type" +++ prt (mkProd (cont,val,[])) ++++ "of function") $ do
pde' <- case pde of
Yes de -> do
liftM yes $ pEval ty de

View File

@@ -123,16 +123,18 @@ cncModuleIdST = stateGrammarST
-- | form a shell state from a canonical grammar
grammar2shellState :: Options -> (CanonGrammar, G.SourceGrammar) -> Err ShellState
grammar2shellState opts (gr,sgr) =
updateShellState opts emptyShellState ((0,sgr,gr),[]) --- is 0 safe?
updateShellState opts Nothing emptyShellState ((0,sgr,gr),[]) --- is 0 safe?
-- | update a shell state from a canonical grammar
updateShellState :: Options -> ShellState ->
updateShellState :: Options -> Maybe Ident -> ShellState ->
((Int,G.SourceGrammar,CanonGrammar),[(FilePath,ModTime)]) ->
---- (CanonGrammar,(G.SourceGrammar,[(FilePath,ModTime)])) ->
Err ShellState
updateShellState opts sh ((_,sgr,gr),rts) = do
updateShellState opts mcnc sh ((_,sgr,gr),rts) = do
let cgr0 = M.updateMGrammar (canModules sh) gr
a' = M.greatestAbstract cgr0
a' <- return $ case mcnc of
Just cnc -> err (const Nothing) Just $ M.abstractOfConcrete cgr0 cnc
_ -> M.greatestAbstract cgr0
abstr0 <- case abstract sh of
Just a -> do
-- test that abstract is compatible --- unsafe exception for old?

View File

@@ -24,6 +24,7 @@ import Macros
import Lookup
import Refresh
import PatternMatch
import Lockfield (isLockLabel) ----
import AppPredefined
@@ -82,6 +83,12 @@ computeTerm gr = comp where
(S (T i cs) e,_) -> prawitz g i (flip App a') cs e
_ -> returnC $ appPredefined $ App f' a'
P t l | isLockLabel l -> return $ R []
---- a workaround 18/2/2005: take this away and find the reason
---- why earlier compilation destroys the lock field
P t l -> do
t' <- comp g t
case t' of

View File

@@ -12,7 +12,7 @@
-- Creating and using lock fields in reused resource grammars.
-----------------------------------------------------------------------------
module Lockfield (lockRecType, unlockRecord, lockLabel) where
module Lockfield (lockRecType, unlockRecord, lockLabel, isLockLabel) where
import Grammar
import Ident
@@ -40,3 +40,7 @@ unlockRecord c ft = do
lockLabel :: Ident -> Label
lockLabel c = LIdent $ "lock_" ++ prt c ----
isLockLabel :: Label -> Bool
isLockLabel l = case l of
LIdent c -> take 5 c == "lock_"
_ -> False

View File

@@ -60,8 +60,17 @@ lookupResType gr m c = do
-- used in reused concrete
CncCat _ _ _ -> return typeType
CncFun (Just (_,(cont,val))) _ _ -> return $ mkProd (cont, val, [])
CncFun (Just (cat,(cont,val))) _ _ -> do
val' <- lockRecType cat val
return $ mkProd (cont, val', [])
CncFun _ _ _ -> do
a <- abstractOfConcrete gr m
mu <- lookupModMod gr a
info <- lookupInfo mu c
case info of
AbsFun (Yes ty) _ -> return $ redirectTerm m ty
AbsCat _ _ -> return typeType
_ -> prtBad "cannot find type of reused function" c
AnyInd _ n -> lookupResType gr n c
ResParam _ -> return $ typePType
ResValue (Yes t) -> return $ qualifAnnotPar m t

View File

@@ -486,6 +486,12 @@ patt2term pt = case pt of
PInt i -> EInt i
PString s -> K s
redirectTerm :: Ident -> Term -> Term
redirectTerm n t = case t of
QC _ f -> QC n f
Q _ f -> Q n f
_ -> composSafeOp (redirectTerm n) t
-- to gather s-fields; assumes term in normal form, preserves label
allLinFields :: Term -> Err [[(Label,Term)]]
allLinFields trm = case unComputed trm of