seminar slides

2006-02-10 08:02:34 +00:00
parent dd3a575080
commit c12ee01480
2 changed files with 8 additions and 102 deletions
--- a/lib/resource-1.0/doc/gslt-sem-2006.html
+++ b/lib/resource-1.0/doc/gslt-sem-2006.html
@@ -7,67 +7,12 @@
 <P ALIGN="center"><CENTER><H1>Grammars as Software Libraries</H1>
 <FONT SIZE="4">
 <I>Author: Aarne Ranta &lt;aarne (at) cs.chalmers.se&gt;</I><BR>
-Last update: Thu Feb  9 11:57:20 2006
+Last update: Thu Feb  9 13:03:45 2006
 </FONT></CENTER>

-<P></P>
-<HR NOSHADE SIZE=1>
-<P></P>
-    <UL>
-    <LI><A HREF="#toc1">Setting</A>
-    <LI><A HREF="#toc2">People</A>
-    <LI><A HREF="#toc3">Software Libraries</A>
-    <LI><A HREF="#toc4">Abstraction</A>
-    <LI><A HREF="#toc5">Grammars as libraries?</A>
-    <LI><A HREF="#toc6">A slightly more advanced example</A>
-    <LI><A HREF="#toc7">Problems with the more advanced example</A>
-    <LI><A HREF="#toc8">More problems with the advanced example</A>
-    <LI><A HREF="#toc9">A library-based solution</A>
-    <LI><A HREF="#toc10">An improved library-based solution</A>
-    <LI><A HREF="#toc11">The ultimate solution?</A>
-    <LI><A HREF="#toc12">The components of a grammar library</A>
-    <LI><A HREF="#toc13">Implementing a grammar library in GF</A>
-    <LI><A HREF="#toc14">Linearization and parsing</A>
-    <LI><A HREF="#toc15">Applying GF</A>
-    <LI><A HREF="#toc16">Domain, ontology, idiom</A>
-    <LI><A HREF="#toc17">Example domain</A>
-    <LI><A HREF="#toc18">Translation system</A>
-    <LI><A HREF="#toc19">Difficulties with concrete syntax</A>
-    <LI><A HREF="#toc20">Solving the difficulties</A>
-    <LI><A HREF="#toc21">Application grammars vs. resource grammars</A>
-    <LI><A HREF="#toc22">GF as programming language</A>
-    <LI><A HREF="#toc23">Concrete syntax using library</A>
-    <LI><A HREF="#toc24">Design questions for the grammar library</A>
-    <LI><A HREF="#toc25">Design decisions</A>
-    <LI><A HREF="#toc26">Design decisions, cont'd</A>
-    <LI><A HREF="#toc27">Success criteria and evaluation</A>
-    <LI><A HREF="#toc28">These are not our success criteria</A>
-    <LI><A HREF="#toc29">Where is semantics?</A>
-    <LI><A HREF="#toc30">Representations in different APIs</A>
-    <LI><A HREF="#toc31">Languages</A>
-    <LI><A HREF="#toc32">Library structure 1: language-independent API</A>
-    <LI><A HREF="#toc33">Library structure 2: language-dependent APIs</A>
-    <LI><A HREF="#toc34">Difficulties encountered</A>
-    <LI><A HREF="#toc35">How much can be language-independent?</A>
-    <LI><A HREF="#toc36">Using the library</A>
-    <LI><A HREF="#toc37">Parametrized modules</A>
-    <LI><A HREF="#toc38">Lexicon extension</A>
-    <LI><A HREF="#toc39">Example low-level morphological definition</A>
-    <LI><A HREF="#toc40">Some formats that can be generated from GF grammars</A>
-    <LI><A HREF="#toc41">Use as program components</A>
-    <LI><A HREF="#toc42">Grammar library as linguistic resource</A>
-    <LI><A HREF="#toc43">Corpus generation</A>
-    <LI><A HREF="#toc44">Related work</A>
-    <LI><A HREF="#toc45">Demo</A>
-    </UL>
-
-<P></P>
-<HR NOSHADE SIZE=1>
-<P></P>
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc1"></A>
 <H2>Setting</H2>
 <P>
 Current funding
@@ -101,7 +46,6 @@ Main applications
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc2"></A>
 <H2>People</H2>
 <P>
 Staff contributions to grammar libraries:
@@ -154,7 +98,6 @@ Resource library patches and suggestions from the WebALT staff:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc3"></A>
 <H2>Software Libraries</H2>
 <P>
 The main device of <B>division of labour</B> in programming.
@@ -180,7 +123,6 @@ Practical advantages:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc4"></A>
 <H2>Abstraction</H2>
 <P>
 Libraries promote <B>abstraction</B>: you abstract away from details.
@@ -199,7 +141,6 @@ if it just has a support for functions or macros.
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc5"></A>
 <H2>Grammars as libraries?</H2>
 <P>
 Example: we want to create a GUI (Graphical User Interface) button
@@ -249,7 +190,6 @@ The library has an API (Application Programmer's Interface) with:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc6"></A>
 <H2>A slightly more advanced example</H2>
 <P>
 This is what you often see as a feedback from a program:
@@ -277,7 +217,6 @@ The code that should be written is of course
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc7"></A>
 <H2>Problems with the more advanced example</H2>
 <P>
 The same as with "Yes": you have to know the words "you",
@@ -304,7 +243,6 @@ of "message":
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc8"></A>
 <H2>More problems with the advanced example</H2>
 <P>
 You also have to know the case required by the verb "have" 
@@ -328,7 +266,6 @@ address the user:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc9"></A>
 <H2>A library-based solution</H2>
 <P>
 In analogy with the "Yes" case, you write
@@ -350,7 +287,6 @@ It is time to move from <B>canned text</B> to a <B>grammar</B>.
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc10"></A>
 <H2>An improved library-based solution</H2>
 <P>
 You may want to write
@@ -378,7 +314,6 @@ For this purpose, you need a library with the API
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc11"></A>
 <H2>The ultimate solution?</H2>
 <P>
 The library API for language will certainly grow big and become
@@ -423,7 +358,6 @@ Thus some amount of interaction is needed.
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc12"></A>
 <H2>The components of a grammar library</H2>
 <P>
 The library has <B>construction functions</B> like
@@ -451,7 +385,6 @@ knowledge by application programmers!
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc13"></A>
 <H2>Implementing a grammar library in GF</H2>
 <P>
 GF = Grammatical Framework
@@ -495,7 +428,6 @@ Simplest possible example:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc14"></A>
 <H2>Linearization and parsing</H2>
 <P>
 The realizatin function is, for each language, implemented by
@@ -517,7 +449,6 @@ The GF formalism moreover has the property of <B>reversibility</B>:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc15"></A>
 <H2>Applying GF</H2>
 <P>
 <B>multilingual grammar</B> = abstract syntax + concrete syntaxes
@@ -534,7 +465,6 @@ Examples of the idea:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc16"></A>
 <H2>Domain, ontology, idiom</H2>
 <P>
 An abstract syntax has other names:
@@ -566,7 +496,6 @@ Problem: the expertise of both a linguist and a domain expert are required.
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc17"></A>
 <H2>Example domain</H2>
 <P>
 Arithmetic of natural numbers: abstract syntax
@@ -589,7 +518,6 @@ Arithmetic of natural numbers: abstract syntax
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc18"></A>
 <H2>Translation system</H2>
 <P>
 We can translate using the abstract syntax as interlingua:
@@ -611,7 +539,6 @@ But is it really so simple?
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc19"></A>
 <H2>Difficulties with concrete syntax</H2>
 <P>
 The previous multilingual grammar breaks these rules in many situations:
@@ -628,7 +555,6 @@ All these sentences are grammatically incorrect.
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc20"></A>
 <H2>Solving the difficulties</H2>
 <P>
 GF <I>can</I> express the linguistic rules that are needed to
@@ -659,7 +585,6 @@ Linguistic knowledge dominates in the size of this grammar.
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc21"></A>
 <H2>Application grammars vs. resource grammars</H2>
 <P>
 Application grammar ("semantic grammar")
@@ -682,7 +607,6 @@ Resource grammar ("syntactic grammar")
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc22"></A>
 <H2>GF as programming language</H2>
 <P>
 The expressive power is between TAG and HPSG.
@@ -702,7 +626,6 @@ We have built a <B>module system</B> that can hide details.
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc23"></A>
 <H2>Concrete syntax using library</H2>
 <P>
 Assume the following API
@@ -733,7 +656,6 @@ Notice: the choice of adjective is domain expert knowledge.
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc24"></A>
 <H2>Design questions for the grammar library</H2>
 <P>
 What should there be in the library?
@@ -765,7 +687,6 @@ hence cannot use existing proprietary resources.
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc25"></A>
 <H2>Design decisions</H2>
 <P>
 Coverage, for each language:
@@ -798,7 +719,6 @@ Presentation:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc26"></A>
 <H2>Design decisions, cont'd</H2>
 <P>
 Where do we get the data from?
@@ -818,7 +738,6 @@ The resource grammar library is entirely open-source free software
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc27"></A>
 <H2>Success criteria and evaluation</H2>
 <P>
 Grammatical correctness of everything generated.
@@ -838,7 +757,6 @@ Tools for regression testing (treebank generation and comparison)
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc28"></A>
 <H2>These are not our success criteria</H2>
 <P>
 Language coverage: 
@@ -873,7 +791,6 @@ Linguistic innovation in syntax:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc29"></A>
 <H2>Where is semantics?</H2>
 <P>
 Application grammars use domain-specific
@@ -897,7 +814,6 @@ for all for the whole language.
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc30"></A>
 <H2>Representations in different APIs</H2>
 <P>
 <B>Grammar composition</B>: any grammar can serve as resource to another one.
@@ -935,7 +851,6 @@ In <CODE>Lang</CODE> (ground level resource API)
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc31"></A>
 <H2>Languages</H2>
 <P>
 The current GF Resource Project covers ten languages:
@@ -962,7 +877,6 @@ In addition, we have parts (morphology) of Arabic, Estonian, Latin, and Urdu
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc32"></A>
 <H2>Library structure 1: language-independent API</H2>
 <P>
 <IMG ALIGN="middle" SRC="Lang.png" BORDER="0" ALT="">
@@ -979,7 +893,6 @@ Cf. "matrix" in BLARK, LinGo
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc33"></A>
 <H2>Library structure 2: language-dependent APIs</H2>
 <UL>
 <LI>morphological paradigms, e.g. <CODE>ParadigmsSwe</CODE>
@@ -1000,7 +913,6 @@ Cf. "matrix" in BLARK, LinGo
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc34"></A>
 <H2>Difficulties encountered</H2>
 <P>
 English: negation and auxiliary vs. non-auxiliary verbs
@@ -1023,7 +935,6 @@ Scandinavian: determiners
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc35"></A>
 <H2>How much can be language-independent?</H2>
 <P>
 For the ten languages we have considered, it <I>is</I> possible
@@ -1049,7 +960,6 @@ Reservations:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc36"></A>
 <H2>Using the library</H2>
 <P>
 Simplest case: use the API in the same way for all languages.
@@ -1078,7 +988,6 @@ than writing a resource grammar!
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc37"></A>
 <H2>Parametrized modules</H2>
 <P>
 We can go even farther than share an abstract API: we can share implementations
@@ -1098,7 +1007,6 @@ Exploited in two families:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc38"></A>
 <H2>Lexicon extension</H2>
 <P>
 We cannot anticipate all vocabulary needed in application grammars.
@@ -1122,7 +1030,6 @@ Example heuristic, from <A HREF="gfdoc/ParadigmsSwe.html">ParadigsSwe</A>:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc39"></A>
 <H2>Example low-level morphological definition</H2>
 <PRE>
    decl2Noun : Str -&gt; N = \bil -&gt;
@@ -1139,7 +1046,6 @@ Example heuristic, from <A HREF="gfdoc/ParadigmsSwe.html">ParadigsSwe</A>:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc40"></A>
 <H2>Some formats that can be generated from GF grammars</H2>
 <PRE>
  -printer=lbnf           BNF Converter, thereby C/Bison, Java/JavaCup
@@ -1157,7 +1063,6 @@ Example heuristic, from <A HREF="gfdoc/ParadigmsSwe.html">ParadigsSwe</A>:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc41"></A>
 <H2>Use as program components</H2>
 <P>
 Haskell, Java, Prolog
@@ -1171,7 +1076,6 @@ Push-button creation of spoken language translators (using Nuance)
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc42"></A>
 <H2>Grammar library as linguistic resource</H2>
 <P>
 Can we use the libraries outside domain-specific fragments?
@@ -1194,7 +1098,6 @@ Two ideas:
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc43"></A>
 <H2>Corpus generation</H2>
 <P>
 The most general format is <B>multilingual treebank</B> generation:
@@ -1225,7 +1128,6 @@ Can this be useful? Cf. Rebecca Jonson this afternoon.
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc44"></A>
 <H2>Related work</H2>
 <P>
 CLE = Core Language Engine
@@ -1255,7 +1157,6 @@ Parsing detached from grammar (Nivre) - grammar detached from parsing
 <P>
 <!-- NEW -->
 </P>
-<A NAME="toc45"></A>
 <H2>Demo</H2>
 <P>
 Stoneage grammar, based on the Swadesh word list.
@@ -1267,6 +1168,6 @@ Implemented as application on top of the resource grammar.
 Illustrate generation and spoken-language parsing.
 </P>

-<!-- html code generated by txt2tags 2.0 (http://txt2tags.sf.net) -->
-<!-- cmdline: txt2tags -\-toc -thtml gslt-sem-2006.txt -->
+<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
+<!-- cmdline: txt2tags gslt-sem-2006.txt -->
 </BODY></HTML>
--- a/lib/resource-1.0/lang.gfprob
+++ b/lib/resource-1.0/lang.gfprob
@@ -1,6 +1,11 @@
 --# prob ConjNP   0.05
+--# prob ConsNP   0.1
+--# prob ConsAP   0.05
 --# prob ConjAP   0.1
 --# prob ConjAdv  0.1
+--# prob AdvVP    0.1
 --# prob ConjS    0.1
 --# prob PredSCVP 0.05
 --# prob PredetNP 0.05
+--# prob SentCN   0.05
+--# prob QuestCN  0.05