mirror of
https://github.com/GrammaticalFramework/gf-core.git
synced 2026-04-09 04:59:31 -06:00
clt with toc
This commit is contained in:
@@ -7,12 +7,82 @@
|
||||
<P ALIGN="center"><CENTER><H1>The GF Resource Grammar Library Version 1.0</H1>
|
||||
<FONT SIZE="4">
|
||||
<I>Author: Aarne Ranta <aarne (at) cs.chalmers.se></I><BR>
|
||||
Last update: Wed Mar 8 22:35:06 2006
|
||||
Last update: Wed Mar 8 22:40:25 2006
|
||||
</FONT></CENTER>
|
||||
|
||||
<P></P>
|
||||
<HR NOSHADE SIZE=1>
|
||||
<P></P>
|
||||
<UL>
|
||||
<LI><A HREF="#toc1">Plan</A>
|
||||
<LI><A HREF="#toc2">Purpose</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc3">Library for applications</A>
|
||||
<LI><A HREF="#toc4">Not primarily code for a parser</A>
|
||||
<LI><A HREF="#toc5">Grammar as language definition</A>
|
||||
<LI><A HREF="#toc6">Usability by non-linguists</A>
|
||||
<LI><A HREF="#toc7">Scientific interest</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc8">Background</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc9">History</A>
|
||||
<LI><A HREF="#toc10">Authors</A>
|
||||
<LI><A HREF="#toc11">Related work</A>
|
||||
<LI><A HREF="#toc12">Slightly less related work</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc13">Coverage</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc14">Languages</A>
|
||||
<LI><A HREF="#toc15">Morphology and lexicon</A>
|
||||
<LI><A HREF="#toc16">Syntactic structures</A>
|
||||
<LI><A HREF="#toc17">Quantitative measures</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc18">Structure of the API</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc19">Language-independent ground API</A>
|
||||
<LI><A HREF="#toc20">The structure of a text sentence</A>
|
||||
<LI><A HREF="#toc21">The structure in the syntax editor</A>
|
||||
<LI><A HREF="#toc22">Language-dependent paradigm modules</A>
|
||||
<LI><A HREF="#toc23">Language-dependent syntax extensions</A>
|
||||
<LI><A HREF="#toc24">Special-purpose APIs</A>
|
||||
<LI><A HREF="#toc25">How to use the resource as top-level grammar</A>
|
||||
<LI><A HREF="#toc26">Compiling</A>
|
||||
<LI><A HREF="#toc27">Parsing</A>
|
||||
<LI><A HREF="#toc28">Treebank generation</A>
|
||||
<LI><A HREF="#toc29">The multilingual treebank format</A>
|
||||
<LI><A HREF="#toc30">Treebank-based parsing</A>
|
||||
<LI><A HREF="#toc31">Morphology</A>
|
||||
<LI><A HREF="#toc32">Syntax editing</A>
|
||||
<LI><A HREF="#toc33">Efficient parsing via application grammar</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc34">How to use as library</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc35">Specialization through parametrized modules</A>
|
||||
<LI><A HREF="#toc36">Compile-time transfer</A>
|
||||
<LI><A HREF="#toc37">A natural division into modules</A>
|
||||
<LI><A HREF="#toc38">Example-based grammar writing</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc39">How to implement a new language</A>
|
||||
<LI><A HREF="#toc40">Ordinary modules</A>
|
||||
<LI><A HREF="#toc41">Parametrized modules</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc42">The core API</A>
|
||||
<LI><A HREF="#toc43">The core API in Latin: parameters</A>
|
||||
<LI><A HREF="#toc44">The core API in Latin: linearization types</A>
|
||||
<LI><A HREF="#toc45">The core API in Latin: predication and complementization</A>
|
||||
<LI><A HREF="#toc46">The core API in Latin: determination and modification</A>
|
||||
<LI><A HREF="#toc47">How to proceed</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc48">How to extend the API</A>
|
||||
</UL>
|
||||
|
||||
<P></P>
|
||||
<HR NOSHADE SIZE=1>
|
||||
<P></P>
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc1"></A>
|
||||
<H2>Plan</H2>
|
||||
<P>
|
||||
Purpose
|
||||
@@ -38,7 +108,9 @@ How to extend the API
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc2"></A>
|
||||
<H2>Purpose</H2>
|
||||
<A NAME="toc3"></A>
|
||||
<H3>Library for applications</H3>
|
||||
<P>
|
||||
High-level access to grammatical rules
|
||||
@@ -63,6 +135,7 @@ Usability for different purposes
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc4"></A>
|
||||
<H3>Not primarily code for a parser</H3>
|
||||
<P>
|
||||
Often in NLP, a grammar is just high-level code for a parser.
|
||||
@@ -89,6 +162,7 @@ Moreover, a grammar fine-tuned for parsing may not be reusable
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc5"></A>
|
||||
<H3>Grammar as language definition</H3>
|
||||
<P>
|
||||
Linguistic ontology: <B>abstract syntax</B>
|
||||
@@ -124,6 +198,7 @@ Resource grammars have generation perspective, rather than parsing
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc6"></A>
|
||||
<H3>Usability by non-linguists</H3>
|
||||
<P>
|
||||
Division of labour: resource grammars hide linguistic details
|
||||
@@ -165,6 +240,7 @@ Example-based grammar writing
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc7"></A>
|
||||
<H3>Scientific interest</H3>
|
||||
<P>
|
||||
Linguistics
|
||||
@@ -191,7 +267,9 @@ Computer science
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc8"></A>
|
||||
<H2>Background</H2>
|
||||
<A NAME="toc9"></A>
|
||||
<H3>History</H3>
|
||||
<P>
|
||||
2002: v. 0.2
|
||||
@@ -230,6 +308,7 @@ Computer science
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc10"></A>
|
||||
<H3>Authors</H3>
|
||||
<P>
|
||||
Janna Khegai (Russian modules, forthcoming),
|
||||
@@ -259,6 +338,7 @@ Jordi Saludes.
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc11"></A>
|
||||
<H3>Related work</H3>
|
||||
<P>
|
||||
CLE (Core Language Engine,
|
||||
@@ -275,6 +355,7 @@ CLE (Core Language Engine,
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc12"></A>
|
||||
<H3>Slightly less related work</H3>
|
||||
<P>
|
||||
<A HREF="http://www.delph-in.net/matrix/">LinGO Grammar Matrix</A>
|
||||
@@ -307,7 +388,9 @@ Rosetta Machine Translation (<A HREF="http://citeseer.ist.psu.edu/181924.html">B
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc13"></A>
|
||||
<H2>Coverage</H2>
|
||||
<A NAME="toc14"></A>
|
||||
<H3>Languages</H3>
|
||||
<P>
|
||||
The current GF Resource Project covers ten languages:
|
||||
@@ -334,6 +417,7 @@ API 1.0 not yet implemented for Danish and Russian
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc15"></A>
|
||||
<H3>Morphology and lexicon</H3>
|
||||
<P>
|
||||
Complete inflection engine
|
||||
@@ -364,6 +448,7 @@ provide a huge lexicon.
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc16"></A>
|
||||
<H3>Syntactic structures</H3>
|
||||
<P>
|
||||
Texts:
|
||||
@@ -396,6 +481,7 @@ proper names, pronouns, determiners, possessives, cardinals and ordinals
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc17"></A>
|
||||
<H3>Quantitative measures</H3>
|
||||
<P>
|
||||
67 categories
|
||||
@@ -429,7 +515,9 @@ proper names, pronouns, determiners, possessives, cardinals and ordinals
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc18"></A>
|
||||
<H2>Structure of the API</H2>
|
||||
<A NAME="toc19"></A>
|
||||
<H3>Language-independent ground API</H3>
|
||||
<P>
|
||||
<IMG ALIGN="middle" SRC="Lang.png" BORDER="0" ALT="">
|
||||
@@ -437,6 +525,7 @@ proper names, pronouns, determiners, possessives, cardinals and ordinals
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc20"></A>
|
||||
<H3>The structure of a text sentence</H3>
|
||||
<PRE>
|
||||
John walks.
|
||||
@@ -461,6 +550,7 @@ proper names, pronouns, determiners, possessives, cardinals and ordinals
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc21"></A>
|
||||
<H3>The structure in the syntax editor</H3>
|
||||
<P>
|
||||
<IMG ALIGN="middle" SRC="editor.png" BORDER="0" ALT="">
|
||||
@@ -468,6 +558,7 @@ proper names, pronouns, determiners, possessives, cardinals and ordinals
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc22"></A>
|
||||
<H3>Language-dependent paradigm modules</H3>
|
||||
<H4>Regular paradigms</H4>
|
||||
<P>
|
||||
@@ -559,6 +650,7 @@ Goal: eliminate the user's need of worst-case functions.
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc23"></A>
|
||||
<H3>Language-dependent syntax extensions</H3>
|
||||
<P>
|
||||
Syntactic structures that are not shared by all languages.
|
||||
@@ -581,6 +673,7 @@ Candidates:
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc24"></A>
|
||||
<H3>Special-purpose APIs</H3>
|
||||
<P>
|
||||
Mathematical
|
||||
@@ -600,7 +693,9 @@ Shallow
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc25"></A>
|
||||
<H3>How to use the resource as top-level grammar</H3>
|
||||
<A NAME="toc26"></A>
|
||||
<H3>Compiling</H3>
|
||||
<P>
|
||||
It is a good idea to compile the library, so that it can be opened faster
|
||||
@@ -627,6 +722,7 @@ files again. Just do some of
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc27"></A>
|
||||
<H3>Parsing</H3>
|
||||
<P>
|
||||
The default parser does not work! (It is obsolete anyway.)
|
||||
@@ -657,6 +753,7 @@ Remedies:
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc28"></A>
|
||||
<H3>Treebank generation</H3>
|
||||
<P>
|
||||
Multilingual treebank entry = tree + linearizations
|
||||
@@ -685,6 +782,7 @@ Updating a treebank
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc29"></A>
|
||||
<H3>The multilingual treebank format</H3>
|
||||
<P>
|
||||
Tree + linearizations
|
||||
@@ -707,6 +805,7 @@ These can also be wrapped in XML tags (<CODE>tb -xml</CODE>)
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc30"></A>
|
||||
<H3>Treebank-based parsing</H3>
|
||||
<P>
|
||||
Brute-force method that helps if real parsing is more expensive.
|
||||
@@ -728,6 +827,7 @@ Brute-force method that helps if real parsing is more expensive.
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc31"></A>
|
||||
<H3>Morphology</H3>
|
||||
<P>
|
||||
Use morphological analyser
|
||||
@@ -755,6 +855,7 @@ Try out inflection patterns
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc32"></A>
|
||||
<H3>Syntax editing</H3>
|
||||
<P>
|
||||
The simplest way to start editing with all grammars is
|
||||
@@ -770,6 +871,7 @@ parts of an application grammar remain to be implemented.
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc33"></A>
|
||||
<H3>Efficient parsing via application grammar</H3>
|
||||
<P>
|
||||
Get rid of discontinuous constituents (in particular, <CODE>VP</CODE>)
|
||||
@@ -786,7 +888,9 @@ instead of <CODE>PredVP np (ComplV2 v2 np')</CODE>
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc34"></A>
|
||||
<H2>How to use as library</H2>
|
||||
<A NAME="toc35"></A>
|
||||
<H3>Specialization through parametrized modules</H3>
|
||||
<P>
|
||||
The application grammar is implemented with reference to
|
||||
@@ -801,6 +905,7 @@ Example: <A HREF="../../../examples/tram/TramI.gf">tram</A>
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc36"></A>
|
||||
<H3>Compile-time transfer</H3>
|
||||
<P>
|
||||
Instead of parametrized modules:
|
||||
@@ -814,6 +919,7 @@ Example: imperative vs. infinitive in mathematical exercises
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc37"></A>
|
||||
<H3>A natural division into modules</H3>
|
||||
<P>
|
||||
Lexicon in language-dependent moduls
|
||||
@@ -824,6 +930,7 @@ Combination rules in a parametrized module
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc38"></A>
|
||||
<H3>Example-based grammar writing</H3>
|
||||
<P>
|
||||
Example: <A HREF="../../../examples/animal/QuestionsI.gfe">animal</A>
|
||||
@@ -849,10 +956,12 @@ Example: <A HREF="../../../examples/animal/QuestionsI.gfe">animal</A>
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc39"></A>
|
||||
<H2>How to implement a new language</H2>
|
||||
<P>
|
||||
See <A HREF="Resource-HOWTO.html">Resource-HOWTO</A>
|
||||
</P>
|
||||
<A NAME="toc40"></A>
|
||||
<H2>Ordinary modules</H2>
|
||||
<P>
|
||||
Write a concrete syntax module for each abstract module in the API
|
||||
@@ -866,6 +975,7 @@ Examples: English, Finnish, German, Russian
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc41"></A>
|
||||
<H2>Parametrized modules</H2>
|
||||
<P>
|
||||
Examples: Romance (French, Italian, Spanish), Scandinavian (Danish, Norwegian, Swedish)
|
||||
@@ -898,6 +1008,7 @@ Problems:
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc42"></A>
|
||||
<H3>The core API</H3>
|
||||
<P>
|
||||
Everything else is variations of this
|
||||
@@ -922,6 +1033,7 @@ Everything else is variations of this
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc43"></A>
|
||||
<H3>The core API in Latin: parameters</H3>
|
||||
<P>
|
||||
This <A HREF="latin.gf">toy Latin grammar</A> shows in a nutshell how the core
|
||||
@@ -942,6 +1054,7 @@ can be implemented.
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc44"></A>
|
||||
<H3>The core API in Latin: linearization types</H3>
|
||||
<PRE>
|
||||
lincat
|
||||
@@ -977,6 +1090,7 @@ can be implemented.
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc45"></A>
|
||||
<H3>The core API in Latin: predication and complementization</H3>
|
||||
<PRE>
|
||||
lin
|
||||
@@ -1001,6 +1115,7 @@ can be implemented.
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc46"></A>
|
||||
<H3>The core API in Latin: determination and modification</H3>
|
||||
<PRE>
|
||||
DetCN det cn =
|
||||
@@ -1024,6 +1139,7 @@ can be implemented.
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc47"></A>
|
||||
<H3>How to proceed</H3>
|
||||
<OL>
|
||||
<LI>put up a directory with dummy modules by copying from e.g. English and
|
||||
@@ -1043,6 +1159,7 @@ commenting out the contents
|
||||
<P>
|
||||
<!-- NEW -->
|
||||
</P>
|
||||
<A NAME="toc48"></A>
|
||||
<H2>How to extend the API</H2>
|
||||
<P>
|
||||
Extend old modules or add a new one?
|
||||
@@ -1057,5 +1174,5 @@ you can work directly in that module.
|
||||
</P>
|
||||
|
||||
<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
|
||||
<!-- cmdline: txt2tags clt2006.txt -->
|
||||
<!-- cmdline: txt2tags -\-toc clt2006.txt -->
|
||||
</BODY></HTML>
|
||||
|
||||
Reference in New Issue
Block a user