mirror of
https://github.com/GrammaticalFramework/gf-core.git
synced 2026-05-20 00:22:51 -06:00
end of tutorial sketched
This commit is contained in:
@@ -7,7 +7,7 @@
|
|||||||
<P ALIGN="center"><CENTER><H1>Grammatical Framework Tutorial</H1>
|
<P ALIGN="center"><CENTER><H1>Grammatical Framework Tutorial</H1>
|
||||||
<FONT SIZE="4">
|
<FONT SIZE="4">
|
||||||
<I>Author: Aarne Ranta <aarne (at) cs.chalmers.se></I><BR>
|
<I>Author: Aarne Ranta <aarne (at) cs.chalmers.se></I><BR>
|
||||||
Last update: Sat Dec 17 21:42:39 2005
|
Last update: Sat Dec 17 23:19:34 2005
|
||||||
</FONT></CENTER>
|
</FONT></CENTER>
|
||||||
|
|
||||||
<P></P>
|
<P></P>
|
||||||
@@ -98,17 +98,22 @@ Last update: Sat Dec 17 21:42:39 2005
|
|||||||
<LI><A HREF="#toc61">Dependent types</A>
|
<LI><A HREF="#toc61">Dependent types</A>
|
||||||
<LI><A HREF="#toc62">Higher-order abstract syntax</A>
|
<LI><A HREF="#toc62">Higher-order abstract syntax</A>
|
||||||
<LI><A HREF="#toc63">Semantic definitions</A>
|
<LI><A HREF="#toc63">Semantic definitions</A>
|
||||||
<LI><A HREF="#toc64">Case study: grammars of formal languages</A>
|
|
||||||
</UL>
|
</UL>
|
||||||
<LI><A HREF="#toc65">Transfer modules</A>
|
<LI><A HREF="#toc64">Transfer modules</A>
|
||||||
<LI><A HREF="#toc66">Practical issues</A>
|
<LI><A HREF="#toc65">Practical issues</A>
|
||||||
<UL>
|
<UL>
|
||||||
<LI><A HREF="#toc67">Lexers and unlexers</A>
|
<LI><A HREF="#toc66">Lexers and unlexers</A>
|
||||||
<LI><A HREF="#toc68">Efficiency of grammars</A>
|
<LI><A HREF="#toc67">Efficiency of grammars</A>
|
||||||
<LI><A HREF="#toc69">Speech input and output</A>
|
<LI><A HREF="#toc68">Speech input and output</A>
|
||||||
<LI><A HREF="#toc70">Communicating with GF</A>
|
<LI><A HREF="#toc69">Multilingual syntax editor</A>
|
||||||
<LI><A HREF="#toc71">Embedded grammars in Haskell, Java, and Prolog</A>
|
<LI><A HREF="#toc70">Interactive Development Environment (IDE)</A>
|
||||||
<LI><A HREF="#toc72">Alternative input and output grammar formats</A>
|
<LI><A HREF="#toc71">Communicating with GF</A>
|
||||||
|
<LI><A HREF="#toc72">Embedded grammars in Haskell, Java, and Prolog</A>
|
||||||
|
<LI><A HREF="#toc73">Alternative input and output grammar formats</A>
|
||||||
|
</UL>
|
||||||
|
<LI><A HREF="#toc74">Case studies</A>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="#toc75">Interfacing formal and natural languages</A>
|
||||||
</UL>
|
</UL>
|
||||||
</UL>
|
</UL>
|
||||||
|
|
||||||
@@ -1779,8 +1784,16 @@ they can be used as arguments. For example:
|
|||||||
<H2>More features of the module system</H2>
|
<H2>More features of the module system</H2>
|
||||||
<A NAME="toc57"></A>
|
<A NAME="toc57"></A>
|
||||||
<H3>Resource grammars and their reuse</H3>
|
<H3>Resource grammars and their reuse</H3>
|
||||||
|
<P>
|
||||||
|
See
|
||||||
|
<A HREF="../../lib/resource/doc/gf-resource.html">resource library documentation</A>
|
||||||
|
</P>
|
||||||
<A NAME="toc58"></A>
|
<A NAME="toc58"></A>
|
||||||
<H3>Interfaces, instances, and functors</H3>
|
<H3>Interfaces, instances, and functors</H3>
|
||||||
|
<P>
|
||||||
|
See an
|
||||||
|
<A HREF="../../examples/mp3/mp3-resource.html">example built this way</A>
|
||||||
|
</P>
|
||||||
<A NAME="toc59"></A>
|
<A NAME="toc59"></A>
|
||||||
<H3>Restricted inheritance and qualified opening</H3>
|
<H3>Restricted inheritance and qualified opening</H3>
|
||||||
<A NAME="toc60"></A>
|
<A NAME="toc60"></A>
|
||||||
@@ -1792,23 +1805,163 @@ they can be used as arguments. For example:
|
|||||||
<A NAME="toc63"></A>
|
<A NAME="toc63"></A>
|
||||||
<H3>Semantic definitions</H3>
|
<H3>Semantic definitions</H3>
|
||||||
<A NAME="toc64"></A>
|
<A NAME="toc64"></A>
|
||||||
<H3>Case study: grammars of formal languages</H3>
|
|
||||||
<A NAME="toc65"></A>
|
|
||||||
<H2>Transfer modules</H2>
|
<H2>Transfer modules</H2>
|
||||||
<A NAME="toc66"></A>
|
<P>
|
||||||
|
Transfer means noncompositional tree-transforming operations.
|
||||||
|
The command <CODE>apply_transfer = at</CODE> is typically used in a pipe:
|
||||||
|
</P>
|
||||||
|
<PRE>
|
||||||
|
> p "John walks and John runs" | apply_transfer aggregate | l
|
||||||
|
John walks and runs
|
||||||
|
</PRE>
|
||||||
|
<P>
|
||||||
|
See the
|
||||||
|
<A HREF="../../transfer/examples/aggregation">sources</A> of this example.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
See the
|
||||||
|
<A HREF="../transfer.html">transfer language documentation</A>
|
||||||
|
for more information.
|
||||||
|
</P>
|
||||||
|
<A NAME="toc65"></A>
|
||||||
<H2>Practical issues</H2>
|
<H2>Practical issues</H2>
|
||||||
<A NAME="toc67"></A>
|
<A NAME="toc66"></A>
|
||||||
<H3>Lexers and unlexers</H3>
|
<H3>Lexers and unlexers</H3>
|
||||||
<A NAME="toc68"></A>
|
<P>
|
||||||
|
Lexers and unlexers can be chosen from
|
||||||
|
a list of predefined ones, using the flags<CODE>-lexer</CODE> and `` -unlexer`` either
|
||||||
|
in the grammar file or on the GF command line.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
Given by <CODE>help -lexer</CODE>, <CODE>help -unlexer</CODE>:
|
||||||
|
</P>
|
||||||
|
<PRE>
|
||||||
|
The default is words.
|
||||||
|
-lexer=words tokens are separated by spaces or newlines
|
||||||
|
-lexer=literals like words, but GF integer and string literals recognized
|
||||||
|
-lexer=vars like words, but "x","x_...","$...$" as vars, "?..." as meta
|
||||||
|
-lexer=chars each character is a token
|
||||||
|
-lexer=code use Haskell's lex
|
||||||
|
-lexer=codevars like code, but treat unknown words as variables, ?? as meta
|
||||||
|
-lexer=text with conventions on punctuation and capital letters
|
||||||
|
-lexer=codelit like code, but treat unknown words as string literals
|
||||||
|
-lexer=textlit like text, but treat unknown words as string literals
|
||||||
|
-lexer=codeC use a C-like lexer
|
||||||
|
-lexer=ignore like literals, but ignore unknown words
|
||||||
|
-lexer=subseqs like ignore, but then try all subsequences from longest
|
||||||
|
|
||||||
|
The default is unwords.
|
||||||
|
-unlexer=unwords space-separated token list (like unwords)
|
||||||
|
-unlexer=text format as text: punctuation, capitals, paragraph <p>
|
||||||
|
-unlexer=code format as code (spacing, indentation)
|
||||||
|
-unlexer=textlit like text, but remove string literal quotes
|
||||||
|
-unlexer=codelit like code, but remove string literal quotes
|
||||||
|
-unlexer=concat remove all spaces
|
||||||
|
-unlexer=bind like identity, but bind at "&+"
|
||||||
|
|
||||||
|
</PRE>
|
||||||
|
<P></P>
|
||||||
|
<A NAME="toc67"></A>
|
||||||
<H3>Efficiency of grammars</H3>
|
<H3>Efficiency of grammars</H3>
|
||||||
<A NAME="toc69"></A>
|
<P>
|
||||||
|
Issues:
|
||||||
|
</P>
|
||||||
|
<UL>
|
||||||
|
<LI>the choice of datastructures in <CODE>lincat</CODE>s
|
||||||
|
<LI>the value of the <CODE>optimize</CODE> flag
|
||||||
|
<LI>parsing efficiency: <CODE>-mcfg</CODE> vs. others
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<A NAME="toc68"></A>
|
||||||
<H3>Speech input and output</H3>
|
<H3>Speech input and output</H3>
|
||||||
|
<P>
|
||||||
|
The<CODE>speak_aloud = sa</CODE> command sends a string to the speech
|
||||||
|
synthesizer
|
||||||
|
<A HREF="http://www.speech.cs.cmu.edu/flite/doc/">Flite</A>.
|
||||||
|
It is typically used via a pipe:
|
||||||
|
</P>
|
||||||
|
<PRE>
|
||||||
|
generate_random | linearize | speak_aloud
|
||||||
|
</PRE>
|
||||||
|
<P>
|
||||||
|
The result is only satisfactory for English.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
The <CODE>speech_input = si</CODE> command receives a string from a
|
||||||
|
speech recognizer that requires the installation of
|
||||||
|
<A HREF="http://mi.eng.cam.ac.uk/~sjy/software.htm">ATK</A>.
|
||||||
|
It is typically used to pipe input to a parser:
|
||||||
|
</P>
|
||||||
|
<PRE>
|
||||||
|
speech_input -tr | parse
|
||||||
|
</PRE>
|
||||||
|
<P>
|
||||||
|
The method words only for grammars of English.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
Both Flite and ATK are freely available through the links
|
||||||
|
above, but they are not distributed together with GF.
|
||||||
|
</P>
|
||||||
|
<A NAME="toc69"></A>
|
||||||
|
<H3>Multilingual syntax editor</H3>
|
||||||
|
<P>
|
||||||
|
The
|
||||||
|
<A HREF="http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm">Editor User Manual</A>
|
||||||
|
describes the use of the editor, which works for any multilingual GF grammar.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
Here is a snapshot of the editor:
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
<IMG ALIGN="middle" SRC="../quick-editor.gif" BORDER="0" ALT="">
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
The grammars of the snapshot are from the
|
||||||
|
<A HREF="http://www.cs.chalmers.se/~aarne/GF/examples/letter">Letter grammar package</A>.
|
||||||
|
</P>
|
||||||
<A NAME="toc70"></A>
|
<A NAME="toc70"></A>
|
||||||
<H3>Communicating with GF</H3>
|
<H3>Interactive Development Environment (IDE)</H3>
|
||||||
|
<P>
|
||||||
|
Forthcoming.
|
||||||
|
</P>
|
||||||
<A NAME="toc71"></A>
|
<A NAME="toc71"></A>
|
||||||
<H3>Embedded grammars in Haskell, Java, and Prolog</H3>
|
<H3>Communicating with GF</H3>
|
||||||
|
<P>
|
||||||
|
Other processes can communicate with the GF command interpreter,
|
||||||
|
and also with the GF syntax editor.
|
||||||
|
</P>
|
||||||
<A NAME="toc72"></A>
|
<A NAME="toc72"></A>
|
||||||
|
<H3>Embedded grammars in Haskell, Java, and Prolog</H3>
|
||||||
|
<P>
|
||||||
|
GF grammars can be used as parts of programs written in the
|
||||||
|
following languages. The links give more documentation.
|
||||||
|
</P>
|
||||||
|
<UL>
|
||||||
|
<LI><A HREF="http://www.cs.chalmers.se/~bringert/gf/gf-java.html">Java</A>
|
||||||
|
<LI><A HREF="http://www.cs.chalmers.se/~aarne/GF/src/GF/Embed/EmbedAPI.hs">Haskell</A>
|
||||||
|
<LI><A HREF="http://www.cs.chalmers.se/~peb/software.html">Prolog</A>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<A NAME="toc73"></A>
|
||||||
<H3>Alternative input and output grammar formats</H3>
|
<H3>Alternative input and output grammar formats</H3>
|
||||||
|
<P>
|
||||||
|
A summary is given in the following chart of GF grammar compiler phases:
|
||||||
|
<IMG ALIGN="middle" SRC="../gf-compiler.png" BORDER="0" ALT="">
|
||||||
|
</P>
|
||||||
|
<A NAME="toc74"></A>
|
||||||
|
<H2>Case studies</H2>
|
||||||
|
<A NAME="toc75"></A>
|
||||||
|
<H3>Interfacing formal and natural languages</H3>
|
||||||
|
<P>
|
||||||
|
<A HREF="http://www.cs.chalmers.se/~krijo/thesis/thesisA4.pdf">Formal and Informal Software Specifications</A>,
|
||||||
|
PhD Thesis by
|
||||||
|
<A HREF="http://www.cs.chalmers.se/~krijo">Kristofer Johannisson</A>, is an extensive example of this.
|
||||||
|
The system is based on a multilingual grammar relating the formal language OCL with
|
||||||
|
English and German.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
A simpler example will be explained here.
|
||||||
|
</P>
|
||||||
|
|
||||||
<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
|
<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
|
||||||
<!-- cmdline: txt2tags -\-toc gf-tutorial2.txt -->
|
<!-- cmdline: txt2tags -\-toc gf-tutorial2.txt -->
|
||||||
|
|||||||
@@ -1530,13 +1530,20 @@ they can be used as arguments. For example:
|
|||||||
|
|
||||||
===Resource grammars and their reuse===
|
===Resource grammars and their reuse===
|
||||||
|
|
||||||
|
See
|
||||||
|
[resource library documentation ../../lib/resource/doc/gf-resource.html]
|
||||||
|
|
||||||
|
|
||||||
===Interfaces, instances, and functors===
|
===Interfaces, instances, and functors===
|
||||||
|
|
||||||
|
See an
|
||||||
|
[example built this way ../../examples/mp3/mp3-resource.html]
|
||||||
|
|
||||||
|
|
||||||
===Restricted inheritance and qualified opening===
|
===Restricted inheritance and qualified opening===
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
==More concepts of abstract syntax==
|
==More concepts of abstract syntax==
|
||||||
|
|
||||||
|
|
||||||
@@ -1546,14 +1553,22 @@ they can be used as arguments. For example:
|
|||||||
|
|
||||||
===Semantic definitions===
|
===Semantic definitions===
|
||||||
|
|
||||||
===Case study: grammars of formal languages===
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
==Transfer modules==
|
==Transfer modules==
|
||||||
|
|
||||||
|
Transfer means noncompositional tree-transforming operations.
|
||||||
|
The command ``apply_transfer = at`` is typically used in a pipe:
|
||||||
|
```
|
||||||
|
> p "John walks and John runs" | apply_transfer aggregate | l
|
||||||
|
John walks and runs
|
||||||
|
```
|
||||||
|
See the
|
||||||
|
[sources ../../transfer/examples/aggregation] of this example.
|
||||||
|
|
||||||
|
See the
|
||||||
|
[transfer language documentation ../transfer.html]
|
||||||
|
for more information.
|
||||||
|
|
||||||
|
|
||||||
==Practical issues==
|
==Practical issues==
|
||||||
@@ -1561,18 +1576,120 @@ they can be used as arguments. For example:
|
|||||||
|
|
||||||
===Lexers and unlexers===
|
===Lexers and unlexers===
|
||||||
|
|
||||||
|
Lexers and unlexers can be chosen from
|
||||||
|
a list of predefined ones, using the flags``-lexer`` and `` -unlexer`` either
|
||||||
|
in the grammar file or on the GF command line.
|
||||||
|
|
||||||
|
Given by ``help -lexer``, ``help -unlexer``:
|
||||||
|
```
|
||||||
|
The default is words.
|
||||||
|
-lexer=words tokens are separated by spaces or newlines
|
||||||
|
-lexer=literals like words, but GF integer and string literals recognized
|
||||||
|
-lexer=vars like words, but "x","x_...","$...$" as vars, "?..." as meta
|
||||||
|
-lexer=chars each character is a token
|
||||||
|
-lexer=code use Haskell's lex
|
||||||
|
-lexer=codevars like code, but treat unknown words as variables, ?? as meta
|
||||||
|
-lexer=text with conventions on punctuation and capital letters
|
||||||
|
-lexer=codelit like code, but treat unknown words as string literals
|
||||||
|
-lexer=textlit like text, but treat unknown words as string literals
|
||||||
|
-lexer=codeC use a C-like lexer
|
||||||
|
-lexer=ignore like literals, but ignore unknown words
|
||||||
|
-lexer=subseqs like ignore, but then try all subsequences from longest
|
||||||
|
|
||||||
|
The default is unwords.
|
||||||
|
-unlexer=unwords space-separated token list (like unwords)
|
||||||
|
-unlexer=text format as text: punctuation, capitals, paragraph <p>
|
||||||
|
-unlexer=code format as code (spacing, indentation)
|
||||||
|
-unlexer=textlit like text, but remove string literal quotes
|
||||||
|
-unlexer=codelit like code, but remove string literal quotes
|
||||||
|
-unlexer=concat remove all spaces
|
||||||
|
-unlexer=bind like identity, but bind at "&+"
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
===Efficiency of grammars===
|
===Efficiency of grammars===
|
||||||
|
|
||||||
|
Issues:
|
||||||
|
|
||||||
|
- the choice of datastructures in ``lincat``s
|
||||||
|
- the value of the ``optimize`` flag
|
||||||
|
- parsing efficiency: ``-mcfg`` vs. others
|
||||||
|
|
||||||
|
|
||||||
===Speech input and output===
|
===Speech input and output===
|
||||||
|
|
||||||
|
The``speak_aloud = sa`` command sends a string to the speech
|
||||||
|
synthesizer
|
||||||
|
[Flite http://www.speech.cs.cmu.edu/flite/doc/].
|
||||||
|
It is typically used via a pipe:
|
||||||
|
``` generate_random | linearize | speak_aloud
|
||||||
|
The result is only satisfactory for English.
|
||||||
|
|
||||||
|
The ``speech_input = si`` command receives a string from a
|
||||||
|
speech recognizer that requires the installation of
|
||||||
|
[ATK http://mi.eng.cam.ac.uk/~sjy/software.htm].
|
||||||
|
It is typically used to pipe input to a parser:
|
||||||
|
``` speech_input -tr | parse
|
||||||
|
The method words only for grammars of English.
|
||||||
|
|
||||||
|
Both Flite and ATK are freely available through the links
|
||||||
|
above, but they are not distributed together with GF.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
===Multilingual syntax editor===
|
||||||
|
|
||||||
|
The
|
||||||
|
[Editor User Manual http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm]
|
||||||
|
describes the use of the editor, which works for any multilingual GF grammar.
|
||||||
|
|
||||||
|
Here is a snapshot of the editor:
|
||||||
|
|
||||||
|
[../quick-editor.gif]
|
||||||
|
|
||||||
|
The grammars of the snapshot are from the
|
||||||
|
[Letter grammar package http://www.cs.chalmers.se/~aarne/GF/examples/letter].
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
===Interactive Development Environment (IDE)===
|
||||||
|
|
||||||
|
Forthcoming.
|
||||||
|
|
||||||
|
|
||||||
===Communicating with GF===
|
===Communicating with GF===
|
||||||
|
|
||||||
|
Other processes can communicate with the GF command interpreter,
|
||||||
|
and also with the GF syntax editor.
|
||||||
|
|
||||||
|
|
||||||
===Embedded grammars in Haskell, Java, and Prolog===
|
===Embedded grammars in Haskell, Java, and Prolog===
|
||||||
|
|
||||||
|
GF grammars can be used as parts of programs written in the
|
||||||
|
following languages. The links give more documentation.
|
||||||
|
|
||||||
|
- [Java http://www.cs.chalmers.se/~bringert/gf/gf-java.html]
|
||||||
|
- [Haskell http://www.cs.chalmers.se/~aarne/GF/src/GF/Embed/EmbedAPI.hs]
|
||||||
|
- [Prolog http://www.cs.chalmers.se/~peb/software.html]
|
||||||
|
|
||||||
|
|
||||||
===Alternative input and output grammar formats===
|
===Alternative input and output grammar formats===
|
||||||
|
|
||||||
|
A summary is given in the following chart of GF grammar compiler phases:
|
||||||
|
[../gf-compiler.png]
|
||||||
|
|
||||||
|
|
||||||
|
==Case studies==
|
||||||
|
|
||||||
|
===Interfacing formal and natural languages===
|
||||||
|
|
||||||
|
[Formal and Informal Software Specifications http://www.cs.chalmers.se/~krijo/thesis/thesisA4.pdf],
|
||||||
|
PhD Thesis by
|
||||||
|
[Kristofer Johannisson http://www.cs.chalmers.se/~krijo], is an extensive example of this.
|
||||||
|
The system is based on a multilingual grammar relating the formal language OCL with
|
||||||
|
English and German.
|
||||||
|
|
||||||
|
A simpler example will be explained here.
|
||||||
|
|
||||||
|
|||||||
@@ -71,7 +71,7 @@ mkMorpho gr a = tcompile $ concatMap mkOne $ allItems where
|
|||||||
|
|
||||||
-- gather forms of lexical items
|
-- gather forms of lexical items
|
||||||
allLins fun@(m,f) = errVal [] $ do
|
allLins fun@(m,f) = errVal [] $ do
|
||||||
ts <- allLinsOfFun gr (CIQ a f)
|
ts <- lookupLin gr (CIQ a f) >>= comp >>= allAllLinValues
|
||||||
ss <- mapM (mapPairsM (mapPairsM (liftM wordsInTerm . comp))) ts
|
ss <- mapM (mapPairsM (mapPairsM (liftM wordsInTerm . comp))) ts
|
||||||
return [(p,s) | (p,fs) <- concat $ map snd $ concat ss, s <- fs]
|
return [(p,s) | (p,fs) <- concat $ map snd $ concat ss, s <- fs]
|
||||||
prOne (_,f) c (ps,s) = (s, [prt f +++ tagPrt c +++ unwords (map prt_ ps)])
|
prOne (_,f) c (ps,s) = (s, [prt f +++ tagPrt c +++ unwords (map prt_ ps)])
|
||||||
|
|||||||
Reference in New Issue
Block a user