end of tutorial sketched

2005-12-17 22:21:23 +00:00
parent 6510f3c115
commit e4314a739d
3 changed files with 293 additions and 23 deletions
@@ -7,7 +7,7 @@
 <P ALIGN="center"><CENTER><H1>Grammatical Framework Tutorial</H1>
 <FONT SIZE="4">
 <I>Author: Aarne Ranta &lt;aarne (at) cs.chalmers.se&gt;</I><BR>
-Last update: Sat Dec 17 21:42:39 2005
+Last update: Sat Dec 17 23:19:34 2005
 </FONT></CENTER>

 <P></P>
@@ -98,17 +98,22 @@ Last update: Sat Dec 17 21:42:39 2005
      <LI><A HREF="#toc61">Dependent types</A>
      <LI><A HREF="#toc62">Higher-order abstract syntax</A>
      <LI><A HREF="#toc63">Semantic definitions</A>
-      <LI><A HREF="#toc64">Case study: grammars of formal languages</A>
      </UL>
-    <LI><A HREF="#toc65">Transfer modules</A>
-    <LI><A HREF="#toc66">Practical issues</A>
+    <LI><A HREF="#toc64">Transfer modules</A>
+    <LI><A HREF="#toc65">Practical issues</A>
      <UL>
-      <LI><A HREF="#toc67">Lexers and unlexers</A>
-      <LI><A HREF="#toc68">Efficiency of grammars</A>
-      <LI><A HREF="#toc69">Speech input and output</A>
-      <LI><A HREF="#toc70">Communicating with GF</A>
-      <LI><A HREF="#toc71">Embedded grammars in Haskell, Java, and Prolog</A>
-      <LI><A HREF="#toc72">Alternative input and output grammar formats</A>
+      <LI><A HREF="#toc66">Lexers and unlexers</A>
+      <LI><A HREF="#toc67">Efficiency of grammars</A>
+      <LI><A HREF="#toc68">Speech input and output</A>
+      <LI><A HREF="#toc69">Multilingual syntax editor</A>
+      <LI><A HREF="#toc70">Interactive Development Environment (IDE)</A>
+      <LI><A HREF="#toc71">Communicating with GF</A>
+      <LI><A HREF="#toc72">Embedded grammars in Haskell, Java, and Prolog</A>
+      <LI><A HREF="#toc73">Alternative input and output grammar formats</A>
+      </UL>
+    <LI><A HREF="#toc74">Case studies</A>
+      <UL>
+      <LI><A HREF="#toc75">Interfacing formal and natural languages</A>
      </UL>
    </UL>

@@ -1779,8 +1784,16 @@ they can be used as arguments. For example:
 <H2>More features of the module system</H2>
 <A NAME="toc57"></A>
 <H3>Resource grammars and their reuse</H3>
+<P>
+See 
+<A HREF="../../lib/resource/doc/gf-resource.html">resource library documentation</A>
+</P>
 <A NAME="toc58"></A>
 <H3>Interfaces, instances, and functors</H3>
+<P>
+See an
+<A HREF="../../examples/mp3/mp3-resource.html">example built this way</A>
+</P>
 <A NAME="toc59"></A>
 <H3>Restricted inheritance and qualified opening</H3>
 <A NAME="toc60"></A>
@@ -1792,23 +1805,163 @@ they can be used as arguments. For example:
 <A NAME="toc63"></A>
 <H3>Semantic definitions</H3>
 <A NAME="toc64"></A>
-<H3>Case study: grammars of formal languages</H3>
-<A NAME="toc65"></A>
 <H2>Transfer modules</H2>
-<A NAME="toc66"></A>
+<P>
+Transfer means noncompositional tree-transforming operations.
+The command <CODE>apply_transfer = at</CODE> is typically used in a pipe:
+</P>
+<PRE>
+    &gt; p "John walks and John runs" | apply_transfer aggregate | l
+    John walks and runs
+</PRE>
+<P>
+See the
+<A HREF="../../transfer/examples/aggregation">sources</A> of this example.
+</P>
+<P>
+See the
+<A HREF="../transfer.html">transfer language documentation</A>
+for more information.
+</P>
+<A NAME="toc65"></A>
 <H2>Practical issues</H2>
-<A NAME="toc67"></A>
+<A NAME="toc66"></A>
 <H3>Lexers and unlexers</H3>
-<A NAME="toc68"></A>
+<P>
+Lexers and unlexers can be chosen from
+a list of predefined ones, using the flags<CODE>-lexer</CODE> and `` -unlexer`` either
+in the grammar file or on the GF command line.
+</P>
+<P>
+Given by <CODE>help -lexer</CODE>, <CODE>help -unlexer</CODE>:
+</P>
+<PRE>
+      The default is words.
+      -lexer=words         tokens are separated by spaces or newlines
+      -lexer=literals      like words, but GF integer and string literals recognized
+      -lexer=vars          like words, but "x","x_...","$...$" as vars, "?..." as meta
+      -lexer=chars         each character is a token
+      -lexer=code          use Haskell's lex
+      -lexer=codevars      like code, but treat unknown words as variables, ?? as meta
+      -lexer=text          with conventions on punctuation and capital letters
+      -lexer=codelit       like code, but treat unknown words as string literals
+      -lexer=textlit       like text, but treat unknown words as string literals
+      -lexer=codeC         use a C-like lexer
+      -lexer=ignore        like literals, but ignore unknown words
+      -lexer=subseqs       like ignore, but then try all subsequences from longest
+  
+      The default is unwords.
+      -unlexer=unwords     space-separated token list (like unwords)
+      -unlexer=text        format as text: punctuation, capitals, paragraph &lt;p&gt;
+      -unlexer=code        format as code (spacing, indentation)
+      -unlexer=textlit     like text, but remove string literal quotes
+      -unlexer=codelit     like code, but remove string literal quotes
+      -unlexer=concat      remove all spaces
+      -unlexer=bind        like identity, but bind at "&amp;+"
+  
+</PRE>
+<P></P>
+<A NAME="toc67"></A>
 <H3>Efficiency of grammars</H3>
-<A NAME="toc69"></A>
+<P>
+Issues:
+</P>
+<UL>
+<LI>the choice of datastructures in <CODE>lincat</CODE>s
+<LI>the value of the <CODE>optimize</CODE> flag 
+<LI>parsing efficiency: <CODE>-mcfg</CODE> vs. others
+</UL>
+
+<A NAME="toc68"></A>
 <H3>Speech input and output</H3>
+<P>
+The<CODE>speak_aloud = sa</CODE> command sends a string to the speech
+synthesizer 
+<A HREF="http://www.speech.cs.cmu.edu/flite/doc/">Flite</A>.
+It is typically used via a pipe:
+</P>
+<PRE>
+   generate_random | linearize | speak_aloud
+</PRE>
+<P>
+The result is only satisfactory for English.
+</P>
+<P>
+The <CODE>speech_input = si</CODE> command receives a string from a
+speech recognizer that requires the installation of
+<A HREF="http://mi.eng.cam.ac.uk/~sjy/software.htm">ATK</A>.
+It is typically used to pipe input to a parser:
+</P>
+<PRE>
+   speech_input -tr | parse
+</PRE>
+<P>
+The method words only for grammars of English.
+</P>
+<P>
+Both Flite and ATK are freely available through the links
+above, but they are not distributed together with GF.
+</P>
+<A NAME="toc69"></A>
+<H3>Multilingual syntax editor</H3>
+<P>
+The 
+<A HREF="http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm">Editor User Manual</A>
+describes the use of the editor, which works for any multilingual GF grammar.
+</P>
+<P>
+Here is a snapshot of the editor:
+</P>
+<P>
+<IMG ALIGN="middle" SRC="../quick-editor.gif" BORDER="0" ALT="">
+</P>
+<P>
+The grammars of the snapshot are from the
+<A HREF="http://www.cs.chalmers.se/~aarne/GF/examples/letter">Letter grammar package</A>.
+</P>
 <A NAME="toc70"></A>
-<H3>Communicating with GF</H3>
+<H3>Interactive Development Environment (IDE)</H3>
+<P>
+Forthcoming.
+</P>
 <A NAME="toc71"></A>
-<H3>Embedded grammars in Haskell, Java, and Prolog</H3>
+<H3>Communicating with GF</H3>
+<P>
+Other processes can communicate with the GF command interpreter,
+and also with the GF syntax editor.
+</P>
 <A NAME="toc72"></A>
+<H3>Embedded grammars in Haskell, Java, and Prolog</H3>
+<P>
+GF grammars can be used as parts of programs written in the
+following languages. The links give more documentation.
+</P>
+<UL>
+<LI><A HREF="http://www.cs.chalmers.se/~bringert/gf/gf-java.html">Java</A>
+<LI><A HREF="http://www.cs.chalmers.se/~aarne/GF/src/GF/Embed/EmbedAPI.hs">Haskell</A>
+<LI><A HREF="http://www.cs.chalmers.se/~peb/software.html">Prolog</A>
+</UL>
+
+<A NAME="toc73"></A>
 <H3>Alternative input and output grammar formats</H3>
+<P>
+A summary is given in the following chart of GF grammar compiler phases:
+<IMG ALIGN="middle" SRC="../gf-compiler.png" BORDER="0" ALT="">
+</P>
+<A NAME="toc74"></A>
+<H2>Case studies</H2>
+<A NAME="toc75"></A>
+<H3>Interfacing formal and natural languages</H3>
+<P>
+<A HREF="http://www.cs.chalmers.se/~krijo/thesis/thesisA4.pdf">Formal and Informal Software Specifications</A>,
+PhD Thesis by
+<A HREF="http://www.cs.chalmers.se/~krijo">Kristofer Johannisson</A>, is an extensive example of this.
+The system is based on a multilingual grammar relating the formal language OCL with
+English and German.
+</P>
+<P>
+A simpler example will be explained here.
+</P>

 <!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
 <!-- cmdline: txt2tags -\-toc gf-tutorial2.txt -->
@@ -1530,13 +1530,20 @@ they can be used as arguments. For example:

 ===Resource grammars and their reuse===

+See 
+[resource library documentation  ../../lib/resource/doc/gf-resource.html]
+

 ===Interfaces, instances, and functors===

+See an
+[example built this way ../../examples/mp3/mp3-resource.html]
+

 ===Restricted inheritance and qualified opening===


+
 ==More concepts of abstract syntax==


@@ -1546,14 +1553,22 @@ they can be used as arguments. For example:

 ===Semantic definitions===

-===Case study: grammars of formal languages===
-
-
-


 ==Transfer modules==

+Transfer means noncompositional tree-transforming operations.
+The command ``apply_transfer = at`` is typically used in a pipe:
+```
+  > p "John walks and John runs" | apply_transfer aggregate | l
+  John walks and runs
+```
+See the
+[sources ../../transfer/examples/aggregation] of this example.
+
+See the
+[transfer language documentation  ../transfer.html]
+for more information.


 ==Practical issues==
@@ -1561,18 +1576,120 @@ they can be used as arguments. For example:

 ===Lexers and unlexers===

+Lexers and unlexers can be chosen from
+a list of predefined ones, using the flags``-lexer`` and `` -unlexer`` either
+in the grammar file or on the GF command line.
+
+Given by ``help -lexer``, ``help -unlexer``:
+```
+    The default is words.
+    -lexer=words         tokens are separated by spaces or newlines
+    -lexer=literals      like words, but GF integer and string literals recognized
+    -lexer=vars          like words, but "x","x_...","$...$" as vars, "?..." as meta
+    -lexer=chars         each character is a token
+    -lexer=code          use Haskell's lex
+    -lexer=codevars      like code, but treat unknown words as variables, ?? as meta
+    -lexer=text          with conventions on punctuation and capital letters
+    -lexer=codelit       like code, but treat unknown words as string literals
+    -lexer=textlit       like text, but treat unknown words as string literals
+    -lexer=codeC         use a C-like lexer
+    -lexer=ignore        like literals, but ignore unknown words
+    -lexer=subseqs       like ignore, but then try all subsequences from longest
+
+    The default is unwords.
+    -unlexer=unwords     space-separated token list (like unwords)
+    -unlexer=text        format as text: punctuation, capitals, paragraph <p>
+    -unlexer=code        format as code (spacing, indentation)
+    -unlexer=textlit     like text, but remove string literal quotes
+    -unlexer=codelit     like code, but remove string literal quotes
+    -unlexer=concat      remove all spaces
+    -unlexer=bind        like identity, but bind at "&+"
+
+```
+

 ===Efficiency of grammars===

+Issues:
+
+- the choice of datastructures in ``lincat``s
+- the value of the ``optimize`` flag 
+- parsing efficiency: ``-mcfg`` vs. others
+

 ===Speech input and output===

+The``speak_aloud = sa`` command sends a string to the speech
+synthesizer 
+[Flite http://www.speech.cs.cmu.edu/flite/doc/].
+It is typically used via a pipe:
+```  generate_random | linearize | speak_aloud
+The result is only satisfactory for English.
+
+The ``speech_input = si`` command receives a string from a
+speech recognizer that requires the installation of
+[ATK http://mi.eng.cam.ac.uk/~sjy/software.htm].
+It is typically used to pipe input to a parser:
+```  speech_input -tr | parse
+The method words only for grammars of English.
+
+Both Flite and ATK are freely available through the links
+above, but they are not distributed together with GF.
+
+
+
+
+===Multilingual syntax editor===
+
+The 
+[Editor User Manual http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm]
+describes the use of the editor, which works for any multilingual GF grammar.
+
+Here is a snapshot of the editor:
+
+[../quick-editor.gif]
+ 
+The grammars of the snapshot are from the
+[Letter grammar package http://www.cs.chalmers.se/~aarne/GF/examples/letter].
+
+
+
+===Interactive Development Environment (IDE)===
+
+Forthcoming.
+

 ===Communicating with GF===

+Other processes can communicate with the GF command interpreter,
+and also with the GF syntax editor.
+

 ===Embedded grammars in Haskell, Java, and Prolog===

+GF grammars can be used as parts of programs written in the
+following languages. The links give more documentation.
+
+- [Java http://www.cs.chalmers.se/~bringert/gf/gf-java.html]
+- [Haskell http://www.cs.chalmers.se/~aarne/GF/src/GF/Embed/EmbedAPI.hs]
+- [Prolog http://www.cs.chalmers.se/~peb/software.html]
+

 ===Alternative input and output grammar formats===

+A summary is given in the following chart of GF grammar compiler phases:
+[../gf-compiler.png]
+
+
+==Case studies==
+
+===Interfacing formal and natural languages===
+
+[Formal and Informal Software Specifications http://www.cs.chalmers.se/~krijo/thesis/thesisA4.pdf],
+PhD Thesis by
+[Kristofer Johannisson http://www.cs.chalmers.se/~krijo], is an extensive example of this.
+The system is based on a multilingual grammar relating the formal language OCL with
+English and German.
+
+A simpler example will be explained here.
+
@@ -71,7 +71,7 @@ mkMorpho gr a = tcompile $ concatMap mkOne $ allItems where
  
  -- gather forms of lexical items
  allLins fun@(m,f) = errVal [] $ do
-    ts <- allLinsOfFun gr (CIQ a f)  
+    ts <- lookupLin gr (CIQ a f) >>= comp >>= allAllLinValues  
    ss <- mapM (mapPairsM (mapPairsM (liftM wordsInTerm . comp))) ts
    return [(p,s) | (p,fs) <- concat $ map snd $ concat ss, s <- fs]
  prOne (_,f) c (ps,s) = (s, [prt f +++ tagPrt c +++ unwords (map prt_ ps)])