From cbf3bd088ba8bcda7a56a3a64b05975cce68a8f2 Mon Sep 17 00:00:00 2001 From: aarne Date: Sat, 7 Jan 2006 20:53:47 +0000 Subject: [PATCH] regex in the tutorial --- doc/tutorial/gf-tutorial2.html | 158 +++++++++++++++++++++++---------- doc/tutorial/gf-tutorial2.txt | 58 ++++++++++++ 2 files changed, 170 insertions(+), 46 deletions(-) diff --git a/doc/tutorial/gf-tutorial2.html b/doc/tutorial/gf-tutorial2.html index 6f8ff78f1..223d6db50 100644 --- a/doc/tutorial/gf-tutorial2.html +++ b/doc/tutorial/gf-tutorial2.html @@ -7,7 +7,7 @@

Grammatical Framework Tutorial

Author: Aarne Ranta <aarne (at) cs.chalmers.se>
-Last update: Wed Dec 21 10:29:13 2005 +Last update: Sat Jan 7 21:51:56 2006

@@ -92,37 +92,38 @@ Last update: Wed Dec 21 10:29:13 2005
  • Record extension and subtyping
  • Tuples and product types
  • Record and tuple patterns -
  • Prefix-dependent choices -
  • Predefined types and operations +
  • Regular expression patterns +
  • Prefix-dependent choices +
  • Predefined types and operations -
  • More features of the module system +
  • More features of the module system -
  • More concepts of abstract syntax +
  • More concepts of abstract syntax -
  • Transfer modules -
  • Practical issues +
  • Transfer modules +
  • Practical issues -
  • Case studies +
  • Case studies @@ -2036,6 +2037,71 @@ possible to write, slightly surprisingly,

    +

    Regular expression patterns

    +

    +(New since 7 January 2006.) +

    +

    +To define string operations computed at compile time, such +as in morphology, it is handy to use regular expression patterns: +

    + + +

    +The last three apply to all types of patterns, the first two only to token strings. +Example: plural formation in Swedish 2nd declension +(pojke-pojkar, nyckel-nycklar, seger-segrar, bil-bilar): +

    +
    +    plural2 : Str -> Str = \w -> case w of {
    +      pojk + "e"                       => pojk + "ar" ;
    +      nyck + "e" + l@("l" | "r" | "n") => nyck + l + "ar" ;
    +      bil                              => bil + "ar"
    +      } ;
    +
    +

    +Another example: English noun plural formation. +

    +
    +    plural : Str -> Str = \w -> case w of {
    +      _ + ("s" | "z" | "x" | "sh")      => w + "es" ;
    +      _ + ("a" | "o" | "u" | "e") + "y" => w + "s" ;
    +      x + "y"                           => x + "ies" ;
    +      _                                 => w + "s"
    +      } ;
    +  
    +
    +

    +Semantics: variables are always bound to the first match, which is the first +in the sequence of binding lists Match p v defined as follows. In the definition, +p is a pattern and v is a value. +

    +
    +    Match (p1|p2) v = Match p1 v ++ Match p2 v
    +    Match (p1+p2) s = [Match p1 s1 ++ Match p2 s2 | i <- [0..length s], (s1,s2) = splitAt i s]
    +    Match p*      s = Match "" s ++ Match p s ++ Match (p + p) s ++ ...
    +    Match c       v = [[]] if c == v  -- for constant and literal patterns c
    +    Match x       v = [[(x,v)]]       -- for variable patterns x
    +    Match x@p     v = [[(x,v)]] + M   if M = Match p v /= []
    +    Match p       v = [] otherwise    -- failure
    +
    +

    +Examples: +

    + + +

    Prefix-dependent choices

    The construct exemplified in @@ -2064,7 +2130,7 @@ This very example does not work in all situations: the prefix } ;

    - +

    Predefined types and operations

    GF has the following predefined categories in abstract syntax: @@ -2087,11 +2153,11 @@ they can be used as arguments. For example: -- e.g. (StreetAddress 10 "Downing Street") : Address

    - -

    More features of the module system

    -

    Interfaces, instances, and functors

    +

    More features of the module system

    +

    Interfaces, instances, and functors

    +

    Resource grammars and their reuse

    A resource grammar is a grammar built on linguistic grounds, @@ -2144,19 +2210,19 @@ The rest of the modules (black) come from the resource.

    - -

    Restricted inheritance and qualified opening

    -

    More concepts of abstract syntax

    +

    Restricted inheritance and qualified opening

    -

    Dependent types

    +

    More concepts of abstract syntax

    -

    Higher-order abstract syntax

    +

    Dependent types

    -

    Semantic definitions

    +

    Higher-order abstract syntax

    -

    List categories

    +

    Semantic definitions

    +

    List categories

    +

    Transfer modules

    Transfer means noncompositional tree-transforming operations. @@ -2175,9 +2241,9 @@ See the transfer language documentation for more information.

    - -

    Practical issues

    +

    Practical issues

    +

    Lexers and unlexers

    Lexers and unlexers can be chosen from @@ -2213,7 +2279,7 @@ Given by help -lexer, help -unlexer:

    - +

    Efficiency of grammars

    Issues: @@ -2224,7 +2290,7 @@ Issues:

  • parsing efficiency: -mcfg vs. others - +

    Speech input and output

    Thespeak_aloud = sa command sends a string to the speech @@ -2254,7 +2320,7 @@ The method words only for grammars of English. Both Flite and ATK are freely available through the links above, but they are not distributed together with GF.

    - +

    Multilingual syntax editor

    The @@ -2271,12 +2337,12 @@ Here is a snapshot of the editor: The grammars of the snapshot are from the Letter grammar package.

    - +

    Interactive Development Environment (IDE)

    Forthcoming.

    - +

    Communicating with GF

    Other processes can communicate with the GF command interpreter, @@ -2293,7 +2359,7 @@ Thus the most silent way to invoke GF is - +

    Embedded grammars in Haskell, Java, and Prolog

    GF grammars can be used as parts of programs written in the @@ -2305,15 +2371,15 @@ following languages. The links give more documentation.

  • Prolog - +

    Alternative input and output grammar formats

    A summary is given in the following chart of GF grammar compiler phases:

    - -

    Case studies

    +

    Case studies

    +

    Interfacing formal and natural languages

    Formal and Informal Software Specifications, diff --git a/doc/tutorial/gf-tutorial2.txt b/doc/tutorial/gf-tutorial2.txt index 077cb4da1..a5b262053 100644 --- a/doc/tutorial/gf-tutorial2.txt +++ b/doc/tutorial/gf-tutorial2.txt @@ -1733,6 +1733,64 @@ possible to write, slightly surprisingly, } ``` +%--! +===Regular expression patterns=== + +(New since 7 January 2006.) + +To define string operations computed at compile time, such +as in morphology, it is handy to use regular expression patterns: + + + - //p// ``+`` //q// : token consisting of //p// followed by //q// + - //p// ``*`` : token //p// repeated 0 or more times + (max the length of the string to be matched) + - ``-`` //p// : matches anything that //p// does not match + - //x// ``@`` //p// : bind to //x// what //p// matches + - //p// ``|`` //q// : matches what either //p// or //q// matches + + +The last three apply to all types of patterns, the first two only to token strings. +Example: plural formation in Swedish 2nd declension +(//pojke-pojkar, nyckel-nycklar, seger-segrar, bil-bilar//): +``` + plural2 : Str -> Str = \w -> case w of { + pojk + "e" => pojk + "ar" ; + nyck + "e" + l@("l" | "r" | "n") => nyck + l + "ar" ; + bil => bil + "ar" + } ; +``` +Another example: English noun plural formation. +``` + plural : Str -> Str = \w -> case w of { + _ + ("s" | "z" | "x" | "sh") => w + "es" ; + _ + ("a" | "o" | "u" | "e") + "y" => w + "s" ; + x + "y" => x + "ies" ; + _ => w + "s" + } ; + +``` +Semantics: variables are always bound to the **first match**, which is the first +in the sequence of binding lists ``Match p v`` defined as follows. In the definition, +``p`` is a pattern and ``v`` is a value. +``` + Match (p1|p2) v = Match p1 v ++ Match p2 v + Match (p1+p2) s = [Match p1 s1 ++ Match p2 s2 | i <- [0..length s], (s1,s2) = splitAt i s] + Match p* s = Match "" s ++ Match p s ++ Match (p + p) s ++ ... + Match c v = [[]] if c == v -- for constant and literal patterns c + Match x v = [[(x,v)]] -- for variable patterns x + Match x@p v = [[(x,v)]] + M if M = Match p v /= [] + Match p v = [] otherwise -- failure +``` +Examples: + +- ``x + "e" + y`` matches ``"peter"`` with ``x = "p", y = "ter"`` +- ``x@("foo"*)`` matches any token with ``x = ""`` +- ``x + y@("er"*)`` matches ``"burgerer"`` with ``x = "burg", y = "erer"`` + + + + %--! ===Prefix-dependent choices===