diff --git a/doc/gf-summerschool.html b/doc/gf-summerschool.html index 1c6fa2d78..d5ce321ad 100644 --- a/doc/gf-summerschool.html +++ b/doc/gf-summerschool.html @@ -19,24 +19,25 @@ Aarne Ranta (aarne at chalmers.se)
  • Introduction
  • The GF resource grammar library -
  • The summer school +
  • The summer school -
  • Why I should participate -
  • More information +
  • Why I should participate +
  • More information @@ -49,7 +50,7 @@ Aarne Ranta (aarne at chalmers.se)

    -red=wanted, green=exists, yellow=in-progress, solid=official-eu, dotted=non-eu +red=wanted, green=exists, orange=in-progress, solid=official-eu, dotted=non-eu

    Executive summary

    @@ -58,7 +59,13 @@ GF Resource Grammar Library is an open-source computational grammar resource that currently covers 12 languages. The Summer School is a part of a collaborative effort to extend the library to all of the 23 official EU languages. Also other languages -chosen by the participants can be covered. +chosen by the participants are welcome. +

    +

    +The missing EU languages are: +Czech, Dutch, Estonian, Greek, Hungarian, Irish, Latvian, Lithuanian, +Maltese, Portuguese, Slovak, and Slovenian. There is also more work to +be done on Polish and Romanian.

    The linguistic coverage of the library includes the inflectional morphology @@ -90,7 +97,7 @@ Chalmers University of Technology in Gothenburg, Sweden.

    -word alignment produced by GF from the resource grammar in English, Italian, Swedish, Finnish, French, and German +Word alignment produced by GF from the resource grammar in English, Italian, Swedish, Finnish, French, and German.

    Introduction

    @@ -174,6 +181,60 @@ programmer builds a graphical user interface (GUI) from high-level elements such buttons and menus, without having to care about pixels or geometrical forms.

    +

    Missing EU languages, by the family

    +

    +Writing a grammar for a language is usually easier if other languages +from the same family already have grammars. The colours have the same +meaning as in the diagram above. +

    +

    +Baltic: + Latvian + Lithuanian +

    +

    +Celtic: + Irish +

    +

    +Fenno-Ugric: + Estonian + Finnish + Hungarian +

    +

    +Germanic: + Danish + Dutch + English + German + Swedish +

    +

    +Hellenic: + Greek +

    +

    +Romance: + French + Italian + Portuguese + Romanian + Spanish +

    +

    +Semitic: + Maltese +

    +

    +Slavonic: + Bulgarian + Czech + Polish + Slovak + Slovenian +

    +

    Applications of the library

    In addition to translation, the library is also useful in localization, @@ -194,7 +255,8 @@ interlingua-based translation or localization of systems to new languages:

    -The library is also a generic linguistic resource, which can be used for tasks +The library is also a generic linguistic resource, +which can be used for tasks such as language teaching and information retrieval. The liberal license (LGPL) makes it usable for anyone and for any task. GF also has tools supporting the use of grammars in programs written in other @@ -204,7 +266,7 @@ support has also been developed for translating GF grammars to language models used in speech recognition (GSL/Nuance, HTK/ATK, SRGS, JSGF).

    - +

    The structure of the library

    The library has the following main parts: @@ -225,7 +287,7 @@ The library has the following main parts: The goal of the summer school is to implement, for each language, at least the first three components. The latter three are more open-ended in character.

    - +

    The summer school

    The goal of the summer school is to extend the GF resource grammar library @@ -239,7 +301,8 @@ The Russian implementation was made by Janna Khegai as a part of her PhD thesis; the thesis contains other material, too. The Arabic implementation was started by Ali El Dada in his Master's thesis, but the thesis does not cover the whole API. The realistic amount of work is -somewhere between 3 and 8 person months, but this is very much language-dependent. +somewhere between 3 and 8 person months, +but this is very much language-dependent. Dutch, for instance, can profit from previous implementations of German and Scandinavian languages, and will probably require less work. Latvian and Lithuanian are the first languages of the Baltic family and @@ -247,11 +310,11 @@ will probably require more work.

    In any case, the proposed allocation of work power is 2 participants per -language. They will do 2 months' worth of home work, followed +language. They will do 1 months' worth of home work, followed by 2 weeks of summer school, followed by 4 months work at home. Who are these participants?

    - +

    Selecting participants

    Persons interested to participate in the Summer School should sign up in @@ -261,7 +324,13 @@ the Google Group of the course, groups.google.com/group/gf-resource-school-2009/

    -The registration deadline is 15 June 2009. +The registration deadline is 15 June 2009. +

    +

    +Notice: you can sign up in the Google +group even if you are not planning to attend the summer school, but are +just interested in the topic. There will be a separate registration to the +school itself later.

    The participants are recommended to learn GF in advance, by self-study from the @@ -279,6 +348,11 @@ to the Summer School Committee by 12 May. The Committee then decides who is given a travel grant of up to 1000 EUR.

    +Notice: you can participate in the summer school without following the on-line +course or participating in the contest. These things are required only if you +want a travel grant. +

    +

    The summer school itself is devoted for working on resource grammars. In addition to grammar writing itself, testing and evaluation is performed. One way to do this is via adding new languages @@ -292,7 +366,7 @@ be published at GF website and licensed under LGPL.

    The participants are encouraged to contact each other and even work in groups.

    - +

    Who is qualified

    Writing a resource grammar implementation requires good general programming @@ -313,7 +387,7 @@ But it is the quality of the assignment that is assessed, not any formal requirements. The "typical participant" was described to give an idea of who is likely to succeed in this.

    - +

    Costs

    The summer school is free of charge. @@ -332,7 +406,7 @@ Special terms will apply to students in GSLT and NGSLT.

    - +

    Teachers

    A list of teachers will be published here later. Some of the local teachers @@ -356,13 +430,13 @@ we can discuss your involvement and travel arrangements. In addition to teachers, we will look for consultants who can help to assess the results for each language. Please contact us!

    - +

    The Summer School Committee

    This committee consists of a number of teachers and informants, who will select the participants. It will be selected by April 2009.

    - +

    Time and Place

    The summer school will @@ -383,7 +457,7 @@ Time schedule:

  • December: release of the extended Resource Grammar Library - +

    Dissemination and intellectual property

    The new resource grammars will be released under the LGPL just like @@ -393,7 +467,7 @@ with the copyright held by respective authors.

    The grammars will be distributed via the GF web site.

    - +

    Why I should participate

    Seven reasons: @@ -413,7 +487,7 @@ Seven reasons:

  • free trip and stay in Gothenburg (for travel grant students) - +

    More information

    Course Google Group @@ -430,7 +504,7 @@ Seven reasons:

    Resource-HOWTO document

    - +

    Contact

    Håkan Burden: burden at chalmers se @@ -438,7 +512,7 @@ H

    Aarne Ranta: aarne at chalmers se

    - +

    Selected publications from earlier resource grammar projects

    K. Angelov. @@ -449,6 +523,12 @@ LNCS/LNAI 5221, Springer, 2008.

    +B. Bringert. +Programming Language Techniques for Natural Language Applications. +Phd thesis, Computer Science, University of Gothenburg, +2008. +

    +

    A. El Dada and A. Ranta. Implementing an Open Source Arabic Resource Grammar in GF. In M. Mughazy (ed), @@ -479,7 +559,13 @@ July 21-22, 2007, LSA 2007 Linguistic Institute, Stanford University. 2007.

    -J Khegai. +K. Johannisson. +Formal and Informal Software Specifications. +Phd thesis, Computer Science, University of Gothenburg, +2005. +

    +

    +J. Khegai. GF parallel resource grammars and Russian. In proceedings of ACL2006 (The joint conference of the International Committee on Computational @@ -488,7 +574,7 @@ In proceedings of ACL2006

    J. Khegai. -Language engineering in Grammatical Framework (GF). +Language engineering in Grammatical Framework (GF). Phd thesis, Computer Science, Chalmers University of Technology, 2006.

    @@ -520,6 +606,18 @@ In J. Nivre, M. Dahll University of Uppsala, 2008.

    +

    +A. Ranta. Grammars as Software Libraries. +To appear in +Y. Bertot, G. Huet, J-J. Lévy, and G. Plotkin (eds.), +From Semantics to Computer Science, +Cambridge University Press, Cambridge, 2009. +

    +

    +A. Ranta and K. Angelov. +Implementing Controlled Languages in GF. +To appear in the proceedings of CNL 2009. +

    diff --git a/doc/gf-summerschool.txt b/doc/gf-summerschool.txt index c89da32c0..8a686c24d 100644 --- a/doc/gf-summerschool.txt +++ b/doc/gf-summerschool.txt @@ -7,13 +7,18 @@ Aarne Ranta (aarne at chalmers.se) %!target:html %!postproc(html): #BECE
    %!postproc(html): #ENCE
    +%!postproc(html): #GRAY +%!postproc(html): #EGRAY +%!postproc(html): #RED +%!postproc(html): #YELLOW +%!postproc(html): #ERED #BECE [school-langs.png] #ENCE -//red=wanted, green=exists, yellow=in-progress, solid=official-eu, dotted=non-eu// +//red=wanted, green=exists, orange=in-progress, solid=official-eu, dotted=non-eu// ==Executive summary== @@ -22,7 +27,12 @@ GF Resource Grammar Library is an open-source computational grammar resource that currently covers 12 languages. The Summer School is a part of a collaborative effort to extend the library to all of the 23 official EU languages. Also other languages -chosen by the participants can be covered. +chosen by the participants are welcome. + +The missing EU languages are: +Czech, Dutch, Estonian, Greek, Hungarian, Irish, Latvian, Lithuanian, +Maltese, Portuguese, Slovak, and Slovenian. There is also more work to +be done on Polish and Romanian. The linguistic coverage of the library includes the inflectional morphology and basic syntax of each language. It can be used in GF applications @@ -48,7 +58,7 @@ Chalmers University of Technology in Gothenburg, Sweden. [align6.png] -//word alignment produced by GF from the resource grammar in English, Italian, Swedish, Finnish, French, and German// +//Word alignment produced by GF from the resource grammar in English, Italian, Swedish, Finnish, French, and German.// ==Introduction== @@ -125,6 +135,55 @@ programmer builds a graphical user interface (GUI) from high-level elements such buttons and menus, without having to care about pixels or geometrical forms. +===Missing EU languages, by the family=== + +Writing a grammar for a language is usually easier if other languages +from the same family already have grammars. The colours have the same +meaning as in the diagram above. + +Baltic: +#RED Latvian #ERED +#RED Lithuanian #ERED + +Celtic: +#RED Irish #ERED + +Fenno-Ugric: +#RED Estonian #ERED +#GRAY Finnish #EGRAY +#RED Hungarian #ERED + +Germanic: +#GRAY Danish #EGRAY +#RED Dutch #ERED +#GRAY English #EGRAY +#GRAY German #EGRAY +#GRAY Swedish #EGRAY + +Hellenic: +#RED Greek #ERED + +Romance: +#GRAY French #EGRAY +#GRAY Italian #EGRAY +#RED Portuguese #ERED +#YELLOW Romanian #ERED +#GRAY Spanish #EGRAY + +Semitic: +#RED Maltese #ERED + +Slavonic: +#GRAY Bulgarian #EGRAY +#RED Czech #ERED +#YELLOW Polish #ERED +#RED Slovak #ERED +#RED Slovenian #ERED + + + + + ===Applications of the library=== @@ -143,7 +202,8 @@ interlingua-based translation or localization of systems to new languages: to six languages -The library is also a generic linguistic resource, which can be used for tasks +The library is also a generic **linguistic resource**, +which can be used for tasks such as language teaching and information retrieval. The liberal license (LGPL) makes it usable for anyone and for any task. GF also has tools supporting the use of grammars in programs written in other @@ -185,14 +245,15 @@ The Russian implementation was made by Janna Khegai as a part of her PhD thesis; the thesis contains other material, too. The Arabic implementation was started by Ali El Dada in his Master's thesis, but the thesis does not cover the whole API. The realistic amount of work is -somewhere between 3 and 8 person months, but this is very much language-dependent. +somewhere between 3 and 8 person months, +but this is very much language-dependent. Dutch, for instance, can profit from previous implementations of German and Scandinavian languages, and will probably require less work. Latvian and Lithuanian are the first languages of the Baltic family and will probably require more work. In any case, the proposed allocation of work power is 2 participants per -language. They will do 2 months' worth of home work, followed +language. They will do 1 months' worth of home work, followed by 2 weeks of summer school, followed by 4 months work at home. Who are these participants? @@ -204,7 +265,12 @@ the **Google Group** of the course, [``groups.google.com/group/gf-resource-school-2009/`` http://groups.google.com/group/gf-resource-school-2009/] -The registration deadline is 15 June 2009. +The registration deadline is 15 June 2009. + +Notice: you can sign up in the Google +group even if you are not planning to attend the summer school, but are +just interested in the topic. There will be a separate registration to the +school itself later. The participants are recommended to learn GF in advance, by self-study from the [tutorial http://digitalgrammars.com/gf/doc/gf-tutorial.html]. @@ -219,6 +285,10 @@ their sample resource grammar fragment to the Summer School Committee by 12 May. The Committee then decides who is given a travel grant of up to 1000 EUR. +Notice: you can participate in the summer school without following the on-line +course or participating in the contest. These things are required only if you +want a travel grant. + The summer school itself is devoted for working on resource grammars. In addition to grammar writing itself, testing and evaluation is performed. One way to do this is via adding new languages @@ -370,6 +440,11 @@ In B. Nordstr LNCS/LNAI 5221, Springer, 2008. +B. Bringert. +//Programming Language Techniques for Natural Language Applications//. +Phd thesis, Computer Science, University of Gothenburg, +2008. + A. El Dada and A. Ranta. Implementing an Open Source Arabic Resource Grammar in GF. In M. Mughazy (ed), @@ -396,7 +471,12 @@ Urdu Morphology, Orthography and Lexicon Extraction. July 21-22, 2007, LSA 2007 Linguistic Institute, Stanford University. 2007. -J Khegai. +K. Johannisson. +//Formal and Informal Software Specifications.// +Phd thesis, Computer Science, University of Gothenburg, +2005. + +J. Khegai. GF parallel resource grammars and Russian. In proceedings of ACL2006 (The joint conference of the International Committee on Computational @@ -404,7 +484,7 @@ In proceedings of ACL2006 Sydney, Australia, July 2006. J. Khegai. -Language engineering in Grammatical Framework (GF). +//Language engineering in Grammatical Framework (GF)//. Phd thesis, Computer Science, Chalmers University of Technology, 2006. @@ -432,3 +512,13 @@ In J. Nivre, M. Dahll University of Uppsala, 2008. +A. Ranta. Grammars as Software Libraries. +To appear in +Y. Bertot, G. Huet, J-J. Lévy, and G. Plotkin (eds.), +//From Semantics to Computer Science//, +Cambridge University Press, Cambridge, 2009. + +A. Ranta and K. Angelov. +Implementing Controlled Languages in GF. +To appear in the proceedings of //CNL 2009//. + diff --git a/doc/school-langs.dot b/doc/school-langs.dot index f35284951..208ffb157 100644 --- a/doc/school-langs.dot +++ b/doc/school-langs.dot @@ -54,7 +54,7 @@ overlap = scale ; "16" [label = "Dutch", style = "solid", shape = "ellipse", color = "red"] ; "Abs" -- "16" [style = "solid"]; -"17" [label = "Polish", style = "solid", shape = "ellipse", color = "yellow"] ; +"17" [label = "Polish", style = "solid", shape = "ellipse", color = "orange"] ; "Abs" -- "17" [style = "solid"]; "18" [label = "Portuguese", style = "solid", shape = "ellipse", color = "red"] ; @@ -66,7 +66,7 @@ overlap = scale ; "20" [label = "Slovene", style = "solid", shape = "ellipse", color = "red"] ; "Abs" -- "20" [style = "solid"]; -"21" [label = "Romanian", style = "solid", shape = "ellipse", color = "yellow"] ; +"21" [label = "Romanian", style = "solid", shape = "ellipse", color = "orange"] ; "Abs" -- "21" [style = "solid"]; "22" [label = "Finnish", style = "solid", shape = "ellipse", color = "green"] ; @@ -87,14 +87,18 @@ overlap = scale ; "27" [label = "Interlingua", style = "dotted", shape = "ellipse", color = "green"] ; "Abs" -- "27" [style = "solid"]; -"28" [label = "Latin", style = "dotted", shape = "ellipse", color = "yellow"] ; +"28" [label = "Latin", style = "dotted", shape = "ellipse", color = "orange"] ; "Abs" -- "28" [style = "solid"]; -"29" [label = "Turkish", style = "dotted", shape = "ellipse", color = "yellow"] ; +"29" [label = "Turkish", style = "dotted", shape = "ellipse", color = "orange"] ; "Abs" -- "29" [style = "solid"]; -"30" [label = "Hindi", style = "dotted", shape = "ellipse", color = "yellow"] ; +"30" [label = "Hindi", style = "dotted", shape = "ellipse", color = "orange"] ; "Abs" -- "30" [style = "solid"]; -"31" [label = "Thai", style = "dotted", shape = "ellipse", color = "yellow"] ; +"31" [label = "Thai", style = "dotted", shape = "ellipse", color = "orange"] ; "Abs" -- "31" [style = "solid"]; +"32" [label = "Urdu", style = "dotted", shape = "ellipse", color = "orange"] ; +"Abs" -- "32" [style = "solid"]; +"33" [label = "Telugu", style = "dotted", shape = "ellipse", color = "red"] ; +"Abs" -- "33" [style = "solid"]; } diff --git a/doc/school-langs.png b/doc/school-langs.png index 03373d7b5..96e2160c8 100644 Binary files a/doc/school-langs.png and b/doc/school-langs.png differ