mirror of
https://github.com/GrammaticalFramework/gf-core.git
synced 2026-04-13 14:59:32 -06:00
new school web page
This commit is contained in:
@@ -3,21 +3,56 @@
|
||||
<HEAD>
|
||||
<META NAME="generator" CONTENT="http://txt2tags.sf.net">
|
||||
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
|
||||
<TITLE>European Resource Grammar Summer School</TITLE>
|
||||
<TITLE>GF Resource Grammar Summer School</TITLE>
|
||||
</HEAD><BODY BGCOLOR="white" TEXT="black">
|
||||
<P ALIGN="center"><CENTER><H1>European Resource Grammar Summer School</H1>
|
||||
<P ALIGN="center"><CENTER><H1>GF Resource Grammar Summer School</H1>
|
||||
<FONT SIZE="4">
|
||||
<I>Gothenburg, 17-28 August 2009</I><BR>
|
||||
Aarne Ranta (aarne at chalmers.se)
|
||||
</FONT></CENTER>
|
||||
|
||||
<P></P>
|
||||
<HR NOSHADE SIZE=1>
|
||||
<P></P>
|
||||
<UL>
|
||||
<LI><A HREF="#toc1">Executive summary</A>
|
||||
<LI><A HREF="#toc2">Introduction</A>
|
||||
<LI><A HREF="#toc3">The GF resource grammar library</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc4">Applications of the library</A>
|
||||
<LI><A HREF="#toc5">The structure of the library</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc6">The summer school</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc7">Selecting participants</A>
|
||||
<LI><A HREF="#toc8">Who is qualified</A>
|
||||
<LI><A HREF="#toc9">Costs</A>
|
||||
<LI><A HREF="#toc10">Teachers</A>
|
||||
<LI><A HREF="#toc11">The Summer School Committee</A>
|
||||
<LI><A HREF="#toc12">Time and Place</A>
|
||||
<LI><A HREF="#toc13">Dissemination and intellectual property</A>
|
||||
</UL>
|
||||
<LI><A HREF="#toc14">Why I should participate</A>
|
||||
<LI><A HREF="#toc15">More information</A>
|
||||
<UL>
|
||||
<LI><A HREF="#toc16">Contaxt</A>
|
||||
<LI><A HREF="#toc17">Selected publications from earlier resource grammar projects</A>
|
||||
</UL>
|
||||
</UL>
|
||||
|
||||
<P></P>
|
||||
<HR NOSHADE SIZE=1>
|
||||
<P></P>
|
||||
<P>
|
||||
<I>preliminary version, 17 November 2008</I>
|
||||
<center>
|
||||
<IMG ALIGN="middle" SRC="school-langs.png" BORDER="0" ALT="">
|
||||
</center>
|
||||
</P>
|
||||
<P>
|
||||
<IMG ALIGN="middle" SRC="eu-langs.png" BORDER="0" ALT="">
|
||||
<I>red=wanted, green=exists, yellow=in-progress, solid=official-eu, dotted=non-eu</I>
|
||||
</P>
|
||||
<H3>Executive summary</H3>
|
||||
<A NAME="toc1"></A>
|
||||
<H2>Executive summary</H2>
|
||||
<P>
|
||||
We plan to organize a summer school with the goal of implementing the GF
|
||||
resource grammar library for 15 new languages, so that the library will
|
||||
@@ -32,91 +67,76 @@ and also ported to other formats. The library is licensed under LGPL.
|
||||
</P>
|
||||
<P>
|
||||
Each language is implemented by one or two students working together.
|
||||
Travel grants will be available for students selected on the basis of
|
||||
Travel grants will be available for some students selected on the basis of
|
||||
pre-conference assignments.
|
||||
</P>
|
||||
<P>
|
||||
The official announcement will be in January 2009, and the summer school
|
||||
itself on 17-28 August 2009, at the campus of Chalmers University of
|
||||
Technology in Gothenburg, Sweden.
|
||||
The summer school will be held on 17-28 August 2009, at the campus of
|
||||
Chalmers University of Technology in Gothenburg, Sweden.
|
||||
</P>
|
||||
<A NAME="toc2"></A>
|
||||
<H2>Introduction</H2>
|
||||
<P>
|
||||
Since 2007, EU-27 has 23 official languages, listed in the diagram on top of this
|
||||
document.
|
||||
There is a growing need of translation between
|
||||
these languages. The traditional language-to-language method requires 23*22 = 506
|
||||
translators (humans or computer programs) to cover all possible translation needs.
|
||||
</P>
|
||||
<P>
|
||||
An alternative to language-to-language translation is the use of an <B>interlingua</B>:
|
||||
a language-independent representation such that all translation problems can
|
||||
be reduced to translating to and from the interlingua. With 23 languages,
|
||||
only 2*23 = 46 translators are needed.
|
||||
</P>
|
||||
<P>
|
||||
Interlingua sounds too good to be true. In a sense, it is. All attempts to
|
||||
create an interlingua that would solve all translation problems have failed.
|
||||
However, interlinguas for restricted applications have shown more
|
||||
success. For instance, mathematical texts and weather reports can be translated
|
||||
by using interlinguas tailor-made for the domains of mathematics and weather reports,
|
||||
respectively.
|
||||
</P>
|
||||
<P>
|
||||
What is required of an interlingua is
|
||||
</P>
|
||||
<UL>
|
||||
<LI>semantic accuracy: correspondence to what you want to say in the application
|
||||
<LI>language-independence: abstraction from individual languages
|
||||
</UL>
|
||||
|
||||
<P>
|
||||
Thus, for instance, an interlingua for mathematical texts may be based on
|
||||
mathematical logic, which at the same time gives semantic accuracy and
|
||||
language independence. In other domains, something else than mathematical
|
||||
logic may be needed; the <B>ontologies</B> defined within the semantic
|
||||
web technology are often good starting points for interlinguas.
|
||||
</P>
|
||||
<H2>GF: a framework for multilingual grammars</H2>
|
||||
<P>
|
||||
The interlingua is just one part of a translation system. We also need
|
||||
the mappings between the interlingua and the involved languages. As the
|
||||
number of languages increases, this part grows while the interlingua remains
|
||||
constant.
|
||||
document. There is a growing need of linguistic resources for these
|
||||
languages, to help in tasks such as translation and information retrieval.
|
||||
These resources should be <B>portable</B> and <B>freely accessible</B>.
|
||||
Languages marked in red in the diagram are of particular interest for
|
||||
the summer school, since they are those on which the effort will be concentrated.
|
||||
</P>
|
||||
<P>
|
||||
GF (Grammatical Framework,
|
||||
<A HREF="http://digitalgrammars.com/gf"><CODE>digitalgrammars.com/gf</CODE></A>)
|
||||
is a programming language designed to support interlingua-based translation.
|
||||
A "program" in GF is a <B>multilingual grammar</B>, which consists of an
|
||||
<B>abstract syntax</B> and a set of <B>concrete syntaxes</B>. A concrete
|
||||
syntaxes is a mapping from the abstract syntax to a particular language.
|
||||
These mappings are <B>reversible</B>, which means that they can be used for
|
||||
translating in both directions. This means that creating an interlingua-based
|
||||
translator for 23 languages just requires 1 + 23 = 24 grammar modules (the abstract
|
||||
syntax and the concrete syntaxes).
|
||||
is a <B>functional programming language</B> designed for writing natural
|
||||
language grammars. It provides an efficient platform for this task, due to
|
||||
its modern characteristics:
|
||||
</P>
|
||||
<UL>
|
||||
<LI>It is a functional programming language, similar to Haskell and ML.
|
||||
<LI>It has a static type system and type checker.
|
||||
<LI>It has a powerful module system supporting separate compilation
|
||||
and data abstraction.
|
||||
<LI>It has an optimizing compiler to <B>Portable Grammar Format</B> (PGF).
|
||||
<LI>PGF can be further compiled to other formats, such as JavaScript and
|
||||
speech recognition language models.
|
||||
<LI>GF has a <B>resource grammar library</B> giving access to the morphology and
|
||||
basic syntax of 12 languages.
|
||||
</UL>
|
||||
|
||||
<P>
|
||||
In addition to "ordinary" grammars for single languages, GF
|
||||
supports <B>multilingual grammars</B>. A multilingual GF grammar consists of an
|
||||
<B>abstract syntax</B> and a set of <B>concrete syntaxes</B>.
|
||||
An abstract syntax is system of <B>trees</B>, serving as a semantic
|
||||
model or an ontology. A concrete syntax is a mapping from abstract syntax
|
||||
trees to strings of a particular language.
|
||||
</P>
|
||||
<P>
|
||||
The diagram first in this document shows an interlingua
|
||||
system covering the 23 EU languages.
|
||||
Languages marked in
|
||||
red are of particular interest for the summer school, since they are those
|
||||
on which the effort will be concentrated.
|
||||
These mappings defined in concrete syntax are <B>reversible</B>: they
|
||||
can be used both for <B>generating</B> strings from trees, and for
|
||||
<B>parsing</B> strings into trees. Combinations of generation and
|
||||
parsing can be used for <B>translation</B>, where the abstract
|
||||
syntax works as an <B>interlingua</B>. Thus GF has been used as a
|
||||
framework for building translation systems in several areas
|
||||
of application and large sets of languages.
|
||||
</P>
|
||||
<A NAME="toc3"></A>
|
||||
<H2>The GF resource grammar library</H2>
|
||||
<P>
|
||||
The GF resource grammar library is a set of grammars used as libraries when
|
||||
building interlingua-based translation systems. The library currently covers
|
||||
The GF resource grammar library is a set of grammars usable as libraries when
|
||||
building translation systems and other applications.
|
||||
The library currently covers
|
||||
the 9 languages coloured in green in the diagram above; in addition,
|
||||
Catalan, Norwegian, and Russian are covered, and there is ongoing work on
|
||||
Arabic, Hindi/Urdu, and Thai.
|
||||
Arabic, Hindi/Urdu, Polish, Romanian, and Thai.
|
||||
</P>
|
||||
<P>
|
||||
The purpose of the resource grammar library is to define the "low-level" structure
|
||||
of a language: inflection, word order, agreement. This structure belongs to what
|
||||
linguists call morphology and syntax. It can be very complex and requires
|
||||
a lot of knowledge. Yet, when translating from one language to another, knowing
|
||||
morphology and syntax is but a part of what is needed. The translator (whether human
|
||||
a lot of knowledge. Yet, when translating from one language to
|
||||
another, knowing morphology and syntax is but a part of what is needed.
|
||||
The translator (whether human
|
||||
or machine) must understand the meaning of what is translated, and must also know
|
||||
the idiomatic way to express the meaning in the target language. This knowledge
|
||||
can be very domain-dependent and requires in general an expert in the field to
|
||||
@@ -127,13 +147,15 @@ in the field of weather reports, etc.
|
||||
The problem is to find a person who is an expert in both the domain of translation
|
||||
and in the low-level linguistic details. It is the rareness of this combination
|
||||
that has made it difficult to build interlingua-based translation systems.
|
||||
The GF resource grammar library has the mission of helping in this task. It encapsulates
|
||||
the low-level linguistics in program modules accessed through easy-to-use interfaces.
|
||||
The GF resource grammar library has the mission of helping in this task.
|
||||
It encapsulates the low-level linguistics in program modules
|
||||
accessed through easy-to-use interfaces.
|
||||
Experts on different domains can build translation systems by using the library,
|
||||
without knowing low-level linguistics. The idea is much the same as when a
|
||||
programmer builds a graphical user interface (GUI) from high-level elements such as
|
||||
buttons and menus, without having to care about pixels or geometrical forms.
|
||||
</P>
|
||||
<A NAME="toc4"></A>
|
||||
<H3>Applications of the library</H3>
|
||||
<P>
|
||||
In addition to translation, the library is also useful in <B>localization</B>,
|
||||
@@ -149,25 +171,29 @@ interlingua-based translation or localization of systems to new languages:
|
||||
<A HREF="http://webalt.math.helsinki.fi/content/index_eng.html"><CODE>http://webalt.math.helsinki.fi/content/index_eng.html</CODE></A>,
|
||||
for translating mathematical exercises to 7 languages
|
||||
<LI>in TALK <A HREF="http://www.talk-project.org"><CODE>http://www.talk-project.org</CODE></A>,
|
||||
where the library was used for localizing spoken dialogue systems to six languages
|
||||
where the library was used for localizing spoken dialogue systems
|
||||
to six languages
|
||||
</UL>
|
||||
|
||||
<P>
|
||||
The library is also a generic linguistic resource, which can be used for tasks
|
||||
such as language teaching and information retrieval. The liberal license (LGPL)
|
||||
makes it usable for anyone and for any task. GF also has tools supporting the
|
||||
use of grammars in programs written in other programming languages: C, C++, Haskell,
|
||||
Java, JavaScript, and Prolog. In connection with the TALK project, support has also been
|
||||
use of grammars in programs written in other
|
||||
programming languages: C, C++, Haskell,
|
||||
Java, JavaScript, and Prolog. In connection with the TALK project,
|
||||
support has also been
|
||||
developed for translating GF grammars to language models used in speech
|
||||
recognition (GSL/Nuance, HTK/ATK, SRGS, JSGF).
|
||||
</P>
|
||||
<A NAME="toc5"></A>
|
||||
<H3>The structure of the library</H3>
|
||||
<P>
|
||||
The library has the following main parts:
|
||||
</P>
|
||||
<UL>
|
||||
<LI><B>Inflection paradigms</B>, covering the inflection of each language.
|
||||
<LI><B>Common Syntax API</B>, covering a large set of syntax rule that
|
||||
<LI><B>Core Syntax</B>, covering a large set of syntax rule that
|
||||
can be implemented for all languages involved.
|
||||
<LI><B>Common Test Lexicon</B>, giving ca. 500 common words that can be used for
|
||||
testing the library.
|
||||
@@ -181,11 +207,13 @@ The library has the following main parts:
|
||||
The goal of the summer school is to implement, for each language, at least
|
||||
the first three components. The latter three are more open-ended in character.
|
||||
</P>
|
||||
<A NAME="toc6"></A>
|
||||
<H2>The summer school</H2>
|
||||
<P>
|
||||
The goal of the summer school is to extend the GF resource grammar library
|
||||
to covering all 23 EU languages, which means we need 15 new languages.
|
||||
We also welcome other languages, if there are interested participants.
|
||||
We also welcome other languages than these 23,
|
||||
if there are interested participants.
|
||||
</P>
|
||||
<P>
|
||||
The amount of work and skill is between a Master's thesis and a PhD thesis.
|
||||
@@ -201,50 +229,52 @@ will probably require more work.
|
||||
</P>
|
||||
<P>
|
||||
In any case, the proposed allocation of work power is 2 participants per
|
||||
language. They will have 6 months to work at home, followed
|
||||
by 2 weeks of summer school. Who are these participants?
|
||||
language. They will do 2 months' worth of home work, followed
|
||||
by 2 weeks of summer school, followed by 4 months work at home.
|
||||
Who are these participants?
|
||||
</P>
|
||||
<A NAME="toc7"></A>
|
||||
<H3>Selecting participants</H3>
|
||||
<P>
|
||||
After the call has been published, persons interested to participate in
|
||||
the project are expected to learn GF by self-study from the
|
||||
<A HREF="http://digitalgrammars.com/gf/doc/gf-tutorial.html">tutorial</A>.
|
||||
This should take a couple of weeks. Also an on-line course will be
|
||||
arranged to help in getting started with GF.
|
||||
Persons interested to participate in the Summer School should sign up in
|
||||
the <B>Google Group</B> of the course,
|
||||
</P>
|
||||
<P>
|
||||
Participants should continue to
|
||||
implement selected parts of the resource grammar, following the advice from
|
||||
the
|
||||
<A HREF="http://digitalgrammars.com/gf/doc/Resource-HOWTO.html">Resource-HOWTO document</A>.
|
||||
What parts exactly are selected will be announced later.
|
||||
This work will take another couple of weeks.
|
||||
<A HREF="http://groups.google.com/group/gf-resource-school-2009/"><CODE>groups.google.com/group/gf-resource-school-2009/</CODE></A>
|
||||
</P>
|
||||
<P>
|
||||
The participants are expected to learn GF by self-study from the
|
||||
<A HREF="http://digitalgrammars.com/gf/doc/gf-tutorial.html">tutorial</A>.
|
||||
This should take a couple of weeks. An <B>on-line course</B> will be
|
||||
arranged in April to help in getting started with GF.
|
||||
</P>
|
||||
<P>
|
||||
After the on-line course, a <B>programming assignment</B> will be published.
|
||||
This assignment will test skills required in resource grammar programming.
|
||||
Work on the assignment will take a couple of weeks.
|
||||
</P>
|
||||
<P>
|
||||
Those who are interested in getting a travel grant will submit
|
||||
their sample resource grammar fragment
|
||||
to the Summer School Committee in the beginning of May.
|
||||
to the Summer School Committee by 12 May.
|
||||
The Committee then decides who is invited to represent which language
|
||||
in the summer school.
|
||||
</P>
|
||||
<P>
|
||||
After the Committee decision, the participants have around three months
|
||||
to work on their languages. The work is completed in the summer school
|
||||
itself. It is also thoroughly tested by using it to add new languages
|
||||
to applications - in particular, to the WebALT mathematical
|
||||
The summer school itself is devoted for working on resource grammars.
|
||||
In addition to grammar writing itself, testing and evaluation is
|
||||
performed. One way to do this is via adding new languages
|
||||
to resource grammar applications - in particular, to the WebALT mathematical
|
||||
exercise translator.
|
||||
</P>
|
||||
<P>
|
||||
Depending on the quality of submitted work, and on the demands of different
|
||||
languages, the Committee may decide to select another number than 2 participants
|
||||
for a language. We will also consider accepting participants who want to
|
||||
pay their own expenses.
|
||||
The resource grammars are expected to be completed by December 2009. They will
|
||||
be published at GF website and licensed under LGPL.
|
||||
</P>
|
||||
<P>
|
||||
To keep track on who is working on which language, we will establish a Wiki page
|
||||
soon after the call is published. The participants are encouraged
|
||||
to contact each other and even work in groups.
|
||||
The participants are encouraged to contact each other and even work in groups.
|
||||
</P>
|
||||
<A NAME="toc8"></A>
|
||||
<H3>Who is qualified</H3>
|
||||
<P>
|
||||
Writing a resource grammar implementation requires good general programming
|
||||
@@ -265,6 +295,7 @@ But it is the quality of the assignment that is assessed, not any formal
|
||||
requirements. The "typical participant" was described to give an idea of
|
||||
who is likely to succeed in this.
|
||||
</P>
|
||||
<A NAME="toc9"></A>
|
||||
<H3>Costs</H3>
|
||||
<P>
|
||||
Our aim is to make the summer school free of charge for the participants
|
||||
@@ -273,8 +304,15 @@ we plan to cover their travel and accommodation costs, up to 1000 EUR
|
||||
per person.
|
||||
</P>
|
||||
<P>
|
||||
We try to get the funding question settled by mid-February 2009.
|
||||
The number of grants will be decided during Spring 2009, so that grand
|
||||
holders can be notified before the beginning of June.
|
||||
</P>
|
||||
<P>
|
||||
Special terms will apply to students in
|
||||
<A HREF="http://www.gslt.hum.gu.se/">GSLT</A> and
|
||||
<A HREF="http://ngslt.org/">NGSLT</A>.
|
||||
</P>
|
||||
<A NAME="toc10"></A>
|
||||
<H3>Teachers</H3>
|
||||
<P>
|
||||
A list of teachers will be published here later. Some of the local teachers
|
||||
@@ -298,11 +336,13 @@ we can discuss your involvement and travel arrangements.
|
||||
In addition to teachers, we will look for consultants who can help to assess
|
||||
the results for each language. Please contact us!
|
||||
</P>
|
||||
<A NAME="toc11"></A>
|
||||
<H3>The Summer School Committee</H3>
|
||||
<P>
|
||||
This committee consists of a number of teachers and consultants,
|
||||
who will select the participants. It will be selected by February 2009.
|
||||
This committee consists of a number of teachers and informants,
|
||||
who will select the participants. It will be selected by April 2009.
|
||||
</P>
|
||||
<A NAME="toc12"></A>
|
||||
<H3>Time and Place</H3>
|
||||
<P>
|
||||
The summer school will
|
||||
@@ -313,15 +353,16 @@ Sweden, on 17-28 August 2009.
|
||||
Time schedule:
|
||||
</P>
|
||||
<UL>
|
||||
<LI>February: announcement of summer school and the grammar
|
||||
writing contest to get participants
|
||||
<LI>March-April: on-line course, work on the contest assignment (ca 1 month)
|
||||
<LI>May: submission deadline and notification of acceptance
|
||||
<LI>June-July: more work on the grammars
|
||||
<LI>August: summer school
|
||||
<LI>September-December: more homework if necessary
|
||||
<LI>February: announcement of summer school
|
||||
<LI>April: on-line course, work on the contest assignment
|
||||
<LI>12 May: submission deadline for assignment work
|
||||
<LI>31 May: review of assignments, notifications of acceptance
|
||||
<LI>17-28 August: Summer School
|
||||
<LI>September-December: homework on resource grammars
|
||||
<LI>December: release of the extended Resource Grammar Library
|
||||
</UL>
|
||||
|
||||
<A NAME="toc13"></A>
|
||||
<H3>Dissemination and intellectual property</H3>
|
||||
<P>
|
||||
The new resource grammars will be released under the LGPL just like
|
||||
@@ -331,28 +372,137 @@ with the copyright held by respective authors.
|
||||
<P>
|
||||
The grammars will be distributed via the GF web site.
|
||||
</P>
|
||||
<P>
|
||||
The WebALT-specific grammars will have special licenses agreed between the
|
||||
authors and WebALT Inc.
|
||||
</P>
|
||||
<A NAME="toc14"></A>
|
||||
<H2>Why I should participate</H2>
|
||||
<P>
|
||||
Seven reasons:
|
||||
</P>
|
||||
<OL>
|
||||
<LI>participation in a pioneering language technology work in an enthusiastic atmosphere
|
||||
<LI>participation in a pioneering language technology work in an
|
||||
enthusiastic atmosphere
|
||||
<LI>work and fun with people from all over Europe and the world
|
||||
<LI>job opportunities and business ideas
|
||||
<LI>credits: the school project will be established as a course at Chalmers worth
|
||||
15 ETCS points per person, but extensions to Master's thesis will
|
||||
also be considered
|
||||
<LI>merits: the resulting grammar can easily lead to a published paper
|
||||
7.5 or 15 ETCS points per person, depending on the work accompliched; also
|
||||
extensions to Master's thesis will be considered (special credit arrangements
|
||||
for <A HREF="http://www.gslt.hum.gu.se/">GSLT</A> and <A HREF="http://ngslt.org/">NGSLT</A>)
|
||||
<LI>merits: the resulting grammar can easily lead to a published paper (see below)
|
||||
<LI>contribution to the multilingual and multicultural development of Europe and the
|
||||
world
|
||||
<LI>free trip and stay in Gothenburg (for travel grant students)
|
||||
</OL>
|
||||
|
||||
<A NAME="toc15"></A>
|
||||
<H2>More information</H2>
|
||||
<P>
|
||||
<A HREF="http://groups.google.com/group/gf-resource-school-2009/">Course Google Group</A>
|
||||
</P>
|
||||
<P>
|
||||
<A HREF="http://digitalgrammars.com/gf/">GF web page</A>
|
||||
</P>
|
||||
<P>
|
||||
<A HREF="http://digitalgrammars.com/gf/doc/gf-tutorial.html">GF tutorial</A>
|
||||
</P>
|
||||
<P>
|
||||
<A HREF="http://digitalgrammars.com/gf/doc/Resource-HOWTO.html">Resource-HOWTO document</A>
|
||||
</P>
|
||||
<P>
|
||||
Forthcoming: survey article "The GF Resource Grammar Library"
|
||||
</P>
|
||||
<P>
|
||||
Forthcoming: book about GF
|
||||
</P>
|
||||
<A NAME="toc16"></A>
|
||||
<H3>Contaxt</H3>
|
||||
<P>
|
||||
Håkan Burden: burden at chalmers se
|
||||
</P>
|
||||
<P>
|
||||
Aarne Ranta: aarne at chalmers se
|
||||
</P>
|
||||
<A NAME="toc17"></A>
|
||||
<H3>Selected publications from earlier resource grammar projects</H3>
|
||||
<P>
|
||||
K. Angelov.
|
||||
Type-Theoretical Bulgarian Grammar.
|
||||
In B. Nordström and A. Ranta (eds),
|
||||
<I>Advances in Natural Language Processing (GoTAL 2008)</I>,
|
||||
LNCS/LNAI 5221, Springer,
|
||||
2008.
|
||||
</P>
|
||||
<P>
|
||||
A. El Dada and A. Ranta.
|
||||
Implementing an Open Source Arabic Resource Grammar in GF.
|
||||
In M. Mughazy (ed),
|
||||
<I>Perspectives on Arabic Linguistics XX. Papers from the Twentieth Annual Symposium on Arabic Linguistics, Kalamazoo, March 26</I>
|
||||
John Benjamins Publishing Company.
|
||||
2007.
|
||||
</P>
|
||||
<P>
|
||||
A. El Dada.
|
||||
Implementation of the Arabic Numerals and their Syntax in GF.
|
||||
Computational Approaches to Semitic Languages: Common Issues and Resources,
|
||||
ACL-2007 Workshop,
|
||||
June 28, 2007, Prague.
|
||||
2007.
|
||||
</P>
|
||||
<P>
|
||||
H. Hammarström and A. Ranta.
|
||||
Cardinal Numerals Revisited in GF.
|
||||
<I>Workshop on Numerals in the World's Languages</I>.
|
||||
Dept. of Linguistics Max Planck Institute for Evolutionary Anthropology, Leipzig,
|
||||
2004.
|
||||
</P>
|
||||
<P>
|
||||
M. Humayoun, H. Hammarström, and A. Ranta.
|
||||
Urdu Morphology, Orthography and Lexicon Extraction.
|
||||
<I>CAASL-2: The Second Workshop on Computational Approaches to Arabic Script-based Languages</I>,
|
||||
July 21-22, 2007, LSA 2007 Linguistic Institute, Stanford University.
|
||||
2007.
|
||||
</P>
|
||||
<P>
|
||||
J Khegai.
|
||||
GF parallel resource grammars and Russian.
|
||||
In proceedings of ACL2006
|
||||
(The joint conference of the International Committee on Computational
|
||||
Linguistics and the Association for Computational Linguistics) (pp. 475-482),
|
||||
Sydney, Australia, July 2006.
|
||||
</P>
|
||||
<P>
|
||||
J. Khegai.
|
||||
Language engineering in Grammatical Framework (GF).
|
||||
Phd thesis, Computer Science, Chalmers University of Technology,
|
||||
2006.
|
||||
</P>
|
||||
<P>
|
||||
W. Ng'ang'a.
|
||||
Multilingual content development for eLearning in Africa.
|
||||
eLearning Africa: 1st Pan-African Conference on ICT for Development,
|
||||
Education and Training. 24-26 May 2006, Addis Ababa, Ethiopia.
|
||||
2006.
|
||||
</P>
|
||||
<P>
|
||||
N. Perera and A. Ranta.
|
||||
Dialogue System Localization with the GF Resource Grammar Library.
|
||||
<I>SPEECHGRAM 2007: ACL Workshop on Grammar-Based Approaches to Spoken Language Processing</I>,
|
||||
June 29, 2007, Prague.
|
||||
2007.
|
||||
</P>
|
||||
<P>
|
||||
A. Ranta.
|
||||
Modular Grammar Engineering in GF.
|
||||
<I>Research on Language and Computation</I>,
|
||||
5:133-158, 2007.
|
||||
</P>
|
||||
<P>
|
||||
A. Ranta.
|
||||
How predictable is Finnish morphology? An experiment on lexicon construction.
|
||||
In J. Nivre, M. Dahllöf and B. Megyesi (eds),
|
||||
<I>Resourceful Language Technology: Festschrift in Honor of Anna Sågvall Hein</I>,
|
||||
University of Uppsala,
|
||||
2008.
|
||||
</P>
|
||||
|
||||
<!-- html code generated by txt2tags 2.4 (http://txt2tags.sf.net) -->
|
||||
<!-- cmdline: txt2tags gf-summerschool.txt -->
|
||||
<!-- cmdline: txt2tags -\-toc gf-summerschool.txt -->
|
||||
</BODY></HTML>
|
||||
|
||||
@@ -1,17 +1,22 @@
|
||||
European Resource Grammar Summer School
|
||||
GF Resource Grammar Summer School
|
||||
Gothenburg, 17-28 August 2009
|
||||
Aarne Ranta (aarne at chalmers.se)
|
||||
|
||||
%!Encoding : iso-8859-1
|
||||
|
||||
%!target:html
|
||||
%!postproc(html): #BECE <center>
|
||||
%!postproc(html): #ENCE </center>
|
||||
|
||||
//preliminary version, 17 November 2008//
|
||||
|
||||
[eu-langs.png]
|
||||
#BECE
|
||||
[school-langs.png]
|
||||
#ENCE
|
||||
|
||||
|
||||
===Executive summary===
|
||||
//red=wanted, green=exists, yellow=in-progress, solid=official-eu, dotted=non-eu//
|
||||
|
||||
|
||||
==Executive summary==
|
||||
|
||||
We plan to organize a summer school with the goal of implementing the GF
|
||||
resource grammar library for 15 new languages, so that the library will
|
||||
@@ -24,89 +29,71 @@ and basic syntax of each language. It can be used in GF applications
|
||||
and also ported to other formats. The library is licensed under LGPL.
|
||||
|
||||
Each language is implemented by one or two students working together.
|
||||
Travel grants will be available for students selected on the basis of
|
||||
Travel grants will be available for some students selected on the basis of
|
||||
pre-conference assignments.
|
||||
|
||||
The official announcement will be in January 2009, and the summer school
|
||||
itself on 17-28 August 2009, at the campus of Chalmers University of
|
||||
Technology in Gothenburg, Sweden.
|
||||
The summer school will be held on 17-28 August 2009, at the campus of
|
||||
Chalmers University of Technology in Gothenburg, Sweden.
|
||||
|
||||
|
||||
|
||||
==Introduction==
|
||||
|
||||
Since 2007, EU-27 has 23 official languages, listed in the diagram on top of this
|
||||
document.
|
||||
%[``http://ec.europa.eu/education/policies/lang/languages/index_en.html``
|
||||
%http://ec.europa.eu/education/policies/lang/languages/index_en.html].
|
||||
There is a growing need of translation between
|
||||
these languages. The traditional language-to-language method requires 23*22 = 506
|
||||
translators (humans or computer programs) to cover all possible translation needs.
|
||||
|
||||
An alternative to language-to-language translation is the use of an **interlingua**:
|
||||
a language-independent representation such that all translation problems can
|
||||
be reduced to translating to and from the interlingua. With 23 languages,
|
||||
only 2*23 = 46 translators are needed.
|
||||
|
||||
Interlingua sounds too good to be true. In a sense, it is. All attempts to
|
||||
create an interlingua that would solve all translation problems have failed.
|
||||
However, interlinguas for restricted applications have shown more
|
||||
success. For instance, mathematical texts and weather reports can be translated
|
||||
by using interlinguas tailor-made for the domains of mathematics and weather reports,
|
||||
respectively.
|
||||
|
||||
What is required of an interlingua is
|
||||
- semantic accuracy: correspondence to what you want to say in the application
|
||||
- language-independence: abstraction from individual languages
|
||||
|
||||
|
||||
Thus, for instance, an interlingua for mathematical texts may be based on
|
||||
mathematical logic, which at the same time gives semantic accuracy and
|
||||
language independence. In other domains, something else than mathematical
|
||||
logic may be needed; the **ontologies** defined within the semantic
|
||||
web technology are often good starting points for interlinguas.
|
||||
|
||||
|
||||
==GF: a framework for multilingual grammars==
|
||||
|
||||
The interlingua is just one part of a translation system. We also need
|
||||
the mappings between the interlingua and the involved languages. As the
|
||||
number of languages increases, this part grows while the interlingua remains
|
||||
constant.
|
||||
document. There is a growing need of linguistic resources for these
|
||||
languages, to help in tasks such as translation and information retrieval.
|
||||
These resources should be **portable** and **freely accessible**.
|
||||
Languages marked in red in the diagram are of particular interest for
|
||||
the summer school, since they are those on which the effort will be concentrated.
|
||||
|
||||
GF (Grammatical Framework,
|
||||
[``digitalgrammars.com/gf`` http://digitalgrammars.com/gf])
|
||||
is a programming language designed to support interlingua-based translation.
|
||||
A "program" in GF is a **multilingual grammar**, which consists of an
|
||||
**abstract syntax** and a set of **concrete syntaxes**. A concrete
|
||||
syntaxes is a mapping from the abstract syntax to a particular language.
|
||||
These mappings are **reversible**, which means that they can be used for
|
||||
translating in both directions. This means that creating an interlingua-based
|
||||
translator for 23 languages just requires 1 + 23 = 24 grammar modules (the abstract
|
||||
syntax and the concrete syntaxes).
|
||||
is a **functional programming language** designed for writing natural
|
||||
language grammars. It provides an efficient platform for this task, due to
|
||||
its modern characteristics:
|
||||
- It is a functional programming language, similar to Haskell and ML.
|
||||
- It has a static type system and type checker.
|
||||
- It has a powerful module system supporting separate compilation
|
||||
and data abstraction.
|
||||
- It has an optimizing compiler to **Portable Grammar Format** (PGF).
|
||||
- PGF can be further compiled to other formats, such as JavaScript and
|
||||
speech recognition language models.
|
||||
- GF has a **resource grammar library** giving access to the morphology and
|
||||
basic syntax of 12 languages.
|
||||
|
||||
The diagram first in this document shows an interlingua
|
||||
system covering the 23 EU languages.
|
||||
Languages marked in
|
||||
red are of particular interest for the summer school, since they are those
|
||||
on which the effort will be concentrated.
|
||||
|
||||
In addition to "ordinary" grammars for single languages, GF
|
||||
supports **multilingual grammars**. A multilingual GF grammar consists of an
|
||||
**abstract syntax** and a set of **concrete syntaxes**.
|
||||
An abstract syntax is system of **trees**, serving as a semantic
|
||||
model or an ontology. A concrete syntax is a mapping from abstract syntax
|
||||
trees to strings of a particular language.
|
||||
|
||||
These mappings defined in concrete syntax are **reversible**: they
|
||||
can be used both for **generating** strings from trees, and for
|
||||
**parsing** strings into trees. Combinations of generation and
|
||||
parsing can be used for **translation**, where the abstract
|
||||
syntax works as an **interlingua**. Thus GF has been used as a
|
||||
framework for building translation systems in several areas
|
||||
of application and large sets of languages.
|
||||
|
||||
|
||||
|
||||
==The GF resource grammar library==
|
||||
|
||||
The GF resource grammar library is a set of grammars used as libraries when
|
||||
building interlingua-based translation systems. The library currently covers
|
||||
The GF resource grammar library is a set of grammars usable as libraries when
|
||||
building translation systems and other applications.
|
||||
The library currently covers
|
||||
the 9 languages coloured in green in the diagram above; in addition,
|
||||
Catalan, Norwegian, and Russian are covered, and there is ongoing work on
|
||||
Arabic, Hindi/Urdu, and Thai.
|
||||
Arabic, Hindi/Urdu, Polish, Romanian, and Thai.
|
||||
|
||||
The purpose of the resource grammar library is to define the "low-level" structure
|
||||
of a language: inflection, word order, agreement. This structure belongs to what
|
||||
linguists call morphology and syntax. It can be very complex and requires
|
||||
a lot of knowledge. Yet, when translating from one language to another, knowing
|
||||
morphology and syntax is but a part of what is needed. The translator (whether human
|
||||
a lot of knowledge. Yet, when translating from one language to
|
||||
another, knowing morphology and syntax is but a part of what is needed.
|
||||
The translator (whether human
|
||||
or machine) must understand the meaning of what is translated, and must also know
|
||||
the idiomatic way to express the meaning in the target language. This knowledge
|
||||
can be very domain-dependent and requires in general an expert in the field to
|
||||
@@ -116,8 +103,9 @@ in the field of weather reports, etc.
|
||||
The problem is to find a person who is an expert in both the domain of translation
|
||||
and in the low-level linguistic details. It is the rareness of this combination
|
||||
that has made it difficult to build interlingua-based translation systems.
|
||||
The GF resource grammar library has the mission of helping in this task. It encapsulates
|
||||
the low-level linguistics in program modules accessed through easy-to-use interfaces.
|
||||
The GF resource grammar library has the mission of helping in this task.
|
||||
It encapsulates the low-level linguistics in program modules
|
||||
accessed through easy-to-use interfaces.
|
||||
Experts on different domains can build translation systems by using the library,
|
||||
without knowing low-level linguistics. The idea is much the same as when a
|
||||
programmer builds a graphical user interface (GUI) from high-level elements such as
|
||||
@@ -138,14 +126,17 @@ interlingua-based translation or localization of systems to new languages:
|
||||
[``http://webalt.math.helsinki.fi/content/index_eng.html`` http://webalt.math.helsinki.fi/content/index_eng.html],
|
||||
for translating mathematical exercises to 7 languages
|
||||
- in TALK [``http://www.talk-project.org`` http://www.talk-project.org],
|
||||
where the library was used for localizing spoken dialogue systems to six languages
|
||||
where the library was used for localizing spoken dialogue systems
|
||||
to six languages
|
||||
|
||||
|
||||
The library is also a generic linguistic resource, which can be used for tasks
|
||||
such as language teaching and information retrieval. The liberal license (LGPL)
|
||||
makes it usable for anyone and for any task. GF also has tools supporting the
|
||||
use of grammars in programs written in other programming languages: C, C++, Haskell,
|
||||
Java, JavaScript, and Prolog. In connection with the TALK project, support has also been
|
||||
use of grammars in programs written in other
|
||||
programming languages: C, C++, Haskell,
|
||||
Java, JavaScript, and Prolog. In connection with the TALK project,
|
||||
support has also been
|
||||
developed for translating GF grammars to language models used in speech
|
||||
recognition (GSL/Nuance, HTK/ATK, SRGS, JSGF).
|
||||
|
||||
@@ -155,7 +146,7 @@ recognition (GSL/Nuance, HTK/ATK, SRGS, JSGF).
|
||||
|
||||
The library has the following main parts:
|
||||
- **Inflection paradigms**, covering the inflection of each language.
|
||||
- **Common Syntax API**, covering a large set of syntax rule that
|
||||
- **Core Syntax**, covering a large set of syntax rule that
|
||||
can be implemented for all languages involved.
|
||||
- **Common Test Lexicon**, giving ca. 500 common words that can be used for
|
||||
testing the library.
|
||||
@@ -173,7 +164,8 @@ the first three components. The latter three are more open-ended in character.
|
||||
|
||||
The goal of the summer school is to extend the GF resource grammar library
|
||||
to covering all 23 EU languages, which means we need 15 new languages.
|
||||
We also welcome other languages, if there are interested participants.
|
||||
We also welcome other languages than these 23,
|
||||
if there are interested participants.
|
||||
|
||||
The amount of work and skill is between a Master's thesis and a PhD thesis.
|
||||
The Russian implementation was made by Janna Khegai as a part of her
|
||||
@@ -187,45 +179,43 @@ Latvian and Lithuanian are the first languages of the Baltic family and
|
||||
will probably require more work.
|
||||
|
||||
In any case, the proposed allocation of work power is 2 participants per
|
||||
language. They will have 6 months to work at home, followed
|
||||
by 2 weeks of summer school. Who are these participants?
|
||||
language. They will do 2 months' worth of home work, followed
|
||||
by 2 weeks of summer school, followed by 4 months work at home.
|
||||
Who are these participants?
|
||||
|
||||
|
||||
===Selecting participants===
|
||||
|
||||
After the call has been published, persons interested to participate in
|
||||
the project are expected to learn GF by self-study from the
|
||||
[tutorial http://digitalgrammars.com/gf/doc/gf-tutorial.html].
|
||||
This should take a couple of weeks. Also an on-line course will be
|
||||
arranged to help in getting started with GF.
|
||||
Persons interested to participate in the Summer School should sign up in
|
||||
the **Google Group** of the course,
|
||||
|
||||
Participants should continue to
|
||||
implement selected parts of the resource grammar, following the advice from
|
||||
the
|
||||
[Resource-HOWTO document http://digitalgrammars.com/gf/doc/Resource-HOWTO.html].
|
||||
What parts exactly are selected will be announced later.
|
||||
This work will take another couple of weeks.
|
||||
[``groups.google.com/group/gf-resource-school-2009/`` http://groups.google.com/group/gf-resource-school-2009/]
|
||||
|
||||
The participants are expected to learn GF by self-study from the
|
||||
[tutorial http://digitalgrammars.com/gf/doc/gf-tutorial.html].
|
||||
This should take a couple of weeks. An **on-line course** will be
|
||||
arranged in April to help in getting started with GF.
|
||||
|
||||
After the on-line course, a **programming assignment** will be published.
|
||||
This assignment will test skills required in resource grammar programming.
|
||||
Work on the assignment will take a couple of weeks.
|
||||
|
||||
Those who are interested in getting a travel grant will submit
|
||||
their sample resource grammar fragment
|
||||
to the Summer School Committee in the beginning of May.
|
||||
to the Summer School Committee by 12 May.
|
||||
The Committee then decides who is invited to represent which language
|
||||
in the summer school.
|
||||
|
||||
After the Committee decision, the participants have around three months
|
||||
to work on their languages. The work is completed in the summer school
|
||||
itself. It is also thoroughly tested by using it to add new languages
|
||||
to applications - in particular, to the WebALT mathematical
|
||||
The summer school itself is devoted for working on resource grammars.
|
||||
In addition to grammar writing itself, testing and evaluation is
|
||||
performed. One way to do this is via adding new languages
|
||||
to resource grammar applications - in particular, to the WebALT mathematical
|
||||
exercise translator.
|
||||
|
||||
Depending on the quality of submitted work, and on the demands of different
|
||||
languages, the Committee may decide to select another number than 2 participants
|
||||
for a language. We will also consider accepting participants who want to
|
||||
pay their own expenses.
|
||||
The resource grammars are expected to be completed by December 2009. They will
|
||||
be published at GF website and licensed under LGPL.
|
||||
|
||||
To keep track on who is working on which language, we will establish a Wiki page
|
||||
soon after the call is published. The participants are encouraged
|
||||
to contact each other and even work in groups.
|
||||
The participants are encouraged to contact each other and even work in groups.
|
||||
|
||||
|
||||
|
||||
@@ -254,7 +244,14 @@ who are selected on the basis of their assignments. And not only that:
|
||||
we plan to cover their travel and accommodation costs, up to 1000 EUR
|
||||
per person.
|
||||
|
||||
We try to get the funding question settled by mid-February 2009.
|
||||
The number of grants will be decided during Spring 2009, so that grand
|
||||
holders can be notified before the beginning of June.
|
||||
|
||||
Special terms will apply to students in
|
||||
[GSLT http://www.gslt.hum.gu.se/] and
|
||||
[NGSLT http://ngslt.org/].
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -281,8 +278,8 @@ the results for each language. Please contact us!
|
||||
|
||||
===The Summer School Committee===
|
||||
|
||||
This committee consists of a number of teachers and consultants,
|
||||
who will select the participants. It will be selected by February 2009.
|
||||
This committee consists of a number of teachers and informants,
|
||||
who will select the participants. It will be selected by April 2009.
|
||||
|
||||
|
||||
===Time and Place===
|
||||
@@ -292,13 +289,13 @@ be organized at the campus of Chalmers University of Technology in Gothenburg,
|
||||
Sweden, on 17-28 August 2009.
|
||||
|
||||
Time schedule:
|
||||
- February: announcement of summer school and the grammar
|
||||
writing contest to get participants
|
||||
- March-April: on-line course, work on the contest assignment (ca 1 month)
|
||||
- May: submission deadline and notification of acceptance
|
||||
- June-July: more work on the grammars
|
||||
- August: summer school
|
||||
- September-December: more homework if necessary
|
||||
- February: announcement of summer school
|
||||
- April: on-line course, work on the contest assignment
|
||||
- 12 May: submission deadline for assignment work
|
||||
- 31 May: review of assignments, notifications of acceptance
|
||||
- 17-28 August: Summer School
|
||||
- September-December: homework on resource grammars
|
||||
- December: release of the extended Resource Grammar Library
|
||||
|
||||
|
||||
===Dissemination and intellectual property===
|
||||
@@ -309,22 +306,115 @@ with the copyright held by respective authors.
|
||||
|
||||
The grammars will be distributed via the GF web site.
|
||||
|
||||
The WebALT-specific grammars will have special licenses agreed between the
|
||||
authors and WebALT Inc.
|
||||
|
||||
|
||||
==Why I should participate==
|
||||
|
||||
Seven reasons:
|
||||
+ participation in a pioneering language technology work in an enthusiastic atmosphere
|
||||
+ participation in a pioneering language technology work in an
|
||||
enthusiastic atmosphere
|
||||
+ work and fun with people from all over Europe and the world
|
||||
+ job opportunities and business ideas
|
||||
+ credits: the school project will be established as a course at Chalmers worth
|
||||
15 ETCS points per person, but extensions to Master's thesis will
|
||||
also be considered
|
||||
+ merits: the resulting grammar can easily lead to a published paper
|
||||
7.5 or 15 ETCS points per person, depending on the work accompliched; also
|
||||
extensions to Master's thesis will be considered (special credit arrangements
|
||||
for [GSLT http://www.gslt.hum.gu.se/] and [NGSLT http://ngslt.org/])
|
||||
+ merits: the resulting grammar can easily lead to a published paper (see below)
|
||||
+ contribution to the multilingual and multicultural development of Europe and the
|
||||
world
|
||||
+ free trip and stay in Gothenburg (for travel grant students)
|
||||
|
||||
|
||||
==More information==
|
||||
|
||||
[Course Google Group http://groups.google.com/group/gf-resource-school-2009/]
|
||||
|
||||
[GF web page http://digitalgrammars.com/gf/]
|
||||
|
||||
[GF tutorial http://digitalgrammars.com/gf/doc/gf-tutorial.html]
|
||||
|
||||
[Resource-HOWTO document http://digitalgrammars.com/gf/doc/Resource-HOWTO.html]
|
||||
|
||||
Forthcoming: survey article "The GF Resource Grammar Library"
|
||||
|
||||
Forthcoming: book about GF
|
||||
|
||||
===Contaxt===
|
||||
|
||||
Håkan Burden: burden at chalmers se
|
||||
|
||||
Aarne Ranta: aarne at chalmers se
|
||||
|
||||
|
||||
|
||||
===Selected publications from earlier resource grammar projects===
|
||||
|
||||
K. Angelov.
|
||||
Type-Theoretical Bulgarian Grammar.
|
||||
In B. Nordström and A. Ranta (eds),
|
||||
//Advances in Natural Language Processing (GoTAL 2008)//,
|
||||
LNCS/LNAI 5221, Springer,
|
||||
2008.
|
||||
|
||||
A. El Dada and A. Ranta.
|
||||
Implementing an Open Source Arabic Resource Grammar in GF.
|
||||
In M. Mughazy (ed),
|
||||
//Perspectives on Arabic Linguistics XX. Papers from the Twentieth Annual Symposium on Arabic Linguistics, Kalamazoo, March 26//
|
||||
John Benjamins Publishing Company.
|
||||
2007.
|
||||
|
||||
A. El Dada.
|
||||
Implementation of the Arabic Numerals and their Syntax in GF.
|
||||
Computational Approaches to Semitic Languages: Common Issues and Resources,
|
||||
ACL-2007 Workshop,
|
||||
June 28, 2007, Prague.
|
||||
2007.
|
||||
|
||||
H. Hammarström and A. Ranta.
|
||||
Cardinal Numerals Revisited in GF.
|
||||
//Workshop on Numerals in the World's Languages//.
|
||||
Dept. of Linguistics Max Planck Institute for Evolutionary Anthropology, Leipzig,
|
||||
2004.
|
||||
|
||||
M. Humayoun, H. Hammarström, and A. Ranta.
|
||||
Urdu Morphology, Orthography and Lexicon Extraction.
|
||||
//CAASL-2: The Second Workshop on Computational Approaches to Arabic Script-based Languages//,
|
||||
July 21-22, 2007, LSA 2007 Linguistic Institute, Stanford University.
|
||||
2007.
|
||||
|
||||
J Khegai.
|
||||
GF parallel resource grammars and Russian.
|
||||
In proceedings of ACL2006
|
||||
(The joint conference of the International Committee on Computational
|
||||
Linguistics and the Association for Computational Linguistics) (pp. 475-482),
|
||||
Sydney, Australia, July 2006.
|
||||
|
||||
J. Khegai.
|
||||
Language engineering in Grammatical Framework (GF).
|
||||
Phd thesis, Computer Science, Chalmers University of Technology,
|
||||
2006.
|
||||
|
||||
W. Ng'ang'a.
|
||||
Multilingual content development for eLearning in Africa.
|
||||
eLearning Africa: 1st Pan-African Conference on ICT for Development,
|
||||
Education and Training. 24-26 May 2006, Addis Ababa, Ethiopia.
|
||||
2006.
|
||||
|
||||
N. Perera and A. Ranta.
|
||||
Dialogue System Localization with the GF Resource Grammar Library.
|
||||
//SPEECHGRAM 2007: ACL Workshop on Grammar-Based Approaches to Spoken Language Processing//,
|
||||
June 29, 2007, Prague.
|
||||
2007.
|
||||
|
||||
A. Ranta.
|
||||
Modular Grammar Engineering in GF.
|
||||
//Research on Language and Computation//,
|
||||
5:133-158, 2007.
|
||||
|
||||
A. Ranta.
|
||||
How predictable is Finnish morphology? An experiment on lexicon construction.
|
||||
In J. Nivre, M. Dahllöf and B. Megyesi (eds),
|
||||
//Resourceful Language Technology: Festschrift in Honor of Anna Sågvall Hein//,
|
||||
University of Uppsala,
|
||||
2008.
|
||||
|
||||
|
||||
100
doc/school-langs.dot
Normal file
100
doc/school-langs.dot
Normal file
@@ -0,0 +1,100 @@
|
||||
graph{
|
||||
|
||||
size = "8,8" ;
|
||||
|
||||
overlap = scale ;
|
||||
|
||||
"Abs" [label = "Abstract Syntax", style = "solid", shape = "rectangle"] ;
|
||||
|
||||
"1" [label = "Bulgarian", style = "solid", shape = "ellipse", color = "green"] ;
|
||||
"1" -- "Abs" [style = "solid"];
|
||||
|
||||
"2" [label = "Czech", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"2" -- "Abs" [style = "solid"];
|
||||
|
||||
"3" [label = "Danish", style = "solid", shape = "ellipse", color = "green"] ;
|
||||
"3" -- "Abs" [style = "solid"];
|
||||
|
||||
"4" [label = "German", style = "solid", shape = "ellipse", color = "green"] ;
|
||||
"4" -- "Abs" [style = "solid"];
|
||||
|
||||
"5" [label = "Estonian", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"5" -- "Abs" [style = "solid"];
|
||||
|
||||
"6" [label = "Greek", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"6" -- "Abs" [style = "solid"];
|
||||
|
||||
"7" [label = "English", style = "solid", shape = "ellipse", color = "green"] ;
|
||||
"7" -- "Abs" [style = "solid"];
|
||||
|
||||
"8" [label = "Spanish", style = "solid", shape = "ellipse", color = "green"] ;
|
||||
"8" -- "Abs" [style = "solid"];
|
||||
|
||||
"9" [label = "French", style = "solid", shape = "ellipse", color = "green"] ;
|
||||
"9" -- "Abs" [style = "solid"];
|
||||
|
||||
"10" [label = "Italian", style = "solid", shape = "ellipse", color = "green"] ;
|
||||
"10" -- "Abs" [style = "solid"];
|
||||
|
||||
"11" [label = "Latvian", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"11" -- "Abs" [style = "solid"];
|
||||
|
||||
"12" [label = "Lithuanian", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"Abs" -- "12" [style = "solid"];
|
||||
|
||||
"13" [label = "Irish", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"Abs" -- "13" [style = "solid"];
|
||||
|
||||
"14" [label = "Hungarian", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"Abs" -- "14" [style = "solid"];
|
||||
|
||||
"15" [label = "Maltese", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"Abs" -- "15" [style = "solid"];
|
||||
|
||||
"16" [label = "Dutch", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"Abs" -- "16" [style = "solid"];
|
||||
|
||||
"17" [label = "Polish", style = "solid", shape = "ellipse", color = "yellow"] ;
|
||||
"Abs" -- "17" [style = "solid"];
|
||||
|
||||
"18" [label = "Portuguese", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"Abs" -- "18" [style = "solid"];
|
||||
|
||||
"19" [label = "Slovak", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"Abs" -- "19" [style = "solid"];
|
||||
|
||||
"20" [label = "Slovene", style = "solid", shape = "ellipse", color = "red"] ;
|
||||
"Abs" -- "20" [style = "solid"];
|
||||
|
||||
"21" [label = "Romanian", style = "solid", shape = "ellipse", color = "yellow"] ;
|
||||
"Abs" -- "21" [style = "solid"];
|
||||
|
||||
"22" [label = "Finnish", style = "solid", shape = "ellipse", color = "green"] ;
|
||||
"Abs" -- "22" [style = "solid"];
|
||||
|
||||
"23" [label = "Swedish", style = "solid", shape = "ellipse", color = "green"] ;
|
||||
"Abs" -- "23" [style = "solid"];
|
||||
|
||||
"24" [label = "Catalan", style = "dotted", shape = "ellipse", color = "green"] ;
|
||||
"Abs" -- "24" [style = "solid"];
|
||||
|
||||
"25" [label = "Norwegian", style = "dotted", shape = "ellipse", color = "green"] ;
|
||||
"Abs" -- "25" [style = "solid"];
|
||||
|
||||
"26" [label = "Russian", style = "dotted", shape = "ellipse", color = "green"] ;
|
||||
"Abs" -- "26" [style = "solid"];
|
||||
|
||||
"27" [label = "Interlingua", style = "dotted", shape = "ellipse", color = "green"] ;
|
||||
"Abs" -- "27" [style = "solid"];
|
||||
|
||||
"28" [label = "Latin", style = "dotted", shape = "ellipse", color = "yellow"] ;
|
||||
"Abs" -- "28" [style = "solid"];
|
||||
"29" [label = "Turkish", style = "dotted", shape = "ellipse", color = "yellow"] ;
|
||||
"Abs" -- "29" [style = "solid"];
|
||||
"30" [label = "Hindi", style = "dotted", shape = "ellipse", color = "yellow"] ;
|
||||
"Abs" -- "30" [style = "solid"];
|
||||
"31" [label = "Thai", style = "dotted", shape = "ellipse", color = "yellow"] ;
|
||||
"Abs" -- "31" [style = "solid"];
|
||||
|
||||
|
||||
}
|
||||
BIN
doc/school-langs.png
Normal file
BIN
doc/school-langs.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 134 KiB |
@@ -187,17 +187,29 @@ resource ParadigmsAra = open
|
||||
-- The definitions should not bother the user of the API. So they are
|
||||
-- hidden from the document.
|
||||
|
||||
{-
|
||||
-- AED's original definition of regV
|
||||
|
||||
regV = \word ->
|
||||
case word of {
|
||||
----AR AED's original definition of regV
|
||||
regV_orig : Str -> V = \wo ->
|
||||
case wo of {
|
||||
"يَ" + f@_ + c@_ + "ُ" + l@_ => v1 (f+c+l) a u ;
|
||||
"يَ" + f@_ + c@_ + "ِ" + l@_ => v1 (f+c+l) a i ;
|
||||
"يَ" + f@_ + c@_ + "َ" + l@_ => v1 (f+c+l) a a ;
|
||||
f@_ + "َ" + c@_ + "ِ" + l@_ => v1 (f+c+l) i a
|
||||
f@_ + "َ" + c@_ + "ِ" + l@_ => v1 (f+c+l) i a ;
|
||||
_ => Predef.error "regV not applicable"
|
||||
};
|
||||
-}
|
||||
|
||||
|
||||
regV_o : Str -> Str = \word ->
|
||||
case word of {
|
||||
"يَ" + f@_ + c@_ + "ُ" + l@_ => "a" ;
|
||||
"يَ" + f@_ + c@_ + "ِ" + l@_ => "b" ;
|
||||
"يَ" + f@_ + c@_ + "َ" + l@_ => "c" ;
|
||||
f@_ + "َ" + c@_ + "ِ" + l@_ => "d" ;
|
||||
_ => "q"
|
||||
};
|
||||
aa = a ; uu = u ; ii = i ;
|
||||
----AR for debug end
|
||||
|
||||
|
||||
---- begin workaround for a problem with pattern matching, AR 27/6/2008
|
||||
|
||||
|
||||
Reference in New Issue
Block a user