Files
gf-core/examples/phrasebook/phrasebook.html
2010-05-09 14:23:00 +00:00

255 lines
8.2 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<META NAME="generator" CONTENT="http://txt2tags.sf.net">
<TITLE>MOLTO Multilingual Phrasebook</TITLE>
</HEAD><BODY BGCOLOR="white" TEXT="black">
<P ALIGN="center"><CENTER><H1>MOLTO Multilingual Phrasebook</H1>
<FONT SIZE="4">
<I>Krasimir Angelov, Olga Caprotti, Ramona Enache, Thomas Hallgren, Aarne Ranta </I><BR>
</FONT></CENTER>
<P>
<HR>
<font size=-1>
</P>
<P>
History
</P>
<UL>
<LI>9 May. Version 0.7:
Danish and Norwegian added (preliminary versions induced from statistical models
and resource grammars).
<LI>3 May. Version 0.6:
Extended API (now final for release), Dutch added; new user interface with text
input enabled.
<LI>10 April. Some additions in API, comments in implementation; regenerated clones.
<LI>8 April. Added German.
<LI>7 April. Added the Clone script, applied to initiate the rest of MOLTO languages.
<LI>6 April. Version 0.4: weekdays, nationalities
<LI>30 March. Version 0.3: disambiguation grammar for English
<LI>28 March. Version 0.2: Swe, Ita; cat Action; small phrases.
<LI>26 March 2010. Version 0.1: Eng, Fin, Fre, Ron; dedicated minibar UI.
</UL>
<P>
<A HREF="missing.txt">Missing constructs</A>
</P>
<P>
<A HREF="http://tournesol.cs.chalmers.se/~aarne/phrasebook/phrasebook.html">Back to phrasebook</A>
</P>
<P>
</font>
<HR>
</P>
<H1>Purpose</H1>
<P>
This phrasebook is a program for translating touristic phrases
between the 15 European languages included in the
<A HREF="http://www.molto-project.eu">MOLTO</A> project
(Multilingual On-Line Translation):
</P>
<UL>
<LI>Bulgarian, Catalan, Danish, Dutch, English,
Finnish, French, German, Italian, Norwegian,
Polish, Romanian, Russian, Spanish, Swedish
</UL>
<P>
It is implemented by using the GF programming language
(<A HREF="http://grammaticalframework.org">Grammatical Framework</A>).
It is the first demo for the MOLTO project, released in the third month (by June 2010)
but to be updated in the course of the project.
</P>
<P>
The phrasebook has the following requirements:
</P>
<UL>
<LI>high quality: reliable translations to express yourself in any language
<LI>translation between all pairs of languages
<LI>runnable in web browsers
<LI>runnable on mobile phones (also off-line: forthcoming for Android phones)
<LI>easily extensible by new words (forthcoming: semi-automatic extensions by users)
</UL>
<P>
The phrasebook is available as open-source software, licensed under GNU LGPL.
The source code resides in
<A HREF="http://code.haskell.org/gf/examples/phrasebook/"><CODE>code.haskell.org/gf/examples/phrasebook/</CODE></A>
</P>
<P>
Current status (9 May 2010):
</P>
<UL>
<LI>small but useful coverage in abstract syntax
<LI>reasonable implementations for
Bulgarian, Danish, Dutch, English, Finnish, French, German,
Italian, Norwegian, Romanian, Swedish
<LI>mostly just cloned for the rest of MOLTO languages
<LI>temporary user interdace
<LI>works on web browsers calling a server
<LI>web service not yet released, but preliminarily available in
<A HREF="http://tournesol.cs.chalmers.se/~aarne/phrasebook/phrasebook.html"><CODE>http://tournesol.cs.chalmers.se/~aarne/phrasebook/phrasebook.html</CODE></A>
</UL>
<H1>Points illustrated</H1>
<P>
Interlingua-based translation.
</P>
<P>
Incremental parsing.
</P>
<P>
The use of resource grammars and functors.
</P>
<P>
Example-based grammar writing and grammar induction from statistical models (Google).
</P>
<P>
Compile-time transfer: especially, in Action in Words.
</P>
<P>
Quasi-incremental translation: many basic types are also used as phrases.
</P>
<P>
Disambiguation, esp. of politeness distinctions.
</P>
<H1>Ontology</H1>
<P>
The abstract syntax defines the <B>ontology</B> behind the phrasebook.
Some explanations can be found in the
<A HREF="Ontology.html">ontology document</A>, which is produced from the
abstract syntax files
<A HREF="http://code.haskell.org/gf/examples/phrasebook/Sentences.gf"><CODE>Sentences.gf</CODE></A>
and
<A HREF="http://code.haskell.org/gf/examples/phrasebook/Words.gf"><CODE>Words.gf</CODE></A>
by <CODE>make doc</CODE>.
</P>
<H1>Files</H1>
<P>
<CODE>Sentences</CODE>: general syntactic structures implementable in a uniform way.
Concrete syntax via the functor <CODE>SencencesI</CODE>.
</P>
<P>
<CODE>Words</CODE>: words and predicates, typically language-dependent.
Separate concrete syntaxes.
</P>
<P>
<CODE>Greetings</CODE>: idiomatic phrases, string-based.
Separate concrete syntaxes.
</P>
<P>
<CODE>Phrasebook</CODE>: the top module putting everything together.
Separate concrete syntaxes.
</P>
<P>
<CODE>DisambPhrasebook</CODE>: disambiguation grammars generating feedback phrases if
the input language is ambiguous.
</P>
<P>
Here is the module structure as produced in GF by
</P>
<PRE>
&gt; i -retain DisambPhrasebookEng.gf
&gt; dg -only=Phrasebook*,Sentences*,Words*,Greetings*,DisambPhrasebookEng
&gt; ! dot -Tpng _gfdepgraph.dot &gt;pgraph.png
</PRE>
<P></P>
<P>
<IMG ALIGN="middle" SRC="pgraph.png" BORDER="0" ALT="">
</P>
<H1>To Do</H1>
<P>
Improved translation interface
</P>
<UL>
<LI>a nicer way to show disambiguation (maybe hidden by default)
</UL>
<P>
Complete the missing words and phrases
</P>
<P>
Disambiguation grammars for other languages than English
</P>
<P>
Extend the abstract lexicon in <CODE>Words</CODE> by hand or (semi)automatically for
</P>
<UL>
<LI>food stuff
<LI>languages
<LI>places
</UL>
<P>
Link to Google translate, for fall-back and for comparison
</P>
<P>
Feedback facility in the UI
</P>
<P>
Customizable distribution: make your own selection of the 2^15 language subsets
when downloading the phrasebook to a phone
</P>
<H1>How to contribute</H1>
<P>
The basic things "everyone" can do is
</P>
<UL>
<LI>complete <A HREF="missing.txt">missing words</A> in concrete syntaxes
<LI>add new abstract words in <CODE>Words</CODE> and greetings in <CODE>Greetings</CODE>
</UL>
<P>
The missing concrete syntax entries are added to the <CODE>Words</CODE><I>L</I><CODE>.gf</CODE>
files for each language <I>L</I>. The
<A HREF="http://code.haskell.org/gf/lib/doc/synopsis.html#toc78">morphological paradigms</A>
of the GF resource library should be used. Actions (prefixed with <CODE>A</CODE>, as <CODE>AWant</CODE>) are
a little more demanding, since they also require syntax constructors. Greetings (prefixed
with <CODE>G</CODE>) are pure strings.
</P>
<P>
Some explanations can be found in the
<A HREF="Implementation.html">implementation document</A>, which is produced from the
concrete syntax files
<A HREF="http://code.haskell.org/gf/examples/phrasebook/SentencesI.gf"><CODE>SentencesI.gf</CODE></A>
and
<A HREF="http://code.haskell.org/gf/examples/phrasebook/WordsEng.gf"><CODE>WordsEng.gf</CODE></A>
by <CODE>make doc</CODE>.
</P>
<P>
Here are the steps to follow for contributors:
</P>
<OL>
<LI>Make sure you have the latest sources
from <A HREF="http://www.grammaticalframework.org/doc/gf-developers.html">GF Darcs</A>,
using <CODE>darcs pull</CODE>.
<LI>Also make sure that you have compiled the library by <CODE>make present</CODE> in <CODE>gf/lib/src/</CODE>.
<LI>Work in the directory
<A HREF="http://code.haskell.org/gf/examples/phrasebook/"><CODE>gf/examples/phrasebook/</CODE></A>.
<LI>After you've finished your contribution, recompile the phrasebook by <CODE>make pgf</CODE>.
<LI>Save your changes in <CODE>darcs record .</CODE> (in the <CODE>phrasebook</CODE> subdirectory).
<LI>Make a patch file with <CODE>darcs send -o my_phrasebook_patch</CODE>, which you can
send to GF maintainers.
<LI>(Recommended:) Test the phrasebook on your local server:
<OL>
<LI>Go to <CODE>gf/src/server/</CODE> and follow the instructions in the
<A HREF="http://code.google.com/p/grammatical-framework/wiki/LaunchWebDemos">project Wiki</A>.
<LI>Make sure that <CODE>Phrasebook.pgf</CODE> is available to you GF server (see project wiki).
<LI>Launch <CODE>lighttpd</CODE> (see project wiki).
<LI>How you can open <CODE>gf/examples/phrasebook/www/phrasebook.html</CODE> and use your phrasebook!
</OL>
</OL>
<UL>
<LI>Don't delete anything! But you are free to correct incorrect forms.
<LI>Don't change the module structure!
<LI>Don't compromise quality to gain coverage: <I>non multa sed multum!</I>
<P></P>
</UL>
<!-- html code generated by txt2tags 2.5 (http://txt2tags.sf.net) -->
<!-- cmdline: txt2tags -thtml phrasebook.txt -->
</BODY></HTML>