forked from GitHub/gf-core
finishing phrasebook documentation ; changed doc name
This commit is contained in:
@@ -29,7 +29,8 @@ doc:
|
|||||||
rm -f Ontology.gf
|
rm -f Ontology.gf
|
||||||
cat SentencesI.gf WordsEng.gf >Implementation.gf
|
cat SentencesI.gf WordsEng.gf >Implementation.gf
|
||||||
gfdoc Implementation.gf
|
gfdoc Implementation.gf
|
||||||
txt2tags -thtml --toc phrasebook.txt
|
txt2tags -thtml --toc doc-phrasebook.txt
|
||||||
|
txt2tags -thtml help-phrasebook.txt
|
||||||
rm -f Ontology.gf Implementation.gf
|
rm -f Ontology.gf Implementation.gf
|
||||||
|
|
||||||
upload:: Phrasebook.pgf
|
upload:: Phrasebook.pgf
|
||||||
|
|||||||
@@ -17,14 +17,18 @@ Showcase for project FP7-ICT-247914, Deliverable D10.2.
|
|||||||
<UL>
|
<UL>
|
||||||
<LI><A HREF="#toc1">Purpose</A>
|
<LI><A HREF="#toc1">Purpose</A>
|
||||||
<LI><A HREF="#toc2">Points illustrated</A>
|
<LI><A HREF="#toc2">Points illustrated</A>
|
||||||
<LI><A HREF="#toc3">Ontology</A>
|
<UL>
|
||||||
<LI><A HREF="#toc4">Files</A>
|
<LI><A HREF="#toc3">From the user perspective</A>
|
||||||
<LI><A HREF="#toc5">To Do</A>
|
<LI><A HREF="#toc4">From the programmer's perspective</A>
|
||||||
<LI><A HREF="#toc6">How to contribute</A>
|
</UL>
|
||||||
|
<LI><A HREF="#toc5">Ontology</A>
|
||||||
|
<LI><A HREF="#toc6">Files</A>
|
||||||
<LI><A HREF="#toc7">Effort and cost</A>
|
<LI><A HREF="#toc7">Effort and cost</A>
|
||||||
<LI><A HREF="#toc8">Example-based grammar writing prototype</A>
|
<LI><A HREF="#toc8">Example-based grammar writing prototype</A>
|
||||||
<LI><A HREF="#toc9">Conclusions (tentative)</A>
|
<LI><A HREF="#toc9">To Do</A>
|
||||||
<LI><A HREF="#toc10">Acknowledgements</A>
|
<LI><A HREF="#toc10">How to contribute</A>
|
||||||
|
<LI><A HREF="#toc11">Conclusions (tentative)</A>
|
||||||
|
<LI><A HREF="#toc12">Acknowledgements</A>
|
||||||
</UL>
|
</UL>
|
||||||
|
|
||||||
<P></P>
|
<P></P>
|
||||||
@@ -65,7 +69,7 @@ History
|
|||||||
<A HREF="missing.txt">Missing constructs</A>
|
<A HREF="missing.txt">Missing constructs</A>
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
<A HREF="http://www.grammaticalframework.org/demos/phrasebook/">Back to phrasebook</A>
|
<A HREF="http://www.grammaticalframework.org/demos/phrasebook/">Back to the phrasebook</A>
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
</font>
|
</font>
|
||||||
@@ -86,7 +90,10 @@ between 14 European languages included in the
|
|||||||
</UL>
|
</UL>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
It is implemented by using the GF programming language
|
A Russian version is not yet finished but is projected later. Also other languages may be added.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
The phrasebook is implemented by using the GF programming language
|
||||||
(<A HREF="http://grammaticalframework.org">Grammatical Framework</A>).
|
(<A HREF="http://grammaticalframework.org">Grammatical Framework</A>).
|
||||||
It is the first demo for the MOLTO project, released in the third month (by June 2010).
|
It is the first demo for the MOLTO project, released in the third month (by June 2010).
|
||||||
The first version is a very small system, but it will extended in the course of the project.
|
The first version is a very small system, but it will extended in the course of the project.
|
||||||
@@ -95,10 +102,10 @@ The first version is a very small system, but it will extended in the course of
|
|||||||
The phrasebook has the following requirement specification:
|
The phrasebook has the following requirement specification:
|
||||||
</P>
|
</P>
|
||||||
<UL>
|
<UL>
|
||||||
<LI>high quality: reliable translations to express yourself in any language
|
<LI>high quality: reliable translations to express yourself in any of the languages
|
||||||
<LI>translation between all pairs of languages
|
<LI>translation between all pairs of languages
|
||||||
<LI>runnable in web browsers
|
<LI>runnable in web browsers
|
||||||
<LI>runnable on mobile phones (forthcoming: Android phones)
|
<LI>runnable on mobile phones (via web browser; Android stand-alone forthcoming)
|
||||||
<LI>easily extensible by new words (forthcoming: semi-automatic extensions by users)
|
<LI>easily extensible by new words (forthcoming: semi-automatic extensions by users)
|
||||||
</UL>
|
</UL>
|
||||||
|
|
||||||
@@ -109,6 +116,8 @@ The source code resides in
|
|||||||
</P>
|
</P>
|
||||||
<A NAME="toc2"></A>
|
<A NAME="toc2"></A>
|
||||||
<H1>Points illustrated</H1>
|
<H1>Points illustrated</H1>
|
||||||
|
<A NAME="toc3"></A>
|
||||||
|
<H2>From the user perspective</H2>
|
||||||
<P>
|
<P>
|
||||||
Interlingua-based translation
|
Interlingua-based translation
|
||||||
</P>
|
</P>
|
||||||
@@ -124,28 +133,10 @@ Incremental parsing
|
|||||||
</UL>
|
</UL>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
The use of resource grammars and functors
|
Mixed modalities
|
||||||
</P>
|
</P>
|
||||||
<UL>
|
<UL>
|
||||||
<LI>the translator was implemented on top of an earlier linguistic knowledge base,
|
<LI>selection of words ("fridge magnets") combined with text input
|
||||||
the <A HREF="http://grammaticalframework.com/lib">GF Resource Grammar Library</A>
|
|
||||||
</UL>
|
|
||||||
|
|
||||||
<P>
|
|
||||||
Example-based grammar writing and grammar induction from statistical models
|
|
||||||
(<A HREF="http://translate.google.com">Google translate</A>)
|
|
||||||
</P>
|
|
||||||
<UL>
|
|
||||||
<LI>many of the grammars were created semi-automatically by generalization from
|
|
||||||
examples
|
|
||||||
</UL>
|
|
||||||
|
|
||||||
<P>
|
|
||||||
Compile-time transfer: especially, in Action in Words
|
|
||||||
</P>
|
|
||||||
<UL>
|
|
||||||
<LI>the structural differences between languages are treated at compile time,
|
|
||||||
for maximal run-time efficiency
|
|
||||||
</UL>
|
</UL>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
@@ -174,7 +165,34 @@ Fall-back to statistical translation
|
|||||||
Feed-back from users
|
Feed-back from users
|
||||||
</P>
|
</P>
|
||||||
<UL>
|
<UL>
|
||||||
<LI>you are welcome to send comments, bug reports, and better translation suggestions!
|
<LI>users are welcomed to send comments, bug reports, and better translation suggestions
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<A NAME="toc4"></A>
|
||||||
|
<H2>From the programmer's perspective</H2>
|
||||||
|
<P>
|
||||||
|
The use of resource grammars and functors
|
||||||
|
</P>
|
||||||
|
<UL>
|
||||||
|
<LI>the translator was implemented on top of an earlier linguistic knowledge base,
|
||||||
|
the <A HREF="http://grammaticalframework.com/lib">GF Resource Grammar Library</A>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>
|
||||||
|
Example-based grammar writing and grammar induction from statistical models
|
||||||
|
(<A HREF="http://translate.google.com">Google translate</A>)
|
||||||
|
</P>
|
||||||
|
<UL>
|
||||||
|
<LI>many of the grammars were created semi-automatically by generalization from
|
||||||
|
examples
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>
|
||||||
|
Compile-time transfer: especially, in Action in Words
|
||||||
|
</P>
|
||||||
|
<UL>
|
||||||
|
<LI>the structural differences between languages are treated at compile time,
|
||||||
|
for maximal run-time efficiency
|
||||||
</UL>
|
</UL>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
@@ -191,7 +209,7 @@ Grammar testing
|
|||||||
<LI>use of treebanks with guided random generation for initial evaluation and regression testing
|
<LI>use of treebanks with guided random generation for initial evaluation and regression testing
|
||||||
</UL>
|
</UL>
|
||||||
|
|
||||||
<A NAME="toc3"></A>
|
<A NAME="toc5"></A>
|
||||||
<H1>Ontology</H1>
|
<H1>Ontology</H1>
|
||||||
<P>
|
<P>
|
||||||
The abstract syntax defines the <B>ontology</B> behind the phrasebook.
|
The abstract syntax defines the <B>ontology</B> behind the phrasebook.
|
||||||
@@ -203,7 +221,7 @@ and
|
|||||||
<A HREF="http://code.haskell.org/gf/examples/phrasebook/Words.gf"><CODE>Words.gf</CODE></A>
|
<A HREF="http://code.haskell.org/gf/examples/phrasebook/Words.gf"><CODE>Words.gf</CODE></A>
|
||||||
by <CODE>make doc</CODE>.
|
by <CODE>make doc</CODE>.
|
||||||
</P>
|
</P>
|
||||||
<A NAME="toc4"></A>
|
<A NAME="toc6"></A>
|
||||||
<H1>Files</H1>
|
<H1>Files</H1>
|
||||||
<P>
|
<P>
|
||||||
<CODE>Sentences</CODE>: general syntactic structures implementable in a uniform way.
|
<CODE>Sentences</CODE>: general syntactic structures implementable in a uniform way.
|
||||||
@@ -233,91 +251,16 @@ Here is the module structure as produced in GF by
|
|||||||
</P>
|
</P>
|
||||||
<PRE>
|
<PRE>
|
||||||
> i -retain DisambPhrasebookEng.gf
|
> i -retain DisambPhrasebookEng.gf
|
||||||
> dg -only=Phrasebook*,Sentences*,Words*,Greetings*,DisambPhrasebookEng
|
> dg -only=Phrasebook*,Sentences*,Words*,Greetings*,Numeral,NumeralEng,DisambPhrasebookEng
|
||||||
> ! dot -Tpng _gfdepgraph.dot >pgraph.png
|
> ! dot -Tpng _gfdepgraph.dot >pgraph.png
|
||||||
</PRE>
|
</PRE>
|
||||||
<P></P>
|
<P></P>
|
||||||
<P>
|
<P>
|
||||||
<IMG ALIGN="middle" SRC="pgraph.png" BORDER="0" ALT="">
|
<IMG ALIGN="middle" SRC="npgraph.png" BORDER="0" ALT="">
|
||||||
</P>
|
</P>
|
||||||
<A NAME="toc5"></A>
|
|
||||||
<H1>To Do</H1>
|
|
||||||
<P>
|
|
||||||
Disambiguation grammars for other languages than English
|
|
||||||
</P>
|
|
||||||
<P>
|
|
||||||
Extend the abstract lexicon in <CODE>Words</CODE> by hand or (semi)automatically for
|
|
||||||
</P>
|
|
||||||
<UL>
|
|
||||||
<LI>food stuff
|
|
||||||
<LI>places
|
|
||||||
<LI>actions
|
|
||||||
</UL>
|
|
||||||
|
|
||||||
<P>
|
|
||||||
Customizable phone distribution: make your own selection of the 2^15 language subsets
|
|
||||||
when downloading the phrasebook to a phone
|
|
||||||
</P>
|
|
||||||
<A NAME="toc6"></A>
|
|
||||||
<H1>How to contribute</H1>
|
|
||||||
<P>
|
|
||||||
The basic things "everyone" can do is
|
|
||||||
</P>
|
|
||||||
<UL>
|
|
||||||
<LI>complete <A HREF="missing.txt">missing words</A> in concrete syntaxes
|
|
||||||
<LI>add new abstract words in <CODE>Words</CODE> and greetings in <CODE>Greetings</CODE>
|
|
||||||
</UL>
|
|
||||||
|
|
||||||
<P>
|
|
||||||
The missing concrete syntax entries are added to the <CODE>Words</CODE><I>L</I><CODE>.gf</CODE>
|
|
||||||
files for each language <I>L</I>. The
|
|
||||||
<A HREF="http://code.haskell.org/gf/lib/doc/synopsis.html#toc78">morphological paradigms</A>
|
|
||||||
of the GF resource library should be used. Actions (prefixed with <CODE>A</CODE>, as <CODE>AWant</CODE>) are
|
|
||||||
a little more demanding, since they also require syntax constructors. Greetings (prefixed
|
|
||||||
with <CODE>G</CODE>) are pure strings.
|
|
||||||
</P>
|
|
||||||
<P>
|
|
||||||
Some explanations can be found in the
|
|
||||||
<A HREF="Implementation.html">implementation document</A>, which is produced from the
|
|
||||||
concrete syntax files
|
|
||||||
<A HREF="http://code.haskell.org/gf/examples/phrasebook/SentencesI.gf"><CODE>SentencesI.gf</CODE></A>
|
|
||||||
and
|
|
||||||
<A HREF="http://code.haskell.org/gf/examples/phrasebook/WordsEng.gf"><CODE>WordsEng.gf</CODE></A>
|
|
||||||
by <CODE>make doc</CODE>.
|
|
||||||
</P>
|
|
||||||
<P>
|
|
||||||
Here are the steps to follow for contributors:
|
|
||||||
</P>
|
|
||||||
<OL>
|
|
||||||
<LI>Make sure you have the latest sources
|
|
||||||
from <A HREF="http://www.grammaticalframework.org/doc/gf-developers.html">GF Darcs</A>,
|
|
||||||
using <CODE>darcs pull</CODE>.
|
|
||||||
<LI>Also make sure that you have compiled the library by <CODE>make present</CODE> in <CODE>gf/lib/src/</CODE>.
|
|
||||||
<LI>Work in the directory
|
|
||||||
<A HREF="http://code.haskell.org/gf/examples/phrasebook/"><CODE>gf/examples/phrasebook/</CODE></A>.
|
|
||||||
<LI>After you've finished your contribution, recompile the phrasebook by <CODE>make pgf</CODE>.
|
|
||||||
<LI>Save your changes in <CODE>darcs record .</CODE> (in the <CODE>phrasebook</CODE> subdirectory).
|
|
||||||
<LI>Make a patch file with <CODE>darcs send -o my_phrasebook_patch</CODE>, which you can
|
|
||||||
send to GF maintainers.
|
|
||||||
<LI>(Recommended:) Test the phrasebook on your local server:
|
|
||||||
<OL>
|
|
||||||
<LI>Go to <CODE>gf/src/server/</CODE> and follow the instructions in the
|
|
||||||
<A HREF="http://code.google.com/p/grammatical-framework/wiki/LaunchWebDemos">project Wiki</A>.
|
|
||||||
<LI>Make sure that <CODE>Phrasebook.pgf</CODE> is available to you GF server (see project wiki).
|
|
||||||
<LI>Launch <CODE>lighttpd</CODE> (see project wiki).
|
|
||||||
<LI>How you can open <CODE>gf/examples/phrasebook/www/phrasebook.html</CODE> and use your phrasebook!
|
|
||||||
</OL>
|
|
||||||
</OL>
|
|
||||||
|
|
||||||
<UL>
|
|
||||||
<LI>Don't delete anything! But you are free to correct incorrect forms.
|
|
||||||
<LI>Don't change the module structure!
|
|
||||||
<LI>Don't compromise quality to gain coverage: <I>non multa sed multum!</I>
|
|
||||||
</UL>
|
|
||||||
|
|
||||||
<A NAME="toc7"></A>
|
<A NAME="toc7"></A>
|
||||||
<H1>Effort and cost</H1>
|
<H1>Effort and cost</H1>
|
||||||
<TABLE BORDER="1" CELLPADDING="4">
|
<TABLE CELLPADDING="4" BORDER="1">
|
||||||
<TR>
|
<TR>
|
||||||
<TH>Language</TH>
|
<TH>Language</TH>
|
||||||
<TH>Grammarian's language skills</TH>
|
<TH>Grammarian's language skills</TH>
|
||||||
@@ -359,7 +302,7 @@ Here are the steps to follow for contributors:
|
|||||||
<TD ALIGN="center">+</TD>
|
<TD ALIGN="center">+</TD>
|
||||||
<TD ALIGN="center">+</TD>
|
<TD ALIGN="center">+</TD>
|
||||||
<TD ALIGN="center">##</TD>
|
<TD ALIGN="center">##</TD>
|
||||||
<TD ALIGN="center">##</TD>
|
<TD ALIGN="center">#</TD>
|
||||||
<TD ALIGN="center">##</TD>
|
<TD ALIGN="center">##</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
@@ -598,6 +541,81 @@ round and 2 rounds were needed in average for the languages for which we perform
|
|||||||
the experiment. It is possible that more effort is needed for more complex languages.
|
the experiment. It is possible that more effort is needed for more complex languages.
|
||||||
</P>
|
</P>
|
||||||
<A NAME="toc9"></A>
|
<A NAME="toc9"></A>
|
||||||
|
<H1>To Do</H1>
|
||||||
|
<P>
|
||||||
|
Disambiguation grammars for other languages than English
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
Extend the abstract lexicon in <CODE>Words</CODE> by hand or (semi)automatically for
|
||||||
|
</P>
|
||||||
|
<UL>
|
||||||
|
<LI>food stuff
|
||||||
|
<LI>places
|
||||||
|
<LI>actions
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>
|
||||||
|
Customizable phone distribution: make your own selection of the 2^15 language subsets
|
||||||
|
when downloading the phrasebook to a phone
|
||||||
|
</P>
|
||||||
|
<A NAME="toc10"></A>
|
||||||
|
<H1>How to contribute</H1>
|
||||||
|
<P>
|
||||||
|
The basic things "everyone" can do is
|
||||||
|
</P>
|
||||||
|
<UL>
|
||||||
|
<LI>complete <A HREF="missing.txt">missing words</A> in concrete syntaxes
|
||||||
|
<LI>add new abstract words in <CODE>Words</CODE> and greetings in <CODE>Greetings</CODE>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>
|
||||||
|
The missing concrete syntax entries are added to the <CODE>Words</CODE><I>L</I><CODE>.gf</CODE>
|
||||||
|
files for each language <I>L</I>. The
|
||||||
|
<A HREF="http://code.haskell.org/gf/lib/doc/synopsis.html#toc78">morphological paradigms</A>
|
||||||
|
of the GF resource library should be used. Actions (prefixed with <CODE>A</CODE>, as <CODE>AWant</CODE>) are
|
||||||
|
a little more demanding, since they also require syntax constructors. Greetings (prefixed
|
||||||
|
with <CODE>G</CODE>) are pure strings.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
Some explanations can be found in the
|
||||||
|
<A HREF="Implementation.html">implementation document</A>, which is produced from the
|
||||||
|
concrete syntax files
|
||||||
|
<A HREF="http://code.haskell.org/gf/examples/phrasebook/SentencesI.gf"><CODE>SentencesI.gf</CODE></A>
|
||||||
|
and
|
||||||
|
<A HREF="http://code.haskell.org/gf/examples/phrasebook/WordsEng.gf"><CODE>WordsEng.gf</CODE></A>
|
||||||
|
by <CODE>make doc</CODE>.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
Here are the steps to follow for contributors:
|
||||||
|
</P>
|
||||||
|
<OL>
|
||||||
|
<LI>Make sure you have the latest sources
|
||||||
|
from <A HREF="http://www.grammaticalframework.org/doc/gf-developers.html">GF Darcs</A>,
|
||||||
|
using <CODE>darcs pull</CODE>.
|
||||||
|
<LI>Also make sure that you have compiled the library by <CODE>make present</CODE> in <CODE>gf/lib/src/</CODE>.
|
||||||
|
<LI>Work in the directory
|
||||||
|
<A HREF="http://code.haskell.org/gf/examples/phrasebook/"><CODE>gf/examples/phrasebook/</CODE></A>.
|
||||||
|
<LI>After you've finished your contribution, recompile the phrasebook by <CODE>make pgf</CODE>.
|
||||||
|
<LI>Save your changes in <CODE>darcs record .</CODE> (in the <CODE>phrasebook</CODE> subdirectory).
|
||||||
|
<LI>Make a patch file with <CODE>darcs send -o my_phrasebook_patch</CODE>, which you can
|
||||||
|
send to GF maintainers.
|
||||||
|
<LI>(Recommended:) Test the phrasebook on your local server:
|
||||||
|
<OL>
|
||||||
|
<LI>Go to <CODE>gf/src/server/</CODE> and follow the instructions in the
|
||||||
|
<A HREF="http://code.google.com/p/grammatical-framework/wiki/LaunchWebDemos">project Wiki</A>.
|
||||||
|
<LI>Make sure that <CODE>Phrasebook.pgf</CODE> is available to you GF server (see project wiki).
|
||||||
|
<LI>Launch <CODE>lighttpd</CODE> (see project wiki).
|
||||||
|
<LI>How you can open <CODE>gf/examples/phrasebook/www/phrasebook.html</CODE> and use your phrasebook!
|
||||||
|
</OL>
|
||||||
|
</OL>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI>Don't delete anything! But you are free to correct incorrect forms.
|
||||||
|
<LI>Don't change the module structure!
|
||||||
|
<LI>Don't compromise quality to gain coverage: <I>non multa sed multum!</I>
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<A NAME="toc11"></A>
|
||||||
<H1>Conclusions (tentative)</H1>
|
<H1>Conclusions (tentative)</H1>
|
||||||
<P>
|
<P>
|
||||||
The grammarian need not be a native speaker of the language.
|
The grammarian need not be a native speaker of the language.
|
||||||
@@ -630,7 +648,7 @@ Resource grammars should give some more support
|
|||||||
<LI>large-scale morphological lexica
|
<LI>large-scale morphological lexica
|
||||||
</UL>
|
</UL>
|
||||||
|
|
||||||
<A NAME="toc10"></A>
|
<A NAME="toc12"></A>
|
||||||
<H1>Acknowledgements</H1>
|
<H1>Acknowledgements</H1>
|
||||||
<P>
|
<P>
|
||||||
The Phrasebook has been built in the MOLTO project funded by the European Commission.
|
The Phrasebook has been built in the MOLTO project funded by the European Commission.
|
||||||
@@ -646,6 +664,6 @@ Willard Rafnsson,
|
|||||||
Nick Smallbone.
|
Nick Smallbone.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<!-- html code generated by txt2tags 2.5 (http://txt2tags.sf.net) -->
|
<!-- html code generated by txt2tags 2.4 (http://txt2tags.sf.net) -->
|
||||||
<!-- cmdline: txt2tags -thtml -\-toc phrasebook.txt -->
|
<!-- cmdline: txt2tags -thtml -\-toc doc-phrasebook.txt -->
|
||||||
</BODY></HTML>
|
</BODY></HTML>
|
||||||
@@ -41,7 +41,7 @@ History
|
|||||||
|
|
||||||
[Missing constructs missing.txt]
|
[Missing constructs missing.txt]
|
||||||
|
|
||||||
[Back to phrasebook http://www.grammaticalframework.org/demos/phrasebook/]
|
[Back to the phrasebook http://www.grammaticalframework.org/demos/phrasebook/]
|
||||||
|
|
||||||
#ESMALL
|
#ESMALL
|
||||||
#HR
|
#HR
|
||||||
@@ -58,16 +58,18 @@ between 14 European languages included in the
|
|||||||
Polish, Romanian, Spanish, Swedish
|
Polish, Romanian, Spanish, Swedish
|
||||||
|
|
||||||
|
|
||||||
It is implemented by using the GF programming language
|
A Russian version is not yet finished but is projected later. Also other languages may be added.
|
||||||
|
|
||||||
|
The phrasebook is implemented by using the GF programming language
|
||||||
([Grammatical Framework http://grammaticalframework.org]).
|
([Grammatical Framework http://grammaticalframework.org]).
|
||||||
It is the first demo for the MOLTO project, released in the third month (by June 2010).
|
It is the first demo for the MOLTO project, released in the third month (by June 2010).
|
||||||
The first version is a very small system, but it will extended in the course of the project.
|
The first version is a very small system, but it will extended in the course of the project.
|
||||||
|
|
||||||
The phrasebook has the following requirement specification:
|
The phrasebook has the following requirement specification:
|
||||||
- high quality: reliable translations to express yourself in any language
|
- high quality: reliable translations to express yourself in any of the languages
|
||||||
- translation between all pairs of languages
|
- translation between all pairs of languages
|
||||||
- runnable in web browsers
|
- runnable in web browsers
|
||||||
- runnable on mobile phones (forthcoming: Android phones)
|
- runnable on mobile phones (via web browser; Android stand-alone forthcoming)
|
||||||
- easily extensible by new words (forthcoming: semi-automatic extensions by users)
|
- easily extensible by new words (forthcoming: semi-automatic extensions by users)
|
||||||
|
|
||||||
|
|
||||||
@@ -79,6 +81,9 @@ The source code resides in
|
|||||||
|
|
||||||
=Points illustrated=
|
=Points illustrated=
|
||||||
|
|
||||||
|
|
||||||
|
==From the user perspective==
|
||||||
|
|
||||||
Interlingua-based translation
|
Interlingua-based translation
|
||||||
- we translate meanings, rather than words
|
- we translate meanings, rather than words
|
||||||
|
|
||||||
@@ -87,20 +92,8 @@ Incremental parsing
|
|||||||
- the user is at every point guided by the list of possible next words
|
- the user is at every point guided by the list of possible next words
|
||||||
|
|
||||||
|
|
||||||
The use of resource grammars and functors
|
Mixed modalities
|
||||||
- the translator was implemented on top of an earlier linguistic knowledge base,
|
- selection of words ("fridge magnets") combined with text input
|
||||||
the [GF Resource Grammar Library http://grammaticalframework.com/lib]
|
|
||||||
|
|
||||||
|
|
||||||
Example-based grammar writing and grammar induction from statistical models
|
|
||||||
([Google translate http://translate.google.com])
|
|
||||||
- many of the grammars were created semi-automatically by generalization from
|
|
||||||
examples
|
|
||||||
|
|
||||||
|
|
||||||
Compile-time transfer: especially, in Action in Words
|
|
||||||
- the structural differences between languages are treated at compile time,
|
|
||||||
for maximal run-time efficiency
|
|
||||||
|
|
||||||
|
|
||||||
Quasi-incremental translation: many basic types are also used as phrases
|
Quasi-incremental translation: many basic types are also used as phrases
|
||||||
@@ -117,7 +110,26 @@ Fall-back to statistical translation
|
|||||||
|
|
||||||
|
|
||||||
Feed-back from users
|
Feed-back from users
|
||||||
- you are welcome to send comments, bug reports, and better translation suggestions!
|
- users are welcomed to send comments, bug reports, and better translation suggestions
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
==From the programmer's perspective==
|
||||||
|
|
||||||
|
The use of resource grammars and functors
|
||||||
|
- the translator was implemented on top of an earlier linguistic knowledge base,
|
||||||
|
the [GF Resource Grammar Library http://grammaticalframework.com/lib]
|
||||||
|
|
||||||
|
|
||||||
|
Example-based grammar writing and grammar induction from statistical models
|
||||||
|
([Google translate http://translate.google.com])
|
||||||
|
- many of the grammars were created semi-automatically by generalization from
|
||||||
|
examples
|
||||||
|
|
||||||
|
|
||||||
|
Compile-time transfer: especially, in Action in Words
|
||||||
|
- the structural differences between languages are treated at compile time,
|
||||||
|
for maximal run-time efficiency
|
||||||
|
|
||||||
|
|
||||||
The level of skills involved in grammar development
|
The level of skills involved in grammar development
|
||||||
@@ -167,72 +179,11 @@ the input language is ambiguous.
|
|||||||
Here is the module structure as produced in GF by
|
Here is the module structure as produced in GF by
|
||||||
```
|
```
|
||||||
> i -retain DisambPhrasebookEng.gf
|
> i -retain DisambPhrasebookEng.gf
|
||||||
> dg -only=Phrasebook*,Sentences*,Words*,Greetings*,DisambPhrasebookEng
|
> dg -only=Phrasebook*,Sentences*,Words*,Greetings*,Numeral,NumeralEng,DisambPhrasebookEng
|
||||||
> ! dot -Tpng _gfdepgraph.dot >pgraph.png
|
> ! dot -Tpng _gfdepgraph.dot >pgraph.png
|
||||||
```
|
```
|
||||||
|
|
||||||
[pgraph.png]
|
[npgraph.png]
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
=To Do=
|
|
||||||
|
|
||||||
Disambiguation grammars for other languages than English
|
|
||||||
|
|
||||||
Extend the abstract lexicon in ``Words`` by hand or (semi)automatically for
|
|
||||||
- food stuff
|
|
||||||
- places
|
|
||||||
- actions
|
|
||||||
|
|
||||||
|
|
||||||
Customizable phone distribution: make your own selection of the 2^15 language subsets
|
|
||||||
when downloading the phrasebook to a phone
|
|
||||||
|
|
||||||
|
|
||||||
=How to contribute=
|
|
||||||
|
|
||||||
The basic things "everyone" can do is
|
|
||||||
- complete [missing words missing.txt] in concrete syntaxes
|
|
||||||
- add new abstract words in ``Words`` and greetings in ``Greetings``
|
|
||||||
|
|
||||||
|
|
||||||
The missing concrete syntax entries are added to the ``Words``//L//``.gf``
|
|
||||||
files for each language //L//. The
|
|
||||||
[morphological paradigms http://code.haskell.org/gf/lib/doc/synopsis.html#toc78]
|
|
||||||
of the GF resource library should be used. Actions (prefixed with ``A``, as ``AWant``) are
|
|
||||||
a little more demanding, since they also require syntax constructors. Greetings (prefixed
|
|
||||||
with ``G``) are pure strings.
|
|
||||||
|
|
||||||
Some explanations can be found in the
|
|
||||||
[implementation document Implementation.html], which is produced from the
|
|
||||||
concrete syntax files
|
|
||||||
[``SentencesI.gf`` http://code.haskell.org/gf/examples/phrasebook/SentencesI.gf]
|
|
||||||
and
|
|
||||||
[``WordsEng.gf`` http://code.haskell.org/gf/examples/phrasebook/WordsEng.gf]
|
|
||||||
by ``make doc``.
|
|
||||||
|
|
||||||
Here are the steps to follow for contributors:
|
|
||||||
+ Make sure you have the latest sources
|
|
||||||
from [GF Darcs http://www.grammaticalframework.org/doc/gf-developers.html],
|
|
||||||
using ``darcs pull``.
|
|
||||||
+ Also make sure that you have compiled the library by ``make present`` in ``gf/lib/src/``.
|
|
||||||
+ Work in the directory
|
|
||||||
[``gf/examples/phrasebook/`` http://code.haskell.org/gf/examples/phrasebook/].
|
|
||||||
+ After you've finished your contribution, recompile the phrasebook by ``make pgf``.
|
|
||||||
+ Save your changes in ``darcs record .`` (in the ``phrasebook`` subdirectory).
|
|
||||||
+ Make a patch file with ``darcs send -o my_phrasebook_patch``, which you can
|
|
||||||
send to GF maintainers.
|
|
||||||
+ (Recommended:) Test the phrasebook on your local server:
|
|
||||||
+ Go to ``gf/src/server/`` and follow the instructions in the
|
|
||||||
[project Wiki http://code.google.com/p/grammatical-framework/wiki/LaunchWebDemos].
|
|
||||||
+ Make sure that ``Phrasebook.pgf`` is available to you GF server (see project wiki).
|
|
||||||
+ Launch ``lighttpd`` (see project wiki).
|
|
||||||
+ How you can open ``gf/examples/phrasebook/www/phrasebook.html`` and use your phrasebook!
|
|
||||||
|
|
||||||
|
|
||||||
- Don't delete anything! But you are free to correct incorrect forms.
|
|
||||||
- Don't change the module structure!
|
|
||||||
- Don't compromise quality to gain coverage: //non multa sed multum!//
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -241,7 +192,7 @@ Here are the steps to follow for contributors:
|
|||||||
|| Language | Grammarian's language skills | Grammarian's GF skills | Informant used for development | Informant used for testing | Use of external tools | Impact of external tools | Changes on the resource grammar | Development time ||
|
|| Language | Grammarian's language skills | Grammarian's GF skills | Informant used for development | Informant used for testing | Use of external tools | Impact of external tools | Changes on the resource grammar | Development time ||
|
||||||
| Bulgarian | ### | ### | - | - | - | ? | # | ## |
|
| Bulgarian | ### | ### | - | - | - | ? | # | ## |
|
||||||
| Catalan | ### | ### | - | - | - | ? | # | # |
|
| Catalan | ### | ### | - | - | - | ? | # | # |
|
||||||
| Danish | - | ### | + | + | + | ## | ## | ## |
|
| Danish | - | ### | + | + | + | ## | # | ## |
|
||||||
| Dutch | - | ### | + | + | + | ## | # | ## |
|
| Dutch | - | ### | + | + | + | ## | # | ## |
|
||||||
| English | ## | ### | - | + | - | - | _ | # |
|
| English | ## | ### | - | + | - | - | _ | # |
|
||||||
| Finnish | ### | ### | - | - | - | ? | # | ## |
|
| Finnish | ### | ### | - | - | - | ? | # | ## |
|
||||||
@@ -344,6 +295,68 @@ round and 2 rounds were needed in average for the languages for which we perform
|
|||||||
the experiment. It is possible that more effort is needed for more complex languages.
|
the experiment. It is possible that more effort is needed for more complex languages.
|
||||||
|
|
||||||
|
|
||||||
|
=To Do=
|
||||||
|
|
||||||
|
Disambiguation grammars for other languages than English
|
||||||
|
|
||||||
|
Extend the abstract lexicon in ``Words`` by hand or (semi)automatically for
|
||||||
|
- food stuff
|
||||||
|
- places
|
||||||
|
- actions
|
||||||
|
|
||||||
|
|
||||||
|
Customizable phone distribution: make your own selection of the 2^15 language subsets
|
||||||
|
when downloading the phrasebook to a phone
|
||||||
|
|
||||||
|
|
||||||
|
=How to contribute=
|
||||||
|
|
||||||
|
The basic things "everyone" can do is
|
||||||
|
- complete [missing words missing.txt] in concrete syntaxes
|
||||||
|
- add new abstract words in ``Words`` and greetings in ``Greetings``
|
||||||
|
|
||||||
|
|
||||||
|
The missing concrete syntax entries are added to the ``Words``//L//``.gf``
|
||||||
|
files for each language //L//. The
|
||||||
|
[morphological paradigms http://code.haskell.org/gf/lib/doc/synopsis.html#toc78]
|
||||||
|
of the GF resource library should be used. Actions (prefixed with ``A``, as ``AWant``) are
|
||||||
|
a little more demanding, since they also require syntax constructors. Greetings (prefixed
|
||||||
|
with ``G``) are pure strings.
|
||||||
|
|
||||||
|
Some explanations can be found in the
|
||||||
|
[implementation document Implementation.html], which is produced from the
|
||||||
|
concrete syntax files
|
||||||
|
[``SentencesI.gf`` http://code.haskell.org/gf/examples/phrasebook/SentencesI.gf]
|
||||||
|
and
|
||||||
|
[``WordsEng.gf`` http://code.haskell.org/gf/examples/phrasebook/WordsEng.gf]
|
||||||
|
by ``make doc``.
|
||||||
|
|
||||||
|
Here are the steps to follow for contributors:
|
||||||
|
+ Make sure you have the latest sources
|
||||||
|
from [GF Darcs http://www.grammaticalframework.org/doc/gf-developers.html],
|
||||||
|
using ``darcs pull``.
|
||||||
|
+ Also make sure that you have compiled the library by ``make present`` in ``gf/lib/src/``.
|
||||||
|
+ Work in the directory
|
||||||
|
[``gf/examples/phrasebook/`` http://code.haskell.org/gf/examples/phrasebook/].
|
||||||
|
+ After you've finished your contribution, recompile the phrasebook by ``make pgf``.
|
||||||
|
+ Save your changes in ``darcs record .`` (in the ``phrasebook`` subdirectory).
|
||||||
|
+ Make a patch file with ``darcs send -o my_phrasebook_patch``, which you can
|
||||||
|
send to GF maintainers.
|
||||||
|
+ (Recommended:) Test the phrasebook on your local server:
|
||||||
|
+ Go to ``gf/src/server/`` and follow the instructions in the
|
||||||
|
[project Wiki http://code.google.com/p/grammatical-framework/wiki/LaunchWebDemos].
|
||||||
|
+ Make sure that ``Phrasebook.pgf`` is available to you GF server (see project wiki).
|
||||||
|
+ Launch ``lighttpd`` (see project wiki).
|
||||||
|
+ How you can open ``gf/examples/phrasebook/www/phrasebook.html`` and use your phrasebook!
|
||||||
|
|
||||||
|
|
||||||
|
- Don't delete anything! But you are free to correct incorrect forms.
|
||||||
|
- Don't change the module structure!
|
||||||
|
- Don't compromise quality to gain coverage: //non multa sed multum!//
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
=Conclusions (tentative)=
|
=Conclusions (tentative)=
|
||||||
|
|
||||||
The grammarian need not be a native speaker of the language.
|
The grammarian need not be a native speaker of the language.
|
||||||
46
examples/phrasebook/help-phrasebook.html
Normal file
46
examples/phrasebook/help-phrasebook.html
Normal file
@@ -0,0 +1,46 @@
|
|||||||
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
||||||
|
<HTML>
|
||||||
|
<HEAD>
|
||||||
|
<META NAME="generator" CONTENT="http://txt2tags.sf.net">
|
||||||
|
<TITLE>MOLTO Phrasebook Help</TITLE>
|
||||||
|
</HEAD><BODY BGCOLOR="white" TEXT="black">
|
||||||
|
<P ALIGN="center"><CENTER><H1>MOLTO Phrasebook Help</H1>
|
||||||
|
<FONT SIZE="4">
|
||||||
|
</FONT></CENTER>
|
||||||
|
|
||||||
|
<P>
|
||||||
|
To start: klick at a word or start typing.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
<B>From</B>: source language
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
<B>To</B>: target language (either a single one or "All" simultaneously)
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
<B>Del</B>: delete last word
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
<B>Clear</B>: start over
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
<B>Random</B>: generate a random phrase
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
Google translate: the current input and language choice; opens in a new window or tab.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
The symbol <CODE>&+</CODE> means binding of two words. It will disappear in the complete translation.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
The translator is slightly <I>overgenerating</I>, which means you can build some semantically strange phrases.
|
||||||
|
Before reporting them as bugs, ask yourself: could this be correct in some situation? is the translation
|
||||||
|
valid in that situation?
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
<A HREF="http://www.grammaticalframework.org/demos/phrasebook/">Back to the phrasebook</A>
|
||||||
|
</P>
|
||||||
|
|
||||||
|
<!-- html code generated by txt2tags 2.4 (http://txt2tags.sf.net) -->
|
||||||
|
<!-- cmdline: txt2tags -thtml help-phrasebook.txt -->
|
||||||
|
</BODY></HTML>
|
||||||
26
examples/phrasebook/help-phrasebook.txt
Normal file
26
examples/phrasebook/help-phrasebook.txt
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
MOLTO Phrasebook Help
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
To start: klick at a word or start typing.
|
||||||
|
|
||||||
|
**From**: source language
|
||||||
|
|
||||||
|
**To**: target language (either a single one or "All" simultaneously)
|
||||||
|
|
||||||
|
**Del**: delete last word
|
||||||
|
|
||||||
|
**Clear**: start over
|
||||||
|
|
||||||
|
**Random**: generate a random phrase
|
||||||
|
|
||||||
|
Google translate: the current input and language choice; opens in a new window or tab.
|
||||||
|
|
||||||
|
The symbol ``&+`` means binding of two words. It will disappear in the complete translation.
|
||||||
|
|
||||||
|
The translator is slightly //overgenerating//, which means you can build some semantically strange phrases.
|
||||||
|
Before reporting them as bugs, ask yourself: could this be correct in some situation? is the translation
|
||||||
|
valid in that situation?
|
||||||
|
|
||||||
|
[Back to the phrasebook http://www.grammaticalframework.org/demos/phrasebook/]
|
||||||
|
|
||||||
Reference in New Issue
Block a user