mirror of
https://github.com/GrammaticalFramework/gf-core.git
synced 2026-04-10 13:29:32 -06:00
174 lines
6.9 KiB
HTML
174 lines
6.9 KiB
HTML
<html>
|
|
|
|
<body bgcolor="#FFFFFF" text="#000000">
|
|
|
|
<center>
|
|
|
|
<h1>Grammatical Framework Version 2.2</h1>
|
|
|
|
Highlights of GF version 2.2.
|
|
|
|
<p>
|
|
|
|
9/5/2005
|
|
|
|
<p>
|
|
|
|
<a href="http://www.cs.chalmers.se/~aarne">Aarne Ranta</a>
|
|
|
|
</center>
|
|
|
|
|
|
<h2>Summary of novelties in Version 2.2 in comparison to 2.1</h2>
|
|
|
|
<li> New optimizations to reduce the size of GFC files
|
|
<li> Improved parsing algorithms
|
|
<li> Lots of bug fixes
|
|
<li> Separate <tt>reuse</tt> modules no longer needed
|
|
<li> Several new command options
|
|
<li> New documentation:
|
|
<ul>
|
|
<li> <a href="gf-modules.html">module system document</tt>
|
|
<li> <a href="tutorial/gf-tutorial2.html">new tutorial</a>, based on the module system (unfinished)
|
|
</ul>
|
|
<li> New resource libraries
|
|
<li> New example grammars
|
|
<li> Visualization of module dependency graph
|
|
<li> In the editor GUI, text corresponding to subtrees with constraints marked with red colour
|
|
<li> Hierarchic modules used in the source code
|
|
<li> <a href="http://www.haskell.org/haddock">haddock</a> documentation available for source code
|
|
<li> Optimizations to reduce GF's memory footprint when using large grammars.
|
|
<li> The <tt>pm</tt> command can now convert identifiers in the grammar to UTF-8.
|
|
|
|
|
|
<h3>Compiler optimizations</h3>
|
|
|
|
The sometimes exploding size of generated <tt>gfc</tt> and
|
|
<tt>gfr</tt> files has made it urgent to find optimizations
|
|
that reduce the size of the code. There are five
|
|
combinations optimizations that can be chosen, as the value of the
|
|
<tt>optimize</tt> flag:
|
|
<ul>
|
|
<li> <tt>share</tt>: group tables so that common branch values are shared
|
|
by the use of disjunctive patterns.
|
|
<li> <tt>parametrize</tt>: if table branches differ at most at the
|
|
occurrence of the pattern, replace the expanded table by a one-branch
|
|
table with a variable. If this fails, perform <tt>share</tt>.
|
|
<li> <tt>values</tt>: only show the values of table branches, not the
|
|
patterns.
|
|
<li> <tt>all</tt>: try <tt>parametrize</tt>; if this fails, do <tt>values</tt>.
|
|
<li> <tt>none</tt>: don't do any optimizations
|
|
</ul>
|
|
The <tt>share</tt> and <tt>parametrize</tt> optimizations are always
|
|
just good, whereas the <tt>values</tt> optimization may slow down the
|
|
use of the table. However, it is very good for grammars mostly consisting
|
|
of the inflection tables of lexical items: it can reduce the file size
|
|
by the factor of 4.
|
|
|
|
<p>
|
|
|
|
An optimization can be selected individually for each
|
|
<tt>resource</tt> and <tt>concrete</tt> module by including
|
|
the judgement
|
|
<pre>
|
|
flags optimize=(share|parametrize|values|all|none) ;
|
|
</pre>
|
|
in the module body. These flags can be overridden by a flag given
|
|
in the <tt>i</tt> command, e.g.
|
|
<pre>
|
|
i -src -optimize=none Foo.gf
|
|
</pre>
|
|
Notice that the option <tt>-src</tt> is needed if there already are
|
|
generated files created with other optimization flags.
|
|
|
|
<p>
|
|
|
|
<b>Important notice</b>: If you use the
|
|
<a href="http://www.cs.chalmers.se/~bringert/gf/gf-java.html">
|
|
Embedded GF Interpreter</a>,
|
|
or the improved parsing algorithms described below,
|
|
only the values <tt>none</tt>,
|
|
<tt>share</tt> and <tt>values</tt> can be used; the stronger optimizations are not
|
|
supported yet.
|
|
Also note that currently, GF aborts and reports an error if the stronger optimizations are used
|
|
when creating the grammar for the Embedded GF Interpreter, or when trying to parse.
|
|
|
|
|
|
<h3>Improved parsing algorithms</h3>
|
|
|
|
We have implemented some of the suggested parsing algorithms described in
|
|
Peter Ljunglöf's <a href="http://www.cs.chalmers.se/~peb/pubs.html">PhD thesis</a>.
|
|
So now there are the following options for parsing:
|
|
<ul>
|
|
<li>The default parser. It uses a (possibly) very overgenerating context-free grammar, and filters the resulting parse trees by type-checking.
|
|
<li>The <tt>-cfg</tt> flag. It uses a much less overgenerating context-free grammar, and filters as above.
|
|
<li>The <tt>-mcfg</tt> flag. It uses an even less overgenerating <em>multiple context-free grammar</em>.
|
|
If the abstract syntax is context-free, meaning that there are no dependent types and only first-order functions,
|
|
the trees do not have to be filtered at all.
|
|
</ul>
|
|
The option <tt>-parser=X</tt> selects the parsing strategy. The default parser has the strategies
|
|
<tt>chart</tt>, <tt>bottomup</tt>, <tt>topdown</tt>, <tt>old</tt>, with the first one being the default.
|
|
The <tt>-cfg</tt> and <tt>-mcfg</tt> parsers only recognize the <tt>bottomup</tt> and <tt>topdown</tt> strategies.
|
|
|
|
<p>
|
|
|
|
<b>Note</b> that the <tt>-cfg</tt> and <tt>-mcfg</tt> parsers can take a very long time on their first call, since
|
|
they have to convert the GF grammar. This will only happen once in a GF run, provided the GF files are not changed.
|
|
|
|
<p>
|
|
|
|
<b>Tips</b> for choosing the best parser for your grammar. Try with the default parser; if it is too slow, try the other two.
|
|
Remember that the first time you parse they will be very slow, since they have to build parsing information.
|
|
the <tt>-cfg</tt> parser is best on grammars with many parameters and inflection tables, and
|
|
The <tt>-mcfg</tt> parser is even better when the grammar also has discontinuous constituents.
|
|
|
|
<p>
|
|
|
|
Here is a small example from the resource library:
|
|
<pre>
|
|
> i -src -optimize=share lib/resource/english/LangEng.gf
|
|
> p -cat=S ""
|
|
> p -cat=S -cfg ""
|
|
> p -cat=S -mcfg ""
|
|
{Comment: Just some dummy parsing calls to calculate the parsing information}
|
|
|
|
> p -cat=S -rawtrees=200000 "you will be running"
|
|
{Comment: Nr of unfiltered trees: 169296 -- 99,996% av the trees are ill-typed}
|
|
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV AAnter run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV ASimul run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV AAnter run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV ASimul run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV AAnter run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV ASimul run_V))
|
|
|
|
17730 msec
|
|
|
|
> p -cat=S -cfg "you will be running"
|
|
{Comment: Nr of unfiltered trees: 246 -- 97,5% of the trees are ill-typed}
|
|
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV AAnter run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV ASimul run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV AAnter run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV ASimul run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV AAnter run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV ASimul run_V))
|
|
|
|
1580 msec
|
|
|
|
> p -cat=S -mcfg "you will be running"
|
|
{Comment: Nr of unfiltered trees: 6 -- all trees are type-corrent}
|
|
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV AAnter run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV ASimul run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV AAnter run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV ASimul run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV AAnter run_V))
|
|
UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV ASimul run_V))
|
|
|
|
470 msec
|
|
</pre>
|
|
|
|
</body>
|
|
</html>
|