Commit Graph

324 Commits

Author SHA1 Message Date
aarne
ca73f4554c option pt -funs to show all fun's in a tree 2013-03-29 11:45:42 +00:00
hallgren
c823b7fd91 Fix a problem with pattern macros in pre { } expressions
The old partial evaluator has special rules to convert pattern macros in
pre { } expressions. These rules were missing in the new partial evaluator.
2013-03-16 13:36:23 +00:00
aarne
0dc182c216 pt -nub to remove duplicate trees from a list returned e.g. by a parser 2013-03-13 13:43:30 +00:00
hallgren
7710dc42db partial evaluator: push predefined functions inside variants
This should prevent errors like

Internal error in Compute.ConcreteNew:
    Applying Predef.drop: Expected a value of type String, got VFV [VString "gewandt",VString "gewendet"]
2013-03-12 16:36:58 +00:00
aarne
9d1be48e0f command pt -subtrees that analyses a tree into the set of subtrees. Using pt -subtrees <bigtree> | l -treebank for debugging the lin of a big tree 2013-03-12 14:58:06 +00:00
hallgren
cd8cbda3d4 Additional changes for GHC 7.4 & 7.6 compatibility 2013-03-11 12:57:09 +00:00
Sergei Trofimovich
05e5895134 ghc-7.6: allow directory-1.2
Get rid of old-time depend (and ClockTime in favour of UTCTime).
time-compat helps to retain backward compatibility with directory-1.1
and lower.
2013-03-09 21:38:43 +00:00
hallgren
e61f2f8d03 Fix a bug that could cause "Prelude.head: empty list"
In Data.Operations, the function topoTest2 assumed too much about the form of
the input, compared to the older function topoTest.
2013-02-28 17:46:13 +00:00
hallgren
5e3e5821fb pattern match length estimation code simplication 2013-02-28 15:13:20 +00:00
hallgren
bbc13e9f0c Faster regular expression pattern matching in the grammar compiler.
The sequence operator (x+y) was implemented by splitting the string to be
matched at all positions and trying to match the parts against the two
subpatterns. To reduce the number of splits, we now estimate the minimum and
maximum length of the string that the subpatterns could match. For common
cases, where one of the subpatterns is a string of known length, like
in (x+"y") or (x + ("a"|"o"|"u"|"e")+"y"), only one split will be tried.
2013-02-27 20:59:43 +00:00
hallgren
8174364250 GF grammar pretty printer improvements
Allow line breaks in more places to make large terms more readable.
2013-02-27 14:22:47 +00:00
kr.angelov
55203110bb now the beam size for the statistical parser can be configured by using the flag beam_size in the top-level concrete module 2013-02-12 10:53:13 +00:00
aarne
28c59faf29 pg -lexc now writes a list of multichar symbols and a title ("Root") for the lexicon, as required by Xerox lexc 2013-02-03 10:03:15 +00:00
hallgren
09fb4cdef0 Better error message for unsupported token gluing
Instead of "Internal error in ...", you now get a proper error message with
a source location and a function name.
2013-01-29 16:25:03 +00:00
hallgren
79795cb0e7 Fix a bug with record extension
Add a conversion rule for ({ l1 = e } ** x).l2 in PMCFG generation. (A rule
for the symmetric case (x ** { l1 = e }).l2 was added some time ago.)
2013-01-29 14:59:16 +00:00
hallgren
c14e75706e Quick fix to render some parser error messages from UTF-8-encoded source files correctly.
The parser works on raw byte sequences read from source files. If parsing
succeeds the raw byte sequences are converted to proper Unicode characters 
in a later phase. But the parser calls the function buildAnyTree, which can 
fail and generate error messages containing source code fragments, which might
then containing raw byte sequences. To render these error messages correctly, 
they need to be converted in accordance with the coding flag in the source 
file. This is now done for UTF-8-encoded source files, but should ideally also
be done for other character encodings. (Latin-1-encoded files never suffered 
from this problem, since raw bytes are proper Unicode characters in this case.)
2013-01-28 17:23:02 +00:00
hallgren
764b649959 Better error message for Predef.error
+ Instead of "Internal error in ...", you now get a proper error message with
  a source location and a function name.
+ Also added some missing error value propagation in the partial evaluator.
+ Also some other minor cleanup and error handling fixes.
2013-01-28 16:12:56 +00:00
aarne
6af9575a68 improved error message for overloading in case the given signature looks the same as one of the expected ones: it shows full records rather than just lock fields. 2013-01-28 14:00:23 +00:00
hallgren
3712b6988e partial evaluator: fix token glueing bug
"a"+("b"++"c") was simplified to "bb"++"c" instead of "ab"++c.
2013-01-11 15:14:42 +00:00
hallgren
368cd7ffbe bug fix in the new partial evaluator
It can leave wildcard tables in their origial form, but it easy to handle
them in the unfactor function in GeneratePMCFG.
2012-12-20 16:41:43 +00:00
aarne
793ba98249 added alltenses to the default search path (just like prelude) 2012-12-20 16:05:34 +00:00
hallgren
f73825ddf1 partial evaluator bug fix
It failed to delay table selection when the selector contains a run-time
variable, causing "gf: Prelude.(!!): index too large" instead.

Also:
  + Show better source locations on unexpected errors, to aid bug hunting.
  + Removed unused SourceGrammar argument to value2term.
2012-12-19 23:12:37 +00:00
hallgren
4aa3638549 GF.Grammar.Lookup: new function lookupResDefLoc
It's like lookupResDef but it includes a source location in the output.
2012-12-19 23:08:56 +00:00
hallgren
3755ea673a partial evaluator bug fix
Int was missing from the list of predefined canonical constants.
2012-12-18 13:03:20 +00:00
kr.angelov
8aefd1e072 The first prototype for exhaustive generation in the C runtime. The trees are always listed in decreasing probability order. There is also an API for generation from Python 2012-12-14 15:32:49 +00:00
hallgren
79711380a2 Add language extension for ghc<7.4
FlexibleInstances does not imply TypeSynonymInstances, apparently.
2012-12-14 14:21:46 +00:00
hallgren
950832dbba More work on the new partial evaluator
The work done by the partial evaluator is now divied in two stages:
 - A static "term traversal" stage that happens only once per term and uses
   only statically known information. In particular, the values of lambda bound
   variables are unknown during this stage. Some tables are transformed to
   reduce the cost of pattern matching.
 - A dynamic "function application" stage, where function bodies can be
   evaluated repeatedly with different arguments, without the term traversal
   overhead and without recomputing statically known information.

Also the treatment of predefined functions has been reworked to take advantage
of the staging and better handle partial applications.
2012-12-14 14:00:21 +00:00
hallgren
f39466f787 partial evaluator work
* Evaluate operators once, not every time they are looked up
* Remember the list of parameter values instead of recomputing it from the
  pattern type every time a table selection is made.
* Quick fix for partial application of some predefined functions.
2012-12-11 15:37:41 +00:00
hallgren
ab97deae57 Compute.ConcreteNew: add missing case for variant functions
Also adding a test case in the test suite for this.
2012-12-10 13:25:32 +00:00
hallgren
7d0f649f29 Compute.ConcreteNew: bug fix for indirectly defined pattern macros
More changes are probably needed to make pattern macros first class values.
Also includes minor changes related to variants and error messages.
2012-12-06 16:44:03 +00:00
aarne
d34adff894 produce error message instead of failure of irrefutable pattern Ok ty_C in GrammarToPGF, to help find compilation errors; the ones I've found are because an inherited abstract excludes something that the inherited concrete does not exclude. 2012-12-02 19:40:45 +00:00
hallgren
b410cc75cd Fix a prededence bug in GF grammar pretty printer
The pretty printer produced

	mkDet pre {"a"; "an" / vowel} Sg

which is not accepted by the parser. The parser assigns pre { ... }, to
prededence level 4, and this is now reflected in the pretty printer, so
it prints

	mkDet (pre {"a"; "an" / vowel}) Sg

(This caused a problem in GFSE since it parsers pretty printed grammars...)
2012-11-23 18:44:08 +00:00
peter.ljunglof
595c475c70 better visualization of parse trees 2012-11-22 08:50:37 +00:00
hallgren
cf00c8bd0b new-comp: rewrite f (x|y) into (f x|f y)
With this change, all languages in molto/mgl/mixture except German and Polish
can be compiled.
2012-11-16 13:47:10 +00:00
hallgren
586d7488f2 Add flag --document-root for user with gf --server
This can make it easier to test cloud service updates before installing them.
2012-11-14 13:52:45 +00:00
hallgren
0ef7b8a3b5 GF usage message fixes
Change the command name from gfc to gf in the usage message header.
Correct spelling of "overide" to "override" in -gf-lib-path description.
2012-11-14 13:49:10 +00:00
hallgren
b6f392b4e1 Adding a new experimental partial evalutator
GF.Compile.Compute.ConcreteNew + two new modules contain a new
partial evaluator intended to solve some performance problems with the old
partial evalutator in GF.Compile.Compute.ConcreteLazy. It has been around for
a while, but is now complete enough to compile the RGL and the Phrasebook.

The old partial evaluator is still used by default. The new one can be activated
in two ways:

  - by using the command line option -new-comp when invoking GF.
  - by using cabal configure -fnew-comp to make -new-comp the default. In this
    case you can also use the command line option -old-comp to revert to the old
    partial evaluator.

In the GF shell, the cc command uses the old evaluator regardless of -new-comp
for now, but you can use "cc -new ..." to invoke the new evaluator.

With -new-comp, computations happen in GF.Compile.GeneratePMCFG instead of
GF.Compile.Optimize. This is implemented by testing the flag optNewComp in
both modules, to omit calls to the old partial evaluator from GF.Compile.Optimize
and add calls to the new partial evaluator in GF.Compile.GeneratePMCFG.
This also means that -new-comp effectively implies -noexpand.

In GF.Compile.CheckGrammar, there is a check that restricted inheritance is used
correctly. However, when -noexpand is used, this check causes unexpected errors,
so it has been converted to generate warnings, for now.

-new-comp no longer enables the new type checker in
GF.Compile.Typeckeck.ConcreteNew.

The GF version number has been bumped to 3.3.10-darcs
2012-11-13 14:09:15 +00:00
hallgren
c2b7288411 Eliminate warnings about deprecated use of catch and try
This is also needed for compatibility with GHC 7.6.
2012-11-08 15:53:46 +00:00
hallgren
ad74dfe527 GF.Grammar.PatternMatch: relax overly restrictive type signatures 2012-11-07 17:23:08 +00:00
hallgren
a912ad813d Some changed/new utility functions
GF.Data.Utilities:  Rename mapFst to apFst, mapSnd to apSnd.
		    Add apBoth, mapFst, mapSnd, mapBoth.
GF.Data.Operations: Remove onSnd (same as apSnd)
2012-11-07 15:31:45 +00:00
virk.shafqat
f7344b8f38 unicode4k-changed 2012-11-05 16:44:31 +00:00
hallgren
4f161ee5b3 GF.Grammar.Macros: add function collectPattOp
collectPattOp :: (Patt -> [a]) -> Patt -> [a]
2012-10-25 16:12:21 +00:00
hallgren
71dd493987 GF.Grammar.Macros: add function composPattOp
For Patt, analogous to composOp for Term.
2012-10-24 22:40:18 +00:00
hallgren
b841664a63 Compute.ConcreteNew: support variants
Also add a missing check for Predef values in apply.
2012-10-24 17:49:20 +00:00
hallgren
eed724271f GeneratePMCFG: prefix messages about "impossible" errors with 'Internal error:'
Just to make them easier to spot when wading through thousands of lines of
warnings...
2012-10-24 17:08:52 +00:00
hallgren
7565ba8b87 cleanup
Simplify the implementation of writeUTF8File and use it in one more place.
Remove unused imports left over after a previous change.
2012-10-23 11:48:23 +00:00
hallgren
d0e1187b10 Refactor compileSourceModule
There was 55 lines of rather repetitive code with calls to 6 compiler passes.
They have been replaced with 19 lines that call the 6 compiler passes
plus 26 lines of helper functions.
2012-10-19 20:14:11 +00:00
hallgren
885aaca6de Consistenly use SourceGrammar instead of [SourceModule] when calling compiler passes 2012-10-19 19:56:00 +00:00
hallgren
1d6cbf8189 Use NOINLINE for build info and darcs version info
... to avoid unnecessary recompilation of other modules.
2012-10-18 20:01:22 +00:00
hallgren
eff4d46fba GF.Command.Command: turn CommandOutput into a newtype
The output from commands is represented as ([Expr],String), where the [Expr] is
used when data is piped between commands and the String is used for the final
output. The String can represent the same list of trees as the [Expr] and/or
contain diagnostic information.

Sometimes the data that is piped between commands is not a list of trees, but
e.g. a string or a list of strings. In those cases, functions like fromStrings
and toStrings are used to encode the data as a [Expr].

This patch introduces a newtype for CommandOutput and collects the functions
dealing with command output in one place to make it clearer what is going on.
It also makes it easier to change to a more direct representation of piped
data, and make pipes more "type safe", if desired.
2012-10-16 13:01:03 +00:00