Commit Graph

174 Commits

Author SHA1 Message Date
kr.angelov
4f0246cc12 bugfix in the grammar splitter 2013-12-10 12:31:40 +00:00
kr.angelov
87fffffbdf option --split-pgf replaces option --mk-index. This splits the PGF into one file for the abstract and one more for each concrete syntax. This is a preparation for being able to load only specific languages from the whole grammar. 2013-12-10 10:43:13 +00:00
kr.angelov
1067d59609 -optimize-pgf should also apply to the linrefs 2013-11-29 14:25:23 +00:00
hallgren
3f57151cc3 Represent identifiers as UTF-8-encoded ByteStrings
This was a fairly simple change thanks to previous work on making the Ident
type abstract and the fact that PGF.CId already uses UTF-8-encoded
ByteStrings.

One potential pitfall is that Data.ByteString.UTF8 uses the same type for
ByteStrings as Data.ByteString. I renamed ident2bs to ident2utf8 and
bsCId to utf8CId, to make it clearer that they work with UTF-8-encoded
ByteStrings.

Since both the compiler input and identifiers are now UTF-8-encoded
ByteStrings, the lexer now creates identifiers without copying any characters.
**END OF DESCRIPTION***

Place the long patch description above the ***END OF DESCRIPTION*** marker.
The first line of this file will be the patch name.


This patch contains the following changes:

M ./src/compiler/GF/Compile/CheckGrammar.hs -3 +3
M ./src/compiler/GF/Compile/GrammarToPGF.hs -2 +2
M ./src/compiler/GF/Grammar/Binary.hs -5 +1
M ./src/compiler/GF/Grammar/Lexer.x -11 +13
M ./src/compiler/GF/Infra/Ident.hs -19 +36
M ./src/runtime/haskell/PGF.hs -1 +1
M ./src/runtime/haskell/PGF/CId.hs -2 +3
2013-11-26 16:12:03 +00:00
kr.angelov
8bcc70eac8 the GF syntax for identifiers is exteded with quoted forms, i.e. you could write for instance 'ab.c' and then everything between the quites is identifier. This includes Unicode characters and non-ASCII symbols. This is useful for automatically generated GF grammars. 2013-11-22 13:30:18 +00:00
kr.angelov
0095119ec0 added Predef.SOFT_BIND. This special token allows zero or more spaces between ordinary tokens. It is also used in the English RGL to attach the commas to the previous word. 2013-11-12 09:54:57 +00:00
hallgren
7a41b45f13 Remove PGF.Signature
This module should not be part of the public PGF library API, and it was only
used in GF.CompileToAPI, so the code was moved there. The module defined
constFuncs and syntaxFuncs, but only syntaxFuncs was used.
2013-11-06 13:27:29 +00:00
kr.angelov
2483dc7728 the content of ParseEngAbs3.probs is now merged with ParseEngAbs.probs. The later is now retrained. Once the grammar is compiled with the .probs file now it doesn't need anything more to do robust parsing. The robustness itself is controlled by the flags 'heuristic_search_factor', 'meta_prob' and 'meta_token_prob' in ParseEngAbs.gf 2013-11-06 10:21:46 +00:00
aarne
aba666c5bc linearization by chunks in the GF shell: a new command 'lc' needed because 'l' requires type checking and trees with metavariable function heads don't type check. This will hopefully be a temporary command. 2013-11-05 17:28:47 +00:00
hallgren
3814841d7d Eliminate mutual dependencies between the GF compiler and the PGF library
+ References to modules under src/compiler have been eliminated from the PGF
  library (under src/runtime/haskell). Only two functions had to be moved (from
  GF.Data.Utilities to PGF.Utilities) to make this possible, other apparent
  dependencies turned out to be vacuous.

+ In gf.cabal, the GF executable no longer directly depends on the PGF library
  source directory, but only on the exposed library modules. This means that
  there is less duplication in gf.cabal and that the 30 modules in the
  PGF library will no longer be compiled twice while building GF.

  To make this possible, additional PGF library modules have been exposed, even
  though they should probably be considered for internal use only. They could
  be collected in a PGF.Internal module, or marked as "unstable", to make
  this explicit.

+ Also, by using the -fwarn-unused-imports flag, ~220 redundant imports were
  found and removed, reducing the total number of imports by ~15%.
2013-11-05 13:11:10 +00:00
hallgren
83a10ce25a Add a cabal flag to use the standard binary package
The standard binary package has improved efficiency and error handling [1], so
in the long run we should consider switching to it. At the moment, using it is
possible but not recommended, since it results in incomatible PGF files.

The modified modules from the binary package have been moved from
src/runtime/haskell to src/binary.

[1] http://lennartkolmodin.blogspot.se/2013/03/binary-07.html
2013-10-31 15:43:12 +00:00
kr.angelov
a4194501fe linref is now used by the linearizer. The visible change is that the 'l' command in the shell now can linearize discontinuous phrases 2013-10-30 14:42:29 +00:00
kr.angelov
042243f08a added the linref construction in GF. The PGF version number is now bumped 2013-10-30 12:53:36 +00:00
hallgren
9410c6b141 Functions merge trees into tries in the GF Shell and the PGF web service
* In the shell, the new command tt (to_trie) merges a list of trees into a
  trie and prints it in a readable way, where unique subtrees are marked with
  a "*" and alternative subtrees are marked with numbers.
* In the PGF web service, adding the parameter trie=yes to the parse and
  translate commands augments the JSON output with a trie.

Example to try in the shell:

	Phrasebook> p -lang=Eng "your son waits for you" | tt
2013-10-24 17:29:02 +00:00
kr.angelov
1d2bf1cea8 fix the grammar serialization for nonExist and BIND 2013-10-21 10:03:43 +00:00
kr.angelov
71868fa053 the symbol for nonExist in the GF runtime should be the last. this simplifies the binary search in the C runtime 2013-10-03 08:21:31 +00:00
kr.angelov
426bc49a52 a major refactoring in the C and the Haskell runtimes. Note incompatible change in the PGF format!!!
The following are the outcomes:

   - Predef.nonExist is fully supported by both the Haskell and the C runtimes

   - Predef.BIND is now an internal compiler defined token. For now
     it behaves just as usual for the Haskell runtime, i.e. it generates &+.
     However, the special treatment will let us to handle it properly in 
     the C runtime.

   - This required a major change in the PGF format since both 
     nonExist and BIND may appear inside 'pre' and this was not supported
     before.
2013-09-27 15:09:48 +00:00
kr.angelov
027fd911b6 fix for linearization with 'pre' 2013-09-03 08:58:04 +00:00
kr.angelov
df26b134fc fix in the GF compiler and runtime which let us to define pre construct detecting whether this is the last token. 2013-09-03 07:51:25 +00:00
kr.angelov
a20cd77d25 nonExist now does the expected thing 2013-08-23 13:17:45 +00:00
kr.angelov
383d829d5a the first approximation for a statistical model consistent with dependent types in the abstract syntax 2013-07-30 07:29:11 +00:00
gregoire.detrez
08a67b9f34 [haskell runtime] Remove trailing whitespaces in VisualizeTree.hs 2013-05-03 09:42:29 +00:00
kr.angelov
b1b68bf6b4 reverse the direction of the arcs in the dependency trees 2013-04-21 19:20:08 +00:00
kr.angelov
4e2044ab99 remove the dead code left behind by Peter Ljunglöf in VisualizeTree 2013-04-19 11:13:07 +00:00
kr.angelov
b49b9d459a added a malt_tab format to the vd command in the GF shell 2013-04-16 18:22:37 +00:00
kr.angelov
f6d675c34b the generation of dependency trees in the Haskell runtime is now finally working with bracketed strings. This also fixes some errors in the old implementation 2013-04-16 13:10:48 +00:00
kr.angelov
2f35964871 the compiler now sorts the list of functions per category in probability order. this ensures probability order search in the C runtime 2013-04-15 19:58:57 +00:00
hallgren
b8ce5ef5b3 PGF.hs: export function missingLins
Also in Commands.hs: be explicit about things imported from the PGF library
that are not in the public API.
Also a couple of haddock documentation fixes.
2013-04-08 15:38:11 +00:00
john.j.camilleri
458ffc42d1 Replace "CId" with "Language" in type signature for PGF.tabularLinearizes 2013-04-02 09:19:08 +00:00
hallgren
9faa3407ab haddock bug workaround 2013-03-26 13:14:37 +00:00
Sergei Trofimovich
5b688b6359 ghc-7.6: add missing Num instance for Bits
Fixes the following build failure:
    src/runtime/haskell/Data/Binary/IEEE754.lhs:256:17:
        Could not deduce (Num a) arising from a use of `mask'
        from the context (Bits a)
          bound by the type signature for
                     clamp :: Bits a => BitCount -> a -> a
2013-03-09 21:19:53 +00:00
hallgren
4f243fbf12 Fix for a PGF portability problem
GF produced slightly different PGF files on 64-bit systems and 32-bit systems.
This could cause problems when a PGF was produced on a 32-bit system and used
on a 64-bit system.

To fix this, the GF compiler and the Haskell PGF run-time library now reads
and writes PGF files like the 32-bit version even when compiled on a 64-bit
system.

Note: the Haskell type Int is still used internally in GF, which could be
32 bits or 64 bits...
2013-02-13 14:28:06 +00:00
hallgren
211cd9bb25 Avoid crash in random generation with probabilities 2013-01-29 13:59:20 +00:00
hallgren
db544b1cc9 PGFService.hs: fix type error caused by change to PGF.graphvizParseTree
Note that some of the graphviz functions have backwards incompatible changes
that might also affect other clients of the PGF run-time library.

Also added graphvizDefaults and export it together with GraphvizOptions from 
the PGF run-time library.
2012-11-22 15:27:16 +00:00
peter.ljunglof
486a510611 better visualization of parse trees 2012-11-22 08:50:37 +00:00
kr.angelov
fe3b5c1360 the Haskell runtime now exports 'functionsByCat' which returns the list of all functions for a given category 2012-09-18 09:48:21 +00:00
kr.angelov
545e48e881 another fix for teyjus 2012-08-30 08:09:30 +00:00
kr.angelov
3f0b8c55ec the loading of PGF files was broken by the Teyjus patch. Now this is fixed 2012-08-30 07:41:49 +00:00
peter.ljunglof
b416f5bbf7 Use nub' instead of nub in some places, remove some unused nub imports 2012-08-29 21:48:34 +00:00
peter.ljunglof
a7de16c34b Added an O(n log n) version of nub
The new nub is called nub', and it replaces the old sortNub which was 
not lazy and did not retain the order between the elements.
2012-08-29 21:45:10 +00:00
kr.angelov
f8fe23fda7 A basic infrastructure for generating Teyjus bytecode from the GF abstract syntax 2012-08-29 11:43:02 +00:00
aarne
191ecc71b8 command option ma -known to drop unknown words 2012-06-10 10:43:57 +00:00
Sergei Trofimovich
24740d250b Fix List.foldl / Map.foldl ambiguosity
Fixes the following error:
src/runtime/haskell/PGF/Expr.hs:111:14:
    Ambiguous occurrence `foldl'
    It could refer to either `List.foldl',
                             imported from `Data.List' at src/runtime/haskell/PGF/Expr.hs:27:1-24
                             (and originally defined in `GHC.List')
                          or `Map.foldl',
                             imported from `Data.Map' at src/runtime/haskell/PGF/Expr.hs:28:1-40
2012-03-26 20:18:23 +00:00
hallgren
07af8988d3 PGF run-time library: function names in BracketedString (experimental)
+ Make room for function names in the BracketedString data structure.
+ Fill in function names when linearizing an abstract syntax tree to a
  BracketedString.
+ Fill in wildCId when it is not obvious what the function is.
+ Function bracketedLinearize: for compatibility with the other linearization
  functions, return Leaf "" instead of error "cannot linearize".
+ Export flattenBracketedString from module PGF.
+ PGFServce: make function names available in the JSON representation of
  BracketedString.
2012-03-18 20:12:26 +00:00
kr.angelov
bb6905e36f the parser now use nub instead of nubsort which means that the abstract syntax trees will be returned lazily 2011-12-19 13:10:33 +00:00
kr.angelov
7c9bbd844b Now graphvizAbstractTree suppress the visualization of implicit arguments. 2011-12-08 09:18:38 +00:00
kr.angelov
a2626e24dd now we store version number in every .gfo file. If the file is compiled with different compiler then we simply recompile it. 2011-11-15 19:12:22 +00:00
kr.angelov
416d231c5e Now PMCFG is compiled per module and at the end we only link it. The new compilation schema is few times faster. 2011-11-10 14:09:41 +00:00
hallgren
a8185fd997 Preparations for release of GF 3.3
+ Changing version numbers and dates here and there.
+ Simplify build-binary-dist.sh since pgf-http need not be built anymore.
+ Use--gf-lib-path to make the sample grammars for minibar compile even if GF
  is not installed.
2011-10-25 18:25:49 +00:00
hallgren
6c5ee3d666 PGF.hs: Add LANGUAGE BangPatterns to make GHC 7.2 happy
Also remove oddly named function forExample (topological sorting) from export
list.
2011-10-20 13:21:28 +00:00