PGF exports the public, stable API.
PGF.Internal exports additional things needed in the GF compiler & shell,
including the nonstardard version of Data.Binary.
The old type was [String] -> String. This function was only used
in GF.Text.Lexing.stringOp, which now uses (unwords . bindTok) instead,
with no change in behaviour.
The capitalization of the first word was done in GF.Text.Lexing.stringOp,
but is now done in the functions unlexText and unlexMixed in PGF.Lexing.
These functions are only used in stringOp and in PGFService (where the change
is needed), so the subtle change in behaviour should not cause any bugs.
Only change the first word to lowercase if the original input is not found in
the grammar's morphology. This allows parsing of sentenses starting with "I" in
English, nouns in German and proper names in other languages, but it can make
the wrong choice for multi-words.
Some backwards incompatible changes were made to the PGF file format after
the release of GF 3.5. This patch adds a module for reading PGF files in the
old format.
This means that old PGF files on the grammaticalframework.org server will
continue to work after we install the latest version of GF.
This was a fairly simple change thanks to previous work on making the Ident
type abstract and the fact that PGF.CId already uses UTF-8-encoded
ByteStrings.
One potential pitfall is that Data.ByteString.UTF8 uses the same type for
ByteStrings as Data.ByteString. I renamed ident2bs to ident2utf8 and
bsCId to utf8CId, to make it clearer that they work with UTF-8-encoded
ByteStrings.
Since both the compiler input and identifiers are now UTF-8-encoded
ByteStrings, the lexer now creates identifiers without copying any characters.
**END OF DESCRIPTION***
Place the long patch description above the ***END OF DESCRIPTION*** marker.
The first line of this file will be the patch name.
This patch contains the following changes:
M ./src/compiler/GF/Compile/CheckGrammar.hs -3 +3
M ./src/compiler/GF/Compile/GrammarToPGF.hs -2 +2
M ./src/compiler/GF/Grammar/Binary.hs -5 +1
M ./src/compiler/GF/Grammar/Lexer.x -11 +13
M ./src/compiler/GF/Infra/Ident.hs -19 +36
M ./src/runtime/haskell/PGF.hs -1 +1
M ./src/runtime/haskell/PGF/CId.hs -2 +3
This module should not be part of the public PGF library API, and it was only
used in GF.CompileToAPI, so the code was moved there. The module defined
constFuncs and syntaxFuncs, but only syntaxFuncs was used.
+ References to modules under src/compiler have been eliminated from the PGF
library (under src/runtime/haskell). Only two functions had to be moved (from
GF.Data.Utilities to PGF.Utilities) to make this possible, other apparent
dependencies turned out to be vacuous.
+ In gf.cabal, the GF executable no longer directly depends on the PGF library
source directory, but only on the exposed library modules. This means that
there is less duplication in gf.cabal and that the 30 modules in the
PGF library will no longer be compiled twice while building GF.
To make this possible, additional PGF library modules have been exposed, even
though they should probably be considered for internal use only. They could
be collected in a PGF.Internal module, or marked as "unstable", to make
this explicit.
+ Also, by using the -fwarn-unused-imports flag, ~220 redundant imports were
found and removed, reducing the total number of imports by ~15%.
The standard binary package has improved efficiency and error handling [1], so
in the long run we should consider switching to it. At the moment, using it is
possible but not recommended, since it results in incomatible PGF files.
The modified modules from the binary package have been moved from
src/runtime/haskell to src/binary.
[1] http://lennartkolmodin.blogspot.se/2013/03/binary-07.html
* In the shell, the new command tt (to_trie) merges a list of trees into a
trie and prints it in a readable way, where unique subtrees are marked with
a "*" and alternative subtrees are marked with numbers.
* In the PGF web service, adding the parameter trie=yes to the parse and
translate commands augments the JSON output with a trie.
Example to try in the shell:
Phrasebook> p -lang=Eng "your son waits for you" | tt
The following are the outcomes:
- Predef.nonExist is fully supported by both the Haskell and the C runtimes
- Predef.BIND is now an internal compiler defined token. For now
it behaves just as usual for the Haskell runtime, i.e. it generates &+.
However, the special treatment will let us to handle it properly in
the C runtime.
- This required a major change in the PGF format since both
nonExist and BIND may appear inside 'pre' and this was not supported
before.
Also in Commands.hs: be explicit about things imported from the PGF library
that are not in the public API.
Also a couple of haddock documentation fixes.
Fixes the following build failure:
src/runtime/haskell/Data/Binary/IEEE754.lhs:256:17:
Could not deduce (Num a) arising from a use of `mask'
from the context (Bits a)
bound by the type signature for
clamp :: Bits a => BitCount -> a -> a
GF produced slightly different PGF files on 64-bit systems and 32-bit systems.
This could cause problems when a PGF was produced on a 32-bit system and used
on a 64-bit system.
To fix this, the GF compiler and the Haskell PGF run-time library now reads
and writes PGF files like the 32-bit version even when compiled on a 64-bit
system.
Note: the Haskell type Int is still used internally in GF, which could be
32 bits or 64 bits...
Note that some of the graphviz functions have backwards incompatible changes
that might also affect other clients of the PGF run-time library.
Also added graphvizDefaults and export it together with GraphvizOptions from
the PGF run-time library.