Commit Graph

19 Commits

Author SHA1 Message Date
John J. Camilleri
f2e52d6f2c Replace tabs for whitespace in source code 2021-07-07 09:40:41 +02:00
Andreas Källberg
f2e4b89a22 Fix syntax error problem for older versions of GHC 2020-10-08 17:41:44 +02:00
Andreas Källberg
7268253f5a MonadFail: Make backwards-compatible 2020-09-05 20:23:07 +02:00
Andreas Källberg
b8812b54b2 fix newer ghc: Don't try to be backwards compatible 2020-08-05 18:48:24 +02:00
Andreas Källberg
251845f83e First attempt at fixing incompabilities with newer cabal 2020-08-05 18:48:24 +02:00
hallgren
65e675d8e2 Lexer.x & Parser.y: add a partial parser for terms
Lexer.x: Change the parser monad type P to allow the remaining input to
	 be returned after a partial parse. Add function

           runPartial :: P t -> String -> Either (Posn, String) (String, t)

Parser.y: Add a partial parser pTerm for nonterminal Exp1.
          Re-export runPartial.
2016-04-07 13:32:14 +00:00
hallgren
c8b7ebc163 Lexer.x: fix cyclic Functor instance
It looks like I introduced this cyclic definition in August 2014, but since
it isn't used, it hasn't been a problem...
2016-04-06 13:46:48 +00:00
krasimir
c8ebe09315 initial support for BNFC syntax in context-free grammars for GF. Not all features are supported yet. Based on contribution from Gleb Lobanov 2016-03-21 13:27:44 +00:00
hallgren
1ccdd0d9fd GF source lexer: allow numeric character escapes in string literals
This makes the output from PGF.showExpr (and other Haskell code that uses
the Prelude.show function to show strings) parsable as GF source code in
more cases.

This is a workaround for the problem that GHC's implementation of the show
function uses numeric escapes for printable non-ASCII characters, e.g.
show "dålig" = "d\229lig"...
2015-09-29 12:18:35 +00:00
hallgren
cd5193b7e1 Fix warnings in 16 modules, mostly forward compatibility warnings from GHC 7.8 2014-08-13 22:16:18 +00:00
kr.angelov
86201df35c now GF keywords can be used as identifiers if they are quoted 2014-06-12 09:36:32 +00:00
kr.angelov
51a9ef72c7 refactor the compilation of CFG and EBNF grammars. Now they are parsed by using GF.Grammar.Parser just like the ordinary GF grammars. Furthermore now GF.Speech.CFG is moved to GF.Grammar.CFG. The new module is used by both the speech conversion utils and by the compiler for CFG grammars. The parser for CFG now consumes a lot less memory and can be used with grammars with more than 4 000 000 productions. 2014-03-21 21:25:05 +00:00
hallgren
3f57151cc3 Represent identifiers as UTF-8-encoded ByteStrings
This was a fairly simple change thanks to previous work on making the Ident
type abstract and the fact that PGF.CId already uses UTF-8-encoded
ByteStrings.

One potential pitfall is that Data.ByteString.UTF8 uses the same type for
ByteStrings as Data.ByteString. I renamed ident2bs to ident2utf8 and
bsCId to utf8CId, to make it clearer that they work with UTF-8-encoded
ByteStrings.

Since both the compiler input and identifiers are now UTF-8-encoded
ByteStrings, the lexer now creates identifiers without copying any characters.
**END OF DESCRIPTION***

Place the long patch description above the ***END OF DESCRIPTION*** marker.
The first line of this file will be the patch name.


This patch contains the following changes:

M ./src/compiler/GF/Compile/CheckGrammar.hs -3 +3
M ./src/compiler/GF/Compile/GrammarToPGF.hs -2 +2
M ./src/compiler/GF/Grammar/Binary.hs -5 +1
M ./src/compiler/GF/Grammar/Lexer.x -11 +13
M ./src/compiler/GF/Infra/Ident.hs -19 +36
M ./src/runtime/haskell/PGF.hs -1 +1
M ./src/runtime/haskell/PGF/CId.hs -2 +3
2013-11-26 16:12:03 +00:00
hallgren
9d7fdf7c9a Change how GF deals with character encodings in grammar files
1. The default encoding is changed from Latin-1 to UTF-8.

2. Alternate encodings should be specified as "--# -coding=enc", the old
   "flags coding=enc" declarations have no effect but are still checked for
   consistency.

3. A transitional warning is generated for files that contain non-ASCII
   characters without specifying a character encoding:

	"Warning: default encoding has changed from Latin-1 to UTF-8"

4. Conversion to Unicode is now done *before* lexing. This makes it possible
   to allow arbitrary Unicode characters in identifiers. But identifiers are
   still stored as ByteStrings, so they are limited to Latin-1 characters
   for now.

5. Lexer.hs is no longer part of the repository. We now generate the lexer
   from Lexer.x with alex>=3. Some workarounds for bugs in alex-3.0 were
   needed. These bugs might already be fixed in newer versions of alex, but
   we should be compatible with what is shipped in the Haskell Platform.
2013-11-25 21:12:11 +00:00
hallgren
841e54e3dc alex 3 incompatibility workaround
As a temporary workaround, alex is no longer invoked automatically when
building with cabal. Developers who want to modify the lexer need to run
alex on Lexer.x manually and record the modified Lexer.hs.

    src/compiler/GF/Grammar/lexer/Lexer.x    -- hidden from cabal
    src/compiler/GF/Grammar/Lexer.hs         -- update it manually
2012-05-04 12:39:07 +00:00
krasimir
992a7ffb38 Yay!! Direct generation of PMCFG from GF grammar 2010-06-18 12:55:58 +00:00
krasimir
f870c4d80f syntax for inaccessible patterns in GF 2010-03-18 19:34:30 +00:00
krasimir
64da1c2021 allow negative integers in the grammar syntax 2010-02-08 12:59:22 +00:00
krasimir
f85232947e reorganize the directories under src, and rescue the JavaScript interpreter from deprecated 2009-12-13 18:50:29 +00:00