Commit Graph

37 Commits

Author SHA1 Message Date
krangelov
0229329d7c implemented pattern macros 2021-09-29 17:38:53 +02:00
krangelov
6efb878c43 pattern matching for "x"* 2021-09-29 14:57:18 +02:00
krangelov
edd7081dea implement measured patterns 2021-09-29 13:26:06 +02:00
krangelov
a27bcb8092 Merge branch 'master' into c-runtime 2019-09-20 10:42:50 +02:00
krangelov
4d79aa8b19 remove obsolete code 2019-09-20 10:37:50 +02:00
krangelov
acb70ccc1b cleanup 2019-09-19 22:30:08 +02:00
hallgren
65e675d8e2 Lexer.x & Parser.y: add a partial parser for terms
Lexer.x: Change the parser monad type P to allow the remaining input to
	 be returned after a partial parse. Add function

           runPartial :: P t -> String -> Either (Posn, String) (String, t)

Parser.y: Add a partial parser pTerm for nonterminal Exp1.
          Re-export runPartial.
2016-04-07 13:32:14 +00:00
krasimir
c8ebe09315 initial support for BNFC syntax in context-free grammars for GF. Not all features are supported yet. Based on contribution from Gleb Lobanov 2016-03-21 13:27:44 +00:00
hallgren
391b301881 ModuleName and Ident are now distinct types
This makes the documentation clearer, and can potentially catch more
programming mistakes.
2014-10-21 19:20:31 +00:00
kr.angelov
51a9ef72c7 refactor the compilation of CFG and EBNF grammars. Now they are parsed by using GF.Grammar.Parser just like the ordinary GF grammars. Furthermore now GF.Speech.CFG is moved to GF.Grammar.CFG. The new module is used by both the speech conversion utils and by the compiler for CFG grammars. The parser for CFG now consumes a lot less memory and can be used with grammars with more than 4 000 000 productions. 2014-03-21 21:25:05 +00:00
hallgren
a98f4aa4be Show relative file paths in error messages
This is to avoid one trivial reason for failures in the test suite.
2013-12-06 15:43:34 +00:00
hallgren
9d7fdf7c9a Change how GF deals with character encodings in grammar files
1. The default encoding is changed from Latin-1 to UTF-8.

2. Alternate encodings should be specified as "--# -coding=enc", the old
   "flags coding=enc" declarations have no effect but are still checked for
   consistency.

3. A transitional warning is generated for files that contain non-ASCII
   characters without specifying a character encoding:

	"Warning: default encoding has changed from Latin-1 to UTF-8"

4. Conversion to Unicode is now done *before* lexing. This makes it possible
   to allow arbitrary Unicode characters in identifiers. But identifiers are
   still stored as ByteStrings, so they are limited to Latin-1 characters
   for now.

5. Lexer.hs is no longer part of the repository. We now generate the lexer
   from Lexer.x with alex>=3. Some workarounds for bugs in alex-3.0 were
   needed. These bugs might already be fixed in newer versions of alex, but
   we should be compatible with what is shipped in the Haskell Platform.
2013-11-25 21:12:11 +00:00
kr.angelov
8bcc70eac8 the GF syntax for identifiers is exteded with quoted forms, i.e. you could write for instance 'ab.c' and then everything between the quites is identifier. This includes Unicode characters and non-ASCII symbols. This is useful for automatically generated GF grammars. 2013-11-22 13:30:18 +00:00
kr.angelov
042243f08a added the linref construction in GF. The PGF version number is now bumped 2013-10-30 12:53:36 +00:00
hallgren
021b5f06d3 Introduce type RawIdent; only 9 imports of Data.ByteString.Char8 remain
The fact that identifiers are represented as ByteStrings is now an internal
implentation detail in module GF.Infra.Ident. Conversion between ByteString
and identifiers is only needed in the lexer and the Binary instances.
2013-09-19 20:48:10 +00:00
hallgren
3d5b9bd1fd Make Ident abstract; imports of Data.ByteString.Char8 down from 29 to 16 modules
Most of the explicit uses of ByteStrings were eliminated by using identS,

	identS = identC . BS.pack 

which was found in GF.Grammar.CF and moved to GF.Infra.Ident. The function

	prefixIdent :: String -> Ident -> Ident

allowed one additional import of ByteString to be eliminated. The functions

	isArgIdent :: Ident -> Bool
	getArgIndex :: Ident -> Maybe Int

were needed to eliminate explicit pattern matching on Ident from two modules.
2013-09-19 18:23:47 +00:00
hallgren
fad63a14be Better error messages for attempts to redefine predefined constants
Instead of just "syntax error", you now get e.g.

   PType is a predefined constant, it can not be redefined

This is a simple change in the parser.
2013-08-07 19:36:09 +00:00
kr.angelov
4922ab6cc4 now the beam size for the statistical parser can be configured by using the flag beam_size in the top-level concrete module 2013-02-12 10:53:13 +00:00
hallgren
a559e51608 Quick fix to render some parser error messages from UTF-8-encoded source files correctly.
The parser works on raw byte sequences read from source files. If parsing
succeeds the raw byte sequences are converted to proper Unicode characters 
in a later phase. But the parser calls the function buildAnyTree, which can 
fail and generate error messages containing source code fragments, which might
then containing raw byte sequences. To render these error messages correctly, 
they need to be converted in accordance with the coding flag in the source 
file. This is now done for UTF-8-encoded source files, but should ideally also
be done for other character encodings. (Latin-1-encoded files never suffered 
from this problem, since raw bytes are proper Unicode characters in this case.)
2013-01-28 17:23:02 +00:00
kr.angelov
61c16f2eb2 more structured format for errors and warnings from the compiler 2011-11-15 13:33:44 +00:00
kr.angelov
416d231c5e Now PMCFG is compiled per module and at the end we only link it. The new compilation schema is few times faster. 2011-11-10 14:09:41 +00:00
kr.angelov
734c66710e merge GF.Infra.Modules and GF.Grammar.Grammar. This is a preparation for the separate PGF building 2011-11-02 13:57:11 +00:00
kr.angelov
5fe49ed9f7 Now the compiler maintains more precise information for the source locations of the different definitions. There is a --tags option which generates a list of all identifiers with their source locations. 2011-11-02 11:44:59 +00:00
kr.angelov
bb599029c9 change the precedence for the left argument of -> 2011-09-22 16:24:02 +00:00
aarne
848373e29e GenIP, GenRP in Extra and any_Quant in ExtraEng 2011-07-21 08:25:04 +00:00
aarne
7361ddea45 make it possible to override opers defined in an interface by syntax 'instance Foo of Bar - [f,g,h]' 2011-03-12 11:24:14 +00:00
krasimir
f6a7292ad2 bugfix for the abstract operations 2010-11-15 09:38:31 +00:00
krasimir
115b4213d5 operations in the abstract syntax 2010-11-12 19:37:19 +00:00
krasimir
c3f4c3eba7 refactoring in GF.Grammar.Grammar 2010-05-28 14:15:15 +00:00
krasimir
6313244eac use the native unicode support from GHC 6.12 2010-04-19 09:38:36 +00:00
krasimir
e7f01aa5f0 fix checkInfoType in Parser.y 2010-03-22 23:49:15 +00:00
krasimir
bf74f50733 store and propagate the exact source location for all judgements in the grammar. It may not be used accurately in the error messages yet 2010-03-22 21:15:29 +00:00
krasimir
985bb550c0 fix the precedence for patterns ~, - and @ 2010-03-18 19:52:45 +00:00
krasimir
f870c4d80f syntax for inaccessible patterns in GF 2010-03-18 19:34:30 +00:00
krasimir
19b17dceb6 no need to keep the list of constructors per category in .gfo 2010-02-16 09:34:02 +00:00
krasimir
be6465a2eb refactor GF.Infra.Modules for better error messages 2010-01-31 15:54:25 +00:00
krasimir
f85232947e reorganize the directories under src, and rescue the JavaScript interpreter from deprecated 2009-12-13 18:50:29 +00:00