gf-core

Author	SHA1	Message	Date
Meng Weng Wong	3a1213ab37	prepare for GHC 9, base 4.15, by using Buffer constructor interface	2022-03-05 12:59:25 +08:00
John J. Camilleri	f2e52d6f2c	Replace tabs for whitespace in source code	2021-07-07 09:40:41 +02:00
Thomas Hallgren	28f53e801a	PGFService: revert unlexing change in PGFService to restore &+ behaviour	2019-11-18 13:20:41 +01:00
Thomas Hallgren	fc1b51aa95	Adding -output-format canonical_gf This output format converts a GF grammar to a "canonical" GF grammar. A canonical GF grammar consists of - one self-contained module for the abstract syntax - one self-contained module per concrete syntax The concrete syntax modules contain param, lincat and lin definitions, everything else has been eliminated by the partial evaluator, including references to resource library modules and functors. Record types and tables are retained. The -output-format canonical_gf option writes canonical GF grammars to a subdirectory "canonical/". The canonical GF grammars are written as normal GF ".gf" source files, which can be compiled with GF in the normal way. The translation to canonical form goes via an AST for canonical GF grammars, defined in GF.Grammar.Canonical. This is a simple, self-contained format that doesn't cover everyting in GF (e.g. omitting dependent types and HOAS), but it is complete enough to translate the Foods and Phrasebook grammars found in gf-contrib. The AST is based on the GF grammar "GFCanonical" presented here: https://github.com/GrammaticalFramework/gf-core/issues/30#issuecomment-453556553 The translation of concrete syntax to canonical form is based on the previously existing translation of concrete syntax to Haskell, implemented in module GF.Compile.ConcreteToHaskell. This module could now be reimplemented and simplified significantly by going via the canonical format. Perhaps exports to other output formats could benefit by going via the canonical format too. There is also the possibility of completing the GFCanonical grammar mentioned above and using GF itself to convert canonical GF grammars to other formats...	2019-01-17 21:04:08 +01:00
Aarne Ranta	013f3573e6	added transliteration arabic_unvocalized, which omits the vowels	2018-06-12 20:39:39 +02:00
Krasimir Angelov	1f908fa7bf	eliminate modules PGF.Lexing, PGF.LexingAGreek. Make PGF.Utilities an internal module in the runtime. These are not really part of the core runtime.	2017-09-04 11:43:37 +02:00
aarne	d51cbb0f1a	added Arabic question mark to arabic and persian transliterations, as well as the zero-width non-joiner U+200C to persian"	2017-06-14 12:32:17 +00:00
leiss	df2901c9c0	add lexer and unlexer for Ancient Greek accent normalization	2016-02-23 16:30:39 +00:00
hallgren	5bfaf10de5	Comment out some dead code found with -fwarn-unused-binds Also fixed some warnings and tightened some imports	2015-08-28 13:59:43 +00:00
kr.angelov	d3b9652b81	revert an accidental change that I pushed together with the last patch	2014-08-11 11:44:49 +00:00
kr.angelov	584d589041	a partial support for def rules in the C runtime The def rules are now compiled to byte code by the compiler and then to native code by the JIT compiler in the runtime. Not all constructions are implemented yet. The partial implementation is now in the repository but it is not activated by default since this requires changes in the PGF format. I will enable it only after it is complete.	2014-08-11 10:59:10 +00:00
hallgren	59172ce9c5	Adding GF.Infra.Location and GF.Text.Pretty (forgot to 'darcs add' them before)	2014-07-27 22:13:13 +00:00
hallgren	0715cfe2ae	minibar: include the grammar's last modification in the grammar info shown by the "i" button Also bumped version number in gf.cabal to 3.6-darcs. Also removed some unecessary use of CPP.	2014-06-24 13:59:09 +00:00
hallgren	50ea3d265c	Change the type of PGF.Lexing.bindTok to [String] -> [String] The old type was [String] -> String. This function was only used in GF.Text.Lexing.stringOp, which now uses (unwords . bindTok) instead, with no change in behaviour.	2014-04-09 17:39:21 +00:00
hallgren	677d849840	Unlexers: move capitalization of first word from GF.Text.Lexing to PGF.Lexing The capitalization of the first word was done in GF.Text.Lexing.stringOp, but is now done in the functions unlexText and unlexMixed in PGF.Lexing. These functions are only used in stringOp and in PGFService (where the change is needed), so the subtle change in behaviour should not cause any bugs.	2014-04-09 17:26:23 +00:00
hallgren	9cac98a356	Move basic lexing functions from GF.Text.Lexing to the new module PGF.Lexing They are thus part of the PGF Run-Time Library, making it possible to add lexing functionality in PGF service in a natural way.	2014-04-08 14:07:49 +00:00
aarne	2ef28487ef	removed the unlines-lines wrapper from Lexing.unlexer to prevent empty lines when an unlexer (such as -bind or -unchars) is used as an option in linearization. Don't know really why the input had been broken into lines in the first place. You can see the effect by importing LangEng and running "gr -cat=Cl \| l -table -bind" before and after recompiling GF.	2013-12-03 13:27:22 +00:00
hallgren	9d7fdf7c9a	Change how GF deals with character encodings in grammar files 1. The default encoding is changed from Latin-1 to UTF-8. 2. Alternate encodings should be specified as "--# -coding=enc", the old "flags coding=enc" declarations have no effect but are still checked for consistency. 3. A transitional warning is generated for files that contain non-ASCII characters without specifying a character encoding: "Warning: default encoding has changed from Latin-1 to UTF-8" 4. Conversion to Unicode is now done before lexing. This makes it possible to allow arbitrary Unicode characters in identifiers. But identifiers are still stored as ByteStrings, so they are limited to Latin-1 characters for now. 5. Lexer.hs is no longer part of the repository. We now generate the lexer from Lexer.x with alex>=3. Some workarounds for bugs in alex-3.0 were needed. These bugs might already be fixed in newer versions of alex, but we should be compatible with what is shipped in the Haskell Platform.	2013-11-25 21:12:11 +00:00
virk.shafqat	d1d5543c26	Improvements In Sindhi RG	2013-06-15 20:02:00 +00:00
hallgren	5b36461c1d	GF.Text.Transliterations: avoid error prone function Data.Map.fromAscList	2013-06-02 10:10:46 +00:00
aarne	f33059ae39	Prasad's sanskrit transliteration ; MiniresourceSan now compiles but is mostly incorrect due to missing paradigms	2013-05-31 16:25:42 +00:00
virk.shafqat	cfcf7cbc7f	unicode4k-changed	2012-11-05 16:44:31 +00:00
Sergei Trofimovich	c015ac77bd	compiler/GF/Text/Coding.hs: fix build failure against ghc-7.2	2012-03-26 20:48:57 +00:00
virk.shafqat	4ba9944663	hindi-resource-grammar	2012-02-23 13:36:50 +00:00
virk.shafqat	5403e31264	sindhipatch	2012-02-21 09:02:42 +00:00
aarne	10d79ed050	made ps -from_TRANSLIT symmetric to -to_TRANSLIT in the sense that unknown characters are returned as themselves and not as question marks	2011-09-15 10:49:40 +00:00
virk.shafqat	fabd2fe192	refinementNepali-11-06-20	2011-06-20 11:24:22 +00:00
aarne	098af279e7	allow empty lines in transliteration files	2011-06-14 11:49:10 +00:00
virk.shafqat	86d2e676bc	refinementsTextUrd-11-05-19	2011-05-19 15:57:12 +00:00
aarne	f158fad6f8	fixed problems in persian transliteration pointed out by Elnaz	2011-05-06 12:11:45 +00:00
aarne	4ec34bdbb6	transliteration via configuration file: ps -to=file or ps -from=file	2011-05-02 14:53:46 +00:00
aarne	7445e56387	a simple clitic analysis command 'ca'	2011-02-06 16:19:24 +00:00
aarne	f875fe563a	corrections to ancientgreek encoding by Hans Leiss	2011-01-31 08:06:42 +00:00
aarne	243a0b3659	DiffUrd and Hin; updated Transliteration.hs	2010-11-25 12:22:58 +00:00
aarne	c5b3de8825	Amharic transliteration by Markos	2010-05-07 12:23:57 +00:00
krasimir	6313244eac	use the native unicode support from GHC 6.12	2010-04-19 09:38:36 +00:00
aarne	cdd9efa559	Urdu transliteration fixed (by Shafqat)	2010-04-01 12:24:04 +00:00
krasimir	1e51690b71	added codepage for Turkish	2010-03-23 13:44:17 +00:00
krasimir	850b897f08	added comment to every GF.Text.CPxxxx module about the purpose of the codepage	2010-03-23 12:19:34 +00:00
krasimir	2ac96a7643	transliteration for Urdu	2010-03-22 09:29:43 +00:00
aarne	a4eb1800a4	correct capitalization in unlexmixed; unlextext and unlexmixed now remove string literal quotes	2009-12-17 21:17:46 +00:00
krasimir	f85232947e	reorganize the directories under src, and rescue the JavaScript interpreter from deprecated	2009-12-13 18:50:29 +00:00

42 Commits