Commit Graph

26 Commits

Author SHA1 Message Date
aarne 2ef28487ef removed the unlines-lines wrapper from Lexing.unlexer to prevent empty lines when an unlexer (such as -bind or -unchars) is used as an option in linearization. Don't know really why the input had been broken into lines in the first place. You can see the effect by importing LangEng and running "gr -cat=Cl | l -table -bind" before and after recompiling GF. 2013-12-03 13:27:22 +00:00
hallgren 9d7fdf7c9a Change how GF deals with character encodings in grammar files
1. The default encoding is changed from Latin-1 to UTF-8.

2. Alternate encodings should be specified as "--# -coding=enc", the old
   "flags coding=enc" declarations have no effect but are still checked for
   consistency.

3. A transitional warning is generated for files that contain non-ASCII
   characters without specifying a character encoding:

	"Warning: default encoding has changed from Latin-1 to UTF-8"

4. Conversion to Unicode is now done *before* lexing. This makes it possible
   to allow arbitrary Unicode characters in identifiers. But identifiers are
   still stored as ByteStrings, so they are limited to Latin-1 characters
   for now.

5. Lexer.hs is no longer part of the repository. We now generate the lexer
   from Lexer.x with alex>=3. Some workarounds for bugs in alex-3.0 were
   needed. These bugs might already be fixed in newer versions of alex, but
   we should be compatible with what is shipped in the Haskell Platform.
2013-11-25 21:12:11 +00:00
virk.shafqat d1d5543c26 Improvements In Sindhi RG 2013-06-15 20:02:00 +00:00
hallgren 5b36461c1d GF.Text.Transliterations: avoid error prone function Data.Map.fromAscList 2013-06-02 10:10:46 +00:00
aarne f33059ae39 Prasad's sanskrit transliteration ; MiniresourceSan now compiles but is mostly incorrect due to missing paradigms 2013-05-31 16:25:42 +00:00
virk.shafqat cfcf7cbc7f unicode4k-changed 2012-11-05 16:44:31 +00:00
Sergei Trofimovich c015ac77bd compiler/GF/Text/Coding.hs: fix build failure against ghc-7.2 2012-03-26 20:48:57 +00:00
virk.shafqat 4ba9944663 hindi-resource-grammar 2012-02-23 13:36:50 +00:00
virk.shafqat 5403e31264 sindhipatch 2012-02-21 09:02:42 +00:00
aarne 10d79ed050 made ps -from_TRANSLIT symmetric to -to_TRANSLIT in the sense that unknown characters are returned as themselves and not as question marks 2011-09-15 10:49:40 +00:00
virk.shafqat fabd2fe192 refinementNepali-11-06-20 2011-06-20 11:24:22 +00:00
aarne 098af279e7 allow empty lines in transliteration files 2011-06-14 11:49:10 +00:00
virk.shafqat 86d2e676bc refinementsTextUrd-11-05-19 2011-05-19 15:57:12 +00:00
aarne f158fad6f8 fixed problems in persian transliteration pointed out by Elnaz 2011-05-06 12:11:45 +00:00
aarne 4ec34bdbb6 transliteration via configuration file: ps -to=file or ps -from=file 2011-05-02 14:53:46 +00:00
aarne 7445e56387 a simple clitic analysis command 'ca' 2011-02-06 16:19:24 +00:00
aarne f875fe563a corrections to ancientgreek encoding by Hans Leiss 2011-01-31 08:06:42 +00:00
aarne 243a0b3659 DiffUrd and Hin; updated Transliteration.hs 2010-11-25 12:22:58 +00:00
aarne c5b3de8825 Amharic transliteration by Markos 2010-05-07 12:23:57 +00:00
krasimir 6313244eac use the native unicode support from GHC 6.12 2010-04-19 09:38:36 +00:00
aarne cdd9efa559 Urdu transliteration fixed (by Shafqat) 2010-04-01 12:24:04 +00:00
krasimir 1e51690b71 added codepage for Turkish 2010-03-23 13:44:17 +00:00
krasimir 850b897f08 added comment to every GF.Text.CPxxxx module about the purpose of the codepage 2010-03-23 12:19:34 +00:00
krasimir 2ac96a7643 transliteration for Urdu 2010-03-22 09:29:43 +00:00
aarne a4eb1800a4 correct capitalization in unlexmixed; unlextext and unlexmixed now remove string literal quotes 2009-12-17 21:17:46 +00:00
krasimir f85232947e reorganize the directories under src, and rescue the JavaScript interpreter from deprecated 2009-12-13 18:50:29 +00:00