1
0
forked from GitHub/gf-core
Commit Graph

26 Commits

Author SHA1 Message Date
aarne
1778cd7c19 removed the unlines-lines wrapper from Lexing.unlexer to prevent empty lines when an unlexer (such as -bind or -unchars) is used as an option in linearization. Don't know really why the input had been broken into lines in the first place. You can see the effect by importing LangEng and running "gr -cat=Cl | l -table -bind" before and after recompiling GF. 2013-12-03 13:27:22 +00:00
hallgren
30fc46e934 Change how GF deals with character encodings in grammar files
1. The default encoding is changed from Latin-1 to UTF-8.

2. Alternate encodings should be specified as "--# -coding=enc", the old
   "flags coding=enc" declarations have no effect but are still checked for
   consistency.

3. A transitional warning is generated for files that contain non-ASCII
   characters without specifying a character encoding:

	"Warning: default encoding has changed from Latin-1 to UTF-8"

4. Conversion to Unicode is now done *before* lexing. This makes it possible
   to allow arbitrary Unicode characters in identifiers. But identifiers are
   still stored as ByteStrings, so they are limited to Latin-1 characters
   for now.

5. Lexer.hs is no longer part of the repository. We now generate the lexer
   from Lexer.x with alex>=3. Some workarounds for bugs in alex-3.0 were
   needed. These bugs might already be fixed in newer versions of alex, but
   we should be compatible with what is shipped in the Haskell Platform.
2013-11-25 21:12:11 +00:00
virk.shafqat
9caa9cd44e Improvements In Sindhi RG 2013-06-15 20:02:00 +00:00
hallgren
4e2337f009 GF.Text.Transliterations: avoid error prone function Data.Map.fromAscList 2013-06-02 10:10:46 +00:00
aarne
000aa35de9 Prasad's sanskrit transliteration ; MiniresourceSan now compiles but is mostly incorrect due to missing paradigms 2013-05-31 16:25:42 +00:00
virk.shafqat
f7344b8f38 unicode4k-changed 2012-11-05 16:44:31 +00:00
Sergei Trofimovich
73040e9c50 compiler/GF/Text/Coding.hs: fix build failure against ghc-7.2 2012-03-26 20:48:57 +00:00
virk.shafqat
9aede98c7f hindi-resource-grammar 2012-02-23 13:36:50 +00:00
virk.shafqat
14e0237950 sindhipatch 2012-02-21 09:02:42 +00:00
aarne
42b7b0f8c2 made ps -from_TRANSLIT symmetric to -to_TRANSLIT in the sense that unknown characters are returned as themselves and not as question marks 2011-09-15 10:49:40 +00:00
virk.shafqat
a9d8634147 refinementNepali-11-06-20 2011-06-20 11:24:22 +00:00
aarne
a772c0af48 allow empty lines in transliteration files 2011-06-14 11:49:10 +00:00
virk.shafqat
9e38856a1e refinementsTextUrd-11-05-19 2011-05-19 15:57:12 +00:00
aarne
20517f98dc fixed problems in persian transliteration pointed out by Elnaz 2011-05-06 12:11:45 +00:00
aarne
72b63e34bd transliteration via configuration file: ps -to=file or ps -from=file 2011-05-02 14:53:46 +00:00
aarne
e4eccba450 a simple clitic analysis command 'ca' 2011-02-06 16:19:24 +00:00
aarne
d661d45b1e corrections to ancientgreek encoding by Hans Leiss 2011-01-31 08:06:42 +00:00
aarne
0460ed2d8b DiffUrd and Hin; updated Transliteration.hs 2010-11-25 12:22:58 +00:00
aarne
788fe8a37e Amharic transliteration by Markos 2010-05-07 12:23:57 +00:00
krasimir
0b6b30d4a8 use the native unicode support from GHC 6.12 2010-04-19 09:38:36 +00:00
aarne
9d6e3dae86 Urdu transliteration fixed (by Shafqat) 2010-04-01 12:24:04 +00:00
krasimir
1591253a55 added codepage for Turkish 2010-03-23 13:44:17 +00:00
krasimir
ef957a0307 added comment to every GF.Text.CPxxxx module about the purpose of the codepage 2010-03-23 12:19:34 +00:00
krasimir
63adb15eb9 transliteration for Urdu 2010-03-22 09:29:43 +00:00
aarne
4e28be6958 correct capitalization in unlexmixed; unlextext and unlexmixed now remove string literal quotes 2009-12-17 21:17:46 +00:00
krasimir
c92f9d1c0c reorganize the directories under src, and rescue the JavaScript interpreter from deprecated 2009-12-13 18:50:29 +00:00