1
0
forked from GitHub/gf-core
Commit Graph

42 Commits

Author SHA1 Message Date
Meng Weng Wong
3a1213ab37 prepare for GHC 9, base 4.15, by using Buffer constructor interface 2022-03-05 12:59:25 +08:00
John J. Camilleri
f2e52d6f2c Replace tabs for whitespace in source code 2021-07-07 09:40:41 +02:00
Thomas Hallgren
28f53e801a PGFService: revert unlexing change in PGFService to restore &+ behaviour 2019-11-18 13:20:41 +01:00
Thomas Hallgren
fc1b51aa95 Adding -output-format canonical_gf
This output format converts a GF grammar to a "canonical" GF grammar. A
canonical GF grammar consists of

 - one self-contained module for the abstract syntax
 - one self-contained module per concrete syntax

The concrete syntax modules contain param, lincat and lin definitions,
everything else has been eliminated by the partial evaluator, including
references to resource library modules and functors. Record types
and tables are retained.

The -output-format canonical_gf option writes canonical GF grammars to a
subdirectory "canonical/". The canonical GF grammars are written as
normal GF ".gf" source files, which can be compiled with GF in the normal way.

The translation to canonical form goes via an AST for canonical GF grammars,
defined in GF.Grammar.Canonical. This is a simple, self-contained format that
doesn't cover everyting in GF (e.g. omitting dependent types and HOAS), but it
is complete enough to translate the Foods and Phrasebook grammars found in
gf-contrib. The AST is based on the GF grammar "GFCanonical" presented here:

  https://github.com/GrammaticalFramework/gf-core/issues/30#issuecomment-453556553

The translation of concrete syntax to canonical form is based on the
previously existing translation of concrete syntax to Haskell, implemented
in module GF.Compile.ConcreteToHaskell. This module could now be reimplemented
and simplified significantly by going via the canonical format. Perhaps exports
to other output formats could benefit by going via the canonical format too.

There is also the possibility of completing the GFCanonical grammar
mentioned above and using GF itself to convert canonical GF grammars to
other formats...
2019-01-17 21:04:08 +01:00
Aarne Ranta
013f3573e6 added transliteration arabic_unvocalized, which omits the vowels 2018-06-12 20:39:39 +02:00
Krasimir Angelov
1f908fa7bf eliminate modules PGF.Lexing, PGF.LexingAGreek. Make PGF.Utilities an internal module in the runtime. These are not really part of the core runtime. 2017-09-04 11:43:37 +02:00
aarne
d51cbb0f1a added Arabic question mark to arabic and persian transliterations, as well as the zero-width non-joiner U+200C to persian" 2017-06-14 12:32:17 +00:00
leiss
df2901c9c0 add lexer and unlexer for Ancient Greek accent normalization 2016-02-23 16:30:39 +00:00
hallgren
5bfaf10de5 Comment out some dead code found with -fwarn-unused-binds
Also fixed some warnings and tightened some imports
2015-08-28 13:59:43 +00:00
kr.angelov
d3b9652b81 revert an accidental change that I pushed together with the last patch 2014-08-11 11:44:49 +00:00
kr.angelov
584d589041 a partial support for def rules in the C runtime
The def rules are now compiled to byte code by the compiler and then to
native code by the JIT compiler in the runtime. Not all constructions
are implemented yet. The partial implementation is now in the repository
but it is not activated by default since this requires changes in the
PGF format. I will enable it only after it is complete.
2014-08-11 10:59:10 +00:00
hallgren
59172ce9c5 Adding GF.Infra.Location and GF.Text.Pretty (forgot to 'darcs add' them before) 2014-07-27 22:13:13 +00:00
hallgren
0715cfe2ae minibar: include the grammar's last modification in the grammar info shown by the "i" button
Also bumped version number in gf.cabal to 3.6-darcs.
Also removed some unecessary use of CPP.
2014-06-24 13:59:09 +00:00
hallgren
50ea3d265c Change the type of PGF.Lexing.bindTok to [String] -> [String]
The old type was [String] -> String. This function was only used
in GF.Text.Lexing.stringOp, which now uses (unwords . bindTok) instead,
with no change in behaviour.
2014-04-09 17:39:21 +00:00
hallgren
677d849840 Unlexers: move capitalization of first word from GF.Text.Lexing to PGF.Lexing
The capitalization of the first word was done in GF.Text.Lexing.stringOp,
but is now done in the functions unlexText and unlexMixed in PGF.Lexing.
These functions are only used in stringOp and in PGFService (where the change
is needed), so the subtle change in behaviour should not cause any bugs.
2014-04-09 17:26:23 +00:00
hallgren
9cac98a356 Move basic lexing functions from GF.Text.Lexing to the new module PGF.Lexing
They are thus part of the PGF Run-Time Library, making it possible to add
lexing functionality in PGF service in a natural way.
2014-04-08 14:07:49 +00:00
aarne
2ef28487ef removed the unlines-lines wrapper from Lexing.unlexer to prevent empty lines when an unlexer (such as -bind or -unchars) is used as an option in linearization. Don't know really why the input had been broken into lines in the first place. You can see the effect by importing LangEng and running "gr -cat=Cl | l -table -bind" before and after recompiling GF. 2013-12-03 13:27:22 +00:00
hallgren
9d7fdf7c9a Change how GF deals with character encodings in grammar files
1. The default encoding is changed from Latin-1 to UTF-8.

2. Alternate encodings should be specified as "--# -coding=enc", the old
   "flags coding=enc" declarations have no effect but are still checked for
   consistency.

3. A transitional warning is generated for files that contain non-ASCII
   characters without specifying a character encoding:

	"Warning: default encoding has changed from Latin-1 to UTF-8"

4. Conversion to Unicode is now done *before* lexing. This makes it possible
   to allow arbitrary Unicode characters in identifiers. But identifiers are
   still stored as ByteStrings, so they are limited to Latin-1 characters
   for now.

5. Lexer.hs is no longer part of the repository. We now generate the lexer
   from Lexer.x with alex>=3. Some workarounds for bugs in alex-3.0 were
   needed. These bugs might already be fixed in newer versions of alex, but
   we should be compatible with what is shipped in the Haskell Platform.
2013-11-25 21:12:11 +00:00
virk.shafqat
d1d5543c26 Improvements In Sindhi RG 2013-06-15 20:02:00 +00:00
hallgren
5b36461c1d GF.Text.Transliterations: avoid error prone function Data.Map.fromAscList 2013-06-02 10:10:46 +00:00
aarne
f33059ae39 Prasad's sanskrit transliteration ; MiniresourceSan now compiles but is mostly incorrect due to missing paradigms 2013-05-31 16:25:42 +00:00
virk.shafqat
cfcf7cbc7f unicode4k-changed 2012-11-05 16:44:31 +00:00
Sergei Trofimovich
c015ac77bd compiler/GF/Text/Coding.hs: fix build failure against ghc-7.2 2012-03-26 20:48:57 +00:00
virk.shafqat
4ba9944663 hindi-resource-grammar 2012-02-23 13:36:50 +00:00
virk.shafqat
5403e31264 sindhipatch 2012-02-21 09:02:42 +00:00
aarne
10d79ed050 made ps -from_TRANSLIT symmetric to -to_TRANSLIT in the sense that unknown characters are returned as themselves and not as question marks 2011-09-15 10:49:40 +00:00
virk.shafqat
fabd2fe192 refinementNepali-11-06-20 2011-06-20 11:24:22 +00:00
aarne
098af279e7 allow empty lines in transliteration files 2011-06-14 11:49:10 +00:00
virk.shafqat
86d2e676bc refinementsTextUrd-11-05-19 2011-05-19 15:57:12 +00:00
aarne
f158fad6f8 fixed problems in persian transliteration pointed out by Elnaz 2011-05-06 12:11:45 +00:00
aarne
4ec34bdbb6 transliteration via configuration file: ps -to=file or ps -from=file 2011-05-02 14:53:46 +00:00
aarne
7445e56387 a simple clitic analysis command 'ca' 2011-02-06 16:19:24 +00:00
aarne
f875fe563a corrections to ancientgreek encoding by Hans Leiss 2011-01-31 08:06:42 +00:00
aarne
243a0b3659 DiffUrd and Hin; updated Transliteration.hs 2010-11-25 12:22:58 +00:00
aarne
c5b3de8825 Amharic transliteration by Markos 2010-05-07 12:23:57 +00:00
krasimir
6313244eac use the native unicode support from GHC 6.12 2010-04-19 09:38:36 +00:00
aarne
cdd9efa559 Urdu transliteration fixed (by Shafqat) 2010-04-01 12:24:04 +00:00
krasimir
1e51690b71 added codepage for Turkish 2010-03-23 13:44:17 +00:00
krasimir
850b897f08 added comment to every GF.Text.CPxxxx module about the purpose of the codepage 2010-03-23 12:19:34 +00:00
krasimir
2ac96a7643 transliteration for Urdu 2010-03-22 09:29:43 +00:00
aarne
a4eb1800a4 correct capitalization in unlexmixed; unlextext and unlexmixed now remove string literal quotes 2009-12-17 21:17:46 +00:00
krasimir
f85232947e reorganize the directories under src, and rescue the JavaScript interpreter from deprecated 2009-12-13 18:50:29 +00:00