mirror of
https://github.com/GrammaticalFramework/gf-core.git
synced 2026-04-09 04:59:31 -06:00
a39f8cc5da71b2130f4d0370485ea8a14e495c1d
The problem is that lower case a with a grave accent is coded in UTF-8 as \195\160. Unicode character \160 is non-breaking space, so Haskell's words function will break a UTF-8 encoded string at this character. String literals in the .gfo file are UTF-8 encoded in generateModuleCode, just before the call to prGrammar (which uses compactPrint, which used words). The real solution would be to pretty-print the grammar to Unicode, and then encode as UTF-8. The problem with that is Latin-1 identifers. They are now kept in Latin-1 in the .gfo file, since Alex can't handle Unicode. The real solution to that would be to fix Alex to handle Unicode, but that is non-trivial. GHC interally uses a very hacky .x file to be able to lex UTF-8 source files. An alternative solution that doesn't address the weirdness of using two different encodings in the same .gfo as we do now, is to incorporate compactPrint into the grammar printer, to avoid having to do any postprocessing.
DESCRIPTION
The Grammatical Framework (=GF) is a grammar formalism based on type theory.
It consists of
* a special-purpose programming language
* a compiler of the language
* a generic grammar processor
The compiler reads GF grammars from user-provided files, and the
generic grammar processor performs various tasks with the grammars:
* generation
* parsing
* translation
* type checking
* computation
* paraphrasing
* random generation
* syntax editing
GF particularly addresses four aspects of grammars:
* multilinguality (parallel grammars for different languages)
* semantics (semantic conditions of well-formedness, semantic
properties of expressions)
* grammar engineering (modularity, abstractions, libraries)
* embeddability in programs written in other languages (C,C++,
Haskell, Java, JavaScript)
INSTALLATION of binary distribution: see INSTALL
INSTALLATION of source distribution:
See src/INSTALL for installation instructions.
Description
Languages
Haskell
45%
C
32.9%
JavaScript
10.1%
HTML
3.3%
Grammatical Framework
2.8%
Other
5.8%