GF allows more characters in its types, as long as they are inside
single quotes. E.g. 'VP/Object' is a valid name for a GF category,
but not for a Haskell data type.
e.g. if C is a fun and not a cat in the abstract syntax.
Discarding bad lincats prevents GF from generating malformed PGFs that
are rejected by the C run-time system.
I also added code to reject bad lincats with an error, but I left it
commented out since it seems a bit pedantic compared to GF's otherwise
rather sloppy grammar checking.
+ Abstract syntax now is converted directly from the Grammar and not via PGF,
so you can use `gf -batch -no-pmcfg -f canonical_gf ...`, to export to
canonical_gf while skipping PMCFG and PGF file generation completely.
+ Flags that are normally copied to PGF files are now included in the
caninical_gf output as well (in particular the startcat flag).
This output format converts a GF grammar to a "canonical" GF grammar. A
canonical GF grammar consists of
- one self-contained module for the abstract syntax
- one self-contained module per concrete syntax
The concrete syntax modules contain param, lincat and lin definitions,
everything else has been eliminated by the partial evaluator, including
references to resource library modules and functors. Record types
and tables are retained.
The -output-format canonical_gf option writes canonical GF grammars to a
subdirectory "canonical/". The canonical GF grammars are written as
normal GF ".gf" source files, which can be compiled with GF in the normal way.
The translation to canonical form goes via an AST for canonical GF grammars,
defined in GF.Grammar.Canonical. This is a simple, self-contained format that
doesn't cover everyting in GF (e.g. omitting dependent types and HOAS), but it
is complete enough to translate the Foods and Phrasebook grammars found in
gf-contrib. The AST is based on the GF grammar "GFCanonical" presented here:
https://github.com/GrammaticalFramework/gf-core/issues/30#issuecomment-453556553
The translation of concrete syntax to canonical form is based on the
previously existing translation of concrete syntax to Haskell, implemented
in module GF.Compile.ConcreteToHaskell. This module could now be reimplemented
and simplified significantly by going via the canonical format. Perhaps exports
to other output formats could benefit by going via the canonical format too.
There is also the possibility of completing the GFCanonical grammar
mentioned above and using GF itself to convert canonical GF grammars to
other formats...
* In GHC 8.4.1, the operator <> has become a method of the Semigroup class
and is exported from the Prelude. This is unfortunate, since <> is also
exported from the standard library module Text.PrettyPrint, so in any
module that defines a pretty printer, there is likely to be an ambiguity.
This affects ~18 modules in GF. Solution:
import Prelude hiding (<>)
This works also in older versions of GHC, since GHC does't complain if
you hide something that doesn't exists.
* In GHC 8.4.1, Semigroup has become a superclass of Monoid. This means
that anywhere you define an instance of the Monoid class you also have to
define an instance in the Semigroup class.
This affects Data.Binary.Builder in GF. Solution: conditionally define
a Semigroup instance if compiling with base>=4.11 (ghc>=8.4.1)