reorganize the directories under src, and rescue the JavaScript interpreter from deprecated

This commit is contained in:
krasimir
2009-12-13 18:50:29 +00:00
parent 15305efa5a
commit c92f9d1c0c
189 changed files with 2 additions and 2 deletions

260
deprecated/FILES Normal file
View File

@@ -0,0 +1,260 @@
Code map for GF source files.
$Author: peb $
$Date: 2005/02/07 10:58:08 $
Directories:
[top level] GF main function and runtime-related modules
api high-level access to GF functionalities
canonical GFC (= GF Canonical) basic functionalities
cf context-free skeleton used in parsing
cfgm multilingual context-free skeleton exported to Java
compile compilation phases from GF to GFC
conversions [OBSOLETE] formats used in parser generation
for-ghc GHC-specific files (Glasgow Haskell Compiler)
for-hugs Hugs-specific files (a Haskell interpreter)
for-windows Windows-specific files (an operating system from Microsoft)
grammar basic functionalities of GF grammars used in compilation
infra GF-independent infrastructure and auxiliaries
newparsing parsing with GF grammars: current version (cf. parsing)
notrace debugging utilities for parser development (cf. trace)
parsers parsers of GF and GFC files
parsing [OBSOLETE] parsing with GF grammars: old version (cf. newparsing)
shell interaction shells
source utilities for reading in GF source files
speech generation of speech recognition grammars
trace debugging utilities for parser development (cf. notrace)
useGrammar grammar functionalities for applications
util utilities for using GF
Individual files:
GF.hs the Main module
GFModes.hs
HelpFile.hs [AUTO] help file generated by util/MkHelpFile
Today.hs [AUTO] file generated by "make today"
api/API.hs high-level access to GF functionalities
api/BatchTranslate.hs
api/GetMyTree.hs
api/GrammarToHaskell.hs
api/IOGrammar.hs
api/MyParser.hs slot for defining your own parser
canonical/AbsGFC.hs [AUTO] abstract syntax of GFC
canonical/CanonToGrammar.hs
canonical/CMacros.hs
canonical/ErrM.hs
canonical/GetGFC.hs
canonical/GFC.cf [LBNF] source of GFC parser
canonical/GFC.hs
canonical/LexGFC.hs
canonical/Look.hs
canonical/MkGFC.hs
canonical/PrExp.hs
canonical/PrintGFC.hs pretty-printer of GFC
canonical/Share.hs
canonical/SkelGFC.hs [AUTO]
canonical/TestGFC.hs [AUTO]
canonical/Unlex.hs
cf/CanonToCF.hs
cf/CF.hs abstract syntax of context-free grammars
cf/CFIdent.hs
cf/CFtoGrammar.hs
cf/CFtoSRG.hs
cf/ChartParser.hs the current default parsing method
cf/EBNF.hs
cf/PPrCF.hs
cf/PrLBNF.hs
cf/Profile.hs
cfgm/CFG.cf [LBNF] source
cfgm/AbsCFG.hs [AUTO]
cfgm/LexCFG.hs [AUTO]
cfgm/ParCFG.hs [AUTO]
cfgm/PrintCFG.hs [AUTO]
cfgm/PrintCFGrammar.hs
compile/CheckGrammar.hs
compile/Compile.hs the complete compiler pipeline
compile/Extend.hs
compile/GetGrammar.hs
compile/GrammarToCanon.hs
compile/MkResource.hs
compile/MkUnion.hs
compile/ModDeps.hs
compile/Optimize.hs
compile/PGrammar.hs
compile/PrOld.hs
compile/Rebuild.hs
compile/RemoveLiT.hs
compile/Rename.hs
compile/ShellState.hs the run-time multilingual grammar datastructure
compile/Update.hs
for-ghc/ArchEdit.hs
for-ghc/Arch.hs
for-ghc-nofud/ArchEdit.hs@
for-ghc-nofud/Arch.hs@
for-hugs/ArchEdit.hs
for-hugs/Arch.hs
for-hugs/JGF.hs
for-hugs/MoreCustom.hs
for-hugs/Unicode.hs
for-hugs/Arch.hs
for-hugs/ArchEdit.hs
for-hugs/JGF.hs
for-hugs/LexCFG.hs dummy CFG lexer
for-hugs/LexGF.hs dummy GF lexer
for-hugs/LexGFC.hs dummy GFC lexer
for-hugs/MoreCustom.hs
for-hugs/ParCFG.hs dummy CFG parser
for-hugs/ParGFC.hs dummy GFC parser
for-hugs/ParGF.hs dummy GF parser
for-hugs/Tracing.hs
for-hugs/Unicode.hs
for-windows/ArchEdit.hs
for-windows/Arch.hs
grammar/AbsCompute.hs
grammar/Abstract.hs GF and GFC abstract syntax datatypes
grammar/AppPredefined.hs
grammar/Compute.hs
grammar/Grammar.hs GF source grammar datatypes
grammar/LookAbs.hs
grammar/Lookup.hs
grammar/Macros.hs macros for creating GF terms and types
grammar/MMacros.hs more macros, mainly for abstract syntax
grammar/PatternMatch.hs
grammar/PrGrammar.hs the top-level grammar printer
grammar/Refresh.hs
grammar/ReservedWords.hs
grammar/TC.hs Coquand's type checking engine
grammar/TypeCheck.hs
grammar/Unify.hs
grammar/Values.hs
infra/Arabic.hs ASCII coding of Arabic Unicode
infra/Assoc.hs finite maps/association lists as binary search trees
infra/CheckM.hs
infra/Comments.hs
infra/Devanagari.hs ASCII coding of Devanagari Unicode
infra/ErrM.hs
infra/Ethiopic.hs
infra/EventF.hs
infra/ExtendedArabic.hs
infra/ExtraDiacritics.hs
infra/FudgetOps.hs
infra/Glue.hs
infra/Greek.hs
infra/Hebrew.hs
infra/Hiragana.hs
infra/Ident.hs
infra/LatinASupplement.hs
infra/Map.hs finite maps as red black trees
infra/Modules.hs
infra/OCSCyrillic.hs
infra/Operations.hs library of strings, search trees, error monads
infra/Option.hs
infra/OrdMap2.hs abstract class of finite maps + implementation as association lists
infra/OrdSet.hs abstract class of sets + implementation as sorted lists
infra/Parsers.hs
infra/ReadFiles.hs
infra/RedBlack.hs red black trees
infra/RedBlackSet.hs sets and maps as red black trees
infra/Russian.hs
infra/SortedList.hs sets as sorted lists
infra/Str.hs
infra/Tamil.hs
infra/Text.hs
infra/Trie2.hs
infra/Trie.hs
infra/UnicodeF.hs
infra/Unicode.hs
infra/UseIO.hs
infra/UTF8.hs UTF3 en/decoding
infra/Zipper.hs
newparsing/CFGrammar.hs type definitions for context-free grammars
newparsing/CFParserGeneral.hs several variants of general CFG chart parsing
newparsing/CFParserIncremental.hs several variants of incremental (Earley-style) CFG chart parsing
newparsing/ConvertGFCtoMCFG.hs converting GFC to MCFG
newparsing/ConvertGrammar.hs conversions between different grammar formats
newparsing/ConvertMCFGtoCFG.hs converting MCFG to CFG
newparsing/GeneralChart.hs Haskell framework for "parsing as deduction"
newparsing/GrammarTypes.hs instantiations of grammar types
newparsing/IncrementalChart.hs Haskell framework for incremental chart parsing
newparsing/MCFGrammar.hs type definitions for multiple CFG
newparsing/MCFParserBasic.hs MCFG chart parser
newparsing/MCFRange.hs ranges for MCFG parsing
newparsing/ParseCFG.hs parsing of CFG
newparsing/ParseCF.hs parsing of the CF format
newparsing/ParseGFC.hs parsing of GFC
newparsing/ParseMCFG.hs parsing of MCFG
newparsing/Parser.hs general definitions for parsers
newparsing/PrintParser.hs pretty-printing class for parsers
newparsing/PrintSimplifiedTerm.hs simplified pretty-printing for GFC terms
notrace/Tracing.hs tracing predicates when we DON'T want tracing capabilities (normal case)
parsers/ParGFC.hs [AUTO]
parsers/ParGF.hs [AUTO]
shell/CommandF.hs
shell/CommandL.hs line-based syntax of editor commands
shell/Commands.hs commands of GF editor shell
shell/IDE.hs
shell/JGF.hs
shell/PShell.hs
shell/ShellCommands.hs commands of GF main shell
shell/Shell.hs
shell/SubShell.hs
shell/TeachYourself.hs
source/AbsGF.hs [AUTO]
source/ErrM.hs
source/GF.cf [LBNF] source of GF parser
source/GrammarToSource.hs
source/LexGF.hs [AUTO]
source/PrintGF.hs [AUTO]
source/SourceToGrammar.hs
speech/PrGSL.hs
speech/PrJSGF.hs
speech/SRG.hs
speech/TransformCFG.hs
trace/Tracing.hs tracing predicates when we want tracing capabilities
translate/GFT.hs Main module of html-producing batch translator
useGrammar/Custom.hs database for customizable commands
useGrammar/Editing.hs
useGrammar/Generate.hs
useGrammar/GetTree.hs
useGrammar/Information.hs
useGrammar/Linear.hs the linearization algorithm
useGrammar/MoreCustom.hs
useGrammar/Morphology.hs
useGrammar/Paraphrases.hs
useGrammar/Parsing.hs the top-level parsing algorithm
useGrammar/Randomized.hs
useGrammar/RealMoreCustom.hs
useGrammar/Session.hs
useGrammar/TeachYourself.hs
useGrammar/Tokenize.hs lexer definitions (listed in Custom)
useGrammar/Transfer.hs
util/GFDoc.hs utility for producing LaTeX and HTML from GF
util/HelpFile source of ../HelpFile.hs
util/Htmls.hs utility for chopping a HTML document to slides
util/MkHelpFile.hs
util/WriteF.hs

693
deprecated/HelpFile Normal file
View File

@@ -0,0 +1,693 @@
-- GF help file updated for GF 2.6, 17/6/2006.
-- *: Commands and options marked with * are currently not implemented.
--
-- Each command has a long and a short name, options, and zero or more
-- arguments. Commands are sorted by functionality. The short name is
-- given first.
-- Type "h -all" for full help file, "h <CommandName>" for full help on a command.
-- commands that change the state
i, import: i File
Reads a grammar from File and compiles it into a GF runtime grammar.
Files "include"d in File are read recursively, nubbing repetitions.
If a grammar with the same language name is already in the state,
it is overwritten - but only if compilation succeeds.
The grammar parser depends on the file name suffix:
.gf normal GF source
.gfc canonical GF
.gfr precompiled GF resource
.gfcm multilingual canonical GF
.gfe example-based grammar files (only with the -ex option)
.gfwl multilingual word list (preprocessed to abs + cncs)
.ebnf Extended BNF format
.cf Context-free (BNF) format
.trc TransferCore format
options:
-old old: parse in GF<2.0 format (not necessary)
-v verbose: give lots of messages
-s silent: don't give error messages
-src from source: ignore precompiled gfc and gfr files
-gfc from gfc: use compiled modules whenever they exist
-retain retain operations: read resource modules (needed in comm cc)
-nocf don't build old-style context-free grammar (default without HOAS)
-docf do build old-style context-free grammar (default with HOAS)
-nocheckcirc don't eliminate circular rules from CF
-cflexer build an optimized parser with separate lexer trie
-noemit do not emit code (default with old grammar format)
-o do emit code (default with new grammar format)
-ex preprocess .gfe files if needed
-prob read probabilities from top grammar file (format --# prob Fun Double)
-treebank read a treebank file to memory (xml format)
flags:
-abs set the name used for abstract syntax (with -old option)
-cnc set the name used for concrete syntax (with -old option)
-res set the name used for resource (with -old option)
-path use the (colon-separated) search path to find modules
-optimize select an optimization to override file-defined flags
-conversion select parsing method (values strict|nondet)
-probs read probabilities from file (format (--# prob) Fun Double)
-preproc use a preprocessor on each source file
-noparse read nonparsable functions from file (format --# noparse Funs)
examples:
i English.gf -- ordinary import of Concrete
i -retain german/ParadigmsGer.gf -- import of Resource to test
r, reload: r
Executes the previous import (i) command.
rl, remove_language: rl Language
Takes away the language from the state.
e, empty: e
Takes away all languages and resets all global flags.
sf, set_flags: sf Flag*
The values of the Flags are set for Language. If no language
is specified, the flags are set globally.
examples:
sf -nocpu -- stop showing CPU time
sf -lang=Swe -- make Swe the default concrete
s, strip: s
Prune the state by removing source and resource modules.
dc, define_command Name Anything
Add a new defined command. The Name must star with '%'. Later,
if 'Name X' is used, it is replaced by Anything where #1 is replaced
by X.
Restrictions: Currently at most one argument is possible, and a defined
command cannot appear in a pipe.
To see what definitions are in scope, use help -defs.
examples:
dc %tnp p -cat=NP -lang=Eng #1 | l -lang=Swe -- translate NPs
%tnp "this man" -- translate and parse
dt, define_term Name Tree
Add a constant for a tree. The constant can later be called by
prefixing it with '$'.
Restriction: These terms are not yet usable as a subterm.
To see what definitions are in scope, use help -defs.
examples:
p -cat=NP "this man" | dt tm -- define tm as parse result
l -all $tm -- linearize tm in all forms
-- commands that give information about the state
pg, print_grammar: pg
Prints the actual grammar (overridden by the -lang=X flag).
The -printer=X flag sets the format in which the grammar is
written.
N.B. since grammars are compiled when imported, this command
generally does not show the grammar in the same format as the
source. In particular, the -printer=latex is not supported.
Use the command tg -printer=latex File to print the source
grammar in LaTeX.
options:
-utf8 apply UTF8-encoding to the grammar
flags:
-printer
-lang
-startcat -- The start category of the generated grammar.
Only supported by some grammar printers.
examples:
pg -printer=cf -- show the context-free skeleton
pm, print_multigrammar: pm
Prints the current multilingual grammar in .gfcm form.
(Automatically executes the strip command (s) before doing this.)
options:
-utf8 apply UTF8 encoding to the tokens in the grammar
-utf8id apply UTF8 encoding to the identifiers in the grammar
examples:
pm | wf Letter.gfcm -- print the grammar into the file Letter.gfcm
pm -printer=graph | wf D.dot -- then do 'dot -Tps D.dot > D.ps'
vg, visualize_graph: vg
Show the dependency graph of multilingual grammar via dot and gv.
po, print_options: po
Print what modules there are in the state. Also
prints those flag values in the current state that differ from defaults.
pl, print_languages: pl
Prints the names of currently available languages.
pi, print_info: pi Ident
Prints information on the identifier.
-- commands that execute and show the session history
eh, execute_history: eh File
Executes commands in the file.
ph, print_history; ph
Prints the commands issued during the GF session.
The result is readable by the eh command.
examples:
ph | wf foo.hist" -- save the history into a file
-- linearization, parsing, translation, and computation
l, linearize: l PattList? Tree
Shows all linearization forms of Tree by the actual grammar
(which is overridden by the -lang flag).
The pattern list has the form [P, ... ,Q] where P,...,Q follow GF
syntax for patterns. All those forms are generated that match with the
pattern list. Too short lists are filled with variables in the end.
Only the -table flag is available if a pattern list is specified.
HINT: see GF language specification for the syntax of Pattern and Term.
You can also copy and past parsing results.
options:
-struct bracketed form
-table show parameters (not compatible with -record, -all)
-record record, i.e. explicit GF concrete syntax term (not compatible with -table, -all)
-all show all forms and variants (not compatible with -record, -table)
-multi linearize to all languages (can be combined with the other options)
flags:
-lang linearize in this grammar
-number give this number of forms at most
-unlexer filter output through unlexer
examples:
l -lang=Swe -table -- show full inflection table in Swe
p, parse: p String
Shows all Trees returned for String by the actual
grammar (overridden by the -lang flag), in the category S (overridden
by the -cat flag).
options for batch input:
-lines parse each line of input separately, ignoring empty lines
-all as -lines, but also parse empty lines
-prob rank results by probability
-cut stop after first lexing result leading to parser success
-fail show strings whose parse fails prefixed by #FAIL
-ambiguous show strings that have more than one parse prefixed by #AMBIGUOUS
options for selecting parsing method:
-fcfg parse using a fast variant of MCFG (default is no HOAS in grammar)
-old parse using an overgenerating CFG (default if HOAS in grammar)
-cfg parse using a much less overgenerating CFG
-mcfg parse using an even less overgenerating MCFG
Note: the first time parsing with -cfg, -mcfg, and -fcfg may take a long time
options that only work for the -old default parsing method:
-n non-strict: tolerates morphological errors
-ign ignore unknown words when parsing
-raw return context-free terms in raw form
-v verbose: give more information if parsing fails
flags:
-cat parse in this category
-lang parse in this grammar
-lexer filter input through this lexer
-parser use this parsing strategy
-number return this many results at most
examples:
p -cat=S -mcfg "jag är gammal" -- parse an S with the MCFG
rf examples.txt | p -lines -- parse each non-empty line of the file
at, apply_transfer: at (Module.Fun | Fun)
Transfer a term using Fun from Module, or the topmost transfer
module. Transfer modules are given in the .trc format. They are
shown by the 'po' command.
flags:
-lang typecheck the result in this lang instead of default lang
examples:
p -lang=Cncdecimal "123" | at num2bin | l -- convert dec to bin
tb, tree_bank: tb
Generate a multilingual treebank from a list of trees (default) or compare
to an existing treebank.
options:
-c compare to existing xml-formatted treebank
-trees return the trees of the treebank
-all show all linearization alternatives (branches and variants)
-table show tables of linearizations with parameters
-record show linearization records
-xml wrap the treebank (or comparison results) with XML tags
-mem write the treebank in memory instead of a file TODO
examples:
gr -cat=S -number=100 | tb -xml | wf tb.xml -- random treebank into file
rf tb.xml | tb -c -- compare-test treebank from file
rf old.xml | tb -trees | tb -xml -- create new treebank from old
ut, use_treebank: ut String
Lookup a string in a treebank and return the resulting trees.
Use 'tb' to create a treebank and 'i -treebank' to read one from
a file.
options:
-assocs show all string-trees associations in the treebank
-strings show all strings in the treebank
-trees show all trees in the treebank
-raw return the lookup result as string, without typechecking it
flags:
-treebank use this treebank (instead of the latest introduced one)
examples:
ut "He adds this to that" | l -multi -- use treebank lookup as parser in translation
ut -assocs | grep "ComplV2" -- show all associations with ComplV2
tt, test_tokenizer: tt String
Show the token list sent to the parser when String is parsed.
HINT: can be useful when debugging the parser.
flags:
-lexer use this lexer
examples:
tt -lexer=codelit "2*(x + 3)" -- a favourite lexer for program code
g, grep: g String1 String2
Grep the String1 in the String2. String2 is read line by line,
and only those lines that contain String1 are returned.
flags:
-v return those lines that do not contain String1.
examples:
pg -printer=cf | grep "mother" -- show cf rules with word mother
cc, compute_concrete: cc Term
Compute a term by concrete syntax definitions. Uses the topmost
resource module (the last in listing by command po) to resolve
constant names.
N.B. You need the flag -retain when importing the grammar, if you want
the oper definitions to be retained after compilation; otherwise this
command does not expand oper constants.
N.B.' The resulting Term is not a term in the sense of abstract syntax,
and hence not a valid input to a Tree-demanding command.
flags:
-table show output in a similar readable format as 'l -table'
-res use another module than the topmost one
examples:
cc -res=ParadigmsFin (nLukko "hyppy") -- inflect "hyppy" with nLukko
so, show_operations: so Type
Show oper operations with the given value type. Uses the topmost
resource module to resolve constant names.
N.B. You need the flag -retain when importing the grammar, if you want
the oper definitions to be retained after compilation; otherwise this
command does not find any oper constants.
N.B.' The value type may not be defined in a supermodule of the
topmost resource. In that case, use appropriate qualified name.
flags:
-res use another module than the topmost one
examples:
so -res=ParadigmsFin ResourceFin.N -- show N-paradigms in ParadigmsFin
t, translate: t Lang Lang String
Parses String in Lang1 and linearizes the resulting Trees in Lang2.
flags:
-cat
-lexer
-parser
examples:
t Eng Swe -cat=S "every number is even or odd"
gr, generate_random: gr Tree?
Generates a random Tree of a given category. If a Tree
argument is given, the command completes the Tree with values to
the metavariables in the tree.
options:
-prob use probabilities (works for nondep types only)
-cf use a very fast method (works for nondep types only)
flags:
-cat generate in this category
-lang use the abstract syntax of this grammar
-number generate this number of trees (not impl. with Tree argument)
-depth use this number of search steps at most
examples:
gr -cat=Query -- generate in category Query
gr (PredVP ? (NegVG ?)) -- generate a random tree of this form
gr -cat=S -tr | l -- gererate and linearize
gt, generate_trees: gt Tree?
Generates all trees up to a given depth. If the depth is large,
a small -alts is recommended. If a Tree argument is given, the
command completes the Tree with values to the metavariables in
the tree.
options:
-metas also return trees that include metavariables
-all generate all (can be infinitely many, lazily)
-lin linearize result of -all (otherwise, use pipe to linearize)
flags:
-depth generate to this depth (default 3)
-atoms take this number of atomic rules of each category (default unlimited)
-alts take this number of alternatives at each branch (default unlimited)
-cat generate in this category
-nonub don't remove duplicates (faster, not effective with -mem)
-mem use a memorizing algorithm (often faster, usually more memory-consuming)
-lang use the abstract syntax of this grammar
-number generate (at most) this number of trees (also works with -all)
-noexpand don't expand these categories (comma-separated, e.g. -noexpand=V,CN)
-doexpand only expand these categories (comma-separated, e.g. -doexpand=V,CN)
examples:
gt -depth=10 -cat=NP -- generate all NP's to depth 10
gt (PredVP ? (NegVG ?)) -- generate all trees of this form
gt -cat=S -tr | l -- generate and linearize
gt -noexpand=NP | l -mark=metacat -- the only NP is meta, linearized "?0 +NP"
gt | l | p -lines -ambiguous | grep "#AMBIGUOUS" -- show ambiguous strings
ma, morphologically_analyse: ma String
Runs morphological analysis on each word in String and displays
the results line by line.
options:
-short show analyses in bracketed words, instead of separate lines
-status show just the work at success, prefixed with "*" at failure
flags:
-lang
examples:
wf Bible.txt | ma -short | wf Bible.tagged -- analyse the Bible
-- elementary generation of Strings and Trees
ps, put_string: ps String
Returns its argument String, like Unix echo.
HINT. The strength of ps comes from the possibility to receive the
argument from a pipeline, and altering it by the -filter flag.
flags:
-filter filter the result through this string processor
-length cut the string after this number of characters
examples:
gr -cat=Letter | l | ps -filter=text -- random letter as text
pt, put_tree: pt Tree
Returns its argument Tree, like a specialized Unix echo.
HINT. The strength of pt comes from the possibility to receive
the argument from a pipeline, and altering it by the -transform flag.
flags:
-transform transform the result by this term processor
-number generate this number of terms at most
examples:
p "zero is even" | pt -transform=solve -- solve ?'s in parse result
* st, show_tree: st Tree
Prints the tree as a string. Unlike pt, this command cannot be
used in a pipe to produce a tree, since its output is a string.
flags:
-printer show the tree in a special format (-printer=xml supported)
wt, wrap_tree: wt Fun
Wraps the tree as the sole argument of Fun.
flags:
-c compute the resulting new tree to normal form
vt, visualize_tree: vt Tree
Shows the abstract syntax tree via dot and gv (via temporary files
grphtmp.dot, grphtmp.ps).
flags:
-c show categories only (no functions)
-f show functions only (no categories)
-g show as graph (sharing uses of the same function)
-o just generate the .dot file
examples:
p "hello world" | vt -o | wf my.dot ;; ! open -a GraphViz my.dot
-- This writes the parse tree into my.dot and opens the .dot file
-- with another application without generating .ps.
-- subshells
es, editing_session: es
Opens an interactive editing session.
N.B. Exit from a Fudget session is to the Unix shell, not to GF.
options:
-f Fudget GUI (necessary for Unicode; only available in X Window System)
ts, translation_session: ts
Translates input lines from any of the actual languages to all other ones.
To exit, type a full stop (.) alone on a line.
N.B. Exit from a Fudget session is to the Unix shell, not to GF.
HINT: Set -parser and -lexer locally in each grammar.
options:
-f Fudget GUI (necessary for Unicode; only available in X Windows)
-lang prepend translation results with language names
flags:
-cat the parser category
examples:
ts -cat=Numeral -lang -- translate numerals, show language names
tq, translation_quiz: tq Lang Lang
Random-generates translation exercises from Lang1 to Lang2,
keeping score of success.
To interrupt, type a full stop (.) alone on a line.
HINT: Set -parser and -lexer locally in each grammar.
flags:
-cat
examples:
tq -cat=NP TestResourceEng TestResourceSwe -- quiz for NPs
tl, translation_list: tl Lang Lang
Random-generates a list of ten translation exercises from Lang1
to Lang2. The number can be changed by a flag.
HINT: use wf to save the exercises in a file.
flags:
-cat
-number
examples:
tl -cat=NP TestResourceEng TestResourceSwe -- quiz list for NPs
mq, morphology_quiz: mq
Random-generates morphological exercises,
keeping score of success.
To interrupt, type a full stop (.) alone on a line.
HINT: use printname judgements in your grammar to
produce nice expressions for desired forms.
flags:
-cat
-lang
examples:
mq -cat=N -lang=TestResourceSwe -- quiz for Swedish nouns
ml, morphology_list: ml
Random-generates a list of ten morphological exercises,
keeping score of success. The number can be changed with a flag.
HINT: use wf to save the exercises in a file.
flags:
-cat
-lang
-number
examples:
ml -cat=N -lang=TestResourceSwe -- quiz list for Swedish nouns
-- IO related commands
rf, read_file: rf File
Returns the contents of File as a String; error if File does not exist.
wf, write_file: wf File String
Writes String into File; File is created if it does not exist.
N.B. the command overwrites File without a warning.
af, append_file: af File
Writes String into the end of File; File is created if it does not exist.
* tg, transform_grammar: tg File
Reads File, parses as a grammar,
but instead of compiling further, prints it.
The environment is not changed. When parsing the grammar, the same file
name suffixes are supported as in the i command.
HINT: use this command to print the grammar in
another format (the -printer flag); pipe it to wf to save this format.
flags:
-printer (only -printer=latex supported currently)
* cl, convert_latex: cl File
Reads File, which is expected to be in LaTeX form.
Three environments are treated in special ways:
\begGF - \end{verbatim}, which contains GF judgements,
\begTGF - \end{verbatim}, which contains a GF expression (displayed)
\begInTGF - \end{verbatim}, which contains a GF expressions (inlined).
Moreover, certain macros should be included in the file; you can
get those macros by applying 'tg -printer=latex foo.gf' to any grammar
foo.gf. Notice that the same File can be imported as a GF grammar,
consisting of all the judgements in \begGF environments.
HINT: pipe with 'wf Foo.tex' to generate a new Latex file.
sa, speak_aloud: sa String
Uses the Flite speech generator to produce speech for String.
Works for American English spelling.
examples:
h | sa -- listen to the list of commands
gr -cat=S | l | sa -- generate a random sentence and speak it aloud
si, speech_input: si
Uses an ATK speech recognizer to get speech input.
flags:
-lang: The grammar to use with the speech recognizer.
-cat: The grammar category to get input in.
-language: Use acoustic model and dictionary for this language.
-number: The number of utterances to recognize.
h, help: h Command?
Displays the paragraph concerning the command from this help file.
Without the argument, shows the first lines of all paragraphs.
options
-all show the whole help file
-defs show user-defined commands and terms
-FLAG show the values of FLAG (works for grammar-independent flags)
examples:
h print_grammar -- show all information on the pg command
q, quit: q
Exits GF.
HINT: you can use 'ph | wf history' to save your session.
!, system_command: ! String
Issues a system command. No value is returned to GF.
example:
! ls
?, system_command: ? String
Issues a system command that receives its arguments from GF pipe
and returns a value to GF.
example:
h | ? 'wc -l' | p -cat=Num
-- Flags. The availability of flags is defined separately for each command.
-cat, category in which parsing is performed.
The default is S.
-depth, the search depth in e.g. random generation.
The default depends on application.
-filter, operation performed on a string. The default is identity.
-filter=identity no change
-filter=erase erase the text
-filter=take100 show the first 100 characters
-filter=length show the length of the string
-filter=text format as text (punctuation, capitalization)
-filter=code format as code (spacing, indentation)
-lang, grammar used when executing a grammar-dependent command.
The default is the last-imported grammar.
-language, voice used by Festival as its --language flag in the sa command.
The default is system-dependent.
-length, the maximum number of characters shown of a string.
The default is unlimited.
-lexer, tokenization transforming a string into lexical units for a parser.
The default is words.
-lexer=words tokens are separated by spaces or newlines
-lexer=literals like words, but GF integer and string literals recognized
-lexer=vars like words, but "x","x_...","$...$" as vars, "?..." as meta
-lexer=chars each character is a token
-lexer=code use Haskell's lex
-lexer=codevars like code, but treat unknown words as variables, ?? as meta
-lexer=textvars like text, but treat unknown words as variables, ?? as meta
-lexer=text with conventions on punctuation and capital letters
-lexer=codelit like code, but treat unknown words as string literals
-lexer=textlit like text, but treat unknown words as string literals
-lexer=codeC use a C-like lexer
-lexer=ignore like literals, but ignore unknown words
-lexer=subseqs like ignore, but then try all subsequences from longest
-number, the maximum number of generated items in a list.
The default is unlimited.
-optimize, optimization on generated code.
The default is share for concrete, none for resource modules.
Each of the flags can have the suffix _subs, which performs
common subexpression elimination after the main optimization.
Thus, -optimize=all_subs is the most aggressive one. The _subs
strategy only works in GFC, and applies therefore in concrete but
not in resource modules.
-optimize=share share common branches in tables
-optimize=parametrize first try parametrize then do share with the rest
-optimize=values represent tables as courses-of-values
-optimize=all first try parametrize then do values with the rest
-optimize=none no optimization
-parser, parsing strategy. The default is chart. If -cfg or -mcfg are
selected, only bottomup and topdown are recognized.
-parser=chart bottom-up chart parsing
-parser=bottomup a more up to date bottom-up strategy
-parser=topdown top-down strategy
-parser=old an old bottom-up chart parser
-printer, format in which the grammar is printed. The default is
gfc. Those marked with M are (only) available for pm, the rest
for pg.
-printer=gfc GFC grammar
-printer=gf GF grammar
-printer=old old GF grammar
-printer=cf context-free grammar, with profiles
-printer=bnf context-free grammar, without profiles
-printer=lbnf labelled context-free grammar for BNF Converter
-printer=plbnf grammar for BNF Converter, with precedence levels
*-printer=happy source file for Happy parser generator (use lbnf!)
-printer=haskell abstract syntax in Haskell, with transl to/from GF
-printer=haskell_gadt abstract syntax GADT in Haskell, with transl to/from GF
-printer=morpho full-form lexicon, long format
*-printer=latex LaTeX file (for the tg command)
-printer=fullform full-form lexicon, short format
*-printer=xml XML: DTD for the pg command, object for st
-printer=old old GF: file readable by GF 1.2
-printer=stat show some statistics of generated GFC
-printer=probs show probabilities of all functions
-printer=gsl Nuance GSL speech recognition grammar
-printer=jsgf Java Speech Grammar Format
-printer=jsgf_sisr_old Java Speech Grammar Format with semantic tags in
SISR WD 20030401 format
-printer=srgs_abnf SRGS ABNF format
-printer=srgs_abnf_non_rec SRGS ABNF format, without any recursion.
-printer=srgs_abnf_sisr_old SRGS ABNF format, with semantic tags in
SISR WD 20030401 format
-printer=srgs_xml SRGS XML format
-printer=srgs_xml_non_rec SRGS XML format, without any recursion.
-printer=srgs_xml_prob SRGS XML format, with weights
-printer=srgs_xml_sisr_old SRGS XML format, with semantic tags in
SISR WD 20030401 format
-printer=vxml Generate a dialogue system in VoiceXML.
-printer=slf a finite automaton in the HTK SLF format
-printer=slf_graphviz the same automaton as slf, but in Graphviz format
-printer=slf_sub a finite automaton with sub-automata in the
HTK SLF format
-printer=slf_sub_graphviz the same automaton as slf_sub, but in
Graphviz format
-printer=fa_graphviz a finite automaton with labelled edges
-printer=regular a regular grammar in a simple BNF
-printer=unpar a gfc grammar with parameters eliminated
-printer=functiongraph abstract syntax functions in 'dot' format
-printer=typegraph abstract syntax categories in 'dot' format
-printer=transfer Transfer language datatype (.tr file format)
-printer=cfg-prolog M cfg in prolog format (also pg)
-printer=gfc-prolog M gfc in prolog format (also pg)
-printer=gfcm M gfcm file (default for pm)
-printer=graph M module dependency graph in 'dot' (graphviz) format
-printer=header M gfcm file with header (for GF embedded in Java)
-printer=js M JavaScript type annotator and linearizer
-printer=mcfg-prolog M mcfg in prolog format (also pg)
-printer=missing M the missing linearizations of each concrete
-startcat, like -cat, but used in grammars (to avoid clash with keyword cat)
-transform, transformation performed on a syntax tree. The default is identity.
-transform=identity no change
-transform=compute compute by using definitions in the grammar
-transform=nodup return the term only if it has no constants duplicated
-transform=nodupatom return the term only if it has no atomic constants duplicated
-transform=typecheck return the term only if it is type-correct
-transform=solve solve metavariables as derived refinements
-transform=context solve metavariables by unique refinements as variables
-transform=delete replace the term by metavariable
-unlexer, untokenization transforming linearization output into a string.
The default is unwords.
-unlexer=unwords space-separated token list (like unwords)
-unlexer=text format as text: punctuation, capitals, paragraph <p>
-unlexer=code format as code (spacing, indentation)
-unlexer=textlit like text, but remove string literal quotes
-unlexer=codelit like code, but remove string literal quotes
-unlexer=concat remove all spaces
-unlexer=bind like identity, but bind at "&+"
-mark, marking of parts of tree in linearization. The default is none.
-mark=metacat append "+CAT" to every metavariable, showing its category
-mark=struct show tree structure with brackets
-mark=java show tree structure with XML tags (used in gfeditor)
-coding, Some grammars are in UTF-8, some in isolatin-1.
If the letters ä (a-umlaut) and ö (o-umlaut) look strange, either
change your terminal to isolatin-1, or rewrite the grammar with
'pg -utf8'.
-- *: Commands and options marked with * are not currently implemented.

250
deprecated/Makefile Normal file
View File

@@ -0,0 +1,250 @@
include config.mk
GHMAKE=$(GHC) --make
GHCXMAKE=ghcxmake
GHCFLAGS+= -fglasgow-exts
GHCOPTFLAGS=-O2
GHCFUDFLAG=
DIST_DIR=GF-$(PACKAGE_VERSION)
NOT_IN_DIST= \
grammars \
download \
doc/release2.html \
src/tools/AlphaConvGF.hs
BIN_DIST_DIR=$(DIST_DIR)-$(host)
GRAMMAR_PACKAGE_VERSION=$(shell date +%Y%m%d)
GRAMMAR_DIST_DIR=gf-grammars-$(GRAMMAR_PACKAGE_VERSION)
MSI_FILE=gf-$(subst .,_,$(PACKAGE_VERSION)).msi
GF_DATA_DIR=$(datadir)/GF-$(PACKAGE_VERSION)
GF_LIB_DIR=$(GF_DATA_DIR)/lib
EMBED = GF/Embed/TemplateApp
# use the temporary binary file name 'gf-bin' to not clash with directory 'GF'
# on case insensitive file systems (such as FAT)
GF_EXE=gf$(EXEEXT)
GF_EXE_TMP=gf-bin$(EXEEXT)
GF_DOC_EXE=gfdoc$(EXEEXT)
ifeq ("$(READLINE)","readline")
GHCFLAGS += -package readline -DUSE_READLINE
endif
ifneq ("$(CPPFLAGS)","")
GHCFLAGS += $(addprefix -optP, $(CPPFLAGS))
endif
ifneq ("$(LDFLAGS)","")
GHCFLAGS += $(addprefix -optl, $(LDFLAGS))
endif
ifeq ("$(INTERRUPT)","yes")
GHCFLAGS += -DUSE_INTERRUPT
endif
ifeq ("$(ATK)","yes")
GHCFLAGS += -DUSE_ATK
endif
ifeq ("$(ENABLE_JAVA)", "yes")
BUILD_JAR=jar
else
BUILD_JAR=
endif
.PHONY: all unix jar tags gfdoc windows install install-gf \
lib temp install-gfdoc \
today help clean windows-msi dist gfc
all: unix gfc lib
static: GHCFLAGS += -optl-static
static: unix
gf: unix
unix: today opt
windows: unix
temp: today noopt
build:
$(GHMAKE) $(GHCFLAGS) GF.hs -o $(GF_EXE_TMP)
strip $(GF_EXE_TMP)
mv $(GF_EXE_TMP) ../bin/$(GF_EXE)
opt: GHCFLAGS += $(GHCOPTFLAGS)
opt: build
embed: GHCFLAGS += $(GHCOPTFLAGS)
embed:
$(GHMAKE) $(GHCFLAGS) $(EMBED) -o $(EMBED)
strip $(EMBED)
noopt: build
clean:
find . '(' -name '*~' -o -name '*.hi' -o -name '*.ghi' -o -name '*.o' ')' -exec rm -f '{}' ';'
-rm -f gf.wixobj
-rm -f ../bin/$(GF_EXE)
$(MAKE) -C tools/c clean
$(MAKE) -C ../lib/c clean
-rm -f ../bin/gfcc2c
distclean: clean
-rm -f tools/$(GF_DOC_EXE)
-rm -f config.status config.mk config.log
-rm -f *.tgz *.zip
-rm -rf $(DIST_DIR) $(BIN_DIST_DIR)
-rm -rf gf.wxs *.msi
today:
echo 'module Paths_gf (version, getDataDir) where' > Paths_gf.hs
echo 'import Data.Version' >> Paths_gf.hs
echo '{-# NOINLINE version #-}' >> Paths_gf.hs
echo 'version :: Version' >> Paths_gf.hs
echo 'version = Version {versionBranch = [3,0], versionTags = ["beta3"]}' >> Paths_gf.hs
echo 'getDataDir = return "$(GF_DATA_DIR)" :: IO FilePath' >> Paths_gf.hs
showflags:
@echo $(GHCFLAGS)
# added by peb:
tracing: GHCFLAGS += -DTRACING
tracing: temp
ghci-trace: GHCFLAGS += -DTRACING
ghci-trace: ghci
#touch-files:
# rm -f GF/System/Tracing.{hi,o}
# touch GF/System/Tracing.hs
# profiling
prof: GHCOPTFLAGS += -prof -auto-all
prof: unix
tags:
find GF Transfer -name '*.hs' | xargs hasktags
#
# Help file
#
tools/MkHelpFile: tools/MkHelpFile.hs
$(GHMAKE) -o $@ $^
help: GF/Shell/HelpFile.hs
GF/Shell/HelpFile.hs: tools/MkHelpFile HelpFile
tools/MkHelpFile
#
# Tools
#
gfdoc: tools/$(GF_DOC_EXE)
tools/$(GF_DOC_EXE): tools/GFDoc.hs
$(GHMAKE) $(GHCOPTFLAGS) -o $@ $^
gfc: gf
echo GFC!
cp -f gfc ../bin/
chmod a+x ../bin/gfc
gfcc2c:
$(MAKE) -C tools/c
$(MAKE) -C ../lib/c
mv tools/c/gfcc2c ../bin
#
# Resource grammars
#
lib:
$(MAKE) -C ../lib/resource clean all
#
# Distribution
#
dist:
-rm -rf $(DIST_DIR)
darcs dist --dist-name=$(DIST_DIR)
tar -zxf ../$(DIST_DIR).tar.gz
rm ../$(DIST_DIR).tar.gz
cd $(DIST_DIR)/src && perl -pi -e "s/^AC_INIT\(\[GF\],\[[^\]]*\]/AC_INIT([GF],[$(PACKAGE_VERSION)]/" configure.ac
cd $(DIST_DIR)/src && autoconf && rm -rf autom4te.cache
# cd $(DIST_DIR)/grammars && sh mkLib.sh
cd $(DIST_DIR) && rm -rf $(NOT_IN_DIST)
$(TAR) -zcf $(DIST_DIR).tgz $(DIST_DIR)
rm -rf $(DIST_DIR)
snapshot: PACKAGE_VERSION=$(shell date +%Y%m%d)
snapshot: DIST_DIR=GF-$(PACKAGE_VERSION)
snapshot: dist
rpm: dist
rpmbuild -ta $(DIST_DIR).tgz
binary-dist:
rm -rf $(BIN_DIST_DIR)
mkdir $(BIN_DIST_DIR)
mkdir $(BIN_DIST_DIR)/lib
./configure --host="$(host)" --build="$(build)"
$(MAKE) gfc gfdoc
$(INSTALL) ../bin/$(GF_EXE) tools/$(GF_DOC_EXE) $(BIN_DIST_DIR)
$(INSTALL) configure config.guess config.sub install-sh config.mk.in $(BIN_DIST_DIR)
$(INSTALL) gfc.in $(BIN_DIST_DIR)
$(INSTALL) -m 0644 ../README ../LICENSE $(BIN_DIST_DIR)
$(INSTALL) -m 0644 INSTALL.binary $(BIN_DIST_DIR)/INSTALL
$(INSTALL) -m 0644 Makefile.binary $(BIN_DIST_DIR)/Makefile
# $(TAR) -C $(BIN_DIST_DIR)/lib -zxf ../lib/compiled.tgz
$(TAR) -zcf GF-$(PACKAGE_VERSION)-$(host).tgz $(BIN_DIST_DIR)
rm -rf $(BIN_DIST_DIR)
grammar-dist:
-rm -rf $(GRAMMAR_DIST_DIR)
mkdir $(GRAMMAR_DIST_DIR)
cp -r ../_darcs/current/{lib,examples} $(GRAMMAR_DIST_DIR)
$(MAKE) GF_LIB_PATH=.. -C $(GRAMMAR_DIST_DIR)/lib/resource-1.0 show-path prelude present alltenses mathematical api multimodal langs
$(TAR) -zcf $(GRAMMAR_DIST_DIR).tgz $(GRAMMAR_DIST_DIR)
rm -rf $(GRAMMAR_DIST_DIR)
gf.wxs: config.status gf.wxs.in
./config.status --file=$@
windows-msi: gf.wxs
candle -nologo gf.wxs
light -nologo -o $(MSI_FILE) gf.wixobj
#
# Installation
#
install: install-gf install-gfdoc install-lib
install-gf:
$(INSTALL) -d $(bindir)
$(INSTALL) ../bin/$(GF_EXE) $(bindir)
install-gfdoc:
$(INSTALL) -d $(bindir)
$(INSTALL) tools/$(GF_DOC_EXE) $(bindir)
install-lib:
$(INSTALL) -d $(GF_LIB_DIR)
$(TAR) -C $(GF_LIB_DIR) -zxf ../lib/compiled.tgz

View File

@@ -0,0 +1,20 @@
include config.mk
GF_DATA_DIR=$(datadir)/GF-$(PACKAGE_VERSION)
GF_LIB_DIR=$(GF_DATA_DIR)/lib
.PHONY: install uninstall
install:
$(INSTALL) -d $(bindir)
$(INSTALL) gf$(EXEEXT) gfdoc$(EXEEXT) $(bindir)
$(INSTALL) gfc$(EXEEXT) $(bindir)
$(INSTALL) -d $(GF_DATA_DIR)
cp -r lib $(GF_DATA_DIR)
uninstall:
-rm -f $(bindir)/gf$(EXEEXT) $(bindir)/gfdoc$(EXEEXT)
-rm -f $GF_LIB_DIR)/*/*.gf{o}
-rmdir $(GF_LIB_DIR)/*
-rmdir $(GF_LIB_DIR)
-rmdir $(GF_DATA_DIR)

13
deprecated/PGF/doc/Eng.gf Normal file
View File

@@ -0,0 +1,13 @@
concrete Eng of Ex = {
lincat
S = {s : Str} ;
NP = {s : Str ; n : Num} ;
VP = {s : Num => Str} ;
param
Num = Sg | Pl ;
lin
Pred np vp = {s = np.s ++ vp.s ! np.n} ;
She = {s = "she" ; n = Sg} ;
They = {s = "they" ; n = Pl} ;
Sleep = {s = table {Sg => "sleeps" ; Pl => "sleep"}} ;
}

8
deprecated/PGF/doc/Ex.gf Normal file
View File

@@ -0,0 +1,8 @@
abstract Ex = {
cat
S ; NP ; VP ;
fun
Pred : NP -> VP -> S ;
She, They : NP ;
Sleep : VP ;
}

13
deprecated/PGF/doc/Swe.gf Normal file
View File

@@ -0,0 +1,13 @@
concrete Swe of Ex = {
lincat
S = {s : Str} ;
NP = {s : Str} ;
VP = {s : Str} ;
param
Num = Sg | Pl ;
lin
Pred np vp = {s = np.s ++ vp.s} ;
She = {s = "hon"} ;
They = {s = "de"} ;
Sleep = {s = "sover"} ;
}

View File

@@ -0,0 +1,64 @@
-- to test GFCC compilation
flags coding=utf8 ;
cat S ; NP ; N ; VP ;
fun Pred : NP -> VP -> S ;
fun Pred2 : NP -> VP -> NP -> S ;
fun Det, Dets : N -> NP ;
fun Mina, Sina, Me, Te : NP ;
fun Raha, Paska, Pallo : N ;
fun Puhua, Munia, Sanoa : VP ;
param Person = P1 | P2 | P3 ;
param Number = Sg | Pl ;
param Case = Nom | Part ;
param NForm = NF Number Case ;
param VForm = VF Number Person ;
lincat N = Noun ;
lincat VP = Verb ;
oper Noun = {s : NForm => Str} ;
oper Verb = {s : VForm => Str} ;
lincat NP = {s : Case => Str ; a : {n : Number ; p : Person}} ;
lin Pred np vp = {s = np.s ! Nom ++ vp.s ! VF np.a.n np.a.p} ;
lin Pred2 np vp ob = {s = np.s ! Nom ++ vp.s ! VF np.a.n np.a.p ++ ob.s ! Part} ;
lin Det no = {s = \\c => no.s ! NF Sg c ; a = {n = Sg ; p = P3}} ;
lin Dets no = {s = \\c => no.s ! NF Pl c ; a = {n = Pl ; p = P3}} ;
lin Mina = {s = table Case ["minä" ; "minua"] ; a = {n = Sg ; p = P1}} ;
lin Te = {s = table Case ["te" ; "teitä"] ; a = {n = Pl ; p = P2}} ;
lin Sina = {s = table Case ["sinä" ; "sinua"] ; a = {n = Sg ; p = P2}} ;
lin Me = {s = table Case ["me" ; "meitä"] ; a = {n = Pl ; p = P1}} ;
lin Raha = mkN "raha" ;
lin Paska = mkN "paska" ;
lin Pallo = mkN "pallo" ;
lin Puhua = mkV "puhu" ;
lin Munia = mkV "muni" ;
lin Sanoa = mkV "sano" ;
oper mkN : Str -> Noun = \raha -> {
s = table {
NF Sg Nom => raha ;
NF Sg Part => raha + "a" ;
NF Pl Nom => raha + "t" ;
NF Pl Part => Predef.tk 1 raha + "oja"
}
} ;
oper mkV : Str -> Verb = \puhu -> {
s = table {
VF Sg P1 => puhu + "n" ;
VF Sg P2 => puhu + "t" ;
VF Sg P3 => puhu + Predef.dp 1 puhu ;
VF Pl P1 => puhu + "mme" ;
VF Pl P2 => puhu + "tte" ;
VF Pl P3 => puhu + "vat"
}
} ;

View File

@@ -0,0 +1,809 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<META NAME="generator" CONTENT="http://txt2tags.sf.net">
<TITLE>The GFCC Grammar Format</TITLE>
</HEAD><BODY BGCOLOR="white" TEXT="black">
<P ALIGN="center"><CENTER><H1>The GFCC Grammar Format</H1>
<FONT SIZE="4">
<I>Aarne Ranta</I><BR>
October 5, 2007
</FONT></CENTER>
<P>
Author's address:
<A HREF="http://www.cs.chalmers.se/~aarne"><CODE>http://www.cs.chalmers.se/~aarne</CODE></A>
</P>
<P>
History:
</P>
<UL>
<LI>5 Oct 2007: new, better structured GFCC with full expressive power
<LI>19 Oct: translation of lincats, new figures on C++
<LI>3 Oct 2006: first version
</UL>
<H2>What is GFCC</H2>
<P>
GFCC is a low-level format for GF grammars. Its aim is to contain the minimum
that is needed to process GF grammars at runtime. This minimality has three
advantages:
</P>
<UL>
<LI>compact grammar files and run-time objects
<LI>time and space efficient processing
<LI>simple definition of interpreters
</UL>
<P>
Thus we also want to call GFCC the <B>portable grammar format</B>.
</P>
<P>
The idea is that all embedded GF applications use GFCC.
The GF system would be primarily used as a compiler and as a grammar
development tool.
</P>
<P>
Since GFCC is implemented in BNFC, a parser of the format is readily
available for C, C++, C#, Haskell, Java, and OCaml. Also an XML
representation can be generated in BNFC. A
<A HREF="../">reference implementation</A>
of linearization and some other functions has been written in Haskell.
</P>
<H2>GFCC vs. GFC</H2>
<P>
GFCC is aimed to replace GFC as the run-time grammar format. GFC was designed
to be a run-time format, but also to
support separate compilation of grammars, i.e.
to store the results of compiling
individual GF modules. But this means that GFC has to contain extra information,
such as type annotations, which is only needed in compilation and not at
run-time. In particular, the pattern matching syntax and semantics of GFC is
complex and therefore difficult to implement in new platforms.
</P>
<P>
Actually, GFC is planned to be omitted also as the target format of
separate compilation, where plain GF (type annotated and partially evaluated)
will be used instead. GFC provides only marginal advantages as a target format
compared with GF, and it is therefore just extra weight to carry around this
format.
</P>
<P>
The main differences of GFCC compared with GFC (and GF) can be summarized as follows:
</P>
<UL>
<LI>there are no modules, and therefore no qualified names
<LI>a GFCC grammar is multilingual, and consists of a common abstract syntax
together with one concrete syntax per language
<LI>records and tables are replaced by arrays
<LI>record labels and parameter values are replaced by integers
<LI>record projection and table selection are replaced by array indexing
<LI>even though the format does support dependent types and higher-order abstract
syntax, there is no interpreted yet that does this
</UL>
<P>
Here is an example of a GF grammar, consisting of three modules,
as translated to GFCC. The representations are aligned; thus they do not completely
reflect the order of judgements in GFCC files, which have different orders of
blocks of judgements, and alphabetical sorting.
</P>
<PRE>
grammar Ex(Eng,Swe);
abstract Ex = { abstract {
cat cat
S ; NP ; VP ; NP[]; S[]; VP[];
fun fun
Pred : NP -&gt; VP -&gt; S ; Pred=[(($ 0! 1),(($ 1! 0)!($ 0! 0)))];
She, They : NP ; She=[0,"she"];
Sleep : VP ; They=[1,"they"];
Sleep=[["sleeps","sleep"]];
} } ;
concrete Eng of Ex = { concrete Eng {
lincat lincat
S = {s : Str} ; S=[()];
NP = {s : Str ; n : Num} ; NP=[1,()];
VP = {s : Num =&gt; Str} ; VP=[[(),()]];
param
Num = Sg | Pl ;
lin lin
Pred np vp = { Pred=[(($ 0! 1),(($ 1! 0)!($ 0! 0)))];
s = np.s ++ vp.s ! np.n} ;
She = {s = "she" ; n = Sg} ; She=[0,"she"];
They = {s = "they" ; n = Pl} ; They = [1, "they"];
Sleep = {s = table { Sleep=[["sleeps","sleep"]];
Sg =&gt; "sleeps" ;
Pl =&gt; "sleep"
}
} ;
} } ;
concrete Swe of Ex = { concrete Swe {
lincat lincat
S = {s : Str} ; S=[()];
NP = {s : Str} ; NP=[()];
VP = {s : Str} ; VP=[()];
param
Num = Sg | Pl ;
lin lin
Pred np vp = { Pred = [(($0!0),($1!0))];
s = np.s ++ vp.s} ;
She = {s = "hon"} ; She = ["hon"];
They = {s = "de"} ; They = ["de"];
Sleep = {s = "sover"} ; Sleep = ["sover"];
} } ;
</PRE>
<P></P>
<H2>The syntax of GFCC files</H2>
<P>
The complete BNFC grammar, from which
the rules in this section are taken, is in the file
<A HREF="../DataGFCC.cf"><CODE>GF/GFCC/GFCC.cf</CODE></A>.
</P>
<H3>Top level</H3>
<P>
A grammar has a header telling the name of the abstract syntax
(often specifying an application domain), and the names of
the concrete languages. The abstract syntax and the concrete
syntaxes themselves follow.
</P>
<PRE>
Grm. Grammar ::=
"grammar" CId "(" [CId] ")" ";"
Abstract ";"
[Concrete] ;
Abs. Abstract ::=
"abstract" "{"
"flags" [Flag]
"fun" [FunDef]
"cat" [CatDef]
"}" ;
Cnc. Concrete ::=
"concrete" CId "{"
"flags" [Flag]
"lin" [LinDef]
"oper" [LinDef]
"lincat" [LinDef]
"lindef" [LinDef]
"printname" [LinDef]
"}" ;
</PRE>
<P>
This syntax organizes each module to a sequence of <B>fields</B>, such
as flags, linearizations, operations, linearization types, etc.
It is envisaged that particular applications can ignore some
of the fields, typically so that earlier fields are more
important than later ones.
</P>
<P>
The judgement forms have the following syntax.
</P>
<PRE>
Flg. Flag ::= CId "=" String ;
Cat. CatDef ::= CId "[" [Hypo] "]" ;
Fun. FunDef ::= CId ":" Type "=" Exp ;
Lin. LinDef ::= CId "=" Term ;
</PRE>
<P>
For the run-time system, the reference implementation in Haskell
uses a structure that gives efficient look-up:
</P>
<PRE>
data GFCC = GFCC {
absname :: CId ,
cncnames :: [CId] ,
abstract :: Abstr ,
concretes :: Map CId Concr
}
data Abstr = Abstr {
aflags :: Map CId String, -- value of a flag
funs :: Map CId (Type,Exp), -- type and def of a fun
cats :: Map CId [Hypo], -- context of a cat
catfuns :: Map CId [CId] -- funs yielding a cat (redundant, for fast lookup)
}
data Concr = Concr {
flags :: Map CId String, -- value of a flag
lins :: Map CId Term, -- lin of a fun
opers :: Map CId Term, -- oper generated by subex elim
lincats :: Map CId Term, -- lin type of a cat
lindefs :: Map CId Term, -- lin default of a cat
printnames :: Map CId Term -- printname of a cat or a fun
}
</PRE>
<P>
These definitions are from <A HREF="../DataGFCC.hs"><CODE>GF/GFCC/DataGFCC.hs</CODE></A>.
</P>
<P>
Identifiers (<CODE>CId</CODE>) are like <CODE>Ident</CODE> in GF, except that
the compiler produces constants prefixed with <CODE>_</CODE> in
the common subterm elimination optimization.
</P>
<PRE>
token CId (('_' | letter) (letter | digit | '\'' | '_')*) ;
</PRE>
<P></P>
<H3>Abstract syntax</H3>
<P>
Types are first-order function types built from argument type
contexts and value types.
category symbols. Syntax trees (<CODE>Exp</CODE>) are
rose trees with nodes consisting of a head (<CODE>Atom</CODE>) and
bound variables (<CODE>CId</CODE>).
</P>
<PRE>
DTyp. Type ::= "[" [Hypo] "]" CId [Exp] ;
DTr. Exp ::= "[" "(" [CId] ")" Atom [Exp] "]" ;
Hyp. Hypo ::= CId ":" Type ;
</PRE>
<P>
The head Atom is either a function
constant, a bound variable, or a metavariable, or a string, integer, or float
literal.
</P>
<PRE>
AC. Atom ::= CId ;
AS. Atom ::= String ;
AI. Atom ::= Integer ;
AF. Atom ::= Double ;
AM. Atom ::= "?" Integer ;
</PRE>
<P>
The context-free types and trees of the "old GFCC" are special
cases, which can be defined as follows:
</P>
<PRE>
Typ. Type ::= [CId] "-&gt;" CId
Typ args val = DTyp [Hyp (CId "_") arg | arg &lt;- args] val
Tr. Exp ::= "(" CId [Exp] ")"
Tr fun exps = DTr [] fun exps
</PRE>
<P>
To store semantic (<CODE>def</CODE>) definitions by cases, the following expression
form is provided, but it is only meaningful in the last field of a function
declaration in an abstract syntax:
</P>
<PRE>
EEq. Exp ::= "{" [Equation] "}" ;
Equ. Equation ::= [Exp] "-&gt;" Exp ;
</PRE>
<P>
Notice that expressions are used to encode patterns. Primitive notions
(the default semantics in GF) are encoded as empty sets of equations
(<CODE>[]</CODE>). For a constructor (canonical form) of a category <CODE>C</CODE>, we
aim to use the encoding as the application <CODE>(_constr C)</CODE>.
</P>
<H3>Concrete syntax</H3>
<P>
Linearization terms (<CODE>Term</CODE>) are built as follows.
Constructor names are shown to make the later code
examples readable.
</P>
<PRE>
R. Term ::= "[" [Term] "]" ; -- array (record/table)
P. Term ::= "(" Term "!" Term ")" ; -- access to field (projection/selection)
S. Term ::= "(" [Term] ")" ; -- concatenated sequence
K. Term ::= Tokn ; -- token
V. Term ::= "$" Integer ; -- argument (subtree)
C. Term ::= Integer ; -- array index (label/parameter value)
FV. Term ::= "[|" [Term] "|]" ; -- free variation
TM. Term ::= "?" ; -- linearization of metavariable
</PRE>
<P>
Tokens are strings or (maybe obsolescent) prefix-dependent
variant lists.
</P>
<PRE>
KS. Tokn ::= String ;
KP. Tokn ::= "[" "pre" [String] "[" [Variant] "]" "]" ;
Var. Variant ::= [String] "/" [String] ;
</PRE>
<P>
Two special forms of terms are introduced by the compiler
as optimizations. They can in principle be eliminated, but
their presence makes grammars much more compact. Their semantics
will be explained in a later section.
</P>
<PRE>
F. Term ::= CId ; -- global constant
W. Term ::= "(" String "+" Term ")" ; -- prefix + suffix table
</PRE>
<P>
There is also a deprecated form of "record parameter alias",
</P>
<PRE>
RP. Term ::= "(" Term "@" Term ")"; -- DEPRECATED
</PRE>
<P>
which will be removed when the migration to new GFCC is complete.
</P>
<H2>The semantics of concrete syntax terms</H2>
<P>
The code in this section is from <A HREF="../Linearize.hs"><CODE>GF/GFCC/Linearize.hs</CODE></A>.
</P>
<H3>Linearization and realization</H3>
<P>
The linearization algorithm is essentially the same as in
GFC: a tree is linearized by evaluating its linearization term
in the environment of the linearizations of the subtrees.
Literal atoms are linearized in the obvious way.
The function also needs to know the language (i.e. concrete syntax)
in which linearization is performed.
</P>
<PRE>
linExp :: GFCC -&gt; CId -&gt; Exp -&gt; Term
linExp gfcc lang tree@(DTr _ at trees) = case at of
AC fun -&gt; comp (Prelude.map lin trees) $ look fun
AS s -&gt; R [kks (show s)] -- quoted
AI i -&gt; R [kks (show i)]
AF d -&gt; R [kks (show d)]
AM -&gt; TM
where
lin = linExp gfcc lang
comp = compute gfcc lang
look = lookLin gfcc lang
</PRE>
<P>
TODO: bindings must be supported.
</P>
<P>
The result of linearization is usually a record, which is realized as
a string using the following algorithm.
</P>
<PRE>
realize :: Term -&gt; String
realize trm = case trm of
R (t:_) -&gt; realize t
S ss -&gt; unwords $ Prelude.map realize ss
K (KS s) -&gt; s
K (KP s _) -&gt; unwords s ---- prefix choice TODO
W s t -&gt; s ++ realize t
FV (t:_) -&gt; realize t
TM -&gt; "?"
</PRE>
<P>
Notice that realization always picks the first field of a record.
If a linearization type has more than one field, the first field
does not necessarily contain the desired string.
Also notice that the order of record fields in GFCC is not necessarily
the same as in GF source.
</P>
<H3>Term evaluation</H3>
<P>
Evaluation follows call-by-value order, with two environments
needed:
</P>
<UL>
<LI>the grammar (a concrete syntax) to give the global constants
<LI>an array of terms to give the subtree linearizations
</UL>
<P>
The code is presented in one-level pattern matching, to
enable reimplementations in languages that do not permit
deep patterns (such as Java and C++).
</P>
<PRE>
compute :: GFCC -&gt; CId -&gt; [Term] -&gt; Term -&gt; Term
compute gfcc lang args = comp where
comp trm = case trm of
P r p -&gt; proj (comp r) (comp p)
W s t -&gt; W s (comp t)
R ts -&gt; R $ Prelude.map comp ts
V i -&gt; idx args (fromInteger i) -- already computed
F c -&gt; comp $ look c -- not computed (if contains V)
FV ts -&gt; FV $ Prelude.map comp ts
S ts -&gt; S $ Prelude.filter (/= S []) $ Prelude.map comp ts
_ -&gt; trm
look = lookOper gfcc lang
idx xs i = xs !! i
proj r p = case (r,p) of
(_, FV ts) -&gt; FV $ Prelude.map (proj r) ts
(W s t, _) -&gt; kks (s ++ getString (proj t p))
_ -&gt; comp $ getField r (getIndex p)
getString t = case t of
K (KS s) -&gt; s
_ -&gt; trace ("ERROR in grammar compiler: string from "++ show t) "ERR"
getIndex t = case t of
C i -&gt; fromInteger i
RP p _ -&gt; getIndex p
TM -&gt; 0 -- default value for parameter
_ -&gt; trace ("ERROR in grammar compiler: index from " ++ show t) 0
getField t i = case t of
R rs -&gt; idx rs i
RP _ r -&gt; getField r i
TM -&gt; TM
_ -&gt; trace ("ERROR in grammar compiler: field from " ++ show t) t
</PRE>
<P></P>
<H3>The special term constructors</H3>
<P>
The three forms introduced by the compiler may a need special
explanation.
</P>
<P>
Global constants
</P>
<PRE>
Term ::= CId ;
</PRE>
<P>
are shorthands for complex terms. They are produced by the
compiler by (iterated) <B>common subexpression elimination</B>.
They are often more powerful than hand-devised code sharing in the source
code. They could be computed off-line by replacing each identifier by
its definition.
</P>
<P>
<B>Prefix-suffix tables</B>
</P>
<PRE>
Term ::= "(" String "+" Term ")" ;
</PRE>
<P>
represent tables of word forms divided to the longest common prefix
and its array of suffixes. In the example grammar above, we have
</P>
<PRE>
Sleep = [("sleep" + ["s",""])]
</PRE>
<P>
which in fact is equal to the array of full forms
</P>
<PRE>
["sleeps", "sleep"]
</PRE>
<P>
The power of this construction comes from the fact that suffix sets
tend to be repeated in a language, and can therefore be collected
by common subexpression elimination. It is this technique that
explains the used syntax rather than the more accurate
</P>
<PRE>
"(" String "+" [String] ")"
</PRE>
<P>
since we want the suffix part to be a <CODE>Term</CODE> for the optimization to
take effect.
</P>
<H2>Compiling to GFCC</H2>
<P>
Compilation to GFCC is performed by the GF grammar compiler, and
GFCC interpreters need not know what it does. For grammar writers,
however, it might be interesting to know what happens to the grammars
in the process.
</P>
<P>
The compilation phases are the following
</P>
<OL>
<LI>type check and partially evaluate GF source
<LI>create a symbol table mapping the GF parameter and record types to
fixed-size arrays, and parameter values and record labels to integers
<LI>traverse the linearization rules replacing parameters and labels by integers
<LI>reorganize the created GF grammar so that it has just one abstract syntax
and one concrete syntax per language
<LI>TODO: apply UTF8 encoding to the grammar, if not yet applied (this is told by the
<CODE>coding</CODE> flag)
<LI>translate the GF grammar object to a GFCC grammar object, using a simple
compositional mapping
<LI>perform the word-suffix optimization on GFCC linearization terms
<LI>perform subexpression elimination on each concrete syntax module
<LI>print out the GFCC code
</OL>
<H3>Problems in GFCC compilation</H3>
<P>
Two major problems had to be solved in compiling GF to GFCC:
</P>
<UL>
<LI>consistent order of tables and records, to permit the array translation
<LI>run-time variables in complex parameter values.
</UL>
<P>
The current implementation is still experimental and may fail
to generate correct code. Any errors remaining are likely to be
related to the two problems just mentioned.
</P>
<P>
The order problem is solved in slightly different ways for tables and records.
In both cases, <B>eta expansion</B> is used to establish a
canonical order. Tables are ordered by applying the preorder induced
by <CODE>param</CODE> definitions. Records are ordered by sorting them by labels.
This means that
e.g. the <CODE>s</CODE> field will in general no longer appear as the first
field, even if it does so in the GF source code. But relying on the
order of fields in a labelled record would be misplaced anyway.
</P>
<P>
The canonical form of records is further complicated by lock fields,
i.e. dummy fields of form <CODE>lock_C = &lt;&gt;</CODE>, which are added to grammar
libraries to force intensionality of linearization types. The problem
is that the absence of a lock field only generates a warning, not
an error. Therefore a GF grammar can contain objects of the same
type with and without a lock field. This problem was solved in GFCC
generation by just removing all lock fields (defined as fields whose
type is the empty record type). This has the further advantage of
(slightly) reducing the grammar size. More importantly, it is safe
to remove lock fields, because they are never used in computation,
and because intensional types are only needed in grammars reused
as libraries, not in grammars used at runtime.
</P>
<P>
While the order problem is rather bureaucratic in nature, run-time
variables are an interesting problem. They arise in the presence
of complex parameter values, created by argument-taking constructors
and parameter records. To give an example, consider the GF parameter
type system
</P>
<PRE>
Number = Sg | Pl ;
Person = P1 | P2 | P3 ;
Agr = Ag Number Person ;
</PRE>
<P>
The values can be translated to integers in the expected way,
</P>
<PRE>
Sg = 0, Pl = 1
P1 = 0, P2 = 1, P3 = 2
Ag Sg P1 = 0, Ag Sg P2 = 1, Ag Sg P3 = 2,
Ag Pl P1 = 3, Ag Pl P2 = 4, Ag Pl P3 = 5
</PRE>
<P>
However, an argument of <CODE>Agr</CODE> can be a run-time variable, as in
</P>
<PRE>
Ag np.n P3
</PRE>
<P>
This expression must first be translated to a case expression,
</P>
<PRE>
case np.n of {
0 =&gt; 2 ;
1 =&gt; 5
}
</PRE>
<P>
which can then be translated to the GFCC term
</P>
<PRE>
([2,5] ! ($0 ! $1))
</PRE>
<P>
assuming that the variable <CODE>np</CODE> is the first argument and that its
<CODE>Number</CODE> field is the second in the record.
</P>
<P>
This transformation of course has to be performed recursively, since
there can be several run-time variables in a parameter value:
</P>
<PRE>
Ag np.n np.p
</PRE>
<P>
A similar transformation would be possible to deal with the double
role of parameter records discussed above. Thus the type
</P>
<PRE>
RNP = {n : Number ; p : Person}
</PRE>
<P>
could be uniformly translated into the set <CODE>{0,1,2,3,4,5}</CODE>
as <CODE>Agr</CODE> above. Selections would be simple instances of indexing.
But any projection from the record should be translated into
a case expression,
</P>
<PRE>
rnp.n ===&gt;
case rnp of {
0 =&gt; 0 ;
1 =&gt; 0 ;
2 =&gt; 0 ;
3 =&gt; 1 ;
4 =&gt; 1 ;
5 =&gt; 1
}
</PRE>
<P>
To avoid the code bloat resulting from this, we have chosen to
deal with records by a <B>currying</B> transformation:
</P>
<PRE>
table {n : Number ; p : Person} {... ...}
===&gt;
table Number {Sg =&gt; table Person {...} ; table Person {...}}
</PRE>
<P>
This is performed when GFCC is generated. Selections with
records have to be treated likewise,
</P>
<PRE>
t ! r ===&gt; t ! r.n ! r.p
</PRE>
<P></P>
<H3>The representation of linearization types</H3>
<P>
Linearization types (<CODE>lincat</CODE>) are not needed when generating with
GFCC, but they have been added to enable parser generation directly from
GFCC. The linearization type definitions are shown as a part of the
concrete syntax, by using terms to represent types. Here is the table
showing how different linearization types are encoded.
</P>
<PRE>
P* = max(P) -- parameter type
{r1 : T1 ; ... ; rn : Tn}* = [T1*,...,Tn*] -- record
(P =&gt; T)* = [T* ,...,T*] -- table, size(P) cases
Str* = ()
</PRE>
<P>
For example, the linearization type <CODE>present/CatEng.NP</CODE> is
translated as follows:
</P>
<PRE>
NP = {
a : { -- 6 = 2*3 values
n : {ParamX.Number} ; -- 2 values
p : {ParamX.Person} -- 3 values
} ;
s : {ResEng.Case} =&gt; Str -- 3 values
}
__NP = [[1,2],[(),(),()]]
</PRE>
<P></P>
<H3>Running the compiler and the GFCC interpreter</H3>
<P>
GFCC generation is a part of the
<A HREF="http://www.cs.chalmers.se/Cs/Research/Language-technology/darcs/GF/doc/darcs.html">developers' version</A>
of GF since September 2006. To invoke the compiler, the flag
<CODE>-printer=gfcc</CODE> to the command
<CODE>pm = print_multi</CODE> is used. It is wise to recompile the grammar from
source, since previously compiled libraries may not obey the canonical
order of records.
Here is an example, performed in
<A HREF="../../../../../examples/bronzeage">example/bronzeage</A>.
</P>
<PRE>
i -src -path=.:prelude:resource-1.0/* -optimize=all_subs BronzeageEng.gf
i -src -path=.:prelude:resource-1.0/* -optimize=all_subs BronzeageGer.gf
strip
pm -printer=gfcc | wf bronze.gfcc
</PRE>
<P>
There is also an experimental batch compiler, which does not use the GFC
format or the record aliases. It can be produced by
</P>
<PRE>
make gfc
</PRE>
<P>
in <CODE>GF/src</CODE>, and invoked by
</P>
<PRE>
gfc --make FILES
</PRE>
<P></P>
<H2>The reference interpreter</H2>
<P>
The reference interpreter written in Haskell consists of the following files:
</P>
<PRE>
-- source file for BNFC
GFCC.cf -- labelled BNF grammar of gfcc
-- files generated by BNFC
AbsGFCC.hs -- abstrac syntax datatypes
ErrM.hs -- error monad used internally
LexGFCC.hs -- lexer of gfcc files
ParGFCC.hs -- parser of gfcc files and syntax trees
PrintGFCC.hs -- printer of gfcc files and syntax trees
-- hand-written files
DataGFCC.hs -- grammar datatype, post-parser grammar creation
Linearize.hs -- linearization and evaluation
Macros.hs -- utilities abstracting away from GFCC datatypes
Generate.hs -- random and exhaustive generation, generate-and-test parsing
API.hs -- functionalities accessible in embedded GF applications
Generate.hs -- random and exhaustive generation
Shell.hs -- main function - a simple command interpreter
</PRE>
<P>
It is included in the
<A HREF="http://www.cs.chalmers.se/Cs/Research/Language-technology/darcs/GF/doc/darcs.html">developers' version</A>
of GF, in the subdirectories <A HREF="../"><CODE>GF/src/GF/GFCC</CODE></A> and
<A HREF="../../Devel"><CODE>GF/src/GF/Devel</CODE></A>.
</P>
<P>
As of September 2007, default parsing in main GF uses GFCC (implemented by Krasimir
Angelov). The interpreter uses the relevant modules
</P>
<PRE>
GF/Conversions/SimpleToFCFG.hs -- generate parser from GFCC
GF/Parsing/FCFG.hs -- run the parser
</PRE>
<P></P>
<P>
To compile the interpreter, type
</P>
<PRE>
make gfcc
</PRE>
<P>
in <CODE>GF/src</CODE>. To run it, type
</P>
<PRE>
./gfcc &lt;GFCC-file&gt;
</PRE>
<P>
The available commands are
</P>
<UL>
<LI><CODE>gr &lt;Cat&gt; &lt;Int&gt;</CODE>: generate a number of random trees in category.
and show their linearizations in all languages
<LI><CODE>grt &lt;Cat&gt; &lt;Int&gt;</CODE>: generate a number of random trees in category.
and show the trees and their linearizations in all languages
<LI><CODE>gt &lt;Cat&gt; &lt;Int&gt;</CODE>: generate a number of trees in category from smallest,
and show their linearizations in all languages
<LI><CODE>gtt &lt;Cat&gt; &lt;Int&gt;</CODE>: generate a number of trees in category from smallest,
and show the trees and their linearizations in all languages
<LI><CODE>p &lt;Lang&gt; &lt;Cat&gt; &lt;String&gt;</CODE>: parse a string into a set of trees
<LI><CODE>lin &lt;Tree&gt;</CODE>: linearize tree in all languages, also showing full records
<LI><CODE>q</CODE>: terminate the system cleanly
</UL>
<H2>Embedded formats</H2>
<UL>
<LI>JavaScript: compiler of linearization and abstract syntax
<P></P>
<LI>Haskell: compiler of abstract syntax and interpreter with parsing,
linearization, and generation
<P></P>
<LI>C: compiler of linearization (old GFCC)
<P></P>
<LI>C++: embedded interpreter supporting linearization (old GFCC)
</UL>
<H2>Some things to do</H2>
<P>
Support for dependent types, higher-order abstract syntax, and
semantic definition in GFCC generation and interpreters.
</P>
<P>
Replacing the entire GF shell by one based on GFCC.
</P>
<P>
Interpreter in Java.
</P>
<P>
Hand-written parsers for GFCC grammars to reduce code size
(and efficiency?) of interpreters.
</P>
<P>
Binary format and/or file compression of GFCC output.
</P>
<P>
Syntax editor based on GFCC.
</P>
<P>
Rewriting of resource libraries in order to exploit the
word-suffix sharing better (depth-one tables, as in FM).
</P>
<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
<!-- cmdline: txt2tags -thtml gfcc.txt -->
</BODY></HTML>

712
deprecated/PGF/doc/gfcc.txt Normal file
View File

@@ -0,0 +1,712 @@
The GFCC Grammar Format
Aarne Ranta
December 14, 2007
Author's address:
[``http://www.cs.chalmers.se/~aarne`` http://www.cs.chalmers.se/~aarne]
% to compile: txt2tags -thtml --toc gfcc.txt
History:
- 14 Dec 2007: simpler, Lisp-like concrete syntax of GFCC
- 5 Oct 2007: new, better structured GFCC with full expressive power
- 19 Oct: translation of lincats, new figures on C++
- 3 Oct 2006: first version
==What is GFCC==
GFCC is a low-level format for GF grammars. Its aim is to contain the minimum
that is needed to process GF grammars at runtime. This minimality has three
advantages:
- compact grammar files and run-time objects
- time and space efficient processing
- simple definition of interpreters
Thus we also want to call GFCC the **portable grammar format**.
The idea is that all embedded GF applications use GFCC.
The GF system would be primarily used as a compiler and as a grammar
development tool.
Since GFCC is implemented in BNFC, a parser of the format is readily
available for C, C++, C#, Haskell, Java, and OCaml. Also an XML
representation can be generated in BNFC. A
[reference implementation ../]
of linearization and some other functions has been written in Haskell.
==GFCC vs. GFC==
GFCC is aimed to replace GFC as the run-time grammar format. GFC was designed
to be a run-time format, but also to
support separate compilation of grammars, i.e.
to store the results of compiling
individual GF modules. But this means that GFC has to contain extra information,
such as type annotations, which is only needed in compilation and not at
run-time. In particular, the pattern matching syntax and semantics of GFC is
complex and therefore difficult to implement in new platforms.
Actually, GFC is planned to be omitted also as the target format of
separate compilation, where plain GF (type annotated and partially evaluated)
will be used instead. GFC provides only marginal advantages as a target format
compared with GF, and it is therefore just extra weight to carry around this
format.
The main differences of GFCC compared with GFC (and GF) can be
summarized as follows:
- there are no modules, and therefore no qualified names
- a GFCC grammar is multilingual, and consists of a common abstract syntax
together with one concrete syntax per language
- records and tables are replaced by arrays
- record labels and parameter values are replaced by integers
- record projection and table selection are replaced by array indexing
- even though the format does support dependent types and higher-order abstract
syntax, there is no interpreted yet that does this
Here is an example of a GF grammar, consisting of three modules,
as translated to GFCC. The representations are aligned;
thus they do not completely
reflect the order of judgements in GFCC files, which have different orders of
blocks of judgements, and alphabetical sorting.
```
grammar Ex(Eng,Swe);
abstract Ex = { abstract {
cat cat
S ; NP ; VP ; NP[]; S[]; VP[];
fun fun
Pred : NP -> VP -> S ; Pred=[(($ 0! 1),(($ 1! 0)!($ 0! 0)))];
She, They : NP ; She=[0,"she"];
Sleep : VP ; They=[1,"they"];
Sleep=[["sleeps","sleep"]];
} } ;
concrete Eng of Ex = { concrete Eng {
lincat lincat
S = {s : Str} ; S=[()];
NP = {s : Str ; n : Num} ; NP=[1,()];
VP = {s : Num => Str} ; VP=[[(),()]];
param
Num = Sg | Pl ;
lin lin
Pred np vp = { Pred=[(($ 0! 1),(($ 1! 0)!($ 0! 0)))];
s = np.s ++ vp.s ! np.n} ;
She = {s = "she" ; n = Sg} ; She=[0,"she"];
They = {s = "they" ; n = Pl} ; They = [1, "they"];
Sleep = {s = table { Sleep=[["sleeps","sleep"]];
Sg => "sleeps" ;
Pl => "sleep"
}
} ;
} } ;
concrete Swe of Ex = { concrete Swe {
lincat lincat
S = {s : Str} ; S=[()];
NP = {s : Str} ; NP=[()];
VP = {s : Str} ; VP=[()];
param
Num = Sg | Pl ;
lin lin
Pred np vp = { Pred = [(($0!0),($1!0))];
s = np.s ++ vp.s} ;
She = {s = "hon"} ; She = ["hon"];
They = {s = "de"} ; They = ["de"];
Sleep = {s = "sover"} ; Sleep = ["sover"];
} } ;
```
==The syntax of GFCC files==
The complete BNFC grammar, from which
the rules in this section are taken, is in the file
[``GF/GFCC/GFCC.cf`` ../DataGFCC.cf].
===Top level===
A grammar has a header telling the name of the abstract syntax
(often specifying an application domain), and the names of
the concrete languages. The abstract syntax and the concrete
syntaxes themselves follow.
```
Grm. Grammar ::=
"grammar" CId "(" [CId] ")" ";"
Abstract ";"
[Concrete] ;
Abs. Abstract ::=
"abstract" "{"
"flags" [Flag]
"fun" [FunDef]
"cat" [CatDef]
"}" ;
Cnc. Concrete ::=
"concrete" CId "{"
"flags" [Flag]
"lin" [LinDef]
"oper" [LinDef]
"lincat" [LinDef]
"lindef" [LinDef]
"printname" [LinDef]
"}" ;
```
This syntax organizes each module to a sequence of **fields**, such
as flags, linearizations, operations, linearization types, etc.
It is envisaged that particular applications can ignore some
of the fields, typically so that earlier fields are more
important than later ones.
The judgement forms have the following syntax.
```
Flg. Flag ::= CId "=" String ;
Cat. CatDef ::= CId "[" [Hypo] "]" ;
Fun. FunDef ::= CId ":" Type "=" Exp ;
Lin. LinDef ::= CId "=" Term ;
```
For the run-time system, the reference implementation in Haskell
uses a structure that gives efficient look-up:
```
data GFCC = GFCC {
absname :: CId ,
cncnames :: [CId] ,
abstract :: Abstr ,
concretes :: Map CId Concr
}
data Abstr = Abstr {
aflags :: Map CId String, -- value of a flag
funs :: Map CId (Type,Exp), -- type and def of a fun
cats :: Map CId [Hypo], -- context of a cat
catfuns :: Map CId [CId] -- funs yielding a cat (redundant, for fast lookup)
}
data Concr = Concr {
flags :: Map CId String, -- value of a flag
lins :: Map CId Term, -- lin of a fun
opers :: Map CId Term, -- oper generated by subex elim
lincats :: Map CId Term, -- lin type of a cat
lindefs :: Map CId Term, -- lin default of a cat
printnames :: Map CId Term -- printname of a cat or a fun
}
```
These definitions are from [``GF/GFCC/DataGFCC.hs`` ../DataGFCC.hs].
Identifiers (``CId``) are like ``Ident`` in GF, except that
the compiler produces constants prefixed with ``_`` in
the common subterm elimination optimization.
```
token CId (('_' | letter) (letter | digit | '\'' | '_')*) ;
```
===Abstract syntax===
Types are first-order function types built from argument type
contexts and value types.
category symbols. Syntax trees (``Exp``) are
rose trees with nodes consisting of a head (``Atom``) and
bound variables (``CId``).
```
DTyp. Type ::= "[" [Hypo] "]" CId [Exp] ;
DTr. Exp ::= "[" "(" [CId] ")" Atom [Exp] "]" ;
Hyp. Hypo ::= CId ":" Type ;
```
The head Atom is either a function
constant, a bound variable, or a metavariable, or a string, integer, or float
literal.
```
AC. Atom ::= CId ;
AS. Atom ::= String ;
AI. Atom ::= Integer ;
AF. Atom ::= Double ;
AM. Atom ::= "?" Integer ;
```
The context-free types and trees of the "old GFCC" are special
cases, which can be defined as follows:
```
Typ. Type ::= [CId] "->" CId
Typ args val = DTyp [Hyp (CId "_") arg | arg <- args] val
Tr. Exp ::= "(" CId [Exp] ")"
Tr fun exps = DTr [] fun exps
```
To store semantic (``def``) definitions by cases, the following expression
form is provided, but it is only meaningful in the last field of a function
declaration in an abstract syntax:
```
EEq. Exp ::= "{" [Equation] "}" ;
Equ. Equation ::= [Exp] "->" Exp ;
```
Notice that expressions are used to encode patterns. Primitive notions
(the default semantics in GF) are encoded as empty sets of equations
(``[]``). For a constructor (canonical form) of a category ``C``, we
aim to use the encoding as the application ``(_constr C)``.
===Concrete syntax===
Linearization terms (``Term``) are built as follows.
Constructor names are shown to make the later code
examples readable.
```
R. Term ::= "[" [Term] "]" ; -- array (record/table)
P. Term ::= "(" Term "!" Term ")" ; -- access to field (projection/selection)
S. Term ::= "(" [Term] ")" ; -- concatenated sequence
K. Term ::= Tokn ; -- token
V. Term ::= "$" Integer ; -- argument (subtree)
C. Term ::= Integer ; -- array index (label/parameter value)
FV. Term ::= "[|" [Term] "|]" ; -- free variation
TM. Term ::= "?" ; -- linearization of metavariable
```
Tokens are strings or (maybe obsolescent) prefix-dependent
variant lists.
```
KS. Tokn ::= String ;
KP. Tokn ::= "[" "pre" [String] "[" [Variant] "]" "]" ;
Var. Variant ::= [String] "/" [String] ;
```
Two special forms of terms are introduced by the compiler
as optimizations. They can in principle be eliminated, but
their presence makes grammars much more compact. Their semantics
will be explained in a later section.
```
F. Term ::= CId ; -- global constant
W. Term ::= "(" String "+" Term ")" ; -- prefix + suffix table
```
There is also a deprecated form of "record parameter alias",
```
RP. Term ::= "(" Term "@" Term ")"; -- DEPRECATED
```
which will be removed when the migration to new GFCC is complete.
==The semantics of concrete syntax terms==
The code in this section is from [``GF/GFCC/Linearize.hs`` ../Linearize.hs].
===Linearization and realization===
The linearization algorithm is essentially the same as in
GFC: a tree is linearized by evaluating its linearization term
in the environment of the linearizations of the subtrees.
Literal atoms are linearized in the obvious way.
The function also needs to know the language (i.e. concrete syntax)
in which linearization is performed.
```
linExp :: GFCC -> CId -> Exp -> Term
linExp gfcc lang tree@(DTr _ at trees) = case at of
AC fun -> comp (Prelude.map lin trees) $ look fun
AS s -> R [kks (show s)] -- quoted
AI i -> R [kks (show i)]
AF d -> R [kks (show d)]
AM -> TM
where
lin = linExp gfcc lang
comp = compute gfcc lang
look = lookLin gfcc lang
```
TODO: bindings must be supported.
The result of linearization is usually a record, which is realized as
a string using the following algorithm.
```
realize :: Term -> String
realize trm = case trm of
R (t:_) -> realize t
S ss -> unwords $ Prelude.map realize ss
K (KS s) -> s
K (KP s _) -> unwords s ---- prefix choice TODO
W s t -> s ++ realize t
FV (t:_) -> realize t
TM -> "?"
```
Notice that realization always picks the first field of a record.
If a linearization type has more than one field, the first field
does not necessarily contain the desired string.
Also notice that the order of record fields in GFCC is not necessarily
the same as in GF source.
===Term evaluation===
Evaluation follows call-by-value order, with two environments
needed:
- the grammar (a concrete syntax) to give the global constants
- an array of terms to give the subtree linearizations
The code is presented in one-level pattern matching, to
enable reimplementations in languages that do not permit
deep patterns (such as Java and C++).
```
compute :: GFCC -> CId -> [Term] -> Term -> Term
compute gfcc lang args = comp where
comp trm = case trm of
P r p -> proj (comp r) (comp p)
W s t -> W s (comp t)
R ts -> R $ Prelude.map comp ts
V i -> idx args (fromInteger i) -- already computed
F c -> comp $ look c -- not computed (if contains V)
FV ts -> FV $ Prelude.map comp ts
S ts -> S $ Prelude.filter (/= S []) $ Prelude.map comp ts
_ -> trm
look = lookOper gfcc lang
idx xs i = xs !! i
proj r p = case (r,p) of
(_, FV ts) -> FV $ Prelude.map (proj r) ts
(FV ts, _ ) -> FV $ Prelude.map (\t -> proj t p) ts
(W s t, _) -> kks (s ++ getString (proj t p))
_ -> comp $ getField r (getIndex p)
getString t = case t of
K (KS s) -> s
_ -> trace ("ERROR in grammar compiler: string from "++ show t) "ERR"
getIndex t = case t of
C i -> fromInteger i
RP p _ -> getIndex p
TM -> 0 -- default value for parameter
_ -> trace ("ERROR in grammar compiler: index from " ++ show t) 0
getField t i = case t of
R rs -> idx rs i
RP _ r -> getField r i
TM -> TM
_ -> trace ("ERROR in grammar compiler: field from " ++ show t) t
```
===The special term constructors===
The three forms introduced by the compiler may a need special
explanation.
Global constants
```
Term ::= CId ;
```
are shorthands for complex terms. They are produced by the
compiler by (iterated) **common subexpression elimination**.
They are often more powerful than hand-devised code sharing in the source
code. They could be computed off-line by replacing each identifier by
its definition.
**Prefix-suffix tables**
```
Term ::= "(" String "+" Term ")" ;
```
represent tables of word forms divided to the longest common prefix
and its array of suffixes. In the example grammar above, we have
```
Sleep = [("sleep" + ["s",""])]
```
which in fact is equal to the array of full forms
```
["sleeps", "sleep"]
```
The power of this construction comes from the fact that suffix sets
tend to be repeated in a language, and can therefore be collected
by common subexpression elimination. It is this technique that
explains the used syntax rather than the more accurate
```
"(" String "+" [String] ")"
```
since we want the suffix part to be a ``Term`` for the optimization to
take effect.
==Compiling to GFCC==
Compilation to GFCC is performed by the GF grammar compiler, and
GFCC interpreters need not know what it does. For grammar writers,
however, it might be interesting to know what happens to the grammars
in the process.
The compilation phases are the following
+ type check and partially evaluate GF source
+ create a symbol table mapping the GF parameter and record types to
fixed-size arrays, and parameter values and record labels to integers
+ traverse the linearization rules replacing parameters and labels by integers
+ reorganize the created GF grammar so that it has just one abstract syntax
and one concrete syntax per language
+ TODO: apply UTF8 encoding to the grammar, if not yet applied (this is told by the
``coding`` flag)
+ translate the GF grammar object to a GFCC grammar object, using a simple
compositional mapping
+ perform the word-suffix optimization on GFCC linearization terms
+ perform subexpression elimination on each concrete syntax module
+ print out the GFCC code
===Problems in GFCC compilation===
Two major problems had to be solved in compiling GF to GFCC:
- consistent order of tables and records, to permit the array translation
- run-time variables in complex parameter values.
The current implementation is still experimental and may fail
to generate correct code. Any errors remaining are likely to be
related to the two problems just mentioned.
The order problem is solved in slightly different ways for tables and records.
In both cases, **eta expansion** is used to establish a
canonical order. Tables are ordered by applying the preorder induced
by ``param`` definitions. Records are ordered by sorting them by labels.
This means that
e.g. the ``s`` field will in general no longer appear as the first
field, even if it does so in the GF source code. But relying on the
order of fields in a labelled record would be misplaced anyway.
The canonical form of records is further complicated by lock fields,
i.e. dummy fields of form ``lock_C = <>``, which are added to grammar
libraries to force intensionality of linearization types. The problem
is that the absence of a lock field only generates a warning, not
an error. Therefore a GF grammar can contain objects of the same
type with and without a lock field. This problem was solved in GFCC
generation by just removing all lock fields (defined as fields whose
type is the empty record type). This has the further advantage of
(slightly) reducing the grammar size. More importantly, it is safe
to remove lock fields, because they are never used in computation,
and because intensional types are only needed in grammars reused
as libraries, not in grammars used at runtime.
While the order problem is rather bureaucratic in nature, run-time
variables are an interesting problem. They arise in the presence
of complex parameter values, created by argument-taking constructors
and parameter records. To give an example, consider the GF parameter
type system
```
Number = Sg | Pl ;
Person = P1 | P2 | P3 ;
Agr = Ag Number Person ;
```
The values can be translated to integers in the expected way,
```
Sg = 0, Pl = 1
P1 = 0, P2 = 1, P3 = 2
Ag Sg P1 = 0, Ag Sg P2 = 1, Ag Sg P3 = 2,
Ag Pl P1 = 3, Ag Pl P2 = 4, Ag Pl P3 = 5
```
However, an argument of ``Agr`` can be a run-time variable, as in
```
Ag np.n P3
```
This expression must first be translated to a case expression,
```
case np.n of {
0 => 2 ;
1 => 5
}
```
which can then be translated to the GFCC term
```
([2,5] ! ($0 ! $1))
```
assuming that the variable ``np`` is the first argument and that its
``Number`` field is the second in the record.
This transformation of course has to be performed recursively, since
there can be several run-time variables in a parameter value:
```
Ag np.n np.p
```
A similar transformation would be possible to deal with the double
role of parameter records discussed above. Thus the type
```
RNP = {n : Number ; p : Person}
```
could be uniformly translated into the set ``{0,1,2,3,4,5}``
as ``Agr`` above. Selections would be simple instances of indexing.
But any projection from the record should be translated into
a case expression,
```
rnp.n ===>
case rnp of {
0 => 0 ;
1 => 0 ;
2 => 0 ;
3 => 1 ;
4 => 1 ;
5 => 1
}
```
To avoid the code bloat resulting from this, we have chosen to
deal with records by a **currying** transformation:
```
table {n : Number ; p : Person} {... ...}
===>
table Number {Sg => table Person {...} ; table Person {...}}
```
This is performed when GFCC is generated. Selections with
records have to be treated likewise,
```
t ! r ===> t ! r.n ! r.p
```
===The representation of linearization types===
Linearization types (``lincat``) are not needed when generating with
GFCC, but they have been added to enable parser generation directly from
GFCC. The linearization type definitions are shown as a part of the
concrete syntax, by using terms to represent types. Here is the table
showing how different linearization types are encoded.
```
P* = max(P) -- parameter type
{r1 : T1 ; ... ; rn : Tn}* = [T1*,...,Tn*] -- record
(P => T)* = [T* ,...,T*] -- table, size(P) cases
Str* = ()
```
For example, the linearization type ``present/CatEng.NP`` is
translated as follows:
```
NP = {
a : { -- 6 = 2*3 values
n : {ParamX.Number} ; -- 2 values
p : {ParamX.Person} -- 3 values
} ;
s : {ResEng.Case} => Str -- 3 values
}
__NP = [[1,2],[(),(),()]]
```
===Running the compiler and the GFCC interpreter===
GFCC generation is a part of the
[developers' version http://www.cs.chalmers.se/Cs/Research/Language-technology/darcs/GF/doc/darcs.html]
of GF since September 2006. To invoke the compiler, the flag
``-printer=gfcc`` to the command
``pm = print_multi`` is used. It is wise to recompile the grammar from
source, since previously compiled libraries may not obey the canonical
order of records.
Here is an example, performed in
[example/bronzeage ../../../../../examples/bronzeage].
```
i -src -path=.:prelude:resource-1.0/* -optimize=all_subs BronzeageEng.gf
i -src -path=.:prelude:resource-1.0/* -optimize=all_subs BronzeageGer.gf
strip
pm -printer=gfcc | wf bronze.gfcc
```
There is also an experimental batch compiler, which does not use the GFC
format or the record aliases. It can be produced by
```
make gfc
```
in ``GF/src``, and invoked by
```
gfc --make FILES
```
==The reference interpreter==
The reference interpreter written in Haskell consists of the following files:
```
-- source file for BNFC
GFCC.cf -- labelled BNF grammar of gfcc
-- files generated by BNFC
AbsGFCC.hs -- abstrac syntax datatypes
ErrM.hs -- error monad used internally
LexGFCC.hs -- lexer of gfcc files
ParGFCC.hs -- parser of gfcc files and syntax trees
PrintGFCC.hs -- printer of gfcc files and syntax trees
-- hand-written files
DataGFCC.hs -- grammar datatype, post-parser grammar creation
Linearize.hs -- linearization and evaluation
Macros.hs -- utilities abstracting away from GFCC datatypes
Generate.hs -- random and exhaustive generation, generate-and-test parsing
API.hs -- functionalities accessible in embedded GF applications
Generate.hs -- random and exhaustive generation
Shell.hs -- main function - a simple command interpreter
```
It is included in the
[developers' version http://www.cs.chalmers.se/Cs/Research/Language-technology/darcs/GF/doc/darcs.html]
of GF, in the subdirectories [``GF/src/GF/GFCC`` ../] and
[``GF/src/GF/Devel`` ../../Devel].
As of September 2007, default parsing in main GF uses GFCC (implemented by Krasimir
Angelov). The interpreter uses the relevant modules
```
GF/Conversions/SimpleToFCFG.hs -- generate parser from GFCC
GF/Parsing/FCFG.hs -- run the parser
```
To compile the interpreter, type
```
make gfcc
```
in ``GF/src``. To run it, type
```
./gfcc <GFCC-file>
```
The available commands are
- ``gr <Cat> <Int>``: generate a number of random trees in category.
and show their linearizations in all languages
- ``grt <Cat> <Int>``: generate a number of random trees in category.
and show the trees and their linearizations in all languages
- ``gt <Cat> <Int>``: generate a number of trees in category from smallest,
and show their linearizations in all languages
- ``gtt <Cat> <Int>``: generate a number of trees in category from smallest,
and show the trees and their linearizations in all languages
- ``p <Lang> <Cat> <String>``: parse a string into a set of trees
- ``lin <Tree>``: linearize tree in all languages, also showing full records
- ``q``: terminate the system cleanly
==Embedded formats==
- JavaScript: compiler of linearization and abstract syntax
- Haskell: compiler of abstract syntax and interpreter with parsing,
linearization, and generation
- C: compiler of linearization (old GFCC)
- C++: embedded interpreter supporting linearization (old GFCC)
==Some things to do==
Support for dependent types, higher-order abstract syntax, and
semantic definition in GFCC generation and interpreters.
Replacing the entire GF shell by one based on GFCC.
Interpreter in Java.
Hand-written parsers for GFCC grammars to reduce code size
(and efficiency?) of interpreters.
Binary format and/or file compression of GFCC output.
Syntax editor based on GFCC.
Rewriting of resource libraries in order to exploit the
word-suffix sharing better (depth-one tables, as in FM).

View File

@@ -0,0 +1,50 @@
Grm. Grammar ::= Header ";" Abstract ";" [Concrete] ;
Hdr. Header ::= "grammar" CId "(" [CId] ")" ;
Abs. Abstract ::= "abstract" "{" [AbsDef] "}" ;
Cnc. Concrete ::= "concrete" CId "{" [CncDef] "}" ;
Fun. AbsDef ::= CId ":" Type "=" Exp ;
--AFl. AbsDef ::= "%" CId "=" String ; -- flag
Lin. CncDef ::= CId "=" Term ;
--CFl. CncDef ::= "%" CId "=" String ; -- flag
Typ. Type ::= [CId] "->" CId ;
Tr. Exp ::= "(" Atom [Exp] ")" ;
AC. Atom ::= CId ;
AS. Atom ::= String ;
AI. Atom ::= Integer ;
AF. Atom ::= Double ;
AM. Atom ::= "?" ;
trA. Exp ::= Atom ;
define trA a = Tr a [] ;
R. Term ::= "[" [Term] "]" ; -- record/table
P. Term ::= "(" Term "!" Term ")" ; -- projection/selection
S. Term ::= "(" [Term] ")" ; -- sequence with ++
K. Term ::= Tokn ; -- token
V. Term ::= "$" Integer ; -- argument
C. Term ::= Integer ; -- parameter value/label
F. Term ::= CId ; -- global constant
FV. Term ::= "[|" [Term] "|]" ; -- free variation
W. Term ::= "(" String "+" Term ")" ; -- prefix + suffix table
RP. Term ::= "(" Term "@" Term ")"; -- record parameter alias
TM. Term ::= "?" ; -- lin of metavariable
L. Term ::= "(" CId "->" Term ")" ; -- lambda abstracted table
BV. Term ::= "#" CId ; -- lambda-bound variable
KS. Tokn ::= String ;
KP. Tokn ::= "[" "pre" [String] "[" [Variant] "]" "]" ;
Var. Variant ::= [String] "/" [String] ;
terminator Concrete ";" ;
terminator AbsDef ";" ;
terminator CncDef ";" ;
separator CId "," ;
separator Term "," ;
terminator Exp "" ;
terminator String "" ;
separator Variant "," ;
token CId (('_' | letter) (letter | digit | '\'' | '_')*) ;

View File

@@ -0,0 +1,656 @@
The GFCC Grammar Format
Aarne Ranta
October 19, 2006
Author's address:
[``http://www.cs.chalmers.se/~aarne`` http://www.cs.chalmers.se/~aarne]
% to compile: txt2tags -thtml --toc gfcc.txt
History:
- 19 Oct: translation of lincats, new figures on C++
- 3 Oct 2006: first version
==What is GFCC==
GFCC is a low-level format for GF grammars. Its aim is to contain the minimum
that is needed to process GF grammars at runtime. This minimality has three
advantages:
- compact grammar files and run-time objects
- time and space efficient processing
- simple definition of interpreters
The idea is that all embedded GF applications are compiled to GFCC.
The GF system would be primarily used as a compiler and as a grammar
development tool.
Since GFCC is implemented in BNFC, a parser of the format is readily
available for C, C++, Haskell, Java, and OCaml. Also an XML
representation is generated in BNFC. A
[reference implementation ../]
of linearization and some other functions has been written in Haskell.
==GFCC vs. GFC==
GFCC is aimed to replace GFC as the run-time grammar format. GFC was designed
to be a run-time format, but also to
support separate compilation of grammars, i.e.
to store the results of compiling
individual GF modules. But this means that GFC has to contain extra information,
such as type annotations, which is only needed in compilation and not at
run-time. In particular, the pattern matching syntax and semantics of GFC is
complex and therefore difficult to implement in new platforms.
The main differences of GFCC compared with GFC can be summarized as follows:
- there are no modules, and therefore no qualified names
- a GFCC grammar is multilingual, and consists of a common abstract syntax
together with one concrete syntax per language
- records and tables are replaced by arrays
- record labels and parameter values are replaced by integers
- record projection and table selection are replaced by array indexing
- there is (so far) no support for dependent types or higher-order abstract
syntax (which would be easy to add, but make interpreters much more difficult
to write)
Here is an example of a GF grammar, consisting of three modules,
as translated to GFCC. The representations are aligned, with the exceptions
due to the alphabetical sorting of GFCC grammars.
```
grammar Ex(Eng,Swe);
abstract Ex = { abstract {
cat
S ; NP ; VP ;
fun
Pred : NP -> VP -> S ; Pred : NP,VP -> S = (Pred);
She, They : NP ; She : -> NP = (She);
Sleep : VP ; Sleep : -> VP = (Sleep);
They : -> NP = (They);
} } ;
concrete Eng of Ex = { concrete Eng {
lincat
S = {s : Str} ;
NP = {s : Str ; n : Num} ;
VP = {s : Num => Str} ;
param
Num = Sg | Pl ;
lin
Pred np vp = { Pred = [(($0!1),(($1!0)!($0!0)))];
s = np.s ++ vp.s ! np.n} ;
She = {s = "she" ; n = Sg} ; She = [0, "she"];
They = {s = "they" ; n = Pl} ;
Sleep = {s = table { Sleep = [("sleep" + ["s",""])];
Sg => "sleeps" ;
Pl => "sleep" They = [1, "they"];
} } ;
} ;
}
concrete Swe of Ex = { concrete Swe {
lincat
S = {s : Str} ;
NP = {s : Str} ;
VP = {s : Str} ;
param
Num = Sg | Pl ;
lin
Pred np vp = { Pred = [(($0!0),($1!0))];
s = np.s ++ vp.s} ;
She = {s = "hon"} ; She = ["hon"];
They = {s = "de"} ; They = ["de"];
Sleep = {s = "sover"} ; Sleep = ["sover"];
} } ;
```
==The syntax of GFCC files==
===Top level===
A grammar has a header telling the name of the abstract syntax
(often specifying an application domain), and the names of
the concrete languages. The abstract syntax and the concrete
syntaxes themselves follow.
```
Grammar ::= Header ";" Abstract ";" [Concrete] ;
Header ::= "grammar" CId "(" [CId] ")" ;
Abstract ::= "abstract" "{" [AbsDef] "}" ;
Concrete ::= "concrete" CId "{" [CncDef] "}" ;
```
Abstract syntax judgements give typings and semantic definitions.
Concrete syntax judgements give linearizations.
```
AbsDef ::= CId ":" Type "=" Exp ;
CncDef ::= CId "=" Term ;
```
Also flags are possible, local to each "module" (i.e. abstract and concretes).
```
AbsDef ::= "%" CId "=" String ;
CncDef ::= "%" CId "=" String ;
```
For the run-time system, the reference implementation in Haskell
uses a structure that gives efficient look-up:
```
data GFCC = GFCC {
absname :: CId ,
cncnames :: [CId] ,
abstract :: Abstr ,
concretes :: Map CId Concr
}
data Abstr = Abstr {
funs :: Map CId Type, -- find the type of a fun
cats :: Map CId [CId] -- find the funs giving a cat
}
type Concr = Map CId Term
```
===Abstract syntax===
Types are first-order function types built from
category symbols. Syntax trees (``Exp``) are
rose trees with the head (``Atom``) either a function
constant, a metavariable, or a string, integer, or float
literal.
```
Type ::= [CId] "->" CId ;
Exp ::= "(" Atom [Exp] ")" ;
Atom ::= CId ; -- function constant
Atom ::= "?" ; -- metavariable
Atom ::= String ; -- string literal
Atom ::= Integer ; -- integer literal
Atom ::= Double ; -- float literal
```
===Concrete syntax===
Linearization terms (``Term``) are built as follows.
Constructor names are shown to make the later code
examples readable.
```
R. Term ::= "[" [Term] "]" ; -- array
P. Term ::= "(" Term "!" Term ")" ; -- access to indexed field
S. Term ::= "(" [Term] ")" ; -- sequence with ++
K. Term ::= Tokn ; -- token
V. Term ::= "$" Integer ; -- argument
C. Term ::= Integer ; -- array index
FV. Term ::= "[|" [Term] "|]" ; -- free variation
TM. Term ::= "?" ; -- linearization of metavariable
```
Tokens are strings or (maybe obsolescent) prefix-dependent
variant lists.
```
KS. Tokn ::= String ;
KP. Tokn ::= "[" "pre" [String] "[" [Variant] "]" "]" ;
Var. Variant ::= [String] "/" [String] ;
```
Three special forms of terms are introduced by the compiler
as optimizations. They can in principle be eliminated, but
their presence makes grammars much more compact. Their semantics
will be explained in a later section.
```
F. Term ::= CId ; -- global constant
W. Term ::= "(" String "+" Term ")" ; -- prefix + suffix table
RP. Term ::= "(" Term "@" Term ")"; -- record parameter alias
```
Identifiers are like ``Ident`` in GF and GFC, except that
the compiler produces constants prefixed with ``_`` in
the common subterm elimination optimization.
```
token CId (('_' | letter) (letter | digit | '\'' | '_')*) ;
```
==The semantics of concrete syntax terms==
===Linearization and realization===
The linearization algorithm is essentially the same as in
GFC: a tree is linearized by evaluating its linearization term
in the environment of the linearizations of the subtrees.
Literal atoms are linearized in the obvious way.
The function also needs to know the language (i.e. concrete syntax)
in which linearization is performed.
```
linExp :: GFCC -> CId -> Exp -> Term
linExp mcfg lang tree@(Tr at trees) = case at of
AC fun -> comp (Prelude.map lin trees) $ look fun
AS s -> R [kks (show s)] -- quoted
AI i -> R [kks (show i)]
AF d -> R [kks (show d)]
AM -> TM
where
lin = linExp mcfg lang
comp = compute mcfg lang
look = lookLin mcfg lang
```
The result of linearization is usually a record, which is realized as
a string using the following algorithm.
```
realize :: Term -> String
realize trm = case trm of
R (t:_) -> realize t
S ss -> unwords $ Prelude.map realize ss
K (KS s) -> s
K (KP s _) -> unwords s ---- prefix choice TODO
W s t -> s ++ realize t
FV (t:_) -> realize t
TM -> "?"
```
Since the order of record fields is not necessarily
the same as in GF source,
this realization does not work securely for
categories whose lincats more than one field.
===Term evaluation===
Evaluation follows call-by-value order, with two environments
needed:
- the grammar (a concrete syntax) to give the global constants
- an array of terms to give the subtree linearizations
The code is presented in one-level pattern matching, to
enable reimplementations in languages that do not permit
deep patterns (such as Java and C++).
```
compute :: GFCC -> CId -> [Term] -> Term -> Term
compute mcfg lang args = comp where
comp trm = case trm of
P r p -> proj (comp r) (comp p)
RP i t -> RP (comp i) (comp t)
W s t -> W s (comp t)
R ts -> R $ Prelude.map comp ts
V i -> idx args (fromInteger i) -- already computed
F c -> comp $ look c -- not computed (if contains V)
FV ts -> FV $ Prelude.map comp ts
S ts -> S $ Prelude.filter (/= S []) $ Prelude.map comp ts
_ -> trm
look = lookLin mcfg lang
idx xs i = xs !! i
proj r p = case (r,p) of
(_, FV ts) -> FV $ Prelude.map (proj r) ts
(W s t, _) -> kks (s ++ getString (proj t p))
_ -> comp $ getField r (getIndex p)
getString t = case t of
K (KS s) -> s
_ -> trace ("ERROR in grammar compiler: string from "++ show t) "ERR"
getIndex t = case t of
C i -> fromInteger i
RP p _ -> getIndex p
TM -> 0 -- default value for parameter
_ -> trace ("ERROR in grammar compiler: index from " ++ show t) 0
getField t i = case t of
R rs -> idx rs i
RP _ r -> getField r i
TM -> TM
_ -> trace ("ERROR in grammar compiler: field from " ++ show t) t
```
===The special term constructors===
The three forms introduced by the compiler may a need special
explanation.
Global constants
```
Term ::= CId ;
```
are shorthands for complex terms. They are produced by the
compiler by (iterated) common subexpression elimination.
They are often more powerful than hand-devised code sharing in the source
code. They could be computed off-line by replacing each identifier by
its definition.
Prefix-suffix tables
```
Term ::= "(" String "+" Term ")" ;
```
represent tables of word forms divided to the longest common prefix
and its array of suffixes. In the example grammar above, we have
```
Sleep = [("sleep" + ["s",""])]
```
which in fact is equal to the array of full forms
```
["sleeps", "sleep"]
```
The power of this construction comes from the fact that suffix sets
tend to be repeated in a language, and can therefore be collected
by common subexpression elimination. It is this technique that
explains the used syntax rather than the more accurate
```
"(" String "+" [String] ")"
```
since we want the suffix part to be a ``Term`` for the optimization to
take effect.
The most curious construct of GFCC is the parameter array alias,
```
Term ::= "(" Term "@" Term ")";
```
This form is used as the value of parameter records, such as the type
```
{n : Number ; p : Person}
```
The problem with parameter records is their double role.
They can be used like parameter values, as indices in selection,
```
VP.s ! {n = Sg ; p = P3}
```
but also as records, from which parameters can be projected:
```
{n = Sg ; p = P3}.n
```
Whichever use is selected as primary, a prohibitively complex
case expression must be generated at compilation to GFCC to get the
other use. The adopted
solution is to generate a pair containing both a parameter value index
and an array of indices of record fields. For instance, if we have
```
param Number = Sg | Pl ; Person = P1 | P2 | P3 ;
```
we get the encoding
```
{n = Sg ; p = P3} ---> (2 @ [0,2])
```
The GFCC computation rules are essentially
```
(t ! (i @ _)) = (t ! i)
((_ @ r) ! j) =(r ! j)
```
==Compiling to GFCC==
Compilation to GFCC is performed by the GF grammar compiler, and
GFCC interpreters need not know what it does. For grammar writers,
however, it might be interesting to know what happens to the grammars
in the process.
The compilation phases are the following
+ translate GF source to GFC, as always in GF
+ undo GFC back-end optimizations
+ perform the ``values`` optimization to normalize tables
+ create a symbol table mapping the GFC parameter and record types to
fixed-size arrays, and parameter values and record labels to integers
+ traverse the linearization rules replacing parameters and labels by integers
+ reorganize the created GFC grammar so that it has just one abstract syntax
and one concrete syntax per language
+ apply UTF8 encoding to the grammar, if not yet applied (this is told by the
``coding`` flag)
+ translate the GFC syntax tree to a GFCC syntax tree, using a simple
compositional mapping
+ perform the word-suffix optimization on GFCC linearization terms
+ perform subexpression elimination on each concrete syntax module
+ print out the GFCC code
Notice that a major part of the compilation is done within GFC, so that
GFC-related tasks (such as parser generation) could be performed by
using the old algorithms.
===Problems in GFCC compilation===
Two major problems had to be solved in compiling GFC to GFCC:
- consistent order of tables and records, to permit the array translation
- run-time variables in complex parameter values.
The current implementation is still experimental and may fail
to generate correct code. Any errors remaining are likely to be
related to the two problems just mentioned.
The order problem is solved in different ways for tables and records.
For tables, the ``values`` optimization of GFC already manages to
maintain a canonical order. But this order can be destroyed by the
``share`` optimization. To make sure that GFCC compilation works properly,
it is safest to recompile the GF grammar by using the ``values``
optimization flag.
Records can be canonically ordered by sorting them by labels.
In fact, this was done in connection of the GFCC work as a part
of the GFC generation, to guarantee consistency. This means that
e.g. the ``s`` field will in general no longer appear as the first
field, even if it does so in the GF source code. But relying on the
order of fields in a labelled record would be misplaced anyway.
The canonical form of records is further complicated by lock fields,
i.e. dummy fields of form ``lock_C = <>``, which are added to grammar
libraries to force intensionality of linearization types. The problem
is that the absence of a lock field only generates a warning, not
an error. Therefore a GFC grammar can contain objects of the same
type with and without a lock field. This problem was solved in GFCC
generation by just removing all lock fields (defined as fields whose
type is the empty record type). This has the further advantage of
(slightly) reducing the grammar size. More importantly, it is safe
to remove lock fields, because they are never used in computation,
and because intensional types are only needed in grammars reused
as libraries, not in grammars used at runtime.
While the order problem is rather bureaucratic in nature, run-time
variables are an interesting problem. They arise in the presence
of complex parameter values, created by argument-taking constructors
and parameter records. To give an example, consider the GF parameter
type system
```
Number = Sg | Pl ;
Person = P1 | P2 | P3 ;
Agr = Ag Number Person ;
```
The values can be translated to integers in the expected way,
```
Sg = 0, Pl = 1
P1 = 0, P2 = 1, P3 = 2
Ag Sg P1 = 0, Ag Sg P2 = 1, Ag Sg P3 = 2,
Ag Pl P1 = 3, Ag Pl P2 = 4, Ag Pl P3 = 5
```
However, an argument of ``Agr`` can be a run-time variable, as in
```
Ag np.n P3
```
This expression must first be translated to a case expression,
```
case np.n of {
0 => 2 ;
1 => 5
}
```
which can then be translated to the GFCC term
```
([2,5] ! ($0 ! $1))
```
assuming that the variable ``np`` is the first argument and that its
``Number`` field is the second in the record.
This transformation of course has to be performed recursively, since
there can be several run-time variables in a parameter value:
```
Ag np.n np.p
```
A similar transformation would be possible to deal with the double
role of parameter records discussed above. Thus the type
```
RNP = {n : Number ; p : Person}
```
could be uniformly translated into the set ``{0,1,2,3,4,5}``
as ``Agr`` above. Selections would be simple instances of indexing.
But any projection from the record should be translated into
a case expression,
```
rnp.n ===>
case rnp of {
0 => 0 ;
1 => 0 ;
2 => 0 ;
3 => 1 ;
4 => 1 ;
5 => 1
}
```
To avoid the code bloat resulting from this, we chose the alias representation
which is easy enough to deal with in interpreters.
===The representation of linearization types===
Linearization types (``lincat``) are not needed when generating with
GFCC, but they have been added to enable parser generation directly from
GFCC. The linearization type definitions are shown as a part of the
concrete syntax, by using terms to represent types. Here is the table
showing how different linearization types are encoded.
```
P* = size(P) -- parameter type
{_ : I ; __ : R}* = (I* @ R*) -- record of parameters
{r1 : T1 ; ... ; rn : Tn}* = [T1*,...,Tn*] -- other record
(P => T)* = [T* ,...,T*] -- size(P) times
Str* = ()
```
The category symbols are prefixed with two underscores (``__``).
For example, the linearization type ``present/CatEng.NP`` is
translated as follows:
```
NP = {
a : { -- 6 = 2*3 values
n : {ParamX.Number} ; -- 2 values
p : {ParamX.Person} -- 3 values
} ;
s : {ResEng.Case} => Str -- 3 values
}
__NP = [(6@[2,3]),[(),(),()]]
```
===Running the compiler and the GFCC interpreter===
GFCC generation is a part of the
[developers' version http://www.cs.chalmers.se/Cs/Research/Language-technology/darcs/GF/doc/darcs.html]
of GF since September 2006. To invoke the compiler, the flag
``-printer=gfcc`` to the command
``pm = print_multi`` is used. It is wise to recompile the grammar from
source, since previously compiled libraries may not obey the canonical
order of records. To ``strip`` the grammar before
GFCC translation removes unnecessary interface references.
Here is an example, performed in
[example/bronzeage ../../../../../examples/bronzeage].
```
i -src -path=.:prelude:resource-1.0/* -optimize=all_subs BronzeageEng.gf
i -src -path=.:prelude:resource-1.0/* -optimize=all_subs BronzeageGer.gf
strip
pm -printer=gfcc | wf bronze.gfcc
```
==The reference interpreter==
The reference interpreter written in Haskell consists of the following files:
```
-- source file for BNFC
GFCC.cf -- labelled BNF grammar of gfcc
-- files generated by BNFC
AbsGFCC.hs -- abstrac syntax of gfcc
ErrM.hs -- error monad used internally
LexGFCC.hs -- lexer of gfcc files
ParGFCC.hs -- parser of gfcc files and syntax trees
PrintGFCC.hs -- printer of gfcc files and syntax trees
-- hand-written files
DataGFCC.hs -- post-parser grammar creation, linearization and evaluation
GenGFCC.hs -- random and exhaustive generation, generate-and-test parsing
RunGFCC.hs -- main function - a simple command interpreter
```
It is included in the
[developers' version http://www.cs.chalmers.se/Cs/Research/Language-technology/darcs/GF/doc/darcs.html]
of GF, in the subdirectory [``GF/src/GF/Canon/GFCC`` ../].
To compile the interpreter, type
```
make gfcc
```
in ``GF/src``. To run it, type
```
./gfcc <GFCC-file>
```
The available commands are
- ``gr <Cat> <Int>``: generate a number of random trees in category.
and show their linearizations in all languages
- ``grt <Cat> <Int>``: generate a number of random trees in category.
and show the trees and their linearizations in all languages
- ``gt <Cat> <Int>``: generate a number of trees in category from smallest,
and show their linearizations in all languages
- ``gtt <Cat> <Int>``: generate a number of trees in category from smallest,
and show the trees and their linearizations in all languages
- ``p <Int> <Cat> <String>``: "parse", i.e. generate trees until match or
until the given number have been generated
- ``<Tree>``: linearize tree in all languages, also showing full records
- ``quit``: terminate the system cleanly
==Interpreter in C++==
A base-line interpreter in C++ has been started.
Its main functionality is random generation of trees and linearization of them.
Here are some results from running the different interpreters, compared
to running the same grammar in GF, saved in ``.gfcm`` format.
The grammar contains the English, German, and Norwegian
versions of Bronzeage. The experiment was carried out on
Ubuntu Linux laptop with 1.5 GHz Intel centrino processor.
|| | GF | gfcc(hs) | gfcc++ |
| program size | 7249k | 803k | 113k
| grammar size | 336k | 119k | 119k
| read grammar | 1150ms | 510ms | 100ms
| generate 222 | 9500ms | 450ms | 800ms
| memory | 21M | 10M | 20M
To summarize:
- going from GF to gfcc is a major win in both code size and efficiency
- going from Haskell to C++ interpreter is not a win yet, because of a space
leak in the C++ version
==Some things to do==
Interpreter in Java.
Parsing via MCFG
- the FCFG format can possibly be simplified
- parser grammars should be saved in files to make interpreters easier
Hand-written parsers for GFCC grammars to reduce code size
(and efficiency?) of interpreters.
Binary format and/or file compression of GFCC output.
Syntax editor based on GFCC.
Rewriting of resource libraries in order to exploit the
word-suffix sharing better (depth-one tables, as in FM).

View File

@@ -0,0 +1,180 @@
GFCC Syntax
==Syntax of GFCC files==
The parser syntax is very simple, as defined in BNF:
```
Grm. Grammar ::= [RExp] ;
App. RExp ::= "(" CId [RExp] ")" ;
AId. RExp ::= CId ;
AInt. RExp ::= Integer ;
AStr. RExp ::= String ;
AFlt. RExp ::= Double ;
AMet. RExp ::= "?" ;
terminator RExp "" ;
token CId (('_' | letter) (letter | digit | '\'' | '_')*) ;
```
While a parser and a printer can be generated for many languages
from this grammar by using the BNF Converter, a parser is also
easy to write by hand using recursive descent.
==Syntax of well-formed GFCC code==
Here is a summary of well-formed syntax,
with a comment on the semantics of each construction.
```
Grammar ::=
("grammar" CId CId*) -- abstract syntax name and concrete syntax names
"(" "flags" Flag* ")" -- global and abstract flags
"(" "abstract" Abstract ")" -- abstract syntax
"(" "concrete" Concrete* ")" -- concrete syntaxes
Abstract ::=
"(" "fun" FunDef* ")" -- function definitions
"(" "cat" CatDef* ")" -- category definitions
Concrete ::=
"(" CId -- language name
"flags" Flag* -- concrete flags
"lin" LinDef* -- linearization rules
"oper" LinDef* -- operations (macros)
"lincat" LinDef* -- linearization type definitions
"lindef" LinDef* -- linearization default definitions
"printname" LinDef* -- printname definitions
"param" LinDef* -- lincats with labels and parameter value names
")"
Flag ::= "(" CId String ")" -- flag and value
FunDef ::= "(" CId Type Exp ")" -- function, type, and definition
CatDef ::= "(" CId Hypo* ")" -- category and context
LinDef ::= "(" CId Term ")" -- function and definition
Type ::=
"(" CId -- value category
"(" "H" Hypo* ")" -- argument context
"(" "X" Exp* ")" ")" -- arguments (of dependent value type)
Exp ::=
"(" CId -- function
"(" "B" CId* ")" -- bindings
"(" "X" Exp* ")" ")" -- arguments
| CId -- variable
| "?" -- metavariable
| "(" "Eq" Equation* ")" -- group of pattern equations
| Integer -- integer literal (non-negative)
| Float -- floating-point literal (non-negative)
| String -- string literal (in double quotes)
Hypo ::= "(" CId Type ")" -- variable and type
Equation ::= "(" "E" Exp Exp* ")" -- value and pattern list
Term ::=
"(" "R" Term* ")" -- array (record or table)
| "(" "S" Term* ")" -- concatenated sequence
| "(" "FV" Term* ")" -- free variant list
| "(" "P" Term Term ")" -- access to index (projection or selection)
| "(" "W" String Term ")" -- token prefix with suffix list
| "(" "A" Integer ")" -- pointer to subtree
| String -- token (in double quotes)
| Integer -- index in array
| CId -- macro constant
| "?" -- metavariable
```
==GFCC interpreter==
The first phase in interpreting GFCC is to parse a GFCC file and
build an internal abstract syntax representation, as specified
in the previous section.
With this representation, linearization can be performed by
a straightforward function from expressions (``Exp``) to terms
(``Term``). All expressions except groups of pattern equations
can be linearized.
Here is a reference Haskell implementation of linearization:
```
linExp :: GFCC -> CId -> Exp -> Term
linExp gfcc lang tree@(DTr _ at trees) = case at of
AC fun -> comp (map lin trees) $ look fun
AS s -> R [K (show s)] -- quoted
AI i -> R [K (show i)]
AF d -> R [K (show d)]
AM -> TM
where
lin = linExp gfcc lang
comp = compute gfcc lang
look = lookLin gfcc lang
```
TODO: bindings must be supported.
Terms resulting from linearization are evaluated in
call-by-value order, with two environments needed:
- the grammar (a concrete syntax) to give the global constants
- an array of terms to give the subtree linearizations
The Haskell implementation works as follows:
```
compute :: GFCC -> CId -> [Term] -> Term -> Term
compute gfcc lang args = comp where
comp trm = case trm of
P r p -> proj (comp r) (comp p)
W s t -> W s (comp t)
R ts -> R $ map comp ts
V i -> idx args (fromInteger i) -- already computed
F c -> comp $ look c -- not computed (if contains V)
FV ts -> FV $ Prelude.map comp ts
S ts -> S $ Prelude.filter (/= S []) $ Prelude.map comp ts
_ -> trm
look = lookOper gfcc lang
idx xs i = xs !! i
proj r p = case (r,p) of
(_, FV ts) -> FV $ Prelude.map (proj r) ts
(FV ts, _ ) -> FV $ Prelude.map (\t -> proj t p) ts
(W s t, _) -> kks (s ++ getString (proj t p))
_ -> comp $ getField r (getIndex p)
getString t = case t of
K (KS s) -> s
_ -> trace ("ERROR in grammar compiler: string from "++ show t) "ERR"
getIndex t = case t of
C i -> fromInteger i
RP p _ -> getIndex p
TM -> 0 -- default value for parameter
_ -> trace ("ERROR in grammar compiler: index from " ++ show t) 0
getField t i = case t of
R rs -> idx rs i
RP _ r -> getField r i
TM -> TM
_ -> trace ("ERROR in grammar compiler: field from " ++ show t) t
```
The result of linearization is usually a record, which is realized as
a string using the following algorithm.
```
realize :: Term -> String
realize trm = case trm of
R (t:_) -> realize t
S ss -> unwords $ map realize ss
K s -> s
W s t -> s ++ realize t
FV (t:_) -> realize t -- TODO: all variants
TM -> "?"
```
Notice that realization always picks the first field of a record.
If a linearization type has more than one field, the first field
does not necessarily contain the desired string.
Also notice that the order of record fields in GFCC is not necessarily
the same as in GF source.

153
deprecated/ReleaseProcedure Normal file
View File

@@ -0,0 +1,153 @@
Procedure for making a GF release:
1. Make sure everything that should be in the release has been
checked in.
2. Go to the src/ dir.
$ cd src
3. Edit configure.ac to set the right version number
(the second argument to the AC_INIT macro).
4. Edit gf.spec to set the version and release numbers
(change %define version and %define release).
5. Commit configure.ac and gf.spec:
$ darcs record -m 'Updated version numbers.' configure.ac gf.spec
6. Run autoconf to generate configure with the right version number:
$ autoconf
7. Go back to the root of the tree.
$ cd ..
8. Tag the release. (X_X should be replaced by the version number, with
_ instead of ., e.g. 2_0)
$ darcs tag -m RELEASE-X_X
9. Push the changes that you made for the release to the main repo:
$ darcs push
10. Build a source package:
$ cd src
$ ./configure
$ make dist
11. (Only if releasing a new grammars distribution)
Build a grammar tarball:
$ cd src
$ ./configure && make grammar-dist
12. Build an x86/linux RPM (should be done on a Mandrake Linux box):
Setup for building RPMs (first time only):
- Make sure that you have the directories neccessary to build
RPMs:
$ mkdir -p ~/rpm/{BUILD,RPMS/i586,RPMS/noarch,SOURCES,SRPMS,SPECS,tmp}
- Create ~/.rpmrc with the following contents:
buildarchtranslate: i386: i586
buildarchtranslate: i486: i586
buildarchtranslate: i586: i586
buildarchtranslate: i686: i586
- Create ~/.rpmmacros with the following contents:
%_topdir %(echo ${HOME}/rpm)
%_tmppath %{_topdir}/tmp
%packager Your Name <yourusername@cs.chalmers.se>
Build the RPM:
$ cd src
$ ./configure && make rpm
13. Build a generic binary x86/linux package (should be done on a Linux box,
e.g. banded.medic.chalmers.se):
$ cd src
$ ./configure --host=i386-pc-linux-gnu && make binary-dist
14. Build a generic binary sparc/solaris package (should be done
on a Solaris box, e.g. remote1.cs.chalmers.se):
$ cd src
$ ./configure --host=sparc-sun-solaris2 && gmake binary-dist
15. Build a Mac OS X package (should be done on a Mac OS X box,
e.g. csmisc99.cs.chalmers.se):
$ cd src
$ ./configure && make binary-dist
Note that to run GHC-compiled binaries on OS X, you need
a "Haskell Support Framework". This should be available
separately from the GF download page.
TODO: Use OS X PackageMaker to build a .pkg-file which can
be installed using the standard OS X Installer program.
16. Build a binary Cygwin package (should be done on a Windows
machine with Cygwin):
$ cd src
$ ./configure && make binary-dist
17. Build a Windows MSI package (FIXME: This doesn't work right,
pathnames with backslashes and spaces are not handled
correctly in Windows. We only release a binary tarball
for Cygwin right now.):
$ cd src
$ ./configure && make all windows-msi
18. Add new GF package release to SourceForge:
- https://sourceforge.net/projects/gf-tools
- Project page -> Admin -> File releases -> Add release (for the
GF package)
- New release name: X.X (just the version number, e.g. 2.2)
- Paste in release notes
- Upload files using anonymous FTP to upload.sourceforge.net
in the incoming directory.
- Add the files to the release and set the processor
and file type for each file (remember to press
Update/Refresh for each file):
* x86 rpm -> i386/.rpm
* source rpm -> Any/Source .rpm
* x86 binary tarball -> i386/.gz
* sparc binary tarball -> Sparc/.gz
* source package -> Any/Source .gz
19. Add new GF-editor release. Repeat the steps above, but
with GF-editor:
- Add files and set properties:
* editor rpm -> i386/.rpm (not really true, but I haven't
figured out how to make noarch rpms from the same spec as
arch-specific ones)
20. Mail to gf-tools-users@lists.sourceforge.net
21. Update website.
22. Party!

1497
deprecated/config.guess vendored Normal file

File diff suppressed because it is too large Load Diff

37
deprecated/config.mk.in Normal file
View File

@@ -0,0 +1,37 @@
# GF configuration file. configure will produce config.mk from this file
# @configure_input@
PACKAGE_VERSION = @PACKAGE_VERSION@
prefix = @prefix@
exec_prefix = @exec_prefix@
bindir = @bindir@
libdir = @libdir@
datadir = @datadir@
host = @host@
build = @build@
GHCFLAGS = @GHCFLAGS@
CPPFLAGS = @CPPFLAGS@
LDFLAGS = @LDFLAGS@
EXEEXT = @EXEEXT@
INSTALL = @INSTALL@
TAR = @TAR@
GHC = "@GHC@"
GHCI = "@GHCI@"
READLINE = @READLINE@
INTERRUPT = @INTERRUPT@
ATK = @ATK@
ENABLE_JAVA = @ENABLE_JAVA@
JAVAC = "@JAVAC@"
JAR = "@JAR@"

1608
deprecated/config.sub vendored Normal file

File diff suppressed because it is too large Load Diff

229
deprecated/configure.ac Normal file
View File

@@ -0,0 +1,229 @@
dnl Run autoconf to generate configure from this file
AC_INIT([GF],[3.0-beta3],[aarne@cs.chalmers.se],[GF])
AC_PREREQ(2.53)
AC_REVISION($Revision: 1.26 $)
AC_CONFIG_FILES([config.mk gfc])
AC_CANONICAL_HOST
dnl ***********************************************
dnl Executable suffix
dnl ***********************************************
AC_MSG_CHECKING([executable suffix])
case $host_os in
cygwin)
EXEEXT='.exe';;
*)
EXEEXT='';;
esac
AC_MSG_RESULT(['$EXEEXT'])
AC_SUBST(EXEEXT)
dnl ***********************************************
dnl GHC
dnl ***********************************************
AC_ARG_WITH(ghc,
AC_HELP_STRING([--with-ghc=<ghc command>],
[Use a different command instead of
'ghc' for the Haskell compiler.]),
[AC_CHECK_FILE("$withval",GHC="$withval",[AC_PATH_PROG(GHC,"$withval")])],
[AC_PATH_PROG(GHC,ghc)])
GHCI=$(dirname $GHC)/ghci
GHC_VERSION=`$GHC --version | sed -e 's/.*version //'`
AC_MSG_CHECKING([GHC version])
AC_MSG_RESULT($GHC_VERSION)
AC_SUBST(GHC)
AC_SUBST(GHCI)
dnl ***********************************************
dnl readline
dnl ***********************************************
AC_ARG_WITH(readline,
AC_HELP_STRING([--with-readline=<readline alternative>],
[Select which readline implementation to use.
Available alternatives are: 'readline' (GNU readline),
'no' (don't use readline)
(default = readline)]),
[if test "$withval" = "yes"; then
READLINE="readline"
else
READLINE="$withval"
fi],
[if test "$host_os" = "cygwin"; then
AC_MSG_WARN([There are problems with readline for Windows,
for example, pipe characters do not work.
Disabling readline support.
Use --with-readline to override.])
READLINE="no"
else
READLINE="readline"
fi])
case $READLINE in
readline)
;;
no)
;;
*)
AC_MSG_ERROR([Bad value for --with-readline: $READLINE])
;;
esac
AC_SUBST(READLINE)
dnl ***********************************************
dnl command interruption
dnl ***********************************************
AC_ARG_WITH(interrupt,
AC_HELP_STRING([--with-interrupt=<allow command interruption>],
[Choose whether to enable interruption of commands
with SIGINT (Ctrl-C)
Available alternatives are: 'yes', 'no'
(default = yes)]),
[INTERRUPT="$withval"],
[if test "$host_os" = "cygwin"; then
AC_MSG_WARN([Command interruption does not work under
Cygwin, because of missing signal handler support.
Disabling command interruption support.
Use --with-interrupt to override.])
INTERRUPT="no"
else
INTERRUPT="yes"
fi])
case $INTERRUPT in
yes)
;;
no)
;;
*)
AC_MSG_ERROR([Bad value for --with-interrupt: $INTERRUPT])
;;
esac
AC_SUBST(INTERRUPT)
dnl ***********************************************
dnl ATK speech recognition
dnl ***********************************************
AC_ARG_WITH(atk,
AC_HELP_STRING([--with-atk=<use ATK speech recognition>],
[Choose whether to compile in support for speech
recognition using ATK. Requires ATK and libatkrec.
Available alternatives are: 'yes', 'no'
(default = no)]),
[ATK="$withval"],
[ATK="no"])
case $ATK in
yes)
AC_MSG_CHECKING([for atkrec package])
ATKREC_VERSION=`ghc-pkg field atkrec version`
if test "$ATKREC_VERSION" = ""; then
AC_MSG_RESULT(['not found'])
AC_MSG_WARN([Disabling ATK support.])
ATK="no"
else
AC_MSG_RESULT([$ATKREC_VERSION])
fi
;;
no)
;;
*)
AC_MSG_ERROR([Bad value for --with-atk: $ATK])
;;
esac
AC_SUBST(ATK)
dnl ***********************************************
dnl java stuff
dnl ***********************************************
AC_ARG_ENABLE(java,
AC_HELP_STRING([--enable-java],
[Build Java components. (default = yes)]),
[ENABLE_JAVA="$enableval"],
[ENABLE_JAVA=yes]
)
if test "$ENABLE_JAVA" = "yes"; then
AC_ARG_WITH(javac,
AC_HELP_STRING([--with-javac=<javac command>],
[Use a different command instead of
'javac' for the Java compiler.]),
[AC_CHECK_FILE("$withval",JAVAC="$withval",[AC_PATH_PROG(JAVAC,"$withval")])],
[AC_PATH_PROG(JAVAC,javac)])
AC_SUBST(JAVAC)
AC_ARG_WITH(java,
AC_HELP_STRING([--with-java=<java command>],
[Use a different command instead of
'java' for the Java Virtual Machine.]),
[AC_CHECK_FILE("$withval",JAVA="$withval",[AC_PATH_PROG(JAVA,"$withval")])],
[AC_PATH_PROG(JAVA,java)])
AC_SUBST(JAVA)
AC_ARG_WITH(jar,
AC_HELP_STRING([--with-jar=<jar command>],
[Use a different command instead of
'jar' for the Java archive tool.]),
[AC_CHECK_FILE("$withval",JAR="$withval",[AC_PATH_PROG(JAR,"$withval")])],
[AC_PATH_PROG(JAR,jar)])
AC_SUBST(JAR)
if test "$JAVAC" = "" || test ! -x "$JAVAC" \
|| test "$JAVA" = "" || test ! -x "$JAVA" \
|| test "$JAR" = "" || test ! -x "$JAR"; then
AC_MSG_WARN([Not building Java components.])
ENABLE_JAVA=no
fi
fi
AC_SUBST(ENABLE_JAVA)
dnl ***********************************************
dnl TAR
dnl ***********************************************
AC_CHECK_PROGS(TAR, gtar tar)
dnl ***********************************************
dnl Other programs
dnl ***********************************************
AC_PROG_INSTALL
dnl ***********************************************
dnl Program flags
dnl ***********************************************
AC_SUBST(GHCFLAGS)
AC_SUBST(CPPFLAGS)
AC_SUBST(LDFLAGS)
dnl ***********************************************
dnl Output
dnl ***********************************************
AC_OUTPUT

View File

@@ -0,0 +1,136 @@
module Main where
import PGF.Editor
import PGF
import Data.Char
import System (getArgs)
-- a rough editor shell using the PGF.Edito API
-- compile:
-- cd .. ; ghc --make exper/EditShell.hs
-- use:
-- EditShell file.pgf
main = do
putStrLn "Hi, I'm the Editor! Type h for help on commands."
file:_ <- getArgs
pgf <- readPGF file
let dict = pgf2dict pgf
let st0 = new (startCat pgf)
let lang = head (languages pgf) ---- for printnames; enable choosing lang
editLoop pgf dict lang st0 -- alt 1: all editing commands
-- dialogueLoop pgf dict lang st0 -- alt 2: just refinement by parsing (see bottom)
editLoop :: PGF -> Dict -> Language -> State -> IO State
editLoop pgf dict lang st = do
putStrLn $
if null (allMetas st)
then unlines
(["The tree is complete:",prState st] ++ linearizeAll pgf (stateTree st))
else if isMetaFocus st
then "I want something of type " ++ showType (focusType st) ++
" (0 - " ++ show (length (refineMenu dict st)-1) ++ ")"
else "Do you want to change this node?"
c <- getLine
st' <- interpret pgf dict st c
editLoop pgf dict lang st'
interpret :: PGF -> Dict -> State -> String -> IO State
interpret pgf dict st c = case words c of
"r":f:_ -> do
let st' = goNextMeta (refine dict (mkCId f) st)
prLState pgf st'
return st'
"p":ws -> do
let tts = parseAll pgf (focusType st) (dropWhile (not . isSpace) c)
st' <- selectReplace dict (concat tts) st
prLState pgf st'
return st'
"a":_ -> do
t:_ <- generateRandom pgf (focusType st)
let st' = goNextMeta (replace dict t st)
prLState pgf st'
return st'
"d":_ -> do
let st' = delete st
prLState pgf st'
return st'
"m":_ -> do
putStrLn (unwords (map prCId (refineMenu dict st)))
return st
d : _ | all isDigit d -> do
let f = refineMenu dict st !! read d
let st' = goNextMeta (refine dict f st)
prLState pgf st'
return st'
p@('[':_):_ -> do
let st' = goPosition (mkPosition (read p)) st
prLState pgf st'
return st'
">":_ -> do
let st' = goNext st
prLState pgf st'
return st'
"x":_ -> do
mapM_ putStrLn [show (showPosition p) ++ showType t | (p,t) <- allMetas st]
return st
"h":_ -> putStrLn commandHelp >> return st
_ -> do
putStrLn "command not understood"
return st
prLState pgf st = do
let t = stateTree st
putStrLn (unlines ([
"Now I have:","",
prState st] ++
linearizeAll pgf t))
-- prompt selection from list of trees, such as ambiguous choice
selectReplace :: Dict -> [Tree] -> State -> IO State
selectReplace dict ts st = case ts of
[] -> putStrLn "no results" >> return st
[t] -> return $ goNextMeta $ replace dict t st
_ -> do
mapM_ putStrLn $ "choose tree by entering its number:" :
[show i ++ " : " ++ showTree t | (i,t) <- zip [0..] ts]
d <- getLine
let t = ts !! read d
return $ goNextMeta $ replace dict t st
commandHelp = unlines [
"a -- refine with a random subtree",
"d -- delete current subtree",
"h -- display this help message",
"m -- show refinement menu",
"p Anything -- parse Anything and refine with it",
"r Function -- refine with Function",
"x -- show all unknown positions and their types",
"4 -- refine with 4th item from menu (see m)",
"[1,2,3] -- go to position 1,2,3",
"> -- go to next node"
]
----------------
-- for a dialogue system, working just by parsing; questions are cat printnames
----------------
dialogueLoop :: PGF -> Dict -> Language -> State -> IO State
dialogueLoop pgf dict lang st = do
putStrLn $
if null (allMetas st)
then "Ready!\n " ++ unlines (linearizeAll pgf (stateTree st))
else if isMetaFocus st
then showPrintName pgf lang (focusType st)
else "Do you want to change this node?"
c <- getLine
st' <- interpretD pgf dict st c
dialogueLoop pgf dict lang st'
interpretD :: PGF -> Dict -> State -> String -> IO State
interpretD pgf dict st c = do
let tts = parseAll pgf (focusType st) c
st' <- selectReplace dict (concat tts) st
-- prLState pgf st'
return st'

View File

@@ -0,0 +1,461 @@
----------------------------------------------------------------------
-- |
-- Module : Evaluate
-- Maintainer : AR
-- Stability : (stable)
-- Portability : (portable)
--
-- > CVS $Date: 2005/11/01 15:39:12 $
-- > CVS $Author: aarne $
-- > CVS $Revision: 1.19 $
--
-- Computation of source terms. Used in compilation and in @cc@ command.
-----------------------------------------------------------------------------
module GF.Compile.Evaluate (appEvalConcrete) where
import GF.Data.Operations
import GF.Grammar.Grammar
import GF.Infra.Ident
import GF.Data.Str
import GF.Grammar.PrGrammar
import GF.Infra.Modules
import GF.Infra.Option
import GF.Grammar.Macros
import GF.Grammar.Lookup
import GF.Grammar.Refresh
import GF.Grammar.PatternMatch
import GF.Grammar.Lockfield (isLockLabel) ----
import GF.Grammar.AppPredefined
import qualified Data.Map as Map
import Data.List (nub,intersperse)
import Control.Monad (liftM2, liftM)
import Debug.Trace
data EEnv = EEnv {
computd :: Map.Map (Ident,Ident) FTerm,
temp :: Int
}
emptyEEnv = EEnv Map.empty 0
lookupComputed :: (Ident,Ident) -> STM EEnv (Maybe FTerm)
lookupComputed mc = do
env <- readSTM
return $ Map.lookup mc $ computd env
updateComputed :: (Ident,Ident) -> FTerm -> STM EEnv ()
updateComputed mc t = updateSTM (\e -> e{computd = Map.insert mc t (computd e)})
getTemp :: STM EEnv Ident
getTemp = do
env <- readSTM
updateSTM (\e -> e{temp = temp e + 1})
return $ identC ("#" ++ show (temp env))
data FTerm =
FTC Term
| FTF (Term -> FTerm)
prFTerm :: Integer -> FTerm -> String
prFTerm i t = case t of
FTC t -> prt t
FTF f -> show i +++ "->" +++ prFTerm (i + 1) (f (EInt i))
term2fterm t = case t of
Abs x b -> FTF (\t -> term2fterm (subst [(x,t)] b))
_ -> FTC t
traceFTerm c ft = ft ----trace ("\n" ++ prt c +++ "=" +++ take 60 (prFTerm 0 ft)) ft
fterm2term :: FTerm -> STM EEnv Term
fterm2term t = case t of
FTC t -> return t
FTF f -> do
x <- getTemp
b <- fterm2term $ f (Vr x)
return $ Abs x b
subst g t = case t of
Vr x -> maybe t id $ lookup x g
_ -> composSafeOp (subst g) t
appFTerm :: FTerm -> [Term] -> FTerm
appFTerm ft ts = case (ft,ts) of
(FTF f, x:xs) -> appFTerm (f x) xs
_ -> ft
{-
(FTC _, []) -> ft
(FTC f, [a]) -> case appPredefined (App f a) of
Ok (t,_) -> FTC t
_ -> error $ "error: appFTerm" +++ prFTerm 0 ft +++ unwords (map prt ts)
_ -> error $ "error: appFTerm" +++ prFTerm 0 ft +++ unwords (map prt ts)
-}
apps :: Term -> (Term,[Term])
apps t = case t of
App f a -> (f',xs ++ [a]) where (f',xs) = apps f
_ -> (t,[])
appEvalConcrete gr bt = liftM fst $ appSTM (evalConcrete gr bt) emptyEEnv
evalConcrete :: SourceGrammar -> BinTree Ident Info -> STM EEnv (BinTree Ident Info)
evalConcrete gr mo = mapMTree evaldef mo where
evaldef (f,info) = case info of
CncFun (mt@(Just (_,ty@(cont,val)))) pde ppr ->
evalIn ("\nerror in linearization of function" +++ prt f +++ ":") $
do
pde' <- case pde of
Yes de -> do
liftM yes $ pEval ty de
_ -> return pde
--- ppr' <- liftM yes $ evalPrintname gr c ppr pde'
return $ (f, CncFun mt pde' ppr) -- only cat in type actually needed
_ -> return (f,info)
pEval (context,val) trm = do ---- errIn ("parteval" +++ prt_ trm) $ do
let
vars = map fst context
args = map Vr vars
subst = [(v, Vr v) | v <- vars]
trm1 = mkApp trm args
trm3 <- recordExpand val trm1 >>= comp subst
return $ mkAbs vars trm3
recordExpand typ trm = case unComputed typ of
RecType tys -> case trm of
FV rs -> return $ FV [R [assign lab (P r lab) | (lab,_) <- tys] | r <- rs]
_ -> return $ R [assign lab (P trm lab) | (lab,_) <- tys]
_ -> return trm
comp g t = case t of
Q (IC "Predef") _ -> trace ("\nPredef:\n" ++ prt t) $ return t
Q p c -> do
md <- lookupComputed (p,c)
case md of
Nothing -> do
d <- lookRes (p,c)
updateComputed (p,c) $ traceFTerm c $ term2fterm d
return d
Just d -> fterm2term d >>= comp g
App f a -> case apps t of
(h@(Q p c),xs) | p == IC "Predef" -> do
xs' <- mapM (comp g) xs
(t',b) <- stmErr $ appPredefined (foldl App h xs')
if b then return t' else comp g t'
(h@(Q p c),xs) -> do
xs' <- mapM (comp g) xs
md <- lookupComputed (p,c)
case md of
Just ft -> do
t <- fterm2term $ appFTerm ft xs'
comp g t
Nothing -> do
d <- lookRes (p,c)
let ft = traceFTerm c $ term2fterm d
updateComputed (p,c) ft
t' <- fterm2term $ appFTerm ft xs'
comp g t'
_ -> do
f' <- comp g f
a' <- comp g a
case (f',a') of
(Abs x b,_) -> comp (ext x a' g) b
(QC _ _,_) -> returnC $ App f' a'
(FV fs, _) -> mapM (\c -> comp g (App c a')) fs >>= return . variants
(_, FV as) -> mapM (\c -> comp g (App f' c)) as >>= return . variants
(Alias _ _ d, _) -> comp g (App d a')
(S (T i cs) e,_) -> prawitz g i (flip App a') cs e
_ -> do
(t',b) <- stmErr $ appPredefined (App f' a')
if b then return t' else comp g t'
Vr x -> do
t' <- maybe (prtRaise (
"context" +++ show g +++ ": no value given to variable") x) return $ lookup x g
case t' of
_ | t == t' -> return t
_ -> comp g t'
Abs x b -> do
b' <- comp (ext x (Vr x) g) b
return $ Abs x b'
Let (x,(_,a)) b -> do
a' <- comp g a
comp (ext x a' g) b
Prod x a b -> do
a' <- comp g a
b' <- comp (ext x (Vr x) g) b
return $ Prod x a' b'
P t l | isLockLabel l -> return $ R []
---- a workaround 18/2/2005: take this away and find the reason
---- why earlier compilation destroys the lock field
P t l -> do
t' <- comp g t
case t' of
FV rs -> mapM (\c -> comp g (P c l)) rs >>= returnC . variants
R r -> maybe
(prtRaise (prt t' ++ ": no value for label") l) (comp g . snd) $
lookup l r
ExtR a (R b) -> case lookup l b of ----comp g (P (R b) l) of
Just (_,v) -> comp g v
_ -> comp g (P a l)
S (T i cs) e -> prawitz g i (flip P l) cs e
_ -> returnC $ P t' l
S t@(T _ cc) v -> do
v' <- comp g v
case v' of
FV vs -> do
ts' <- mapM (comp g . S t) vs
return $ variants ts'
_ -> case matchPattern cc v' of
Ok (c,g') -> comp (g' ++ g) c
_ | isCan v' -> prtRaise ("missing case" +++ prt v' +++ "in") t
_ -> do
t' <- comp g t
return $ S t' v' -- if v' is not canonical
S t v -> do
t' <- comp g t
v' <- comp g v
case t' of
T _ [(PV IW,c)] -> comp g c --- an optimization
T _ [(PT _ (PV IW),c)] -> comp g c
T _ [(PV z,c)] -> comp (ext z v' g) c --- another optimization
T _ [(PT _ (PV z),c)] -> comp (ext z v' g) c
FV ccs -> mapM (\c -> comp g (S c v')) ccs >>= returnC . variants
V ptyp ts -> do
vs <- stmErr $ allParamValues gr ptyp
ps <- stmErr $ mapM term2patt vs
let cc = zip ps ts
case v' of
FV vs -> mapM (\c -> comp g (S t' c)) vs >>= returnC . variants
_ -> case matchPattern cc v' of
Ok (c,g') -> comp (g' ++ g) c
_ | isCan v' -> prtRaise ("missing case" +++ prt v' +++ "in") t
_ -> return $ S t' v' -- if v' is not canonical
T _ cc -> case v' of
FV vs -> mapM (\c -> comp g (S t' c)) vs >>= returnC . variants
_ -> case matchPattern cc v' of
Ok (c,g') -> comp (g' ++ g) c
_ | isCan v' -> prtRaise ("missing case" +++ prt v' +++ "in") t
_ -> return $ S t' v' -- if v' is not canonical
Alias _ _ d -> comp g (S d v')
S (T i cs) e -> prawitz g i (flip S v') cs e
_ -> returnC $ S t' v'
-- normalize away empty tokens
K "" -> return Empty
-- glue if you can
Glue x0 y0 -> do
x <- comp g x0
y <- comp g y0
case (x,y) of
(Alias _ _ d, y) -> comp g $ Glue d y
(x, Alias _ _ d) -> comp g $ Glue x d
(S (T i cs) e, s) -> prawitz g i (flip Glue s) cs e
(s, S (T i cs) e) -> prawitz g i (Glue s) cs e
(_,Empty) -> return x
(Empty,_) -> return y
(K a, K b) -> return $ K (a ++ b)
(_, Alts (d,vs)) -> do
---- (K a, Alts (d,vs)) -> do
let glx = Glue x
comp g $ Alts (glx d, [(glx v,c) | (v,c) <- vs])
(Alts _, ka) -> checks [do
y' <- stmErr $ strsFromTerm ka
---- (Alts _, K a) -> checks [do
x' <- stmErr $ strsFromTerm x -- this may fail when compiling opers
return $ variants [
foldr1 C (map K (str2strings (glueStr v u))) | v <- x', u <- y']
---- foldr1 C (map K (str2strings (glueStr v (str a)))) | v <- x']
,return $ Glue x y
]
(FV ks,_) -> do
kys <- mapM (comp g . flip Glue y) ks
return $ variants kys
(_,FV ks) -> do
xks <- mapM (comp g . Glue x) ks
return $ variants xks
_ -> do
mapM_ checkNoArgVars [x,y]
r <- composOp (comp g) t
returnC r
Alts _ -> do
r <- composOp (comp g) t
returnC r
-- remove empty
C a b -> do
a' <- comp g a
b' <- comp g b
case (a',b') of
(Alts _, K a) -> checks [do
as <- stmErr $ strsFromTerm a' -- this may fail when compiling opers
return $ variants [
foldr1 C (map K (str2strings (plusStr v (str a)))) | v <- as]
,
return $ C a' b'
]
(Empty,_) -> returnC b'
(_,Empty) -> returnC a'
_ -> returnC $ C a' b'
-- reduce free variation as much as you can
FV ts -> mapM (comp g) ts >>= returnC . variants
-- merge record extensions if you can
ExtR r s -> do
r' <- comp g r
s' <- comp g s
case (r',s') of
(Alias _ _ d, _) -> comp g $ ExtR d s'
(_, Alias _ _ d) -> comp g $ Glue r' d
(R rs, R ss) -> stmErr $ plusRecord r' s'
(RecType rs, RecType ss) -> stmErr $ plusRecType r' s'
_ -> return $ ExtR r' s'
-- case-expand tables
-- if already expanded, don't expand again
T i@(TComp _) cs -> do
-- if there are no variables, don't even go inside
cs' <- if (null g) then return cs else mapPairsM (comp g) cs
return $ T i cs'
--- this means some extra work; should implement TSh directly
TSh i cs -> comp g $ T i [(p,v) | (ps,v) <- cs, p <- ps]
T i cs -> do
pty0 <- stmErr $ getTableType i
ptyp <- comp g pty0
case allParamValues gr ptyp of
Ok vs -> do
cs' <- mapM (compBranchOpt g) cs
sts <- stmErr $ mapM (matchPattern cs') vs
ts <- mapM (\ (c,g') -> comp (g' ++ g) c) sts
ps <- stmErr $ mapM term2patt vs
let ps' = ps --- PT ptyp (head ps) : tail ps
return $ --- V ptyp ts -- to save space, just course of values
T (TComp ptyp) (zip ps' ts)
_ -> do
cs' <- mapM (compBranch g) cs
return $ T i cs' -- happens with variable types
-- otherwise go ahead
_ -> composOp (comp g) t >>= returnC
lookRes (p,c) = case lookupResDefKind gr p c of
Ok (t,_) | noExpand p -> return t
Ok (t,0) -> comp [] t
Ok (t,_) -> return t
Bad s -> raise s
noExpand p = errVal False $ do
mo <- lookupModule gr p
return $ case getOptVal (iOpts (flags mo)) useOptimizer of
Just "noexpand" -> True
_ -> False
prtRaise s t = raise (s +++ prt t)
ext x a g = (x,a):g
returnC = return --- . computed
variants ts = case nub ts of
[t] -> t
ts -> FV ts
isCan v = case v of
Con _ -> True
QC _ _ -> True
App f a -> isCan f && isCan a
R rs -> all (isCan . snd . snd) rs
_ -> False
compBranch g (p,v) = do
let g' = contP p ++ g
v' <- comp g' v
return (p,v')
compBranchOpt g c@(p,v) = case contP p of
[] -> return c
_ -> compBranch g c
---- _ -> err (const (return c)) return $ compBranch g c
contP p = case p of
PV x -> [(x,Vr x)]
PC _ ps -> concatMap contP ps
PP _ _ ps -> concatMap contP ps
PT _ p -> contP p
PR rs -> concatMap (contP . snd) rs
PAs x p -> (x,Vr x) : contP p
PSeq p q -> concatMap contP [p,q]
PAlt p q -> concatMap contP [p,q]
PRep p -> contP p
PNeg p -> contP p
_ -> []
prawitz g i f cs e = do
cs' <- mapM (compBranch g) [(p, f v) | (p,v) <- cs]
return $ S (T i cs') e
-- | argument variables cannot be glued
checkNoArgVars :: Term -> STM EEnv Term
checkNoArgVars t = case t of
Vr (IA _) -> raise $ glueErrorMsg $ prt t
Vr (IAV _) -> raise $ glueErrorMsg $ prt t
_ -> composOp checkNoArgVars t
glueErrorMsg s =
"Cannot glue (+) term with run-time variable" +++ s ++ "." ++++
"Use Prelude.bind instead."
stmErr :: Err a -> STM s a
stmErr e = stm (\s -> do
v <- e
return (v,s)
)
evalIn :: String -> STM s a -> STM s a
evalIn msg st = stm $ \s -> case appSTM st s of
Bad e -> Bad $ msg ++++ e
Ok vs -> Ok vs

View File

@@ -0,0 +1,273 @@
----------------------------------------------------------------------
-- |
-- Module : Optimize
-- Maintainer : AR
-- Stability : (stable)
-- Portability : (portable)
--
-- > CVS $Date: 2005/09/16 13:56:13 $
-- > CVS $Author: aarne $
-- > CVS $Revision: 1.18 $
--
-- Top-level partial evaluation for GF source modules.
-----------------------------------------------------------------------------
module GF.Compile.Optimize (optimizeModule) where
import GF.Grammar.Grammar
import GF.Infra.Ident
import GF.Infra.Modules
import GF.Grammar.PrGrammar
import GF.Grammar.Macros
import GF.Grammar.Lookup
import GF.Grammar.Refresh
import GF.Grammar.Compute
import GF.Compile.BackOpt
import GF.Compile.CheckGrammar
import GF.Compile.Update
import GF.Compile.Evaluate
import GF.Data.Operations
import GF.Infra.CheckM
import GF.Infra.Option
import Control.Monad
import Data.List
-- | partial evaluation of concrete syntax. AR 6\/2001 -- 16\/5\/2003 -- 5\/2\/2005.
-- only do this for resource: concrete is optimized in gfc form
optimizeModule :: Options -> [(Ident,SourceModule)] -> (Ident,SourceModule) ->
Err (Ident,SourceModule)
optimizeModule opts ms mo@(_,mi) = case mi of
m0@(Module mt st fs me ops js) | st == MSComplete && isModRes m0 -> do
mo1 <- evalModule oopts ms mo
return $ case optim of
"parametrize" -> shareModule paramOpt mo1 -- parametrization and sharing
"values" -> shareModule valOpt mo1 -- tables as courses-of-values
"share" -> shareModule shareOpt mo1 -- sharing of branches
"all" -> shareModule allOpt mo1 -- first parametrize then values
"none" -> mo1 -- no optimization
_ -> mo1 -- none; default for src
_ -> evalModule oopts ms mo
where
oopts = addOptions opts (iOpts (flagsModule mo))
optim = maybe "all" id $ getOptVal oopts useOptimizer
evalModule :: Options -> [(Ident,SourceModule)] -> (Ident,SourceModule) -> Err (Ident,SourceModule)
evalModule oopts ms mo@(name,mod) = case mod of
m0@(Module mt st fs me ops js) | st == MSComplete -> case mt of
{-
-- now: don't optimize resource
_ | isModRes m0 -> do
let deps = allOperDependencies name js
ids <- topoSortOpers deps
MGrammar (mod' : _) <- foldM evalOp gr ids
return $ mod'
-}
MTConcrete a -> do
-----
js0 <- appEvalConcrete gr js
js' <- mapMTree (evalCncInfo oopts gr name a) js0 ---- <- gr0 6/12/2005
return $ (name, Module mt st fs me ops js')
_ -> return $ (name,mod)
_ -> return $ (name,mod)
where
gr0 = MGrammar $ ms
gr = MGrammar $ (name,mod) : ms
evalOp g@(MGrammar ((_, m) : _)) i = do
info <- lookupTree prt i $ jments m
info' <- evalResInfo oopts gr (i,info)
return $ updateRes g name i info'
-- | only operations need be compiled in a resource, and this is local to each
-- definition since the module is traversed in topological order
evalResInfo :: Options -> SourceGrammar -> (Ident,Info) -> Err Info
evalResInfo oopts gr (c,info) = case info of
ResOper pty pde -> eIn "operation" $ do
pde' <- case pde of
Yes de | optres -> liftM yes $ comp de
_ -> return pde
return $ ResOper pty pde'
_ -> return info
where
comp = if optres then computeConcrete gr else computeConcreteRec gr
eIn cat = errIn ("Error optimizing" +++ cat +++ prt c +++ ":")
optim = maybe "all" id $ getOptVal oopts useOptimizer
optres = case optim of
"noexpand" -> False
_ -> True
evalCncInfo ::
Options -> SourceGrammar -> Ident -> Ident -> (Ident,Info) -> Err (Ident,Info)
evalCncInfo opts gr cnc abs (c,info) = errIn ("optimizing" +++ prt c) $ case info of
CncCat ptyp pde ppr -> do
pde' <- case (ptyp,pde) of
(Yes typ, Yes de) ->
liftM yes $ pEval ([(varStr, typeStr)], typ) de
(Yes typ, Nope) ->
liftM yes $ mkLinDefault gr typ >>= partEval noOptions gr ([(varStr, typeStr)],typ)
(May b, Nope) ->
return $ May b
_ -> return pde -- indirection
ppr' <- liftM yes $ evalPrintname gr c ppr (yes $ K $ prt c)
return (c, CncCat ptyp pde' ppr')
CncFun (mt@(Just (_,ty@(cont,val)))) pde ppr ->
eIn ("linearization in type" +++ prt (mkProd (cont,val,[])) ++++ "of function") $ do
pde' <- case pde of
----- Yes de -> do
----- liftM yes $ pEval ty de
_ -> return pde
ppr' <- liftM yes $ evalPrintname gr c ppr pde'
return $ (c, CncFun mt pde' ppr') -- only cat in type actually needed
_ -> return (c,info)
where
pEval = partEval opts gr
eIn cat = errIn ("Error optimizing" +++ cat +++ prt c +++ ":")
-- | the main function for compiling linearizations
partEval :: Options -> SourceGrammar -> (Context,Type) -> Term -> Err Term
partEval opts gr (context, val) trm = errIn ("parteval" +++ prt_ trm) $ do
let vars = map fst context
args = map Vr vars
subst = [(v, Vr v) | v <- vars]
trm1 = mkApp trm args
trm3 <- if globalTable
then etaExpand trm1 >>= comp subst >>= outCase subst
else etaExpand trm1 >>= comp subst
return $ mkAbs vars trm3
where
globalTable = oElem showAll opts --- i -all
comp g t = {- refreshTerm t >>= -} computeTerm gr g t
etaExpand t = recordExpand val t --- >>= caseEx -- done by comp
outCase subst t = do
pts <- getParams context
let (args,ptyps) = unzip $ filter (flip occur t . fst) pts
if null args
then return t
else do
let argtyp = RecType $ tuple2recordType ptyps
let pvars = map (Vr . zIdent . prt) args -- gets eliminated
patt <- term2patt $ R $ tuple2record $ pvars
let t' = replace (zip args pvars) t
t1 <- comp subst $ T (TTyped argtyp) [(patt, t')]
return $ S t1 $ R $ tuple2record args
--- notice: this assumes that all lin types follow the "old JFP style"
getParams = liftM concat . mapM getParam
getParam (argv,RecType rs) = return
[(P (Vr argv) lab, ptyp) | (lab,ptyp) <- rs, not (isLinLabel lab)]
---getParam (_,ty) | ty==typeStr = return [] --- in lindef
getParam (av,ty) =
Bad ("record type expected not" +++ prt ty +++ "for" +++ prt av)
--- all lin types are rec types
replace :: [(Term,Term)] -> Term -> Term
replace reps trm = case trm of
-- this is the important case
P _ _ -> maybe trm id $ lookup trm reps
_ -> composSafeOp (replace reps) trm
occur t trm = case trm of
-- this is the important case
P _ _ -> t == trm
S x y -> occur t y || occur t x
App f x -> occur t x || occur t f
Abs _ f -> occur t f
R rs -> any (occur t) (map (snd . snd) rs)
T _ cs -> any (occur t) (map snd cs)
C x y -> occur t x || occur t y
Glue x y -> occur t x || occur t y
ExtR x y -> occur t x || occur t y
FV ts -> any (occur t) ts
V _ ts -> any (occur t) ts
Let (_,(_,x)) y -> occur t x || occur t y
_ -> False
-- here we must be careful not to reduce
-- variants {{s = "Auto" ; g = N} ; {s = "Wagen" ; g = M}}
-- {s = variants {"Auto" ; "Wagen"} ; g = variants {N ; M}} ;
recordExpand :: Type -> Term -> Err Term
recordExpand typ trm = case unComputed typ of
RecType tys -> case trm of
FV rs -> return $ FV [R [assign lab (P r lab) | (lab,_) <- tys] | r <- rs]
_ -> return $ R [assign lab (P trm lab) | (lab,_) <- tys]
_ -> return trm
-- | auxiliaries for compiling the resource
mkLinDefault :: SourceGrammar -> Type -> Err Term
mkLinDefault gr typ = do
case unComputed typ of
RecType lts -> mapPairsM mkDefField lts >>= (return . Abs varStr . R . mkAssign)
_ -> prtBad "linearization type must be a record type, not" typ
where
mkDefField typ = case unComputed typ of
Table p t -> do
t' <- mkDefField t
let T _ cs = mkWildCases t'
return $ T (TWild p) cs
Sort "Str" -> return $ Vr varStr
QC q p -> lookupFirstTag gr q p
RecType r -> do
let (ls,ts) = unzip r
ts' <- mapM mkDefField ts
return $ R $ [assign l t | (l,t) <- zip ls ts']
_ | isTypeInts typ -> return $ EInt 0 -- exists in all as first val
_ -> prtBad "linearization type field cannot be" typ
-- | Form the printname: if given, compute. If not, use the computed
-- lin for functions, cat name for cats (dispatch made in evalCncDef above).
--- We cannot use linearization at this stage, since we do not know the
--- defaults we would need for question marks - and we're not yet in canon.
evalPrintname :: SourceGrammar -> Ident -> MPr -> Perh Term -> Err Term
evalPrintname gr c ppr lin =
case ppr of
Yes pr -> comp pr
_ -> case lin of
Yes t -> return $ K $ clean $ prt $ oneBranch t ---- stringFromTerm
_ -> return $ K $ prt c ----
where
comp = computeConcrete gr
oneBranch t = case t of
Abs _ b -> oneBranch b
R (r:_) -> oneBranch $ snd $ snd r
T _ (c:_) -> oneBranch $ snd c
V _ (c:_) -> oneBranch c
FV (t:_) -> oneBranch t
C x y -> C (oneBranch x) (oneBranch y)
S x _ -> oneBranch x
P x _ -> oneBranch x
Alts (d,_) -> oneBranch d
_ -> t
--- very unclean cleaner
clean s = case s of
'+':'+':' ':cs -> clean cs
'"':cs -> clean cs
c:cs -> c: clean cs
_ -> s

119
deprecated/gf.spec Normal file
View File

@@ -0,0 +1,119 @@
%define name GF
%define version 3.0
%define release 1
Name: %{name}
Summary: Grammatical Framework
Version: %{version}
Release: %{release}
License: GPL
Group: Sciences/Other
Vendor: The Language Technology Group
URL: http://www.cs.chalmers.se/~aarne/GF/
Source: GF-%{version}.tgz
BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-buildroot
BuildRequires: ghc
%description
The Grammatical Framework (=GF) is a grammar formalism based on type theory.
It consists of
* a special-purpose programming language
* a compiler of the language
* a generic grammar processor
The compiler reads GF grammars from user-provided files, and the
generic grammar processor performs various tasks with the grammars:
* generation
* parsing
* translation
* type checking
* computation
* paraphrasing
* random generation
* syntax editing
GF particularly addresses the following aspects of grammars:
* multilinguality (parallel grammars for different languages)
* semantics (semantic conditions of well-formedness, semantic
properties of expressions)
* grammar engineering (modularity, information hiding, reusable
libraries)
%package editor
Summary: Java syntax editor for Grammatical Framework (GF).
Group: Sciences/Other
Requires: %{name}
%description editor
This package contains the syntax editor GUI for GF.
%package editor2
Summary: Java syntax editor for Grammatical Framework (GF).
Group: Sciences/Other
Requires: %{name}
%description editor2
This package contains the syntax editor GUI for GF with printname enhancements and HTML support.
%prep
rm -rf $RPM_BUILD_ROOT
%setup -q
%build
cd src
%configure
make all
%install
cd src
%makeinstall
%clean
rm -rf $RPM_BUILD_ROOT
%files
%defattr(-,root,root,0755)
%{_bindir}/gf
%{_bindir}/gfdoc
%doc LICENSE README doc/{DocGF.pdf,gf2-highlights.html,index.html}
%files editor
%defattr(-,root,root,0755)
%{_bindir}/jgf
%{_datadir}/%{name}-%{version}/gf-java.jar
%files editor2
%defattr(-,root,root,0755)
%{_bindir}/gfeditor
%{_datadir}/%{name}-%{version}/gfeditor.jar
%changelog
* Tue Jun 21 2005 Hans-Joachim Daniels <daniels@ira.uka.de> 2.3pre
- added the printnames and HTML enhanced editor as editor2
* Thu May 12 2005 Bjorn Bringert <bringert@cs.chalmers.se> 2.2pre2-1
- Split package into gf and gf-editor packages.
* Wed May 11 2005 Bjorn Bringert <bringert@cs.chalmers.se> 2.2pre1-1
- Release of GF 2.2
* Mon Nov 8 2004 Aarne Ranta <aarne@cs.chalmers.se> 2.1-1
- Release of GF 2.1
* Thu Jun 24 2004 Bjorn Bringert <bringert@cs.chalmers.se> 2.0-2
- Set ownership correctly.
- Move jar-file to share (thanks to Anders Carlsson for pointing this out.)
- Added vendor tag.
* Tue Jun 22 2004 Bjorn Bringert <bringert@cs.chalmers.se> 2.0-1
- Include gfdoc binary
* Mon Jun 21 2004 Bjorn Bringert <bringert@cs.chalmers.se> 2.0-1
- Initial packaging

63
deprecated/gf.wxs.in Normal file
View File

@@ -0,0 +1,63 @@
<?xml version="1.0"?>
<Wix xmlns="http://schemas.microsoft.com/wix/2003/01/wi">
<Product Id="4717AF5D-52AC-4D13-85E6-D87278CE9BBC"
UpgradeCode="0BB7BB08-1A79-4981-A03F-32B401B01010"
Name="Grammatical Framework, version @PACKAGE_VERSION@"
Language="1033" Version="2.2" Manufacturer="The GF Developers">
<Package Id="????????-????-????-????-????????????"
Description="Grammatical Framework, version @PACKAGE_VERSION@"
Comments="This package contains the Grammatical Framework system, version @PACKAGE_VERSION@."
InstallerVersion="200" Compressed="yes" />
<Media Id="1" Cabinet="gf.cab" EmbedCab="yes" />
<Directory Id="TARGETDIR" Name="SourceDir">
<Directory Id="ProgramFilesFolder">
<Directory Id="INSTALLDIR" Name="GF-@PACKAGE_VERSION@">
<Component Id="GFBinary" Guid="E2A44A6C-0252-4346-85AE-BC6A16BFB0FC" DiskId="1">
<File Id="GFEXE" Name="gf.exe" src="../bin/gf.exe" />
<Shortcut Id="GFStartMenu" Directory="GFProgramMenuDir"
Name="GF" Target="[!GFEXE]" />
</Component>
<Component Id="GFDocBinary" Guid="BDCA6F34-EE0A-4E72-8D00-CB7CAF3CEAEA" DiskId="1">
<File Id="GFDocEXE" Name="gfdoc.exe" src="tools/gfdoc.exe" />
</Component>
<Component Id="GFEditor" Guid="39F885F7-BC49-4CBC-9DCD-569C95AA3364" DiskId="1">
<Environment Id="GFHomeEnv" Name="GF_HOME" Action="create" Part="all"
Permanent="no" Value="[INSTALLDIR]" />
<File Id="GFEditorBat" Name="jgf.bat" src="jgf.bat" />
<File Id="GFEditorJar" Name="gf-java.jar" src="JavaGUI/gf-java.jar" />
<Shortcut Id="GFEditorStartMenu" Directory="GFProgramMenuDir"
Name="GFEditor" LongName="GF Editor" Target="[!GFEditorBat]"
WorkingDirectory="INSTALLDIR" />
</Component>
<Directory Id="GFDocDir" Name="doc">
<Component Id="GFDoc" Guid="23BEEBBF-F9AB-459F-B8D2-8414BB47834A" DiskId="1">
<File Id="GFReadme" Name="README.txt" src="../README" />
<File Id="GFLicenee" Name="LICENSE.txt" src="../LICENSE" />
</Component>
</Directory>
</Directory>
</Directory>
<Directory Id="ProgramMenuFolder" Name="PMenu" LongName="Programs">
<Directory Id="GFProgramMenuDir" Name='GF-@PACKAGE_VERSION@' />
</Directory>
</Directory>
<Feature Id="ProductFeature" Title="Feature Title" Level="1">
<ComponentRef Id="GFBinary" />
<ComponentRef Id="GFDocBinary" />
<ComponentRef Id="GFEditor" />
<ComponentRef Id="GFDoc" />
</Feature>
</Product>
</Wix>

98
deprecated/gf_atk.cfg Normal file
View File

@@ -0,0 +1,98 @@
# GF ATK configuration file
# ------------------------
# -- Basic audio signal processing --
SOURCEFORMAT = HAUDIO
SOURCERATE = 625
# Set in GF/System/ATKSpeechInput.hs
# TARGETKIND = MFCC_0_D_A
TARGETRATE = 100000.0
WINDOWSIZE = 250000.0
ENORMALISE = F
ZMEANSOURCE = F
USEHAMMING = T
PREEMCOEF = 0.97
USEPOWER = T
NUMCHANS = 26
CEPLIFTER = 22
NUMCEPS = 12
SILFLOOR = 50.0
USESILDET = T
MEASURESIL = F
OUTSILWARN = T
# -- Silence detection ---
HPARM: CALWINDOW = 40
HPARM: SPEECHTHRESH = 9.0
HPARM: SILDISCARD = 10.0
HPARM: SILENERGY = 0.0
HPARM: SPCSEQCOUNT = 10
HPARM: SPCGLCHCOUNT = 0
HPARM: SILGLCHCOUNT = 2
HPARM: SILSEQCOUNT = 50
# -- Cepstral mean ---
HPARM: CMNTCONST = 0.995
HPARM: CMNRESETONSTOP = F
HPARM: CMNMINFRAMES = 12
# -- Recogniser --
AREC: TRBAKFREQ = 1
# hands free, don't return results until end
AREC: RUNMODE = 01441
AREC: GENBEAM = 200.0
AREC: WORDBEAM = 175.0
AREC: WORDPEN = -10.0
HNET: FORCECXTEXP = T
HNET: ALLOWXWRDEXP = F
HNET: MARKSUBLAT = F
ARMAN: AUTOSIL = F
HREC: CONFSCALE = 0.15
HREC: CONFOFFSET = 0.0
#HREC: CONFBGHMM = bghmm
# -- Set visibility and positions of ATK controls --
AIN: DISPSHOW = T
AIN: DISPXORIGIN = 440
AIN: DISPYORIGIN = 220
AIN: DISPHEIGHT = 40
AIN: DISPWIDTH = 160
ACODE: DISPSHOW = F
ACODE: DISPXORIGIN = 40
ACODE: DISPYORIGIN = 220
ACODE: DISPHEIGHT = 220
ACODE: DISPWIDTH = 380
ACODE: MAXFGFEATS = 13
ACODE: NUMSTREAMS = 1
AREC: DISPSHOW = T
AREC: DISPXORIGIN = 40
AREC: DISPYORIGIN = 20
AREC: DISPHEIGHT = 160
AREC: DISPWIDTH = 560
# -- Debugging --
HMMSET: TRACE = 0
ADICT: TRACE = 0
AGRAM: TRACE = 0
GGRAM: TRACE = 0
AREC: TRACE = 0
ARMAN: TRACE = 0
HPARM: TRACE = 0
HNET: TRACE = 0
HREC: TRACE = 0

30
deprecated/gfc.in Normal file
View File

@@ -0,0 +1,30 @@
#!/bin/sh
prefix="@prefix@"
case "@host@" in
*-cygwin)
prefix=`cygpath -w "$prefix"`;;
esac
exec_prefix="@exec_prefix@"
GF_BIN_DIR="@bindir@"
GF_DATA_DIR="@datadir@/GF-@PACKAGE_VERSION@"
GFBIN="$GF_BIN_DIR/gf"
if [ ! -x "${GFBIN}" ]; then
GF_BIN_DIR=`dirname $0`
GFBIN="$GF_BIN_DIR/gf"
fi
if [ ! -x "${GFBIN}" ]; then
GFBIN=`which gf`
fi
if [ ! -x "${GFBIN}" ]; then
echo "gf not found."
exit 1
fi
exec $GFBIN --batch "$@"

View File

@@ -0,0 +1,169 @@
# checking that a file is haddocky:
# - checking if it has an export list
# - if there is no export list, it tries to find all defined functions
# - checking that all exported functions have type signatures
# - checking that the module header is OK
# changes on files:
# - transforming hard space to ordinary space
# limitations:
# - there might be some problems with nested comments
# - cannot handle type signatures for several functions
# (i.e. "a, b, c :: t")
# but on the other hand -- haddock has some problems with these too...
$operChar = qr/[\!\#\$\%\&\*\+\.\/\<\=\>\?\@\\\^\|\-\~]/;
$operCharColon = qr/[\!\#\$\%\&\*\+\.\/\<\=\>\?\@\\\^\|\-\~\:]/;
$nonOperChar = qr/[^\!\#\$\%\&\*\+\.\/\<\=\>\?\@\\\^\|\-\~]/;
$nonOperCharColon = qr/[^\!\#\$\%\&\*\+\.\/\<\=\>\?\@\\\^\|\-\~\:]/;
$operSym = qr/$operChar $operCharColon*/x;
$funSym = qr/[a-z] \w* \'*/x;
$funOrOper = qr/(?: $funSym | \($operSym\) )/x;
$keyword = qr/(?: type | data | module | newtype | infix[lr]? | import | instance | class )/x;
$keyOper = qr/^(?: \.\. | \:\:? | \= | \\ | \| | \<\- | \-\> | \@ | \~ | \=\> | \. )$/x;
sub check_headerline {
my ($title, $regexp) = @_;
if (s/^-- \s $title \s* : \s+ (.+?) \s*\n//sx) {
$name = $1;
push @ERR, "Incorrect ".lcfirst $title.": $name"
unless $name =~ $regexp;
return $&;
} else {
push @ERR, "Header missing: ".lcfirst $title."";
}
}
if ($#ARGV >= 0) {
@FILES = @ARGV;
} else {
# @dirs = qw{. api canonical cf cfgm compile for-ghc-nofud
# grammar infra notrace parsers shell
# source speech translate useGrammar util visualization
# GF GF/* GF/*/* GF/*/*/*};
@dirs = qw{GF GF/* GF/*/* GF/*/*/*};
@FILES = grep(!/\/(Par|Lex)(GF|GFC|CFG)\.hs$/,
glob "{".join(",",@dirs)."}/*.hs");
}
for $file (@FILES) {
$file =~ s/\.hs//;
open F, "<$file.hs";
$_ = join "", <F>;
close F;
@ERR = ();
# substituting hard spaces for ordinary spaces
$nchars = tr/\240/ /;
if ($nchars > 0) {
push @ERR, "!! > Substituted $nchars hard spaces";
open F, ">$file.hs";
print F $_;
close F;
}
# the module header
$hdr_module = $module = "";
s/^ \{-\# \s+ OPTIONS \s+ -cpp \s+ \#-\} //sx; # removing ghc options (cpp)
s/^ \s+ //sx; # removing initial whitespace
s/^ (--+ \s* \n) +//sx; # removing initial comment lines
unless (s/^ -- \s \| \s* \n//sx) {
push @ERR, "Incorrect module header";
} else {
$hdr_module = s/^-- \s Module \s* : \s+ (.+?) \s*\n//sx ? $1 : "";
&check_headerline("Maintainer", qr/^ [\wåäöÅÄÖüÜ\s\@\.]+ $/x);
&check_headerline("Stability", qr/.*/);
&check_headerline("Portability", qr/.*/);
s/^ (--+ \s* \n) +//sx;
push @ERR, "Missing CVS information"
unless s/^(-- \s+ \> \s+ CVS \s+ \$ .*? \$ \s* \n)+//sx;
s/^ (--+ \s* \n) +//sx;
push @ERR, "Missing module description"
unless /^ -- \s+ [^\(]/x;
}
# removing comments
s/\{- .*? -\}//gsx;
s/-- ($nonOperSymColon .*? \n | \n)/\n/gx;
# removing \n in front of whitespace (for simplification)
s/\n+[ \t]/ /gs;
# the export list
$exportlist = "";
if (/\n module \s+ ((?: \w | \.)+) \s+ \( (.*?) \) \s+ where/sx) {
($module, $exportlist) = ($1, $2);
$exportlist =~ s/\b module \s+ [A-Z] \w*//gsx;
$exportlist =~ s/\(\.\.\)//g;
} elsif (/\n module \s+ ((?: \w | \.)+) \s+ where/sx) {
$module = $1;
# modules without export lists
# push @ERR, "No export list";
# function definitions
while (/^ (.*? $nonOperCharColon) = (?! $operCharColon)/gmx) {
$defn = $1;
next if $defn =~ /^ $keyword \b/x;
if ($defn =~ /\` ($funSym) \`/x) {
$fn = $1;
} elsif ($defn =~ /(?<! $operCharColon) ($operSym)/x
&& $1 !~ $keyOper) {
$fn = "($1)";
} elsif ($defn =~ /^($funSym)/x) {
$fn = $1;
} else {
push @ERR, "!! > Error in function defintion: $defn";
next;
}
$exportlist .= " $fn ";
}
} else {
push @ERR, "No module header found";
}
push @ERR, "Module names not matching: $module != $hdr_module"
if $hdr_module && $module !~ /\Q$hdr_module\E$/;
# fixing exportlist (double spaces as separator)
$exportlist = " $exportlist ";
$exportlist =~ s/(\s | \,)+/ /gx;
# removing functions with type signatures from export list
while (/^ ($funOrOper (\s* , \s* $funOrOper)*) \s* ::/gmx) {
$functionlist = $1;
while ($functionlist =~ s/^ ($funOrOper) (\s* , \s*)?//x) {
$function = $1;
$exportlist =~ s/\s \Q$function\E \s/ /gx;
}
}
# reporting exported functions without type signatures
$reported = 0;
$untyped = "";
while ($exportlist =~ /\s ($funOrOper) \s/x) {
$function = $1;
$exportlist =~ s/\s \Q$function\E \s/ /gx;
$reported++;
$untyped .= " $function";
}
push @ERR, "No type signature for $reported function(s):\n " . $untyped
if $reported;
print "-- $file\n > " . join("\n > ", @ERR) . "\n"
if @ERR;
}

View File

@@ -0,0 +1,73 @@
#!/bin/tcsh
######################################################################
# Author: Peter Ljunglöf
# Time-stamp: "2005-05-12, 23:17"
# CVS $Date: 2005/05/13 12:40:20 $
# CVS $Author: peb $
#
# a script for producing documentation through Haddock
######################################################################
set basedir = `pwd`
set docdir = haddock/html
set tempdir = haddock/.temp-files
set resourcedir = haddock/resources
set files = (`find GF -name '*.hs'` GF.hs)
######################################################################
echo 1. Creating and cleaning Haddock directory
echo -- $docdir
mkdir -p $docdir
rm -rf $docdir/*
######################################################################
echo
echo 2. Copying Haskell files to temporary directory: $tempdir
rm -rf $tempdir
foreach f ($files)
# echo -- $f
mkdir -p `dirname $tempdir/$f`
perl -pe 's/^#/-- CPP #/' $f > $tempdir/$f
end
######################################################################
echo
echo 3. Invoking Haddock
cd $tempdir
haddock -o $basedir/$docdir -h -t 'Grammatical Framework' $files
cd $basedir
######################################################################
echo
echo 4. Restructuring to HTML framesets
echo -- Substituting for frame targets inside html files
mv $docdir/index.html $docdir/index-frame.html
foreach f ($docdir/*.html)
# echo -- $f
perl -pe 's/<HEAD/<HEAD><BASE TARGET="contents"/; s/"index.html"/"index-frame.html"/; s/(<A HREF = "\S*index\S*.html")/$1 TARGET="index"/' $f > .tempfile
mv .tempfile $f
end
echo -- Copying resource files:
echo -- `ls $resourcedir/*.*`
cp $resourcedir/*.* $docdir
######################################################################
echo
echo 5. Finished
echo -- The documentation is located at:
echo -- $docdir/index.html

View File

@@ -0,0 +1,10 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<!-- Time-stamp: "2005-02-03, 15:59" -->
<HTML>
<HEAD>
<LINK HREF="haddock.css" REL=stylesheet>
</HEAD>
<BODY>
</BODY>
</HTML>

View File

@@ -0,0 +1,14 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/html4/frameset.dtd">
<!-- Time-stamp: "2005-02-03, 15:53" -->
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1" />
<title>Grammatical Framework programmer's documentation</title>
</head>
<frameset cols="1*,2*">
<frame name="index" src="index-frame.html">
<frame name="contents" src="blank.html">
</frameset>
</html>

251
deprecated/install-sh Normal file
View File

@@ -0,0 +1,251 @@
#!/bin/sh
#
# install - install a program, script, or datafile
# This comes from X11R5 (mit/util/scripts/install.sh).
#
# Copyright 1991 by the Massachusetts Institute of Technology
#
# Permission to use, copy, modify, distribute, and sell this software and its
# documentation for any purpose is hereby granted without fee, provided that
# the above copyright notice appear in all copies and that both that
# copyright notice and this permission notice appear in supporting
# documentation, and that the name of M.I.T. not be used in advertising or
# publicity pertaining to distribution of the software without specific,
# written prior permission. M.I.T. makes no representations about the
# suitability of this software for any purpose. It is provided "as is"
# without express or implied warranty.
#
# Calling this script install-sh is preferred over install.sh, to prevent
# `make' implicit rules from creating a file called install from it
# when there is no Makefile.
#
# This script is compatible with the BSD install script, but was written
# from scratch. It can only install one file at a time, a restriction
# shared with many OS's install programs.
# set DOITPROG to echo to test this script
# Don't use :- since 4.3BSD and earlier shells don't like it.
doit="${DOITPROG-}"
# put in absolute paths if you don't have them in your path; or use env. vars.
mvprog="${MVPROG-mv}"
cpprog="${CPPROG-cp}"
chmodprog="${CHMODPROG-chmod}"
chownprog="${CHOWNPROG-chown}"
chgrpprog="${CHGRPPROG-chgrp}"
stripprog="${STRIPPROG-strip}"
rmprog="${RMPROG-rm}"
mkdirprog="${MKDIRPROG-mkdir}"
transformbasename=""
transform_arg=""
instcmd="$mvprog"
chmodcmd="$chmodprog 0755"
chowncmd=""
chgrpcmd=""
stripcmd=""
rmcmd="$rmprog -f"
mvcmd="$mvprog"
src=""
dst=""
dir_arg=""
while [ x"$1" != x ]; do
case $1 in
-c) instcmd="$cpprog"
shift
continue;;
-d) dir_arg=true
shift
continue;;
-m) chmodcmd="$chmodprog $2"
shift
shift
continue;;
-o) chowncmd="$chownprog $2"
shift
shift
continue;;
-g) chgrpcmd="$chgrpprog $2"
shift
shift
continue;;
-s) stripcmd="$stripprog"
shift
continue;;
-t=*) transformarg=`echo $1 | sed 's/-t=//'`
shift
continue;;
-b=*) transformbasename=`echo $1 | sed 's/-b=//'`
shift
continue;;
*) if [ x"$src" = x ]
then
src=$1
else
# this colon is to work around a 386BSD /bin/sh bug
:
dst=$1
fi
shift
continue;;
esac
done
if [ x"$src" = x ]
then
echo "install: no input file specified"
exit 1
else
true
fi
if [ x"$dir_arg" != x ]; then
dst=$src
src=""
if [ -d $dst ]; then
instcmd=:
chmodcmd=""
else
instcmd=mkdir
fi
else
# Waiting for this to be detected by the "$instcmd $src $dsttmp" command
# might cause directories to be created, which would be especially bad
# if $src (and thus $dsttmp) contains '*'.
if [ -f $src -o -d $src ]
then
true
else
echo "install: $src does not exist"
exit 1
fi
if [ x"$dst" = x ]
then
echo "install: no destination specified"
exit 1
else
true
fi
# If destination is a directory, append the input filename; if your system
# does not like double slashes in filenames, you may need to add some logic
if [ -d $dst ]
then
dst="$dst"/`basename $src`
else
true
fi
fi
## this sed command emulates the dirname command
dstdir=`echo $dst | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'`
# Make sure that the destination directory exists.
# this part is taken from Noah Friedman's mkinstalldirs script
# Skip lots of stat calls in the usual case.
if [ ! -d "$dstdir" ]; then
defaultIFS='
'
IFS="${IFS-${defaultIFS}}"
oIFS="${IFS}"
# Some sh's can't handle IFS=/ for some reason.
IFS='%'
set - `echo ${dstdir} | sed -e 's@/@%@g' -e 's@^%@/@'`
IFS="${oIFS}"
pathcomp=''
while [ $# -ne 0 ] ; do
pathcomp="${pathcomp}${1}"
shift
if [ ! -d "${pathcomp}" ] ;
then
$mkdirprog "${pathcomp}"
else
true
fi
pathcomp="${pathcomp}/"
done
fi
if [ x"$dir_arg" != x ]
then
$doit $instcmd $dst &&
if [ x"$chowncmd" != x ]; then $doit $chowncmd $dst; else true ; fi &&
if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dst; else true ; fi &&
if [ x"$stripcmd" != x ]; then $doit $stripcmd $dst; else true ; fi &&
if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dst; else true ; fi
else
# If we're going to rename the final executable, determine the name now.
if [ x"$transformarg" = x ]
then
dstfile=`basename $dst`
else
dstfile=`basename $dst $transformbasename |
sed $transformarg`$transformbasename
fi
# don't allow the sed command to completely eliminate the filename
if [ x"$dstfile" = x ]
then
dstfile=`basename $dst`
else
true
fi
# Make a temp file name in the proper directory.
dsttmp=$dstdir/#inst.$$#
# Move or copy the file name to the temp name
$doit $instcmd $src $dsttmp &&
trap "rm -f ${dsttmp}" 0 &&
# and set any options; do chmod last to preserve setuid bits
# If any of these fail, we abort the whole thing. If we want to
# ignore errors from any of these, just make sure not to ignore
# errors from the above "$doit $instcmd $src $dsttmp" command.
if [ x"$chowncmd" != x ]; then $doit $chowncmd $dsttmp; else true;fi &&
if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dsttmp; else true;fi &&
if [ x"$stripcmd" != x ]; then $doit $stripcmd $dsttmp; else true;fi &&
if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dsttmp; else true;fi &&
# Now rename the file to the real destination.
$doit $rmcmd -f $dstdir/$dstfile &&
$doit $mvcmd $dsttmp $dstdir/$dstfile
fi &&
exit 0

View File

@@ -1,19 +0,0 @@
CC = gcc
CFLAGS += -O2 -W -Wall
.PHONY: all clean
all: libgfcc.a
libgfcc.a: gfcc-tree.o gfcc-term.o
ar r $@ $^
gfcc-tree.o: gfcc-tree.c gfcc-tree.h
$(CC) $(CFLAGS) -c -o $@ $<
gfcc-term.o: gfcc-term.c gfcc-term.h
$(CC) $(CFLAGS) -c -o $@ $<
clean:
-rm -f libgfcc.a
-rm -f *.o

View File

@@ -1,203 +0,0 @@
#include "gfcc-term.h"
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
static void *buffer = NULL;
static size_t current;
extern void term_alloc_pool(size_t size) {
if (buffer == NULL)
buffer = malloc(size);
current = 0;
}
extern void term_free_pool() {
if (buffer != NULL)
free(buffer);
buffer = NULL;
}
extern void *term_alloc(size_t size) {
void *off = buffer + current;
current += size;
return off;
}
static inline Term *create_term(TermType type, int n) {
Term *t = (Term*)term_alloc(sizeof(Term) + n * sizeof(Term *));
t->type = type;
t->value.size = n; /* FIXME: hack! */
return t;
}
extern Term *term_array(int n, ...) {
Term *t = create_term(TERM_ARRAY, n);
va_list ap;
int i;
va_start(ap, n);
for (i = 0; i < n; i++) {
term_set_child(t, i, va_arg(ap, Term *));
}
va_end(ap);
return t;
}
extern Term *term_seq(int n, ...) {
Term *t = create_term(TERM_SEQUENCE, n);
va_list ap;
int i;
va_start(ap, n);
for (i = 0; i < n; i++) {
term_set_child(t, i, va_arg(ap, Term *));
}
va_end(ap);
return t;
}
extern Term *term_variants(int n, ...) {
Term *t = create_term(TERM_VARIANTS, n);
va_list ap;
int i;
va_start(ap, n);
for (i = 0; i < n; i++) {
term_set_child(t, i, va_arg(ap, Term *));
}
va_end(ap);
return t;
}
extern Term *term_glue(int n, ...) {
Term *t = create_term(TERM_GLUE, n);
va_list ap;
int i;
va_start(ap, n);
for (i = 0; i < n; i++) {
term_set_child(t, i, va_arg(ap, Term *));
}
va_end(ap);
return t;
}
extern Term *term_rp(Term *t1, Term *t2) {
Term *t = create_term(TERM_RECORD_PARAM, 2);
term_set_child(t, 0, t1);
term_set_child(t, 1, t2);
return t;
}
extern Term *term_suffix(const char *pref, Term *suf) {
Term *t = create_term(TERM_SUFFIX_TABLE, 2);
term_set_child(t,0,term_str(pref));
term_set_child(t,1,suf);
return t;
}
extern Term *term_str(const char *s) {
Term *t = create_term(TERM_STRING, 0);
t->value.string_value = s;
return t;
}
extern Term *term_int(int i) {
Term *t = create_term(TERM_INTEGER,0);
t->value.integer_value = i;
return t;
}
extern Term *term_meta() {
return create_term(TERM_META, 0);
}
extern Term *term_sel_int(Term *t, int i) {
switch (t->type) {
case TERM_ARRAY:
return term_get_child(t,i);
case TERM_SUFFIX_TABLE:
return term_glue(2,
term_get_child(t,0),
term_sel_int(term_get_child(t,1),i));
case TERM_META:
return t;
default:
fprintf(stderr,"Error: term_sel_int %d %d\n", t->type, i);
exit(1);
return NULL;
}
}
extern Term *term_sel(Term *t1, Term *t2) {
switch (t2->type) {
case TERM_INTEGER:
return term_sel_int(t1, t2->value.integer_value);
case TERM_RECORD_PARAM:
return term_sel(t1,term_get_child(t2,0));
case TERM_META:
return term_sel_int(t1,0);
default:
fprintf(stderr,"Error: term_sel %d %d\n", t1->type, t2->type);
exit(1);
return 0;
}
}
static void term_print_sep(FILE *stream, Term *t, const char *sep) {
int n = t->value.size;
int i;
for (i = 0; i < n; i++) {
term_print(stream, term_get_child(t,i));
if (i < n-1) {
fputs(sep, stream);
}
}
}
extern void term_print(FILE *stream, Term *t) {
switch (t->type) {
case TERM_ARRAY:
term_print(stream, term_get_child(t,0));
break;
case TERM_SEQUENCE:
term_print_sep(stream, t, " ");
break;
case TERM_VARIANTS:
term_print_sep(stream, t, "/");
break;
case TERM_GLUE:
term_print_sep(stream, t, "");
break;
case TERM_RECORD_PARAM:
term_print(stream, term_get_child(t,0));
break;
case TERM_SUFFIX_TABLE:
term_print(stream, term_get_child(t,0));
term_print(stream, term_get_child(t,1));
break;
case TERM_META:
fputs("?", stream);
break;
case TERM_STRING:
fputs(t->value.string_value, stream);
break;
case TERM_INTEGER:
fprintf(stream, "%d", t->value.integer_value);
break;
default:
fprintf(stderr,"Error: term_print %d\n", t->type);
exit(1);
}
}

View File

@@ -1,65 +0,0 @@
#ifndef GFCC_TERM_H
#define GFCC_TERM_H
#include <stdio.h>
typedef enum {
/* size = variable */
TERM_ARRAY,
TERM_SEQUENCE,
TERM_VARIANTS,
TERM_GLUE,
/* size = 2 */
TERM_RECORD_PARAM,
TERM_SUFFIX_TABLE,
/* size = 0 */
TERM_META,
TERM_STRING,
TERM_INTEGER
} TermType;
struct Term_ {
TermType type;
union {
const char *string_value;
int integer_value;
int size;
} value;
struct Term_ *args[0];
};
typedef struct Term_ Term;
static inline Term *term_get_child(Term *t, int n) {
return t->args[n];
}
static inline void term_set_child(Term *t, int n, Term *c) {
t->args[n] = c;
}
extern void term_alloc_pool(size_t size);
extern void term_free_pool();
extern void *term_alloc(size_t size);
extern Term *term_array(int n, ...);
extern Term *term_seq(int n, ...);
extern Term *term_variants(int n, ...);
extern Term *term_glue(int n, ...);
extern Term *term_rp(Term *t1, Term *t2);
extern Term *term_suffix(const char *pref, Term *suf);
extern Term *term_str(const char *s);
extern Term *term_int(int i);
extern Term *term_meta();
extern Term *term_sel_int(Term *t, int i);
extern Term *term_sel(Term *t1, Term *t2);
extern void term_print(FILE *stream, Term *t);
#endif

View File

@@ -1,61 +0,0 @@
#include "gfcc-tree.h"
#include <stdlib.h>
extern int arity(Tree *t) {
switch (t->type) {
case ATOM_STRING:
case ATOM_INTEGER:
case ATOM_DOUBLE:
case ATOM_META:
return 0;
default:
return t->value.size;
}
}
static Tree *create_tree(atom_type c, int n) {
Tree *t = (Tree *)malloc(sizeof(Tree) + n * sizeof(Tree *));
t->type = c;
return t;
}
extern Tree *tree_string(const char *s) {
Tree *t = create_tree(ATOM_STRING, 0);
t->value.string_value = s;
return t;
}
extern Tree *tree_integer(int i) {
Tree *t = create_tree(ATOM_INTEGER, 0);
t->value.integer_value = i;
return t;
}
extern Tree *tree_double(double d) {
Tree *t = create_tree(ATOM_DOUBLE, 0);
t->value.double_value = d;
return t;
}
extern Tree *tree_meta() {
return create_tree(ATOM_META, 0);
}
extern Tree *tree_fun(atom_type f, int n) {
Tree *t = create_tree(f, n);
t->value.size = n;
return t;
}
extern void tree_free(Tree *t) {
int n = arity(t);
int i;
for (i = 0; i < n; i++) {
tree_free(tree_get_child(t,i));
}
free(t);
}

View File

@@ -1,49 +0,0 @@
#ifndef GFCC_TREE_H
#define GFCC_TREE_H
typedef enum {
ATOM_STRING,
ATOM_INTEGER,
ATOM_DOUBLE,
ATOM_META,
ATOM_FIRST_FUN
} atom_type;
struct Tree_{
atom_type type;
union {
const char *string_value;
int integer_value;
double double_value;
int size;
} value;
struct Tree_ *args[0];
};
typedef struct Tree_ Tree;
static inline Tree *tree_get_child(Tree *t, int n) {
return t->args[n];
}
static inline void tree_set_child(Tree *t, int n, Tree *a) {
t->args[n] = a;
}
extern int arity(Tree *t);
extern Tree *tree_string(const char *s);
extern Tree *tree_integer(int i);
extern Tree *tree_double(double d);
extern Tree *tree_meta();
extern Tree *tree_fun(atom_type f, int n);
extern void tree_free(Tree *t);
#endif

View File

@@ -1,17 +0,0 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<link rel="stylesheet" type="text/css" href="style.css" />
<script type="text/javascript" src="gflib.js"></script>
<script type="text/javascript" src="editorGrammar.js"></script>
<script type="text/javascript" src="grammar.js"></script>
<script type="text/javascript" src="gfjseditor.js"></script>
<title>Web-based Syntax Editor</title>
</head>
<body onload="mkEditor('editor', Food)" onkeydown="hotKeys(event)">
<div id="editor">
</div>
</body>
</html>

File diff suppressed because one or more lines are too long

Binary file not shown.

Before

Width:  |  Height:  |  Size: 161 B

File diff suppressed because it is too large Load Diff

View File

@@ -1,54 +0,0 @@
/* Output */
function sayText(text) {
document.voice_output_text = text;
activateForm("voice_output");
}
/* XHTML+Voice Utilities */
function activateForm(formid) {
var form = document.getElementById(formid);
var e = document.createEvent("UIEvents");
e.initEvent("DOMActivate","true","true");
form.dispatchEvent(e);
}
/* DOM utilities */
/* Gets the head element of the document. */
function getHeadElement() {
var hs = document.getElementsByTagName("head");
if (hs.length == 0) {
var head = document.createElement("head");
document.documentElement.insertBefore(head, document.documentElement.firstChild);
return head;
} else {
return hs[0];
}
}
/* Gets the body element of the document. */
function getBodyElement() {
var bs = document.getElementsByTagName("body");
if (bs.length == 0) {
var body = document.createElement("body");
document.documentElement.appendChild(body);
return body;
} else {
return bs[0];
}
}
/* Removes all the children of a node */
function removeChildren(node) {
while (node.hasChildNodes()) {
node.removeChild(node.firstChild);
}
}
function setText(node, text) {
removeChildren(node);
node.appendChild(document.createTextNode(text));
}

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

Binary file not shown.

Before

Width:  |  Height:  |  Size: 201 B

Binary file not shown.

Before

Width:  |  Height:  |  Size: 229 B

View File

@@ -1,241 +0,0 @@
body {
font-family:arial,helvetica,sans-serif;
font-size:12px;
background-color: white;
}
#wrapper {
width:740px;
height:520px;
margin:auto 50px;
border:1px solid gray;
padding:10px;
}
#absFrame {
width:250px;
height:250px;
padding:10px;
border:1px solid gray;
float:left;
white-space: nowrap;
}
#conFrame {
width:436px;
height:250px;
margin-left:10px;
padding:10px;
border:1px solid gray;
float:left;
white-space: normal;
overflow:auto;
}
#actFrame {
width:250px;
height:170px;
margin-top:10px;
padding:10px;
border:1px solid gray;
float:left;
overflow:auto;
}
#refFrame {
width:436px;
height:170px;
margin-left:10px;
margin-top:10px;
padding:10px;
border:1px solid gray;
float:left;
overflow:auto;
}
#messageFrame {
width:506px;
height:15px;
margin-top:10px;
margin-right:10px;
padding:10px;
border:1px solid gray;
float:left;
overflow:hidden;
}
#clipboardFrame {
width:180px;
height:15px;
margin-top:10px;
padding:10px;
border:1px solid gray;
float:left;
overflow:auto;
}
#tree {
left: -10px;
top: -10px;
width: 250px;
height: 250px;
margin: 0px;
padding: 10px;
overflow: auto;
}
ul {
position: relative;
list-style: none;
margin-left: 20px;
padding: 0px;
}
li {
position: relative;
}
img.tree-menu {
margin-right: 5px;
}
a.tree:link, a.tree:visited, a.tree:active {
color: black;
background-color: white;
text-decoration: none;
margin-right:10px;
}
a.tree:hover {
color: blue;
background-color: white;
text-decoration: underline;
margin-right:10px;
}
a.treeSelected:link, a.treeSelected:visited, a.treeSelected:active {
color: white;
background-color: #3366CC;
text-decoration: none;
margin-right:10px;
}
a.treeSelected:hover {
color: white;
background-color: #3366CC;
text-decoration: underline;
margin-right:10px;
}
a.treeGray:link, a.treeGray:visited, a.treeGray:active {
color: silver;
background-color: white;
text-decoration: none;
margin-right:10px;
}
a.treeGray:hover {
color: silver;
background-color: white;
text-decoration: none;
margin-right:10px;
}
table.action, table.refinement, table.wrapper, table.tree, table.language {
margin: 0px;
padding: 0px;
border-style: none;
border-collapse: collapse;
border-spacing: 0px;
}
tr.selected {
color: white;
background-color: #3366CC;
}
tr.unavailable, tr.closed {
color: silver;
background-color: white;
}
tr.unavailable:hover {
color: silver;
background-color: #3366CC;
}
tr.action, tr.refinement, tr.wrapper, tr.tree {
color: black;
background-color: white;
}
tr.action:hover, tr.refinement:hover, tr.wrapper:hover, tr.tree:hover {
color: white;
background-color: #3366CC;
}
td.action {
width: 220px;
margin: 0px;
padding: 0px;
}
td.refinement, td.wrapper, td.tree {
width: 515px;
margin: 0px;
padding: 0px;
}
td.hotKey {
width: 30px;
margin: 0px;
padding: 0px;
text-align: right;
}
td.language {
color: black;
background-color: white;
margin: 1px;
padding: 1px;
}
td.language:hover {
color: blue;
background-color: white;
text-decoration: underline;
margin: 1px;
padding: 1px;
}
td.selected {
color: white;
background-color: #3366CC;
margin: 1px;
padding: 1px;
}
td.selected:hover {
color: white;
background-color: #3366CC;
text-decoration: underline;
margin: 1px;
padding: 1px;
}
p {
margin-bottom: 40px;
}
span.normal {
color: black;
background-color: white;
text-decoration: none;
}
span.selected {
color: white;
background-color: #3366CC;
text-decoration: none;
}

View File

@@ -1,54 +0,0 @@
body {
color: black;
background-color: white;
}
dl {
}
dt {
margin: 0;
padding: 0;
}
dl dd {
margin: 0;
padding: 0;
}
dl.fromLang dt {
display: none;
}
dl.toLang {
border-width: 1px 0 0 0;
border-style: solid;
border-color: #c0c0c0;
}
dl.toLang dt {
color: #c0c0c0;
display: block;
float: left;
width: 5em;
}
dl.toLang dd {
border-width: 0 0 1px 0;
border-style: solid;
border-color: #c0c0c0;
}
ul {
margin: 0;
padding: 0;
}
li {
list-style-type: none;
margin: 0;
padding: 0;
}

View File

@@ -1,48 +0,0 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<link rel="stylesheet" type="text/css" href="translator.css" />
<script type="text/javascript" src="gflib.js"></script>
<script type="text/javascript" src="grammar.js"></script>
<script type="text/javascript" src="translator.js"></script>
<script type="text/javascript">
/* CHANGE ME */
var grammar = Food;
function updateTranslation () {
var input = document.getElementById('inputText').value;
var fromLang = document.getElementById('fromLang').value;
var toLang = document.getElementById('toLang').value;
var output = document.getElementById('output');
var translation = grammar.translate(input, fromLang, toLang);
removeChildren(output);
output.appendChild(formatTranslation(translation));
}
function populateLangs () {
var f = document.getElementById('fromLang');
var t = document.getElementById('toLang');
for (var c in grammar.concretes) {
addOption(f, c, c);
addOption(t, c, c);
}
}
</script>
<title>Web-based GF Translator</title>
</head>
<body onload="populateLangs(Food, 'fromLang', 'toLang')">
<form id="translate">
<p>
<input type="text" name="inputText" id="inputText" value="this cheese is warm" size="50" />
</p>
<p>
From: <select name="fromLang" id="fromLang" onchange=""><option value="">Any language</option></select>
To: <select name="toLang" id="toLang"><option value="">All languages</option></select>
<input type="button" value="Translate" onclick="updateTranslation()" />
</p>
</form>
<div id="output"></div>
</body>
</html>

View File

@@ -1,51 +0,0 @@
function formatTranslation (outputs) {
var dl1 = document.createElement("dl");
dl1.className = "fromLang";
for (var fromLang in outputs) {
var ul = document.createElement("ul");
addDefinition(dl1, document.createTextNode(fromLang), ul);
for (var i in outputs[fromLang]) {
var dl2 = document.createElement("dl");
dl2.className = "toLang";
for (var toLang in outputs[fromLang][i]) {
addDefinition(dl2, document.createTextNode(toLang), document.createTextNode(outputs[fromLang][i][toLang]));
}
addItem(ul, dl2);
}
}
return dl1;
}
/* DOM utilities for specific tags */
function addDefinition (dl, t, d) {
var dt = document.createElement("dt");
dt.appendChild(t);
dl.appendChild(dt);
var dd = document.createElement("dd");
dd.appendChild(d);
dl.appendChild(dd);
}
function addItem (ul, i) {
var li = document.createElement("li");
li.appendChild(i);
ul.appendChild(li);
}
function addOption (select, value, content) {
var option = document.createElement("option");
option.value = value;
option.appendChild(document.createTextNode(content));
select.appendChild(option);
}
/* General DOM utilities */
/* Removes all the children of a node */
function removeChildren(node) {
while (node.hasChildNodes()) {
node.removeChild(node.firstChild);
}
}

View File

@@ -0,0 +1,59 @@
GFCFLAGS = +RTS -K100M -RTS --cpu
.PHONY: pgf.fcgi run gwt gf-gwt.jar
pgf.fcgi:
cabal install
cp dist/build/pgf.fcgi/pgf.fcgi .
gwt-translate:
chmod a+x gwt/Translate-compile
gwt/Translate-compile
gwt-fridge:
chmod a+x gwt/Fridge-compile
gwt/Fridge-compile
gwt-morpho:
chmod a+x gwt/Morpho-compile
gwt/Morpho-compile
gf-gwt.jar:
mkdir -p gwt/bin/se/chalmers/cs/gf/gwt/client
javac -classpath "$(GWT_CLASSPATH):gwt/lib/gwt-dnd-2.5.6.jar" -sourcepath gwt/src gwt/src/se/chalmers/cs/gf/gwt/client/*.java
jar -cf $@ -C gwt/src se
cp $@ ../../lib/java
food.pgf:
gfc --make --name=food ../../examples/tutorial/resource-foods/Foods{Eng,Fin,Fre,Ger,Ita,Swe}.gf
Demo%-parse.pgf: ../../next-lib/src/demo/Demo%.gf
gfc $(GFCFLAGS) --make --erasing=on --name=Demo$*-parse $^
Demo%-noparse.pgf: ../../next-lib/src/demo/Demo%.gf
gfc $(GFCFLAGS) --make --parser=off --name=Demo$*-noparse $^
Lang%-parse.pgf: ../../next-lib/alltenses/Lang%.gfo
gfc $(GFCFLAGS) --make --erasing=on --name=Lang$*-parse $^
Lang%-noparse.pgf: ../../next-lib/alltenses/Lang%.gfo
gfc $(GFCFLAGS) --make --parser=off --name=Lang$*-noparse $^
demo.pgf: DemoBul-noparse.pgf DemoCat-noparse.pgf DemoDan-noparse.pgf DemoEng-parse.pgf DemoFin-noparse.pgf DemoFre-noparse.pgf DemoGer-noparse.pgf DemoIta-noparse.pgf DemoNor-noparse.pgf DemoRus-noparse.pgf DemoSpa-noparse.pgf DemoSwe-parse.pgf
gfc $(GFCFLAGS) --name=demo $^
lang.pgf: LangBul-noparse.pgf LangCat-noparse.pgf LangDan-parse.pgf LangEng-parse.pgf LangFin-noparse.pgf LangFre-noparse.pgf LangGer-noparse.pgf LangIta-noparse.pgf LangNor-parse.pgf LangRus-noparse.pgf LangSpa-noparse.pgf LangSwe-parse.pgf
gfc $(GFCFLAGS) --name=lang $^
test.pgf: LangEng-parse.pgf LangGer-parse.pgf
gfc $(GFCFLAGS) --name=test $^
run: pgf.fcgi
@echo '*********************************************'
@echo 'See http://localhost:41296/'
@echo '*********************************************'
lighttpd -f lighttpd.conf -D
clean:
cabal clean
-rm -f pgf.fcgi

132
deprecated/server/README Normal file
View File

@@ -0,0 +1,132 @@
== Requirements ==
- cabal-install
* See quick installation instructions at the bottom of
http://hackage.haskell.org/trac/hackage/wiki/CabalInstall
- GF installed as a Cabal package
$ (cd ../.. && cabal install)
- FastCGI development kit
(MacPorts) $ sudo port install fcgi
(Ubuntu) $ sudo apt-get install libfcgi-dev
- Google Web Toolkit
- Download from http://code.google.com/webtoolkit/
- Unpack somewhere.
- Set $GWT_CLASSPATH to point to the GWT JAR files. For example:
$ export GWT_DIR="/Users/bringert/src/gwt-mac-1.5.3"
$ export GWT_CLASSPATH="$GWT_DIR/gwt-user.jar:$GWT_DIR/gwt-dev-mac.jar"
== Building ==
- Build pgf.fcgi. This will use cabal to install the dependencies (cgi, fastcgi, json, utf8-string).
$ make
- Build small example grammar:
$ make food.pgf
$ cp food.pgf grammar.pgf
== Running (lighttpd) ==
- Install lighttpd
(MacPorts) $ sudo port install lighttpd
(Ubuntu) $ sudo apt-get install lighttpd
- Run pgf.fcgi with lighttpd:
$ make run
== Testing ==
- First test from the command-line, since debugging is harder from the AJAX UI:
$ curl 'http://localhost:41296/pgf/grammar.pgf/translate?input=this+fish&cat=Item&from=FoodEng'
- Non-GWT AJAX UI:
See http://localhost:41296/simple-client.html
- GWT translator:
$ make gwt-translate
Then see http://localhost:41296/translate/
- GWT fridge poetry:
$ make gwt-fridge
Then see http://localhost:41296/fridge/
- GWT morphology:
$ make gwt-morpho
Then see http://localhost:41296/morpho/
The MorphoService.hs module has build-in paths to the grammar that will be loaded.
This have to be fixed by hand
== Running (Apache) ==
Note: This is more complicated, and the instructions may not be up to date.
- Make sure that your web server supports FastCGI. For Apache on OS X,
do this:
$ curl -O http://www.fastcgi.com/dist/mod_fastcgi-2.4.6.tar.gz
$ tar -zxf mod_fastcgi-2.4.6.tar.gz
$ cd mod_fastcgi-2.4.6/
$ apxs -o mod_fastcgi.so -c *.c
$ sudo apxs -i -a -n fastcgi mod_fastcgi.so
- Make sure that your web server knows that gf.fcgi is a FastCGI
program.
- Make sure that you are allowed to run FastCGI programs in the
directory that you use.
- With large grammars, gf.fcgi may take long enough to start that the web server
thinks that the program has died. With Apache, you can fix this by adding
"FastCgiConfig -startDelay 30" to your httpd.conf.
These sections from my Apache config fixes the above two
(some of this may be fixed by the second apxs command above):
(On OS X, this is in /etc/httpd/httpd.conf)
LoadModule fastcgi_module libexec/httpd/mod_fastcgi.so
AddModule mod_fastcgi.c
<IfModule mod_fastcgi.c>
FastCgiIpcDir /tmp/fcgi_ipc/
AddHandler fastcgi-script .fcgi
FastCgiConfig -startDelay 30
</IfModule>
(On OS X, this is in /etc/httpd/users/bringert.conf)
<Directory "/Users/bringert/Sites/">
Options Indexes MultiViews FollowSymlinks ExecCGI
AddHandler cgi-script .cgi
AllowOverride None
Order allow,deny
Allow from all
</Directory>
- If you have changed the web server config, you need to restart the web server
(this is also useful to get a clean slate if you end up with dead or resource-hogging
FastCGI processes):
$ sudo apachectl restart
- If Apache complains about a syntax error on the FastCgiIpcDir line, try deleting
any existing /tmp/fcgi_ipc/ directory:
$ sudo rm -rf /tmp/fcgi_ipc/
- Copy or symlink this directory to your web directory.
- First test from the command-line, since debugging is harder from the AJAX UI:
$ curl 'http://localhost/~bringert/gf-server/gf.fcgi/translate?input=this+fish&cat=Item&from=FoodEng'
- Check server logs (e.g. /var/log/httpd/error_log) if it doesn't work.
- Go to SERVER_URL/simple-client.html in your web browser.