diff --git a/doc/gf-course.html b/doc/gf-course.html deleted file mode 100644 index 039bbe72c..000000000 --- a/doc/gf-course.html +++ /dev/null @@ -1,221 +0,0 @@ - - -
- --GSLT, -NGSLT, -and -Department of Computer Science and Engineering, -Chalmers University of Technology and Gothenburg University. -
--Autumn Term 2007. -
--24/10 Tomorrow's session starts at 8.15. A detailed plan has been added to -the table below. Material (new chapters) will appear later today. -It will explain some of the files in -
-syntax/:
- linguistic grammar programming
-semantics/:
- a question-answer system based on logical semantics
--12/9 The course starts tomorrow at 8.00. A detailed plan for the day is -right below. Don't forget to -
-gf-subscribe at gslt hum gu se)
--31/8 Revised the description of the one- and five-point variants. -
-
-21/8 Course mailing list started.
-To subscribe, send a mail to gf-subscribe at gslt hum gu se
-(replacing spaces by dots except around the word at, where the spaces
-are just removed, and the word itself is replaced by the at symbol).
-
-20/8/2007 Schedule. -The course will start on Thursday 13 September in Room C430 at the Humanities -Building of Gothenburg University ("Humanisten"). -
--First week (13-14/9) -
-| Time | -Subject | -Assignment | -|
|---|---|---|---|
| Thu 8.00-9.30 | -Chapters 1-3 | -Hello and Food in a new language | -|
| Thu 10.00-11.30 | -Chapters 3-4 | -Foods in a new language | -|
| Thu 13.15-14.45 | -Chapter 5 | -ExtFoods in a new language | -|
| Thu 15.15-16.45 | -Chapters 6-7 | -straight code compiler | -|
| Fri 8.00-9.30 | -Chapters 8 | -application in Haskell or Java | -|
-Second week (25/10) -
-| Time | -Subject | -Assignment | -|
|---|---|---|---|
| Thu 8.15-9.45 | -Chapters 13-15 | -mini resource in a new language | -|
| Thu 10.15-11.45 | -Chapters 12,16 | -query system for a new domain | -|
| Thu 13.15-14.45 | -presentations | -explain your own project | -|
-The structure of each lecture will be the following: -
-
-In order for this to work out, it is important that enough many
-have a working GF installation, including the directory
-examples/tutorial. This directory is
-included in the Darcs version, as well as in the updated binary
-packages from 12 September.
-
-GF -(Grammatical Framework) is a grammar formalism, i.e. a special-purpose -programming language for writing grammars. It is suitable for many -natural language processing tasks, in particular, -
--The goal of the course is to develop an understanding of GF and -practical skills in using it. -
--The course consists of two modules. The first module is a one-week -intensive course (during the first intensive week of GSLT), which -is as such usable as a one-week intensive course for doctoral studies, -if completed with a small course project. -
--The second module is a larger programming project, written -by each student (possibly working in groups) during the Autumn term. -The projects are discussed during the second intensive week of GSLT -(see schedule), -and presented at a date that will be set later. -
--The first module goes through the basics of GF, including -
--The lectures follow a draft of GF book. It contains a heavily updated -version os the -GF Tutorial; -thus the on-line tutorial is not adequate for this course. To get the course -book, join the course mailing list. -
--Those who just want to do the first module will write a simple application -as their course work during and after the first intensive week. -
--Those who continue with the second module will choose a more substantial -project. Possible topics are -
--Experience in programming. No earlier natural language processing -or functional programming experience is necessary. -
--The course is thus suitable both for GSLT and NGSLT students, -and for graduate students in computer science. -
--We will in particular welcome students from the Baltic countries -who wish to build resources for their own language in GF. -
- - - - diff --git a/doc/gf-course.txt b/doc/gf-course.txt deleted file mode 100644 index 846186049..000000000 --- a/doc/gf-course.txt +++ /dev/null @@ -1,149 +0,0 @@ -Graduate Course: GF (Grammatical Framework) -Aarne Ranta -%%date(%c) - -% NOTE: this is a txt2tags file. -% Create an html file from this file using: -% txt2tags -thtml --toc gf-reference.html - -%!target:html - -[GSLT http://www.gslt.hum.gu.se], -[NGSLT http://ngslt.org/], -and -[Department of Computer Science and Engineering http://www.chalmers.se/cse/EN/], -Chalmers University of Technology and Gothenburg University. - -Autumn Term 2007. - - -=News= - -24/10 Tomorrow's session starts at 8.15. A detailed plan has been added to -the table below. Material (new chapters) will appear later today. -It will explain some of the files in -- [``syntax/`` http://digitalgrammars.com/gf/examples/tutorial/syntax/]: - linguistic grammar programming -- [``semantics/`` http://digitalgrammars.com/gf/examples/tutorial/semantics/]: - a question-answer system based on logical semantics - - - -12/9 The course starts tomorrow at 8.00. A detailed plan for the day is -right below. Don't forget to -- join the mailing list (send a mail to ``gf-subscribe at gslt hum gu se``) -- install GF on your laptops from [here ../download.html] -- take with you a copy of the book (as sent to the mailing list yesterday) - - -31/8 Revised the description of the one- and five-point variants. - -21/8 Course mailing list started. -To subscribe, send a mail to ``gf-subscribe at gslt hum gu se`` -(replacing spaces by dots except around the word at, where the spaces -are just removed, and the word itself is replaced by the at symbol). - -20/8/2007 [Schedule http://www.gslt.hum.gu.se/courses/schedule.html]. -The course will start on Thursday 13 September in Room C430 at the Humanities -Building of Gothenburg University ("Humanisten"). - - -=Plan= - -First week (13-14/9) - -|| Time | Subject | Assignment || -| Thu 8.00-9.30 | Chapters 1-3 | Hello and Food in a new language | -| Thu 10.00-11.30 | Chapters 3-4 | Foods in a new language | -| Thu 13.15-14.45 | Chapter 5 | ExtFoods in a new language | -| Thu 15.15-16.45 | Chapters 6-7 | straight code compiler | -| Fri 8.00-9.30 | Chapters 8 | application in Haskell or Java | - -Second week (25/10) - -|| Time | Subject | Assignment || -| Thu 8.15-9.45 | Chapters 13-15 | mini resource in a new language | -| Thu 10.15-11.45 | Chapters 12,16 | query system for a new domain | -| Thu 13.15-14.45 | presentations | explain your own project | - - - -The structure of each lecture will be the following: -- ca. 75min lecture, going through the book -- ca. 15min work on computer, individually or in pairs - - -In order for this to work out, it is important that enough many -have a working GF installation, including the directory -[``examples/tutorial`` ../examples/tutorial]. This directory is -included in the Darcs version, as well as in the updated binary -packages from 12 September. - - - -=Purpose= - -[GF http://www.cs.chalmers.se/~aarne/GF/] -(Grammatical Framework) is a grammar formalism, i.e. a special-purpose -programming language for writing grammars. It is suitable for many -natural language processing tasks, in particular, -- multilingual applications -- systems where grammar-based components are needed for e.g. - parsing, translation, or speech recognition - - -The goal of the course is to develop an understanding of GF and -practical skills in using it. - - -=Contents= - -The course consists of two modules. The first module is a one-week -intensive course (during the first intensive week of GSLT), which -is as such usable as a one-week intensive course for doctoral studies, -if completed with a small course project. - -The second module is a larger programming project, written -by each student (possibly working in groups) during the Autumn term. -The projects are discussed during the second intensive week of GSLT -(see [schedule http://www.gslt.hum.gu.se/courses/schedule.html]), -and presented at a date that will be set later. - -The first module goes through the basics of GF, including -- using the GF programming language -- writing multilingual grammars -- using the - [GF resource grammar library http://www.cs.chalmers.se/~aarne/GF/lib/resource-1.0/doc/] -- generating speech recognition systems from GF grammars -- using embedded grammars as components of software systems - - -The lectures follow a draft of GF book. It contains a heavily updated -version os the -[GF Tutorial http://www.cs.chalmers.se/~aarne/GF/doc/tutorial/gf-tutorial2.html]; -thus the on-line tutorial is not adequate for this course. To get the course -book, join the course mailing list. - -Those who just want to do the first module will write a simple application -as their course work during and after the first intensive week. - -Those who continue with the second module will choose a more substantial -project. Possible topics are -- building a dialogue system by using GF -- implementing a multilingual document generator -- experimenting with synthetized multilingual tree banks -- extending the GF resource grammar library - - - -=Prerequisites= - -Experience in programming. No earlier natural language processing -or functional programming experience is necessary. - -The course is thus suitable both for GSLT and NGSLT students, -and for graduate students in computer science. - -We will in particular welcome students from the Baltic countries -who wish to build resources for their own language in GF. - diff --git a/doc/gf-help.txt b/doc/gf-help.txt deleted file mode 100644 index d77e9aff7..000000000 --- a/doc/gf-help.txt +++ /dev/null @@ -1,699 +0,0 @@ -=GF Command Help= - -Each command has a long and a short name, options, and zero or more -arguments. Commands are sorted by functionality. The short name is -given first. - -Commands and options marked with * are currently not implemented. - -==Commands that change the state== - -``` -i, import: i File - Reads a grammar from File and compiles it into a GF runtime grammar. - Files "include"d in File are read recursively, nubbing repetitions. - If a grammar with the same language name is already in the state, - it is overwritten - but only if compilation succeeds. - The grammar parser depends on the file name suffix: - .gf normal GF source - .gfc canonical GF - .gfr precompiled GF resource - .gfcm multilingual canonical GF - .gfe example-based grammar files (only with the -ex option) - .gfwl multilingual word list (preprocessed to abs + cncs) - .ebnf Extended BNF format - .cf Context-free (BNF) format - .trc TransferCore format - options: - -old old: parse in GF<2.0 format (not necessary) - -v verbose: give lots of messages - -s silent: don't give error messages - -src from source: ignore precompiled gfc and gfr files - -gfc from gfc: use compiled modules whenever they exist - -retain retain operations: read resource modules (needed in comm cc) - -nocf don't build old-style context-free grammar (default without HOAS) - -docf do build old-style context-free grammar (default with HOAS) - -nocheckcirc don't eliminate circular rules from CF - -cflexer build an optimized parser with separate lexer trie - -noemit do not emit code (default with old grammar format) - -o do emit code (default with new grammar format) - -ex preprocess .gfe files if needed - -prob read probabilities from top grammar file (format --# prob Fun Double) - -treebank read a treebank file to memory (xml format) - flags: - -abs set the name used for abstract syntax (with -old option) - -cnc set the name used for concrete syntax (with -old option) - -res set the name used for resource (with -old option) - -path use the (colon-separated) search path to find modules - -optimize select an optimization to override file-defined flags - -conversion select parsing method (values strict|nondet) - -probs read probabilities from file (format (--# prob) Fun Double) - -preproc use a preprocessor on each source file - -noparse read nonparsable functions from file (format --# noparse Funs) - examples: - i English.gf -- ordinary import of Concrete - i -retain german/ParadigmsGer.gf -- import of Resource to test - -r, reload: r - Executes the previous import (i) command. - -rl, remove_language: rl Language - Takes away the language from the state. - -e, empty: e - Takes away all languages and resets all global flags. - -sf, set_flags: sf Flag* - The values of the Flags are set for Language. If no language - is specified, the flags are set globally. - examples: - sf -nocpu -- stop showing CPU time - sf -lang=Swe -- make Swe the default concrete - -s, strip: s - Prune the state by removing source and resource modules. - -dc, define_command Name Anything - Add a new defined command. The Name must star with '%'. Later, - if 'Name X' is used, it is replaced by Anything where #1 is replaced - by X. - Restrictions: Currently at most one argument is possible, and a defined - command cannot appear in a pipe. - To see what definitions are in scope, use help -defs. - examples: - dc %tnp p -cat=NP -lang=Eng #1 | l -lang=Swe -- translate NPs - %tnp "this man" -- translate and parse - -dt, define_term Name Tree - Add a constant for a tree. The constant can later be called by - prefixing it with '$'. - Restriction: These terms are not yet usable as a subterm. - To see what definitions are in scope, use help -defs. - examples: - p -cat=NP "this man" | dt tm -- define tm as parse result - l -all $tm -- linearize tm in all forms -``` - -==Commands that give information about the state== - -``` -pg, print_grammar: pg - Prints the actual grammar (overridden by the -lang=X flag). - The -printer=X flag sets the format in which the grammar is - written. - N.B. since grammars are compiled when imported, this command - generally does not show the grammar in the same format as the - source. In particular, the -printer=latex is not supported. - Use the command tg -printer=latex File to print the source - grammar in LaTeX. - options: - -utf8 apply UTF8-encoding to the grammar - flags: - -printer - -lang - -startcat -- The start category of the generated grammar. - Only supported by some grammar printers. - examples: - pg -printer=cf -- show the context-free skeleton - -pm, print_multigrammar: pm - Prints the current multilingual grammar in .gfcm form. - (Automatically executes the strip command (s) before doing this.) - options: - -utf8 apply UTF8 encoding to the tokens in the grammar - -utf8id apply UTF8 encoding to the identifiers in the grammar - examples: - pm | wf Letter.gfcm -- print the grammar into the file Letter.gfcm - pm -printer=graph | wf D.dot -- then do 'dot -Tps D.dot > D.ps' - -vg, visualize_graph: vg - Show the dependency graph of multilingual grammar via dot and gv. - -po, print_options: po - Print what modules there are in the state. Also - prints those flag values in the current state that differ from defaults. - -pl, print_languages: pl - Prints the names of currently available languages. - -pi, print_info: pi Ident - Prints information on the identifier. -``` - -==Commands that execute and show the session history== - -``` -eh, execute_history: eh File - Executes commands in the file. - -ph, print_history; ph - Prints the commands issued during the GF session. - The result is readable by the eh command. - examples: - ph | wf foo.hist" -- save the history into a file -``` - - -==Linearization, parsing, translation, and computation== - -``` -l, linearize: l PattList? Tree - Shows all linearization forms of Tree by the actual grammar - (which is overridden by the -lang flag). - The pattern list has the form [P, ... ,Q] where P,...,Q follow GF - syntax for patterns. All those forms are generated that match with the - pattern list. Too short lists are filled with variables in the end. - Only the -table flag is available if a pattern list is specified. - HINT: see GF language specification for the syntax of Pattern and Term. - You can also copy and past parsing results. - options: - -struct bracketed form - -table show parameters (not compatible with -record, -all) - -record record, i.e. explicit GF concrete syntax term (not compatible with -table, -all) - -all show all forms and variants (not compatible with -record, -table) - -multi linearize to all languages (can be combined with the other options) - flags: - -lang linearize in this grammar - -number give this number of forms at most - -unlexer filter output through unlexer - examples: - l -lang=Swe -table -- show full inflection table in Swe - -p, parse: p String - Shows all Trees returned for String by the actual - grammar (overridden by the -lang flag), in the category S (overridden - by the -cat flag). - options for batch input: - -lines parse each line of input separately, ignoring empty lines - -all as -lines, but also parse empty lines - -prob rank results by probability - -cut stop after first lexing result leading to parser success - -fail show strings whose parse fails prefixed by #FAIL - -ambiguous show strings that have more than one parse prefixed by #AMBIGUOUS - options for selecting parsing method: - -fcfg parse using a fast variant of MCFG (default is no HOAS in grammar) - -old parse using an overgenerating CFG (default if HOAS in grammar) - -cfg parse using a much less overgenerating CFG - -mcfg parse using an even less overgenerating MCFG - Note: the first time parsing with -cfg, -mcfg, and -fcfg may take a long time - options that only work for the -old default parsing method: - -n non-strict: tolerates morphological errors - -ign ignore unknown words when parsing - -raw return context-free terms in raw form - -v verbose: give more information if parsing fails - flags: - -cat parse in this category - -lang parse in this grammar - -lexer filter input through this lexer - -parser use this parsing strategy - -number return this many results at most - examples: - p -cat=S -mcfg "jag är gammal" -- parse an S with the MCFG - rf examples.txt | p -lines -- parse each non-empty line of the file - -at, apply_transfer: at (Module.Fun | Fun) - Transfer a term using Fun from Module, or the topmost transfer - module. Transfer modules are given in the .trc format. They are - shown by the 'po' command. - flags: - -lang typecheck the result in this lang instead of default lang - examples: - p -lang=Cncdecimal "123" | at num2bin | l -- convert dec to bin - -tb, tree_bank: tb - Generate a multilingual treebank from a list of trees (default) or compare - to an existing treebank. - options: - -c compare to existing xml-formatted treebank - -trees return the trees of the treebank - -all show all linearization alternatives (branches and variants) - -table show tables of linearizations with parameters - -record show linearization records - -xml wrap the treebank (or comparison results) with XML tags - -mem write the treebank in memory instead of a file TODO - examples: - gr -cat=S -number=100 | tb -xml | wf tb.xml -- random treebank into file - rf tb.xml | tb -c -- compare-test treebank from file - rf old.xml | tb -trees | tb -xml -- create new treebank from old - -ut, use_treebank: ut String - Lookup a string in a treebank and return the resulting trees. - Use 'tb' to create a treebank and 'i -treebank' to read one from - a file. - options: - -assocs show all string-trees associations in the treebank - -strings show all strings in the treebank - -trees show all trees in the treebank - -raw return the lookup result as string, without typechecking it - flags: - -treebank use this treebank (instead of the latest introduced one) - examples: - ut "He adds this to that" | l -multi -- use treebank lookup as parser in translation - ut -assocs | grep "ComplV2" -- show all associations with ComplV2 - -tt, test_tokenizer: tt String - Show the token list sent to the parser when String is parsed. - HINT: can be useful when debugging the parser. - flags: - -lexer use this lexer - examples: - tt -lexer=codelit "2*(x + 3)" -- a favourite lexer for program code - -g, grep: g String1 String2 - Grep the String1 in the String2. String2 is read line by line, - and only those lines that contain String1 are returned. - flags: - -v return those lines that do not contain String1. - examples: - pg -printer=cf | grep "mother" -- show cf rules with word mother - -cc, compute_concrete: cc Term - Compute a term by concrete syntax definitions. Uses the topmost - resource module (the last in listing by command po) to resolve - constant names. - N.B. You need the flag -retain when importing the grammar, if you want - the oper definitions to be retained after compilation; otherwise this - command does not expand oper constants. - N.B.' The resulting Term is not a term in the sense of abstract syntax, - and hence not a valid input to a Tree-demanding command. - flags: - -table show output in a similar readable format as 'l -table' - -res use another module than the topmost one - examples: - cc -res=ParadigmsFin (nLukko "hyppy") -- inflect "hyppy" with nLukko - -so, show_operations: so Type - Show oper operations with the given value type. Uses the topmost - resource module to resolve constant names. - N.B. You need the flag -retain when importing the grammar, if you want - the oper definitions to be retained after compilation; otherwise this - command does not find any oper constants. - N.B.' The value type may not be defined in a supermodule of the - topmost resource. In that case, use appropriate qualified name. - flags: - -res use another module than the topmost one - examples: - so -res=ParadigmsFin ResourceFin.N -- show N-paradigms in ParadigmsFin - -t, translate: t Lang Lang String - Parses String in Lang1 and linearizes the resulting Trees in Lang2. - flags: - -cat - -lexer - -parser - examples: - t Eng Swe -cat=S "every number is even or odd" - -gr, generate_random: gr Tree? - Generates a random Tree of a given category. If a Tree - argument is given, the command completes the Tree with values to - the metavariables in the tree. - options: - -prob use probabilities (works for nondep types only) - -cf use a very fast method (works for nondep types only) - flags: - -cat generate in this category - -lang use the abstract syntax of this grammar - -number generate this number of trees (not impl. with Tree argument) - -depth use this number of search steps at most - examples: - gr -cat=Query -- generate in category Query - gr (PredVP ? (NegVG ?)) -- generate a random tree of this form - gr -cat=S -tr | l -- gererate and linearize - -gt, generate_trees: gt Tree? - Generates all trees up to a given depth. If the depth is large, - a small -alts is recommended. If a Tree argument is given, the - command completes the Tree with values to the metavariables in - the tree. - options: - -metas also return trees that include metavariables - -all generate all (can be infinitely many, lazily) - -lin linearize result of -all (otherwise, use pipe to linearize) - flags: - -depth generate to this depth (default 3) - -atoms take this number of atomic rules of each category (default unlimited) - -alts take this number of alternatives at each branch (default unlimited) - -cat generate in this category - -nonub don't remove duplicates (faster, not effective with -mem) - -mem use a memorizing algorithm (often faster, usually more memory-consuming) - -lang use the abstract syntax of this grammar - -number generate (at most) this number of trees (also works with -all) - -noexpand don't expand these categories (comma-separated, e.g. -noexpand=V,CN) - -doexpand only expand these categories (comma-separated, e.g. -doexpand=V,CN) - examples: - gt -depth=10 -cat=NP -- generate all NP's to depth 10 - gt (PredVP ? (NegVG ?)) -- generate all trees of this form - gt -cat=S -tr | l -- generate and linearize - gt -noexpand=NP | l -mark=metacat -- the only NP is meta, linearized "?0 +NP" - gt | l | p -lines -ambiguous | grep "#AMBIGUOUS" -- show ambiguous strings - -ma, morphologically_analyse: ma String - Runs morphological analysis on each word in String and displays - the results line by line. - options: - -short show analyses in bracketed words, instead of separate lines - -status show just the work at success, prefixed with "*" at failure - flags: - -lang - examples: - wf Bible.txt | ma -short | wf Bible.tagged -- analyse the Bible -``` - - -==Elementary generation of Strings and Trees== - -``` -ps, put_string: ps String - Returns its argument String, like Unix echo. - HINT. The strength of ps comes from the possibility to receive the - argument from a pipeline, and altering it by the -filter flag. - flags: - -filter filter the result through this string processor - -length cut the string after this number of characters - examples: - gr -cat=Letter | l | ps -filter=text -- random letter as text - -pt, put_tree: pt Tree - Returns its argument Tree, like a specialized Unix echo. - HINT. The strength of pt comes from the possibility to receive - the argument from a pipeline, and altering it by the -transform flag. - flags: - -transform transform the result by this term processor - -number generate this number of terms at most - examples: - p "zero is even" | pt -transform=solve -- solve ?'s in parse result - -* st, show_tree: st Tree - Prints the tree as a string. Unlike pt, this command cannot be - used in a pipe to produce a tree, since its output is a string. - flags: - -printer show the tree in a special format (-printer=xml supported) - -wt, wrap_tree: wt Fun - Wraps the tree as the sole argument of Fun. - flags: - -c compute the resulting new tree to normal form - -vt, visualize_tree: vt Tree - Shows the abstract syntax tree via dot and gv (via temporary files - grphtmp.dot, grphtmp.ps). - flags: - -c show categories only (no functions) - -f show functions only (no categories) - -g show as graph (sharing uses of the same function) - -o just generate the .dot file - examples: - p "hello world" | vt -o | wf my.dot ;; ! open -a GraphViz my.dot - -- This writes the parse tree into my.dot and opens the .dot file - -- with another application without generating .ps. -``` - -==Subshells== - -``` -es, editing_session: es - Opens an interactive editing session. - N.B. Exit from a Fudget session is to the Unix shell, not to GF. - options: - -f Fudget GUI (necessary for Unicode; only available in X Window System) - -ts, translation_session: ts - Translates input lines from any of the actual languages to all other ones. - To exit, type a full stop (.) alone on a line. - N.B. Exit from a Fudget session is to the Unix shell, not to GF. - HINT: Set -parser and -lexer locally in each grammar. - options: - -f Fudget GUI (necessary for Unicode; only available in X Windows) - -lang prepend translation results with language names - flags: - -cat the parser category - examples: - ts -cat=Numeral -lang -- translate numerals, show language names - -tq, translation_quiz: tq Lang Lang - Random-generates translation exercises from Lang1 to Lang2, - keeping score of success. - To interrupt, type a full stop (.) alone on a line. - HINT: Set -parser and -lexer locally in each grammar. - flags: - -cat - examples: - tq -cat=NP TestResourceEng TestResourceSwe -- quiz for NPs - -tl, translation_list: tl Lang Lang - Random-generates a list of ten translation exercises from Lang1 - to Lang2. The number can be changed by a flag. - HINT: use wf to save the exercises in a file. - flags: - -cat - -number - examples: - tl -cat=NP TestResourceEng TestResourceSwe -- quiz list for NPs - -mq, morphology_quiz: mq - Random-generates morphological exercises, - keeping score of success. - To interrupt, type a full stop (.) alone on a line. - HINT: use printname judgements in your grammar to - produce nice expressions for desired forms. - flags: - -cat - -lang - examples: - mq -cat=N -lang=TestResourceSwe -- quiz for Swedish nouns - -ml, morphology_list: ml - Random-generates a list of ten morphological exercises, - keeping score of success. The number can be changed with a flag. - HINT: use wf to save the exercises in a file. - flags: - -cat - -lang - -number - examples: - ml -cat=N -lang=TestResourceSwe -- quiz list for Swedish nouns -``` - - -==IO-related commands== - -``` -rf, read_file: rf File - Returns the contents of File as a String; error if File does not exist. - -wf, write_file: wf File String - Writes String into File; File is created if it does not exist. - N.B. the command overwrites File without a warning. - -af, append_file: af File - Writes String into the end of File; File is created if it does not exist. - -* tg, transform_grammar: tg File - Reads File, parses as a grammar, - but instead of compiling further, prints it. - The environment is not changed. When parsing the grammar, the same file - name suffixes are supported as in the i command. - HINT: use this command to print the grammar in - another format (the -printer flag); pipe it to wf to save this format. - flags: - -printer (only -printer=latex supported currently) - -* cl, convert_latex: cl File - Reads File, which is expected to be in LaTeX form. - -sa, speak_aloud: sa String - Uses the Flite speech generator to produce speech for String. - Works for American English spelling. - examples: - h | sa -- listen to the list of commands - gr -cat=S | l | sa -- generate a random sentence and speak it aloud - -si, speech_input: si - Uses an ATK speech recognizer to get speech input. - flags: - -lang: The grammar to use with the speech recognizer. - -cat: The grammar category to get input in. - -language: Use acoustic model and dictionary for this language. - -number: The number of utterances to recognize. - -h, help: h Command? - Displays the paragraph concerning the command from this help file. - Without the argument, shows the first lines of all paragraphs. - options - -all show the whole help file - -defs show user-defined commands and terms - -FLAG show the values of FLAG (works for grammar-independent flags) - examples: - h print_grammar -- show all information on the pg command - -q, quit: q - Exits GF. - HINT: you can use 'ph | wf history' to save your session. - -!, system_command: ! String - Issues a system command. No value is returned to GF. - example: - ! ls - -?, system_command: ? String - Issues a system command that receives its arguments from GF pipe - and returns a value to GF. - example: - h | ? 'wc -l' | p -cat=Num -``` - - -==Flags== - -The availability of flags is defined separately for each command. -``` --cat, category in which parsing is performed. - The default is S. - --depth, the search depth in e.g. random generation. - The default depends on application. - --filter, operation performed on a string. The default is identity. - -filter=identity no change - -filter=erase erase the text - -filter=take100 show the first 100 characters - -filter=length show the length of the string - -filter=text format as text (punctuation, capitalization) - -filter=code format as code (spacing, indentation) - --lang, grammar used when executing a grammar-dependent command. - The default is the last-imported grammar. - --language, voice used by Festival as its --language flag in the sa command. - The default is system-dependent. - --length, the maximum number of characters shown of a string. - The default is unlimited. - --lexer, tokenization transforming a string into lexical units for a parser. - The default is words. - -lexer=words tokens are separated by spaces or newlines - -lexer=literals like words, but GF integer and string literals recognized - -lexer=vars like words, but "x","x_...","$...$" as vars, "?..." as meta - -lexer=chars each character is a token - -lexer=code use Haskell's lex - -lexer=codevars like code, but treat unknown words as variables, ?? as meta - -lexer=textvars like text, but treat unknown words as variables, ?? as meta - -lexer=text with conventions on punctuation and capital letters - -lexer=codelit like code, but treat unknown words as string literals - -lexer=textlit like text, but treat unknown words as string literals - -lexer=codeC use a C-like lexer - -lexer=ignore like literals, but ignore unknown words - -lexer=subseqs like ignore, but then try all subsequences from longest - --number, the maximum number of generated items in a list. - The default is unlimited. - --optimize, optimization on generated code. - The default is share for concrete, none for resource modules. - Each of the flags can have the suffix _subs, which performs - common subexpression elimination after the main optimization. - Thus, -optimize=all_subs is the most aggressive one. The _subs - strategy only works in GFC, and applies therefore in concrete but - not in resource modules. - -optimize=share share common branches in tables - -optimize=parametrize first try parametrize then do share with the rest - -optimize=values represent tables as courses-of-values - -optimize=all first try parametrize then do values with the rest - -optimize=none no optimization - --parser, parsing strategy. The default is chart. If -cfg or -mcfg are - selected, only bottomup and topdown are recognized. - -parser=chart bottom-up chart parsing - -parser=bottomup a more up to date bottom-up strategy - -parser=topdown top-down strategy - -parser=old an old bottom-up chart parser - --printer, format in which the grammar is printed. The default is - gfc. Those marked with M are (only) available for pm, the rest - for pg. - -printer=gfc GFC grammar - -printer=gf GF grammar - -printer=old old GF grammar - -printer=cf context-free grammar, with profiles - -printer=bnf context-free grammar, without profiles - -printer=lbnf labelled context-free grammar for BNF Converter - -printer=plbnf grammar for BNF Converter, with precedence levels - *-printer=happy source file for Happy parser generator (use lbnf!) - -printer=haskell abstract syntax in Haskell, with transl to/from GF - -printer=haskell_gadt abstract syntax GADT in Haskell, with transl to/from GF - -printer=morpho full-form lexicon, long format - *-printer=latex LaTeX file (for the tg command) - -printer=fullform full-form lexicon, short format - *-printer=xml XML: DTD for the pg command, object for st - -printer=old old GF: file readable by GF 1.2 - -printer=stat show some statistics of generated GFC - -printer=probs show probabilities of all functions - -printer=gsl Nuance GSL speech recognition grammar - -printer=jsgf Java Speech Grammar Format - -printer=jsgf_sisr_old Java Speech Grammar Format with semantic tags in - SISR WD 20030401 format - -printer=srgs_abnf SRGS ABNF format - -printer=srgs_abnf_non_rec SRGS ABNF format, without any recursion. - -printer=srgs_abnf_sisr_old SRGS ABNF format, with semantic tags in - SISR WD 20030401 format - -printer=srgs_xml SRGS XML format - -printer=srgs_xml_non_rec SRGS XML format, without any recursion. - -printer=srgs_xml_prob SRGS XML format, with weights - -printer=srgs_xml_sisr_old SRGS XML format, with semantic tags in - SISR WD 20030401 format - -printer=vxml Generate a dialogue system in VoiceXML. - -printer=slf a finite automaton in the HTK SLF format - -printer=slf_graphviz the same automaton as slf, but in Graphviz format - -printer=slf_sub a finite automaton with sub-automata in the - HTK SLF format - -printer=slf_sub_graphviz the same automaton as slf_sub, but in - Graphviz format - -printer=fa_graphviz a finite automaton with labelled edges - -printer=regular a regular grammar in a simple BNF - -printer=unpar a gfc grammar with parameters eliminated - -printer=functiongraph abstract syntax functions in 'dot' format - -printer=typegraph abstract syntax categories in 'dot' format - -printer=transfer Transfer language datatype (.tr file format) - -printer=cfg-prolog M cfg in prolog format (also pg) - -printer=gfc-prolog M gfc in prolog format (also pg) - -printer=gfcm M gfcm file (default for pm) - -printer=graph M module dependency graph in 'dot' (graphviz) format - -printer=header M gfcm file with header (for GF embedded in Java) - -printer=js M JavaScript type annotator and linearizer - -printer=mcfg-prolog M mcfg in prolog format (also pg) - -printer=missing M the missing linearizations of each concrete - --startcat, like -cat, but used in grammars (to avoid clash with keyword cat) - --transform, transformation performed on a syntax tree. The default is identity. - -transform=identity no change - -transform=compute compute by using definitions in the grammar - -transform=nodup return the term only if it has no constants duplicated - -transform=nodupatom return the term only if it has no atomic constants duplicated - -transform=typecheck return the term only if it is type-correct - -transform=solve solve metavariables as derived refinements - -transform=context solve metavariables by unique refinements as variables - -transform=delete replace the term by metavariable - --unlexer, untokenization transforming linearization output into a string. - The default is unwords. - -unlexer=unwords space-separated token list (like unwords) - -unlexer=text format as text: punctuation, capitals, paragraph- -unlexer=code format as code (spacing, indentation) - -unlexer=textlit like text, but remove string literal quotes - -unlexer=codelit like code, but remove string literal quotes - -unlexer=concat remove all spaces - -unlexer=bind like identity, but bind at "&+" - --mark, marking of parts of tree in linearization. The default is none. - -mark=metacat append "+CAT" to every metavariable, showing its category - -mark=struct show tree structure with brackets - -mark=java show tree structure with XML tags (used in gfeditor) - --coding, Some grammars are in UTF-8, some in isolatin-1. - If the letters ä (a-umlaut) and ö (o-umlaut) look strange, either - change your terminal to isolatin-1, or rewrite the grammar with - 'pg -utf8'. -``` diff --git a/doc/gf-history.html b/doc/gf-history.html deleted file mode 100644 index 3fe8153e2..000000000 --- a/doc/gf-history.html +++ /dev/null @@ -1,865 +0,0 @@ - -
-
-
-
-- -25/6 (BB) -Added new speech recognition grammar printers for non-recursive SRGS grammars, -as used by Nuance Recognizer 9.0. Try pg -printer=srgs_xml_non_rec -or pg -printer=srgs_abnf_non_rec. - -
- -19/6 (AR) -Extended the functor syntax (with modules) so that the functor can have -restricted import and a module body (whose function is normally to complete restricted -import). Thus the following format is now possible: -
- concrete C of A = E ** CI - [f,g] with (...) ** open R in {...}
-
-At the same time, the possibility of an empty module body was added to other modules
-for symmetry. This can be useful for "proxy modules" that just collect other modules
-without adding anything, e.g.
-- abstract Math = Arithmetic, Geometry ; -- - -
- - -18/6 (AR) -Added a warning for clashing constants. A constant coming from multiple opened modules -was interpreted as "the first" found by the compiler, which was a source of difficult -errors. Clashing is officially forbidden, but we chose to give a warning instead of -raising an error to begin with (in version 2.8). - -
- -30/1/2007 (AR) -Semantics of variants fixed for complex types. Officially, it was only -defined for basic types (Str and parameters). When used for records, results were -multiplicative, which was nor usable. But now variants should work for any type. - -
- -
- -22/12 (AR) Release of GF version 2.7. - -
- -21/12 (AR) -Overloading rules for GF version 2.7: -
- -21/12 (BB) Java Speech Grammar Format with SISR tags can now be generated. -Use pg -printer=jsgf_sisr_old. The SISR tags are in Working Draft -20030401 format, which is supported by the OptimTALK VoiceXML interpreter -and the IBM XHTML+Voice implementation use by the Opera web browser. - -
-
-21/12 (BB)
-VoiceXML 2.0 dialog systems can now be generated from GF grammars.
-Use pg -printer=vxml.
-
-
-
-21/12 (BB)
-JavaScript code for linearization and type annotation can now be
-generated from a multilingual GF grammar. Use pm -printer=js.
-
-
-
-
-5/12 (BB)
-A new tool for generating C linearization libraries
-from a GFCC file. make gfcc2c in src
-compiles the tool. The generated
-code includes header files in lib/c and should be linked
-against libgfcc.a in lib/c. For an example of
-using the generated code, see src/tools/c/examples/bronzeage.
-make in that directory generates a GFCC file, then generates
-C code from that, and then compiles a program bronzeage-test.
-The main function for that program is defined in
-bronzeage-test.c.
-
-
-
-
-20/11 (AR) Type error messages in concrete syntax are printed with a
-heuristic where a type of the form {... ; lock_C : {} ; ...}
-is printed as C. This gives more readable error messages, but
-can produce wrong results if lock fields are hand-written or if subtypes
-of lock-fielded categories are used.
-
-
-
-17/11 (AR)
-Operation overloading: an oper can have many types,
-from which one is picked at compile time. The types must have different
-argument lists. Exact match with the arguments given to the oper
-is required. An example is given in
-Constructors.gf.
-The purpose of overloading is to make libraries easier to use, since
-only one name for each grammatical operation is needed: predication, modification,
-coordination, etc. The concrete syntax is, at this experimental level, not
-extended but relies on using a record with the function name repeated
-as label name (see the example). The treatment of overloading is inspired
-by C++, and was first suggested by Björn Nringert.
-
-
-
-
-3/10 (AR) A new low-level format gfcc ("Canonical Canonical GF").
-It is going to replace the gfc format later, but is already now
-an efficient format for multilingual generation.
-See GFCC document
-for more information.
-
-
-
-1/9 (AR) New way for managing errors in grammar compilation:
-
-
-16/8 (AR) New generation algorithm: slower but works with less
-memory. Default of gt; use gt -mem for the old
-algorithm. The new option gt -all lazily generates all
-trees until interrupted. It cannot be piped to other GF commands,
-hence use gt -all -lin to print out linearized strings
-rather than trees.
-
-
-
-20/6 (AR) The FCFG parser is know the default, as it even handles literals.
-The old default can be selected by p -old. Since
-FCFG does not support variable bindings, -old is automatically
-selected if the grammar has bindings - and unless the -fcfg flag
-is used.
-
-
-
-17/6 (AR) The FCFG parser is now the recommended method for parsing
-heavy grammars such as the resource grammars. It does not yet support
-literals and variable bindings.
-
-
-
-1/6 (AR) Added the FCFG parser written by Krasimir Angelov. Invoked by
-p -fcfg. This parser is as general as MCFG but faster.
-It needs more testing and debugging.
-
-
-
-1/6 (AR) The command r = reload repeats the latest
-i = import command.
-
-
-
-30/5 (AR) It is now possible to use the flags -all, -table, -record
-in combination with l -multi, and also with tb.
-
-
-
-18/5 (AR) Introduced a wordlist format gfwl for
-quick creation of language exercises and (in future) multilingual lexica.
-The format is now very simple:
-
-
-3/4 (AR) The predefined abstract syntax type Int now has two
-inherent parameters indicating its last digit and its size. The (hard-coded)
-linearization type is
-
-
-31/3 (AR) Added flags and options to some commands, to help generation:
-
-
-
-
-16/3 (AR) Added two flag values to pt -transform=X:
-nodup which excludes terms where a constant is duplicated,
-and
-nodupatom which excludes terms where an atomic constant is duplicated.
-The latter, in particular, is useful as a filter in generation:
-
-
-6/3 (AR) Generalized the gfe file format in two ways:
-
-
-4/3 (AR) Added command use_treebank = ut for lookup in a treebank.
-This command can be used as a fast substitute for parsing, but also as a
-way to browse treebanks.
-
-
-3/3 (AR) Added option -treebank to the i command. This adds treebanks to
-the shell state. The possible file formats are
-
-
-1/3 (AR) Added option -trees to the command tree_bank = tb.
-By this option, the command just returns the trees in the treebank. It can be
-used for producing new treebanks with the same trees:
-
-
-1/3 (AR) A .gfe file can have a --# -path=PATH on its
-second line. The file given on the first line (--# -resource=FILE)
-is then read w.r.t. this path. This is useful if the resource file has
-no path itself, which happens when it is gfc-only.
-
-
-
-25/2 (AR) The flag preproc of the i command (and thereby
-to gf itself) causes GF to apply a preprocessor to each sourcefile
-it reads.
-
-
-
-8/2 (AR) The command tb = tree_bank for creating and testing against
-multilingual treebanks. Example uses:
-
-
-10/1 (AR) Forbade variable binding inside negation and Kleene star
-patterns.
-
-
-
-7/1 (AR) Full set of regular expression patterns, with
-as-patterns to enable variable bindings to matched expressions:
-
-
-6/1 (AR) Concatenative string patterns to help morphology definitions...
-This can be seen as a step towards regular expression string patterns.
-The natural notation p1 + p2 will be considered later.
-Note. This was done on 7/1.
-
-
-
-5/1/2006 (BB) New grammar printers slf_sub and slf_sub_graphviz
-for creating SLF networks with sub-automata.
-
-
-
-21/12 (AR) It now works to parse escaped string literals from command
-line, and also string literals with spaces:
-
-
-20/12 (AR) Support for full disjunctive patterns (P|Q) i.e.
-not just on top level.
-
-
-
-14/12 (BB) The command si (speech_input) which creates
-a speech recognizer from a grammar for English and admits speech input
-of strings has been added. The command uses an
-ATK recognizer and
-creates a recognition
-network which accepts strings in the currently active grammar.
-In order to use the si command,
-you need to install the
-atkrec library
-and configure GF with ./configure --with-atk before compiling.
-You need to set two environment variables for the si command to
-work. ATK_HOME should contain the path to your copy of ATK
-and GF_ATK_CFG should contain the path to your GF ATK configuration
-file. A default version of this file can be found in
- GF/src/gf_atk.cfg.
-
-
-
-
-11/12 (AR) Parsing of float literals now possible in object language.
-Use the flag lexer=literals.
-
-
-
-6/12 (AR) Accept param and oper definitions in
-concrete modules. The definitions are just inlined in the
-current module and not inherited. The purpose is to support rapid
-prototyping of grammars.
-
-
-
-2/12 (AR) The built-in type Float added to abstract syntax (and
-resource). Values are stored as Haskell's Double precision
-floats. For the syntax of float literals, see BNFC document.
-NB: some bug still prevents parsing float literals in object
-languages. Bug fixed 11/12.
-
-
-
-1/12 (BB,AR) The command at = apply_transfer, which applies
-a transfer function to a term. This is used for noncompositional
-translation. Transfer functions are defined in a special transfer
-language (file suffix .tr), which is compiled into a
-run-time transfer core language (file suffix .trc).
-The compiler is included in GF/transfer. The following is
-a complete example of how to try out transfer:
-
-
-17/11 (AR) Made it possible for lexers to be nondeterministic.
-Now with a simple-minded implementation that the parser is sent
-each lexing result in turn. The option -cut is used for
-breaking after first lexing leading to successful parse. The only
-nondeterministic lexer right now is -lexer=subseqs, which
-first filters with -lexer=ignore (dropping words neither in
-the grammar nor literals) and then starts ignoring other words from
-longest to shortest subsequence. This is usable for parser tasks
-of keyword spotting type, but expensive (2n) in long input.
-A smarter implementation is therefore desirable.
-
-
-
-14/11 (AR) Functions can be made unparsable (or "internal" as
-in BNFC). This is done by i -noparse=file, where
-the nonparsable functions are given in file using the
-line format --# noparse Funs. This can be used e.g. to
-rule out expensive parsing rules. It is used in
-lib/resource/abstract/LangVP.gf to get parse values
-structured with VP, which is obtained via transfer.
-So far only the default (= old) parser generator supports this.
-
-
-
-14/11 (AR) Removed the restrictions how a lincat may look like.
-Now any record type that has a value in GFC (i.e. without any
-functions in it) can be used, e.g. {np : NP ; cn : Bool => CN}.
-To display linearization values, only l -record shows
-nice results.
-
-
-
-9/11 (AR) GF shell state can now have several abstract syntaxes with
-their associated concrete syntaxes. This allows e.g. parsing with
-resource while testing an application. One can also have a
-parse-transfer-lin chain from one abstract syntax to another.
-
-
-7/11 (BB) Running commands can now be interrupted with Ctrl-C, without
-killing the GF process. This feature is not supported on Windows.
-
-
-
-1/11 (AR) Yet another method for adding probabilities: append
- --# prob Double to the end of a line defining a function.
-This can be (1) a .cf rule (2) a fun rule, or
-(3) a lin rule. The probability is attached to the
-first identifier on the line.
-
-
-1/11 (BB) Added generation of weighted SRGS grammars. The weights
-are calculated from the function probabilities. The algorithm
-for calculating the weights is not yet very good.
-Use pg -printer=srgs_xml_prob.
-
-
-31/10 (BB) Added option for converting grammars to SRGS grammars in XML format.
-Use pg -printer=srgs_xml.
-
-
-
-31/10 (AR) Probabilistic grammars. Probabilities can be used to
-weight random generation (gr -prob) and to rank parse
-results (p -prob). They are read from a separate file
-(flag i -probs=File, format --# prob Fun Double)
-or from the top-level grammar file itself (option i -prob).
-To see the probabilities, use pg -printer=probs.
-
-
-12/10 (AR) Flag -atoms=Int to the command gt = generate_trees
-takes away all zero-argument functions except Int per category. In
-this way, it is possible to generate a corpus illustrating each
-syntactic structure even when the lexicon (which consists of
-zero-argument functions) is large.
-
-
-
-6/10 (AR) New commands dc = define_command and
-dt = define_tree to define macros in a GF session.
-See help for details and examples.
-
-
-
-5/10 (AR) Printing missing linearization rules:
-pm -printer=missing. Command g = grep,
-which works in a way similar to Unix grep.
-
-
-
-5/10 (PL) Printing graphs with function and category dependencies:
-pg -printer=functiongraph, pg -printer=typegraph.
-
-
-
-20/9 (AR) Added optimization by common subexpression elimination.
-It works on GFC modules and creates oper definitions for
-subterms that occur more than once in lin definitions. These
-oper definitions are automatically reinlined in functionalities
-that don't support opers in GFC. This conversion is done by
-module and the opers are not inherited. Moreover, the subterms
-can contain free variables which means that the opers are not
-always well typed. However, since all variables in GFC are type-specific
-(and local variables are lin-specific), this does not destroy
-subject reduction or cause illegal captures.
-
-
-18/9 (AR) Removed superfluous spaces from GFC printing. This shrinks
-the GFC size by 5-10%.
-
-
-
-15/9 (AR) Fixed some bugs in dependent-type type checking of abstract
-modules at compile time. The type checker is more severe now, which means
-that some old grammars may fail to compile - but this is usually the
-right result. However, the type checker of def judgements still
-needs work.
-
-
-
-14/9 (AR) Added printing of grammars to a format without parameters, in
-the spirit of Peanos "Latino sine flexione". The command pg -unpar
-does the trick, and the result can be saved in a gfcm file. The generated
-concrete syntax modules get the prefix UP_. The translation is briefly:
-
-
-14/9 (BB) Added finite state approximation of grammars.
-Internally the conversion is done cfg -> regular -> fa -> slf, so the
-different printers can be used to check the output of each stage.
-The new options are:
-
-
-4/9 (AR) Added the option pg -printer=stat to show
-statistics of gfc compilation result. To be extended with new information.
-The most important stats now are the top-40 sized definitions.
-
-
-
-
-
-1/7 (AR) Added the flag -o to the vt command
-to just write the .dot file without going to .ps
-(cf. 20/6).
-
-
-
-29/6 (AR) The printer used by Embedded Java GF Interpreter
-(pm -header) now produces
-working code from all optimized grammars - hence you need not select a
-weaker optimization just to use the interpreter. However, the
-optimization -optimize=share usually produces smaller object
-grammars because the "unoptimizer" just undoes all optimizations.
-(This is to be considered a temporary solution until the interpreter
-knows how to handle stronger optimizations.)
-
-
-
-27/6 (AR) The flag flags optimize=noexpand placed in a
-resource module prevents the optimization phase of the compiler when
-the .gfr file is created. This can prevent serious code
-explosion, but it will also make the processing of modules using the
-resource slowwer. A favourable example is lib/resource/finnish/ParadigmsFin.
-
-
-
-23/6 (HD,AR) The new editor GUI gfeditor by Hans-Joachim
-Daniels can now be used. It is based on Janna Khegai's jgf.
-New functionality include HTML display (gfeditor -h) and
-programmable refinement tooltips.
-
-
-
-23/6 (AR) The flag unlexer=finnish can be used to bind
-Finnish suffixes (e.g. possessives) to preceding words. The GF source
-notation is e.g. "isä" ++ "&*" ++ "nsa" ++ "&*" ++ "ko",
-which unlexes to "isänsäkö". There is no corresponding lexer
-support yet.
-
-
-
-
-22/6 (PL,AR) The MCFG parser (p -mcfg) now works on all
-optimized grammars - hence you need not select a weaker optimization
-to use this parser. The same concerns the CFGM printer (pm -printer=cfgm).
-
-
-
-20/6 (AR) Added the command visualize_tree = vt, to
-display syntax trees graphically. Like vg, this command uses
-GraphViz and Ghostview. The foremost use is to pipe the parser to this
-command.
-
-
-
-17/6 (BB) There is now support for lists in GF abstract syntax.
-A list category is declared as:
-
-
-cat [C]{n} is equivalent to the declarations:
-
-
-A lincat declaration on the form:
-
-
-10/6 (AR) Preprocessor of .gfe files can now be performed as part of
-any grammar compilation. The flag -ex causes GF to look for
-the .gfe files and preprocess those that are younger
-than the corresponding .gf files. The files are first sorted
-and grouped by the resource, so that each resource only need be compiled once.
-
-
-
-10/6 (AR) Editor GUI can now be alternatively invoked by the shell
-command gf -edit (equivalent to jgf).
-
-
-
-10/6 (AR) Editor GUI command pc Int to pop Int
-items from the clip board.
-
-
-
-4/6 (AR) Sequence of commands in the Java editor GUI now possible.
-The commands are separated by ;; (notice the space on
-both sides of the two semicolons). Such a sequence can be sent
-from the "GF Command" pop-up field, but is mostly intended
-for external processes that communicate with GF.
-
-
-
-3/6 (AR) The format .gfe defined to support
-grammar writing by examples. Files of this format are first
-converted to .gf files by the command
-
-
-31/5 (AR) Default of p -rawtrees=k changed to 999999.
-
-
-
-31/5 (AR) Support for restricted inheritance. Syntax:
-
-
-29/5 (AR) Parser support for reading GFC files line per line.
-The category Line in GFC.cf can be used
-as entrypoint instead of Grammar to achieve this.
-
-
-
-28/5 (AR) Environment variables and path wild cards.
-
-
-
-26/5/2005 (BB) Notation for list categories.
-
-
-
-
-
diff --git a/doc/gf-modules.html b/doc/gf-modules.html
deleted file mode 100644
index 6292bd855..000000000
--- a/doc/gf-modules.html
+++ /dev/null
@@ -1,1183 +0,0 @@
-
-
-
-A GF grammar consists of a set of modules, which can be
-combined in different ways to build different grammars.
-There are several different types of modules:
-
-We will go through the module types in this order, which is also
-their order of "importance" from the most basic to
-the more advanced ones.
-
-This document presupposes knowledge of GF judgements and expressions, which can
-be gained from the GF tutorial. It aims
-to give a systamatic description of the module system;
-some tutorial information is repeated to make the document
-self-contained.
-
- Predef.Error : Type ;
- Predef.error : Str -> Predef.Error ;
-
-Denotationally, Error is the empty type and thus a
-subtype of any other types: it can be used anywhere. But the
-error function is not canonical. Hence the compilation
-is interrupted when (error s) is translated to GFC, and
-the message s is emitted. An example use is given in
-english/ParadigmsEng.gf:
-
- regDuplV : Str -> V ;
- regDuplV fit =
- case last fit of {
- ("a" | "e" | "i" | "o" | "u" | "y") =>
- Predef.error (["final duplication makes no sense for"] ++ fit) ;
- t =>
- let fitt = fit + t in
- mkV fit (fit + "s") (fitt + "ed") (fitt + "ed") (fitt + "ing")
- } ;
-
-This function thus cannot be applied to a stem ending with a vowel,
-which is exactly what we want. In future, it may be good to add similar
-checks to all morphological paradigms in the resource.
-
-
-
-
-
-22/6 (AR) Release of GF version 2.6.
-
-
- # Svenska - Franska - Finska
- berg - montagne - vuori
- klättra - grimper / escalader - kiivetä / kiipeillä
-
-but can be extended to cover paradigm functions in addition to just
-words.
-
-
- {s : Str ; size : Predef.Ints 1 ; last : Predef.Ints 9}
-
-The size field has value 1 for integers greater than 9, and
-value 0 for other integers (which are never negative). This parameter can
-be used e.g. in calculating number agreement,
-
- Risala i = {s = i.s ++ table (Predef.Ints 1 * Predef.Ints 9) {
- <0,1> => "risalah" ;
- <0,2> => "risalatan" ;
- <0,_> | <1,0> => "rasail" ;
- _ => "risalah"
- } ! <i.size,i.last>
- } ;
-
-Notice that the table has to be typed explicitly for Ints k,
-because type inference would otherwise return Int and therefore
-fail to expand the table.
-
-
-
-
-
-
-
-21/3/2006 Release of GF 2.5.
-
-
- gt -cat=Cl | pt -transform=nodupatom
-
-This gives a corpus where words don't (usually) occur twice in the same clause.
-
-
-
-A minor novelty is that the --# -resource=FILE flag can now be
-relative to GF_LIB_PATH, both for grammars and treebanks.
-The flag --# -treebank=IDENT gives the language whose treebank
-entries are used, in case of a multilingual treebank.
-
-
- ut "He adds this to that" | l -multi -- use treebank lookup as parser in translation
- ut -assocs | grep "ComplV2" -- show all associations with ComplV2
-
-
-
-
-Notice that the treebanks in shell state are unilingual, and have strings as keys.
-Multilingual treebanks have trees as keys. In case 1, one unilingual treebank per
-language is built in the shell state.
-
-
-
- rf old.xml | tb -trees | tb -xml | wf new.xml
-
-Recall that only treebanks in the XML format can be read with the -trees
-and -c flags.
-
-
- gr -cat=S -number=100 | tb -xml | wf tb.xml -- random treebank into file
- rf tb.txt | tb -c -- read comparison treebank from file
-
-
-
-
-The last three apply to all types of patterns, the first two only to token strings.
-Example: plural formation in Swedish 2nd declension
-(pojke-pojkar, nyckel-nycklar, seger-segrar, bil-bilar):
-
- plural2 : Str -> Str = \w -> case w of {
- pojk + "e" => pojk + "ar" ;
- nyck + "e" + l@("l" | "r" | "n") => nyck + l + "ar" ;
- bil => bil + "ar"
- } ;
-
-Semantics: variables are always bound to the first match, in the sequence defined
-as the list Match p v as follows:
-
- Match (p1|p2) v = Match p1 v ++ Match p2 v
- Match (p1+p2) s = [Match p1 s1 ++ Match p2 s2 | i <- [0..length s], (s1,s2) = splitAt i s]
- Match p* s = Match "" s ++ Match p s ++ Match (p + p) s ++ ...
- Match c v = [[]] if c == v -- for constant patterns c
- Match x v = [[(x,v)]] -- for variable patterns x
- Match x@p v = [[(x,v)]] + M if M = Match p v /= []
- Match p v = [] otherwise -- failure
-
-Examples:
-
-
-
-
-22/12 Release of GF 2.4.
-
-
- gf examples/tram0/TramEng.gf
- > p -lexer=literals "I want to go to \"Gustaf Adolfs torg\" ;"
- QInput (GoTo (DestNamed "Gustaf Adolfs torg"))
-
-
-
- % cd GF/transfer
- % make -- compile the trc compiler
- % cd examples -- GF/transfer/examples
- % ../compile_to_core -i../lib numerals.tr
- % mv numerals.trc ../../examples/numerals
- % cd ../../examples/numerals -- GF/examples/numerals
- % gf
- > i decimal.gf
- > i BinaryDigits.gf
- > i numerals.trc
- > p -lang=Cncdecimal "123" | at num2bin | l
- 1 0 0 1 1 0 0 1 1 1 0
-
-Other relevant commands are:
-
-
-For more information on the commands, see help. Documentation on
-the transfer language: to appear.
-
-
-As a by-product, the probabilistic random generation algorithm is
-available for any context-free abstract syntax. Use the flag
-gr -cf. This algorithm is much faster than the
-old (more general) one, but it may sometimes loop.
-
-
-The optimization is triggered by the flag optimize=OPT_subs,
-where OPT is any of the other optimizations (see h -optimize).
-The most aggressive value of the flag is all_subs. In experiments,
-the size of a GFC module can shrink by 85% compared to plain all.
-
-
- (P => T)* = T*
- (t ! p)* = t*
- (table {p => t ; ...})* = t*
-
-In order for this to be maximally useful, the grammar should be written in such
-a way that the first value of every parameter type is the desired one. For
-instance, in Peano's case it would be the ablative for noun cases, the singular for
-numbers, and the 2nd person singular imperative for verb forms.
-
-
-
-
-
-
-1/7 Release of GF 2.3.
-
-
-cat [C]
-
-or
-
-cat [C]{n}
-
-where C is a category and n is a non-negative integer.
-cat [C] is equivalent to cat [C]{0}. List category
-syntax can be used whereever categories are used.
-
-
-cat ListC
-fun BaseC : C^n -> ListC
-fun ConsC : C -> ListC -> ListC
-
-
-where C^0 -> X means X, and C^m (where
-m > 0) means C -> C^(m-1).
-
-
-lincat [C] = T
-
-is equivalent to
-
-lincat ListC = T
-
-
-The linearizations of the list constructors are written
-just like they would be if the function declarations above
-had been made manually, e.g.:
-
-lin BaseC x_1 ... x_n = t
-lin ConsC x xs = t'
-
-
-
- gf -examples File.gfe
-
-See
-../lib/resource/doc/examples/QuestionsI.gfe
-for an example.
-
-
- M -- inherit everything from M, as before
- M [a,b,c] -- only inherit constants a,b,c
- M-[a,b,c] -- inherit everything except a,b,c
-
-Caution: there is no check yet for completeness and
-consistency, but restricted inheritance can create
-run-time failures.
-
-
-
-The Module System of GF
-
-Aarne Ranta
-8/4/2005 - 5/7/2007
-
-
-
-
-
-
-
-
-
-
-
-abstract
-concrete
-resource
-interface
-instance
-incomplete concrete
-The principal module types
-
-
-Any GF grammar that is used in an application
-will probably contain at least one module
-of the abstract module type. Here is an example of
-such a module, defining a fragment of propositional logic.
-
- abstract Logic = {
- cat Prop ;
- fun Conj : Prop -> Prop -> Prop ;
- fun Disj : Prop -> Prop -> Prop ;
- fun Impl : Prop -> Prop -> Prop ;
- fun Falsum : Prop ;
- }
-
-
-The name of this module is Logic.
-
-An abstract module defines an abstract syntax, which
-is a language-independent representation of a fragment of language.
-It consists of two kinds of judgements:
-
cat judgements telling what categories there are
- (types of abstract syntax trees)
-fun judgements telling what functions there are
- (to build abstract syntax trees)
-
-There can also be def and data judgements in an
-abstract syntax.
-
-The GF grammar compiler expects to find the module Logic in a file named
-Logic.gf. When the compiler is run, it produces
-another file, named Logic.gfc. This file is in the
-format called canonical GF, which is the "machine language"
-of GF. Next time that the module Logic is needed in
-compiling a grammar, it can be read from the compiled (gfc)
-file instead of the source (gf) file, unless the source
-has been changed after the compilation.
-
-In order for a GF grammar to describe a concrete language, the abstract
-syntax must be completed with a concrete syntax of it.
-For this purpose, we use modules of type concrete: for instance,
-
- concrete LogicEng of Logic = {
- lincat Prop = {s : Str} ;
- lin Conj a b = {s = a.s ++ "and" ++ b.s} ;
- lin Disj a b = {s = a.s ++ "or" ++ b.s} ;
- lin Impl a b = {s = "if" ++ a.s ++ "then" ++ b.s} ;
- lin Falsum = {s = ["we have a contradiction"]} ;
- }
-
-
-The module LogicEng is a concrete syntax of the
-abstract syntax Logic. The GF grammar compiler checks that
-the concrete is valid with respect to the abstract syntax of
-which it is claimed to be. The validity requires that there has to be
-
lincat judgement for each cat judgement, telling what the
- linearization types of categories are
-lin judgement for each fun judgement, telling what the
- linearization functions corresponding to functions are
-
-Validity also requires that the linearization functions defined by
-lin judgements are type-correct with respect to the
-linearization types of the arguments and value of the function.
-
-There can also be lindef and printname judgements in a
-concrete syntax.
-
-When a concrete module is successfully compiled, a gfc
-file is produced in the same way as for abstract modules. The
-pair of an abstract and a corresponding concrete module
-is a top-level grammar, which can be used in the GF system to
-perform various tasks. The most fundamental tasks are
-
-In the current grammar, infinitely many trees and strings are recognized, although -no very interesting ones. For example, the tree -
-- Impl (Disj Falsum Falsum) Falsum --
-has the linearization -
-- if we have a contradiction or we have a contradiction then we have a contradiction --
-which in turn can be parsed uniquely as that tree. -
- -
-When GF compiles the module LogicEng it also has to compile
-all modules that it depends on (in this case, just Logic).
-The compilation process starts with dependency analysis to find
-all these modules, recursively, starting from the explicitly imported one.
-The compiler then reads either gf or gfc files, in
-a dependency order. The decision on which files to read depends on
-time stamps and dependencies in a natural way, so that all and only
-those modules that have to be compiled are compiled. (This behaviour can
-be changed with flags, see below.)
-
-To use a top-level grammar in the GF system, one uses the import
-command (short name i). For instance,
-
- i LogicEng.gf --
-It is also possible to specify the imported grammar(s) on the command -line when invoking GF: -
-- gf LogicEng.gf --
-Various compilation flags can be added to both ways of compiling a module: -
--src forces compilation form source files
--v gives more verbose information on compilation
--s makes compilation silent (except if it fails with an error message)
-
-A complete list of flags can be obtained in GF by help i.
-
-Importing a grammar makes it visible in GF's internal state. To see
-what modules are available, use the command print_options (po).
-You can empty the state with the command empty (e); this is
-needed if you want to read in grammars with a different abstract syntax
-than the current one without exiting GF.
-
-Grammar modules can reside in different directories. They can then be found -by means of a search path, which is a flag such as -
-- -path=.:api/toplevel:prelude --
-given to the import command or the shell command invoking GF.
-(It can also be defined in the grammar file; see below.) The compiler
-writes every gfc file in the same directory as the corresponding
-gf file.
-
-The path is relative to the working directory pwd, so that
-all directories listed are primarily interpreted as subdirectories of
-pwd. Secondarily, they are searched relative to the value of the
-environment variable GF_LIB_PATH, which is by default set to
-/usr/local/share/GF.
-
-Parsing and linearization can be performed with the parse
-(p) and linearize (l) commands, respectively.
-For instance,
-
- > l Impl (Disj Falsum Falsum) Falsum - if we have a contradiction or we have a contradiction then we have a contradiction - - > p -cat=Prop "we have a contradiction" - Falsum --
-Notice that the parse command needs the parsing category
-as a flag. This necessary since a grammar can have several
-possible parsing categories ("entry points").
-
-One abstract syntax can have several concrete syntaxes.
-Here are two new ones for Logic:
-
- concrete LogicFre of Logic = {
- lincat Prop = {s : Str} ;
- lin Conj a b = {s = a.s ++ "et" ++ b.s} ;
- lin Disj a b = {s = a.s ++ "ou" ++ b.s} ;
- lin Impl a b = {s = "si" ++ a.s ++ "alors" ++ b.s} ;
- lin Falsum = {s = ["nous avons une contradiction"]} ;
- }
-
- concrete LogicSymb of Logic = {
- lincat Prop = {s : Str} ;
- lin Conj a b = {s = "(" ++ a.s ++ "&" ++ b.s ++ ")"} ;
- lin Disj a b = {s = "(" ++ a.s ++ "v" ++ b.s ++ ")"} ;
- lin Impl a b = {s = "(" ++ a.s ++ "->" ++ b.s ++ ")"} ;
- lin Falsum = {s = "_|_"} ;
- }
-
-
-The four modules Logic, LogicEng, LogicFre, and
-LogicSymb together form a multilingual grammar, in which
-it is possible to perform parsing and linearization with respect to any
-of the concrete syntaxes. As a combination of parsing and linearization,
-one can also perform translation from one language to another.
-(By language we mean the set of expressions generated by one
-concrete syntax.)
-
-Any combination of abstract syntax and corresponding concrete syntaxes
-is thus a multilingual grammar. With many languages and other enrichments
-(as described below), a multilingual grammar easily grows to the size of
-tens of modules. The grammar developer, having finished her job, can
-package the result in a multilingual canonical grammar, a file
-with the suffix .gfcm. For instance, to compile the set of grammars
-described by now, the following sequence of GF commands can be used:
-
- i LogicEng.gf - i LogicFre.gf - i LogicSymb.gf - pm | wf logic.gfcm --
-The "end user" of the grammar only needs the file logic.gfcm to
-access all the functionality of the multilingual grammar. It can be
-imported in the GF system in the same way as .gf files. But
-it can also be used in the
-Embedded Java Interpreter for GF
-to build Java programs of which the multilingual grammar functionalities
-(linearization, parsing, translation) form a part.
-
-In a multilingual grammar, the concrete syntax module names work as -names of languages that can be selected for linearization and parsing: -
-- > l -lang=LogicFre Impl Falsum Falsum - si nous avons une contradiction alors nous avons une contradiction - - > l -lang=LogicSymb Impl Falsum Falsum - ( _|_ -> _|_ ) - - > p -cat=Prop -lang=LogicSymb "( _|_ & _|_ )" - Conj Falsum Falsum --
-The option -multi gives linearization to all languages:
-
- > l -multi Impl Falsum Falsum - if we have a contradiction then we have a contradiction - si nous avons une contradiction alors nous avons une contradiction - ( _|_ -> _|_ ) --
-Translation can be obtained by using a pipe from a parser -to a linearizer: -
-- > p -cat=Prop -lang=LogicSymb "( _|_ & _|_ )" | l -lang=LogicEng - if we have a contradiction then we have a contradiction -- - -
-The concrete modules shown above would look much nicer if
-we used the main idea of functional programming: avoid repetitive
-code by using functions that capture repeated patterns of
-expressions. A collection of such functions can be a valuable
-resource for a programmer, reusable in many different
-top-level grammars. Thus we introduce the resource
-module type, with the first example
-
- resource Util = {
- oper SS : Type = {s : Str} ;
- oper ss : Str -> SS = \s -> {s = s} ;
- oper paren : Str -> Str = \s -> "(" ++ s ++ ")" ;
- oper infix : Str -> SS -> SS -> SS = \h,x,y ->
- ss (x.s ++ h ++ y.s) ;
- oper infixp : Str -> SS -> SS -> SS = \h,x,y ->
- ss (paren (infix h x y)) ;
- }
-
-
-Modules of resource type have two forms of judgement:
-
oper defining auxiliary operations
-param defining parameter types
-
-A resource can be used in a concrete (or another
-resource) by opening it. This means that
-all operations (and parameter types) defined in the resource
-module become usable in module that opens it. For instance,
-we can rewrite the module LogicSymb much more concisely:
-
- concrete LogicSymb of Logic = open Util in {
- lincat Prop = SS ;
- lin Conj = infixp "&" ;
- lin Disj = infixp "v" ;
- lin Impl = infixp "->" ;
- lin Falsum = ss "_|_" ;
- }
-
-
-What happens when this variant of LogicSymb is
-compiled is that the oper-defined constants
-of Util are inlined in the
-right-hand-sides of the judgements of LogicSymb,
-and these expressions are partially evaluated, i.e.
-computed as far as possible. The generated gfc file
-will look just like the file generated for the first version
-of LogicSymb - at least, it will do the same job.
-
-Several resource modules can be opened
-at the same time. If the modules contain same names, the
-conflict can be resolved by qualified opening and
-reference. For instance,
-
- concrete LogicSymb of Logic = open Util, Prelude in { ...
- } ;
-
-
-(where Prelude is a standard library of GF) brings
-into scope two definitions of the constant SS.
-To specify which one is used, you can write
-Util.SS or Prelude.SS instead of just SS.
-You can also introduce abbreviations to avoid long qualifiers, e.g.
-
- concrete LogicSymb of Logic = open (U=Util), (P=Prelude) in { ...
- } ;
-
-
-which means that you can write U.SS and P.SS.
-
-Judgements of param and oper forms may also be used
-in concrete modules, and they are then considered local
-to those modules, i.e. they are not exported.
-
-The compilation of a resource module differs
-from the compilation of abstract and
-concrete modules because oper operations
-do not in general have values in gfc. A gfc
-file is generated, but it contains only
-param judgements (also recall that opers
-are inlined in their top-level use sites, so it is not
-necessary to save them in the compiled grammar).
-However, since computing the operations over and over
-again can be time comsuming, and since type checking
-resource modules also takes time, a third kind
-of file is generated for resource modules: a .gfr
-file. This file is written in the GF source code notation,
-but it is type checked and type annotated, and opers
-are computed as far as possible.
-
-If you look at any gfc or gfr file generated
-by the GF compiler, you see that all names have been replaced by
-their qualified variants. This is an important first step (after parsing)
-the compiler does. As for the commands in the GF shell, some output
-qualified names and some not. The difference does not always result
-from firm principles.
-
-The typical use is through open in a
-concrete module, which means that
-resource modules are not imported on their own.
-However, in the developing and testing phase of grammars, it
-can be useful to evaluate opers with different
-arguments. To prevent them from being thrown away after inlining, the
--retain option can be used:
-
- > i -retain Util.gf --
-The command compute_concrete (cc)
-can now be used for evaluating expressions that may contain
-operations defined in Util:
-
- > cc ss (paren "foo")
- {s = "(" ++ "foo" ++ ")"}
-
-
-To find out what opers are available for a given type,
-the command show_operations (so) can be used:
-
- > so SS - Util.ss : Str -> SS ; - Util.infix : Str -> SS -> SS -> SS ; - Util.infixp : Str -> SS -> SS -> SS ; -- - -
-The most characteristic modularity of GF lies in the division of
-grammars into abstract, concrete, and
-resource modules. This permits writing multilingual
-grammar and sharing the maximum of code between different
-languages.
-
-In addition to this special kind of modularity, GF provides inheritance, -which is familiar from other programming languages (in particular, -object-oriented ones). Inheritance means that a module inherits all -judgements from another module; we also say that it extends -the other module. Inheritance is useful to divide big grammars into -smaller units, and also to reuse the same units in different bigger -grammars. -
-
-The first example of inheritance is for abstract syntax. Let us
-extend the module Logic to Arithmetic:
-
- abstract Arithmetic = Logic ** {
- cat Nat ;
- fun Even : Nat -> Prop ;
- fun Odd : Nat -> Prop ;
- fun Zero : Nat ;
- fun Succ : Nat -> Nat ;
- }
-
-
-In parallel with the extension of the abstract syntax
-Logic to Arithmetic, we can extend
-the concrete syntax LogicEng to ArithmeticEng:
-
- concrete ArithmeticEng of Arithmetic = LogicEng ** open Util in {
- lincat Nat = SS ;
- lin Even x = ss (x.s ++ "is" ++ "even") ;
- lin Odd x = ss (x.s ++ "is" ++ "odd") ;
- lin Zero = ss "zero" ;
- lin Succ x = ss ("the" ++ "successor" ++ "of" ++ x.s) ;
- }
-
-
-Another extension of Logic is Geometry,
-
- abstract Geometry = Logic ** {
- cat Point ;
- cat Line ;
- fun Incident : Point -> Line -> Prop ;
- }
-
--The corresponding concrete syntax is left as exercise. -
- --Inheritance can be multiple, which means that a module -may extend many modules at the same time. Suppose, for instance, -that we want to build a module for mathematics covering both -arithmetic and geometry, and the underlying logic. We then write -
-
- abstract Mathematics = Arithmetic, Geometry ** {
- } ;
-
--We could of course add some new judgements in this module, but -it is not necessary to do so. If no new judgements are added, the -module body can be omitted: -
-- abstract Mathematics = Arithmetic, Geometry ; -- -
-The module Mathematics shows that it is possibe
-to extend a module already built by extension. The correctness
-criterion for extensions is that the same name
-(cat, fun, oper, or param)
-may not be defined twice in the resulting union of names.
-That the names defined in Logic are "inherited twice"
-by Mathematics (via both Arithmetic and
-Geometry) is no violation of this rule; the usual
-problems of multiple inheritance do not arise, since
-the definitions of inherited constants cannot be changed.
-
-Inheritance can be restricted, which means that only some of -the constants are inherited. There are two dual notations for this: -
-- A [f,g] --
-meaning that only f and g are inherited from A, and
-
- A-[f,g] --
-meaning that everything except f is g are inherited from A.
-
-Constants that are not inherited may be redefined in the inheriting module. -
- -
-Inherited judgements are not copied into the inheriting modules.
-Instead, an indirection is created for each inherited name,
-as can be seen by looking into the generated gfc (and
-gfr) files. Thus for instance the names
-
- Mathematics.Prop Arithmetic.Prop Geometry.Prop Logic.Prop --
-all refer to the same category, declared in the module
-Logic.
-
-The command visualize_graph (vg) shows the
-dependency graph in the current GF shell state. The graph can
-also be saved in a file and used e.g. in documentation, by the
-command print_multi -graph (pm -graph).
-
-The vg command uses the free software packages Graphviz (commad dot)
-and Ghostscript (command gv).
-
-Top-level grammars have a straightforward translation to
-resource modules. The translation concerns
-pairs of abstract-concrete judgements:
-
- cat C ; ===> oper C : Type = T ; - lincat C = T ; - - fun f : A ; ===> oper f : A = t ; - lin f = t ; --
-Due to this translation, a concrete module
-can be opened in the same way as a
-resource module; the translation is done
-on the fly (it is computationally very cheap).
-
-Modular grammar engineering often means that some grammarians -focus on the semantics of the domain whereas others take care -of linguistic details. Thus a typical reuse opens a -linguistically oriented resource grammar, -
-
- abstract Resource = {
- cat S ; NP ; A ;
- fun PredA : NP -> A -> S ;
- }
- concrete ResourceEng of Resource = {
- lincat S = ... ;
- lin PredA = ... ;
- }
-
--The application grammar, instead of giving linearizations -explicitly, just reduces them to categories and functions in the -resource grammar: -
-
- concrete ArithmeticEng of Arithmetic = LogicEng ** open ResourceEng in {
- lincat Nat = NP ;
- lin Even x = PredA x (regA "even") ;
- }
-
--If the resource grammar is only capable of generating grammatically -correct expressions, then the grammaticality of the application -grammar is also guaranteed: the type checker of GF is used as -grammar checker. -To guarantee distinctions between categories that have -the same linearization type, the actual translation used -in GF adds to every linearization type and linearization -a lock field, -
-
- cat C ; ===> oper C : Type = T ** {lock_C : {}} ;
- lincat C = T ;
-
- fun f : C_1 ... C_n -> C ; ===> oper f : C_1 ... C_n -> C = \x_1,...,x_n ->
- lin f = t ; t x_1 ... x_n ** {lock_C = <>};
-
-
-(Notice that the latter translation is type-correct because of
-record subtyping, which means that t can ignore the
-lock fields of its arguments.) An application grammarian who
-only uses resource grammar categories and functions never
-needs to write these lock fields herself. Having to do so
-serves as a warning that the grammaticality guarantee given
-by the resource grammar no longer holds.
-
-Note. The lock field mechanism is experimental, and may be changed -to a stronger abstraction mechnism in the future. This may result in -hand-written lock fields ceasing to work. -
- -
-One difference between top-level grammars and resource
-modules is that the former systematically separete the
-declarations of categories and functions from their definitions.
-In the reuse translation creating and oper judgement,
-the declaration coming from the abstract module is put
-together with the definition coming from the concrete
-module.
-
-However, the separation of declarations and definitions is so
-useful a notion that GF also has specific modules types that
-resource modules into two parts. In this splitting,
-an interface module corresponds to an abstract syntax,
-in giving the declarations of operations (and parameter types).
-For instance, a generic markup interface would look as follows:
-
- interface Markup = open Util in {
- oper Boldface : Str -> Str ;
- oper Heading : Str -> Str ;
- oper markupSS : (Str -> Str) -> SS -> SS = \f,r ->
- ss (f r.s) ;
- }
-
-
-The definitions of the constants declared in an interface
-are given in an instance module (which is always of
-an interface, in the same way as a concrete is always
-of an abstract). The following instances
-define markup in HTML and latex.
-
- instance MarkupHTML of Markup = open Util in {
- oper Boldface s = "<b>" ++ s ++ "</b>" ;
- oper Heading s = "<h2>" ++ s ++ "</h2>" ;
- }
-
- instance MarkupLatex of Markup = open Util in {
- oper Boldface s = "\\textbf{" ++ s ++ "}" ;
- oper Heading s = "\\section{" ++ s ++ "}" ;
- }
-
-
-Notice that both interfaces and instances may
-open resources (and also reused top-level grammars).
-An interface may moreover define some of the operations it
-declares; these definitions are inherited by all instances and cannot
-be changed in them. Inheritance by module extension
-is possible, as always, between modules of the same type.
-
-An interface or an instance
-can be opened in
-a concrete using the same syntax as when opening
-a resource. For an instance, the semantics
-is the same as when opening the definitions together with
-the type signatures - one can think of an interface
-and an instance of it together forming an ordinary
-resource. Opening an interface, however,
-is different: functions that are only declared without
-having a definition cannot be compiled (inlined); neither
-can functions whose definitions depend on undefined functions.
-
-A module that opens an interface is therefore
-incomplete, and has to be completed with an
-instance of the interface to become complete. To make
-this situation clear, GF requires any module that opens an
-interface to be marked as incomplete. Thus
-the module
-
- incomplete concrete DocMarkup of Doc = open Markup in {
- ...
- }
-
-
-uses the interface Markup to place markup in
-chosen places in its linearization rules, but the
-implementation of markup - whether in HTML or in LaTeX - is
-left unspecified. This is a powerful way of sharing
-the code of a whole module with just differences in
-the definitions of some constants.
-
-Another terminology for incomplete modules is
-parametrized modules or functors.
-The interface gives the list of parameters
-that the functor depends on.
-
-To complete an incomplete module, each inteface
-that it opens has to be provided an instance. The following
-syntax is used for this:
-
- concrete DocHTML of Doc = DocMarkup with (Markup = MarkupHTML) ; --
-Instantiation of Markup with MarkupLatex is
-another one-liner.
-
-If more interfaces than one are instantiated, a comma-separated -list of equations in parentheses is used, e.g. -
-- concrete MusicIta = MusicI with - (Syntax = SyntaxIta), (LexMusic = LexMusicIta) ; --
-This example shows a common design pattern for building applications:
-the concrete syntax is a functor on the generic resource grammar library
-interface Syntax and a domain-specific lexicon interface, here
-LexMusic.
-
-All interfaces that are opened in the completed model
-must be completed.
-
-Notice that the completion of an incomplete module
-may at the same time extend modules of the same type (which need
-not be completions). It can also add new judgements in a module body,
-and restrict inheritance from the functor.
-
- concrete MusicIta = MusicI - [f] with
- (Syntax = SyntaxIta), (LexMusic = LexMusicIta) ** {
-
- lin f = ...
-
- } ;
-
-
-
-
-Interfaces, instances, and parametric modules are purely a
-front-end feature of GF: these module types do not exist in
-the gfc and gfr formats. The compiler has
-nevertheless to keep track of their dependencies and modification
-times. Here is a summary of how they are compiled:
-
interface is compiled into a resource with an empty body
-instance is compiled into a resource in union with its
- interface
-incomplete module (concrete or resource) is compiled
- into a module of the same type with an empty body
-concrete or resource) is compiled
- into a module of the same type by compiling its functor so that, instead of
- each interface, its given instance is used
-
-This means that some generated code is duplicated, because those operations that
-do have complete definitions in an interface are copied to each of
-the instances.
-
-Syntax: -
-
-abstract A = (A1,...,An **)?
-{J1 ; ... ; Jm ; }
-
-where -
--[f,..,g]
- or Ai[f,..,g]
-cat, fun, def, data
--Semantic conditions: -
--Syntax: -
-
-incomplete? concrete C of A =
-(C1,...,Cn **)?
-(open O1,...,Ok in)?
-{J1 ; ... ; Jm ; }
-
-where -
--[f,..,g]
-(Q=R)
- - where R is a resource, instance, or concrete, and Q is any identifier -
-lincat, lin, lindef, printname; also the forms oper, param are
- allowed, but they cannot be inherited.
-
-If the modifier incomplete appears, then any R in
-an open specification may also be an interface or an abstract.
-
-Semantic conditions: -
-cat judgement in A
- must have a corresponding, unique
- lincat judgement in C
-fun judgement in A
- must have a corresponding, unique
- lin judgement in C
--Syntax: -
-
-resource R =
-(R1,...,Rn **)?
-(open O1,...,Ok in)?
-{J1 ; ... ; Jm ; }
-
-where -
--[f,..,g]
-(Q=R)
- - where P is a resource, instance, or concrete, and Q is any identifier -
-oper, param
--Semantic conditions: -
--Syntax: -
-
-interface R =
-(R1,...,Rn **)?
-(open O1,...,Ok in)?
-{J1 ; ... ; Jm ; }
-
-where -
--[f,..,g]
-(Q=R)
- - where P is a resource, instance, or concrete, and Q is any identifier -
-oper, param
--Semantic conditions: -
--Syntax: -
-
-instance R of I =
-(R1,...,Rn **)?
-(open O1,...,Ok in)?
-{J1 ; ... ; Jm ; }
-
-where -
--[f,..,g]
-
-(Q=R)
- - where P is a resource, instance, or concrete, and Q is any identifier -
-oper, param
--Semantic conditions: -
--Syntax: -
-
-concrete C of A =
-(C1,...,Cn **)?
-B
-with
-(I1 =J1), ...
-, (Ip =Jp)
-(-? [c1,...,cq ])?
-(**?
-(open O1,...,Ok in)?
-{J1 ; ... ; Jm ; })? ;
-
-where -
--[f,..,g]
-(Q=R)
- - where R is a resource, instance, or concrete, and Q is any identifier -
-lincat, lin, lindef, printname; also the forms oper, param are
- allowed, but they cannot be inherited.
-