diff --git a/doc/gf-history.html b/doc/gf-history.html index 513374233..57a425ca2 100644 --- a/doc/gf-history.html +++ b/doc/gf-history.html @@ -12,6 +12,10 @@ Changes in functionality since May 17, 2005, release of GF Version 2.2 +17/6 (AR) The FCFG parser is now the recommended method for parsing +heavy grammars such as the resource grammars. It does not yet support +literals and variable bindings. +
1/6 (AR) Added the FCFG parser written by Krasimir Angelov. Invoked by diff --git a/doc/gf-manual.html b/doc/gf-manual.html index 979585eb1..48fcff85e 100644 --- a/doc/gf-manual.html +++ b/doc/gf-manual.html @@ -10,11 +10,12 @@ Aarne Ranta, -December 1, 2005, for (forthcoming) GF Version 2.4 +June 17, 2006, for (forthcoming) GF Version 2.6
-Forth version: May 17, 2005, for GF Version 2.2.
+Fifth version: December 1, 2005, for GF Version 2.4
+Fourth version: May 17, 2005, for GF Version 2.2.
Third version: June 25, 2003, for GF Version 1.2.
Second version: June 17, 2002, for GF Version 1.0.
First version: April 19, 2002.
@@ -28,7 +29,7 @@ The GF grammar language is described in other documents.
There is a separate -GF Java GUI Manual. +Editor User Manual. @@ -329,8 +330,8 @@ input for a command, so the pipe breaks there. The following is a copy of the current HelpFile.
--- GF help file updated for GF 2.4, 1/12/2005.
--- *: Commands and options marked with * are not yet implemented.
+-- GF help file updated for GF 2.6, 17/6/2006.
+-- *: Commands and options marked with * are currently not implemented.
--
-- Each command has a long and a short name, options, and zero or more
-- arguments. Commands are sorted by functionality. The short name is
@@ -351,6 +352,7 @@ i, import: i File
.gfr precompiled GF resource
.gfcm multilingual canonical GF
.gfe example-based grammar files (only with the -ex option)
+ .gfwl multilingual word list (preprocessed to abs + cncs)
.ebnf Extended BNF format
.cf Context-free (BNF) format
.trc TransferCore format
@@ -358,7 +360,8 @@ i, import: i File
-old old: parse in GF<2.0 format (not necessary)
-v verbose: give lots of messages
-s silent: don't give error messages
- -src source: ignore precompiled gfc and gfr files
+ -src from source: ignore precompiled gfc and gfr files
+ -gfc from gfc: use compiled modules whenever they exist
-retain retain operations: read resource modules (needed in comm cc)
-nocf don't build context-free grammar (thus no parser)
-nocheckcirc don't eliminate circular rules from CF
@@ -367,6 +370,7 @@ i, import: i File
-o do emit code (default with new grammar format)
-ex preprocess .gfe files if needed
-prob read probabilities from top grammar file (format --# prob Fun Double)
+ -treebank read a treebank file to memory (xml format)
flags:
-abs set the name used for abstract syntax (with -old option)
-cnc set the name used for concrete syntax (with -old option)
@@ -375,12 +379,16 @@ i, import: i File
-optimize select an optimization to override file-defined flags
-conversion select parsing method (values strict|nondet)
-probs read probabilities from file (format (--# prob) Fun Double)
+ -preproc use a preprocessor on each source file
-noparse read nonparsable functions from file (format --# noparse Funs)
examples:
i English.gf -- ordinary import of Concrete
i -retain german/ParadigmsGer.gf -- import of Resource to test
+
+r, reload: r
+ Executes the previous import (i) command.
-* rl, remove_language: rl Language
+rl, remove_language: rl Language
Takes away the language from the state.
e, empty: e
@@ -432,6 +440,8 @@ pg, print_grammar: pg
flags:
-printer
-lang
+ -startcat -- The start category of the generated grammar.
+ Only supported by some grammar printers.
examples:
pg -printer=cf -- show the context-free skeleton
@@ -481,11 +491,11 @@ l, linearize: l PattList? Tree
HINT: see GF language specification for the syntax of Pattern and Term.
You can also copy and past parsing results.
options:
- -table show parameters
-struct bracketed form
- -record record, i.e. explicit GF concrete syntax term
- -all show all forms and variants
- -multi linearize to all languages (the other options don't work)
+ -table show parameters (not compatible with -record, -all)
+ -record record, i.e. explicit GF concrete syntax term (not compatible with -table, -all)
+ -all show all forms and variants (not compatible with -record, -table)
+ -multi linearize to all languages (can be combined with the other options)
flags:
-lang linearize in this grammar
-number give this number of forms at most
@@ -498,15 +508,18 @@ p, parse: p String
grammar (overridden by the -lang flag), in the category S (overridden
by the -cat flag).
options for batch input:
- -lines parse each line of input separately, ignoring empty lines
- -all as -lines, but also parse empty lines
- -prob rank results by probability
- -cut stop after first lexing result leading to parser success
+ -lines parse each line of input separately, ignoring empty lines
+ -all as -lines, but also parse empty lines
+ -prob rank results by probability
+ -cut stop after first lexing result leading to parser success
+ -fail show strings whose parse fails prefixed by #FAIL
+ -ambiguous show strings that have more than one parse prefixed by #AMBIGUOUS
options for selecting parsing method:
(default)parse using an overgenerating CFG
-cfg parse using a much less overgenerating CFG
-mcfg parse using an even less overgenerating MCFG
- Note: the first time parsing with -cfg or -mcfg might take a long time
+ -fcfg parse using a faster variant of MCFG
+ Note: the first time parsing with -cfg, -mcfg, and -fcfg might take a long time
options that only work for the default parsing method:
-n non-strict: tolerates morphological errors
-ign ignore unknown words when parsing
@@ -531,6 +544,37 @@ at, apply_transfer: at (Module.Fun | Fun)
examples:
p -lang=Cncdecimal "123" | at num2bin | l -- convert dec to bin
+tb, tree_bank: tb
+ Generate a multilingual treebank from a list of trees (default) or compare
+ to an existing treebank.
+ options:
+ -c compare to existing xml-formatted treebank
+ -trees return the trees of the treebank
+ -all show all linearization alternatives (branches and variants)
+ -table show tables of linearizations with parameters
+ -record show linearization records
+ -xml wrap the treebank (or comparison results) with XML tags
+ -mem write the treebank in memory instead of a file TODO
+ examples:
+ gr -cat=S -number=100 | tb -xml | wf tb.xml -- random treebank into file
+ rf tb.xml | tb -c -- compare-test treebank from file
+ rf old.xml | tb -trees | tb -xml -- create new treebank from old
+
+ut, use_treebank: ut String
+ Lookup a string in a treebank and return the resulting trees.
+ Use 'tb' to create a treebank and 'i -treebank' to read one from
+ a file.
+ options:
+ -assocs show all string-trees associations in the treebank
+ -strings show all strings in the treebank
+ -trees show all trees in the treebank
+ -raw return the lookup result as string, without typechecking it
+ flags:
+ -treebank use this treebank (instead of the latest introduced one)
+ examples:
+ ut "He adds this to that" | l -multi -- use treebank lookup as parser in translation
+ ut -assocs | grep "ComplV2" -- show all associations with ComplV2
+
tt, test_tokenizer: tt String
Show the token list sent to the parser when String is parsed.
HINT: can be useful when debugging the parser.
@@ -606,18 +650,22 @@ gt, generate_trees: gt Tree?
command completes the Tree with values to the metavariables in
the tree.
options:
- -metas also return trees that include metavariables
+ -metas also return trees that include metavariables
flags:
- -depth generate to this depth (default 3)
- -atoms take this number of atomic rules of each category (default unlimited)
- -alts take this number of alternatives at each branch (default unlimited)
- -cat generate in this category
- -lang use the abstract syntax of this grammar
- -number generate (at most) this number of trees
+ -depth generate to this depth (default 3)
+ -atoms take this number of atomic rules of each category (default unlimited)
+ -alts take this number of alternatives at each branch (default unlimited)
+ -cat generate in this category
+ -lang use the abstract syntax of this grammar
+ -number generate (at most) this number of trees
+ -noexpand don't expand these categories (comma-separated, e.g. -noexpand=V,CN)
+ -doexpand only expand these categories (comma-separated, e.g. -doexpand=V,CN)
examples:
- gt -depth=10 -cat=NP -- generate all NP's to depth 10
- gt (PredVP ? (NegVG ?)) -- generate all trees of this form
- gt -cat=S -tr | l -- gererate and linearize
+ gt -depth=10 -cat=NP -- generate all NP's to depth 10
+ gt (PredVP ? (NegVG ?)) -- generate all trees of this form
+ gt -cat=S -tr | l -- generate and linearize
+ gt -noexpand=NP | l -mark=metacat -- the only NP is meta, linearized "?0 +NP"
+ gt | l | p -lines -ambiguous | grep "#AMBIGUOUS" -- show ambiguous strings
ma, morphologically_analyse: ma String
Runs morphological analysis on each word in String and displays
@@ -782,12 +830,21 @@ sa, speak_aloud: sa String
h | sa -- listen to the list of commands
gr -cat=S | l | sa -- generate a random sentence and speak it aloud
+si, speech_input: si
+ Uses an ATK speech recognizer to get speech input.
+ flags:
+ -lang: The grammar to use with the speech recognizer.
+ -cat: The grammar category to get input in.
+ -language: Use acoustic model and dictionary for this language.
+ -number: The number of utterances to recognize.
+
h, help: h Command?
Displays the paragraph concerning the command from this help file.
Without the argument, shows the first lines of all paragraphs.
options
-all show the whole help file
-defs show user-defined commands and terms
+ -FLAG show the values of FLAG (works for grammar-independent flags)
examples:
h print_grammar -- show all information on the pg command
@@ -855,7 +912,6 @@ q, quit: q
Each of the flags can have the suffix _subs, which performs
common subexpression elimination after the main optimization.
Thus, -optimize=all_subs is the most aggressive one.
-
-optimize=share share common branches in tables
-optimize=parametrize first try parametrize then do share with the rest
-optimize=values represent tables as courses-of-values
@@ -893,8 +949,15 @@ q, quit: q
-printer=jsgf Java Speech Grammar Format
-printer=srgs_xml SRGS XML format
-printer=srgs_xml_prob SRGS XML format, with weights
+ -printer=srgs_xml_ms_sem SRGS XML format, with semantic tags for the
+ Microsoft Speech API.
+ -printer=vxml Generate a dialogue system in VoiceXML.
-printer=slf a finite automaton in the HTK SLF format
- -printer=slf_graphviz the same automaton as in SLF, but in Graphviz format
+ -printer=slf_graphviz the same automaton as slf, but in Graphviz format
+ -printer=slf_sub a finite automaton with sub-automata in the
+ HTK SLF format
+ -printer=slf_sub_graphviz the same automaton as slf_sub, but in
+ Graphviz format
-printer=fa_graphviz a finite automaton with labelled edges
-printer=regular a regular grammar in a simple BNF
-printer=unpar a gfc grammar with parameters eliminated
@@ -912,12 +975,14 @@ q, quit: q
-startcat, like -cat, but used in grammars (to avoid clash with keyword cat)
-transform, transformation performed on a syntax tree. The default is identity.
- -transform=identity no change
- -transform=compute compute by using definitions in the grammar
- -transform=typecheck return the term only if it is type-correct
- -transform=solve solve metavariables as derived refinements
- -transform=context solve metavariables by unique refinements as variables
- -transform=delete replace the term by metavariable
+ -transform=identity no change
+ -transform=compute compute by using definitions in the grammar
+ -transform=nodup return the term only if it has no constants duplicated
+ -transform=nodupatom return the term only if it has no atomic constants duplicated
+ -transform=typecheck return the term only if it is type-correct
+ -transform=solve solve metavariables as derived refinements
+ -transform=context solve metavariables by unique refinements as variables
+ -transform=delete replace the term by metavariable
-unlexer, untokenization transforming linearization output into a string.
The default is unwords.
@@ -929,7 +994,17 @@ q, quit: q
-unlexer=concat remove all spaces
-unlexer=bind like identity, but bind at "&+"
--- *: Commands and options marked with * are currently not implemented.
+-mark, marking of parts of tree in linearization. The default is none.
+ -mark=metacat append "+CAT" to every metavariable, showing its category
+ -mark=struct show tree structure with brackets
+ -mark=java show tree structure with XML tags (used in gfeditor)
+
+-coding, Some grammars are in UTF-8, some in isolatin-1.
+ If the letters ä (a-umlaut) and ö (o-umlaut) look strange, either
+ change your terminal to isolatin-1, or rewrite the grammar with
+ 'pg -utf8'.
+
+-- *: Commands and options marked with * are not currently implemented.
@@ -952,7 +1027,7 @@ a Fudget GUI, and a Java GUI. They all use the same abstract command language,
the difference being that the subshell has a string syntax for each command,
whereas the GUIs mostly use menus and buttons to issue commands.
There is a separate
-GF Java GUI Manual.
+Editor User Manual.
diff --git a/doc/index.html b/doc/index.html index 1ab85a454..d0be7f6e0 100644 --- a/doc/index.html +++ b/doc/index.html @@ -28,7 +28,7 @@ sections are still unwritten.
Old Grammarian's Tutorial -on writing GF grammars, with exercises. +on writing GF grammars, with exercises. GF v 1.2, before the module system. @@ -90,13 +90,14 @@ parser.
-Resource grammar library documentation +On-line resource grammar library documentation in progress for the forthcoming API v 1.0.
@@ -105,6 +106,11 @@ in progress for the forthcoming API v 1.0. Resource grammar writing HOWTO document in progress (forthcoming API v 1.0). +
+ +Old resource grammar library +document (v 0.9). + diff --git a/lib/resource-1.0/TODO b/lib/resource-1.0/TODO index 763bd5bdc..7ae9f9a7f 100644 --- a/lib/resource-1.0/TODO +++ b/lib/resource-1.0/TODO @@ -2,7 +2,7 @@ TODO in Resource 1.0 implementation 6/2/2006 -Eng: non-contracted negations +%Eng: non-contracted negations %Eng: auxiliaries in standard API diff --git a/lib/resource-1.0/doc/index.html b/lib/resource-1.0/doc/index.html index ac257ae7c..795bf10c3 100644 --- a/lib/resource-1.0/doc/index.html +++ b/lib/resource-1.0/doc/index.html @@ -7,41 +7,9 @@
The GF Resource Grammar Library defines the basic grammar of ten languages: @@ -49,11 +17,13 @@ Danish, English, Finnish, French, German, Italian, Norwegian, Russian, Spanish, Swedish.
+New: User manual of the resource library. +
+Notice. This document concerns the API v. 1.0 which has not yet been "officially" released. The release is planned in the end of June 2006.
-Inger Andersson and Therese Soderberg (Spanish morphology), @@ -88,14 +58,12 @@ Saara Myllyntausta, Wanjiku Ng'ang'a, Jordi Saludes.
-The GF Resource Grammar Library is open-source software licensed under GNU General Public License. See the file LICENSE for more details.
-Coverage, for each language: @@ -126,7 +94,6 @@ Presentation:
Go to the main directory, compile the grammars, and run a test. @@ -138,7 +105,8 @@ Go to the main directory, compile the grammars, and run a test.
This will take quite some time. An alternative is to use the
-precompiled grammar package from GF download page. This package
+precompiled grammar package compiled.tgz.
+This package
has the necessary gfc and gfr files directly under GF/lib.
@@ -159,7 +127,6 @@ Do for instanceFor more examples, see the Overview slides.
-The language independent ground API
This API is accessible by both
presentandalltenses. @@ -193,7 +160,6 @@ The documentation of the individual modules:
Grammar and Lexicon
-
@@ -269,9 +234,7 @@ gesture. Some functions for constructing demonstratives are provided.
The simplest way to get the library is to install the precompiled version @@ -293,7 +256,6 @@ library. Use one (or several) of the following packages instead: multimodal dialogue applications -
Typically, open one of
@@ -332,7 +294,6 @@ The mathematical API shares modules with
present. It is therefore not a good idea to use it in combination with
alltenses.
If you have done make in lib/resource-1.0, you will have
@@ -368,14 +329,12 @@ each session, but gets faster at later runs.
It is also feasible to parse in Scandinavian languages (Danish, Norwegian, Swedish).
--These applications are meand to serve as starting points for +These applications are meant to serve as starting points for new applications, showing how the libraries can be used in typical situations.
-The examples/bronzeage @@ -383,7 +342,6 @@ grammar set implements a language fragment based on the Swadesh list of 200 words. It is useful for things like language training.
-The examples/dialogue @@ -392,7 +350,6 @@ multimodal dialogue system. Its purpose is to serve as a prototype for applications in the TALK project.
-The examples/animal @@ -400,12 +357,8 @@ grammar set implements some queries about animals. Its purpose is to serve as a prototype for example-based grammar writing.
--This bugs should be fixed before the final release of v. 1.0. -
-Danish
@@ -431,6 +384,7 @@ French
@@ -446,6 +400,7 @@ Italian
@@ -460,21 +415,30 @@ Russian
Spanish
Swedish +
++GF Resource Grammar Library (pdf). +Printable user manual with API documentation. +
+Grammars as Software Libraries. Slides with background and motivation for the resource grammar library.
@@ -502,5 +466,5 @@ examples are frommultimodal/old, which is a reduced-size API.
-
+