forked from GitHub/gf-core
1086 lines
42 KiB
HTML
1086 lines
42 KiB
HTML
<html>
|
||
<body bgcolor="#FFFFFF" text="#000000" >
|
||
<center>
|
||
<IMG SRC="gf-logo.gif">
|
||
|
||
|
||
<h1>Grammatical Framework User Manual</h1>
|
||
|
||
</center>
|
||
|
||
<a href="http://www.cs.chalmers.se/~aarne">
|
||
Aarne Ranta</a>,
|
||
June 17, 2006, for (forthcoming) GF Version 2.6
|
||
|
||
<p>
|
||
|
||
Fifth version: December 1, 2005, for GF Version 2.4
|
||
Fourth version: May 17, 2005, for GF Version 2.2.<br>
|
||
Third version: June 25, 2003, for GF Version 1.2.<br>
|
||
Second version: June 17, 2002, for GF Version 1.0.<br>
|
||
First version: April 19, 2002.
|
||
|
||
<p>
|
||
|
||
This document describes
|
||
the command language available for the user of GF.
|
||
The GF grammar language is described in other documents.
|
||
|
||
<p>
|
||
|
||
There is a separate
|
||
<a href="http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm">Editor User Manual</a>.
|
||
|
||
|
||
|
||
|
||
<h2>Levels of commands</h2>
|
||
|
||
<b>GF commands</b> appear on three levels:
|
||
|
||
<ol>
|
||
<li> <b>top-level shell commands</b>,
|
||
used for calling GF from Unix/Windows/Mac.
|
||
|
||
<li> <b>internal shell commands</b>,
|
||
available in the shell entered by the top-level shell command <tt>gf</tt>.
|
||
|
||
<li> <b>internal subshell commands</b>,
|
||
such as the editor commands,
|
||
entered by certain internal shell commands.
|
||
</ol>
|
||
|
||
By the <b>GF command language</b> we mean the internal shell
|
||
commands, which the most part of this document is about;
|
||
the sections describing the other levels are much shorter.
|
||
|
||
|
||
|
||
<h2>Top-level shell commands</h2>
|
||
|
||
The compiled GF program is invoked by a command that has the syntax
|
||
<pre>
|
||
gf Option* File*
|
||
</pre>
|
||
The files should contain GF
|
||
grammars, each of which is <b>imported</b> in the environment in which
|
||
GF starts, in the same way as if GF were first started and
|
||
the import command <tt>i</tt> then executed for each of the files.
|
||
The currently available options are:
|
||
<ul>
|
||
<li> <tt>-java</tt>, which enters directly an editor session designed
|
||
for communicating with and external java GUI (see <tt>jgf</tt> below).
|
||
</ul>
|
||
|
||
<p>
|
||
|
||
Like any program in Unix, GF can be used in a pipeline or
|
||
a redirection. For instance,
|
||
<pre>
|
||
echo "h" | gf
|
||
</pre>
|
||
starts GF and executes the help command.
|
||
<pre>
|
||
gf <script
|
||
</pre>
|
||
starts GF and executes the commands in the file <tt>script</tt>.
|
||
|
||
<p>
|
||
|
||
The Java GUI is started with the command
|
||
<pre>
|
||
jgf File+
|
||
</pre>
|
||
which executes a simple shell script. The effect is to start
|
||
GF, import each grammar in the files, and enter the
|
||
Editor subshell (see below), with which the GUI then communicates.
|
||
|
||
<p>
|
||
|
||
If a compiled version of GF is not available, GF can be started within
|
||
the Haskell interpreter GHCI, by the command
|
||
<pre>
|
||
make ghci
|
||
</pre>
|
||
in the GF source directory, followed by ":l GF" in GHCI.
|
||
Unfortunately, the standard binary of the light-weight Hugs interpreter
|
||
has insufficient code space for GF.
|
||
|
||
|
||
<h2>Batch mode</h2>
|
||
|
||
A simple protocol has been defined to run GF in batch mode, e.g. from another
|
||
program. The command line syntax is
|
||
<pre>
|
||
gf -batch (-s) (-[flag])*
|
||
</pre>
|
||
It reads standard input, which is typically directed from a
|
||
script file containing GF commands.
|
||
Every command read by GF, GF's reply, and
|
||
the whole run, are enclosed in XML tags:
|
||
<ul>
|
||
<li> tag <tt>gfcommand</tt> encloses a command in the script
|
||
<li> tag <tt>gfreply</tt> encloses GF's reply to a command
|
||
<li> tag <tt>gfbatch</tt> encloses the whole run of the script
|
||
</ul>
|
||
The DTD is the following:
|
||
<pre>
|
||
<!ELEMENT gfbatch ((gfcommand, gfreply)*) >
|
||
<!ELEMENT gfcommand (#PCDATA) >
|
||
<!ELEMENT gfreply (#PCDATA) >
|
||
</pre>
|
||
The optional <tt>+s</tt> (silence) flag turns off showing
|
||
commands and the XML structure of the run; it is moreove sent
|
||
as a global flag to the environment in which the run is
|
||
performed, together with the other flags appearing in the
|
||
command line.
|
||
|
||
<p>
|
||
|
||
Another version of the batch mode is the compiler. Thus
|
||
<pre>
|
||
gf -make -s file.gf
|
||
</pre>
|
||
silently compiles the file <tt>file.gf</tt> (as well as
|
||
all other files that it depends on).
|
||
All flags to the command <tt>i</tt> are recognized.
|
||
|
||
|
||
|
||
|
||
|
||
<h2>Library and grammar paths</h2>
|
||
|
||
Environment variables and path wild cards.
|
||
<ul>
|
||
<li> <tt>GF_LIB_PATH</tt> gives the location of <tt>GF/lib</tt>
|
||
<li> <tt>GF_GRAMMAR_PATH</tt> gives a list of directories appended
|
||
to the explicitly given path
|
||
<li> <tt>DIR/*</tt> is expanded to the union of all subdirectories
|
||
of <tt>DIR</tt>
|
||
</ul>
|
||
|
||
|
||
<h3>Command line syntax</h3>
|
||
|
||
The syntax of the individual commands is described in later sections.
|
||
The general structure of a command line is defined by the following
|
||
grammar:
|
||
<pre>
|
||
CommandLine ::= Pipeline (";;" Pipeline)*
|
||
Pipeline ::= Command
|
||
| Command Arg ("|" Command)*
|
||
Command ::= CommandId (Option | Flag)* Arg*
|
||
Arg ::= QuotedString | Tree | File | Lang | Int
|
||
</pre>
|
||
Several commands can be collected on one line, separated by a double
|
||
semicolon. The effect is that each of the commands is executed;
|
||
the same effect is achieved in a script by putting the commands on
|
||
consecutive lines. Thus
|
||
<pre>
|
||
i LangEng.gf ;; p -cat=AP "black or green" ;; q
|
||
</pre>
|
||
is equivalent to
|
||
<pre>
|
||
i LangEng.gf
|
||
p -cat=AP "black or green"
|
||
q
|
||
</pre>
|
||
The one-line variant is handy to use as an argument of the <tt>echo</tt>
|
||
command in Unix, to define simple shell scripts using GF.
|
||
|
||
<p>
|
||
|
||
A <b>pipeline</b> consists of a first command with an argument,
|
||
producing a result which is sent as argument to the next command.
|
||
For example,
|
||
<pre>
|
||
gr -cat=Phrase | l | sa
|
||
</pre>
|
||
generates a random Phrase, linearizes it, and speaks aloud the
|
||
resulting string. No result is seen in the output, but the
|
||
phrase is heard spoken.
|
||
|
||
<p>
|
||
|
||
The <b>trace</b> option <tt>-tr</tt> can be used to show intermediate
|
||
results in a pipeline:
|
||
<pre>
|
||
rf -tr bible.txt | p -lang=Eng -cat=Text | l -lang=Chi
|
||
</pre>
|
||
reads a string from the file <tt>bible.txt</tt> (displaying the result),
|
||
parses it as an English text (without displaying the parse tree),
|
||
and linearizes the tree into Chinese (displaying the result, as the
|
||
last command in a pipeline always does).
|
||
|
||
<p>
|
||
|
||
The Unix <b>Readline</b> facility makes arrow keys, file name completions,
|
||
etc, available in the GF shell, but only in the GHC-compiled variant.
|
||
For instance, the up-arrow goes backwards in the command history.
|
||
If Readline is not available,
|
||
a command line consisting of an integer <tt>n</tt>
|
||
repeats a command <tt>n</tt> lines back in the history.
|
||
For instance, 0 repeats the last command, 1 the second-last, etc.
|
||
This functionality usually doesn't work in Windows.
|
||
|
||
<p>
|
||
|
||
From GF version 2.4: to <b>interrupt</b> the execution of a command,
|
||
you can type [Control]-c, and it no longer terminates the GF session.
|
||
This functionality doesn't work in Windows.
|
||
|
||
|
||
<h3>Options and flags</h3>
|
||
|
||
An <b>option</b> consist of a hyphen and an option identifier, e.g.
|
||
<pre>
|
||
-retain
|
||
</pre>
|
||
What options belong to what commands is explained below.
|
||
|
||
<p>
|
||
|
||
A <b>flag</b> consists of a hyphen, a flag identifier, an equality sign,
|
||
and a value identifier, e.g.
|
||
<pre>
|
||
-lang=Eng
|
||
</pre>
|
||
What flags belong to what commands is explained below.
|
||
In addition to command lines, flags can be set globally with the
|
||
<tt>sf</tt> command (see below), as well as
|
||
in grammars, using a <tt>flags</tt> directive, e.g.
|
||
<pre>
|
||
flags lexer=code ; startcat=Exp ;
|
||
</pre>
|
||
either first in a file or immediately after an <tt>include</tt> directive.
|
||
In case of conflicts arising from this, the descending order of priority is:
|
||
command line, grammar, global.
|
||
The global state is initialized by <b>default values</b> to
|
||
all available flags.
|
||
|
||
|
||
|
||
<h3>Environment</h3>
|
||
|
||
To understand the semantics of commands in a GF session,
|
||
one must know their dependence on and their effects to an
|
||
<b>environment</b>. The environment consists of
|
||
<ul>
|
||
<li> main abstract syntax (if any) - pointer to a compiled module
|
||
<li> main concrete syntax (if any) - pointer to a compiled module
|
||
<li> a list of other concrete syntaxes, for the same abstract
|
||
<li> a list of other abstract syntaxes
|
||
<li> a list of all concrete syntaxes for all abstract syntaxes
|
||
<li> a list of compiled modules
|
||
<li> a list of source modules
|
||
<li> global options
|
||
<li> a list of transfer modules
|
||
</ul>
|
||
Normally, the main concrete syntax is the last-imported one.
|
||
The name of this is the
|
||
value of the flag <tt>-lang</tt>, which can be reset by the
|
||
<tt>sf</tt> command.
|
||
<pre>
|
||
i StoneageEng.gf -- main concrete is StoneageEng
|
||
i StoneageSwe.gf -- main concrete is StoneageSwe
|
||
sf -lang=StoneageEng -- main concrete is StoneageEng
|
||
po -- show the current environment
|
||
cm LangSwe -- change main concrete and abstract
|
||
e -- empty the environment
|
||
</pre>
|
||
|
||
|
||
|
||
|
||
<h3>Command arguments</h3>
|
||
|
||
Unlike Unix, where command arguments and values are strings,
|
||
GF uses a primitive type system, distinguishing between
|
||
<ul>
|
||
<li> strings,
|
||
<li> (lists of) terms (= syntax trees),
|
||
<li> names of languages,
|
||
<li> names of files,
|
||
<li> integers,
|
||
<li> void values (e.g. the result of speaking aloud a string),
|
||
<li> error values.
|
||
</ul>
|
||
A pipeline is only meaningful among strings and terms, and
|
||
only if the
|
||
argument type of a command matches with the value type of the
|
||
preceding one. For instance,
|
||
<pre>
|
||
p "hello world" | l -lang=Swe
|
||
</pre>
|
||
sends a list of terms (the parsing result) to the linearizer,
|
||
which expects terms, so that the types match. But
|
||
<pre>
|
||
p "hello world" | p -lang=Swe
|
||
</pre>
|
||
tries to parse arguments which are already terms, and this is a
|
||
type error. An error value is also displayed as a string
|
||
(an error message), but this string is never a meaningful
|
||
input for a command, so the pipe breaks there.
|
||
|
||
|
||
|
||
|
||
<h3>Descriptions with the individual commands</h3>
|
||
|
||
The following is a copy of the current <tt>HelpFile</tt>.
|
||
<pre>
|
||
-- GF help file updated for GF 2.6, 17/6/2006.
|
||
-- *: Commands and options marked with * are currently not implemented.
|
||
--
|
||
-- Each command has a long and a short name, options, and zero or more
|
||
-- arguments. Commands are sorted by functionality. The short name is
|
||
-- given first.
|
||
|
||
-- Type "h -all" for full help file, "h <CommandName>" for full help on a command.
|
||
|
||
-- commands that change the state
|
||
|
||
i, import: i File
|
||
Reads a grammar from File and compiles it into a GF runtime grammar.
|
||
Files "include"d in File are read recursively, nubbing repetitions.
|
||
If a grammar with the same language name is already in the state,
|
||
it is overwritten - but only if compilation succeeds.
|
||
The grammar parser depends on the file name suffix:
|
||
.gf normal GF source
|
||
.gfc canonical GF
|
||
.gfr precompiled GF resource
|
||
.gfcm multilingual canonical GF
|
||
.gfe example-based grammar files (only with the -ex option)
|
||
.gfwl multilingual word list (preprocessed to abs + cncs)
|
||
.ebnf Extended BNF format
|
||
.cf Context-free (BNF) format
|
||
.trc TransferCore format
|
||
options:
|
||
-old old: parse in GF<2.0 format (not necessary)
|
||
-v verbose: give lots of messages
|
||
-s silent: don't give error messages
|
||
-src from source: ignore precompiled gfc and gfr files
|
||
-gfc from gfc: use compiled modules whenever they exist
|
||
-retain retain operations: read resource modules (needed in comm cc)
|
||
-nocf don't build context-free grammar (thus no parser)
|
||
-nocheckcirc don't eliminate circular rules from CF
|
||
-cflexer build an optimized parser with separate lexer trie
|
||
-noemit do not emit code (default with old grammar format)
|
||
-o do emit code (default with new grammar format)
|
||
-ex preprocess .gfe files if needed
|
||
-prob read probabilities from top grammar file (format --# prob Fun Double)
|
||
-treebank read a treebank file to memory (xml format)
|
||
flags:
|
||
-abs set the name used for abstract syntax (with -old option)
|
||
-cnc set the name used for concrete syntax (with -old option)
|
||
-res set the name used for resource (with -old option)
|
||
-path use the (colon-separated) search path to find modules
|
||
-optimize select an optimization to override file-defined flags
|
||
-conversion select parsing method (values strict|nondet)
|
||
-probs read probabilities from file (format (--# prob) Fun Double)
|
||
-preproc use a preprocessor on each source file
|
||
-noparse read nonparsable functions from file (format --# noparse Funs)
|
||
examples:
|
||
i English.gf -- ordinary import of Concrete
|
||
i -retain german/ParadigmsGer.gf -- import of Resource to test
|
||
|
||
r, reload: r
|
||
Executes the previous import (i) command.
|
||
|
||
rl, remove_language: rl Language
|
||
Takes away the language from the state.
|
||
|
||
e, empty: e
|
||
Takes away all languages and resets all global flags.
|
||
|
||
sf, set_flags: sf Flag*
|
||
The values of the Flags are set for Language. If no language
|
||
is specified, the flags are set globally.
|
||
examples:
|
||
sf -nocpu -- stop showing CPU time
|
||
sf -lang=Swe -- make Swe the default concrete
|
||
|
||
s, strip: s
|
||
Prune the state by removing source and resource modules.
|
||
|
||
dc, define_command Name Anything
|
||
Add a new defined command. The Name must star with '%'. Later,
|
||
if 'Name X' is used, it is replaced by Anything where #1 is replaced
|
||
by X.
|
||
Restrictions: Currently at most one argument is possible, and a defined
|
||
command cannot appear in a pipe.
|
||
To see what definitions are in scope, use help -defs.
|
||
examples:
|
||
dc %tnp p -cat=NP -lang=Eng #1 | l -lang=Swe -- translate NPs
|
||
%tnp "this man" -- translate and parse
|
||
|
||
dt, define_term Name Tree
|
||
Add a constant for a tree. The constant can later be called by
|
||
prefixing it with '$'.
|
||
Restriction: These terms are not yet usable as a subterm.
|
||
To see what definitions are in scope, use help -defs.
|
||
examples:
|
||
p -cat=NP "this man" | dt tm -- define tm as parse result
|
||
l -all $tm -- linearize tm in all forms
|
||
|
||
-- commands that give information about the state
|
||
|
||
pg, print_grammar: pg
|
||
Prints the actual grammar (overridden by the -lang=X flag).
|
||
The -printer=X flag sets the format in which the grammar is
|
||
written.
|
||
N.B. since grammars are compiled when imported, this command
|
||
generally does not show the grammar in the same format as the
|
||
source. In particular, the -printer=latex is not supported.
|
||
Use the command tg -printer=latex File to print the source
|
||
grammar in LaTeX.
|
||
options:
|
||
-utf8 apply UTF8-encoding to the grammar
|
||
flags:
|
||
-printer
|
||
-lang
|
||
-startcat -- The start category of the generated grammar.
|
||
Only supported by some grammar printers.
|
||
examples:
|
||
pg -printer=cf -- show the context-free skeleton
|
||
|
||
pm, print_multigrammar: pm
|
||
Prints the current multilingual grammar in .gfcm form.
|
||
(Automatically executes the strip command (s) before doing this.)
|
||
options:
|
||
-utf8 apply UTF8 encoding to the tokens in the grammar
|
||
-utf8id apply UTF8 encoding to the identifiers in the grammar
|
||
examples:
|
||
pm | wf Letter.gfcm -- print the grammar into the file Letter.gfcm
|
||
pm -printer=graph | wf D.dot -- then do 'dot -Tps D.dot > D.ps'
|
||
|
||
vg, visualize_graph: vg
|
||
Show the dependency graph of multilingual grammar via dot and gv.
|
||
|
||
po, print_options: po
|
||
Print what modules there are in the state. Also
|
||
prints those flag values in the current state that differ from defaults.
|
||
|
||
pl, print_languages: pl
|
||
Prints the names of currently available languages.
|
||
|
||
pi, print_info: pi Ident
|
||
Prints information on the identifier.
|
||
|
||
-- commands that execute and show the session history
|
||
|
||
eh, execute_history: eh File
|
||
Executes commands in the file.
|
||
|
||
ph, print_history; ph
|
||
Prints the commands issued during the GF session.
|
||
The result is readable by the eh command.
|
||
examples:
|
||
ph | wf foo.hist" -- save the history into a file
|
||
|
||
-- linearization, parsing, translation, and computation
|
||
|
||
l, linearize: l PattList? Tree
|
||
Shows all linearization forms of Tree by the actual grammar
|
||
(which is overridden by the -lang flag).
|
||
The pattern list has the form [P, ... ,Q] where P,...,Q follow GF
|
||
syntax for patterns. All those forms are generated that match with the
|
||
pattern list. Too short lists are filled with variables in the end.
|
||
Only the -table flag is available if a pattern list is specified.
|
||
HINT: see GF language specification for the syntax of Pattern and Term.
|
||
You can also copy and past parsing results.
|
||
options:
|
||
-struct bracketed form
|
||
-table show parameters (not compatible with -record, -all)
|
||
-record record, i.e. explicit GF concrete syntax term (not compatible with -table, -all)
|
||
-all show all forms and variants (not compatible with -record, -table)
|
||
-multi linearize to all languages (can be combined with the other options)
|
||
flags:
|
||
-lang linearize in this grammar
|
||
-number give this number of forms at most
|
||
-unlexer filter output through unlexer
|
||
examples:
|
||
l -lang=Swe -table -- show full inflection table in Swe
|
||
|
||
p, parse: p String
|
||
Shows all Trees returned for String by the actual
|
||
grammar (overridden by the -lang flag), in the category S (overridden
|
||
by the -cat flag).
|
||
options for batch input:
|
||
-lines parse each line of input separately, ignoring empty lines
|
||
-all as -lines, but also parse empty lines
|
||
-prob rank results by probability
|
||
-cut stop after first lexing result leading to parser success
|
||
-fail show strings whose parse fails prefixed by #FAIL
|
||
-ambiguous show strings that have more than one parse prefixed by #AMBIGUOUS
|
||
options for selecting parsing method:
|
||
(default)parse using an overgenerating CFG
|
||
-cfg parse using a much less overgenerating CFG
|
||
-mcfg parse using an even less overgenerating MCFG
|
||
-fcfg parse using a faster variant of MCFG
|
||
Note: the first time parsing with -cfg, -mcfg, and -fcfg might take a long time
|
||
options that only work for the default parsing method:
|
||
-n non-strict: tolerates morphological errors
|
||
-ign ignore unknown words when parsing
|
||
-raw return context-free terms in raw form
|
||
-v verbose: give more information if parsing fails
|
||
flags:
|
||
-cat parse in this category
|
||
-lang parse in this grammar
|
||
-lexer filter input through this lexer
|
||
-parser use this parsing strategy
|
||
-number return this many results at most
|
||
examples:
|
||
p -cat=S -mcfg "jag <20>r gammal" -- parse an S with the MCFG
|
||
rf examples.txt | p -lines -- parse each non-empty line of the file
|
||
|
||
at, apply_transfer: at (Module.Fun | Fun)
|
||
Transfer a term using Fun from Module, or the topmost transfer
|
||
module. Transfer modules are given in the .trc format. They are
|
||
shown by the 'po' command.
|
||
flags:
|
||
-lang typecheck the result in this lang instead of default lang
|
||
examples:
|
||
p -lang=Cncdecimal "123" | at num2bin | l -- convert dec to bin
|
||
|
||
tb, tree_bank: tb
|
||
Generate a multilingual treebank from a list of trees (default) or compare
|
||
to an existing treebank.
|
||
options:
|
||
-c compare to existing xml-formatted treebank
|
||
-trees return the trees of the treebank
|
||
-all show all linearization alternatives (branches and variants)
|
||
-table show tables of linearizations with parameters
|
||
-record show linearization records
|
||
-xml wrap the treebank (or comparison results) with XML tags
|
||
-mem write the treebank in memory instead of a file TODO
|
||
examples:
|
||
gr -cat=S -number=100 | tb -xml | wf tb.xml -- random treebank into file
|
||
rf tb.xml | tb -c -- compare-test treebank from file
|
||
rf old.xml | tb -trees | tb -xml -- create new treebank from old
|
||
|
||
ut, use_treebank: ut String
|
||
Lookup a string in a treebank and return the resulting trees.
|
||
Use 'tb' to create a treebank and 'i -treebank' to read one from
|
||
a file.
|
||
options:
|
||
-assocs show all string-trees associations in the treebank
|
||
-strings show all strings in the treebank
|
||
-trees show all trees in the treebank
|
||
-raw return the lookup result as string, without typechecking it
|
||
flags:
|
||
-treebank use this treebank (instead of the latest introduced one)
|
||
examples:
|
||
ut "He adds this to that" | l -multi -- use treebank lookup as parser in translation
|
||
ut -assocs | grep "ComplV2" -- show all associations with ComplV2
|
||
|
||
tt, test_tokenizer: tt String
|
||
Show the token list sent to the parser when String is parsed.
|
||
HINT: can be useful when debugging the parser.
|
||
flags:
|
||
-lexer use this lexer
|
||
examples:
|
||
tt -lexer=codelit "2*(x + 3)" -- a favourite lexer for program code
|
||
|
||
g, grep: g String1 String2
|
||
Grep the String1 in the String2. String2 is read line by line,
|
||
and only those lines that contain String1 are returned.
|
||
flags:
|
||
-v return those lines that do not contain String1.
|
||
examples:
|
||
pg -printer=cf | grep "mother" -- show cf rules with word mother
|
||
|
||
cc, compute_concrete: cc Term
|
||
Compute a term by concrete syntax definitions. Uses the topmost
|
||
resource module (the last in listing by command po) to resolve
|
||
constant names.
|
||
N.B. You need the flag -retain when importing the grammar, if you want
|
||
the oper definitions to be retained after compilation; otherwise this
|
||
command does not expand oper constants.
|
||
N.B.' The resulting Term is not a term in the sense of abstract syntax,
|
||
and hence not a valid input to a Tree-demanding command.
|
||
flags:
|
||
-res use another module than the topmost one
|
||
examples:
|
||
cc -res=ParadigmsFin (nLukko "hyppy") -- inflect "hyppy" with nLukko
|
||
|
||
so, show_operations: so Type
|
||
Show oper operations with the given value type. Uses the topmost
|
||
resource module to resolve constant names.
|
||
N.B. You need the flag -retain when importing the grammar, if you want
|
||
the oper definitions to be retained after compilation; otherwise this
|
||
command does not find any oper constants.
|
||
N.B.' The value type may not be defined in a supermodule of the
|
||
topmost resource. In that case, use appropriate qualified name.
|
||
flags:
|
||
-res use another module than the topmost one
|
||
examples:
|
||
so -res=ParadigmsFin ResourceFin.N -- show N-paradigms in ParadigmsFin
|
||
|
||
t, translate: t Lang Lang String
|
||
Parses String in Lang1 and linearizes the resulting Trees in Lang2.
|
||
flags:
|
||
-cat
|
||
-lexer
|
||
-parser
|
||
examples:
|
||
t Eng Swe -cat=S "every number is even or odd"
|
||
|
||
gr, generate_random: gr Tree?
|
||
Generates a random Tree of a given category. If a Tree
|
||
argument is given, the command completes the Tree with values to
|
||
the metavariables in the tree.
|
||
options:
|
||
-prob use probabilities (works for nondep types only)
|
||
-cf use a very fast method (works for nondep types only)
|
||
flags:
|
||
-cat generate in this category
|
||
-lang use the abstract syntax of this grammar
|
||
-number generate this number of trees (not impl. with Tree argument)
|
||
-depth use this number of search steps at most
|
||
examples:
|
||
gr -cat=Query -- generate in category Query
|
||
gr (PredVP ? (NegVG ?)) -- generate a random tree of this form
|
||
gr -cat=S -tr | l -- gererate and linearize
|
||
|
||
gt, generate_trees: gt Tree?
|
||
Generates all trees up to a given depth. If the depth is large,
|
||
a small -alts is recommended. If a Tree argument is given, the
|
||
command completes the Tree with values to the metavariables in
|
||
the tree.
|
||
options:
|
||
-metas also return trees that include metavariables
|
||
flags:
|
||
-depth generate to this depth (default 3)
|
||
-atoms take this number of atomic rules of each category (default unlimited)
|
||
-alts take this number of alternatives at each branch (default unlimited)
|
||
-cat generate in this category
|
||
-lang use the abstract syntax of this grammar
|
||
-number generate (at most) this number of trees
|
||
-noexpand don't expand these categories (comma-separated, e.g. -noexpand=V,CN)
|
||
-doexpand only expand these categories (comma-separated, e.g. -doexpand=V,CN)
|
||
examples:
|
||
gt -depth=10 -cat=NP -- generate all NP's to depth 10
|
||
gt (PredVP ? (NegVG ?)) -- generate all trees of this form
|
||
gt -cat=S -tr | l -- generate and linearize
|
||
gt -noexpand=NP | l -mark=metacat -- the only NP is meta, linearized "?0 +NP"
|
||
gt | l | p -lines -ambiguous | grep "#AMBIGUOUS" -- show ambiguous strings
|
||
|
||
ma, morphologically_analyse: ma String
|
||
Runs morphological analysis on each word in String and displays
|
||
the results line by line.
|
||
options:
|
||
-short show analyses in bracketed words, instead of separate lines
|
||
flags:
|
||
-lang
|
||
examples:
|
||
wf Bible.txt | ma -short | wf Bible.tagged -- analyse the Bible
|
||
|
||
|
||
-- elementary generation of Strings and Trees
|
||
|
||
ps, put_string: ps String
|
||
Returns its argument String, like Unix echo.
|
||
HINT. The strength of ps comes from the possibility to receive the
|
||
argument from a pipeline, and altering it by the -filter flag.
|
||
flags:
|
||
-filter filter the result through this string processor
|
||
-length cut the string after this number of characters
|
||
examples:
|
||
gr -cat=Letter | l | ps -filter=text -- random letter as text
|
||
|
||
pt, put_tree: pt Tree
|
||
Returns its argument Tree, like a specialized Unix echo.
|
||
HINT. The strength of pt comes from the possibility to receive
|
||
the argument from a pipeline, and altering it by the -transform flag.
|
||
flags:
|
||
-transform transform the result by this term processor
|
||
-number generate this number of terms at most
|
||
examples:
|
||
p "zero is even" | pt -transform=solve -- solve ?'s in parse result
|
||
|
||
* st, show_tree: st Tree
|
||
Prints the tree as a string. Unlike pt, this command cannot be
|
||
used in a pipe to produce a tree, since its output is a string.
|
||
flags:
|
||
-printer show the tree in a special format (-printer=xml supported)
|
||
|
||
wt, wrap_tree: wt Fun
|
||
Wraps the tree as the sole argument of Fun.
|
||
flags:
|
||
-c compute the resulting new tree to normal form
|
||
|
||
vt, visualize_tree: vt Tree
|
||
Shows the abstract syntax tree via dot and gv (via temporary files
|
||
grphtmp.dot, grphtmp.ps).
|
||
flags:
|
||
-c show categories only (no functions)
|
||
-f show functions only (no categories)
|
||
-g show as graph (sharing uses of the same function)
|
||
-o just generate the .dot file
|
||
examples:
|
||
p "hello world" | vt -o | wf my.dot ;; ! open -a GraphViz my.dot
|
||
-- This writes the parse tree into my.dot and opens the .dot file
|
||
-- with another application without generating .ps.
|
||
|
||
-- subshells
|
||
|
||
es, editing_session: es
|
||
Opens an interactive editing session.
|
||
N.B. Exit from a Fudget session is to the Unix shell, not to GF.
|
||
options:
|
||
-f Fudget GUI (necessary for Unicode; only available in X Window System)
|
||
|
||
ts, translation_session: ts
|
||
Translates input lines from any of the actual languages to all other ones.
|
||
To exit, type a full stop (.) alone on a line.
|
||
N.B. Exit from a Fudget session is to the Unix shell, not to GF.
|
||
HINT: Set -parser and -lexer locally in each grammar.
|
||
options:
|
||
-f Fudget GUI (necessary for Unicode; only available in X Windows)
|
||
-lang prepend translation results with language names
|
||
flags:
|
||
-cat the parser category
|
||
examples:
|
||
ts -cat=Numeral -lang -- translate numerals, show language names
|
||
|
||
tq, translation_quiz: tq Lang Lang
|
||
Random-generates translation exercises from Lang1 to Lang2,
|
||
keeping score of success.
|
||
To interrupt, type a full stop (.) alone on a line.
|
||
HINT: Set -parser and -lexer locally in each grammar.
|
||
flags:
|
||
-cat
|
||
examples:
|
||
tq -cat=NP TestResourceEng TestResourceSwe -- quiz for NPs
|
||
|
||
tl, translation_list: tl Lang Lang
|
||
Random-generates a list of ten translation exercises from Lang1
|
||
to Lang2. The number can be changed by a flag.
|
||
HINT: use wf to save the exercises in a file.
|
||
flags:
|
||
-cat
|
||
-number
|
||
examples:
|
||
tl -cat=NP TestResourceEng TestResourceSwe -- quiz list for NPs
|
||
|
||
mq, morphology_quiz: mq
|
||
Random-generates morphological exercises,
|
||
keeping score of success.
|
||
To interrupt, type a full stop (.) alone on a line.
|
||
HINT: use printname judgements in your grammar to
|
||
produce nice expressions for desired forms.
|
||
flags:
|
||
-cat
|
||
-lang
|
||
examples:
|
||
mq -cat=N -lang=TestResourceSwe -- quiz for Swedish nouns
|
||
|
||
ml, morphology_list: ml
|
||
Random-generates a list of ten morphological exercises,
|
||
keeping score of success. The number can be changed with a flag.
|
||
HINT: use wf to save the exercises in a file.
|
||
flags:
|
||
-cat
|
||
-lang
|
||
-number
|
||
examples:
|
||
ml -cat=N -lang=TestResourceSwe -- quiz list for Swedish nouns
|
||
|
||
|
||
-- IO related commands
|
||
|
||
rf, read_file: rf File
|
||
Returns the contents of File as a String; error if File does not exist.
|
||
|
||
wf, write_file: wf File String
|
||
Writes String into File; File is created if it does not exist.
|
||
N.B. the command overwrites File without a warning.
|
||
|
||
af, append_file: af File
|
||
Writes String into the end of File; File is created if it does not exist.
|
||
|
||
* tg, transform_grammar: tg File
|
||
Reads File, parses as a grammar,
|
||
but instead of compiling further, prints it.
|
||
The environment is not changed. When parsing the grammar, the same file
|
||
name suffixes are supported as in the i command.
|
||
HINT: use this command to print the grammar in
|
||
another format (the -printer flag); pipe it to wf to save this format.
|
||
flags:
|
||
-printer (only -printer=latex supported currently)
|
||
|
||
* cl, convert_latex: cl File
|
||
Reads File, which is expected to be in LaTeX form.
|
||
Three environments are treated in special ways:
|
||
\begGF - \end{verbatim}, which contains GF judgements,
|
||
\begTGF - \end{verbatim}, which contains a GF expression (displayed)
|
||
\begInTGF - \end{verbatim}, which contains a GF expressions (inlined).
|
||
Moreover, certain macros should be included in the file; you can
|
||
get those macros by applying 'tg -printer=latex foo.gf' to any grammar
|
||
foo.gf. Notice that the same File can be imported as a GF grammar,
|
||
consisting of all the judgements in \begGF environments.
|
||
HINT: pipe with 'wf Foo.tex' to generate a new Latex file.
|
||
|
||
sa, speak_aloud: sa String
|
||
Uses the Flite speech generator to produce speech for String.
|
||
Works for American English spelling.
|
||
examples:
|
||
h | sa -- listen to the list of commands
|
||
gr -cat=S | l | sa -- generate a random sentence and speak it aloud
|
||
|
||
si, speech_input: si
|
||
Uses an ATK speech recognizer to get speech input.
|
||
flags:
|
||
-lang: The grammar to use with the speech recognizer.
|
||
-cat: The grammar category to get input in.
|
||
-language: Use acoustic model and dictionary for this language.
|
||
-number: The number of utterances to recognize.
|
||
|
||
h, help: h Command?
|
||
Displays the paragraph concerning the command from this help file.
|
||
Without the argument, shows the first lines of all paragraphs.
|
||
options
|
||
-all show the whole help file
|
||
-defs show user-defined commands and terms
|
||
-FLAG show the values of FLAG (works for grammar-independent flags)
|
||
examples:
|
||
h print_grammar -- show all information on the pg command
|
||
|
||
q, quit: q
|
||
Exits GF.
|
||
HINT: you can use 'ph | wf history' to save your session.
|
||
|
||
!, system_command: ! String
|
||
Issues a system command. No value is returned to GF.
|
||
example:
|
||
! ls
|
||
|
||
?, system_command: ? String
|
||
Issues a system command that receives its arguments from GF pipe
|
||
and returns a value to GF.
|
||
example:
|
||
h | ? 'wc -l' | p -cat=Num
|
||
|
||
|
||
-- Flags. The availability of flags is defined separately for each command.
|
||
|
||
-cat, category in which parsing is performed.
|
||
The default is S.
|
||
|
||
-depth, the search depth in e.g. random generation.
|
||
The default depends on application.
|
||
|
||
-filter, operation performed on a string. The default is identity.
|
||
-filter=identity no change
|
||
-filter=erase erase the text
|
||
-filter=take100 show the first 100 characters
|
||
-filter=length show the length of the string
|
||
-filter=text format as text (punctuation, capitalization)
|
||
-filter=code format as code (spacing, indentation)
|
||
|
||
-lang, grammar used when executing a grammar-dependent command.
|
||
The default is the last-imported grammar.
|
||
|
||
-language, voice used by Festival as its --language flag in the sa command.
|
||
The default is system-dependent.
|
||
|
||
-length, the maximum number of characters shown of a string.
|
||
The default is unlimited.
|
||
|
||
-lexer, tokenization transforming a string into lexical units for a parser.
|
||
The default is words.
|
||
-lexer=words tokens are separated by spaces or newlines
|
||
-lexer=literals like words, but GF integer and string literals recognized
|
||
-lexer=vars like words, but "x","x_...","$...$" as vars, "?..." as meta
|
||
-lexer=chars each character is a token
|
||
-lexer=code use Haskell's lex
|
||
-lexer=codevars like code, but treat unknown words as variables, ?? as meta
|
||
-lexer=text with conventions on punctuation and capital letters
|
||
-lexer=codelit like code, but treat unknown words as string literals
|
||
-lexer=textlit like text, but treat unknown words as string literals
|
||
-lexer=codeC use a C-like lexer
|
||
-lexer=ignore like literals, but ignore unknown words
|
||
-lexer=subseqs like ignore, but then try all subsequences from longest
|
||
|
||
-number, the maximum number of generated items in a list.
|
||
The default is unlimited.
|
||
|
||
-optimize, optimization on generated code.
|
||
The default is share for concrete, none for resource modules.
|
||
Each of the flags can have the suffix _subs, which performs
|
||
common subexpression elimination after the main optimization.
|
||
Thus, -optimize=all_subs is the most aggressive one.
|
||
-optimize=share share common branches in tables
|
||
-optimize=parametrize first try parametrize then do share with the rest
|
||
-optimize=values represent tables as courses-of-values
|
||
-optimize=all first try parametrize then do values with the rest
|
||
-optimize=none no optimization
|
||
|
||
-parser, parsing strategy. The default is chart. If -cfg or -mcfg are
|
||
selected, only bottomup and topdown are recognized.
|
||
-parser=chart bottom-up chart parsing
|
||
-parser=bottomup a more up to date bottom-up strategy
|
||
-parser=topdown top-down strategy
|
||
-parser=old an old bottom-up chart parser
|
||
|
||
-printer, format in which the grammar is printed. The default is
|
||
gfc. Those marked with M are (only) available for pm, the rest
|
||
for pg.
|
||
-printer=gfc GFC grammar
|
||
-printer=gf GF grammar
|
||
-printer=old old GF grammar
|
||
-printer=cf context-free grammar, with profiles
|
||
-printer=bnf context-free grammar, without profiles
|
||
-printer=lbnf labelled context-free grammar for BNF Converter
|
||
-printer=plbnf grammar for BNF Converter, with precedence levels
|
||
*-printer=happy source file for Happy parser generator (use lbnf!)
|
||
-printer=srg speech recognition grammar
|
||
-printer=haskell abstract syntax in Haskell, with transl to/from GF
|
||
-printer=morpho full-form lexicon, long format
|
||
*-printer=latex LaTeX file (for the tg command)
|
||
-printer=fullform full-form lexicon, short format
|
||
*-printer=xml XML: DTD for the pg command, object for st
|
||
-printer=old old GF: file readable by GF 1.2
|
||
-printer=stat show some statistics of generated GFC
|
||
-printer=probs show probabilities of all functions
|
||
-printer=gsl Nuance GSL speech recognition grammar
|
||
-printer=jsgf Java Speech Grammar Format
|
||
-printer=srgs_xml SRGS XML format
|
||
-printer=srgs_xml_prob SRGS XML format, with weights
|
||
-printer=srgs_xml_ms_sem SRGS XML format, with semantic tags for the
|
||
Microsoft Speech API.
|
||
-printer=vxml Generate a dialogue system in VoiceXML.
|
||
-printer=slf a finite automaton in the HTK SLF format
|
||
-printer=slf_graphviz the same automaton as slf, but in Graphviz format
|
||
-printer=slf_sub a finite automaton with sub-automata in the
|
||
HTK SLF format
|
||
-printer=slf_sub_graphviz the same automaton as slf_sub, but in
|
||
Graphviz format
|
||
-printer=fa_graphviz a finite automaton with labelled edges
|
||
-printer=regular a regular grammar in a simple BNF
|
||
-printer=unpar a gfc grammar with parameters eliminated
|
||
-printer=functiongraph abstract syntax functions in 'dot' format
|
||
-printer=typegraph abstract syntax categories in 'dot' format
|
||
-printer=transfer Transfer language datatype (.tr file format)
|
||
-printer=gfcm M gfcm file (default for pm)
|
||
-printer=header M gfcm file with header (for GF embedded in Java)
|
||
-printer=graph M module dependency graph in 'dot' (graphviz) format
|
||
-printer=missing M the missing linearizations of each concrete
|
||
-printer=gfc-prolog M gfc in prolog format (also pg)
|
||
-printer=mcfg-prolog M mcfg in prolog format (also pg)
|
||
-printer=cfg-prolog M cfg in prolog format (also pg)
|
||
|
||
-startcat, like -cat, but used in grammars (to avoid clash with keyword cat)
|
||
|
||
-transform, transformation performed on a syntax tree. The default is identity.
|
||
-transform=identity no change
|
||
-transform=compute compute by using definitions in the grammar
|
||
-transform=nodup return the term only if it has no constants duplicated
|
||
-transform=nodupatom return the term only if it has no atomic constants duplicated
|
||
-transform=typecheck return the term only if it is type-correct
|
||
-transform=solve solve metavariables as derived refinements
|
||
-transform=context solve metavariables by unique refinements as variables
|
||
-transform=delete replace the term by metavariable
|
||
|
||
-unlexer, untokenization transforming linearization output into a string.
|
||
The default is unwords.
|
||
-unlexer=unwords space-separated token list (like unwords)
|
||
-unlexer=text format as text: punctuation, capitals, paragraph <p>
|
||
-unlexer=code format as code (spacing, indentation)
|
||
-unlexer=textlit like text, but remove string literal quotes
|
||
-unlexer=codelit like code, but remove string literal quotes
|
||
-unlexer=concat remove all spaces
|
||
-unlexer=bind like identity, but bind at "&+"
|
||
|
||
-mark, marking of parts of tree in linearization. The default is none.
|
||
-mark=metacat append "+CAT" to every metavariable, showing its category
|
||
-mark=struct show tree structure with brackets
|
||
-mark=java show tree structure with XML tags (used in gfeditor)
|
||
|
||
-coding, Some grammars are in UTF-8, some in isolatin-1.
|
||
If the letters <20> (a-umlaut) and <20> (o-umlaut) look strange, either
|
||
change your terminal to isolatin-1, or rewrite the grammar with
|
||
'pg -utf8'.
|
||
|
||
-- *: Commands and options marked with * are not currently implemented.
|
||
</pre>
|
||
|
||
|
||
|
||
<h2>Commands in subshells</h2>
|
||
|
||
<h3>The interactive editor</h3>
|
||
|
||
The command <tt>es</tt> (edit session) opens a subshell, where editing is
|
||
commenced by selecting a new category, which initializes a syntax tree
|
||
with a metavariable. Editing has its own <b>state</b>, expressed by a Tree
|
||
Zipper, where the <b>current subtree</b> is marked by a star <tt>*</tt>.
|
||
A subtree that is a <b>metavariable</b> (of form <tt>?n</tt>) is
|
||
a <b>subgoal</b>.
|
||
|
||
<p>
|
||
|
||
There are currently three interfaces to the editor: a line-based GF subshell,
|
||
a Fudget GUI, and a Java GUI. They all use the same abstract command language,
|
||
the difference being that the subshell has a string syntax for each command,
|
||
whereas the GUIs mostly use menus and buttons to issue commands.
|
||
There is a separate
|
||
<a href="http://www.cs.chalmers.se/~aarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm">Editor User Manual</a>.
|
||
|
||
<p>
|
||
|
||
The command syntax for the string-based editor is the following:
|
||
|
||
<p>
|
||
|
||
Start/finish editing:
|
||
<ul>
|
||
<li> <tt>n Cat</tt> start new goal of type Cat
|
||
<li> <tt>t Tree</tt> start editing with Tree
|
||
<li> <tt>q</tt> quit the editor
|
||
</ul>
|
||
Navigation (change current subtree):
|
||
<ul>
|
||
<li> <tt><<</tt> go to previous metavariable
|
||
<li> <tt>< Int</tt> go Int steps back in the tree
|
||
<li> <tt>'</tt> go to the top of the tree
|
||
<li> <tt>> Int</tt> go Int steps ahead in the tree
|
||
<li> <tt>>></tt> go to next metavariable
|
||
</ul>
|
||
Refinement and wrapping (of current subtree):
|
||
<ul>
|
||
<li> <tt>r (Fun | Var)</tt> refine with function Fun or variable Var
|
||
<li> <tt>w Fun Int</tt> wrap subterm by Fun into its argument place Int
|
||
<li> <tt>s Int</tt> select candidate nr. Int (result of ambiguous parsing)
|
||
<li> <tt>x Var Var</tt> change (alpha convert) bound variable Var1 to Var2
|
||
<li> <tt>d</tt> delete subtree
|
||
<li> <tt>g Tree</tt> refine current subgoal with Tree
|
||
<li> <tt>p String</tt> parse String as refinement of current subgoal
|
||
<li> <tt>a</tt> aleatory: find random refinement
|
||
<li> <tt>u</tt> undo: go back in refinement history
|
||
<li> <tt>c Transform</tt> apply Transform (one of the -transform values) to subtree
|
||
<li> <tt>f Filter</tt> apply Filter
|
||
(one of the -filter values) to linearization output
|
||
</ul>
|
||
Information and display:
|
||
<ul>
|
||
<li> <tt>m</tt> show refinement/wrapping menu
|
||
<li> <tt>v</tt> toggle the pretty-printer view (Tree or grammar)
|
||
<li> <tt>h</tt> show command help
|
||
</ul>
|
||
|
||
|
||
|
||
|
||
<h3>Translate, parse, and teach yourself sessions</h3>
|
||
|
||
The system expects a string which it then tries to parse. A string consisting
|
||
of a dot (.) serves as exit command. The graphical translation session has a
|
||
Quit button.
|
||
|
||
|
||
</body>
|
||
</html>
|