Commit Graph

144 Commits

Author SHA1 Message Date
krasimir
ce70720859 CFGtoPGF is now extended to support context-free grammars with primitive parameters 2016-03-22 10:28:15 +00:00
krasimir
c8ebe09315 initial support for BNFC syntax in context-free grammars for GF. Not all features are supported yet. Based on contribution from Gleb Lobanov 2016-03-21 13:27:44 +00:00
leiss
df2901c9c0 add lexer and unlexer for Ancient Greek accent normalization 2016-02-23 16:30:39 +00:00
hallgren
f2b057c078 GF shell, cc command: try to compute pre{...} tokens in token sequences
This is implemented as a simple post-processing step after partial evaluation
to try compute pre{...} tokens in token sequences. Nothing is done to deal
with intervening free variants.

This was done in response to a query from René T on the gf-dev mailing list.
2015-12-02 16:41:18 +00:00
aarne
0a38e137b6 vd -conll2latex now converts conll to latex. Without going through GF trees, but as a service to the dependency parser community. 2015-11-23 10:43:03 +00:00
aarne
d6a505169a added -output=latex to visialize_dependencies. This generates more familiar-looking output than the default graphviz, which can moreover be pasted into LaTeX documents. Some more work is needed to make long sentences look nice and fit on a page; a constant word length is now used to simplify computing the coordinates. 2015-11-17 09:11:10 +00:00
aarne
ce6557f1f2 the visualization commands (aw,vd,vp,vt) can now show multiple trees. Previously they only showed one tree even if there were for instance after ambiguous parsing. The reason was that dot (graphviz) ignored all graphs but the first one. Now the graphs are put into separate files. The 'convert' command from ImageMagick package is used to combine them to one pdf. If this is a problem, the old behaviour can be restored by the -number=1 option to the tree-generating command, which cuts away all trees but the one and doesn't require ImageMagick. 2015-11-05 16:47:41 +00:00
aarne
eb49b6ab56 improved documentation of vp -showdep 2015-11-05 08:23:33 +00:00
aarne
0786dc6f42 dependency labels in parse trees now with the -deps flag, -file=labels_file for configuration. With -nocat option this shows reasonable dep trees, more familiar looking than the vd command. With -showfun flag, the tree gives a rather complete picture of the analysis of the sentence. 2015-11-04 20:36:47 +00:00
aarne
e39787ab88 prepared visualize_parse for showing dependency labels 2015-11-04 17:28:09 +00:00
hallgren
35be182824 Preliminary new shell feature: cc -trace.
You can now do things like 

	cc -trace mkV "debug"

to see a trace of all opers with their arguments and results during the
computation of mkV "debug".
2015-09-28 22:23:56 +00:00
hallgren
32f18b515e GF shell: write_file now writes one tree per line
This compensates for other changes that removed line breaks.
Maybe it should have a -lines options like ps and rf?
2015-09-03 20:42:38 +00:00
hallgren
5bfaf10de5 Comment out some dead code found with -fwarn-unused-binds
Also fixed some warnings and tightened some imports
2015-08-28 13:59:43 +00:00
hallgren
128236eab9 GF shell: change parse & linearize to obtain useful results from p|l and l|p in more cases
These changes are inspired by the gf -cshell implementation of these commands.

The output of the linearize command has been changed to remove superfluous
blank lines and commas, and deliver the result as a list of strings instead of
a single multi-line string. This makes it possible to use -all and pipe the
results to the parse command. This also means that with -treebank -all,
the language tag will be repeated for each result from the same language.

The parse command, when trying to parse with more than one language, would
"forget" other results after a failed parse, and thus not send all
successful parses through the pipe. For example, if English is not the first
language in the grammar,

    p "hello" | l

would output nothing, instead of translations of "hello" to all languages,
forcing the user to write

   p -lang=Eng "hello" | l

instead, to get the expected result. The cause of this behaviour was in the
function fromParse, which was rather messy, so I assume it is not intentional,
but the result of a programming mistake at some point.

The fromParse function has now been refactored from a big recursive function
into 

    fromParse opts = foldr (joinPiped . fromParse1 opts) void

where the helper functions fromParse1 deals with a single parse result and
joinPiped combines multiple parse results.
2015-08-26 13:56:23 +00:00
hallgren
c25705519a GF shell bug fix: visualize_parse didn't accept the -lang flag
Even though the -lang flag was handled in the implementation, it was not
documented, and GF.Command.Interpreter rejects undocumented flags:

	option not interpreted: lang

This must be a fairly old bug, so it suggests that the vp command isn't used
much...
2015-08-21 12:20:40 +00:00
hallgren
e178615338 GF -cshell: implement visualize_parse
Supported options and flags: -lang -format -view
None of the rendering options available in the Haskell run-time are supported.
2015-08-21 12:14:47 +00:00
hallgren
026d6a73ad gf -cshell: implement visualize_tree
But the following options are not supported: -mk -nocats -nofuns
2015-08-21 10:14:46 +00:00
hallgren
786ef54d62 gf -cshell: implement a subset of print_grammar and abstract_info
pg supports only the -funs, -cats and -langs output modes.

ai IDENTIFIER shows info about a category or a function. ai can not type check
and refine metavariables in expressions.
2015-08-20 16:06:10 +00:00
hallgren
2ff7e829dc gf -cshell: linearize: implement options -all -list -treebank
Options -all and -list use PGF2.linearizeAll, which lists all variants, but
not all forms...
Also, there is no attempt to be compatible with the output from the Haskell
run-rime shell, which produces superfluous blank lines (-all) or
commas (-list), and mixes tagged and untagged lines (-treebank -all).
2015-08-18 16:05:45 +00:00
hallgren
41075fb50a GF shell: restore the eh command to working order and document it
Also, when the command line parser fails, append the problematic command line
to the error message "command not parsed".
2015-08-18 13:13:31 +00:00
hallgren
87e64a804c GF Shell: refactoring for improved modularity and reusability:
+ Generalize the CommandInfo type by parameterizing it on the monad
  instead of just the environment.
+ Generalize the commands defined in
  GF.Command.{Commands,Commands2,CommonCommands,SourceCommands,HelpCommand}
  to work in any monad that supports the needed operations.
+ Liberate GF.Command.Interpreter from the IO monad.
  Also, move the current PGF from CommandEnv to GFEnv in
  GF.Interactive, making the command interpreter even more generic.
+ Use a state monad to maintain the state of the interpreter in
  GF.{Interactive,Interactive2}.
2015-08-13 10:49:50 +00:00
hallgren
d860a921e0 GF Shell: turn set_encoding into a common command
Implemented in GF.Command.CommonCommands instead of GF.Interactive &
GF.Interactive2.
2015-08-12 15:00:03 +00:00
hallgren
6fff2def39 GF shell: source commands (cc, sd, so, ss & dg) can now be used in pipes
These commands are now implemented as regular commands (i.e. using the
CommandInfo data type) in the new module GF.Command.SourceCommands.

The list of commands exported from GF.Command.Commmands now called pgfCommands
instead of allCommands.

The list allCommands of all commands is now assembled
from sourceCommands, pgfCommands, commonCommands and helpCommand in
GF.Interactive.
2015-08-12 11:05:08 +00:00
hallgren
063912c386 Move welcome message from GF.Interactive & GF.Interactive2 to GF.Command.Messages
...to avoid the duplication.
2015-08-12 11:01:45 +00:00
hallgren
e50f92c41d GF shell: make environment types abstract, comment out some dead code 2015-08-11 16:14:38 +00:00
hallgren
911310bb40 gf -cshell: improved help for the 'import' command 2015-08-10 16:39:31 +00:00
hallgren
10e7bacbfd Factor out common code from GF.Command.Commands and GF.Command.Commands2
Created module GF.Command.CommonCommands with ~250 lines of code for commands
that do not depend on the type of PGF in the environemnt, either because they
don't use the PGF or because they are just documented here and implemented
elsewhere.

TODO: further refactoring so that documentation and implementation of
*all* commands can be kept together.
2015-08-10 16:30:17 +00:00
hallgren
8d6e61a8df gf -cshell: preliminary support for the C run-time system in the GF shell
Some C run-time functionality is now available in the GF shell, by starting
GF with 'gf -cshell' or 'gf -crun'. Only limited functionality is available
when running the shell in these modes:

- You can only import .pgf files, not source files.
- The -retain flag can not be used and the commands that require it to work
  are not available.
- Only 18 of the 40 commands available in the usual shell have been
  implemented. The 'linearize' and 'parse' commands are the only ones
  that call the C run-time system, and they support only a limited set of
  options and flags. Use the 'help' commmands for details.
- A new command 'generate_all', that calls PGF2.generateAll, has been added.
  Unfortuntaly, using it causes 'segmentation fault'.

This is implemented by adding two new modules: GF.Command.Commands2 and
GF.Interactive2. They are copied and modified versions of GF.Command.Commands
and GF.Interactive, respectively. Code for unimplemented commands and other
code that has not been adapted to the C run-time system has been left in
place, but commented out, pending further work.
2015-08-10 14:12:51 +00:00
hallgren
d38efbaa6a Refactor GF shell modules to improve modularity and reusability
+ Move type CommandInfo from GF.Command.Commands to a new module
  GF.Commands.CommandInfo and make it independent of the PGF type.
+ Make the module GF.Command.Interpreter independent of the PGF type and
  eliminate the import of GF.Command.Commands.
+ Move the implementation of the "help" command to its own module
  GF.Command.Help
2015-08-10 13:01:02 +00:00
krasimir
8c697b72a4 drop the dependency to FST 2015-04-20 11:56:13 +00:00
krasimir
0238579610 remove some more old code 2015-03-05 14:47:36 +00:00
hallgren
632aab83c3 GF shell: fixed problems with previous change of the -retain flag
Because the prompt included the name of the abstract syntax, the loading
of the PGF was forced even if -retain was used. Even worse,
if an error occured while loading the PGF, it was repeated and caught
every time the prompt was printed, creating an infite loop. The solution
is to not print the name of the abstract syntax when the grammar is
imported with -retain, which is the way things were before anyway.
2015-02-27 16:42:09 +00:00
hallgren
c707575bd7 Documentation improvements and cleanup relating to the IOE monad
Renamed appIOE to tryIOE (it is analogous to 'try' in the standard libraries).
Removed unused IOE operations & documented the remaining ones.
Removed/simplified superfluous uses of IOE operations.
2014-11-10 16:20:01 +00:00
hallgren
6ee67cd04f Various small changes for improved documentation 2014-10-22 15:45:52 +00:00
aarne
84bce336fd (un)lexmixed: added the other math environments than $ used in latex 2014-10-19 17:43:39 +00:00
aarne
6c2e0d5ce2 ps -lines preserves line-by-line structure when preprocessing files for parsing line by line 2014-10-17 15:50:03 +00:00
kr.angelov
584d589041 a partial support for def rules in the C runtime
The def rules are now compiled to byte code by the compiler and then to
native code by the JIT compiler in the runtime. Not all constructions
are implemented yet. The partial implementation is now in the repository
but it is not activated by default since this requires changes in the
PGF format. I will enable it only after it is complete.
2014-08-11 10:59:10 +00:00
hallgren
7a91afc02a Convert from Text.PrettyPrint to GF.Text.Pretty
All compiler modules now use GF.Text.Pretty instead of Text.PrettyPrint
2014-07-28 11:58:00 +00:00
hallgren
30cda51516 Introducing GF.Text.Pretty for more concise pretty printers and GF.Infra.Location for modularity
GF.Text.Pretty provides the class Pretty and overloaded versions of the pretty
printing combinators in Text.PrettyPrint, allowing pretty printable values to
be used directly instead of first having to convert them to Doc with functions
like text, int, char and ppIdent. Some modules have been converted to use
GF.Text.Pretty, but not all. Precedences could be added to simplify the pretty
printers for terms and patterns.

GF.Infra.Location contains the types Location and L, factored out from
GF.Grammar.Grammar, and the class HasSourcePath. This allowed the import
of GF.Grammar.Grammar to be removed from GF.Infra.CheckM, making it more
like a pure library module.
2014-07-27 22:06:23 +00:00
hallgren
d6252d1c16 PGF library: expose only PGF and PGF.Internal instead of all modules
PGF exports the public, stable API.
PGF.Internal exports additional things needed in the GF compiler & shell,
including the nonstardard version of Data.Binary.
2014-06-12 14:43:18 +00:00
kr.angelov
67f64cb233 now we compile context-free grammars directly to PGF without going via GF source code. This makes it quick and lightweight to compile big grammars such as the Berkley grammar 2014-05-24 07:47:06 +00:00
kr.angelov
51a9ef72c7 refactor the compilation of CFG and EBNF grammars. Now they are parsed by using GF.Grammar.Parser just like the ordinary GF grammars. Furthermore now GF.Speech.CFG is moved to GF.Grammar.CFG. The new module is used by both the speech conversion utils and by the compiler for CFG grammars. The parser for CFG now consumes a lot less memory and can be used with grammars with more than 4 000 000 productions. 2014-03-21 21:25:05 +00:00
hallgren
ed3d30e3d1 Check file datestamp before creating PGF file when compiling grammars
When running a command like

	gf -make L_1.gf ... L_n.gf

gf now avoids recreating the target PGF file if it already exists and is
up-to-date. 

gf still reads all required .gfo files, so significant additional speed
improvements are still possible. This could be done by reading .gfo files
more lazily...
2014-01-09 17:30:24 +00:00
hallgren
d6974a4065 GF shell: fix help text for generate_trees
Trees are not generated with increasing depth.
2013-12-06 13:45:12 +00:00
hallgren
ddac5f9e5a GF shell: improved system_pipe (aka "?") command
1. No temporary files are created.

2. The output of a system command is read lazily, making it feasible to 
   process large or even infinite output, e.g. the following works as
   expected:

	? "yes" | ? "head -5" | ps -lextext
2013-11-19 15:18:58 +00:00
hallgren
fa4c327463 Fix Issue 60: Weird output when executing system commands from the gf shell
The system_pipe (aka "?") command creates a temporary file _tmpi containing
the input of the system command. It *both* appends _tmpi as an extra argument
to the system command line *and* adds an input redirection "< _tmpi". (It
also uses and output redirection "> _tmpo" to captures the output of the
command.)

With this patch, the _tmpi argument is no longer appended to the command line.
This allows system_pipe to work with pure filters, such as the "tr" commands,
but it will no longer work with commands that require an input file name.
(It is possible to use write_file instead...)

TODO: it would also be fairly easy to eliminate the creation of the _tmpi and
_tmpo files altogether.
2013-11-12 18:07:38 +00:00
hallgren
47e04656fb Fix issue 61: GF shell cannot parse a system command ending with a space
Trailing spaces caused the command line parse to be ambiguous, and
ambiguous parses were rejected by function readCommandLine, causing
the cryptic error message "command not parsed".
2013-11-11 15:13:24 +00:00
hallgren
48660c219a Make PGF.Tree internal
The only use of PGF.Tree outside the PGF library was in GF.Command.Commands,
and it was eliminated by using PGF.Expr directly instead.
PGF.Paraphrase still uses PGF.Tree.
2013-11-06 14:29:17 +00:00
kr.angelov
2483dc7728 the content of ParseEngAbs3.probs is now merged with ParseEngAbs.probs. The later is now retrained. Once the grammar is compiled with the .probs file now it doesn't need anything more to do robust parsing. The robustness itself is controlled by the flags 'heuristic_search_factor', 'meta_prob' and 'meta_token_prob' in ParseEngAbs.gf 2013-11-06 10:21:46 +00:00
aarne
e4c6ca41a7 added a -treebank option to the lc command 2013-11-05 20:42:22 +00:00