kr.angelov
|
70c68f0527
|
bugfix in the statistical parser
|
2012-11-13 09:48:23 +00:00 |
|
kr.angelov
|
08ee662944
|
two simple heuristics which speed up the statistical parser more than seven times.
|
2012-11-12 22:17:40 +00:00 |
|
kr.angelov
|
68170d5b08
|
a simple refactoring in the statistical parser
|
2012-11-12 21:48:22 +00:00 |
|
kr.angelov
|
a2771552d6
|
more counters in the profiler for the statistical parser
|
2012-11-12 15:36:21 +00:00 |
|
kr.angelov
|
46de62c452
|
now we store the state instead of the offset for every continuation in the chart for the statistical parser
|
2012-11-12 14:04:52 +00:00 |
|
kr.angelov
|
9967c3ad04
|
in the statistical parser: move the outside probability from the parse items to their continuation. this makes the value slot shared between many items
|
2012-11-12 13:43:43 +00:00 |
|
kr.angelov
|
9d23093492
|
small refactoring in the C runtime
|
2012-11-12 13:05:35 +00:00 |
|
kr.angelov
|
a50c7c24b8
|
use size_t consistently as the type for constituent indices in the C runtime
|
2012-11-12 12:51:27 +00:00 |
|
kr.angelov
|
52255664be
|
use prob_t instead of float in a few places
|
2012-10-29 08:52:56 +00:00 |
|
kr.angelov
|
0ad2405d69
|
forgot to add one #ifdef
|
2012-10-25 18:37:22 +00:00 |
|
kr.angelov
|
9721833680
|
a major refactoring in the robust parser: bottom-up filtering and garbage collection for the chart
|
2012-10-25 14:42:53 +00:00 |
|
kr.angelov
|
18fe8af964
|
now the meta probability for a category is explicitly specified in the statistical model instead of computed internally. this avoids rounding errors while computing the sum of a large number of small values.
|
2012-09-24 09:37:21 +00:00 |
|
kr.angelov
|
bb15542a85
|
in the robust parser we don't have to care about trees which yeld empty strings. this makes the parser a lot faster
|
2012-09-24 09:30:20 +00:00 |
|
kr.angelov
|
44df7a33cf
|
the C runtime now has a type prob_t which is used only for probability values
|
2012-09-18 09:18:48 +00:00 |
|
kr.angelov
|
cd3cca4aa2
|
bugfix in the C parser
|
2012-09-06 14:52:19 +00:00 |
|
kr.angelov
|
3a352a953f
|
Use a separated tag for meta productions in the robust parser. This cleans up the code a lot
|
2012-06-13 05:49:30 +00:00 |
|
kr.angelov
|
7549a4876d
|
now there is a limit of 2000000 items in the chart of the robust parser. This prevents from explosion in the memory size but it will also prevent us from parsing some sentences.
|
2012-06-12 11:30:01 +00:00 |
|
kr.angelov
|
b765b0c054
|
now the robust parser is purely top-down and the meta rules compete on a fair basis with the grammar rules
|
2012-06-12 09:29:51 +00:00 |
|
kr.angelov
|
cab4602b62
|
the viterbi probability for the epsilon categories is now updated properly
|
2012-05-25 07:30:35 +00:00 |
|
kr.angelov
|
bd8046f23d
|
another attempt to port the robust parser to MacOS
|
2012-05-16 15:18:44 +00:00 |
|
kr.angelov
|
4aca965109
|
a new unbiased statistical parser. it is still far from perfect use it on your own risk.
|
2012-05-08 12:13:28 +00:00 |
|
kr.angelov
|
ed6a53609b
|
yet another fix for parsing literals
|
2012-04-18 15:50:55 +00:00 |
|
kr.angelov
|
c6c54f8815
|
some fixes in the robust parser and a new API for literals
|
2012-04-12 06:55:25 +00:00 |
|
kr.angelov
|
99cc07ad67
|
simple cleanup in the robust parser
|
2012-04-02 19:01:18 +00:00 |
|
kr.angelov
|
2bf3f22fac
|
libpgf: a new implementation for literals which also allows custom literals. the same mechanism is now used for the metavariables
|
2012-03-12 14:25:51 +00:00 |
|
kr.angelov
|
c1b2246fa9
|
libpgf: added simple lexer
|
2012-03-09 09:14:44 +00:00 |
|
kr.angelov
|
1da464a4cc
|
libpgf: implementation for built in literal categories
|
2012-03-07 16:39:29 +00:00 |
|
kr.angelov
|
bf81c0f77f
|
libpgf: simple fix in the parser debugger
|
2012-03-07 12:23:07 +00:00 |
|
kr.angelov
|
e871330665
|
libpgf: two APIs - one for finding all parse results and another for finding the best parse result
|
2012-03-07 11:00:17 +00:00 |
|
kr.angelov
|
791a1a17b0
|
libpgf: now all concrete functions and categories are explicitly linked to their abstract counter parts
|
2012-03-05 12:59:31 +00:00 |
|
kr.angelov
|
fdf6dd7798
|
libpgf: preliminary version for the statistical ranking. we use naive statistical model with random weight for the meta variables.
|
2012-03-02 19:25:01 +00:00 |
|
kr.angelov
|
aca0bd5ee5
|
libpgf: the first prototype for the robust parser
|
2012-02-29 14:43:08 +00:00 |
|
kr.angelov
|
e0bf3c0a07
|
libpgf: another fix in the parser debugger
|
2012-02-28 16:37:12 +00:00 |
|
kr.angelov
|
3aa26948de
|
libpgf: fix in the parser debugger
|
2012-02-28 13:12:38 +00:00 |
|
kr.angelov
|
9604f3623a
|
libpgf: pretty printing for expressions with metavaraibles
|
2012-02-27 13:50:35 +00:00 |
|
kr.angelov
|
695c776065
|
libpgf: fix in pgf_read_into_map
|
2012-02-24 15:15:07 +00:00 |
|
kr.angelov
|
0faffc6ffd
|
libpgf: simple fix in the grammar printer and the reader
|
2012-02-24 13:52:21 +00:00 |
|
kr.angelov
|
f1d2852c4d
|
libpgf: now we have both complete bottom up index for robust parsing and fast lexical lookup from the same index
|
2012-02-22 21:27:54 +00:00 |
|
kr.angelov
|
440b208144
|
libpgf: two small fixes in the parser debugger
|
2012-02-22 14:06:49 +00:00 |
|
kr.angelov
|
831de53573
|
libpgf: the map curr_lindefs must be allocated from a temporary pool
|
2012-02-22 08:49:08 +00:00 |
|
kr.angelov
|
dc4c3d3b28
|
libpgf: added index for fast lexicon lookup. Still not perfect
|
2012-02-21 21:17:50 +00:00 |
|
kr.angelov
|
30d3fc8b4d
|
libpgf: now the debugging mode for the parser is available only with compilation option.
|
2012-02-18 19:30:16 +00:00 |
|
kr.angelov
|
0f7b3ed9f4
|
libpgf: remove the now redundant field extra_ccats in PgfConcr
|
2012-02-18 16:25:53 +00:00 |
|
kr.angelov
|
0147885e2f
|
libpgf: now the linearization index is created during the grammar loading which also makes the types PgfLzr and PgfParser redundant.
|
2012-02-18 16:22:40 +00:00 |
|
kr.angelov
|
75b724ab54
|
libpgf: simplify the loading of PgfCncCat
|
2012-02-17 14:26:08 +00:00 |
|
kr.angelov
|
1bb13787a7
|
libpgf: added printer.c
|
2012-02-17 14:11:29 +00:00 |
|
kr.angelov
|
92cbbe9173
|
libpgf: switch to using callbacks and lazy prediction in the parser. this reduce the parsing time from 11 sec down to 3 sec.
|
2012-01-26 12:32:26 +00:00 |
|
kr.angelov
|
0e05fc08d5
|
libpgf: use a temporal pool for allocating the arrays in the continuation map of the parser
|
2012-01-26 09:03:08 +00:00 |
|
kr.angelov
|
469d8cf804
|
libpgf: fix a warning in reader.c
|
2012-01-26 08:58:23 +00:00 |
|
kr.angelov
|
b62d57fd30
|
libpgf: few fixes to make the loading of grammars with def rules possible
|
2012-01-24 14:47:11 +00:00 |
|