kr.angelov
|
561e478ed4
|
the statistical parser is now using two memory pools: one for parsing and one for the output trees. This means that the memory for parsing can be released as soon as the needed abstract trees are retrieved, while the trees themselves are retained in the separate output pool
|
2013-05-06 15:28:04 +00:00 |
|
kr.angelov
|
307e0854ed
|
fix the leftcorner filtering after the addition of word completion
|
2013-05-05 10:30:06 +00:00 |
|
kr.angelov
|
9cdd96363a
|
word completion in the C runtime. The runtime/python/test.py example is now using readline with word completion
|
2013-05-01 06:09:55 +00:00 |
|
kr.angelov
|
6cc44193b8
|
finally the statistical parser is able to return all possible abstract trees
|
2013-04-26 20:44:01 +00:00 |
|
kr.angelov
|
650e1cfa43
|
the calculation of lexical_prob in the statistical parser doesn't work properly. It should be fixed but for now I just disabled the optimization
|
2013-03-20 12:28:52 +00:00 |
|
kr.angelov
|
fec34e7622
|
replace #if with #ifdef when checking for the optional bottom up filtering in the C runtime
|
2013-03-20 10:47:47 +00:00 |
|
kr.angelov
|
1ddcfc219e
|
the bottom up filtering in the C runtime is temporary disabled. It takes too much memory and even makes it impossible to load the Finnish and the German parsing grammars.
|
2013-03-19 10:59:44 +00:00 |
|
kr.angelov
|
2893397fbb
|
bugfix in the statistical parser
|
2013-03-11 14:47:43 +00:00 |
|
kr.angelov
|
5a54596fe8
|
the parser in the C runtime should not crash if the start category is not defined
|
2013-02-19 12:08:48 +00:00 |
|
kr.angelov
|
55203110bb
|
now the beam size for the statistical parser can be configured by using the flag beam_size in the top-level concrete module
|
2013-02-12 10:53:13 +00:00 |
|
kr.angelov
|
1f77afcfce
|
the statistical parser now uses a baseline lexical estimation of the beam size
|
2013-02-12 09:41:32 +00:00 |
|
kr.angelov
|
56c8f91d19
|
remove the pgf2yaml tool which was both broken and redundant. The declarations for generic programming from data.c are removed as well
|
2013-02-11 13:51:12 +00:00 |
|
kr.angelov
|
f7eaa8a89a
|
bugfix for linearization of metavariables at the root of a tree
|
2012-12-19 10:03:05 +00:00 |
|
kr.angelov
|
5c9ee467a9
|
a major reimplementation of the linearizer in the C runtime
|
2012-12-19 09:07:05 +00:00 |
|
kr.angelov
|
60942c440a
|
bugfix: the outside probability of a PgfItemConts must always be initialized to zero
|
2012-12-13 11:11:45 +00:00 |
|
kr.angelov
|
3182e382dc
|
bugfix for robust parsing with multi-word units
|
2012-12-11 12:57:22 +00:00 |
|
kr.angelov
|
1863e4c3d6
|
added experimental script for chunking in the C runtime
|
2012-12-03 10:07:54 +00:00 |
|
kr.angelov
|
f8c302f9ef
|
remove the duplicated definition of PgfProductionIdx in parser.c
|
2012-11-19 14:16:31 +00:00 |
|
kr.angelov
|
71b7c09ffe
|
bugfix for the building of bottom-up filter in the C runtime
|
2012-11-16 13:27:15 +00:00 |
|
kr.angelov
|
a3ba1991f4
|
revised heuristic in the statistical parser
|
2012-11-14 12:34:22 +00:00 |
|
kr.angelov
|
70c68f0527
|
bugfix in the statistical parser
|
2012-11-13 09:48:23 +00:00 |
|
kr.angelov
|
08ee662944
|
two simple heuristics which speed up the statistical parser more than seven times.
|
2012-11-12 22:17:40 +00:00 |
|
kr.angelov
|
68170d5b08
|
a simple refactoring in the statistical parser
|
2012-11-12 21:48:22 +00:00 |
|
kr.angelov
|
a2771552d6
|
more counters in the profiler for the statistical parser
|
2012-11-12 15:36:21 +00:00 |
|
kr.angelov
|
46de62c452
|
now we store the state instead of the offset for every continuation in the chart for the statistical parser
|
2012-11-12 14:04:52 +00:00 |
|
kr.angelov
|
9967c3ad04
|
in the statistical parser: move the outside probability from the parse items to their continuation. this makes the value slot shared between many items
|
2012-11-12 13:43:43 +00:00 |
|
kr.angelov
|
9d23093492
|
small refactoring in the C runtime
|
2012-11-12 13:05:35 +00:00 |
|
kr.angelov
|
a50c7c24b8
|
use size_t consistently as the type for constituent indices in the C runtime
|
2012-11-12 12:51:27 +00:00 |
|
kr.angelov
|
0ad2405d69
|
forgot to add one #ifdef
|
2012-10-25 18:37:22 +00:00 |
|
kr.angelov
|
9721833680
|
a major refactoring in the robust parser: bottom-up filtering and garbage collection for the chart
|
2012-10-25 14:42:53 +00:00 |
|
kr.angelov
|
bb15542a85
|
in the robust parser we don't have to care about trees which yeld empty strings. this makes the parser a lot faster
|
2012-09-24 09:30:20 +00:00 |
|
kr.angelov
|
44df7a33cf
|
the C runtime now has a type prob_t which is used only for probability values
|
2012-09-18 09:18:48 +00:00 |
|
kr.angelov
|
cd3cca4aa2
|
bugfix in the C parser
|
2012-09-06 14:52:19 +00:00 |
|
kr.angelov
|
3a352a953f
|
Use a separated tag for meta productions in the robust parser. This cleans up the code a lot
|
2012-06-13 05:49:30 +00:00 |
|
kr.angelov
|
7549a4876d
|
now there is a limit of 2000000 items in the chart of the robust parser. This prevents from explosion in the memory size but it will also prevent us from parsing some sentences.
|
2012-06-12 11:30:01 +00:00 |
|
kr.angelov
|
b765b0c054
|
now the robust parser is purely top-down and the meta rules compete on a fair basis with the grammar rules
|
2012-06-12 09:29:51 +00:00 |
|
kr.angelov
|
cab4602b62
|
the viterbi probability for the epsilon categories is now updated properly
|
2012-05-25 07:30:35 +00:00 |
|
kr.angelov
|
bd8046f23d
|
another attempt to port the robust parser to MacOS
|
2012-05-16 15:18:44 +00:00 |
|
kr.angelov
|
4aca965109
|
a new unbiased statistical parser. it is still far from perfect use it on your own risk.
|
2012-05-08 12:13:28 +00:00 |
|
kr.angelov
|
c6c54f8815
|
some fixes in the robust parser and a new API for literals
|
2012-04-12 06:55:25 +00:00 |
|
kr.angelov
|
99cc07ad67
|
simple cleanup in the robust parser
|
2012-04-02 19:01:18 +00:00 |
|
kr.angelov
|
2bf3f22fac
|
libpgf: a new implementation for literals which also allows custom literals. the same mechanism is now used for the metavariables
|
2012-03-12 14:25:51 +00:00 |
|
kr.angelov
|
c1b2246fa9
|
libpgf: added simple lexer
|
2012-03-09 09:14:44 +00:00 |
|
kr.angelov
|
1da464a4cc
|
libpgf: implementation for built in literal categories
|
2012-03-07 16:39:29 +00:00 |
|
kr.angelov
|
bf81c0f77f
|
libpgf: simple fix in the parser debugger
|
2012-03-07 12:23:07 +00:00 |
|
kr.angelov
|
e871330665
|
libpgf: two APIs - one for finding all parse results and another for finding the best parse result
|
2012-03-07 11:00:17 +00:00 |
|
kr.angelov
|
791a1a17b0
|
libpgf: now all concrete functions and categories are explicitly linked to their abstract counter parts
|
2012-03-05 12:59:31 +00:00 |
|
kr.angelov
|
fdf6dd7798
|
libpgf: preliminary version for the statistical ranking. we use naive statistical model with random weight for the meta variables.
|
2012-03-02 19:25:01 +00:00 |
|
kr.angelov
|
aca0bd5ee5
|
libpgf: the first prototype for the robust parser
|
2012-02-29 14:43:08 +00:00 |
|
kr.angelov
|
e0bf3c0a07
|
libpgf: another fix in the parser debugger
|
2012-02-28 16:37:12 +00:00 |
|