Commit Graph

53 Commits

Author SHA1 Message Date
kr.angelov
9d23093492 small refactoring in the C runtime 2012-11-12 13:05:35 +00:00
kr.angelov
a50c7c24b8 use size_t consistently as the type for constituent indices in the C runtime 2012-11-12 12:51:27 +00:00
kr.angelov
52255664be use prob_t instead of float in a few places 2012-10-29 08:52:56 +00:00
kr.angelov
0ad2405d69 forgot to add one #ifdef 2012-10-25 18:37:22 +00:00
kr.angelov
9721833680 a major refactoring in the robust parser: bottom-up filtering and garbage collection for the chart 2012-10-25 14:42:53 +00:00
kr.angelov
18fe8af964 now the meta probability for a category is explicitly specified in the statistical model instead of computed internally. this avoids rounding errors while computing the sum of a large number of small values. 2012-09-24 09:37:21 +00:00
kr.angelov
bb15542a85 in the robust parser we don't have to care about trees which yeld empty strings. this makes the parser a lot faster 2012-09-24 09:30:20 +00:00
kr.angelov
44df7a33cf the C runtime now has a type prob_t which is used only for probability values 2012-09-18 09:18:48 +00:00
kr.angelov
cd3cca4aa2 bugfix in the C parser 2012-09-06 14:52:19 +00:00
kr.angelov
3a352a953f Use a separated tag for meta productions in the robust parser. This cleans up the code a lot 2012-06-13 05:49:30 +00:00
kr.angelov
7549a4876d now there is a limit of 2000000 items in the chart of the robust parser. This prevents from explosion in the memory size but it will also prevent us from parsing some sentences. 2012-06-12 11:30:01 +00:00
kr.angelov
b765b0c054 now the robust parser is purely top-down and the meta rules compete on a fair basis with the grammar rules 2012-06-12 09:29:51 +00:00
kr.angelov
cab4602b62 the viterbi probability for the epsilon categories is now updated properly 2012-05-25 07:30:35 +00:00
kr.angelov
bd8046f23d another attempt to port the robust parser to MacOS 2012-05-16 15:18:44 +00:00
kr.angelov
4aca965109 a new unbiased statistical parser. it is still far from perfect use it on your own risk. 2012-05-08 12:13:28 +00:00
kr.angelov
ed6a53609b yet another fix for parsing literals 2012-04-18 15:50:55 +00:00
kr.angelov
c6c54f8815 some fixes in the robust parser and a new API for literals 2012-04-12 06:55:25 +00:00
kr.angelov
99cc07ad67 simple cleanup in the robust parser 2012-04-02 19:01:18 +00:00
kr.angelov
2bf3f22fac libpgf: a new implementation for literals which also allows custom literals. the same mechanism is now used for the metavariables 2012-03-12 14:25:51 +00:00
kr.angelov
c1b2246fa9 libpgf: added simple lexer 2012-03-09 09:14:44 +00:00
kr.angelov
1da464a4cc libpgf: implementation for built in literal categories 2012-03-07 16:39:29 +00:00
kr.angelov
bf81c0f77f libpgf: simple fix in the parser debugger 2012-03-07 12:23:07 +00:00
kr.angelov
e871330665 libpgf: two APIs - one for finding all parse results and another for finding the best parse result 2012-03-07 11:00:17 +00:00
kr.angelov
791a1a17b0 libpgf: now all concrete functions and categories are explicitly linked to their abstract counter parts 2012-03-05 12:59:31 +00:00
kr.angelov
fdf6dd7798 libpgf: preliminary version for the statistical ranking. we use naive statistical model with random weight for the meta variables. 2012-03-02 19:25:01 +00:00
kr.angelov
aca0bd5ee5 libpgf: the first prototype for the robust parser 2012-02-29 14:43:08 +00:00
kr.angelov
e0bf3c0a07 libpgf: another fix in the parser debugger 2012-02-28 16:37:12 +00:00
kr.angelov
3aa26948de libpgf: fix in the parser debugger 2012-02-28 13:12:38 +00:00
kr.angelov
9604f3623a libpgf: pretty printing for expressions with metavaraibles 2012-02-27 13:50:35 +00:00
kr.angelov
695c776065 libpgf: fix in pgf_read_into_map 2012-02-24 15:15:07 +00:00
kr.angelov
0faffc6ffd libpgf: simple fix in the grammar printer and the reader 2012-02-24 13:52:21 +00:00
kr.angelov
f1d2852c4d libpgf: now we have both complete bottom up index for robust parsing and fast lexical lookup from the same index 2012-02-22 21:27:54 +00:00
kr.angelov
440b208144 libpgf: two small fixes in the parser debugger 2012-02-22 14:06:49 +00:00
kr.angelov
831de53573 libpgf: the map curr_lindefs must be allocated from a temporary pool 2012-02-22 08:49:08 +00:00
kr.angelov
dc4c3d3b28 libpgf: added index for fast lexicon lookup. Still not perfect 2012-02-21 21:17:50 +00:00
kr.angelov
30d3fc8b4d libpgf: now the debugging mode for the parser is available only with compilation option. 2012-02-18 19:30:16 +00:00
kr.angelov
0f7b3ed9f4 libpgf: remove the now redundant field extra_ccats in PgfConcr 2012-02-18 16:25:53 +00:00
kr.angelov
0147885e2f libpgf: now the linearization index is created during the grammar loading which also makes the types PgfLzr and PgfParser redundant. 2012-02-18 16:22:40 +00:00
kr.angelov
75b724ab54 libpgf: simplify the loading of PgfCncCat 2012-02-17 14:26:08 +00:00
kr.angelov
1bb13787a7 libpgf: added printer.c 2012-02-17 14:11:29 +00:00
kr.angelov
92cbbe9173 libpgf: switch to using callbacks and lazy prediction in the parser. this reduce the parsing time from 11 sec down to 3 sec. 2012-01-26 12:32:26 +00:00
kr.angelov
0e05fc08d5 libpgf: use a temporal pool for allocating the arrays in the continuation map of the parser 2012-01-26 09:03:08 +00:00
kr.angelov
469d8cf804 libpgf: fix a warning in reader.c 2012-01-26 08:58:23 +00:00
kr.angelov
b62d57fd30 libpgf: few fixes to make the loading of grammars with def rules possible 2012-01-24 14:47:11 +00:00
kr.angelov
5b96b55184 libpgf: extra_ccat is now redundant and was removed 2012-01-23 19:47:08 +00:00
kr.angelov
21dee01c9d libpgf: debugging framework for the parser 2012-01-23 15:49:29 +00:00
kr.angelov
f2cfa9888e libpgf: the concrete categories were allocated from the temporary pool 2012-01-23 13:43:17 +00:00
kr.angelov
e9014902ef libpgf: printing of literals and flags 2012-01-23 10:17:20 +00:00
kr.angelov
c5b4e5388a libpgf: move the lindefs field from PgfCncCat to PgfCCat. display the list in the grammar printout 2012-01-23 09:46:45 +00:00
kr.angelov
64a00dad48 added an API for printing the PGF to human readable format 2012-01-21 10:27:55 +00:00