Commit Graph

320 Commits

Author SHA1 Message Date
John J. Camilleri
785d6069e2 Fix lin2string and pass all unittests and Phrasebook 2021-03-08 09:53:10 +01:00
John J. Camilleri
575a746a3e Add LPGF function for catching errors. Manual fixes to Phrasebook treebank. 2021-03-05 12:05:25 +01:00
John J. Camilleri
30b016032d Also store Pre prefixes in token map. Introduce IntMapBuilder data structure.
Storing of prefixes uses show/read, which isn't a great solution but avoids having yet another token map.
2021-03-04 09:58:17 +01:00
John J. Camilleri
4082c006c3 Extract token strings and put them in map which linfuns refer to by index, to reduce LPGF sizes. 2021-03-04 00:16:12 +01:00
John J. Camilleri
4c09e4a340 Remove LF prefix from constructors. Pass all unit tests and Foods again, but improvements/cleanup still necessary. 2021-03-03 09:19:52 +01:00
John J. Camilleri
33e0e98aec Add 1-tree treebank for Phrasebook in a few languages 2021-02-28 00:34:46 +01:00
John J. Camilleri
4771d9c356 WIP params 2021-02-26 17:18:21 +01:00
John J. Camilleri
8324ad8801 Add pretty-printing of LPGF grammars, to help debugging 2021-02-26 10:13:33 +01:00
John J. Camilleri
b4a393ac09 Pass missing unit test 2021-02-21 14:22:46 +01:00
John J. Camilleri
9f3f4139b1 Grammar and languages to run in testsuite can be specified by command line options, see README 2021-02-19 11:14:55 +01:00
John J. Camilleri
29114ce606 Improve binary format, reducing Foods.lpgf from 300 to 73KB (4x smaller!) 2021-02-16 23:30:21 +01:00
John J. Camilleri
5be21dba1c Add and pass FoodsJpn 2021-02-16 22:49:37 +01:00
John J. Camilleri
312cfeb69d Add Afr, Amh, Cat, Cze, Dut, Ger foods grammars to testsuite 2021-02-16 22:33:26 +01:00
John J. Camilleri
4c06c3f825 Add case for when pre is not followed by anything 2021-02-16 21:01:01 +01:00
John J. Camilleri
398b294734 Use Data.Text instead of String. Rename Abstr to Abstract, Concr to Concrete. 2021-02-16 16:04:40 +01:00
John J. Camilleri
d394cacddf Add support for CAPIT and ALL_CAPIT 2021-02-16 15:17:54 +01:00
John J. Camilleri
21f14c2aa1 Add support for SOFT_SPACE 2021-02-16 14:57:33 +01:00
John J. Camilleri
4d1217b06d Add support for pre 2021-02-15 21:57:05 +01:00
John J. Camilleri
d563abb928 Minors 2021-02-13 00:59:15 +01:00
John J. Camilleri
98f6136ebd Add support for BIND 2021-02-13 00:14:35 +01:00
John J. Camilleri
8cfaa69b6e Handle record tables, pass FoodSwe in testsuite 2021-02-12 23:51:16 +01:00
John J. Camilleri
9c2d8eb0b2 Add FoodsChi, FoodsHeb to LPGF testsuite 2021-02-09 10:14:40 +01:00
John J. Camilleri
34f0fc0ba7 Fix bug in dynamic parameter handling, compile FoodsBul successfully 2021-02-03 15:41:27 +01:00
John J. Camilleri
132f693713 Minor cleanup 2021-02-03 09:44:15 +01:00
John J. Camilleri
c94bffe435 Generalise testsuite script to use treebank files, add FoodEng 2021-02-02 21:22:36 +01:00
John J. Camilleri
fe15aa0c00 Use canonical GF in LPGF compiler
Still contains some hardcoded values, missing cases.

I notice now that LPGF and Canonical GF are almost identical, so maybe we don't need a new LPGF format,
just a linearization-only runtime which works on canonical grammars.
The argument for keeping LGPF is that it would be optimized for size and speed.
2021-02-01 12:28:06 +01:00
John J. Camilleri
270e7f021f Add binary instances 2021-01-25 14:42:00 +01:00
John J. Camilleri
f24c50339b Strip down format. More early work on compiler. Add testsuite (doesn't work yet). 2021-01-25 12:10:30 +01:00
John J. Camilleri
cd5881d83a Early work on LPGF compiler 2021-01-22 15:17:36 +01:00
John J. Camilleri
93b81b9f13 Add first version of LPGF datatype, with linearization function and some hardcoded examples 2021-01-22 14:07:41 +01:00
krangelov
f3a8658cc1 Merge branch 'master' of https://github.com/GrammaticalFramework/gf-core 2020-10-02 19:55:24 +02:00
krangelov
bfb94d1e48 fix parsing with HOAS 2020-10-02 19:34:52 +02:00
Andreas Källberg
251845f83e First attempt at fixing incompabilities with newer cabal 2020-08-05 18:48:24 +02:00
aarneranta
8a052edca2 an attempt to solve record extension overloading bug, commented out for the moment 2020-07-06 18:01:59 +02:00
aarneranta
65c810f085 accepting gf-ud style abslabels in gf-core ; cnclabels TODO 2020-05-05 15:46:48 +02:00
krangelov
733fdac755 restore the sequence ordering after -optimize-pgf 2020-03-15 19:57:47 +01:00
aarneranta
6f2b1a83b7 fixed a vd bug that sometimes erased the root label 2019-11-13 11:40:37 +01:00
aarneranta
d3b501d35f fixed the problem with generating several roots in ud2gf. Now only the leftmost word becomes ROOT, the others become dep - which can be eliminated by cnclabels. This works fine for e.g. English prepositional and particle verbs. But it does not work if the 'main' word is not the leftmost one 2019-11-12 17:46:55 +01:00
Aarne Ranta
b3387e80e4 hiding morphological tags from Latex printing of dependency trees 2019-03-20 22:19:32 +01:00
Thomas Hallgren
fc5c2b5a22 PGF.Haskell.fromStr: fix double spaces caused by empty tokens 2019-01-23 02:45:23 +01:00
Prasanth Kolachina
0accd97691 add CoNLLU as output format for gf2ud: merging issue (#24) 2019-01-07 13:24:49 +01:00
Prasanth Kolachina
f8bd35543c Merge pull request #24 from odanoburu/gf2ud-comments
(gf2ud) add comments to CoNLL-U output
2019-01-07 13:18:45 +01:00
Krasimir Angelov
260c0d07e0 revert to printing the unique id in ppBracketedString 2018-12-20 10:54:04 +01:00
Krasimir Angelov
26dabeab9b save the original concrete category in BracketedString 2018-12-20 10:52:45 +01:00
odanoburu
f7c2fb8a7d (gf2ud) add comments to CoNLL-U output
when debbuging labels, I find it useful to have comments saying what's
the original sentence (lazy, I know) and the original tree (depending
on the treebank, the trees can be similar).

I know this is not the goal exactly, but UDv2 treebanks
(http://universaldependencies.org/format.html) should always have a
'text =' comment, and a 'sent_id =' comment (which would be easy to
implement too, but not that useful).
2018-12-19 12:13:31 -02:00
Aarne Ranta
54204d2d95 added the possibility to annotate features of syncat words, e.g. @"is" PresSg3 2018-12-18 18:44:02 +01:00
Aarne Ranta
9834b89a30 refactored cnc configfile parsing a bit 2018-12-18 18:30:40 +01:00
Aarne Ranta
77c0a8e100 Merge branch 'master' into master 2018-12-18 19:05:42 +02:00
Prasanth Kolachina
86233e9c28 morph. feat generation by AR 2018-12-18 16:53:35 +01:00
Aarne Ranta
40e7544a2b added morphological tags to UD tree output. Tags are give in CncConfiguration, e.g. @N Sg Pl. Default tag is Cat-offset, as defined for each Cat in pgf 2018-12-18 15:59:48 +01:00