mirror of
https://github.com/GrammaticalFramework/gf-rgl.git
synced 2026-05-27 08:58:55 -06:00
morphodict instructions in README
This commit is contained in:
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
33
src/morphodict/README
Normal file
33
src/morphodict/README
Normal file
@@ -0,0 +1,33 @@
|
||||
MkMorphoDict: Extracting a minimal morphological dictionary from an existing GF dictionary.
|
||||
|
||||
Aarne Ranta 2020-03-02
|
||||
|
||||
principles:
|
||||
|
||||
Functions are 1-to-1 with lemgrams, i.e. inflection tables, thus
|
||||
- no sense distinctions
|
||||
- no subcategorizations
|
||||
- no variants
|
||||
|
||||
Functionname = baseform_category, with exceptions
|
||||
- variant inflection tables: lie_1_V, lie_2_V
|
||||
- words that have non-ident characters: 'bird\'s-eye_A'
|
||||
- words that start with non-letters: W_'tween_Adv
|
||||
|
||||
Example run, English:
|
||||
|
||||
gf -make ../english/DictEng.gf
|
||||
runghc MkMorphodict.hs DictEngAbs.pgf MorphoDictEng
|
||||
|
||||
Result: 64923 -> 56599 functions, of which 21679 could be compounds
|
||||
|
||||
Swedish, using a dump of SALDO (not available in these sources)
|
||||
|
||||
cd saldo/
|
||||
runghc SaldoGF.hs
|
||||
# combine abs.tmp with Saldo.header to obtain Saldo.gf
|
||||
# combine cnc.tmp with SaldoSwe.header to obtain SaldoSwe.gf
|
||||
gf -make SaldoSwe.gf
|
||||
cd ..
|
||||
runghc MkMorphodict.hs saldo/Saldo.pgf MorphoDictSwe
|
||||
|
||||
Reference in New Issue
Block a user