mirror of
https://github.com/GrammaticalFramework/gf-rgl.git
synced 2026-05-27 08:58:55 -06:00
more in morphodict/README.md
This commit is contained in:
@@ -27,7 +27,7 @@ They should be numbered as
|
|||||||
|
|
||||||
Such distinctions are made in all cases where there are alternative inflections, even if there is no sense distinction:
|
Such distinctions are made in all cases where there are alternative inflections, even if there is no sense distinction:
|
||||||
- `learn_1_V` ("learn, learned, learned")
|
- `learn_1_V` ("learn, learned, learned")
|
||||||
- `learn_1_V` ("learn, learnt, learnt")
|
- `learn_2_V` ("learn, learnt, learnt")
|
||||||
|
|
||||||
Hence,
|
Hence,
|
||||||
- no `variants` should appear in the MorphoDict
|
- no `variants` should appear in the MorphoDict
|
||||||
@@ -38,6 +38,20 @@ The dictionary will also exclude *multiwords* consisting of several tokens.
|
|||||||
Most of the time, even *compounds* written as single tokens should be excluded.
|
Most of the time, even *compounds* written as single tokens should be excluded.
|
||||||
However, as the status of a compound is not always clear, and since they do not create spurious morphological analyses, they can be tolerated, in particular if extracted from legacy sources.
|
However, as the status of a compound is not always clear, and since they do not create spurious morphological analyses, they can be tolerated, in particular if extracted from legacy sources.
|
||||||
|
|
||||||
|
Since multiwords and compounds are excluded, `Paradigms` and `MakeStructural` should for each language provide API functions for easy definitions of them, preferably of the form
|
||||||
|
```
|
||||||
|
mkC : Str -> C -> C
|
||||||
|
```
|
||||||
|
The situation when this is not enough is when separate functions are needed for gluing and concatenation compounds.
|
||||||
|
|
||||||
|
*Open question*: what to do with compound prepositions that are common in e.g. English?
|
||||||
|
The above principles imply
|
||||||
|
```
|
||||||
|
according_to_Prep = mkPrep "according" to_Prep
|
||||||
|
```
|
||||||
|
defined *outside* `MorphoDictEng`, so that `mkPrep` comes from `ParadigmsEng` and `to_Prep` from `MorphoDictEng`.
|
||||||
|
This may sound like against tradition, but follows the general guidelines of morphological dictionaries.
|
||||||
|
|
||||||
|
|
||||||
## Relevant categories
|
## Relevant categories
|
||||||
|
|
||||||
@@ -85,6 +99,20 @@ However, if the word contains characters that are not legal in identifiers, the
|
|||||||
If function names are formed by the API function `PGF.mkCId`, these conventions are automatically followed.
|
If function names are formed by the API function `PGF.mkCId`, these conventions are automatically followed.
|
||||||
|
|
||||||
|
|
||||||
|
## Coding conventions
|
||||||
|
|
||||||
|
To enable easy ocular and automatic inspection,
|
||||||
|
- write one entry per line, each prefixed by `fun` or `lin` keyword
|
||||||
|
- sort the entries alphabetically
|
||||||
|
- use paradigms with enough many arguments to make the characteristic forms explicit
|
||||||
|
|
||||||
|
To guarantee compatibility with the rest of the RGL and application grammars,
|
||||||
|
- paradigms used should be imported from `Paradigms` and `MakeStructural` rather than defined in `MorphoDict` itself
|
||||||
|
- import of *low-level modules* such as `Res` should be avoided
|
||||||
|
- `MorphoDict` should be self-contained, i.e. not inherit from other modules such as `Structural` or `Irreg`. But it is OK to `open` them in a qualified mode to use when defining linearizations.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Bootstrapping with `MkMorphoDict`
|
## Bootstrapping with `MkMorphoDict`
|
||||||
|
|
||||||
THIS WAS AN EARLY EXPERIMENT, TO BE UPDATED
|
THIS WAS AN EARLY EXPERIMENT, TO BE UPDATED
|
||||||
@@ -107,3 +135,12 @@ Swedish, using a dump of SALDO (not available in these sources)
|
|||||||
runghc MkMorphodict.hs saldo/Saldo.pgf MorphoDictSwe
|
runghc MkMorphodict.hs saldo/Saldo.pgf MorphoDictSwe
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Things to do
|
||||||
|
|
||||||
|
To support the construction of a `MorphoDict`, the following should be guaranteed in the RGL:
|
||||||
|
- in `Paradigms`, explicit smart paradigms with characteristic forms and inherent features for each category
|
||||||
|
- in `Paradigms`, API constants for all inherent features that are needed
|
||||||
|
- in `Paradigms`, compound-constructing functions for all categories that need them
|
||||||
|
- in `Extra`, the extra categories that one wants to include in that language
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user