Merge pull request #2 from GrammaticalFramework/new-labs

This commit is contained in:
Aarne Ranta
2025-03-28 13:15:32 +01:00
committed by GitHub
54 changed files with 184 additions and 2286 deletions

View File

@@ -1,120 +1,64 @@
# Lab 1: Grammatical analysis
# Lab 1: Multilingual generation and translation
In this lab, you will implement the concrete syntax of a grammar for a language of your choice.
The abstract syntax is given in the directory [`grammar/abstract/`](grammar/abstract/) and an example concrete syntax for English can be found in [`grammar/english/`](grammar/english/).
This lab follows Chapters 1-4 in the course notes. Each part is started after the lecture on the corresponding chapter.
The assignments are submitted via Canvas.
## Part 1: setup and lexicon
1. Create a subfolder in [`grammar/`](grammar/) for your language of choice
2. Copy the contents of [`grammar/english/`](grammar/english/) to your new folder and apply the necessary renamings (i.e. replace all occurrences of `Eng` with the new language code)
3. Translate the words in lexicon part of `MicroLangXxx`
4. Test your new concrete syntax by generating a few random trees in the GF interpreter. When you linearize them, you should see sentences in a mixture of English and your chosen language. To do this you can use the commands
- `i MicroLangXxx.gf` to [import](https://www.grammaticalframework.org/doc/gf-shell-reference.html#toc18) the grammar
- `gr | l` to [generate a random tree](https://www.grammaticalframework.org/doc/gf-shell-reference.html#toc15) and [linearize](https://www.grammaticalframework.org/doc/gf-shell-reference.html#toc19) it
## Chapter 1: explore the parallel UD treebank (PUD)
## Part 2: morphology
1. Design the morphological types of the major parts of speech (NOUN, ADJ, and VERB) in your selected language, i.e. identify their inflectional and inherent features using: a traditional grammar book or a Wikipedia article and/or data from [universaldependencies.org](https://universaldependencies.org/). In the latter case:
1. download a treebank for your language
2. use [deptreepy](https://github.com/aarneranta/deptreepy/) or [STUnD](https://harisont.github.io/STUnD/) to query the treebank and look up what morphological features actually occur in the data for each POS
2. Implement these in GF by defining parameters and writing a couple of paradigms. In this phase, you will work in the `MicroResXxx` module
3. Define the `lincat`s for `N`,`A`,`V` and `V2` in `MicroLangEng`
4. Test your GF morphology. To do that, you can import the grammar with the `-retain` flag and use the [`compute_concrete`](https://www.grammaticalframework.org/doc/gf-shell-reference.html#toc8) command on the various lexical items. For example `cc star_N` returns the full inflectional table for the noun "star"
1. Go to [universaldependencies.org](https://universaldependencies.org/) and download Version 2.7+ treebanks
2. Look up the Parallel UD treebanks for those 21 languages that have it. They are named e.g. `UD_English-PUD/`
3. Select a language to compare with English.
4. Make statistics about the frequencies of POS tags and dependency labels in your language compared with English: find the top-20 tags/labels and their number of occurrences. What does this tell you about the language? (This can be done with shell or Python programming or, more easily, with the [deptreepy](https://github.com/aarneranta/deptreepy/) or [gf-ud](https://github.com/grammaticalFramework/gf-ud) tools. The latter is also available on the eduserv server.)
5. Convert the following four trees from CoNLL-U format to graphical trees by hand, on paper.
- a short English tree (5-10 words, of your choice) and its translation.
- a long English tree (>25 words) and its translation.
1. Draw word alignments for some non-trivial example in the PUD treebank, on paper.
Use the same trees as in the previous question.
What can you say about the syntactic differences between the languages?
## Part 3: syntax
1. Define the linearization types of main phrasal categories - the remaining categories in `MicroLang`.
2. Define the rest of the linearization rules in `MicroLang`.
## Part 4: testing your grammar against the RGL
Since `MicroLang` is a proper part of the RGL, it can be easily implemented as an application grammar.
How to do this is shown in `grammar/functor/`, where the implementation consists of two files:
- `MicroLangFunctor.gf` which is a generic implementation working for all RGL languages,
- `MicroLangFunctorEng.gf` which is a *functor instantiation* for English, easily reproducible for other languages than `Eng`.
## Chapter 2: design the morpological types of the major parts of speech in your selected language
To use this for testing, you can take the following steps:
1. It is enough to cover NOUN, ADJ, and VERB.
2. Use a traditional grammar book or a Wikipedia article to identify the inflectional and inherent features.
3. Then use data from PUD to check which morphological features actually occur in the treebank for that language.
1. Build a functor instantiation for your language by copying `MicroLangFunctorEng.gf` and changing `Eng` in the file name and inside the file to your language code.
## Chapter 3: UD syntax analysis
2. Use GF to create a testfile by random generation:
```
$ echo "gr -number=1000 | l -tabtreebank" | gf english/MicroLangEng.gf functor/MicroLangFunctorEng.gf >test.tmp
```
In this lab, you will annotate a bilingual corpus with UD.
You can choose between starting with an English corpus and translate it to a language of your choice, or start with a Swedish corpus to translate into English.
3. Inspect the resulting file `test.tmp`.
But you can also use Unix `cut` to create separate files for the two versions of the grammar and `diff` to compare them:
```
$ cut -f2 test.tmp >test1.tmp
$ cut -f3 test.tmp >test2.tmp
$ diff test1.tmp test2.tmp
Your task is to:
52c52
< the hot fire teachs her
---
> the hot fire teaches her
69c69
< the man teachs the apples
---
> the man teaches the apples
122c122
```
As seen from the result in this case, our implementation has a wrong inflection of the verb "teach".
1. write an CoNLL file analysing your chosen corpus
2. translate it
3. write a CoNLL file analysing your translation
The Mini grammar can be tested in the same way, by building a reference implementation using the functor in `functor/`.'
### Option 1: English data
The English text is given in the file [`comp-syntax-corpus-english.txt`](comp-syntax-corpus-english.txt) in this directory.
The corpus is a combination of different sources, including the Parallel UD treebank (PUD).
If you want to cheat - or just check your own answer - you can look for those sentences in the official PUD. You can also compare your analyses with those of an automatic parser, such as [UDPipe](https://lindat.mff.cuni.cz/services/udpipe/), which you can try directly from your browser. These automatic analyses must of course be taken with a grain of salt.
---
### Option 2: Swedish data
The Swedish text is given in the file [`comp-syntax-corpus-swedish.txt`](comp-syntax-corpus-swedish.txt) in this directory.
It consists of teacher-corrected sentences from the [Swedish Learner Language (SweLL) corpus](https://spraakbanken.gu.se/en/resources/swell-gold)[^1], which is currently being annotated in UD for the first time.
In this case, there is no "gold standard" to check your answers against, but by choosing this corpus you will directly contribute to an ongoing annotation effort.
Of course, you can still compare your solutions with [UDPipe](https://lindat.mff.cuni.cz/services/udpipe/)'s automatic analyses.
In both corpora, the first few sentences are POS-tagged, with each word having the form
`word:<POS>`
Hint: you can initialize the task by converting each word or word:<POS> to a simplified CoNLL line with a dummy head (0) and label (dep), with proper position number of course.
The UD annotation that you produce manually can be simplified CoNLL, with just the fields
`position word postag head label`
Make sure that each field is exactly one token, so that the whole line has exactly 5 tokens.
This input can be automatically expanded to full CoNLL by adding undescores for the lemma, morphology, and other missing fields, as well as tabs between the fields (if you didn't use tabs already).
`position word _ postag _ _ head label _ _`
Example:
`7 world NOUN 4 nmod`
expands to
`7 world _ NOUN _ _ 4 nmod _ _`
(Unfortunately, the tabs are not visible in the md output.)
The conversion to full CoNLL can be done using Python or `gf-ud reduced2conll` (available on eduserv) or with [this script](https://gist.github.com/harisont/612a87d20f729aa3411041f873367fa2).
Once you have full CoNLL, you can use [deptreepy](https://github.com/aarneranta/deptreepy/), [gf-ud](https://github.com/grammaticalFramework/gf-ud) or [the online CoNNL-U viewer](https://universaldependencies.org/conllu_viewer.html) to visualize it.
With deptreepy, you will need to issue the command
`cat my-file.conllu | python deptreepy.py visualize_conllu > my-file.html`
which creates an HTML file you can open in you web browser.
If you use the gf-ud tool, the command is
`cat my-file.conllu | ./gf-ud conll2pdf`
which generates a PDF. However, this does not support all foreign characters.
(It is possible that you won't be able to visualize the trees directly on eduserv.
Building gf-ud and running this command on your machine requires Haskell and the GF libraries, as well as LaTeX to show the pdf output.)
## (Chapter 4: phrase structure analysis)
> __NOTE:__ chapter 4 is __not__ required in the 2024 edition of the course.
> You are of course welcome to try out these exercises, but they will not be graded.
### Prerequisites: get `gf-ud` to work
There are multiple ways to use `gf-ud`:
- using the version that is installed on eduserv
- installing a pre-compiled executable, available for Mac and Ubuntu machines at http://www.grammaticalframework.org/~aarne/software/
- compiling the source code, available at https://github.com/GrammaticalFramework/gf-ud. `gf-ud` can be built:
- with `make` provided that you have the GHC Haskell compiler and the gf-core libraries (available at https://github.com/GrammaticalFramework/gf-core) installed
- with the Haskell Stack tool, by running `stack install`. This will install all the necessary dependency automatically.
### Tasks
1. Construct (by hand) phrase structure trees for some of the sentences in the corpus used in Chapter 3, both for English and your chosen language.
2. Test the grammar at
https://github.com/GrammaticalFramework/gf-ud/blob/master/grammars/English.dbnf
on last week's corpus, both for English and your own language.
In practice, this means:
- running `gf-ud`'s `dbnf` command on (possibly POS-tagged) versions of the sentences in Chapter 3's corpus.
- comparing the CoNNL-U and parse trees obtained in this way with, respectively, your hand-drawn parse trees and the CoNNL-U trees from Chapter 3. Parse tree comparison can be qualitative, while CoNNL-U trees are to be compared quantitatively via `gf-ud eval`.
3. Modify the grammar to suit your language and test it on some of the UD treebanks by using `gf-ud eval`. Try to obtain a `udScore` above 0.60. You are welcome to explain the changes you make.
[^1]: to be precise, the sentences you will use have been extracted from [DaLAJ-GED-SuperLim 2.0](https://spraakbanken.gu.se/en/resources/dalaj-ged-superlim), a publicly available spinoff of the main SweLL corpus.
Submit `MicroLangXxx.gf` and `MicroResXxx.gf` on Canvas.

View File

@@ -1,63 +0,0 @@
# UDPipe: quick instructions
## Download and install
The simplest way to use UDPipe is to install the binaries for UDPipe-1, which exist for several operating systems.
They can be downloaded from
https://github.com/ufal/udpipe/releases/download/v1.3.0/udpipe-1.3.0-bin.zip
When you have downloaded and unzipped this file, you will find the binary for your system in a subdirectory, for instance,
```
udpipe-1.3.0-bin/bin-macos/udpipe
```
is the binary for MacOS.
If you include this directory on your PATH, you can run the command `udpipe` from anywhere.
Running the parser for a language requires a model for that language.
Models can be accessed via
https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-3131
This page includes a long list of models and a command to install them all.
If you need only some of them, you can do, for instance,
```
$ wget https://lindat.mff.cuni.cz/repository/xmlui/bitstream/handle/11234/1-3131//english-lines-ud-2.5-191206.udpipe
```
## Running the parser
Assuming that you have the binary `udpipe` and the model `english-lines-ud-2.5-191206.udpipe` on you path, you can parse a single sentence with
```
$ echo "my hovercraft is full of eels" | udpipe --tokenize --tag --parse english-lines-ud-2.5-191206.udpipe
```
The result is a UD tree in the CoNLL-U notation,
```
# newdoc
# newpar
# sent_id = 1
# text = my hovercraft is full of eels
1 my I PRON P1SG-GEN Number=Sing|Person=1|Poss=Yes|PronType=Prs nmod:poss _ _
2 hovercraft hovercraft NOUN SG-NOM Number=Sing 4 nsubj _ _
3 is be AUX PRES Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin cop _ _
4 full full ADJ POS Degree=Pos 0 root _ _
5 of of ADP _ _ 6 case _ _
6 eels eel NOUN PL-NOM Number=Plur 4 nmod _ SpacesAfter=\n
```
If you also have `gfud` and `pdflatex` on your path, you can pipe the result into `gfud conll2pdf` to see the graphical tree.
As `udpipe` reads standard input, you can read it from a file, such as `lecture3-examples.txt`:
```
$ cat <myfile> | udpipe --tokenize --tag --parse <model>
```
Notice that sentences in that file must either end with a period or be separated by empty lines, because otherwise the whole file is parsed as one sentence.
## Training a new model
If you have a treebank in the CoNLL-U format, you can use it for training a new model, with
```
$ cat <myfile>.conllu | udpipe --tokenizer none --tagger none --train <myfile>.udpipe
```

View File

@@ -1,87 +0,0 @@
# Lab 2: Multilingual generation and translation
This lab corresponds to Chapters 5 to 9 of the Notes, but follows them only loosely.
Therefore we will structure it according to the exercise sessions
rather than chapters.
The abstract syntax is given in the subdirectory grammars/abstract/
## After lecture 6
1. Design a morphology for the main lexical types (N, A, V) with parameters and a couple of paradigms.
2. Test it by implementing the lexicon in the MicroLang module. You need to define lincat N,A,V,V2 as well as the paradigms in MicroResource.
*To deliver*: the lexicon part of files MicroGrammarX.gf and MicroResourceX.gf for your language of choice X. Follow the structure of MicroGrammarEng and MicroResourceEng when preparing these.
## After lecture 7
1. Define the linearization types of main phrasal categories - the remaining categories in MicroLang.
2. Define the rest of the linearization rules in MicroLang.
*To deliver*: MicroLangX and MicroResourceX for your language of choice, with the lexicon part from Session 5 completed with syntax part.
## After lecture 8
1. Try out the applications in `../python` and read its README carefully.
2. Add a concrete syntax for your language to one of the grammars
in `../python/`, either `Query` or `Draw`.
The simplest way to do this
is first to copy the `Eng` grammar and then to change the words; the
syntax may work well as it is. Even though it can be a bit unnatural,
it should be in a wide sense natural.
3. Compile the grammar with `gf -make Query???.gf` so that your grammar
gets included (the same for `Draw`).
4. Generate phrases in GF by first importing your pgf file and then
issuing the command `gt | l -treebank`; fix your grammar if it looks
too bad.
5. Test the corresponding Python application with your language.
The Python code with embedded GF grammars will be explained in a greater
detail in Lecture 9.
*To deliver*: your grammar module.
*Deadline*: 29 May 2024. Demo your grammars (both Micro and this one) at
the last lecture of the course!
## A method for testing your Micro grammar
Since MicroLang is a proper part of the RGL, it can be easily implemented as an application grammar.
How to do this is shown in `grammar/functor/`, where the implementation consists of two files:
- `MicroLangFunctor.gf` which is a generic implementation working for all RGL languages,
- `MicroLangFunctorEng.gf` which is a *functor instantiation* for English, easily reproduciple for other languages than `Eng`.
To use this for testing, you can take the following steps:
1. Build a functor instantiation for your language by copying `MicroLangFunctorEng.gf` and changing `Eng` in the file name and inside the file to your language code.
2. Use GF to create a testfile by random generation:
```
$ echo "gr -number=1000 | l -tabtreebank" | gf english/MicroLangEng.gf functor/MicroLangFunctorEng.gf >test.tmp
```
3. Inspect the resulting file `test.tmp`.
But you can also use Unix `cut` to create separate files for the two versions of the grammar and `diff` to compare them:
```
$ cut -f2 test.tmp >test1.tmp
$ cut -f3 test.tmp >test2.tmp
$ diff test1.tmp test2.tmp
52c52
< the hot fire teachs her
---
> the hot fire teaches her
69c69
< the man teachs the apples
---
> the man teaches the apples
122c122
```
As seen from the result in this case, our implementation has a wrong inflection of the verb "teach".
The Mini grammar can be tested in the same way, by building a reference implementation using the functor in `functor/`.

View File

@@ -1,16 +0,0 @@
PredVPS nsubj head
ComplV2 head obj
AdvVP head advmod
DetCN det head
AdjCN amod head
PrepNP case head
Det DET
Prep ADP
V VERB
V2 VERB
A ADJ
N NOUN
Pron PRON
Adv ADV

View File

@@ -1,121 +0,0 @@
abstract MiniGrammar = {
-- collected from GF/lib/src/abstract/*.gf
-- the functions marked ---s are shortcuts
-- the leading comments, e.g. "-- Common", indicate the standard RGL module
cat
-- Common
Utt ; -- sentence, question, word... e.g. "be quiet"
Pol ; -- polarity e.g. positive, negative
Temp ; -- temporal features e.g. present, anterior
-- Cat
Imp ; -- imperative e.g. "walk", "don't walk"
S ; -- declarative sentence e.g. "she lives here"
QS ; -- question sentence e.g. "does she live here"
Cl ; -- declarative clause, with all tenses e.g. "she looks at this"
QCl ; -- question clause e.g. "does she look at this"
VP ; -- verb phrase e.g. "lives here"
Comp ; -- complement of copula e.g. "in trouble"
AP ; -- adjectival phrase e.g. "very warm"
CN ; -- common noun (without determiner) e.g. "red house"
NP ; -- noun phrase (subject or object) e.g. "the red house"
IP ; -- interrogative phrase e.g. "who"
Pron ; -- personal pronoun e.g. "she"
Det ; -- determiner phrase e.g. "those"
Conj ; -- conjunction e.g. "and"
Prep ; -- preposition, or just case e.g. "in", dative
V ; -- one-place verb e.g. "sleep"
V2 ; -- two-place verb e.g. "love"
VS ; -- sentence-complement verb e.g. "know"
VV ; -- verb-phrase-complement verb e.g. "want"
A ; -- one-place adjective e.g. "warm"
N ; -- common noun e.g. "house"
PN ; -- proper name e.g. "Paris"
Adv ; -- adverbial phrase e.g. "in the house"
IAdv ; -- interrogative adverbial e.g. "where"
fun
-- Phrase
UttS : S -> Utt ; -- John walks
UttQS : QS -> Utt ; -- does John walk
UttNP : NP -> Utt ; -- John
UttAdv : Adv -> Utt ; -- in the house
UttIAdv : IAdv -> Utt ; -- why
UttImpSg : Pol -> Imp -> Utt ; -- (do not) walk
-- Sentence
UseCl : Temp -> Pol -> Cl -> S ; -- John has not walked
UseQCl : Temp -> Pol -> QCl -> QS ; -- has John walked
PredVP : NP -> VP -> Cl ; -- John walks / John does not walk
QuestCl : Cl -> QCl ; -- does John (not) walk
QuestVP : IP -> VP -> QCl ; -- who does (not) walk
ImpVP : VP -> Imp ; -- walk / do not walk
-- Verb
UseV : V -> VP ; -- sleep
ComplV2 : V2 -> NP -> VP ; -- love it ---s
ComplVS : VS -> S -> VP ; -- know that it is good
ComplVV : VV -> VP -> VP ; -- want to be good
UseComp : Comp -> VP ; -- be small
CompAP : AP -> Comp ; -- small
CompNP : NP -> Comp ; -- a man
CompAdv : Adv -> Comp ; -- in the house
AdvVP : VP -> Adv -> VP ; -- sleep here
-- Noun
DetCN : Det -> CN -> NP ; -- the man
UsePN : PN -> NP ; -- John
UsePron : Pron -> NP ; -- he
MassNP : CN -> NP ; -- milk
a_Det : Det ; -- indefinite singular ---s
aPl_Det : Det ; -- indefinite plural ---s
the_Det : Det ; -- definite singular ---s
thePl_Det : Det ; -- definite plural ---s
UseN : N -> CN ; -- house
AdjCN : AP -> CN -> CN ; -- big house
-- Adjective
PositA : A -> AP ; -- warm
-- Adverb
PrepNP : Prep -> NP -> Adv ; -- in the house
-- Conjunction
CoordS : Conj -> S -> S -> S ; -- he walks and she runs ---s
-- Tense
PPos : Pol ; -- I sleep [positive polarity]
PNeg : Pol ; -- I do not sleep [negative polarity]
TSim : Temp ; -- simultanous: she sleeps ---s
TAnt : Temp ; -- anterior: she has slept ---s
-- Structural
and_Conj : Conj ;
or_Conj : Conj ;
every_Det : Det ;
in_Prep : Prep ;
on_Prep : Prep ;
with_Prep : Prep ;
i_Pron : Pron ;
youSg_Pron : Pron ;
he_Pron : Pron ;
she_Pron : Pron ;
we_Pron : Pron ;
youPl_Pron : Pron ;
they_Pron : Pron ;
whoSg_IP : IP ;
where_IAdv : IAdv ;
why_IAdv : IAdv ;
have_V2 : V2 ;
want_VV : VV ;
}

View File

@@ -1,8 +0,0 @@
abstract MiniLang =
MiniGrammar,
MiniLexicon
** {
flags startcat = Utt ;
}

View File

@@ -1,27 +0,0 @@
AdjCN amod head
AdvVP head advmod
ComplV2 head obj
ComplVS head ccomp
ComplVV head xcomp
CoordS cc head conj
DetCN det head
PredVP nsubj head
PrepNP case head
QuestVP nsubj head
UseCl empty empty head
UseQCl empty empty head
UttImpSg emoty head
A ADJ
Adv ADV
Conj CONJ
Det DET
IAdv ADV
N NOUN
PN PROPN
Prep ADP
Pron PRON
V VERB
V2 VERB
VV VERB
VS VERB

View File

@@ -1,92 +0,0 @@
abstract MiniLexicon = MiniGrammar ** {
fun
already_Adv : Adv ;
animal_N : N ;
apple_N : N ;
baby_N : N ;
bad_A : A ;
beer_N : N ;
big_A : A ;
bike_N : N ;
bird_N : N ;
black_A : A ;
blood_N : N ;
blue_A : A ;
boat_N : N ;
book_N : N ;
boy_N : N ;
bread_N : N ;
break_V2 : V2 ;
buy_V2 : V2 ;
car_N : N ;
cat_N : N ;
child_N : N ;
city_N : N ;
clean_A : A ;
clever_A : A ;
cloud_N : N ;
cold_A : A ;
come_V : V ;
computer_N : N ;
cow_N : N ;
dirty_A : A ;
dog_N : N ;
drink_V2 : V2 ;
eat_V2 : V2 ;
find_V2 : V2 ;
fire_N : N ;
fish_N : N ;
flower_N : N ;
friend_N : N ;
girl_N : N ;
good_A : A ;
go_V : V ;
grammar_N : N ;
green_A : A ;
heavy_A : A ;
horse_N : N ;
hot_A : A ;
house_N : N ;
john_PN : PN ;
jump_V : V ;
kill_V2 : V2 ;
know_VS : VS ;
language_N : N ;
live_V : V ;
love_V2 : V2 ;
man_N : N ;
milk_N : N ;
music_N : N ;
new_A : A ;
now_Adv : Adv ;
old_A : A ;
paris_PN : PN ;
play_V : V ;
read_V2 : V2 ;
ready_A : A ;
red_A : A ;
river_N : N ;
run_V : V ;
sea_N : N ;
see_V2 : V2 ;
ship_N : N ;
sleep_V : V ;
small_A : A ;
star_N : N ;
swim_V : V ;
teach_V2 : V2 ;
train_N : N ;
travel_V : V ;
tree_N : N ;
understand_V2 : V2 ;
wait_V2 : V2 ;
walk_V : V ;
warm_A : A ;
water_N : N ;
white_A : A ;
wine_N : N ;
woman_N : N ;
yellow_A : A ;
young_A : A ;
}

View File

@@ -1,148 +0,0 @@
incomplete resource MiniSyntax =
open MiniGrammar
in {
oper
mkUtt = overload {
mkUtt : S -> Utt
= UttS ;
mkUtt : QS -> Utt
= UttQS ;
mkUtt : NP -> Utt
= UttNP ;
mkUtt : Adv -> Utt
= UttAdv ;
mkUtt : Pol -> Imp -> Utt
= UttImpSg ;
mkUtt : Imp -> Utt
= UttImpSg PPos
} ;
mkImp = overload {
mkImp : VP -> Imp
= ImpVP ;
} ;
mkS = overload {
mkS : Temp -> Pol -> Cl -> S
= UseCl ;
mkS : Pol -> Cl -> S
= UseCl TSim ;
mkS : Temp -> Cl -> S
= \t -> UseCl t PPos ;
mkS : Cl -> S
= UseCl TSim PPos ;
mkS : Conj -> S -> S -> S
= CoordS ;
} ;
mkQS = overload {
mkQS : Temp -> Pol -> QCl -> QS
= UseQCl ;
mkQS : Pol -> QCl -> QS
= UseQCl TSim ;
mkQS : Temp -> QCl -> QS
= \t -> UseQCl t PPos ;
mkQS : QCl -> QS
= UseQCl TSim PPos ;
} ;
positivePol : Pol
= PPos ;
negativePol : Pol
= PNeg ;
simultaneousAnt : Temp
= TSim ;
anteriorAnt : Temp
= TAnt ;
mkCl = overload {
mkCl : NP -> VP -> Cl
= PredVP ;
mkCl : NP -> V -> Cl
= \np,v -> PredVP np (UseV v) ;
mkCl : NP -> V2 -> NP -> Cl
= \np,v,obj -> PredVP np (ComplV2 v obj) ;
mkCl : NP -> VS -> S -> Cl
= \np,v,obj -> PredVP np (ComplVS v obj) ;
mkCl : NP -> VV -> VP -> Cl
= \np,v,obj -> PredVP np (ComplVV v obj) ;
mkCl : NP -> AP -> Cl
= \np,ap -> PredVP np (UseComp (CompAP ap)) ;
mkCl : NP -> A -> Cl
= \np,a -> PredVP np (UseComp (CompAP (PositA a))) ;
} ;
mkQCl = overload {
mkQCl : Cl -> QCl
= QuestCl ;
mkQCl : IP -> VP -> QCl
= QuestVP ;
} ;
mkVP = overload {
mkVP : V -> VP
= UseV ;
mkVP : V2 -> NP -> VP
= ComplV2 ;
mkVP : AP -> VP
= \ap -> UseComp (CompAP ap) ;
mkVP : A -> VP
= \a -> UseComp (CompAP (PositA a)) ;
mkVP : NP -> VP
= \np -> UseComp (CompNP np) ;
mkVP : Adv -> VP
= \adv -> UseComp (CompAdv adv) ;
mkVP : VP -> Adv -> VP
= AdvVP ;
} ;
mkNP = overload {
mkNP : Det -> CN -> NP
= DetCN ;
mkNP : Det -> N -> NP
= \det,n -> DetCN det (UseN n) ;
mkNP : Pron -> NP
= UsePron ;
mkNP : PN -> NP
= UsePN ;
mkNP : CN -> NP
= MassNP ;
mkNP : N -> NP
= \n -> MassNP (UseN n) ;
} ;
i_NP : NP
= UsePron i_Pron ;
you_NP : NP
= UsePron youSg_Pron ;
he_NP : NP
= UsePron he_Pron ;
she_NP : NP
= UsePron she_Pron ;
mkCN = overload {
mkCN : N -> CN
= UseN ;
mkCN : AP -> CN -> CN
= AdjCN ;
mkCN : A -> N -> CN
= \a,n -> AdjCN (PositA a) (UseN n) ;
mkCN : A -> CN -> CN
= \a,cn -> AdjCN (PositA a) cn ;
} ;
mkAP = overload {
mkAP : A -> AP
= PositA ;
} ;
mkAdv = overload {
mkAdv : Prep -> NP -> Adv
= PrepNP ;
} ;
}

View File

@@ -1,6 +0,0 @@
--# -path=.:../english:../abstract
resource MiniSyntaxEng =
MiniGrammarEng ** --- inheriting everything from Grammar, not just Cat and Structural
MiniSyntax with
(MiniGrammar=MiniGrammarEng) ;

View File

@@ -1,92 +0,0 @@
abstract Doctor = {
flags startcat = Phrase ;
cat
Phrase ; -- has she slept?
Fact ; -- she sleeps
Action ; -- sleep
Property ; -- be a doctor
Profession ; -- doctor
Person ; -- she
Place ; -- the hospital
Substance ; -- drugs
Illness ; -- fever
fun
presPosPhrase : Fact -> Phrase ; -- she sleeps
presNegPhrase : Fact -> Phrase ; -- she doesn't sleep
pastPosPhrase : Fact -> Phrase ; -- she has slept
pastNegPhrase : Fact -> Phrase ; -- she has not slept
presQuestionPhrase : Fact -> Phrase ; -- does she sleep
pastQuestionPhrase : Fact -> Phrase ; -- has she slept
impPosPhrase : Action -> Phrase ; -- eat
impNegPhrase : Action -> Phrase ; -- don't eat
actionFact : Person -> Action -> Fact ; -- she vaccinates you
propertyFact : Person -> Property -> Fact ; -- she is a doctor
isProfessionProperty : Profession -> Property ; -- be a doctor
isAtPlaceProperty : Place -> Property ; -- be at the hospital
haveIllnessProperty : Illness -> Property ; -- have a fever
needProfessionProperty : Profession -> Property ; -- need a doctor
theProfessionPerson : Profession -> Person ; -- the doctor
iMascPerson : Person ;
iFemPerson : Person ;
youMascPerson : Person ;
youFemPerson : Person ;
hePerson : Person ;
shePerson : Person ;
goToAction : Place -> Action ; -- go to the hospital
stayAtAction : Place -> Action ; -- stay at home
vaccinateAction : Person -> Action ; -- vaccinate you
examineAction : Person -> Action ; -- examine you
takeSubstanceAction : Substance -> Action ; -- take drugs
coughAction : Action ;
breatheAction : Action ;
vomitAction : Action ;
sleepAction : Action ;
undressAction : Action ;
dressAction : Action ;
eatAction : Action ;
drinkAction : Action ;
smokeAction : Action ;
measureTemperatureAction : Action ;
measureBloodPressureAction : Action ;
hospitalPlace : Place ;
homePlace : Place ;
schoolPlace : Place ;
workPlace : Place ;
doctorProfession : Profession ;
nurseProfession : Profession ;
interpreterProfession : Profession ;
bePregnantProperty : Property ;
beIllProperty : Property ;
beWellProperty : Property ;
beDeadProperty : Property ;
haveAllergiesProperty : Property ;
havePainsProperty : Property ;
haveChildrenProperty : Property ;
feverIllness : Illness ;
fluIllness : Illness ;
headacheIllness : Illness ;
diarrheaIllness : Illness ;
heartDiseaseIllness : Illness ;
lungDiseaseIllness : Illness ;
hypertensionIllness : Illness ;
alcoholSubstance : Substance ;
medicineSubstance : Substance ;
drugsSubstance : Substance ;
}

View File

@@ -1,110 +0,0 @@
concrete DoctorEng of Doctor =
open
SyntaxEng,
ParadigmsEng,
Prelude
in {
-- application using standard RGL
lincat
Phrase = Utt ;
Fact = Cl ;
Action = VP ;
Property = VP ;
Profession = CN ;
Person = NP ;
Place = {at,to : Adv} ;
Substance = NP ;
Illness = NP ;
lin
presPosPhrase fact = mkUtt (mkS fact) ;
presNegPhrase fact = mkUtt (mkS negativePol fact) ;
pastPosPhrase fact = mkUtt (mkS anteriorAnt fact) ;
pastNegPhrase fact = mkUtt (mkS anteriorAnt negativePol fact) ;
presQuestionPhrase fact = mkUtt (mkQS (mkQCl fact)) ;
pastQuestionPhrase fact = mkUtt (mkQS anteriorAnt (mkQCl fact)) ;
impPosPhrase action = mkUtt (mkImp action) ;
impNegPhrase action = mkUtt negativePol (mkImp action) ;
actionFact person action = mkCl person action ;
propertyFact person property = mkCl person property ;
isProfessionProperty profession = mkVP (mkNP a_Det profession) ;
needProfessionProperty profession = mkVP need_V2 (mkNP a_Det profession) ;
isAtPlaceProperty place = mkVP place.at ;
haveIllnessProperty illness = mkVP have_V2 illness ;
theProfessionPerson profession = mkNP the_Det profession ;
iMascPerson = i_NP ;
iFemPerson = i_NP ;
youMascPerson = you_NP ;
youFemPerson = you_NP ;
hePerson = he_NP ;
shePerson = she_NP ;
goToAction place = mkVP (mkVP go_V) place.to ;
stayAtAction place = mkVP (mkVP stay_V) place.at ;
vaccinateAction person = mkVP vaccinate_V2 person ;
examineAction person = mkVP examine_V2 person ;
takeSubstanceAction substance = mkVP take_V2 substance ;
-- end of what could be a functor
--------------------------------
coughAction = mkVP (mkV "cough") ;
breatheAction = mkVP (mkV "breathe") ;
vomitAction = mkVP (mkV "vomit") ;
sleepAction = mkVP (mkV "sleep" "slept" "slept") ;
undressAction = mkVP (mkVP take_V2 (mkNP thePl_Det (mkN "clothe"))) (pAdv "off") ;
dressAction = mkVP (mkVP put_V2 (mkNP thePl_Det (mkN "clothe"))) (pAdv "on") ;
eatAction = mkVP (mkV "eat" "ate" "eaten") ;
drinkAction = mkVP (mkV "drink" "drank" "drunk") ;
smokeAction = mkVP (mkV "smoke") ;
measureTemperatureAction = mkVP (mkV2 (mkV "measure")) (mkNP the_Det (mkN "body temperature")) ;
measureBloodPressureAction = mkVP (mkV2 (mkV "measure")) (mkNP the_Det (mkN "blood pressure")) ;
hospitalPlace = {at = pAdv "at the hospital" ; to = pAdv "to the hospital"} ;
homePlace = {at = pAdv "at home" ; to = pAdv "home"} ;
schoolPlace = {at = pAdv "at school" ; to = pAdv "to school"} ;
workPlace = {at = pAdv "at work" ; to = pAdv "to work"} ;
doctorProfession = mkCN (mkN "doctor") ;
nurseProfession = mkCN (mkN "nurse") ;
interpreterProfession = mkCN (mkN "interpreter") ;
bePregnantProperty = mkVP (mkA "pregnant") ;
beIllProperty = mkVP (mkA "ill") ;
beWellProperty = mkVP (mkA "well") ;
beDeadProperty = mkVP (mkA "dead") ;
haveAllergiesProperty = mkVP have_V2 (mkNP aPl_Det (mkN "allergy")) ;
havePainsProperty = mkVP have_V2 (mkNP aPl_Det (mkN "pain")) ;
haveChildrenProperty = mkVP have_V2 (mkNP aPl_Det (mkN "child" "children")) ;
feverIllness = mkNP a_Det (mkN "fever") ;
fluIllness = mkNP a_Det (mkN "flu") ;
headacheIllness = mkNP a_Det (mkN "headache") ;
diarrheaIllness = mkNP a_Det (mkN "diarrhea") ;
heartDiseaseIllness = mkNP a_Det (mkN "heart disease") ;
lungDiseaseIllness = mkNP a_Det (mkN "lung disease") ;
hypertensionIllness = mkNP (mkN "hypertension") ;
alcoholSubstance = mkNP (mkN "alcohol") ;
medicineSubstance = mkNP a_Det (mkN "drug") ;
drugsSubstance = mkNP aPl_Det (mkN "drug") ;
oper
pAdv : Str -> Adv = ParadigmsEng.mkAdv ;
go_V = mkV "go" "went" "gone" ;
stay_V = mkV "stay" ;
need_V2 = mkV2 (mkV "need") ;
take_V2 = mkV2 (mkV "take" "took" "taken") ;
put_V2 = mkV2 (mkV "put" "put" "put") ;
vaccinate_V2 = mkV2 (mkV "vaccinate") ;
examine_V2 = mkV2 (mkV "examine") ;
}

View File

@@ -1,117 +0,0 @@
--# -path=.:../abstract:../english:../api
-- model implementation using Mini RGL
concrete DoctorMiniEng of Doctor =
open
MiniSyntaxEng,
MiniParadigmsEng,
Prelude
in {
-- application using your own Mini* modules
lincat
Phrase = Utt ;
Fact = Cl ;
Action = VP ;
Property = VP ;
Profession = CN ;
Person = NP ;
Place = {at,to : Adv} ;
Substance = NP ;
Illness = NP ;
lin
presPosPhrase fact = mkUtt (mkS fact) ;
presNegPhrase fact = mkUtt (mkS negativePol fact) ;
pastPosPhrase fact = mkUtt (mkS anteriorAnt fact) ;
pastNegPhrase fact = mkUtt (mkS anteriorAnt negativePol fact) ;
-- presQuestionPhrase fact = mkUtt (mkQS (mkQCl fact)) ;
-- pastQuestionPhrase fact = mkUtt (mkQS anteriorAnt (mkQCl fact)) ;
presQuestionPhrase fact = let p : Utt = mkUtt (mkQS (mkQCl fact)) in p ** {s = p.s ++ SOFT_BIND ++ "?"} ;
pastQuestionPhrase fact = let p : Utt = mkUtt (mkQS anteriorAnt (mkQCl fact)) in p ** {s = p.s ++ SOFT_BIND ++ "?"} ;
impPosPhrase action = mkUtt (mkImp action) ;
impNegPhrase action = mkUtt negativePol (mkImp action) ;
actionFact person action = mkCl person action ;
propertyFact person property = mkCl person property ;
isProfessionProperty profession = mkVP (mkNP a_Det profession) ;
needProfessionProperty profession = mkVP need_V2 (mkNP a_Det profession) ;
isAtPlaceProperty place = mkVP place.at ;
haveIllnessProperty illness = mkVP have_V2 illness ;
theProfessionPerson profession = mkNP the_Det profession ;
iMascPerson = i_NP ;
iFemPerson = i_NP ;
youMascPerson = you_NP ;
youFemPerson = you_NP ;
hePerson = he_NP ;
shePerson = she_NP ;
goToAction place = mkVP (mkVP go_V) place.to ;
stayAtAction place = mkVP (mkVP stay_V) place.at ;
vaccinateAction person = mkVP vaccinate_V2 person ;
examineAction person = mkVP examine_V2 person ;
takeSubstanceAction substance = mkVP take_V2 substance ;
-- end of what could be a functor
--------------------------------
coughAction = mkVP (mkV "cough") ;
breatheAction = mkVP (mkV "breathe") ;
vomitAction = mkVP (mkV "vomit") ;
sleepAction = mkVP (mkV "sleep" "slept" "slept") ;
undressAction = mkVP (mkVP take_V2 (mkNP thePl_Det (mkN "clothe"))) (pAdv "off") ;
dressAction = mkVP (mkVP put_V2 (mkNP thePl_Det (mkN "clothe"))) (pAdv "on") ;
eatAction = mkVP (mkV "eat" "ate" "eaten") ;
drinkAction = mkVP (mkV "drink" "drank" "drunk") ;
smokeAction = mkVP (mkV "smoke") ;
measureTemperatureAction = mkVP (mkV2 (mkV "measure")) (mkNP the_Det (mkN "body temperature")) ;
measureBloodPressureAction = mkVP (mkV2 (mkV "measure")) (mkNP the_Det (mkN "blood pressure")) ;
hospitalPlace = {at = pAdv "at the hospital" ; to = pAdv "to the hospital"} ;
homePlace = {at = pAdv "at home" ; to = pAdv "home"} ;
schoolPlace = {at = pAdv "at school" ; to = pAdv "to school"} ;
workPlace = {at = pAdv "at work" ; to = pAdv "to work"} ;
doctorProfession = mkCN (mkN "doctor") ;
nurseProfession = mkCN (mkN "nurse") ;
interpreterProfession = mkCN (mkN "interpreter") ;
bePregnantProperty = mkVP (mkA "pregnant") ;
beIllProperty = mkVP (mkA "ill") ;
beWellProperty = mkVP (mkA "well") ;
beDeadProperty = mkVP (mkA "dead") ;
haveAllergiesProperty = mkVP have_V2 (mkNP aPl_Det (mkN "allergy")) ;
havePainsProperty = mkVP have_V2 (mkNP aPl_Det (mkN "pain")) ;
haveChildrenProperty = mkVP have_V2 (mkNP aPl_Det (mkN "child" "children")) ;
feverIllness = mkNP a_Det (mkN "fever") ;
fluIllness = mkNP a_Det (mkN "flu") ;
headacheIllness = mkNP a_Det (mkN "headache") ;
diarrheaIllness = mkNP a_Det (mkN "diarrhea") ;
heartDiseaseIllness = mkNP a_Det (mkN "heart disease") ;
lungDiseaseIllness = mkNP a_Det (mkN "lung disease") ;
hypertensionIllness = mkNP (mkN "hypertension") ;
alcoholSubstance = mkNP (mkN "alcohol") ;
medicineSubstance = mkNP a_Det (mkN "drug") ;
drugsSubstance = mkNP aPl_Det (mkN "drug") ;
oper
pAdv : Str -> Adv = MiniParadigmsEng.mkAdv ;
go_V = mkV "go" "went" "gone" ;
stay_V = mkV "stay" ;
need_V2 = mkV2 (mkV "need") ;
take_V2 = mkV2 (mkV "take" "took" "taken") ;
put_V2 = mkV2 (mkV "put" "put" "put") ;
vaccinate_V2 = mkV2 (mkV "vaccinate") ;
examine_V2 = mkV2 (mkV "examine") ;
}

View File

@@ -1 +0,0 @@
UseComp {"es","sont"} AUX cop head

View File

@@ -1,234 +0,0 @@
--# -path=.:../abstract
concrete MiniGrammarEng of MiniGrammar = open MiniResEng, Prelude in {
lincat
Utt = {s : Str} ;
Pol = {s : Str ; isTrue : Bool} ; -- the s field is empty, but needed for parsing
Temp = {s : Str ; isPres : Bool} ;
S = {s : Str} ;
QS = {s : Str} ;
Cl = { -- word order is fixed in S and QS
subj : Str ; -- subject
verb : Bool => Bool => {fin,inf : Str} ; -- dep. on Pol,Temp, e.g. "does","sleep"
compl : Str -- after verb: complement, adverbs
} ;
QCl = Cl ** {isWh : Bool} ;
Imp = {s : Bool => Str} ;
VP = {verb : GVerb ; compl : Str} ;
Comp = {s : Str} ;
AP = Adjective ;
CN = Noun ;
NP = {s : Case => Str ; a : Agreement} ;
IP = {s : Case => Str ; a : Agreement} ;
Pron = {s : Case => Str ; a : Agreement} ;
Det = {s : Str ; n : Number} ;
Conj = {s : Str} ;
Prep = {s : Str} ;
V = Verb ;
V2 = Verb2 ;
VS = Verb ;
VV = Verb ; ---- only VV to VP
A = Adjective ;
N = Noun ;
PN = {s : Str} ;
Adv = {s : Str} ;
IAdv = {s : Str} ;
lin
UttS s = s ;
UttQS s = s ;
UttNP np = {s = np.s ! Acc} ; -- Acc: produce "me" rather than "I"
UttAdv adv = adv ;
UttIAdv iadv = iadv ;
UttImpSg pol imp = {s = pol.s ++ imp.s ! pol.isTrue} ;
UseCl temp pol cl =
let clt = cl.verb ! pol.isTrue ! temp.isPres -- isTrue regulates if "do" is used
in {
s = pol.s ++ temp.s ++ --- needed for parsing: a GF hack
cl.subj ++ -- she
clt.fin ++ -- does
negation pol.isTrue ++ -- not
clt.inf ++ -- drink
cl.compl -- beer
} ;
UseQCl temp pol qcl =
let
isWh = qcl.isWh ;
clt = qcl.verb ! andB isWh pol.isTrue ! temp.isPres ; -- no "do" in present positive Wh questions
verbsubj = case isWh of {
True => qcl.subj ++ clt.fin ; -- no inversion in Wh questions
False => clt.fin ++ qcl.subj
}
in {
s = pol.s ++ temp.s ++
verbsubj ++
negation pol.isTrue ++ -- not
clt.inf ++ -- drink
qcl.compl -- beer
} ;
PredVP np vp = {
subj = np.s ! Nom ;
compl = vp.compl ;
verb = \\plain,isPres => case <vp.verb.isAux, plain, isPres, np.a> of {
-- non-auxiliary verbs, negative/question present: "does (not) drink"
<False,False,True,Agr Sg Per3> => {fin = "does" ; inf = vp.verb.s ! VF Inf} ;
<False,False,True,_ > => {fin = "do" ; inf = vp.verb.s ! VF Inf} ;
-- non-auxiliary, plain present ; auxiliary, all present: "drinks", "is (not)"
<_,_, True, Agr Sg Per1> => {fin = vp.verb.s ! PresSg1 ; inf = []} ;
<_,_, True, Agr Sg Per3> => {fin = vp.verb.s ! VF PresSg3 ; inf = []} ;
<_,_, True, _> => {fin = vp.verb.s ! PresPl ; inf = []} ;
-- all verbs, past: "has (not) drunk", "has (not) been"
<_,_, False,Agr Sg Per3> => {fin = "has" ; inf = vp.verb.s ! VF PastPart} ;
<_,_, False,_ > => {fin = "have" ; inf = vp.verb.s ! VF PastPart}
-- the negation word "not" is put in place in UseCl, UseQCl
}
} ;
QuestCl cl = cl ** {isWh = False} ; -- since the parts are the same, we don't need to change anything
QuestVP ip vp = PredVP ip vp ** {isWh = True} ;
ImpVP vp = {
s = table {
True => vp.verb.s ! VF Inf ++ vp.compl ; -- in Eng, imperative = infinitive
False => "do not" ++ vp.verb.s ! VF Inf ++ vp.compl
}
} ;
UseV v = {
verb = verb2gverb v ; -- lift ordinary verbs to generalized verbs
compl = []
} ;
ComplV2 v2 np = {
verb = verb2gverb v2 ;
compl = v2.c ++ np.s ! Acc -- NP object in the accusative, preposition first
} ;
ComplVS vs s = {
verb = verb2gverb vs ;
compl = "that" ++ s.s ;
} ;
ComplVV vv vp = {
verb = verb2gverb vv ;
compl = "to" ++ vp.verb.s ! VF Inf ++ vp.compl ;
} ;
UseComp comp = {
verb = be_GVerb ; -- the verb is the copula "be"
compl = comp.s
} ;
CompAP ap = ap ;
CompNP np = {
s = np.s ! Nom -- NP complement is in the nominative
} ;
CompAdv adv = adv ;
AdvVP vp adv =
vp ** {compl = vp.compl ++ adv.s} ;
DetCN det cn = {
s = table {c => det.s ++ cn.s ! det.n} ;
a = Agr det.n Per3 -- this kind of NP is always third person
} ;
UsePN pn = {
s = \\_ => pn.s ;
a = Agr Sg Per3
} ;
UsePron p = p ; -- Pron is worst-case NP
MassNP cn = {
s = \\_ => cn.s ! Sg ;
a = Agr Sg Per3
} ;
a_Det = {s = pre {"a"|"e"|"i"|"o" => "an" ; _ => "a"} ; n = Sg} ; --- a/an can get wrong
aPl_Det = {s = "" ; n = Pl} ;
the_Det = {s = "the" ; n = Sg} ;
thePl_Det = {s = "the" ; n = Pl} ;
UseN n = n ;
AdjCN ap cn = {
s = table {n => ap.s ++ cn.s ! n}
} ;
PositA a = a ;
PrepNP prep np = {s = prep.s ++ np.s ! Acc} ;
CoordS conj a b = {s = a.s ++ conj.s ++ b.s} ;
PPos = {s = [] ; isTrue = True} ;
PNeg = {s = [] ; isTrue = False} ;
TSim = {s = [] ; isPres = True} ;
TAnt = {s = [] ; isPres = False} ;
and_Conj = {s = "and"} ;
or_Conj = {s = "or"} ;
every_Det = {s = "every" ; n = Sg} ;
in_Prep = {s = "in"} ;
on_Prep = {s = "on"} ;
with_Prep = {s = "with"} ;
i_Pron = {
s = table {Nom => "I" ; Acc => "me"} ;
a = Agr Sg Per1
} ;
youSg_Pron = {
s = \\_ => "you" ;
a = Agr Sg Per2
} ;
he_Pron = {
s = table {Nom => "he" ; Acc => "him"} ;
a = Agr Sg Per3
} ;
she_Pron = {
s = table {Nom => "she" ; Acc => "her"} ;
a = Agr Sg Per3
} ;
we_Pron = {
s = table {Nom => "we" ; Acc => "us"} ;
a = Agr Pl Per1
} ;
youPl_Pron = {
s = \\_ => "you" ;
a = Agr Pl Per2
} ;
they_Pron = {
s = table {Nom => "they" ; Acc => "them"} ;
a = Agr Pl Per3
} ;
whoSg_IP = {
s = table {Nom => "who" ; Acc => "whom"} ;
a = Agr Sg Per3
} ;
where_IAdv = {s = "where"} ;
why_IAdv = {s = "why"} ;
have_V2 = mkVerb "have" "has" "had" "had" "having" ** {c = []} ;
want_VV = regVerb "want" ;
}

View File

@@ -1,3 +0,0 @@
--# -path=.:../abstract
concrete MiniLangEng of MiniLang = MiniGrammarEng, MiniLexiconEng ;

View File

@@ -1,6 +0,0 @@
UseCl, UseQCl, ImpVP {"not"} PART advmod head
UseComp {"is","are","am","was","were","been","be"} AUX cop head
PredVP, QuestVP {"has","had","have","do","does"} AUX aux head
ImpVP {"do"} AUX aux head
ComplVS {"that"} SCONJ mark ccomp
ComplVV {"to"} PART mark xcomp

View File

@@ -1,94 +0,0 @@
concrete MiniLexiconEng of MiniLexicon = MiniGrammarEng **
open
MiniParadigmsEng
in {
lin already_Adv = mkAdv "already" ;
lin animal_N = mkN "animal" ;
lin apple_N = mkN "apple" ;
lin baby_N = mkN "baby" ;
lin bad_A = mkA "bad" ;
lin beer_N = mkN "beer" ;
lin big_A = mkA "big" ;
lin bike_N = mkN "bike" ;
lin bird_N = mkN "bird" ;
lin black_A = mkA "black" ;
lin blood_N = mkN "blood" ;
lin blue_A = mkA "blue" ;
lin boat_N = mkN "boat" ;
lin book_N = mkN "book" ;
lin boy_N = mkN "boy" ;
lin bread_N = mkN "bread" ;
lin break_V2 = mkV2 (mkV "break" "broke" "broken") ;
lin buy_V2 = mkV2 (mkV "buy" "bought" "bought") ;
lin car_N = mkN "car" ;
lin cat_N = mkN "cat" ;
lin child_N = mkN "child" "children" ;
lin city_N = mkN "city" ;
lin clean_A = mkA "clean" ;
lin clever_A = mkA "clever" ;
lin cloud_N = mkN "cloud" ;
lin cold_A = mkA "cold" ;
lin come_V = mkV "come" "came" "come" ;
lin computer_N = mkN "computer" ;
lin cow_N = mkN "cow" ;
lin dirty_A = mkA "dirty" ;
lin dog_N = mkN "dog" ;
lin drink_V2 = mkV2 (mkV "drink" "drank" "drunk") ;
lin eat_V2 = mkV2 (mkV "eat" "ate" "eaten") ;
lin find_V2 = mkV2 (mkV "find" "found" "found") ;
lin fire_N = mkN "fire" ;
lin fish_N = mkN "fish" "fish" ;
lin flower_N = mkN "flower" ;
lin friend_N = mkN "friend" ;
lin girl_N = mkN "girl" ;
lin good_A = mkA "good" ;
lin go_V = mkV "go" "went" "gone" ;
lin grammar_N = mkN "grammar" ;
lin green_A = mkA "green" ;
lin heavy_A = mkA "heavy" ;
lin horse_N = mkN "horse" ;
lin hot_A = mkA "hot" ;
lin house_N = mkN "house" ;
lin john_PN = mkPN "John" ;
lin jump_V = mkV "jump" ;
lin kill_V2 = mkV2 "kill" ;
lin know_VS = mkVS (mkV "know" "knew" "known") ;
lin language_N = mkN "language" ;
lin live_V = mkV "live" ;
lin love_V2 = mkV2 (mkV "love") ;
lin man_N = mkN "man" "men" ;
lin milk_N = mkN "milk" ;
lin music_N = mkN "music" ;
lin new_A = mkA "new" ;
lin now_Adv = mkAdv "now" ;
lin old_A = mkA "old" ;
lin paris_PN = mkPN "Paris" ;
lin play_V = mkV "play" ;
lin read_V2 = mkV2 (mkV "read" "read" "read") ;
lin ready_A = mkA "ready" ;
lin red_A = mkA "red" ;
lin river_N = mkN "river" ;
lin run_V = mkV "run" "ran" "run" ;
lin sea_N = mkN "sea" ;
lin see_V2 = mkV2 (mkV "see" "saw" "seen") ;
lin ship_N = mkN "ship" ;
lin sleep_V = mkV "sleep" "slept" "slept" ;
lin small_A = mkA "small" ;
lin star_N = mkN "star" ;
lin swim_V = mkV "swim" "swam" "swum" ;
lin teach_V2 = mkV2 (mkV "teach" "taught" "taught") ;
lin train_N = mkN "train" ;
lin travel_V = mkV "travel" ;
lin tree_N = mkN "tree" ;
lin understand_V2 = mkV2 (mkV "understand" "understood" "understood") ;
lin wait_V2 = mkV2 "wait" "for" ;
lin walk_V = mkV "walk" ;
lin warm_A = mkA "warm" ;
lin water_N = mkN "water" ;
lin white_A = mkA "white" ;
lin wine_N = mkN "wine" ;
lin woman_N = mkN "woman" "women" ;
lin yellow_A = mkA "yellow" ;
lin young_A = mkA "young" ;
}

View File

@@ -1,49 +0,0 @@
resource MiniParadigmsEng = open
MiniGrammarEng,
MiniResEng
in {
oper
mkN = overload {
mkN : Str -> Noun -- predictable noun, e.g. car-cars, boy-boys, fly-flies, bush-bushes
= \n -> lin N (smartNoun n) ;
mkN : Str -> Str -> Noun -- irregular noun, e.g. man-men
= \sg,pl -> lin N (mkNoun sg pl) ;
} ;
mkPN : Str -> PN
= \s -> lin PN {s = s} ;
mkA : Str -> A
= \s -> lin A {s = s} ;
mkV = overload {
mkV : (inf : Str) -> V -- predictable verb, e.g. play-plays, cry-cries, wash-washes
= \s -> lin V (smartVerb s) ;
mkV : (inf,pres,part : Str) -> V -- irregular verb, e.g. drink-drank-drunk
= \inf,pres,part -> lin V (irregVerb inf pres part) ;
} ;
mkV2 = overload {
mkV2 : Str -> V2 -- predictable verb with direct object, e.g. "wash"
= \s -> lin V2 (smartVerb s ** {c = []}) ;
mkV2 : Str -> Str -> V2 -- predictable verb with preposition, e.g. "wait - for"
= \s,p -> lin V2 (smartVerb s ** {c = p}) ;
mkV2 : V -> V2 -- any verb with direct object, e.g. "drink"
= \v -> lin V2 (v ** {c = []}) ;
mkV2 : V -> Str -> V2 -- any verb with preposition
= \v,p -> lin V2 (v ** {c = p}) ;
} ;
mkVS : V -> VS
= \v -> lin VS v ;
mkAdv : Str -> Adv
= \s -> lin Adv {s = s} ;
mkPrep : Str -> Prep
= \s -> lin Prep {s = s} ;
}

View File

@@ -1,99 +0,0 @@
resource MiniResEng = open Prelude in {
param
Number = Sg | Pl ;
Case = Nom | Acc ;
Person = Per1 | Per2 | Per3 ;
Agreement = Agr Number Person ;
-- all forms of normal Eng verbs, although not yet used in MiniGrammar
VForm = Inf | PresSg3 | Past | PastPart | PresPart ;
oper
Noun : Type = {s : Number => Str} ;
mkNoun : Str -> Str -> Noun = \sg,pl -> {
s = table {Sg => sg ; Pl => pl}
} ;
regNoun : Str -> Noun = \sg -> mkNoun sg (sg + "s") ;
-- smart paradigm
smartNoun : Str -> Noun = \sg -> case sg of {
_ + ("ay"|"ey"|"oy"|"uy") => regNoun sg ;
x + "y" => mkNoun sg (x + "ies") ;
_ + ("ch"|"sh"|"s"|"o") => mkNoun sg (sg + "es") ;
_ => regNoun sg
} ;
Adjective : Type = {s : Str} ;
Verb : Type = {s : VForm => Str} ;
mkVerb : (inf,pres,past,pastpart,prespart : Str) -> Verb
= \inf,pres,past,pastpart,prespart -> {
s = table {
Inf => inf ;
PresSg3 => pres ;
Past => past ;
PastPart => pastpart ;
PresPart => prespart
}
} ;
regVerb : (inf : Str) -> Verb = \inf ->
mkVerb inf (inf + "s") (inf + "ed") (inf + "ed") (inf + "ing") ;
-- regular verbs with predictable variations
smartVerb : Str -> Verb = \inf -> case inf of {
pl + ("a"|"e"|"i"|"o"|"u") + "y" => regVerb inf ;
cr + "y" => mkVerb inf (cr + "ies") (cr + "ied") (cr + "ied") (inf + "ing") ;
lov + "e" => mkVerb inf (inf + "s") (lov + "ed") (lov + "ed") (lov + "ing") ;
kis + ("s"|"sh"|"x"|"o") => mkVerb inf (inf + "es") (inf + "ed") (inf + "ed") (inf + "ing") ;
_ => regVerb inf
} ;
-- normal irregular verbs e.g. drink,drank,drunk
irregVerb : (inf,past,pastpart : Str) -> Verb =
\inf,past,pastpart ->
let verb = smartVerb inf
in mkVerb inf (verb.s ! PresSg3) past pastpart (verb.s ! PresPart) ;
negation : Bool -> Str = \b -> case b of {True => [] ; False => "not"} ;
-- two-place verb with "case" as preposition; for transitive verbs, c=[]
Verb2 : Type = Verb ** {c : Str} ;
-- generalized verb, here just "be"
param
GVForm = VF VForm | PresSg1 | PresPl | PastPl ;
oper
GVerb : Type = {
s : GVForm => Str ;
isAux : Bool
} ;
be_GVerb : GVerb = {
s = table {
PresSg1 => "am" ;
PresPl => "are" ;
PastPl => "were" ;
VF vf => (mkVerb "be" "is" "was" "been" "being").s ! vf
} ;
isAux = True
} ;
-- in VP formation, all verbs are lifted to GVerb, but morphology doesn't need to know this
verb2gverb : Verb -> GVerb = \v -> {s =
table {
PresSg1 => v.s ! Inf ;
PresPl => v.s ! Inf ;
PastPl => v.s ! Past ;
VF vf => v.s ! vf
} ;
isAux = False
} ;
}

View File

@@ -1,196 +0,0 @@
incomplete concrete MiniLangFunctor of MiniLang =
open
Grammar,
Syntax,
Lexicon
in {
-- A functor implementation of MiniLang, using Grammar and Lexicon whenever the function is
-- directly from there, Syntax otherwise.
-- Both Grammar and Lexicon are in a single file for simplicity.
-----------------------------------------------------
---------------- Grammar part -----------------------
-----------------------------------------------------
lincat
Utt = Grammar.Utt ;
Pol = Grammar.Pol ;
Temp = Grammar.Temp ;
Imp = Grammar.Imp ;
S = Grammar.S ;
QS = Grammar.QS ;
Cl = Grammar.Cl ;
QCl = Grammar.QCl ;
VP = Grammar.VP ;
Comp = Grammar.Comp ;
AP = Grammar.AP ;
CN = Grammar.CN ;
NP = Grammar.NP ;
IP = Grammar.IP ;
Pron = Grammar.Pron ;
Det = Grammar.Det ;
Conj = Grammar.Conj ;
Prep = Grammar.Prep ;
V = Grammar.V ;
V2 = Grammar.V2 ;
VS = Grammar.VS ;
VV = Grammar.VV ;
A = Grammar.A ;
N = Grammar.N ;
PN = Grammar.PN ;
Adv = Grammar.Adv ;
IAdv = Grammar.IAdv ;
lin
UttS = Grammar.UttS ;
UttQS = Grammar.UttQS ;
UttNP = Grammar.UttNP ;
UttAdv = Grammar.UttAdv ;
UttIAdv = Grammar.UttIAdv ;
UttImpSg pol imp = Syntax.mkUtt pol imp ;
UseCl = Grammar.UseCl ;
UseQCl = Grammar.UseQCl ;
PredVP = Grammar.PredVP ;
QuestCl = Grammar.QuestCl ;
QuestVP = Grammar.QuestVP ;
ImpVP = Grammar.ImpVP ;
UseV = Grammar.UseV ;
ComplV2 v2 np = Syntax.mkVP v2 np ;
ComplVS = Grammar.ComplVS ;
ComplVV = Grammar.ComplVV ;
UseComp = Grammar.UseComp ;
CompAP = Grammar.CompAP ;
CompNP = Grammar.CompNP ;
CompAdv = Grammar.CompAdv ;
AdvVP = Grammar.AdvVP ;
DetCN = Grammar.DetCN ;
UsePN = Grammar.UsePN ;
UsePron = Grammar.UsePron ;
MassNP = Grammar.MassNP ;
a_Det = Syntax.a_Det ;
aPl_Det = Syntax.aPl_Det ;
the_Det = Syntax.the_Det ;
thePl_Det = Syntax.thePl_Det ;
UseN = Grammar.UseN ;
AdjCN = Grammar.AdjCN ;
PositA = Grammar.PositA ;
PrepNP = Grammar.PrepNP ;
CoordS conj a b = Syntax.mkS conj a b ;
PPos = Grammar.PPos ;
PNeg = Grammar.PNeg ;
TSim = Syntax.mkTemp Syntax.presentTense Syntax.simultaneousAnt ;
TAnt = Syntax.mkTemp Syntax.presentTense Syntax.anteriorAnt ;
and_Conj = Grammar.and_Conj ;
or_Conj = Grammar.or_Conj ;
every_Det = Grammar.every_Det ;
in_Prep = Grammar.in_Prep ;
on_Prep = Grammar.on_Prep ;
with_Prep = Grammar.with_Prep ;
i_Pron = Grammar.i_Pron ;
youSg_Pron = Grammar.youSg_Pron ;
he_Pron = Grammar.he_Pron ;
she_Pron = Grammar.she_Pron ;
we_Pron = Grammar.we_Pron ;
youPl_Pron = Grammar.youPl_Pron ;
they_Pron = Grammar.they_Pron ;
whoSg_IP = Grammar.whoSg_IP ;
where_IAdv = Grammar.where_IAdv ;
why_IAdv = Grammar.why_IAdv ;
have_V2 = Grammar.have_V2 ;
want_VV = Grammar.want_VV ;
------------------------------------
-- Lexicon part --------------------
------------------------------------
already_Adv = Lexicon.already_Adv ;
animal_N = Lexicon.animal_N ;
apple_N = Lexicon.apple_N ;
baby_N = Lexicon.baby_N ;
bad_A = Lexicon.bad_A ;
beer_N = Lexicon.beer_N ;
big_A = Lexicon.big_A ;
bike_N = Lexicon.bike_N ;
bird_N = Lexicon.bird_N ;
black_A = Lexicon.black_A ;
blood_N = Lexicon.blood_N ;
blue_A = Lexicon.blue_A ;
boat_N = Lexicon.boat_N ;
book_N = Lexicon.book_N ;
boy_N = Lexicon.boy_N ;
bread_N = Lexicon.bread_N ;
break_V2 = Lexicon.break_V2 ;
buy_V2 = Lexicon.buy_V2 ;
car_N = Lexicon.car_N ;
cat_N = Lexicon.cat_N ;
child_N = Lexicon.child_N ;
city_N = Lexicon.city_N ;
clean_A = Lexicon.clean_A ;
clever_A = Lexicon.clever_A ;
cloud_N = Lexicon.cloud_N ;
cold_A = Lexicon.cold_A ;
come_V = Lexicon.come_V ;
computer_N = Lexicon.computer_N ;
cow_N = Lexicon.cow_N ;
dirty_A = Lexicon.dirty_A ;
dog_N = Lexicon.dog_N ;
drink_V2 = Lexicon.drink_V2 ;
eat_V2 = Lexicon.eat_V2 ;
find_V2 = Lexicon.find_V2 ;
fire_N = Lexicon.fire_N ;
fish_N = Lexicon.fish_N ;
flower_N = Lexicon.flower_N ;
friend_N = Lexicon.friend_N ;
girl_N = Lexicon.girl_N ;
good_A = Lexicon.good_A ;
go_V = Lexicon.go_V ;
grammar_N = Lexicon.grammar_N ;
green_A = Lexicon.green_A ;
heavy_A = Lexicon.heavy_A ;
horse_N = Lexicon.horse_N ;
hot_A = Lexicon.hot_A ;
house_N = Lexicon.house_N ;
john_PN = Lexicon.john_PN ;
jump_V = Lexicon.jump_V ;
kill_V2 = Lexicon.kill_V2 ;
know_VS = Lexicon.know_VS ;
language_N = Lexicon.language_N ;
live_V = Lexicon.live_V ;
love_V2 = Lexicon.love_V2 ;
man_N = Lexicon.man_N ;
milk_N = Lexicon.milk_N ;
music_N = Lexicon.music_N ;
new_A = Lexicon.new_A ;
now_Adv = Lexicon.now_Adv ;
old_A = Lexicon.old_A ;
paris_PN = Lexicon.paris_PN ;
play_V = Lexicon.play_V ;
read_V2 = Lexicon.read_V2 ;
ready_A = Lexicon.ready_A ;
red_A = Lexicon.red_A ;
river_N = Lexicon.river_N ;
run_V = Lexicon.run_V ;
sea_N = Lexicon.sea_N ;
see_V2 = Lexicon.see_V2 ;
ship_N = Lexicon.ship_N ;
sleep_V = Lexicon.sleep_V ;
small_A = Lexicon.small_A ;
star_N = Lexicon.star_N ;
swim_V = Lexicon.swim_V ;
teach_V2 = Lexicon.teach_V2 ;
train_N = Lexicon.train_N ;
travel_V = Lexicon.travel_V ;
tree_N = Lexicon.tree_N ;
understand_V2 = Lexicon.understand_V2 ;
wait_V2 = Lexicon.wait_V2 ;
walk_V = Lexicon.walk_V ;
warm_A = Lexicon.warm_A ;
water_N = Lexicon.water_N ;
white_A = Lexicon.white_A ;
wine_N = Lexicon.wine_N ;
woman_N = Lexicon.woman_N ;
yellow_A = Lexicon.yellow_A ;
young_A = Lexicon.young_A ;
}

View File

@@ -1,8 +0,0 @@
--# -path=.:../abstract
concrete MiniLangFunctorEng of MiniLang = MiniLangFunctor with
(Grammar = GrammarEng),
(Syntax = SyntaxEng),
(Lexicon = LexiconEng)
;

View File

@@ -1,8 +0,0 @@
--# -path=.:../abstract
concrete MiniLangFunctorSwe of MiniLang = MiniLangFunctor with
(Grammar = GrammarSwe),
(Syntax = SyntaxSwe),
(Lexicon = LexiconSwe)
;

View File

@@ -1,15 +0,0 @@
--# -path=.:../abstract
concrete MicroLangEng of MicroLang =
open MicroResEng
in {
lincat
N = Noun ;
lin
animal_N = mkN "animal" ;
apple_N = mkN "apple" ;
baby_N = mkN "baby" ;
woman_N = mkN "woman" "women" ;
}

View File

@@ -1,44 +0,0 @@
-- live-coded MicroResEng for Lab 2
resource MicroResEng = {
param
Number = Sg | Pl ;
oper
-- phonological patterns
sibilant : pattern Str
= #("s" | "x" | "ch" | "sh" | "z") ;
vowel : pattern Str
= #("a" | "e" | "i" | "o" | "u") ;
-- the type of nouns
Noun : Type = {s : Number => Str} ;
-- worst-case paradigm
mkNoun : (sg, pl : Str) -> Noun
= \sg, pl -> {s = table {Sg => sg ; Pl => pl}} ;
-- regular paradigm
regNoun : (sg : Str) -> Noun
= \sg -> mkNoun sg (sg + "s") ;
-- smart paradigm
smartNoun : (sg : Str) -> Noun
= \sg -> case sg of {
x + #vowel + "y" => regNoun sg ;
x + "y" => mkNoun sg (x + "ies") ;
x + #sibilant => mkNoun sg (sg + "es") ;
_ => regNoun sg
} ;
-- overloaded paradigm for lexicographers
mkN = overload {
mkN : (sg : Str) -> Noun = smartNoun ;
mkN : (sg, pl : Str) -> Noun = mkNoun ;
} ;
}

View File

@@ -1,233 +0,0 @@
--# -path=.:../abstract
concrete MicroLangIta of MicroLang = open MicroResIta in {
-----------------------------------------------------
---------------- Grammar part -----------------------
-----------------------------------------------------
lincat
{-
Utt = {s : Str} ;
S = {s : Str} ;
VP = {verb : Verb ; compl : Str} ; ---s special case of Mini
Comp = {s : Str} ;
-}
AP = Adjective ;
CN = Noun ;
{-
NP = {s : Case => Str ; a : Agreement} ;
Pron = {s : Case => Str ; a : Agreement} ;
Det = {s : Str ; n : Number} ;
Prep = {s : Str} ;
V = Verb ;
V2 = Verb2 ;
-}
A = Adjective ;
N = Noun ;
Adv = {s : Str} ;
lin
{-
UttS s = s ;
UttNP np = {s = np.s ! Acc} ;
PredVPS np vp = {
s = np.s ! Nom ++ vp.verb.s ! agr2vform np.a ++ vp.compl
} ;
UseV v = {
verb = v ;
compl = [] ;
} ;
ComplV2 v2 np = {
verb = v2 ;
compl = v2.c ++ np.s ! Acc -- NP object in the accusative, preposition first
} ;
UseComp comp = {
verb = be_Verb ; -- the verb is the copula "be"
compl = comp.s
} ;
CompAP ap = ap ;
AdvVP vp adv =
vp ** {compl = vp.compl ++ adv.s} ;
DetCN det cn = {
s = \\c => det.s ++ cn.s ! det.n ;
a = Agr det.n ;
} ;
UsePron p = p ;
a_Det = {s = pre {"a"|"e"|"i"|"o" => "an" ; _ => "a"} ; n = Sg} ; --- a/an can get wrong
aPl_Det = {s = "" ; n = Pl} ;
the_Det = {s = "the" ; n = Sg} ;
thePl_Det = {s = "the" ; n = Pl} ;
-}
UseN n = n ;
AdjCN ap cn = {
s = table {n => cn.s ! n ++ ap.s ! cn.g ! n} ;
g = cn.g
} ;
PositA a = a ;
{-
PrepNP prep np = {s = prep.s ++ np.s ! Acc} ;
in_Prep = {s = "in"} ;
on_Prep = {s = "on"} ;
with_Prep = {s = "with"} ;
he_Pron = {
s = table {Nom => "he" ; Acc => "him"} ;
a = Agr Sg ;
} ;
she_Pron = {
s = table {Nom => "she" ; Acc => "her"} ;
a = Agr Sg ;
} ;
they_Pron = {
s = table {Nom => "they" ; Acc => "them"} ;
a = Agr Pl ;
} ;
-}
-----------------------------------------------------
---------------- Lexicon part -----------------------
-----------------------------------------------------
-- lin already_Adv = mkAdv "already" ;
lin animal_N = mkN "animale" ;
lin apple_N = mkN "mela" ;
lin baby_N = mkN "bambino" ;
lin bad_A = mkA "cattivo" ;
lin beer_N = mkN "birra" ;
lin big_A = mkA "grande" ;
lin bike_N = mkN "bicicletta" ;
{-
lin bird_N = mkN "bird" ;
lin black_A = mkA "black" ;
lin blood_N = mkN "blood" ;
lin blue_A = mkA "blue" ;
lin boat_N = mkN "boat" ;
lin book_N = mkN "book" ;
lin boy_N = mkN "boy" ;
lin bread_N = mkN "bread" ;
lin break_V2 = mkV2 (mkV "break" "broke" "broken") ;
lin buy_V2 = mkV2 (mkV "buy" "bought" "bought") ;
lin car_N = mkN "car" ;
lin cat_N = mkN "cat" ;
lin child_N = mkN "child" "children" ;
lin city_N = mkN "city" ;
lin clean_A = mkA "clean" ;
lin clever_A = mkA "clever" ;
lin cloud_N = mkN "cloud" ;
lin cold_A = mkA "cold" ;
lin come_V = mkV "come" "came" "come" ;
lin computer_N = mkN "computer" ;
lin cow_N = mkN "cow" ;
lin dirty_A = mkA "dirty" ;
lin dog_N = mkN "dog" ;
lin drink_V2 = mkV2 (mkV "drink" "drank" "drunk") ;
lin eat_V2 = mkV2 (mkV "eat" "ate" "eaten") ;
lin find_V2 = mkV2 (mkV "find" "found" "found") ;
lin fire_N = mkN "fire" ;
lin fish_N = mkN "fish" "fish" ;
lin flower_N = mkN "flower" ;
lin friend_N = mkN "friend" ;
lin girl_N = mkN "girl" ;
lin good_A = mkA "good" ;
lin go_V = mkV "go" "went" "gone" ;
lin grammar_N = mkN "grammar" ;
lin green_A = mkA "green" ;
lin heavy_A = mkA "heavy" ;
lin horse_N = mkN "horse" ;
lin hot_A = mkA "hot" ;
lin house_N = mkN "house" ;
-- lin john_PN = mkPN "John" ;
lin jump_V = mkV "jump" ;
lin kill_V2 = mkV2 "kill" ;
-- lin know_VS = mkVS (mkV "know" "knew" "known") ;
lin language_N = mkN "language" ;
lin live_V = mkV "live" ;
lin love_V2 = mkV2 (mkV "love") ;
lin man_N = mkN "man" "men" ;
lin milk_N = mkN "milk" ;
lin music_N = mkN "music" ;
lin new_A = mkA "new" ;
lin now_Adv = mkAdv "now" ;
lin old_A = mkA "old" ;
-- lin paris_PN = mkPN "Paris" ;
lin play_V = mkV "play" ;
lin read_V2 = mkV2 (mkV "read" "read" "read") ;
lin ready_A = mkA "ready" ;
lin red_A = mkA "red" ;
lin river_N = mkN "river" ;
lin run_V = mkV "run" "ran" "run" ;
lin sea_N = mkN "sea" ;
lin see_V2 = mkV2 (mkV "see" "saw" "seen") ;
lin ship_N = mkN "ship" ;
lin sleep_V = mkV "sleep" "slept" "slept" ;
lin small_A = mkA "small" ;
lin star_N = mkN "star" ;
lin swim_V = mkV "swim" "swam" "swum" ;
lin teach_V2 = mkV2 (mkV "teach" "taught" "taught") ;
lin train_N = mkN "train" ;
lin travel_V = mkV "travel" ;
lin tree_N = mkN "tree" ;
lin understand_V2 = mkV2 (mkV "understand" "understood" "understood") ;
lin wait_V2 = mkV2 "wait" "for" ;
lin walk_V = mkV "walk" ;
lin warm_A = mkA "warm" ;
lin water_N = mkN "water" ;
lin white_A = mkA "white" ;
lin wine_N = mkN "wine" ;
lin woman_N = mkN "woman" "women" ;
lin yellow_A = mkA "yellow" ;
lin young_A = mkA "young" ;
---------------------------
-- Paradigms part ---------
---------------------------
oper
mkN = overload {
mkN : Str -> Noun -- predictable noun, e.g. car-cars, boy-boys, fly-flies, bush-bushes
= \n -> lin N (smartNoun n) ;
mkN : Str -> Str -> Noun -- irregular noun, e.g. man-men
= \sg,pl -> lin N (mkNoun sg pl) ;
} ;
mkA : Str -> A
= \s -> lin A {s = s} ;
mkV = overload {
mkV : (inf : Str) -> V -- predictable verb, e.g. play-plays, cry-cries, wash-washes
= \s -> lin V (smartVerb s) ;
mkV : (inf,pres,part : Str) -> V -- irregular verb, e.g. drink-drank-drunk
= \inf,pres,part -> lin V (irregVerb inf pres part) ;
} ;
mkV2 = overload {
mkV2 : Str -> V2 -- predictable verb with direct object, e.g. "wash"
= \s -> lin V2 (smartVerb s ** {c = []}) ;
mkV2 : Str -> Str -> V2 -- predictable verb with preposition, e.g. "wait - for"
= \s,p -> lin V2 (smartVerb s ** {c = p}) ;
mkV2 : V -> V2 -- any verb with direct object, e.g. "drink"
= \v -> lin V2 (v ** {c = []}) ;
mkV2 : V -> Str -> V2 -- any verb with preposition
= \v,p -> lin V2 (v ** {c = p}) ;
} ;
mkAdv : Str -> Adv
= \s -> lin Adv {s = s} ;
mkPrep : Str -> Prep
= \s -> lin Prep {s = s} ;
-}
}

View File

@@ -1,62 +0,0 @@
resource MicroResIta = {
param
-- define types of morphological parameters
Number = Sg | Pl ;
Gender = Masc | Fem ;
oper
-- define types for parts of speech
-- they are recourd types with tables and inherent features
Noun : Type = {s : Number => Str ; g : Gender} ;
Adjective : Type = {s : Gender => Number => Str} ;
-- here is an example that is type-correct as a Noun
donna_N : Noun = {
s = table {Sg => "donna" ; Pl => "donne"} ;
g = Fem
} ;
-- define constructor function for Noun
mkNoun : Str -> Str -> Gender -> Noun = \sg, pl, g -> {
s = table {Sg => sg ; Pl => pl} ;
g = g
} ;
-- define a noun using this constructor
uomo_N : Noun = mkNoun "uomo" "uomini" Masc ;
-- define a smart paradigm
smartNoun : Str -> Noun = \s -> case s of {
x + "o" => mkNoun s (x + "i") Masc ;
x + "a" => mkNoun s (x + "e") Fem ;
x + "e" => mkNoun s (x + "i") Masc ;
_ => mkNoun s s Masc
} ;
-- the overloaded paradigm is what the lexicon will use
mkN = overload {
mkN : Str -> Noun = smartNoun ;
mkN : Str -> Str -> Gender -> Noun = mkNoun ;
mkN : Gender -> Noun -> Noun = \g, n -> n ** {g = g} ;
} ;
-- adjectives:
mkAdjective : (msg,fsg,mpl,fpl : Str) -> Adjective = \msg,fsg,mpl,fpl -> {
s = table {
Masc => table {Sg => msg ; Pl => mpl} ;
Fem => table {Sg => fsg ; Pl => fpl}
}
} ;
smartAdjective : Str -> Adjective = \s -> case s of {
x + "o" => mkAdjective s (x + "a") (x + "i") (x + "e") ;
x + "e" => mkAdjective s s (x + "i") (x + "i") ;
_ => mkAdjective s s s s
} ;
mkA = overload {
mkA : Str -> Adjective = smartAdjective ;
mkA : (msg,fsg,mpl,fpl : Str) -> Adjective = mkAdjective ;
} ;
}

Binary file not shown.

View File

@@ -1,57 +0,0 @@
--# -path=.:../abstract
concrete MicroLangEng of MicroLang = open MicroResEng in {
lincat
Utt = {s : Str} ;
S = {s : Str} ;
VP = {verb : MicroResEng.V ; compl : Str} ;
CN = MicroResEng.N ;
AP = MicroResEng.A ;
NP = MicroResEng.Pron ;
Pron = MicroResEng.Pron ;
N = MicroResEng.N ;
A = MicroResEng.A ;
V = MicroResEng.V ;
V2 = MicroResEng.V2 ;
lin
PredVPS np vp = {s = np.s ! Nom ++ selectVerb vp.verb np.n ++ vp.compl} ;
UseV v = {verb = v ; compl = []} ;
ComplV2 v np = {verb = v ; compl = np.s ! Acc} ;
AdjCN ap cn = {s = \\n => ap.s ++ cn.s ! n} ;
UsePron p = p ;
UseN n = n ;
PositA a = a ;
he_Pron = mkPron "he" "him" Sg ;
she_Pron = mkPron "she" "her" Sg ;
they_Pron = mkPron "they" "them" Pl ;
book_N = {s = table {Sg => "book" ; Pl => "books"}} ;
grammar_N = mkN "grammar" ;
woman_N = mkN "woman" "women" ;
child_N = mkN "child" "children" ;
boy_N = mkN "boy" ;
big_A = mkA "big" ;
good_A = mkA "good" ;
live_V = mkV "live" ;
love_V2 = mkV2 "love" ;
}

View File

@@ -1,15 +0,0 @@
--# -path=.:../abstract
concrete MicroLangSwe of MicroLang = open MicroResSwe in {
lincat N = MicroResSwe.N ;
lin baby_N = decl2 "bebis" ;
lin dog_N = decl2 "hund" ;
lin man_N = worstN "man" "mannen" "män" "männen" Utr ;
lin car_N = decl2 "bil" ;
--lin city_N = mkN "stad" ;
lin boy_N = decl2 "pojke" ;
}

View File

@@ -1,124 +0,0 @@
resource MicroResEng = open Prelude in {
param
Number = Sg | Pl ;
Case = Nom | Acc ;
Agreement = Agr Number ; ---s Person to be added
-- all forms of normal Eng verbs, although not yet used in MiniGrammar
VForm = Inf | PresSg3 | Past | PastPart | PresPart ;
oper
N : Type = {s : Number => Str} ;
worstN : Str -> Str -> N = \sg,pl -> {
s = table {Sg => sg ; Pl => pl}
} ;
regN : Str -> N = \sg -> worstN sg (sg + "s") ;
-- smart paradigm
smartN : Str -> N = \sg -> case sg of {
_ + ("ay"|"ey"|"oy"|"uy") => regN sg ;
x + "y" => worstN sg (x + "ies") ;
_ + ("ch"|"sh"|"s"|"o") => worstN sg (sg + "es") ;
_ => regN sg
} ;
A : Type = {s : Str} ;
V : Type = {s : VForm => Str} ;
mkVerb : (inf,pres,past,pastpart,prespart : Str) -> V
= \inf,pres,past,pastpart,prespart -> {
s = table {
Inf => inf ;
PresSg3 => pres ;
Past => past ;
PastPart => pastpart ;
PresPart => prespart
}
} ;
regV : (inf : Str) -> V = \inf ->
mkVerb inf (inf + "s") (inf + "ed") (inf + "ed") (inf + "ing") ;
-- regular verbs with predictable variations
smartV : Str -> V = \inf -> case inf of {
pl + ("a"|"e"|"i"|"o"|"u") + "y" => regV inf ;
cr + "y" => mkVerb inf (cr + "ies") (cr + "ied") (cr + "ied") (inf + "ing") ;
lov + "e" => mkVerb inf (inf + "s") (lov + "ed") (lov + "ed") (lov + "ing") ;
kis + ("s"|"sh"|"x"|"o") => mkVerb inf (inf + "es") (inf + "ed") (inf + "ed") (inf + "ing") ;
_ => regV inf
} ;
-- normal irregular verbs e.g. drink,drank,drunk
irregV : (inf,past,pastpart : Str) -> V =
\inf,past,pastpart ->
let verb = smartV inf
in mkVerb inf (verb.s ! PresSg3) past pastpart (verb.s ! PresPart) ;
-- two-place verb with "case" as preposition; for transitive verbs, c=[]
V2 : Type = V ** {c : Str} ;
be_V : V = mkVerb "are" "is" "was" "been" "being" ; ---s to be generalized
---s a very simplified verb agreement function for Micro
agr2vform : Agreement -> VForm = \a -> case a of {
Agr Sg => PresSg3 ;
Agr Pl => Inf
} ;
Pron : Type = {s : Case => Str ; n : Number} ;
mkPron : Str -> Str -> Number -> Pron = \nom,acc,n -> {s = table {Nom => nom ; Acc => acc} ; n = n} ;
selectVerb : V -> Number -> Str = \v,n -> case n of {
Sg => v.s ! PresSg3 ;
Pl => v.s ! Inf
} ;
---------------------------
-- Paradigms part ---------
---------------------------
oper
mkN = overload {
mkN : Str -> N -- predictable noun, e.g. car-cars, boy-boys, fly-flies, bush-bushes
= \n -> lin N (smartN n) ;
mkN : Str -> Str -> N -- irregular noun, e.g. man-men
= \sg,pl -> lin N (worstN sg pl) ;
} ;
mkA : Str -> A
= \s -> {s = s} ;
mkV = overload {
mkV : (inf : Str) -> V -- predictable verb, e.g. play-plays, cry-cries, wash-washes
= \s -> lin V (smartV s) ;
mkV : (inf,pres,part : Str) -> V -- irregular verb, e.g. drink-drank-drunk
= \inf,pres,part -> lin V (irregV inf pres part) ;
} ;
mkV2 = overload {
mkV2 : Str -> V2 -- predictable verb with direct object, e.g. "wash"
= \s -> lin V2 (smartV s ** {c = []}) ;
mkV2 : Str -> Str -> V2 -- predictable verb with preposition, e.g. "wait - for"
= \s,p -> lin V2 (smartV s ** {c = p}) ;
mkV2 : V -> V2 -- any verb with direct object, e.g. "drink"
= \v -> lin V2 (v ** {c = []}) ;
mkV2 : V -> Str -> V2 -- any verb with preposition
= \v,p -> lin V2 (v ** {c = p}) ;
} ;
-- mkAdv : Str -> Adv
-- = \s -> lin Adv {s = s} ;
-- mkPrep : Str -> Prep
-- = \s -> lin Prep {s = s} ;
}

View File

@@ -1,36 +0,0 @@
resource MicroResSwe = open Prelude in {
param
Number = Sg | Pl ;
Species = Indef | Def ;
Gender = Utr | Neutr ;
oper
N : Type = {s : Number => Species => Str ; g : Gender} ;
worstN : Str -> Str -> Str -> Str -> Gender -> N
= \man,mannen,män,männen,g -> {
s = table {
Sg => table {Indef => man ; Def => mannen} ;
Pl => table {Indef => män ; Def => männen}
} ;
g = g
} ;
-- https://en.wikipedia.org/wiki/Swedish_grammar
decl1 : Str -> N
= \apa ->
let ap = init apa in
worstN apa (apa + "n") (ap + "or") (ap + "orna") Utr ;
decl2 : Str -> N
= \bil -> case bil of {
pojk + "e" => worstN bil (bil + "en") (pojk + "ar") (pojk + "arna") Utr ;
_ => worstN bil (bil + "en") (bil + "ar") (bil + "arna") Utr
} ;
}

View File

@@ -1,8 +0,0 @@
gt -cat=N | l -list
gt -cat=A | l -list
gt -cat=V | l -list
gt -cat=V2 | l -list
gt -cat=Pron | l -list
gr -number=21 | l -treebank
pg -missing

116
lab3/README.md Normal file
View File

@@ -0,0 +1,116 @@
# Lab 3: Universal Dependencies
This lab is divided into two parts.
In [part 1](#part-1-ud-annotation), you will create a small parallel UD treebank for English/Swedish and a language of your choice.
In [part 2](#part-2-ud-parsing), you will train a parsing model and evaluate it on your treebank.
## Part 1: UD annotation
The goal of this part of the lab is for you to become able to contribute to a UD annotation project. You will familiarize with the CoNNL-U format and annotate your own parallel UD treebank.
### Step 1: familiarize with the CoNLL-U format
Go to [universaldependencies.org](https://universaldependencies.org/) and download a treebank for a language of your choice.
Choose a short (5-10 tokens) and a long (>25 words) sentence and convert it from CoNNL-U to a graphical trees by hand.
### Step 2: choose a corpus
Choose one of the two corpora provided in this folder:
- [`comp-syntax-corpus-english.txt`](comp-syntax-corpus-english.txt) is a combination of __English__ sentences from different sources, including [the Parallel UD treebank (PUD)](https://github.com/UniversalDependencies/UD_English-PUD/tree/master). If you want to cheat - or just check your answers - you can look for them in the official treebank. You can also compare your analyses with those of an automatic parser, such as [UDPipe](https://lindat.mff.cuni.cz/services/udpipe/), which you can try directly in your browser. These automatic analyses must of course be taken with a grain of salt
- [`comp-syntax-corpus-swedish.txt`](comp-syntax-corpus-swedish.txt) consists of teacher-corrected sentences from the [__Swedish__ Learner Language (SweLL) corpus](https://spraakbanken.gu.se/en/resources/swell-gold), which is currently being annotated in UD for the first time.
In this case, there is no "gold standard" to check your answers against, but you can still compare your solutions with [UDPipe](https://lindat.mff.cuni.cz/services/udpipe/)'s automatic analyses.
In both corpora, the first few sentences are pre-tokenized and POS-tagged. Each token is in the form
`word:<UPOS>`.
### Step 3: annotate
For each sentence in the corpus, the annotation tasks consists in:
1. analyzing the sentence in UD
2. translating it to a language of your choice
3. analyzing your translation
The only required fields are `ID`, `FORM`, `UPOS`, `HEAD` and `DEPREL`.
In the end, you will submit two parallel CoNLL-U files, one containing the analyses of the source sentences and one for the analyses of the translations.
To produce the CoNLL-U files, you may work in your text editor (if you use Visual Studio Code, you can use the [vscode-conllu](https://marketplace.visualstudio.com/items?itemName=lgrobol.vscode-conllu) to get syntax highlighting) or use a dedicated annotation tool such as [Arborator](https://arborator.grew.fr/#/).
If you work in your text editor, it might be easier to first write a simplified CoNLL-U, with just the fields `ID`, `FORM`, `UPOS`, `HEAD` and `DEPREL`, separated by tabs, and then expand it to full CoNLL-U with [this script](https://gist.github.com/harisont/612a87d20f729aa3411041f873367fa2) (or similar).
Example:
`7 world NOUN 4 nmod`
expands to
`7 world _ NOUN _ _ 4 nmod _ _`
We recommend that you annotate at least the first few sentences from scratch.
When you start feeling confident, you may pre-parse the remaining ones with UDPipe and manually correct the automatic annotation.
### Step 4: make sure your files match the CoNLL-U specification
Once you have full CoNLL, you can use [deptreepy](https://github.com/aarneranta/deptreepy/), [STUnD](https://harisont.github.io/STUnD/) or [the official online CoNNL-U viewer](https://universaldependencies.org/conllu_viewer.html) to visualize it.
With deptreepy, you will need to issue the command
`cat my-file.conllu | python deptreepy.py visualize_conllu > my-file.html`
which creates an HTML file you can open in you web browser.
If you can visualize your trees with any of these tools, it means that they are in valid CoNLL-U format.
If you want to check for more subtle errors, you can try to download and run [the official UD validator](https://github.com/UniversalDependencies/tools/blob/master/validate.py).
Submit the two CoNLL-U files on Canvas.
## Part 2: UD parsing
In this part of the lab, you will train and evaluate a UD parsing + POS tagging model.
For better performance, you are strongly encouraged to use the MLTGPU server.
### Step 1: setting up MaChAmp
1. optional, but recommended: create a Python virtual environment with the command
```
python -m venv ENVNAME
```
and activate it with
`source ENVNAME/bin/activate` (Linux/MacOS), or
`ENVNAME/Scripts/activate.bat` (Windows)
2. clone [the MaChAmp repository](https://github.com/machamp-nlp/machamp), move inside it and run
```
pip3 install -r requirements.txt
```
### Step 2: selecting the training and development data
Choose a UD treebank for one of the two languages you annotated in [part 1](#part-1-ud-annotation) and download it.
If you translated the corpus to a language that does not have a UD treebank, download a treebank for a related language (e.g. Italian if you annotated sentences in Sardinian).
If you are working on MLTGPU, you may choose a large treebank such as [Swedish-Talbanken](https://github.com/UniversalDependencies/UD_Swedish-Talbanken), which is already divided into a training, development and test split.
If you are working on your laptop and/or if your language does not have a lot of data available, you may want to use a smaller treebank, such as [Amharic-ATT](https://github.com/UniversalDependencies/UD_Amharic-ATT), which only comes with a test set.
In this case, split the test into a training and a development portion (e.g. 80% of the sentences for training and 20% for development).
### Step 3: training
Copy `compsyn.json` to `machamp/configs` and replace the traning and development data paths with the paths to the files you selected/created in step 2.
You can now train your model by running
```
python3 train.py --dataset_configs configs/compsyn.json --device N
```
from the MaChAmp folder.
If you are working on MLTGPU, replace `N` with `0` (GPU). If you are using your laptop or EDUSERV, replace it with `-1`, which instructs MaChAmp to train the model on the CPU.
### Step 4: evaluation
Run your newly trained model with
```
python predict.py logs/compsyn/DATE/model.pt PATH-TO-YOUR-PART1-TREEBANK predictions/OUTPUT-FILE-NAME.conllu --device N
```
and use the `machamp/scripts/misc/conll18_ud_eval.py` script to evaluate the system output against your annotations. You can run it as
```
python conll18_ud_eval.py PATH-TO-YOUR-PART1-TREEBANK predictions/OUTPUT-FILE-NAME.conllu
```
On Canvas, submit the training logs, the predictions and the output of `conll18_ud_eval.py`, along with a short text summarizing your considerations on the performance of the parser.

17
lab3/machamp_config.json Normal file
View File

@@ -0,0 +1,17 @@
{
"compsyn": {
"train_data_path": "PATH-TO-YOUR-TRAIN-SPLIT",
"dev_data_path": "PATH-TO-YOUR-DEV-SPLIT",
"word_idx": 1,
"tasks": {
"upos": {
"task_type": "seq",
"column_idx": 3
},
"dependency": {
"task_type": "dependency",
"column_idx": 6
}
}
}
}