8 Commits
main ... main

Author SHA1 Message Date
Aarne Ranta
f5015adb9a most of the lecture 5 syntax 2026-04-20 12:05:46 +02:00
Aarne Ranta
3449c442fa started lecture05 by copies from 04 to be modified 2026-04-20 08:43:35 +02:00
aarneranta
4f6b4bd531 lecture 4 code 2026-04-15 12:02:57 +02:00
aarneranta
649cdcf448 lecture 3 live coding 2026-04-14 15:03:44 +02:00
Aarne Ranta
14ece60235 lecture 2 examples 2026-04-01 12:04:05 +02:00
aarneranta
c0bc9c85f2 lecture 1 examples 2026-03-30 17:28:42 +02:00
aarneranta
9d0f650881 last year's lecture material moved to directory 2025 2026-03-30 07:43:08 +02:00
Arianna Masciolini
088f52a0f6 2026 modifications to lab 3 2026-03-29 21:15:06 +02:00
52 changed files with 715 additions and 50 deletions

View File

@@ -12,28 +12,27 @@ Go to [universaldependencies.org](https://universaldependencies.org/) and downlo
Choose a short (5-10 tokens) and a long (>25 words) sentence and convert it from CoNNL-U to a graphical trees by hand. Choose a short (5-10 tokens) and a long (>25 words) sentence and convert it from CoNNL-U to a graphical trees by hand.
### Step 2: choose a corpus ### Step 2: choose a corpus
Choose one of the two corpora provided in this folder: Choose a corpus of 25+ sentences.
- [`comp-syntax-corpus-english.txt`](comp-syntax-corpus-english.txt) is a combination of __English__ sentences from different sources, including [the Parallel UD treebank (PUD)](https://github.com/UniversalDependencies/UD_English-PUD/tree/master). If you want to cheat - or just check your answers - you can look for them in the official treebank. You can also compare your analyses with those of an automatic parser, such as [UDPipe](https://lindat.mff.cuni.cz/services/udpipe/), which you can try directly in your browser. These automatic analyses must of course be taken with a grain of salt If you want to start with __English__, you can use[`comp-syntax-corpus-english.txt`](comp-syntax-corpus-english.txt) is a combination of sentences from different sources, including [the Parallel UD treebank (PUD)](https://github.com/UniversalDependencies/UD_English-PUD/tree/master). If you want to cheat - or just check your answers - you can look for them in the official treebank. You can also compare your analyses with those of an automatic parser, such as [UDPipe](https://lindat.mff.cuni.cz/services/udpipe/), which you can try directly in your browser. These automatic analyses must of course be taken with a grain of salt. Note that the first few sentences of this corpus are pre-tokenized and POS-tagged. Each token is in the form `word:<UPOS>`.
- [`comp-syntax-corpus-swedish.txt`](comp-syntax-corpus-swedish.txt) consists of teacher-corrected sentences from the [__Swedish__ Learner Language (SweLL) corpus](https://spraakbanken.gu.se/en/resources/swell-gold), which is currently being annotated in UD for the first time.
In this case, there is no "gold standard" to check your answers against, but you can still compare your solutions with [UDPipe](https://lindat.mff.cuni.cz/services/udpipe/)'s automatic analyses.
In both corpora, the first few sentences are pre-tokenized and POS-tagged. Each token is in the form
`word:<UPOS>`. If you want to work with __Swedish__ and might be interested in contributing to an [official UD treebank](https://github.com/universaldependencies/UD_Swedish-SweLL), ask Arianna for [a sample of the Swedish Learner Language corpus](https://spraakbanken.gu.se/en/resources/swell).
If you have other data in mind that you think would be interesting to annotate in UD (not necessarily in English or Swedish), don't hesitate to bring it up during a lab session!
### Step 3: annotate ### Step 3: annotate
For each sentence in the corpus, the annotation tasks consists in: For each sentence in the corpus, the annotation tasks consists in:
1. analyzing the sentence in UD 1. analyzing the sentence in UD
2. translating it to a language of your choice 2. translating it to a language of your choice (as long as one of the two versions is in English or Swedish)
3. analyzing your translation 3. analyzing your translation
The only required fields are `ID`, `FORM`, `UPOS`, `HEAD` and `DEPREL`. The only required fields are `ID`, `FORM`, `UPOS`, `HEAD` and `DEPREL`.
In the end, you will submit two parallel CoNLL-U files, one containing the analyses of the source sentences and one for the analyses of the translations. In the end, you will submit two parallel CoNLL-U files, one containing the analyses of the source sentences and one for the analyses of the translations.
To produce the CoNLL-U files, you may work in your text editor (if you use Visual Studio Code, you can use the [vscode-conllu](https://marketplace.visualstudio.com/items?itemName=lgrobol.vscode-conllu) to get syntax highlighting), use a spreadsheet program and then export to TSV, or use a dedicated graphical annotation tool such as [Arborator](https://arborator.grew.fr/#/). To produce the CoNLL-U files, you may work in your text editor (you can usually get syntax highlighting by changing the extension to `.tsv`), use a spreadsheet program and then export to TSV, or use a dedicated graphical annotation tool such as [Arborator](https://arborator.grew.fr/#/) (helpful but unstable!).
If you work in your text editor, it might be easier to first write a simplified CoNLL-U, with just the fields `ID`, `FORM`, `UPOS`, `HEAD` and `DEPREL`, separated by tabs, and then expand it to full CoNLL-U with [this script](https://gist.github.com/harisont/612a87d20f729aa3411041f873367fa2) (or similar). If you work in your text editor, it might be easier to first write a simplified CoNLL-U, with just the fields `ID`, `FORM`, `UPOS`, `HEAD` and `DEPREL`, separated by tabs, and then expand it to full CoNLL-U with [this script](https://gist.github.com/harisont/612a87d20f729aa3411041f873367fa2) (or similar).
@@ -54,7 +53,7 @@ To fully comply with the CoNLL-U standard, comment lines should consist of key-v
# comment = your comment here # comment = your comment here
``` ```
but for this assigment lines like but for this assignment lines like
``` ```
# your comment here # your comment here
@@ -63,24 +62,14 @@ but for this assigment lines like
are perfectly acceptable too. are perfectly acceptable too.
### Step 4: make sure your files match the CoNLL-U specification ### Step 4: make sure your files match the CoNLL-U specification
Once you have full CoNLL, you can use [deptreepy](https://github.com/aarneranta/deptreepy/), [STUnD](https://harisont.github.io/STUnD/) or [the official online CoNNL-U viewer](https://universaldependencies.org/conllu_viewer.html) to visualize it. Check your treebank with the official UD validator.
With deptreepy, you will need to issue the command
`cat my-file.conllu | python deptreepy.py visualize_conllu > my-file.html`
which creates an HTML file you can open in you web browser.
If you can visualize your trees with any of these tools, that's a very good sign that your file _more or less_ matches the CoNNL-U format!
As a last step, validate your treebank with the official UD validator.
To do that, clone or download the [UD tools repository](https://github.com/UniversalDependencies/tools), move inside the corresponding folder and run To do that, clone or download the [UD tools repository](https://github.com/UniversalDependencies/tools), move inside the corresponding folder and run
``` ```
python validate.py PATH-TO-YOUR-TREEBANK.conllu --lang=2-LETTER-LANGCODE-FOR-YOUR-LANGUAGE --level=1 python validate.py PATH-TO-YOUR-TREEBANK.conllu --lang=2-LETTER-LANGCODE-FOR-YOUR-LANGUAGE --level=2
``` ```
If you want to check for more subtle errors, you can [go up a few levels](https://harisont.github.io/gfaqs.html#ud-validator). Level 2 should be enough for part 2, but you can [go up a few levels](https://harisont.github.io/gfaqs.html#ud-validator) to check for more subtle errors.
Submit the two CoNLL-U files on Canvas. Submit the two CoNLL-U files on Canvas.
@@ -91,7 +80,7 @@ If you want to install MaChAmp on your own computer, keep in mind that very old
For more information, see [here](https://github.com/machamp-nlp/machamp/issues/42). For more information, see [here](https://github.com/machamp-nlp/machamp/issues/42).
### Step 1: setting up MaChAmp ### Step 1: setting up MaChAmp
1. optional, but recommended: create a Python virtual environment with the command 1. create a Python virtual environment with the command
``` ```
python -m venv ENVNAME python -m venv ENVNAME
``` ```
@@ -124,7 +113,7 @@ python scripts/misc/cleanconl.py PATH-TO-A-DATASET-SPLIT
This replaces the contents of your input file with a "cleaned up" version of the same treebank. This replaces the contents of your input file with a "cleaned up" version of the same treebank.
### Step 3: training ### Step 3: training
Copy `compsyn.json` to `machamp/configs` and replace the traning and development data paths with the paths to the files you selected/created in step 2. Copy `compsyn.json` to `machamp/configs` and replace the training and development data paths with the paths to the files you selected/created in step 2.
You can now train your model by running You can now train your model by running
@@ -152,4 +141,4 @@ Then, use the `machamp/scripts/misc/conll18_ud_eval.py` script to evaluate the s
python scripts/misc/conll18_ud_eval.py PATH-TO-YOUR-PART1-TREEBANK predictions/OUTPUT-FILE-NAME.conllu python scripts/misc/conll18_ud_eval.py PATH-TO-YOUR-PART1-TREEBANK predictions/OUTPUT-FILE-NAME.conllu
``` ```
On Canvas, submit the training logs, the predictions and the output of `conll18_ud_eval.py`, along with a short text summarizing your considerations on the performance of the parser, based on the predictions themselves and on the output of the results of the evaluation. On Canvas, submit the training logs, the predictions and the output of `conll18_ud_eval.py`, along with a short text summarizing your considerations on the performance of the parser, based on the predictions themselves and on the automatic evaluation.

View File

@@ -6,7 +6,7 @@ The:<DET> study:<NOUN> of:<ADP> volcanoes:<NOUN> is:<AUX> called:<VERB> volcanol
It:<PRON> was:<AUX> conducted:<VERB> just:<ADV> off:<ADP> the:<DET> Mexican:<ADJ> coast:<NOUN> from:<ADP> April:<PROPN> to:<ADP> June:<PROPN> .:<PUNCT> It:<PRON> was:<AUX> conducted:<VERB> just:<ADV> off:<ADP> the:<DET> Mexican:<ADJ> coast:<NOUN> from:<ADP> April:<PROPN> to:<ADP> June:<PROPN> .:<PUNCT>
":<PUNCT> Her:<PRON> voice:<NOUN> literally:<ADV> went:<VERB> around:<ADP> the:<DET> world:<NOUN> ,:<PUNCT> ":<PUNCT> Leive:<PROPN> said:<VERB> .:<PUNCT> ":<PUNCT> Her:<PRON> voice:<NOUN> literally:<ADV> went:<VERB> around:<ADP> the:<DET> world:<NOUN> ,:<PUNCT> ":<PUNCT> Leive:<PROPN> said:<VERB> .:<PUNCT>
A:<DET> witness:<NOUN> told:<VERB> police:<NOUN> that:<SCONJ> the:<DET> victim:<NOUN> had:<AUX> attacked:<VERB> the:<DET> suspect:<NOUN> in:<ADP> April:<PROPN> .:<PUNCT> A:<DET> witness:<NOUN> told:<VERB> police:<NOUN> that:<SCONJ> the:<DET> victim:<NOUN> had:<AUX> attacked:<VERB> the:<DET> suspect:<NOUN> in:<ADP> April:<PROPN> .:<PUNCT>
It:<PRON> 's:<AUX> most:<ADV> obvious:<ADJ> when:<SSUBJ> a:<DET> celebrity:<NOUN> 's:<PART> name:<NOUN> is:<AUX> initially:<ADV> quite:<ADV> rare:<ADJ> .:<PUNCT> It:<PRON> 's:<AUX> most:<ADV> obvious:<ADJ> when:<SCONJ> a:<DET> celebrity:<NOUN> 's:<PART> name:<NOUN> is:<AUX> initially:<ADV> quite:<ADV> rare:<ADJ> .:<PUNCT>
This:<PRON> has:<AUX> not:<PART> stopped:<VERB> investors:<NOUN> flocking:<VERB> to:<PART> put:<VERB> their:<PRON> money:<NOUN> in:<ADP> the:<DET> funds:<NOUN> .:<PUNCT> This:<PRON> has:<AUX> not:<PART> stopped:<VERB> investors:<NOUN> flocking:<VERB> to:<PART> put:<VERB> their:<PRON> money:<NOUN> in:<ADP> the:<DET> funds:<NOUN> .:<PUNCT>
This:<DET> discordance:<NOUN> between:<ADP> economic:<ADJ> data:<NOUN> and:<CCONJ> political:<ADJ> rhetoric:<NOUN> is:<AUX> familiar:<ADJ> ,:<PUNCT> or:<CCONJ> should:<AUX> be:<AUX> .:<PUNCT> This:<DET> discordance:<NOUN> between:<ADP> economic:<ADJ> data:<NOUN> and:<CCONJ> political:<ADJ> rhetoric:<NOUN> is:<AUX> familiar:<ADJ> ,:<PUNCT> or:<CCONJ> should:<AUX> be:<AUX> .:<PUNCT>
The:<DET> feasibility:<NOUN> study:<NOUN> estimates:<VERB> that:<SCONJ> it:<PRON> would:<AUX> take:<VERB> passengers:<NOUN> about:<ADV> four:<NUM> minutes:<NOUN> to:<PART> cross:<VERB> the:<DET> Potomac:<PROPN> River:<PROPN> on:<ADP> the:<DET> gondola:<NOUN> .:<PUNCT> The:<DET> feasibility:<NOUN> study:<NOUN> estimates:<VERB> that:<SCONJ> it:<PRON> would:<AUX> take:<VERB> passengers:<NOUN> about:<ADV> four:<NUM> minutes:<NOUN> to:<PART> cross:<VERB> the:<DET> Potomac:<PROPN> River:<PROPN> on:<ADP> the:<DET> gondola:<NOUN> .:<PUNCT>

View File

@@ -1,24 +0,0 @@
Jag:<PRON> tycker:<VERB> att:<SCONJ> du:<PRON> ska:<AUX> börja:<VERB> med:<ADP> en:<DET> språkkurs:<NOUN>.:<PUNCT>
Flerspråkighet:<NOUN> gynnar:<VERB> oss:<PRON> även:<ADV> på:<ADP> arbetsmarknaden:<NOUN>.:<PUNCT>
Språket:<NOUN> är:<AUX> lätt:<ADJ> och:<CCONJ> jag:<PRON> kan:<AUX> läsa:<VERB> utan:<ADP> något:<DET> problem:<PRON>.:<PUNCT>
Man:<PRON> känner:<VERB> sig:<PRON> ensam:<ADJ> när:<SCONJ> man:<PRON> inte:<PART> kan:<AUX> prata:<VERB> språket:<NOUN> bra:<ADV>.:<PUNCT>
Det:<PRON> kan:<AUX> vara:<AUX> kroppsspråk:<NOUN> men:<CCONJ> främst:<ADV> sker:<VERB> det:<PRON> genom:<ADP> talet:<NOUN>.
Språket:<NOUN> är:<AUX> nyckeln:<NOUN> till:<ADP> alla:<DET> låsta:<ADJ> dörrar:<NOUN>,:<PUNCT> har:<AUX> vi:<PRON> hört:<VERB> flera:<ADJ> gånger:<NOUN>.:<PUNCT>
Att:<PART> kunna:<VERB> ett:<DET> språk:<NOUN> är:<AUX> en:<DET> av:<ADP> de:<DET> viktigaste:<ADJ> och:<CCONJ> värdefullaste:<ADJ> egenskaper:<NOUN> en:<DET> människa:<NOUN> kan:<AUX> ha:<VERB> så:<SCONJ> det:<PRON> är:<AUX> värt:<ADJ> mer:<ADV> än:<ADP> vad:<PRON> man:<PRON> tror:<VERB>.:<PUNCT>
Med:<ADP> andra:<ADJ> ord:<NOUN>,:<PUNCT> språket:<NOUN> är:<AUX> nyckeln:<NOUN> till:<ADP> alla:<DET> låsta:<ADJ> dörrar:<NOUN>,:<PUNCT> men:<CCONJ> det:<PRON> finns:<VERB> viktigare:<ADJ> saker:<NOUN> att:<PART> satsa:<VERB> på:<ADP> som:<PRON> jag:<PRON> kommer:<AUX> att:<PART> nämna:<VERB> längre:<ADV> ner:<ADV>.:<PUNCT>
Han:<PRON> kom:<VERB> till:<ADP> Sverige:<PROPN> för:<ADP> 4:<NUM> år:<NOUN> sedan:<ADV>,:<PUNCT> han:<PRON> kunde:<AUX> inte:<PART> tala:<VERB> svenska:<ADJ> språket:<NOUN>,<PUNCT> ingen:<DET> engelska:<NOUN>,:<PUNCT> han:<PRON> kunde:<AUX> i:<ADP> princip:<NOUN> inte:<PART> kommunicera:<VERB> med:<ADP> någon:<PRON> här<ADV>.:<PUNCT>
För:<ADP> det:<DET> första:<ADJ> hänger:<VERB> språket:<NOUN> ihop:<ADV> med:<ADP> tillhörighet:<NOUN>,:<PUNCT> särskilt:<ADV> för:<ADP> de:<DET> nya:<ADJ> invandrare:<NOUN> som:<PRON> har:<AUX> bestämt:<VERB> sig:<PRON> för:<ADP> att:<PART> flytta:<VERB> och:<CCONJ> bosätta:<VERB> sig:<PRON> i:<ADP> Sverige:<PROPN>.:<PUNCT>
Om:<SCONJ> alla:<PRON> hade:<AUX> talat:<VERB> samma:<DET> språk:<NOUN> hade:<AUX> det:<PRON> förmodligen:<ADV> inte:<PART> funnits:<VERB> något:<DET> utanförskap:<NOUN>,:<PUNCT> utan:<CCONJ> man:<PRON> hade:<AUX> fått:<VERB> en:<DET> typ:<NOUN> av:<ADP> gemenskap:<NOUN> där:<ADV> man:<PRON> delar:<VERB> samma:<DET> kultur:<NOUN>.:<PUNCT>
Att:<PART> lära:<VERB> sig:<PRON> ett:<DET> språk:<NOUN> är:<AUX> väldigt:<ADV> svårt:<ADJ>,:<PUNCT> speciellt:<ADV> för:<ADP> vuxna:<ADJ> människor:<NOUN>,:<PUNCT> och:<CCONJ> eftersom:<SCONJ> majoritetsspråket:<NOUN> blir:<VERB> en:<DET> viktig:<ADJ> del:<NOUN> i:<ADP> en:<DET> persons:<NOUN> liv:<NOUN> räcker:<VERB> det:<PRON> inte:<PART> att:<PART> tala:<VERB> det:<PRON> på:<ADP> söndagar:<NOUN> utan:<CCONJ> det:<PRON> måste:<AUX> läras:<VERB> in:<PART> som:<SCONJ> ett:<DET> modersmål:<NOUN>,:<PUNCT> vilket:<PRON> finansieras:<VERB> av:<ADP> oss:<PRON> skattebetalare:<NOUN>.:<PUNCT>
Avslutningsvis så vill jag förmedla att vi bör rädda världen innan språken.
Språket är ganska enkelt, och det är lätt att förstå vad romanen handlar om.
Det är även kostsamt för staten att se till att dessa minoritetsspråk lever kvar.
Låt mig säga att det är inte för sent att rädda de små språken, vi måste ta steget nu.
Att hålla dessa minoritetsspråk vid liv är både slöseri med tid och mycket ekonomiskt krävande.
Jag tackar alla lärare på Sfi som hjälper oss för att vi ska kunna bli bättre på svenska språket.
Språk skapades för flera tusen år sedan och vissa språk har tynat bort medan några nya har skapats.
Samhället behöver flerspråkiga och vägen till kommunikation och till att begripa andras kulturer är ett språk.
Om man kan fler språk har man fler möjligheter att använda sig av dem vilket leder till utveckling.
Därför tycker jag att vi bör införa ett förbud mot främmande språk i statliga myndigheter och föreningar.
Men jag anser först och främst att språket är som själen, det som ger oss livskraft, säregenhet och karaktär.
På Sveriges riksdags hemsida kan man läsa om hur Sverige bidrar med att skydda dessa språk med hjälp av statligt bidrag.

View File

Before

Width:  |  Height:  |  Size: 81 KiB

After

Width:  |  Height:  |  Size: 81 KiB

View File

Before

Width:  |  Height:  |  Size: 57 KiB

After

Width:  |  Height:  |  Size: 57 KiB

View File

Before

Width:  |  Height:  |  Size: 64 KiB

After

Width:  |  Height:  |  Size: 64 KiB

View File

Before

Width:  |  Height:  |  Size: 337 KiB

After

Width:  |  Height:  |  Size: 337 KiB

View File

Before

Width:  |  Height:  |  Size: 160 KiB

After

Width:  |  Height:  |  Size: 160 KiB

View File

Before

Width:  |  Height:  |  Size: 51 KiB

After

Width:  |  Height:  |  Size: 51 KiB

View File

Before

Width:  |  Height:  |  Size: 2.1 KiB

After

Width:  |  Height:  |  Size: 2.1 KiB

View File

Before

Width:  |  Height:  |  Size: 81 KiB

After

Width:  |  Height:  |  Size: 81 KiB

View File

Before

Width:  |  Height:  |  Size: 258 KiB

After

Width:  |  Height:  |  Size: 258 KiB

View File

Before

Width:  |  Height:  |  Size: 60 KiB

After

Width:  |  Height:  |  Size: 60 KiB

View File

@@ -0,0 +1,13 @@
S ::= NP VP ;
VP ::= V NP ;
NP ::= Det CN ;
CN ::= A CN ;
CN ::= N ;
NP ::= Pron ;
Det ::= "the" ;
A ::= "black" ;
N ::= "cat" ;
V ::= "sees" ;
Pron ::= "us" ;

View File

@@ -0,0 +1,19 @@
S ::= NP_nom_sg VP_sg ;
S ::= NP_nom_pl VP_pl ;
VP_sg ::= V_sg NP_acc ;
VP_pl ::= V_pl NP_acc ;
NP_nom_sg ::= Det CN ;
NP_acc ::= Det CN ;
CN ::= A CN ;
CN ::= N ;
NP_nom_pl ::= Pron_nom_pl ;
NP_acc ::= Pron_acc ;
Det ::= "the" ;
A ::= "black" ;
N ::= "cat" ;
V_sg ::= "sees" ;
V_pl ::= "see" ;
Pron_nom_pl ::= "we" ;
Pron_acc ::= "us" ;

View File

@@ -0,0 +1,21 @@
abstract Nano = {
cat
S ; NP ; VP ; CN ;
Det ; Pron ; A ; N ; V2 ;
fun
PredVPS : NP -> VP -> S ;
ComplV2 : V2 -> NP -> VP ;
DetCN : Det -> CN -> NP ;
AdjCN : A -> CN -> CN ;
UseCN : N -> CN ;
UsePron : Pron -> NP ;
the_Det : Det ;
black_A : A ;
cat_N : N ;
see_V2 : V2 ;
we_Pron : Pron ;
}

View File

@@ -0,0 +1,33 @@
concrete NanoEng of Nano = {
lincat
S = Str ;
NP, Pron = {s : Case => Str ; n : Number} ;
VP, V2 = Number => Str ;
CN, Det = Str ;
A, N = Str ;
lin
PredVPS np vp = np.s ! Nom ++ vp ! np.n ;
ComplV2 v2 np = table {n => v2 ! n ++ np.s ! Acc} ;
DetCN det cn =
{s = table {c => det ++ cn} ; n = Sg} ;
AdjCN a cn = a ++ cn ;
UseCN n = n ;
UsePron pron = pron ;
the_Det = "the" ;
black_A = "black" ;
cat_N = "cat" ;
see_V2 =
table {Sg => "sees" ; Pl => "see"} ;
we_Pron = {
s = table {Nom => "we" ; Acc => "us"} ;
n = Pl
} ;
param
Number = Sg | Pl ;
Case = Nom | Acc ;
}

View File

@@ -0,0 +1,21 @@
concrete NanoIta of Nano = {
lincat
S, NP, VP, CN,
Det, Pron, A, N, V2 = Str ;
lin
PredVPS np vp = np ++ vp ;
ComplV2 v2 np = np ++ v2 ;
DetCN det cn = det ++ cn ;
AdjCN a cn = cn ++ a ;
UseCN n = n ;
UsePron pron = pron ;
the_Det = "il" ;
black_A = "nero" ;
cat_N = "gatto" ;
see_V2 = "vede" ;
we_Pron = "ci" ;
}

View File

@@ -0,0 +1,82 @@
resource MorphologyEng = {
-- to use:
-- i -retain MorphologyEng.gf
-- cc -table dog_N
param
Number = Sg | Pl ;
VerbForm = Inf | Pres3Sg | Past | PastPart | PresPart ;
oper
Noun : Type = {s : Number => Str} ;
-- constructor
mkNoun : (dog, dogs : Str) -> Noun
= \dog, dogs -> {
s = table {Sg => dog ; Pl => dogs}
} ;
regNoun : (dog : Str) -> Noun
= \dog -> mkNoun dog (dog + "s") ;
smartNoun : (noun : Str) -> Noun
= \noun -> case noun of {
b + ("a" | "e" | "o" | "u") + "y" => regNoun noun ;
bab + "y" => mkNoun noun (bab + "ies") ;
_ => regNoun noun
} ;
Verb : Type = {s : VerbForm => Str} ;
-- constructor; worst case paradigm
mkVerb : (sing, sings, sang, sung, singing : Str) -> Verb
= \sing, sings, sang, sung, singing -> {
s = table {
Inf => sing ;
Pres3Sg => sings ;
Past => sang ;
PastPart => sung ;
PresPart => singing
}
} ;
regVerb : (walk : Str) -> Verb
= \walk ->
mkVerb walk (walk + "s") (walk + "ed")
(walk + "ed") (walk + "ing") ;
smartVerb : (verb : Str) -> Verb
= \verb -> case verb of {
b + ("a" | "e" | "o" | "u") + "y" => regVerb verb ;
cr + "y" => mkVerb verb (cr + "ies")
(cr + "ied") (cr + "ied") (cr + "ying") ;
us + "e" => let used = us + "ed" in
mkVerb verb (verb + "s") used used (us + "ing") ;
wa + ("ch" | "sh" | "s" | "z" | "x") =>
mkVerb verb (verb + "es") (verb + "ed") (verb + "ed")
(verb + "ing") ;
_ => regVerb verb
} ;
irregVerb : (sing, sang, sung : Str) -> Verb
= \sing, sang, sung -> {s =
table {
Past => sang ;
PastPart => sung ;
x => (smartVerb sing).s ! x
}
} ;
-- lexicon
dog_N = mkNoun "dog" "dogs" ;
girl_N = mkNoun "girl" "girls" ;
house_N = regNoun "house" ;
}

View File

@@ -0,0 +1,105 @@
--# -path=.:../abstract
concrete MicroLangMyeng of MicroLang = open MicroResMyeng, Prelude in {
lincat
V = Verb ;
V2 = Verb2 ;
A = Adjective ;
N = Noun ;
Adv = Adverb ;
-----------------------------------------------------
---------------- Lexicon part -----------------------
-----------------------------------------------------
lin already_Adv = mkAdv "already" ;
lin animal_N = mkN "animal" ;
lin apple_N = mkN "apple" ;
lin baby_N = mkN "baby" ;
lin bad_A = mkA "bad" ;
lin beer_N = mkN "beer" ;
lin big_A = mkA "big" ;
lin bike_N = mkN "bike" ;
lin bird_N = mkN "bird" ;
lin black_A = mkA "black" ;
lin blood_N = mkN "blood" ;
lin blue_A = mkA "blue" ;
lin boat_N = mkN "boat" ;
lin book_N = mkN "book" ;
lin boy_N = mkN "boy" ;
lin bread_N = mkN "bread" ;
lin break_V2 = mkV2 (mkV "break" "broke" "broken") ;
lin buy_V2 = mkV2 (mkV "buy" "bought" "bought") ;
lin car_N = mkN "car" ;
lin cat_N = mkN "cat" ;
lin child_N = mkN "child" "children" ;
lin city_N = mkN "city" ;
lin clean_A = mkA "clean" ;
lin clever_A = mkA "clever" ;
lin cloud_N = mkN "cloud" ;
lin cold_A = mkA "cold" ;
lin come_V = mkV "come" "came" "come" ;
lin computer_N = mkN "computer" ;
lin cow_N = mkN "cow" ;
lin dirty_A = mkA "dirty" ;
lin dog_N = mkN "dog" ;
lin drink_V2 = mkV2 (mkV "drink" "drank" "drunk") ;
lin eat_V2 = mkV2 (mkV "eat" "ate" "eaten") ;
lin find_V2 = mkV2 (mkV "find" "found" "found") ;
lin fire_N = mkN "fire" ;
lin fish_N = mkN "fish" "fish" ;
lin flower_N = mkN "flower" ;
lin friend_N = mkN "friend" ;
lin girl_N = mkN "girl" ;
lin good_A = mkA "good" ;
lin go_V = mkV "go" "goes" "went" "gone" "going" ;
lin grammar_N = mkN "grammar" ;
lin green_A = mkA "green" ;
lin heavy_A = mkA "heavy" ;
lin horse_N = mkN "horse" ;
lin hot_A = mkA "hot" ;
lin house_N = mkN "house" ;
-- lin john_PN = mkPN "John" ;
lin jump_V = mkV "jump" ;
lin kill_V2 = mkV2 "kill" ;
-- lin know_VS = mkVS (mkV "know" "knew" "known") ;
lin language_N = mkN "language" ;
lin live_V = mkV "live" ;
lin love_V2 = mkV2 (mkV "love") ;
lin man_N = mkN "man" "men" ;
lin milk_N = mkN "milk" ;
lin music_N = mkN "music" ;
lin new_A = mkA "new" ;
lin now_Adv = mkAdv "now" ;
lin old_A = mkA "old" ;
-- lin paris_PN = mkPN "Paris" ;
lin play_V = mkV "play" ;
lin read_V2 = mkV2 (mkV "read" "read" "read") ;
lin ready_A = mkA "ready" ;
lin red_A = mkA "red" ;
lin river_N = mkN "river" ;
lin run_V = mkV "run" "runs" "ran" "run" "running" ;
lin sea_N = mkN "sea" ;
lin see_V2 = mkV2 (mkV "see" "saw" "seen") ;
lin ship_N = mkN "ship" ;
lin sleep_V = mkV "sleep" "slept" "slept" ;
lin small_A = mkA "small" ;
lin star_N = mkN "star" ;
lin swim_V = mkV "swim" "swims" "swam" "swum" "swimming" ;
lin teach_V2 = mkV2 (mkV "teach" "taught" "taught") ;
lin train_N = mkN "train" ;
lin travel_V = mkV "travel" ;
lin tree_N = mkN "tree" ;
lin understand_V2 = mkV2 (mkV "understand" "understood" "understood") ;
lin wait_V2 = mkV2 "wait" "for" ;
lin walk_V = mkV "walk" ;
lin warm_A = mkA "warm" ;
lin water_N = mkN "water" ;
lin white_A = mkA "white" ;
lin wine_N = mkN "wine" ;
lin woman_N = mkN "woman" "women" ;
lin yellow_A = mkA "yellow" ;
lin young_A = mkA "young" ;
}

View File

@@ -0,0 +1,115 @@
resource MicroResMyeng = {
------------------------------
-- API: overloaded paradigms
oper
mkN = overload {
mkN : (baby : Str) -> Noun
= \baby -> smartNoun baby ;
mkN : (man, men : Str) -> Noun
= \man, men -> mkNoun man men ;
} ;
mkA : Str -> Adjective
= \adj -> {s = adj} ;
mkAdv : Str -> Adverb
= \adv -> {s = adv} ;
mkV = overload {
mkV : (try : Str) -> Verb
= \try -> smartVerb try ;
mkV : (go, went, gone : Str) -> Verb
= \go, went, gone -> irregVerb go went gone ;
mkV : (sing, sings, sang, sung, singing : Str) -> Verb
= \sing, sings, sang, sung, singing ->
mkVerb sing sings sang sung singing ;
} ;
mkV2 = overload {
mkV2 : (kill : Str) -> Verb2
= \kill -> mkV kill ** {prep = ""} ;
mkV2 : (wait, for : Str) -> Verb2
= \wait, for -> mkV wait ** {prep = for} ;
mkV2 : Verb -> Verb2
= \verb -> verb ** {prep = ""} ;
} ;
------------------------------
param
Number = Sg | Pl ;
VerbForm = Inf | Pres3Sg | Past | PastPart | PresPart ;
oper
Noun : Type = {s : Number => Str} ;
-- constructor
mkNoun : (dog, dogs : Str) -> Noun
= \dog, dogs -> {
s = table {Sg => dog ; Pl => dogs}
} ;
regNoun : (dog : Str) -> Noun
= \dog -> mkNoun dog (dog + "s") ;
smartNoun : (noun : Str) -> Noun
= \noun -> case noun of {
b + ("a" | "e" | "o" | "u") + "y" => regNoun noun ;
bab + "y" => mkNoun noun (bab + "ies") ;
_ => regNoun noun
} ;
Adjective : Type = {s : Str} ;
Adverb : Type = {s : Str} ;
Verb : Type = {s : VerbForm => Str} ;
-- constructor; worst case paradigm
mkVerb : (sing, sings, sang, sung, singing : Str) -> Verb
= \sing, sings, sang, sung, singing -> {
s = table {
Inf => sing ;
Pres3Sg => sings ;
Past => sang ;
PastPart => sung ;
PresPart => singing
}
} ;
regVerb : (walk : Str) -> Verb
= \walk ->
mkVerb walk (walk + "s") (walk + "ed")
(walk + "ed") (walk + "ing") ;
smartVerb : (verb : Str) -> Verb
= \verb -> case verb of {
b + ("a" | "e" | "o" | "u") + "y" => regVerb verb ;
cr + "y" => mkVerb verb (cr + "ies")
(cr + "ied") (cr + "ied") (cr + "ying") ;
refer + "ee" => let refereed = refer + "eed" in
mkVerb verb (verb + "s") refereed refereed (verb + "ing") ;
us + "e" => let used = us + "ed" in
mkVerb verb (verb + "s") used used (us + "ing") ;
wa + ("ch" | "sh" | "s" | "z" | "x") =>
mkVerb verb (verb + "es") (verb + "ed") (verb + "ed")
(verb + "ing") ;
_ => regVerb verb
} ;
irregVerb : (sing, sang, sung : Str) -> Verb
= \sing, sang, sung -> {s =
table {
Past => sang ;
PastPart => sung ;
x => (smartVerb sing).s ! x
}
} ;
Verb2 : Type = {s : VerbForm => Str ; prep : Str} ;
}

View File

@@ -0,0 +1,160 @@
--# -path=.:../abstract
concrete MicroLangMyeng of MicroLang = open MicroResMyeng, Prelude in {
lincat
Utt = {s : Str} ;
S = {s : Str} ;
NP = {s : Case => Str ; n : Number} ;
VP = {s : Number => Str} ;
CN = {s : Number => Str} ;
Comp = {s : Str} ;
AP = {s : Str} ;
Det = {s : Str ; n : Number} ;
Prep = {s : Str} ;
Pron = {s : Case => Str ; n : Number} ;
V = Verb ;
V2 = Verb2 ;
A = Adjective ;
N = Noun ;
Adv = Adverb ;
lin
UttS s = s ;
UttNP np = {s = np.s ! Nom} ;
PredVPS np vp = {s = np.s ! Nom ++ vp.s ! np.n} ;
-- Verb
UseV verb = {s = \\n => presentVerb verb n} ;
ComplV2 verb np = {s = \\n => presentVerb verb n ++ verb.prep ++ np.s ! Acc} ;
UseComp comp = {s = \\n => copula n ++ comp.s} ;
CompAP ap = ap ;
AdvVP vp adv = {s = \\n => vp.s ! n ++ adv.s} ;
-- Noun
DetCN det cn = {s = \\c => det.s ++ cn.s ! det.n ; n = det.n} ;
UsePron pron = pron ;
a_Det = {s = "a" ; n = Sg} ;
aPl_Det = {s = "" ; n = Pl} ;
the_Det = {s = "the" ; n = Sg} ;
thePl_Det = {s = "the" ; n = Pl} ;
UseN noun = noun ;
AdjCN ap cn = {s = \\n => ap.s ++ cn.s ! n} ;
-- Adjective
PositA adj = adj ;
{-
-- Adverb
PrepNP : Prep -> NP -> Adv ; -- in the house
-- Structural
in_Prep : Prep ;
on_Prep : Prep ;
with_Prep : Prep ;
-}
he_Pron = {s = table {Nom => "he" ; Acc => "him"} ; n = Sg} ;
she_Pron = {s = table {Nom => "she" ; Acc => "her"} ; n = Sg} ;
they_Pron = {s = table {Nom => "they" ; Acc => "them"} ; n = Pl} ;
-----------------------------------------------------
---------------- Lexicon part -----------------------
-----------------------------------------------------
lin already_Adv = mkAdv "already" ;
lin animal_N = mkN "animal" ;
lin apple_N = mkN "apple" ;
lin baby_N = mkN "baby" ;
lin bad_A = mkA "bad" ;
lin beer_N = mkN "beer" ;
lin big_A = mkA "big" ;
lin bike_N = mkN "bike" ;
lin bird_N = mkN "bird" ;
lin black_A = mkA "black" ;
lin blood_N = mkN "blood" ;
lin blue_A = mkA "blue" ;
lin boat_N = mkN "boat" ;
lin book_N = mkN "book" ;
lin boy_N = mkN "boy" ;
lin bread_N = mkN "bread" ;
lin break_V2 = mkV2 (mkV "break" "broke" "broken") ;
lin buy_V2 = mkV2 (mkV "buy" "bought" "bought") ;
lin car_N = mkN "car" ;
lin cat_N = mkN "cat" ;
lin child_N = mkN "child" "children" ;
lin city_N = mkN "city" ;
lin clean_A = mkA "clean" ;
lin clever_A = mkA "clever" ;
lin cloud_N = mkN "cloud" ;
lin cold_A = mkA "cold" ;
lin come_V = mkV "come" "came" "come" ;
lin computer_N = mkN "computer" ;
lin cow_N = mkN "cow" ;
lin dirty_A = mkA "dirty" ;
lin dog_N = mkN "dog" ;
lin drink_V2 = mkV2 (mkV "drink" "drank" "drunk") ;
lin eat_V2 = mkV2 (mkV "eat" "ate" "eaten") ;
lin find_V2 = mkV2 (mkV "find" "found" "found") ;
lin fire_N = mkN "fire" ;
lin fish_N = mkN "fish" "fish" ;
lin flower_N = mkN "flower" ;
lin friend_N = mkN "friend" ;
lin girl_N = mkN "girl" ;
lin good_A = mkA "good" ;
lin go_V = mkV "go" "goes" "went" "gone" "going" ;
lin grammar_N = mkN "grammar" ;
lin green_A = mkA "green" ;
lin heavy_A = mkA "heavy" ;
lin horse_N = mkN "horse" ;
lin hot_A = mkA "hot" ;
lin house_N = mkN "house" ;
-- lin john_PN = mkPN "John" ;
lin jump_V = mkV "jump" ;
lin kill_V2 = mkV2 "kill" ;
-- lin know_VS = mkVS (mkV "know" "knew" "known") ;
lin language_N = mkN "language" ;
lin live_V = mkV "live" ;
lin love_V2 = mkV2 (mkV "love") ;
lin man_N = mkN "man" "men" ;
lin milk_N = mkN "milk" ;
lin music_N = mkN "music" ;
lin new_A = mkA "new" ;
lin now_Adv = mkAdv "now" ;
lin old_A = mkA "old" ;
-- lin paris_PN = mkPN "Paris" ;
lin play_V = mkV "play" ;
lin read_V2 = mkV2 (mkV "read" "read" "read") ;
lin ready_A = mkA "ready" ;
lin red_A = mkA "red" ;
lin river_N = mkN "river" ;
lin run_V = mkV "run" "runs" "ran" "run" "running" ;
lin sea_N = mkN "sea" ;
lin see_V2 = mkV2 (mkV "see" "saw" "seen") ;
lin ship_N = mkN "ship" ;
lin sleep_V = mkV "sleep" "slept" "slept" ;
lin small_A = mkA "small" ;
lin star_N = mkN "star" ;
lin swim_V = mkV "swim" "swims" "swam" "swum" "swimming" ;
lin teach_V2 = mkV2 (mkV "teach" "taught" "taught") ;
lin train_N = mkN "train" ;
lin travel_V = mkV "travel" ;
lin tree_N = mkN "tree" ;
lin understand_V2 = mkV2 (mkV "understand" "understood" "understood") ;
lin wait_V2 = mkV2 "wait" "for" ;
lin walk_V = mkV "walk" ;
lin warm_A = mkA "warm" ;
lin water_N = mkN "water" ;
lin white_A = mkA "white" ;
lin wine_N = mkN "wine" ;
lin woman_N = mkN "woman" "women" ;
lin yellow_A = mkA "yellow" ;
lin young_A = mkA "young" ;
}

View File

@@ -0,0 +1,131 @@
resource MicroResMyeng = {
------------------------------
-- API: overloaded paradigms
oper
mkN = overload {
mkN : (baby : Str) -> Noun
= \baby -> smartNoun baby ;
mkN : (man, men : Str) -> Noun
= \man, men -> mkNoun man men ;
} ;
mkA : Str -> Adjective
= \adj -> {s = adj} ;
mkAdv : Str -> Adverb
= \adv -> {s = adv} ;
mkV = overload {
mkV : (try : Str) -> Verb
= \try -> smartVerb try ;
mkV : (go, went, gone : Str) -> Verb
= \go, went, gone -> irregVerb go went gone ;
mkV : (sing, sings, sang, sung, singing : Str) -> Verb
= \sing, sings, sang, sung, singing ->
mkVerb sing sings sang sung singing ;
} ;
mkV2 = overload {
mkV2 : (kill : Str) -> Verb2
= \kill -> mkV kill ** {prep = ""} ;
mkV2 : (wait, for : Str) -> Verb2
= \wait, for -> mkV wait ** {prep = for} ;
mkV2 : Verb -> Verb2
= \verb -> verb ** {prep = ""} ;
} ;
------------------------------
param
Number = Sg | Pl ;
VerbForm = Inf | Pres3Sg | Past | PastPart | PresPart ;
Case = Nom | Acc ;
oper
Noun : Type = {s : Number => Str} ;
-- constructor
mkNoun : (dog, dogs : Str) -> Noun
= \dog, dogs -> {
s = table {Sg => dog ; Pl => dogs}
} ;
regNoun : (dog : Str) -> Noun
= \dog -> mkNoun dog (dog + "s") ;
smartNoun : (noun : Str) -> Noun
= \noun -> case noun of {
b + ("a" | "e" | "o" | "u") + "y" => regNoun noun ;
bab + "y" => mkNoun noun (bab + "ies") ;
_ => regNoun noun
} ;
Adjective : Type = {s : Str} ;
Adverb : Type = {s : Str} ;
Verb : Type = {s : VerbForm => Str} ;
-- constructor; worst case paradigm
mkVerb : (sing, sings, sang, sung, singing : Str) -> Verb
= \sing, sings, sang, sung, singing -> {
s = table {
Inf => sing ;
Pres3Sg => sings ;
Past => sang ;
PastPart => sung ;
PresPart => singing
}
} ;
regVerb : (walk : Str) -> Verb
= \walk ->
mkVerb walk (walk + "s") (walk + "ed")
(walk + "ed") (walk + "ing") ;
smartVerb : (verb : Str) -> Verb
= \verb -> case verb of {
b + ("a" | "e" | "o" | "u") + "y" => regVerb verb ;
cr + "y" => mkVerb verb (cr + "ies")
(cr + "ied") (cr + "ied") (cr + "ying") ;
refer + "ee" => let refereed = refer + "eed" in
mkVerb verb (verb + "s") refereed refereed (verb + "ing") ;
us + "e" => let used = us + "ed" in
mkVerb verb (verb + "s") used used (us + "ing") ;
wa + ("ch" | "sh" | "s" | "z" | "x") =>
mkVerb verb (verb + "es") (verb + "ed") (verb + "ed")
(verb + "ing") ;
_ => regVerb verb
} ;
irregVerb : (sing, sang, sung : Str) -> Verb
= \sing, sang, sung -> {s =
table {
Past => sang ;
PastPart => sung ;
x => (smartVerb sing).s ! x
}
} ;
Verb2 : Type = {s : VerbForm => Str ; prep : Str} ;
-- auxiliary for syntax
presentVerb : Verb -> Number -> Str = \verb, n ->
case n of {
Sg => verb.s ! Pres3Sg ;
Pl => verb.s ! Inf
} ;
copula : Number -> Str = \n ->
case n of {
Sg => "is" ;
Pl => "are"
} ;
}