diff --git a/lectures/lecture-n-1/img/argmining.png b/lectures/lecture-n-1/img/argmining.png
new file mode 100644
index 0000000..9ed9b65
Binary files /dev/null and b/lectures/lecture-n-1/img/argmining.png differ
diff --git a/lectures/lecture-n-1/img/gfast.png b/lectures/lecture-n-1/img/gfast.png
new file mode 100644
index 0000000..7909973
Binary files /dev/null and b/lectures/lecture-n-1/img/gfast.png differ
diff --git a/lectures/lecture-n-1/img/machamp.png b/lectures/lecture-n-1/img/machamp.png
new file mode 100644
index 0000000..bbb7d34
Binary files /dev/null and b/lectures/lecture-n-1/img/machamp.png differ
diff --git a/lectures/lecture-n-1/img/sets.png b/lectures/lecture-n-1/img/sets.png
new file mode 100644
index 0000000..8306374
Binary files /dev/null and b/lectures/lecture-n-1/img/sets.png differ
diff --git a/lectures/lecture-n-1/img/ud.conllu b/lectures/lecture-n-1/img/ud.conllu
new file mode 100644
index 0000000..d44193f
--- /dev/null
+++ b/lectures/lecture-n-1/img/ud.conllu
@@ -0,0 +1,6 @@
+1 the the DET DT Definite=Def|PronType=Art 3 det _ TokenRange=0:3
+2 black black ADJ JJ Degree=Pos 3 amod _ TokenRange=4:9
+3 cat cat NOUN NN Number=Sing 4 nsubj _ TokenRange=10:13
+4 sees see VERB VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 0 root _ TokenRange=14:18
+5 us we PRON PRP Case=Acc|Number=Plur|Person=1|PronType=Prs 4 obj _ TokenRange=19:21
+6 now now ADV RB PronType=Dem 4 advmod _ SpaceAfter=No|TokenRange=22:25
\ No newline at end of file
diff --git a/lectures/lecture-n-1/img/ud.svg b/lectures/lecture-n-1/img/ud.svg
new file mode 100644
index 0000000..7fcd7e0
--- /dev/null
+++ b/lectures/lecture-n-1/img/ud.svg
@@ -0,0 +1,51 @@
+
diff --git a/lectures/lecture-n-1/sentence.txt b/lectures/lecture-n-1/sentence.txt
deleted file mode 100644
index e69de29..0000000
diff --git a/lectures/lecture-n-1/slides.md b/lectures/lecture-n-1/slides.md
index ab84462..c8afd58 100644
--- a/lectures/lecture-n-1/slides.md
+++ b/lectures/lecture-n-1/slides.md
@@ -1,6 +1,6 @@
---
-title: "Training and evaluating UD parsers"
-subtitle: "by popular demand"
+title: "Training and evaluating \\newline dependency parsers"
+subtitle: "(added to the course by popular demand)"
author: "Arianna Masciolini"
theme: "lucid"
logo: "gu.png"
@@ -8,7 +8,173 @@ date: "VT25"
institute: "LT2214 Computational Syntax"
---
-# Basics of dependency parsing
+## Today's topic
+\bigskip \bigskip
+
-## Today's focus
-
\ No newline at end of file
+# Parsing
+
+## A structured prediction task
+Sequence $\to$ structure, e.g.
+
+- natural language sentence $\to$ syntax tree
+- code $\to$ AST
+- argumentative essay $\to$ argumentative structure
+
+## Example (argmining)
+
+> Språkbanken has better fika than CLASP: every fika, someone bakes. Sure, CLASP has a better coffee machine. On the other hand, there are more important things than coffee. In fact, most people drink tea in the afternoon.
+
+## Example (argmining)
+
+
+\footnotesize From "A gentle introduction to argumentation mining" (Lindahl et al., 2022)
+
+# Syntactic parsing
+
+## From sentence to tree
+From Jurafsky & Martin. _Speech and Language Processing_, chapter 18 (January 2024 draft):
+
+> Syntactic parsing is the task of assigning a syntactic structure to a sentence
+
+- the structure is usually a _syntax tree_
+- two main classes of approaches:
+ - constituency parsing (e.g. GF)
+ - dependency parsing (e.g. UD)
+
+## Example (GF)
+```
+MicroLang> i MicroLangEng.gf
+linking ... OK
+
+Languages: MicroLangEng
+7 msec
+MicroLang> p "the black cat sees us now"
+PredVPS (DetCN the_Det (AdjCN (PositA black_A)
+(UseN cat_N))) (AdvVP (ComplV2 see_V2 (UsePron
+we_Pron)) now_Adv)
+```
+
+## Example (GF)
+```haskell
+PredVPS (
+ DetCN
+ the_Det
+ (AdjCN (PositA black_A) (UseN cat_N))
+ )
+ (AdvVP
+ (ComplV2 see_V2 (UsePron we_Pron))
+ now_Adv
+ )
+```
+
+## Example (GF)
+
+
+# Dependency parsing
+
+## Example (UD)
+
+
+\small
+```
+1 the _ DET _ _ 3 det _ _
+2 black _ ADJ _ _ 3 amod _ _
+3 cat _ NOUN _ _ 4 nsubj _ _
+4 sees _ VERB _ _ 0 root _ _
+5 us _ PRON _ _ 4 obj _ _
+6 now _ ADV _ _ 4 advmod _ _
+```
+
+## Two paradigms
+- __graph-based algorithms__: find the optimal tree from the set of all possible candidate solutions or a subset of it
+- __transition-based algorithms__: incrementally build a tree by solving a sequence of classification problems
+
+## Graph-based approaches
+$$\hat{t} = \underset{t \in T(s)}{argmax}\, score(s,t)$$
+
+- $t$: candidate tree
+- $\hat{t}$: predicted tree
+- $s$: input sentence
+- $T(s)$: set of candidate trees for $s$
+
+## Complexity
+- choice of $T$ (upper bound: $n^{n-1}$, where $n$ is the number of words in $s$)
+- scoring function (in the __arc-factor model__, the score of a tree is the sum of the score of each edge, scored individually by a NN. This results in $O(n^3)$ complexity)
+
+## Transition-based approaches
+- trees are built through a sequence of steps, called _transitions_
+- training requires:
+ - a gold-standard treebank (as for graph-based approaches)
+ - an _oracle_ i.e. an algorithm that converts each tree into a a gold-standard sequence of transitions
+- much more efficient: $O(n)$
+
+## Evaluation
+2 main metrics:
+
+- __UAS__ (Unlabelled Attachment Score): what's the fraction of nodes are attached to the correct dependency head?
+- __LAS__ (Labelled Attachment Score): what's the fraction of nodes are attached to the correct dependency head _with an arc labelled with the correct relation type_[^1]?
+
+[^1]: in UD: the `DEPREL` column
+
+# Specifics of UD parsing
+
+## Not just parsing per se
+UD "parsers" typically do a lot more than just dependency parsing:
+
+- lemmatization (`LEMMA` column)
+- POS tagging (`UPOS` + `XPOS`)
+- morphological tagging (`FEATS`)
+- ...
+
+## Evaluation (UD-specific)
+Some more specific metrics:
+
+- CLAS (Content-word LAS): LAS limited to content words
+- MLAS (Morphology-Aware LAS): CLAS that also uses the `FEATS` column
+- BLEX (Bi-Lexical dependency score): CLAS that also uses the `LEMMA` column
+
+## Evaluation script output
+\small
+```
+Metric | Precision | Recall | F1 Score | AligndAcc
+-----------+-----------+-----------+-----------+-----------
+Tokens | 100.00 | 100.00 | 100.00 |
+Sentences | 100.00 | 100.00 | 100.00 |
+Words | 100.00 | 100.00 | 100.00 |
+UPOS | 98.36 | 98.36 | 98.36 | 98.36
+XPOS | 100.00 | 100.00 | 100.00 | 100.00
+UFeats | 100.00 | 100.00 | 100.00 | 100.00
+AllTags | 98.36 | 98.36 | 98.36 | 98.36
+Lemmas | 100.00 | 100.00 | 100.00 | 100.00
+UAS | 92.73 | 92.73 | 92.73 | 92.73
+LAS | 90.30 | 90.30 | 90.30 | 90.30
+CLAS | 88.50 | 88.34 | 88.42 | 88.34
+MLAS | 86.72 | 86.56 | 86.64 | 86.56
+BLEX | 88.50 | 88.34 | 88.42 | 88.34
+```
+
+## Three generations of parsers
+1. __MaltParser__ (Nivre et al., 2006): "classic" transition-based parser, data-driven but not NN-based
+2. __UDPipe__: neural transition-based parser; personal favorite
+ - version 1 (Straka et al. 2016): solid and fast software, available anywhere
+ - version 2 (Straka et al. 2018): much better performance, but slower and only available through an API
+3. __MaChAmp__ (van der Goot et al., 2021): transformer-based toolkit for multi-task learning, works on all CoNNL-like data, close to the SOTA, relatively easy to install and train
+
+## Your task (lab 3)
+
+
+1. annotate a small treebank for your language of choice (started)
+2. train a parser-tagger with MaChAmp on a reference UD treebank (tomorrow: installation)
+3. evaluate it on your treebank
+
+## Sources/further reading
+- chapters 18-19 of the January 2024 draft of _Speech and Language Processing_ (Jurafsky & Martin) (full text available [__here__](https://web.stanford.edu/~jurafsky/slp3/))
+- unit 3-2 of Johansson & Kuhlmann's course "Deep Learning for Natural Language Processing" (slides and videos available __[__here__](https://liu-nlp.ai/dl4nlp/modules/module3/)__)
+- section 10.9.2 on parser evaluation from Aarne's course notes (on Canvas or [__here__](https://www.cse.chalmers.se/~aarne/grammarbook.pdf))
+
+## Papers describing the parsers
+- _MaltParser: A Data-Driven Parser-Generator for Dependency Parsing_ (Nivre et al. 2006) (PDF [__here__](http://lrec-conf.org/proceedings/lrec2006/pdf/162_pdf.pdf))
+- _UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing_ (Straka et al. 2016) (PDF [__here__](https://aclanthology.org/L16-1680.pdf))
+- _UDPipe 2.0 Prototype at CoNLL 2018 UD Shared Task_ (Straka et al. 2018) (PDF [__here__](https://aclanthology.org/K18-2020.pdf))
+- _Massive Choice, Ample Tasks (MACHAMP): A Toolkit for Multi-task Learning in NLP_ (van der Goot et al., 2021) (PDF [__here__](https://arxiv.org/pdf/2005.14672))
\ No newline at end of file
diff --git a/lectures/lecture-n-1/slides.pdf b/lectures/lecture-n-1/slides.pdf
index ff18785..bee586b 100644
Binary files a/lectures/lecture-n-1/slides.pdf and b/lectures/lecture-n-1/slides.pdf differ