mirror of
https://github.com/GrammaticalFramework/comp-syntax-gu-mlt.git
synced 2026-02-08 22:41:05 -07:00
UDpipe instructions
This commit is contained in:
38
lab1/lecture3-examples.txt
Normal file
38
lab1/lecture3-examples.txt
Normal file
@@ -0,0 +1,38 @@
|
||||
John prefers wine to beer.
|
||||
John definitely prefers beer to wine today if he is hungry.
|
||||
John walks.
|
||||
John does not walk.
|
||||
John has walked.
|
||||
John is old.
|
||||
John is a doctor.
|
||||
John is here.
|
||||
Probably the best beer in the world.
|
||||
Why?
|
||||
|
||||
She will.
|
||||
The black cat was seen.
|
||||
There is an elephant in the room.
|
||||
It is too cold in the room.
|
||||
She gave me a hint.
|
||||
She gave a hint to me.
|
||||
John saw a man with a telescope.
|
||||
She walks today.
|
||||
genetically modified
|
||||
|
||||
Why does she walk?
|
||||
|
||||
Why does she not walk?
|
||||
|
||||
Every man in the city works in the city.
|
||||
She is here.
|
||||
She has been here.
|
||||
The reason is that I am tired.
|
||||
He can sing.
|
||||
Does he sing?
|
||||
|
||||
Would he have sung?
|
||||
|
||||
He would not have been tired.
|
||||
He called me a "bad loser".
|
||||
|
||||
|
||||
63
lab1/using_udpipe.md
Normal file
63
lab1/using_udpipe.md
Normal file
@@ -0,0 +1,63 @@
|
||||
# UDPipe: quick instructions
|
||||
|
||||
## Download and install
|
||||
|
||||
The simplest way to use UDPipe is to install the binaries for UDPipe-1, which exist for several operating systems.
|
||||
They can be downloaded from
|
||||
|
||||
https://github.com/ufal/udpipe/releases/download/v1.3.0/udpipe-1.3.0-bin.zip
|
||||
|
||||
When you have downloaded and unzipped this file, you will find the binary for your system in a subdirectory, for instance,
|
||||
```
|
||||
udpipe-1.3.0-bin/bin-macos/udpipe
|
||||
```
|
||||
is the binary for MacOS.
|
||||
If you include this directory on your PATH, you can run the command `udpipe` from anywhere.
|
||||
|
||||
Running the parser for a language requires a model for that language.
|
||||
Models can be accessed via
|
||||
|
||||
https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-3131
|
||||
|
||||
This page includes a long list of models and a command to install them all.
|
||||
If you need only some of them, you can do, for instance,
|
||||
```
|
||||
$ wget https://lindat.mff.cuni.cz/repository/xmlui/bitstream/handle/11234/1-3131//english-lines-ud-2.5-191206.udpipe
|
||||
```
|
||||
|
||||
## Running the parser
|
||||
|
||||
Assuming that you have the binary `udpipe` and the model `english-lines-ud-2.5-191206.udpipe` on you path, you can parse a single sentence with
|
||||
```
|
||||
$ echo "my hovercraft is full of eels" | udpipe --tokenize --tag --parse english-lines-ud-2.5-191206.udpipe
|
||||
```
|
||||
The result is a UD tree in the CoNLL-U notation,
|
||||
```
|
||||
# newdoc
|
||||
# newpar
|
||||
# sent_id = 1
|
||||
# text = my hovercraft is full of eels
|
||||
1 my I PRON P1SG-GEN Number=Sing|Person=1|Poss=Yes|PronType=Prs nmod:poss _ _
|
||||
2 hovercraft hovercraft NOUN SG-NOM Number=Sing 4 nsubj _ _
|
||||
3 is be AUX PRES Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin cop _ _
|
||||
4 full full ADJ POS Degree=Pos 0 root _ _
|
||||
5 of of ADP _ _ 6 case _ _
|
||||
6 eels eel NOUN PL-NOM Number=Plur 4 nmod _ SpacesAfter=\n
|
||||
```
|
||||
If you also have `gfud` and `pdflatex` on your path, you can pipe the result into `gfud conll2pdf` to see the graphical tree.
|
||||
|
||||
As `udpipe` reads standard input, you can read it from a file, such as `lecture3-examples.txt`:
|
||||
```
|
||||
$ cat <myfile> | udpipe --tokenize --tag --parse <model>
|
||||
```
|
||||
Notice that sentences in that file must either end with a period or be separated by empty lines, because otherwise the whole file is parsed as one sentence.
|
||||
|
||||
|
||||
## Training a new model
|
||||
|
||||
If you have a treebank in the CoNLL-U format, you can use it for training a new model, with
|
||||
```
|
||||
$ cat <myfile>.conllu | udpipe --tokenizer none --tagger none --train <myfile>.udpipe
|
||||
```
|
||||
|
||||
|
||||
Reference in New Issue
Block a user