1.8 KiB
Lab 2: Multilingual text generation from Wikidata
This uses GF to generate texts from facts in the Wikidata fact database. You are given
- an abstract syntax and an English concrete syntax, in the subdirectory grammars/
- a json dump from Wikidata, in the subdirectory data/
- a Python file that connects Wikidata with GF, in the subdirectory scripts/
Your task is to create a concrete syntax for some other language by using the GF RGL and evaluate the text generated by this. The steps to take are the following:
- in scripts/, run
python3 find_labels.py da > ../grammars/LabelsDan.gf - in grammars/, copy the beginnings of
LabelsEng.gftoLabelsDan.gf, change Eng to Dan - in grammars/, copy
NobelEng.gftoNobelDan.gfand do the necessary changes - in grammars/, start GF and
import NobelDan.gf, to do some testing - in grammars/ outside GF, do
gf -make NobelEng.gf NobelDan.gf - (if possible, do this, but see woraround below) in scripts/, generate all texts with
python3 describe_nobel.py Dan
Replace da and Dan with your own language codes!
The last step above requires pip3 install pgf.
If you don't manage to install pgf, a quick way to test is, in GF,
import NobelEng.gf
rf -file="../data/trees.gft" -lines -tree | l
If you need gender agreement of names
(This note was added late, and is therefore not required at the 2025 course)
In some languages, names of laureates requires gender agreement. In that case, use the GF command
rf -file="../data/gendertrees.gft" -lines -tree | l
or, if it works for you, the Python command
python3 describe_nobel.py Dan gender
This requires you to define linearizations of the gender-specific functions MaleName and FemaleName so that the gender agreement is set properly.
The following works for many languages:
FemaleName s = mkNP (mkPN s.s feminine) ;