forked from GitHub/gf-core
5234 lines
160 KiB
TeX
5234 lines
160 KiB
TeX
\documentclass[11pt,a4paper]{article}
|
|
\usepackage{amsfonts,graphicx}
|
|
\usepackage[pdfstartview=FitH,urlcolor=blue,colorlinks=true,bookmarks=true]{hyperref}
|
|
\pagestyle{plain} % do page numbering ('empty' turns off)
|
|
\frenchspacing % no aditional spaces after periods
|
|
\setlength{\parskip}{8pt}\parindent=0pt % no paragraph indentation
|
|
|
|
\newcommand{\commOut}[1]{}
|
|
\newcommand{\subsubsubsection}[1]{\textit{#1}}
|
|
|
|
\title{The GF Resource Grammar Library}
|
|
\author{Author: Aarne Ranta}
|
|
\begin{document}
|
|
\date{Last update: Tue Jun 13 11:43:19 2006}
|
|
\maketitle
|
|
|
|
\tableofcontents
|
|
|
|
\clearpage
|
|
|
|
|
|
This document is about the
|
|
GF Resource Grammar Library. It presuppose knowledge of GF and its
|
|
module system, knowledge that can be acquired e.g. from the GF
|
|
tutorial. We start with an introduction to the library, and proceed to
|
|
details with the aim of covering all that one needs to know
|
|
in order to use the library.
|
|
How to write one's own resource grammar (i.e. implement the API for
|
|
a new language), is covered by a separate Resource-HOWTO document.
|
|
|
|
\section{Motivation}
|
|
The GF Resource Grammar Library contains grammar rules for
|
|
10 languages (some more are under construction). Its purpose
|
|
is to make these rules available for application programmers,
|
|
who can thereby concentrate on the semantic and stylistic
|
|
aspects of their grammars, without having to think about
|
|
grammaticality. The level of a typical application grammarian
|
|
is skilled programmer, without knowledge linguistics, but with
|
|
a good knowledge of the target languages. Such a combination of
|
|
skilles is typical of a programmer who wants to localize a piece
|
|
of software to a new language.
|
|
|
|
To give an example, an application dealing with
|
|
music players may have a semantical category \texttt{Kind}, examples
|
|
of Kinds being Song and Artist. In German, for instance, Song
|
|
is linearized into the noun "Lied", but knowing this is not
|
|
enough to make the application work, because the noun must be
|
|
produced in both singular and plural, and in four different
|
|
cases. By using the resource grammar library, it is enough to
|
|
write
|
|
|
|
\begin{verbatim}
|
|
lin Song = reg2N "Lied" "Lieder" neuter
|
|
\end{verbatim}
|
|
and the eight forms are correctly generated. The resource grammar
|
|
library contains a complete set of inflectional paradigms (such as
|
|
regN2 here), enabling the definition of any lexical items.
|
|
|
|
The resource grammar library is not only about inflectional paradigms - it
|
|
also has syntax rules. The music player application
|
|
might also want to modify songs with properties, such as "American",
|
|
"old", "good". The German grammar for adjectival modifications is
|
|
particularly complex, because the adjectives have to agree in gender,
|
|
number, and case, and also depend on what determiner is used
|
|
("ein Amerikanisches Lied" vs. "das Amerikanische Lied"). All this
|
|
variation is taken care of by the resource grammar function
|
|
|
|
\begin{verbatim}
|
|
fun AdjCN : AP -> CN -> CN
|
|
\end{verbatim}
|
|
and the resource grammar implementation of the rule adding properties
|
|
to kinds is
|
|
|
|
\begin{verbatim}
|
|
lin PropKind kind prop = AdjCN prop kind
|
|
\end{verbatim}
|
|
given that
|
|
|
|
\begin{verbatim}
|
|
lincat Prop = AP
|
|
lincat Kind = CN
|
|
\end{verbatim}
|
|
The resource library API is devided into language-specific and language-independet
|
|
parts. To put is roughly,
|
|
|
|
\begin{itemize}
|
|
\item lexicon is language-specific
|
|
\item syntax is language-independent
|
|
\end{itemize}
|
|
|
|
Thus, to render the above example in French instead of German, we need to
|
|
pick a different linearization of Song,
|
|
|
|
\begin{verbatim}
|
|
lin Song = regGenN "chanson" feminine
|
|
\end{verbatim}
|
|
But to linearize PropKind, we can use the very same rule as in German.
|
|
The resource function AdjCN has different implementations in the two
|
|
languages, but the application programmer need not care about the difference.
|
|
|
|
\subsection{A complete example}
|
|
To summarize the example, and also give a template for a programmer to work on,
|
|
here is the complete implementation of a small system with songs and properties.
|
|
The abstract syntax defines a "domain ontology":
|
|
|
|
\begin{verbatim}
|
|
abstract Music = {
|
|
cat
|
|
Kind,
|
|
Property ;
|
|
fun
|
|
PropKind : Kind -> Property -> Kind ;
|
|
Song : Kind ;
|
|
American : Property ;
|
|
}
|
|
\end{verbatim}
|
|
The concrete syntax is defined independently of language, by opening
|
|
two interfaces: the resource Grammar and an application lexicon.
|
|
|
|
\begin{verbatim}
|
|
incomplete concrete MusicI of Music = open Grammar, MusicLex in {
|
|
lincat
|
|
Kind = CN ;
|
|
Property = AP ;
|
|
lin
|
|
PropKind k p = AdjCN p k ;
|
|
Song = UseN song_N ;
|
|
American = PositA american_A ;
|
|
}
|
|
\end{verbatim}
|
|
The application lexicon MusicLex has an abstract syntax, that extends
|
|
the resource category system Cat.
|
|
|
|
\begin{verbatim}
|
|
abstract MusicLex = Cat ** {
|
|
fun
|
|
song_N : N ;
|
|
american_A : A ;
|
|
}
|
|
\end{verbatim}
|
|
Each language has its own concrete syntax, which opens the inflectional paradigms
|
|
module for that language:
|
|
|
|
\begin{verbatim}
|
|
concrete MusicLexGer of MusicLex = CatGer ** open ParadigmsGer in {
|
|
lin
|
|
song_N = reg2N "Lied" "Lieder" neuter ;
|
|
american_A = regA "amerikanisch" ;
|
|
}
|
|
|
|
concrete MusicLexFre of MusicLex = CatFre ** open ParadigmsFre in {
|
|
lin
|
|
song_N = regGenN "chanson" feminine ;
|
|
american_A = regA "américain" ;
|
|
}
|
|
\end{verbatim}
|
|
The top-level Music grammars are obtained by instantiating the two interfaces
|
|
of MusicI:
|
|
|
|
\begin{verbatim}
|
|
concrete MusicGer of Music = MusicI with
|
|
(Grammar = GrammarGer),
|
|
(MusicLex = MusicLexGer) ;
|
|
|
|
concrete MusicFre of Music = MusicI with
|
|
(Grammar = GrammarFre),
|
|
(MusicLex = MusicLexFre) ;
|
|
\end{verbatim}
|
|
To localize the system to a new language, all that is needed is two modules,
|
|
one implementing MusicLex and the other instantiating Music. The latter is
|
|
completely trivial, whereas the former one involves the choice of correct
|
|
vocabulary and inflectional paradigms. For instance, Finnish is added as follows:
|
|
|
|
\begin{verbatim}
|
|
concrete MusicLexFin of MusicLex = CatFre ** open ParadigmsFin in {
|
|
lin
|
|
song_N = regN "kappale" ;
|
|
american_A = regA "amerikkalainen" ;
|
|
}
|
|
|
|
concrete MusicFin of Music = MusicI with
|
|
(Grammar = GrammarFin),
|
|
(MusicLex = MusicLexFin) ;
|
|
\end{verbatim}
|
|
More work is of course needed if the language-independent linearizations in
|
|
MusicI are not satisfactory for some language. The resource grammar guarantees
|
|
that the linearizations are possible in all languages, in the sense of grammatical,
|
|
but they might of course be inadequate for stylistic reasons. Assume,
|
|
for the sake of argument, that adjectival modification does not sound good in
|
|
English, but that a relative clause would be preferrable. One can then start as
|
|
before,
|
|
|
|
\begin{verbatim}
|
|
concrete MusicLexEng of MusicLex = CatFre ** open ParadigmsEng in {
|
|
lin
|
|
song_N = regN "song" ;
|
|
american_A = regA "American" ;
|
|
}
|
|
|
|
concrete MusicEng0 of Music = MusicI with
|
|
(Grammar = GrammarEng),
|
|
(MusicLex = MusicLexEng) ;
|
|
\end{verbatim}
|
|
The module MusicEng0 would not be used on the top level, however, but
|
|
another module would be built on top of it, with a restricted import from
|
|
MusicEng0. MusicEng inherits everything from MusicEng0 except PropKind, and
|
|
gives its own definition of this function:
|
|
|
|
\begin{verbatim}
|
|
concrete MusicEng of Music = MusicEng0 - [PropKind] ** open GrammarEng in {
|
|
lin
|
|
PropKind k p =
|
|
RelCN k (UseRCl TPres ASimul PPos (RelVP IdRP (UseComp (CompAP p)))) ;
|
|
}
|
|
\end{verbatim}
|
|
|
|
\subsection{Parsing with resource grammars?}
|
|
The intended use of the resource grammar is as a library for writing
|
|
application grammars. It is not designed for e.g. parsing newspaper text. There
|
|
are several reasons why this is not so practical:
|
|
|
|
\begin{itemize}
|
|
\item Efficiency: the resource grammar uses complex data structures, in
|
|
particular, discontinuous constituents, which make parsing slow and the
|
|
parser size huge.
|
|
\item Completeness: the resource grammar does not necessarily cover all rules
|
|
of the language - only enough many to be able to express everything
|
|
in one way or another.
|
|
\item Lexicon: the resource grammar has a very small lexicon, only meant for test
|
|
purposes.
|
|
\item Semantics: the resource grammar has very little semantic control, and may
|
|
accept strange input or deliver strange interpretations.
|
|
\item Ambiguity: parsing in the resource grammar may return lots of results many
|
|
of which are implausible.
|
|
\end{itemize}
|
|
|
|
All of these problems should be solved in application grammars.
|
|
The task of resource grammars is just to take care of low-level linguistic
|
|
details such as inflection, agreement, and word order.
|
|
|
|
For the same reasons, resource grammars are not adequate for parsing.
|
|
That the syntax API is implemented for different languages of course makes
|
|
it possible to translate via it - but there is no guarantee of translation
|
|
equivalence. Of course, the use of parametrized implementations such as MusicI
|
|
above only extends to those cases where the syntax API does give translation
|
|
equivalence - but this must be seen as a limiting case, and real applications
|
|
will often use only restricted inheritance of MusicI.
|
|
|
|
\section{To find rules in the resource grammar library}
|
|
\subsection{Inflection paradigms}
|
|
Inflection paradigms are defined separately for each language L
|
|
in the module ParadigmsL. To test them, the command cc (= compute\_concrete)
|
|
can be used:
|
|
|
|
\begin{verbatim}
|
|
> i -retain german/ParadigmsGer.gf
|
|
|
|
> cc regN "Schlange"
|
|
{
|
|
s : Number => Case => Str = table Number {
|
|
Sg => table Case {
|
|
Nom => "Schlange" ;
|
|
Acc => "Schlange" ;
|
|
Dat => "Schlange" ;
|
|
Gen => "Schlange"
|
|
} ;
|
|
Pl => table Case {
|
|
Nom => "Schlangen" ;
|
|
Acc => "Schlangen" ;
|
|
Dat => "Schlangen" ;
|
|
Gen => "Schlangen"
|
|
}
|
|
} ;
|
|
g : Gender = Fem
|
|
}
|
|
\end{verbatim}
|
|
For the sake of convenience, every language implements these four paradigms:
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
regN : Str -> N ; -- regular nouns
|
|
regA : Str -> A : -- regular adjectives
|
|
regV : Str -> V ; -- regular verbs
|
|
dirV : V -> V2 ; -- direct transitive verbs
|
|
\end{verbatim}
|
|
It is often possible to initialize a lexicon by just using these functions,
|
|
and later revise it by using the more involved paradigms. For instance, in
|
|
German we cannot use regN "Lied" for Song, because the result would be a
|
|
Masculine noun with the plural form "Liede". The individual Paradigms modules
|
|
tell what cases are covered by the regular heuristics.
|
|
|
|
As a limiting case, one could even initialize the lexicon for a new language
|
|
by copying the English (or some other already existing) lexicon. This will
|
|
produce language with correct grammar but content words directly borrowed from
|
|
English.
|
|
|
|
\subsection{Syntax rules}
|
|
Syntax rules should be looked for in the abstract modules defining the
|
|
API. There are around 10 such modules, each defining constructors for
|
|
a group of one or more related categories. For instance, the module
|
|
Noun defines how to construct common nouns, noun phrases, and determiners.
|
|
Thus the proper place to find out how nouns are modified with adjectives
|
|
is Noun, because the result of the construction is again a common noun.
|
|
|
|
Browsing the libraries is helped by the gfdoc-generated HTML pages.
|
|
However, this is still not easy, and the most efficient way is
|
|
probably to use the parser.
|
|
Even though parsing is not an intended end-user application
|
|
of resource grammars, it is a useful technique for application grammarians
|
|
to browse the library. To find out what resource function does some
|
|
particular job, you can just parse a string that exemplifies this job. For
|
|
instance, to find out how sentences are built using transitive verbs, write
|
|
|
|
\begin{verbatim}
|
|
> i english/LangEng.gf
|
|
|
|
> p -cat=Cl -fcfg "she loves him"
|
|
|
|
PredVP (UsePron she_Pron) (ComplV2 love_V2 (UsePron he_Pron))
|
|
\end{verbatim}
|
|
Parsing with the English resource grammar has an acceptable speed, but
|
|
with most languages it takes just too much resources even to build the
|
|
parser. However, examples parsed in one language can always be linearized into
|
|
other languages:
|
|
|
|
\begin{verbatim}
|
|
> i italian/LangIta.gf
|
|
|
|
> l PredVP (UsePron she_Pron) (ComplV2 love_V2 (UsePron he_Pron))
|
|
|
|
lo ama
|
|
\end{verbatim}
|
|
Therefore, one can use the English parser to write an Italian grammar, and also
|
|
to write a language-independent (incomplete) grammar. One can also parse strings
|
|
that are bizarre in English but the intended way of expression in another language.
|
|
For instance, the phrase for "I am hungry" in Italian is literally "I have hunger".
|
|
This can be built by parsing "I have beer" in LanEng and then writing
|
|
|
|
\begin{verbatim}
|
|
lin IamHungry =
|
|
let beer_N = regGenN "fame" feminine
|
|
in
|
|
PredVP (UsePron i_Pron) (ComplV2 have_V2
|
|
(DetCN (DetSg MassDet NoOrd) (UseN beer_N))) ;
|
|
\end{verbatim}
|
|
which uses ParadigmsIta.regGenN.
|
|
|
|
\subsection{Example-based grammar writing}
|
|
The technique of parsing with the resource grammar can be used in GF source files,
|
|
endowed with the suffix .gfe ("GF examples"). The suffix tells GF to preprocess
|
|
the file by replacing all expressions of the form
|
|
|
|
\begin{verbatim}
|
|
in Module.Cat "example string"
|
|
\end{verbatim}
|
|
by the syntax trees obtained by parsing "example string" in Cat in Module.
|
|
For instance,
|
|
|
|
\begin{verbatim}
|
|
lin IamHungry =
|
|
let beer_N = regGenN "fame" feminine
|
|
in
|
|
(in LangEng.Cl "I have beer") ;
|
|
\end{verbatim}
|
|
will result in the rule displayed in the previous section. The normal binding rules
|
|
of functional programming (and GF) guarantee that local bindings of identifiers
|
|
take precedence over constants of the same forms. Thus it is also possible to
|
|
linearize functions taking arguments in this way:
|
|
|
|
\begin{verbatim}
|
|
lin
|
|
PropKind car_N old_A = in LangEng.CN "old car" ;
|
|
\end{verbatim}
|
|
However, the technique of example-based grammar writing has some limitations:
|
|
|
|
\begin{itemize}
|
|
\item Ambiguity. If a string has several parses, the first one is returned, and
|
|
it may not be the intended one. The other parses are shown in a comment, from
|
|
where they must/can be picked manually.
|
|
\item Lexicality. The arguments of a function must be atomic identifiers, and are thus
|
|
not available for categories that have no lexical items. For instance, the PropKind
|
|
rule above gives the result
|
|
\begin{verbatim}
|
|
lin
|
|
PropKind car_N old_A = AdjCN (UseN car_N) (PositA old_A) ;
|
|
\end{verbatim}
|
|
However, it is possible to write a special lexicon that gives atomic rules for
|
|
all those categories that can be used as arguments, for instance,
|
|
\begin{verbatim}
|
|
fun
|
|
cat_CN : CN ;
|
|
old_AP : AP ;
|
|
\end{verbatim}
|
|
and then use this lexicon instead of the standard one included in Lang.
|
|
\end{itemize}
|
|
|
|
\subsection{Special-purpose APIs}
|
|
To give an analogy with a well-known type setting program, GF can be compared
|
|
with TeX and the resource grammar library with LaTeX. As TeX frees the author
|
|
from thinking about low-level problems of page layout, so GF frees the grammarian
|
|
from writing parsing and generation algorithms. But quite a lot of knowledge of
|
|
\textit{how} to write grammars is still needed, and the resource grammar library helps
|
|
GF grammarians in a way similar to how the LaTeX macro package helps TeX authors.
|
|
|
|
But even LaTeX is often too detailed and low-level, and users are encouraged to
|
|
develop their own macro packages. The same applies to GF resource grammars:
|
|
the application grammarian might not need all the choises that the resource
|
|
provides, but would prefer less writing and higher-level programming.
|
|
To this end, application grammarians may want to write their own views on the
|
|
resource grammar. An example of this is already provided, in mathematical/Predication.
|
|
Instead of the NP-VP structure, it permits clause construction directly from
|
|
verbs and adjectives and their arguments:
|
|
|
|
\begin{verbatim}
|
|
predV : V -> NP -> Cl ; -- "x converges"
|
|
predV2 : V2 -> NP -> NP -> Cl ; -- "x intersects y"
|
|
predV3 : V3 -> NP -> NP -> NP -> Cl ; -- "x intersects y at z"
|
|
predVColl : V -> NP -> NP -> Cl ; -- "x and y intersect"
|
|
predA : A -> NP -> Cl ; -- "x is even"
|
|
predA2 : A2 -> NP -> NP -> Cl ; -- "x is divisible by y"
|
|
\end{verbatim}
|
|
The implementation of this module is the functor PredicationI:
|
|
|
|
\begin{verbatim}
|
|
predV v x = PredVP x (UseV v) ;
|
|
predV2 v x y = PredVP x (ComplV2 v y) ;
|
|
predV3 v x y z = PredVP x (ComplV3 v y z) ;
|
|
predVColl v x y = PredVP (ConjNP and_Conj (BaseNP x y)) (UseV v) ;
|
|
predA a x = PredVP x (UseComp (CompAP (PositA a))) ;
|
|
predA2 a x y = PredVP x (UseComp (CompAP (ComplA2 a y))) ;
|
|
\end{verbatim}
|
|
Of course, Predication can be opened together with Grammar, but using
|
|
the resulting grammar for parsing can be frustrating, since having both
|
|
ways of building clauses simultaneously available will produce spurious
|
|
ambiguities. Using Predication without Verb for parsing is a better idea,
|
|
since parsing is also made more efficient without the VP category.
|
|
|
|
The use of special-purpose APIs is to some extent to be seen as an alternative
|
|
to grammar writing by parsing, and its importance may decrease as parsing
|
|
with the resource grammars gets more efficient.
|
|
|
|
\section{Overview of syntactic structures}
|
|
\subsection{Texts. phrases, and utterances}
|
|
The outermost linguistic structure is Text. Texts are composed
|
|
from Phrases followed by punctuation marks - either of ".", "?" or
|
|
"!" (with their proper variants in Spanish and Arabic). Here is an
|
|
example of a Text.
|
|
|
|
\begin{verbatim}
|
|
John walks. Why? He doesn't want to sleep!
|
|
\end{verbatim}
|
|
Phrases are mostly built from Utterances, which in turn are
|
|
declarative sentences, questions, or imperatives - but there
|
|
are also "one-word utterances" consisting of noun phrases
|
|
or other subsentential phrases. Some Phrases are atomic,
|
|
for instance "yes" and "no". Here are some examples of Phrases.
|
|
|
|
\begin{verbatim}
|
|
yes
|
|
come on, John
|
|
but John walks
|
|
give me the stick please
|
|
don't you know that he is sleeping
|
|
a glass of wine
|
|
a glass of wine please
|
|
\end{verbatim}
|
|
There is no connection between the punctuation marks and the
|
|
types of utterances. This reflects the fact that the punctuation
|
|
mark in a real text is selected as a function of the speech act
|
|
rather than the grammatical form of an utterance. The following
|
|
text is thus well-formed.
|
|
|
|
\begin{verbatim}
|
|
John walks. John walks? John walks!
|
|
\end{verbatim}
|
|
What is the difference between Phrase and Utterance? Just technical:
|
|
a Phrase is an Utterance with an optional leading conjunction ("but")
|
|
and an optional tailing vocative ("John", "please").
|
|
|
|
\subsection{Sentences and clauses}
|
|
The richest of the categories below Utterance is S, Sentence. A Sentence
|
|
is formed from a Clause, by fixing its Tense, Anteriority, and Polarity.
|
|
The difference between Sentence and Clause is thus also rather technical.
|
|
For example, each of the following strings has a distinct syntax tree
|
|
in the category Sentence:
|
|
|
|
\begin{verbatim}
|
|
John walks
|
|
John doesn't walk
|
|
John walked
|
|
John didn't walk
|
|
John has walked
|
|
John hasn't walked
|
|
John will walk
|
|
John won't walk
|
|
...
|
|
\end{verbatim}
|
|
whereas in the category Clause all of them are just different forms of
|
|
the same tree.
|
|
|
|
The following syntax tree of the Text "John walks." gives an overview
|
|
of the structural levels.
|
|
|
|
\begin{verbatim}
|
|
Node Constructor Value type Other constructors
|
|
-----------------------------------------------------------
|
|
1. TFullStop Text TQuestMark
|
|
2. (PhrUtt Phr
|
|
3. NoPConj PConj but_PConj
|
|
4. (UttS Utt UttQS
|
|
5. (UseCl S UseQCl
|
|
6. TPres Tense TPast
|
|
7. ASimul Anter AAnter
|
|
8. PPos Pol PNeg
|
|
9. (PredVP Cl
|
|
10. (UsePN NP UsePron, DetCN
|
|
11. john_PN) PN mary_PN
|
|
12. (UseV VP ComplV2, ComplV3
|
|
13. walk_V)))) V sleep_V
|
|
14. NoVoc) Voc please_Voc
|
|
15. TEmpty Text
|
|
\end{verbatim}
|
|
Here are some examples of the results of changing constructors.
|
|
|
|
\begin{verbatim}
|
|
1. TFullStop -> TQuestMark John walks?
|
|
3. NoPConj -> but_PConj But John walks.
|
|
6. TPres -> TPast John walked.
|
|
7. ASimul -> AAnter John has walked.
|
|
8. PPos -> PNeg John doesn't walk.
|
|
11. john_PN -> mary_PN Mary walks.
|
|
13. walk_V -> sleep_V John sleeps.
|
|
14. NoVoc -> please_Voc John sleeps please.
|
|
\end{verbatim}
|
|
All constructors cannot of course be changed so freely, because the
|
|
resulting tree would not remain well-typed. Here are some changes involving
|
|
many constructors:
|
|
|
|
\begin{verbatim}
|
|
4- 5. UttS (UseCl ...) ->
|
|
UttQS (UseQCl (... QuestCl ...)) Does John walk?
|
|
10-11. UsePN john_PN ->
|
|
UsePron we_Pron We walk.
|
|
12-13. UseV walk_V ->
|
|
ComplV2 love_V2 this_NP John loves this.
|
|
\end{verbatim}
|
|
|
|
\subsection{Parts of sentences}
|
|
The linguistic phenomena mostly discussed in both traditional grammars and modern
|
|
syntax belong to the level of Clauses, that is, lines 9-13, and occasionally
|
|
to Sentences, lines 5-13. At this level, the major categories are
|
|
NP (Noun Phrase) and VP (Verb Phrase). A Clause typically consists of just an
|
|
NP and a VP. The internal structure of both NP and VP can be very complex,
|
|
and these categories are mutually recursive: not only can a VP contain an NP,
|
|
|
|
\begin{verbatim}
|
|
[VP loves [NP Mary]]
|
|
\end{verbatim}
|
|
but an NP can also contain a VP
|
|
|
|
\begin{verbatim}
|
|
[NP every man [RS who [VP walks]]]
|
|
\end{verbatim}
|
|
(a labelled bracketing like this is of course just a rough approximation of
|
|
a GF syntax tree, but still a useful device of exposition).
|
|
|
|
Most of the resource modules thus define functions that are used inside
|
|
NPs and VPs. Here is a brief overview:
|
|
|
|
Noun: How to construct NPs. The main three mechanisms
|
|
for constructing NPs are
|
|
|
|
\begin{itemize}
|
|
\item from proper names: John
|
|
\item from pronouns: we
|
|
\item from common nouns by determiners: this man
|
|
\end{itemize}
|
|
|
|
The Noun module also defines the construction of common nouns. The most frequent ways are
|
|
|
|
\begin{itemize}
|
|
\item lexical noun items: man
|
|
\item adjectival modification: old man
|
|
\item relative clause modification: man who sleeps
|
|
\item application of relational nouns: successor of the number
|
|
\end{itemize}
|
|
|
|
Verb: How to construct VPs. The main mechanism is verbs with their arguments, for instance,
|
|
|
|
\begin{itemize}
|
|
\item one-place verbs: walks
|
|
\item two-place verbs: loves Mary
|
|
\item three-place verbs: gives her a kiss
|
|
\item sentence-complement verbs: says that it is cold
|
|
\item VP-complement verbs: wants to give her a kiss
|
|
\end{itemize}
|
|
|
|
A special verb is the copula, "be" in English but not even realized
|
|
by a verb in all languages.
|
|
A copula can take different kinds of complement:
|
|
|
|
\begin{itemize}
|
|
\item an adjectival phrase: (John is) old
|
|
\item an adverb: (John is) here
|
|
\item a noun phrase: (John is) a man
|
|
\end{itemize}
|
|
|
|
Adjective: How to constuct APs. The main ways are
|
|
|
|
\begin{itemize}
|
|
\item positive forms of adjectives: old
|
|
\item comparative forms with object of comparison: older than John
|
|
\end{itemize}
|
|
|
|
Adverb: How to construct Advs. The main ways are
|
|
|
|
\begin{itemize}
|
|
\item from adjectives: slowly
|
|
\end{itemize}
|
|
|
|
\subsection{Modules and their names}
|
|
The resource modules are named after the kind of phrases that are constructed in them,
|
|
and they can be roughly classified by the "level" or "size" of expressions that are
|
|
formed in them:
|
|
|
|
\begin{itemize}
|
|
\item Larger than sentence: Text, Phrase
|
|
\item Same level as sentence: Sentence, Question, Relative
|
|
\item Parts of sentence: Adjective, Adverb, Noun, Verb
|
|
\item Cross-cut: Conjunction
|
|
\end{itemize}
|
|
|
|
Because of mutual recursion such as in embedded sentences, this classification is
|
|
not a complete order. However, no mutual dependence is needed between the
|
|
modules in a formal sense - they can all be compiled separately. This is due
|
|
to the module Cat, which defines the type system common to the other modules.
|
|
For instance, the types NP and VP are defined in Cat, and the module Verb only
|
|
needs to know what is given in Cat, not what is given in Noun. To implement
|
|
a rule such as
|
|
|
|
\begin{verbatim}
|
|
Verb.ComplV2 : V2 -> NP -> VP
|
|
\end{verbatim}
|
|
it is enough to know the linearization type of NP (as well as those of V2 and VP, all
|
|
given in Cat). It is not necessary to know what
|
|
ways there are to build NPs (given in Noun), since all these ways must
|
|
conform to the linearization type defined in Cat. Thus the format of
|
|
category-specific modules is as follows:
|
|
|
|
\begin{verbatim}
|
|
abstract Adjective = Cat ** {...}
|
|
abstract Noun = Cat ** {...}
|
|
abstract Verb = Cat ** {...}
|
|
\end{verbatim}
|
|
|
|
\subsection{Top-level grammar and lexicon}
|
|
The module Grammar collects all the category-specific modules into
|
|
a complete grammar:
|
|
|
|
\begin{verbatim}
|
|
abstract Grammar =
|
|
Adjective, Noun, Verb, ..., Structural, Idiom
|
|
\end{verbatim}
|
|
The module Structural is a lexicon of structural words (function words),
|
|
such as determiners.
|
|
The module Idiom is a collection of idiomatic structures whose
|
|
implementation is very language-dependent. An example is existential
|
|
structures ("there is", "es gibt", "il y a", etc).
|
|
|
|
The module Lang combines Grammar with a Lexicon of ca. 350 content words:
|
|
|
|
\begin{verbatim}
|
|
abstract Lang = Grammar, Lexicon
|
|
\end{verbatim}
|
|
Using Lang instead of Grammar as a library may give the advantage of prociding
|
|
for free some words needed in an application. But its main purpose is to
|
|
help testing the resource library. It does not seem possible to maintain
|
|
a general-purpose multilingual lexicon, and this is the form that the module
|
|
Lexicon has.
|
|
|
|
\subsection{Language-specific syntactic structures}
|
|
The API collected in Grammar has been designed to be implementable for
|
|
all languages in the resource package. It does contain some rules that
|
|
are strange or superfluous in some languages; for instance, the distinction
|
|
between definite and indefinite articles does not apply to Finnish and Russian.
|
|
But such rules are still easy to implement: they only create some superfluous
|
|
ambiguity in the languages in question.
|
|
|
|
But the library makes no claim that all languages should have exactly the same
|
|
abstract syntax. The common API is therefore extended by language-dependent
|
|
rules. The top level of each languages looks as follows (with English as example):
|
|
|
|
\begin{verbatim}
|
|
abstract English = Grammar, ExtraEngAbs, DictEngAbs
|
|
\end{verbatim}
|
|
where ExtraEngAbs is a collection of syntactic structures specific to English,
|
|
and DictEngAbs is an English dictionary (at the moment, it consists of IrregEngAbs,
|
|
the irregular verbs of English). Each of these language-specific grammars has
|
|
the potential to grow into a full-scale grammar of the language. These grammar
|
|
can also be used as libraries, but the possibility of using functors is lost.
|
|
|
|
To give a better overview of language-specific structures, modules like ExtraEngAbs
|
|
are built from a language-independent module ExtraAbs by restricted inheritance:
|
|
|
|
\begin{verbatim}
|
|
abstract ExtraEngAbs = Extra [f,g,...]
|
|
\end{verbatim}
|
|
Thus any category and function in Extra may be shared by a subset of all
|
|
languages. One can see this set-up as a matrix, which tells what Extra structures
|
|
are implemented in what languages. For the common API in Grammar, the matrix
|
|
is filled with 1's (everything is implemented in every language).
|
|
|
|
Language-specific extensions and the use of restricted
|
|
inheritance is a recent addition to the resource grammar library, and
|
|
has only been exploited in a very small scale so far.
|
|
|
|
\section{API Documentation}
|
|
\subsection{Top-level modules}
|
|
|
|
\subsubsection{Grammar}
|
|
This grammar a collection of the different grammar modules,
|
|
To test the resource, import \htmladdnormallink{Lang}{Lang.html}, which also contains
|
|
a lexicon.
|
|
|
|
\begin{verbatim}
|
|
abstract Grammar =
|
|
Noun,
|
|
Verb,
|
|
Adjective,
|
|
Adverb,
|
|
Numeral,
|
|
Sentence,
|
|
Question,
|
|
Relative,
|
|
Conjunction,
|
|
Phrase,
|
|
Text,
|
|
Structural,
|
|
Idiom
|
|
** {} ;
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Grammar with lexicon}
|
|
This grammar is just a collection of the different modules,
|
|
and the one that can be imported when one wants to test the
|
|
grammar. A module without a lexicon is \htmladdnormallink{Grammar}{Grammar.html},
|
|
which may be more suitable to open in applications.
|
|
|
|
\begin{verbatim}
|
|
abstract Lang =
|
|
Grammar,
|
|
Lexicon
|
|
** {} ;
|
|
\end{verbatim}
|
|
|
|
\subsection{Type system}
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{The category system}
|
|
The category system is central to the library in the sense
|
|
that the other modules (\texttt{Adjective}, \texttt{Adverb}, \texttt{Noun}, \texttt{Verb} etc)
|
|
communicate through it. This means that a e.g. a function using
|
|
\texttt{NP}s in \texttt{Verb} need not know how \texttt{NP}s are constructed in \texttt{Noun}:
|
|
it is enough that both \texttt{Verb} and \texttt{Noun} use the same type \texttt{NP},
|
|
which is given here in \texttt{Cat}.
|
|
|
|
Some categories are inherited from \htmladdnormallink{Common}{Common.html}.
|
|
The reason they are defined there is that they have the same
|
|
implementation in all languages in the resource (typically,
|
|
just a string). These categories are
|
|
\texttt{AdA, AdN, AdV, Adv, Ant, CAdv, IAdv, PConj, Phr},
|
|
\texttt{Pol, SC, Tense, Text, Utt, Voc}.
|
|
|
|
Moreover, the list categories \texttt{ListAdv, ListAP, ListNP, ListS}
|
|
are defined on \texttt{Conjunction} and only used locally there.
|
|
|
|
\begin{verbatim}
|
|
abstract Cat = Common ** {
|
|
|
|
cat
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Sentences and clauses}
|
|
Constructed in \htmladdnormallink{Sentence}{Sentence.html}, and also in
|
|
\htmladdnormallink{Idiom}{Idiom.html}.
|
|
|
|
\begin{verbatim}
|
|
S ; -- declarative sentence e.g. "she lived here"
|
|
QS ; -- question e.g. "where did she live"
|
|
RS ; -- relative e.g. "in which she lived"
|
|
Cl ; -- declarative clause, with all tenses e.g. "she looks at this"
|
|
Slash ; -- clause missing NP (S/NP in GPSG) e.g. "she looks at"
|
|
Imp ; -- imperative e.g. "look at this"
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Questions and interrogatives}
|
|
Constructed in \htmladdnormallink{Question}{Question.html}.
|
|
|
|
\begin{verbatim}
|
|
QCl ; -- question clause, with all tenses e.g. "why does she walk"
|
|
IP ; -- interrogative pronoun e.g. "who"
|
|
IComp ; -- interrogative complement of copula e.g. "where"
|
|
IDet ; -- interrogative determiner e.g. "which"
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relative clauses and pronouns}
|
|
Constructed in \htmladdnormallink{Relative}{Relative.html}.
|
|
|
|
\begin{verbatim}
|
|
RCl ; -- relative clause, with all tenses e.g. "in which she lives"
|
|
RP ; -- relative pronoun e.g. "in which"
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verb phrases}
|
|
Constructed in \htmladdnormallink{Verb}{Verb.html}.
|
|
|
|
\begin{verbatim}
|
|
VP ; -- verb phrase e.g. "is very warm"
|
|
Comp ; -- complement of copula, such as AP e.g. "very warm"
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adjectival phrases}
|
|
Constructed in \htmladdnormallink{Adjective}{Adjective.html}.
|
|
|
|
\begin{verbatim}
|
|
AP ; -- adjectival phrase e.g. "very warm"
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Nouns and noun phrases}
|
|
Constructed in \htmladdnormallink{Noun}{Noun.html}.
|
|
Many atomic noun phrases e.g. \textit{everybody}
|
|
are constructed in \htmladdnormallink{Structural}{Structural.html}.
|
|
The determiner structure is
|
|
|
|
\begin{verbatim}
|
|
Predet (QuantSg | QuantPl Num) Ord
|
|
\end{verbatim}
|
|
as defined in \htmladdnormallink{Noun}{Noun.html}.
|
|
|
|
\begin{verbatim}
|
|
CN ; -- common noun (without determiner) e.g. "red house"
|
|
NP ; -- noun phrase (subject or object) e.g. "the red house"
|
|
Pron ; -- personal pronoun e.g. "she"
|
|
Det ; -- determiner phrase e.g. "all the seven"
|
|
Predet; -- predeterminer (prefixed Quant) e.g. "all"
|
|
QuantSg;-- quantifier ('nucleus' of sing. Det) e.g. "every"
|
|
QuantPl;-- quantifier ('nucleus' of plur. Det) e.g. "many"
|
|
Quant ; -- quantifier with both sg and pl e.g. "this/these"
|
|
Num ; -- cardinal number (used with QuantPl) e.g. "seven"
|
|
Ord ; -- ordinal number (used in Det) e.g. "seventh"
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Numerals}
|
|
Constructed in \htmladdnormallink{Numeral}{Numeral.html}.
|
|
|
|
\begin{verbatim}
|
|
Numeral;-- cardinal or ordinal, e.g. "five/fifth"
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Structural words}
|
|
Constructed in \htmladdnormallink{Structural}{Structural.html}.
|
|
|
|
\begin{verbatim}
|
|
Conj ; -- conjunction, e.g. "and"
|
|
DConj ; -- distributed conj. e.g. "both - and"
|
|
Subj ; -- subjunction, e.g. "if"
|
|
Prep ; -- preposition, or just case e.g. "in"
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Words of open classes}
|
|
These are constructed in \htmladdnormallink{Lexicon}{Lexicon.html} and in
|
|
additional lexicon modules.
|
|
|
|
\begin{verbatim}
|
|
V ; -- one-place verb e.g. "sleep"
|
|
V2 ; -- two-place verb e.g. "love"
|
|
V3 ; -- three-place verb e.g. "show"
|
|
VV ; -- verb-phrase-complement verb e.g. "want"
|
|
VS ; -- sentence-complement verb e.g. "claim"
|
|
VQ ; -- question-complement verb e.g. "ask"
|
|
VA ; -- adjective-complement verb e.g. "look"
|
|
V2A ; -- verb with NP and AP complement e.g. "paint"
|
|
|
|
A ; -- one-place adjective e.g. "warm"
|
|
A2 ; -- two-place adjective e.g. "divisible"
|
|
|
|
N ; -- common noun e.g. "house"
|
|
N2 ; -- relational noun e.g. "son"
|
|
N3 ; -- three-place relational noun e.g. "connection"
|
|
PN ; -- proper name e.g. "Paris"
|
|
|
|
}
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Infrastructure with common implementations.}
|
|
This module defines the categories that uniformly have the linearization
|
|
\texttt{\{s : Str\}} in all languages.
|
|
Moreover, this module defines the abstract parameters of tense, polarity, and
|
|
anteriority, which are used in \htmladdnormallink{Phrase}{Phrase.html} to generate different
|
|
forms of sentences. Together they give 2 x 4 x 4 = 16 sentence forms.
|
|
These tenses are defined for all languages in the library. More tenses
|
|
can be defined in the language extensions, e.g. the \textit{passé simple} of
|
|
Romance languages.
|
|
|
|
\begin{verbatim}
|
|
abstract Common = {
|
|
|
|
cat
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Top-level units}
|
|
Constructed in \htmladdnormallink{Text}{Text.html}: \texttt{Text}.
|
|
|
|
\begin{verbatim}
|
|
Text ; -- text consisting of several phrases e.g. "He is here. Why?"
|
|
\end{verbatim}
|
|
|
|
Constructed in \htmladdnormallink{Phrase}{Phrase.html}:
|
|
|
|
\begin{verbatim}
|
|
Phr ; -- phrase in a text e.g. "but be quiet please"
|
|
Utt ; -- sentence, question, word... e.g. "be quiet"
|
|
Voc ; -- vocative or "please" e.g. "my darling"
|
|
PConj ; -- phrase-beginning conj. e.g. "therefore"
|
|
\end{verbatim}
|
|
|
|
Constructed in \htmladdnormallink{Sentence}{Sentence.html}:
|
|
|
|
\begin{verbatim}
|
|
SC ; -- embedded sentence or question e.g. "that it rains"
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adverbs}
|
|
Constructed in \htmladdnormallink{Adverb}{Adverb.html}.
|
|
Many adverbs are constructed in \htmladdnormallink{Structural}{Structural.html}.
|
|
|
|
\begin{verbatim}
|
|
Adv ; -- verb-phrase-modifying adverb, e.g. "in the house"
|
|
AdV ; -- adverb directly attached to verb e.g. "always"
|
|
AdA ; -- adjective-modifying adverb, e.g. "very"
|
|
AdN ; -- numeral-modifying adverb, e.g. "more than"
|
|
IAdv ; -- interrogative adverb e.g. "why"
|
|
CAdv ; -- comparative adverb e.g. "more"
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Tense, polarity, and anteriority}
|
|
\begin{verbatim}
|
|
Tense ; -- tense: present, past, future, conditional
|
|
Pol ; -- polarity: positive, negative
|
|
Ant ; -- anteriority: simultaneous, anterior
|
|
|
|
fun
|
|
PPos, PNeg : Pol ; -- I sleep/don't sleep
|
|
|
|
TPres : Tense ;
|
|
ASimul : Ant ;
|
|
TPast, TFut, TCond : Tense ; -- I slept/will sleep/would sleep --# notpresent
|
|
AAnter : Ant ; -- I have slept --# notpresent
|
|
|
|
}
|
|
\end{verbatim}
|
|
|
|
\subsection{Phrase category modules}
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Adjectives and adjectival phrases}
|
|
\begin{verbatim}
|
|
abstract Adjective = Cat ** {
|
|
|
|
fun
|
|
\end{verbatim}
|
|
|
|
The principal ways of forming an adjectival phrase are
|
|
positive, comparative, relational, reflexive-relational, and
|
|
elliptic-relational.
|
|
(The superlative use is covered in \htmladdnormallink{Noun}{Noun.html}.\texttt{SuperlA}.)
|
|
|
|
\begin{verbatim}
|
|
PositA : A -> AP ; -- warm
|
|
ComparA : A -> NP -> AP ; -- warmer than Spain
|
|
ComplA2 : A2 -> NP -> AP ; -- divisible by 2
|
|
ReflA2 : A2 -> AP ; -- divisible by itself
|
|
UseA2 : A2 -> A ; -- divisible
|
|
\end{verbatim}
|
|
|
|
Sentence and question complements defined for all adjectival
|
|
phrases, although the semantics is only clear for some adjective.
|
|
|
|
\begin{verbatim}
|
|
SentAP : AP -> SC -> AP ; -- great that she won, uncertain if she did
|
|
\end{verbatim}
|
|
|
|
An adjectival phrase can be modified by an \textbf{adadjective}, such as \textit{very}.
|
|
|
|
\begin{verbatim}
|
|
AdAP : AdA -> AP -> AP ; -- very uncertain
|
|
\end{verbatim}
|
|
|
|
The formation of adverbs from adjective (e.g. \textit{quickly}) is covered
|
|
by \htmladdnormallink{Adverb}{Adverb.html}.
|
|
|
|
\begin{verbatim}
|
|
}
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Adverbs and adverbial phrases}
|
|
\begin{verbatim}
|
|
abstract Adverb = Cat ** {
|
|
|
|
fun
|
|
\end{verbatim}
|
|
|
|
The two main ways of forming adverbs are from adjectives and by
|
|
prepositions from noun phrases.
|
|
|
|
\begin{verbatim}
|
|
PositAdvAdj : A -> Adv ; -- quickly
|
|
PrepNP : Prep -> NP -> Adv ; -- in the house
|
|
\end{verbatim}
|
|
|
|
Comparative adverbs have a noun phrase or a sentence as object of
|
|
comparison.
|
|
|
|
\begin{verbatim}
|
|
ComparAdvAdj : CAdv -> A -> NP -> Adv ; -- more quickly than John
|
|
ComparAdvAdjS : CAdv -> A -> S -> Adv ; -- more quickly than he runs
|
|
\end{verbatim}
|
|
|
|
Adverbs can be modified by 'adadjectives', just like adjectives.
|
|
|
|
\begin{verbatim}
|
|
AdAdv : AdA -> Adv -> Adv ; -- very quickly
|
|
\end{verbatim}
|
|
|
|
Subordinate clauses can function as adverbs.
|
|
|
|
\begin{verbatim}
|
|
SubjS : Subj -> S -> Adv ; -- when he arrives
|
|
AdvSC : SC -> Adv ; -- that he arrives ---- REMOVE?
|
|
\end{verbatim}
|
|
|
|
Comparison adverbs also work as numeral adverbs.
|
|
|
|
\begin{verbatim}
|
|
AdnCAdv : CAdv -> AdN ; -- more (than five)
|
|
|
|
}
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Coordination}
|
|
Coordination is defined for many different categories; here is
|
|
a sample. The rules apply to \textbf{lists} of two or more elements,
|
|
and define two general patterns:
|
|
|
|
\begin{itemize}
|
|
\item ordinary conjunction: X,...X and X
|
|
\item distributed conjunction: both X,...,X and X
|
|
\end{itemize}
|
|
|
|
\textbf{Note}. This module uses right-recursive lists. If backward
|
|
compatibility with API 0.9 is needed, use
|
|
\htmladdnormallink{SeqConjunction}{SeqConjunction.html}.
|
|
|
|
\begin{verbatim}
|
|
abstract Conjunction = Cat ** {
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Rules}
|
|
\begin{verbatim}
|
|
fun
|
|
ConjS : Conj -> [S] -> S ; -- "John walks and Mary runs"
|
|
ConjAP : Conj -> [AP] -> AP ; -- "even and prime"
|
|
ConjNP : Conj -> [NP] -> NP ; -- "John or Mary"
|
|
ConjAdv : Conj -> [Adv] -> Adv ; -- "quickly or slowly"
|
|
|
|
DConjS : DConj -> [S] -> S ; -- "either John walks or Mary runs"
|
|
DConjAP : DConj -> [AP] -> AP ; -- "both even and prime"
|
|
DConjNP : DConj -> [NP] -> NP ; -- "either John or Mary"
|
|
DConjAdv : DConj -> [Adv] -> Adv; -- "both badly and slowly"
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Categories}
|
|
These categories are only used in this module.
|
|
|
|
\begin{verbatim}
|
|
cat
|
|
[S]{2} ;
|
|
[Adv]{2} ;
|
|
[NP]{2} ;
|
|
[AP]{2} ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{List constructors}
|
|
The list constructors are derived from the list notation and therefore
|
|
not given explicitly. But here are their type signatures:
|
|
|
|
\begin{verbatim}
|
|
-- BaseC : C -> C -> [C] ; -- for C = S, AP, NP, Adv
|
|
-- ConsC : C -> [C] -> [C] ;
|
|
}
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Idiomatic expressions}
|
|
\begin{verbatim}
|
|
abstract Idiom = Cat ** {
|
|
\end{verbatim}
|
|
|
|
This module defines constructions that are formed in fixed ways,
|
|
often different even in closely related languages.
|
|
|
|
\begin{verbatim}
|
|
fun
|
|
ImpersCl : VP -> Cl ; -- it rains
|
|
GenericCl : VP -> Cl ; -- one sleeps
|
|
|
|
CleftNP : NP -> RS -> Cl ; -- it is you who did it
|
|
CleftAdv : Adv -> S -> Cl ; -- it is yesterday she arrived
|
|
|
|
ExistNP : NP -> Cl ; -- there is a house
|
|
ExistIP : IP -> QCl ; -- which houses are there
|
|
|
|
ProgrVP : VP -> VP ; -- be sleeping
|
|
|
|
ImpPl1 : VP -> Utt ; -- let's go
|
|
|
|
}
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{The construction of nouns, noun phrases, and determiners}
|
|
\begin{verbatim}
|
|
abstract Noun = Cat ** {
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Noun phrases}
|
|
The three main types of noun phrases are
|
|
|
|
\begin{itemize}
|
|
\item common nouns with determiners
|
|
\item proper names
|
|
\item pronouns
|
|
\end{itemize}
|
|
|
|
\begin{verbatim}
|
|
fun
|
|
DetCN : Det -> CN -> NP ; -- the man
|
|
UsePN : PN -> NP ; -- John
|
|
UsePron : Pron -> NP ; -- he
|
|
\end{verbatim}
|
|
|
|
Pronouns are defined in the module \htmladdnormallink{Structural}{Structural.html}.
|
|
A noun phrase already formed can be modified by a \texttt{Predet}erminer.
|
|
|
|
\begin{verbatim}
|
|
PredetNP : Predet -> NP -> NP; -- only the man
|
|
\end{verbatim}
|
|
|
|
A noun phrase can also be postmodified by the past participle of a
|
|
verb or by an adverb.
|
|
|
|
\begin{verbatim}
|
|
PPartNP : NP -> V2 -> NP ; -- the number squared
|
|
AdvNP : NP -> Adv -> NP ; -- Paris at midnight
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Determiners}
|
|
The determiner has a fine-grained structure, in which a 'nucleus'
|
|
quantifier and two optional parts can be discerned.
|
|
The cardinal numeral is only available for plural determiners.
|
|
(This is modified from CLE by further dividing their \texttt{Num} into
|
|
cardinal and ordinal.)
|
|
|
|
\begin{verbatim}
|
|
DetSg : QuantSg -> Ord -> Det ; -- this best man
|
|
DetPl : QuantPl -> Num -> Ord -> Det ; -- these five best men
|
|
\end{verbatim}
|
|
|
|
Quantifiers that have both forms can be used in both ways.
|
|
|
|
\begin{verbatim}
|
|
SgQuant : Quant -> QuantSg ; -- this
|
|
PlQuant : Quant -> QuantPl ; -- these
|
|
\end{verbatim}
|
|
|
|
Pronouns have possessive forms. Genitives of other kinds
|
|
of noun phrases are not given here, since they are not possible
|
|
in e.g. Romance languages.
|
|
|
|
\begin{verbatim}
|
|
PossPron : Pron -> Quant ; -- my (house)
|
|
\end{verbatim}
|
|
|
|
All parts of the determiner can be empty, except \texttt{Quant}, which is
|
|
the \textit{kernel} of a determiner.
|
|
|
|
\begin{verbatim}
|
|
NoNum : Num ;
|
|
NoOrd : Ord ;
|
|
\end{verbatim}
|
|
|
|
\texttt{Num} consists of either digits or numeral words.
|
|
|
|
\begin{verbatim}
|
|
NumInt : Int -> Num ; -- 51
|
|
NumNumeral : Numeral -> Num ; -- fifty-one
|
|
\end{verbatim}
|
|
|
|
The construction of numerals is defined in \htmladdnormallink{Numeral}{Numeral.html}.
|
|
\texttt{Num} can be modified by certain adverbs.
|
|
|
|
\begin{verbatim}
|
|
AdNum : AdN -> Num -> Num ; -- almost 51
|
|
\end{verbatim}
|
|
|
|
\texttt{Ord} consists of either digits or numeral words.
|
|
|
|
\begin{verbatim}
|
|
OrdInt : Int -> Ord ; -- 51st
|
|
OrdNumeral : Numeral -> Ord ; -- fifty-first
|
|
\end{verbatim}
|
|
|
|
Superlative forms of adjectives behave syntactically in the same way as
|
|
ordinals.
|
|
|
|
\begin{verbatim}
|
|
OrdSuperl : A -> Ord ; -- largest
|
|
\end{verbatim}
|
|
|
|
Definite and indefinite constructions are sometimes realized as
|
|
neatly distinct words (Spanish \textit{un, unos ; el, los}) but also without
|
|
any particular word (Finnish; Swedish definites).
|
|
|
|
\begin{verbatim}
|
|
DefArt : Quant ; -- the (house), the (houses)
|
|
IndefArt : Quant ; -- a (house), (houses)
|
|
\end{verbatim}
|
|
|
|
Nouns can be used without an article as mass nouns. The resource does
|
|
not distinguish mass nouns from other common nouns, which can result
|
|
in semantically odd expressions.
|
|
|
|
\begin{verbatim}
|
|
MassDet : QuantSg ; -- (beer)
|
|
\end{verbatim}
|
|
|
|
Other determiners are defined in \htmladdnormallink{Structural}{Structural.html}.
|
|
|
|
\subsubsubsection{Common nouns}
|
|
Simple nouns can be used as nouns outright.
|
|
|
|
\begin{verbatim}
|
|
UseN : N -> CN ; -- house
|
|
\end{verbatim}
|
|
|
|
Relational nouns take one or two arguments.
|
|
|
|
\begin{verbatim}
|
|
ComplN2 : N2 -> NP -> CN ; -- son of the king
|
|
ComplN3 : N3 -> NP -> N2 ; -- flight from Moscow (to Paris)
|
|
\end{verbatim}
|
|
|
|
Relational nouns can also be used without their arguments.
|
|
The semantics is typically derivative of the relational meaning.
|
|
|
|
\begin{verbatim}
|
|
UseN2 : N2 -> CN ; -- son
|
|
UseN3 : N3 -> CN ; -- flight
|
|
\end{verbatim}
|
|
|
|
Nouns can be modified by adjectives, relative clauses, and adverbs
|
|
(the last rule will give rise to many 'PP attachement' ambiguities
|
|
when used in connection with verb phrases).
|
|
|
|
\begin{verbatim}
|
|
AdjCN : AP -> CN -> CN ; -- big house
|
|
RelCN : CN -> RS -> CN ; -- house that John owns
|
|
AdvCN : CN -> Adv -> CN ; -- house on the hill
|
|
\end{verbatim}
|
|
|
|
Nouns can also be modified by embedded sentences and questions.
|
|
For some nouns this makes little sense, but we leave this for applications
|
|
to decide. Sentential complements are defined in \htmladdnormallink{Verb}{Verb.html}.
|
|
|
|
\begin{verbatim}
|
|
SentCN : CN -> SC -> CN ; -- fact that John smokes, question if he does
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Apposition}
|
|
This is certainly overgenerating.
|
|
|
|
\begin{verbatim}
|
|
ApposCN : CN -> NP -> CN ; -- number x, numbers x and y
|
|
|
|
} ;
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Numerals}
|
|
This grammar defines numerals from 1 to 999999.
|
|
The implementations are adapted from the
|
|
\htmladdnormallink{numerals library}{http://www.cs.chalmers.se/~aarne/GF/examples/numerals/}
|
|
which defines numerals for 88 languages.
|
|
The resource grammar implementations add to this inflection (if needed)
|
|
and ordinal numbers.
|
|
\textbf{Note}. Number 1 as defined
|
|
in the category \texttt{Numeral} here should not be used in the formation of
|
|
noun phrases, and should therefore be removed. Instead, one should use
|
|
\htmladdnormallink{Structural}{Structural.html}\texttt{.one\_Quant}. This makes the grammar simpler
|
|
because we can assume that numbers form plural noun phrases.
|
|
|
|
\begin{verbatim}
|
|
abstract Numeral = Cat ** {
|
|
|
|
cat
|
|
Digit ; -- 2..9
|
|
Sub10 ; -- 1..9
|
|
Sub100 ; -- 1..99
|
|
Sub1000 ; -- 1..999
|
|
Sub1000000 ; -- 1..999999
|
|
|
|
fun
|
|
num : Sub1000000 -> Numeral ;
|
|
|
|
n2, n3, n4, n5, n6, n7, n8, n9 : Digit ;
|
|
|
|
pot01 : Sub10 ; -- 1
|
|
pot0 : Digit -> Sub10 ; -- d * 1
|
|
pot110 : Sub100 ; -- 10
|
|
pot111 : Sub100 ; -- 11
|
|
pot1to19 : Digit -> Sub100 ; -- 10 + d
|
|
pot0as1 : Sub10 -> Sub100 ; -- coercion of 1..9
|
|
pot1 : Digit -> Sub100 ; -- d * 10
|
|
pot1plus : Digit -> Sub10 -> Sub100 ; -- d * 10 + n
|
|
pot1as2 : Sub100 -> Sub1000 ; -- coercion of 1..99
|
|
pot2 : Sub10 -> Sub1000 ; -- m * 100
|
|
pot2plus : Sub10 -> Sub100 -> Sub1000 ; -- m * 100 + n
|
|
pot2as3 : Sub1000 -> Sub1000000 ; -- coercion of 1..999
|
|
pot3 : Sub1000 -> Sub1000000 ; -- m * 1000
|
|
pot3plus : Sub1000 -> Sub1000 -> Sub1000000 ; -- m * 1000 + n
|
|
|
|
}
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Phrases and utterances}
|
|
\begin{verbatim}
|
|
abstract Phrase = Cat ** {
|
|
\end{verbatim}
|
|
|
|
When a phrase is built from an utterance it can be prefixed
|
|
with a phrasal conjunction (such as \textit{but}, \textit{therefore})
|
|
and suffixing with a vocative (typically a noun phrase).
|
|
|
|
\begin{verbatim}
|
|
fun
|
|
PhrUtt : PConj -> Utt -> Voc -> Phr ; -- But go home my friend.
|
|
\end{verbatim}
|
|
|
|
Utterances are formed from sentences, questions, and imperatives.
|
|
|
|
\begin{verbatim}
|
|
UttS : S -> Utt ; -- John walks
|
|
UttQS : QS -> Utt ; -- is it good
|
|
UttImpSg : Pol -> Imp -> Utt; -- (don't) help yourself
|
|
UttImpPl : Pol -> Imp -> Utt; -- (don't) help yourselves
|
|
\end{verbatim}
|
|
|
|
There are also 'one-word utterances'. A typical use of them is
|
|
as answers to questions.
|
|
\textbf{Note}. This list is incomplete. More categories could be covered.
|
|
Moreover, in many languages e.g. noun phrases in different cases
|
|
can be used.
|
|
|
|
\begin{verbatim}
|
|
UttIP : IP -> Utt ; -- who
|
|
UttIAdv : IAdv -> Utt ; -- why
|
|
UttNP : NP -> Utt ; -- this man
|
|
UttAdv : Adv -> Utt ; -- here
|
|
UttVP : VP -> Utt ; -- to sleep
|
|
\end{verbatim}
|
|
|
|
The phrasal conjunction is optional. A sentence conjunction
|
|
can also used to prefix an utterance.
|
|
|
|
\begin{verbatim}
|
|
NoPConj : PConj ;
|
|
PConjConj : Conj -> PConj ; -- and
|
|
\end{verbatim}
|
|
|
|
The vocative is optional. Any noun phrase can be made into vocative,
|
|
which may be overgenerating (e.g. \textit{I}).
|
|
|
|
\begin{verbatim}
|
|
NoVoc : Voc ;
|
|
VocNP : NP -> Voc ; -- my friend
|
|
|
|
}
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Questions and interrogative pronouns}
|
|
\begin{verbatim}
|
|
abstract Question = Cat ** {
|
|
\end{verbatim}
|
|
|
|
A question can be formed from a clause ('yes-no question') or
|
|
with an interrogative.
|
|
|
|
\begin{verbatim}
|
|
fun
|
|
QuestCl : Cl -> QCl ; -- does John walk
|
|
QuestVP : IP -> VP -> QCl ; -- who walks
|
|
QuestSlash : IP -> Slash -> QCl ; -- who does John love
|
|
QuestIAdv : IAdv -> Cl -> QCl ; -- why does John walk
|
|
QuestIComp : IComp -> NP -> QCl ; -- where is John
|
|
\end{verbatim}
|
|
|
|
Interrogative pronouns can be formed with interrogative
|
|
determiners.
|
|
|
|
\begin{verbatim}
|
|
IDetCN : IDet -> Num -> Ord -> CN -> IP; -- which five best songs
|
|
AdvIP : IP -> Adv -> IP ; -- who in Europe
|
|
|
|
PrepIP : Prep -> IP -> IAdv ; -- with whom
|
|
|
|
CompIAdv : IAdv -> IComp ; -- where
|
|
\end{verbatim}
|
|
|
|
More \texttt{IP}, \texttt{IDet}, and \texttt{IAdv} are defined in
|
|
\htmladdnormallink{Structural}{Structural.html}.
|
|
|
|
\begin{verbatim}
|
|
}
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Relative clauses and pronouns}
|
|
\begin{verbatim}
|
|
abstract Relative = Cat ** {
|
|
|
|
fun
|
|
\end{verbatim}
|
|
|
|
The simplest way to form a relative clause is from a clause by
|
|
a pronoun similar to \textit{such that}.
|
|
|
|
\begin{verbatim}
|
|
RelCl : Cl -> RCl ; -- such that John loves her
|
|
\end{verbatim}
|
|
|
|
The more proper ways are from a verb phrase (formed in \htmladdnormallink{Verb}{Verb.html})
|
|
or a sentence with a missing noun phrase (formed in \htmladdnormallink{Sentence}{Sentence.html}).
|
|
|
|
\begin{verbatim}
|
|
RelVP : RP -> VP -> RCl ; -- who loves John
|
|
RelSlash : RP -> Slash -> RCl ; -- whom John loves
|
|
\end{verbatim}
|
|
|
|
Relative pronouns are formed from an 'identity element' by prefixing
|
|
or suffixing (depending on language) prepositional phrases.
|
|
|
|
\begin{verbatim}
|
|
IdRP : RP ; -- which
|
|
FunRP : Prep -> NP -> RP -> RP ; -- all the roots of which
|
|
|
|
}
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Sentences, clauses, imperatives, and sentential complements}
|
|
\begin{verbatim}
|
|
abstract Sentence = Cat ** {
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Clauses}
|
|
The \texttt{NP VP} predication rule form a clause whose linearization
|
|
gives a table of all tense variants, positive and negative.
|
|
Clauses are converted to \texttt{S} (with fixed tense) in \htmladdnormallink{Tensed}{Tensed.html}.
|
|
|
|
\begin{verbatim}
|
|
fun
|
|
PredVP : NP -> VP -> Cl ; -- John walks
|
|
\end{verbatim}
|
|
|
|
Using an embedded sentence as a subject is treated separately.
|
|
This can be overgenerating. E.g. \textit{whether you go} as subject
|
|
is only meaningful for some verb phrases.
|
|
|
|
\begin{verbatim}
|
|
PredSCVP : SC -> VP -> Cl ; -- that you go makes me happy
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Clauses missing object noun phrases}
|
|
This category is a variant of the 'slash category' \texttt{S/NP} of
|
|
GPSG and categorial grammars, which in turn replaces
|
|
movement transformations in the formation of questions
|
|
and relative clauses. Except \texttt{SlashV2}, the construction
|
|
rules can be seen as special cases of function composition, in
|
|
the style of CCG.
|
|
\textbf{Note} the set is not complete and lacks e.g. verbs with more than 2 places.
|
|
|
|
\begin{verbatim}
|
|
SlashV2 : NP -> V2 -> Slash ; -- (whom) he sees
|
|
SlashVVV2 : NP -> VV -> V2 -> Slash; -- (whom) he wants to see
|
|
AdvSlash : Slash -> Adv -> Slash ; -- (whom) he sees tomorrow
|
|
SlashPrep : Cl -> Prep -> Slash ; -- (with whom) he walks
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Imperatives}
|
|
An imperative is straightforwardly formed from a verb phrase.
|
|
It has variation over positive and negative, singular and plural.
|
|
To fix these parameters, see \htmladdnormallink{Phrase}{Phrase.html}.
|
|
|
|
\begin{verbatim}
|
|
ImpVP : VP -> Imp ; -- go
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Embedded sentences}
|
|
Sentences, questions, and infinitival phrases can be used as
|
|
subjects and (adverbial) complements.
|
|
|
|
\begin{verbatim}
|
|
EmbedS : S -> SC ; -- that you go
|
|
EmbedQS : QS -> SC ; -- whether you go
|
|
EmbedVP : VP -> SC ; -- to go
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Sentences}
|
|
These are the 2 x 4 x 4 = 16 forms generated by different
|
|
combinations of tense, polarity, and
|
|
anteriority, which are defined in \htmladdnormallink{Tense}{Tense.html}.
|
|
|
|
\begin{verbatim}
|
|
fun
|
|
UseCl : Tense -> Ant -> Pol -> Cl -> S ;
|
|
UseQCl : Tense -> Ant -> Pol -> QCl -> QS ;
|
|
UseRCl : Tense -> Ant -> Pol -> RCl -> RS ;
|
|
|
|
}
|
|
\end{verbatim}
|
|
|
|
Examples for English \texttt{S}/\texttt{Cl}:
|
|
|
|
Pres Simul Pos ODir : he sleeps
|
|
Pres Simul Neg ODir : he doesn't sleep
|
|
Pres Anter Pos ODir : he has slept
|
|
Pres Anter Neg ODir : he hasn't slept
|
|
Past Simul Pos ODir : he slept
|
|
Past Simul Neg ODir : he didn't sleep
|
|
Past Anter Pos ODir : he had slept
|
|
Past Anter Neg ODir : he hadn't slept
|
|
Fut Simul Pos ODir : he will sleep
|
|
Fut Simul Neg ODir : he won't sleep
|
|
Fut Anter Pos ODir : he will have slept
|
|
Fut Anter Neg ODir : he won't have slept
|
|
Cond Simul Pos ODir : he would sleep
|
|
Cond Simul Neg ODir : he wouldn't sleep
|
|
Cond Anter Pos ODir : he would have slept
|
|
Cond Anter Neg ODir : he wouldn't have slept
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Structural Words}
|
|
|
|
Here we have some words belonging to closed classes and appearing
|
|
in all languages we have considered.
|
|
Sometimes they are not really meaningful, e.g. \texttt{we\_Pron} in Spanish
|
|
should be replaced by masculine and feminine variants.
|
|
|
|
\begin{verbatim}
|
|
abstract Structural = Cat ** {
|
|
|
|
fun
|
|
\end{verbatim}
|
|
|
|
This is an alphabetical list of structural words
|
|
|
|
\begin{verbatim}
|
|
above_Prep : Prep ;
|
|
after_Prep : Prep ;
|
|
all_Predet : Predet ;
|
|
almost_AdA : AdA ;
|
|
almost_AdN : AdN ;
|
|
although_Subj : Subj ;
|
|
always_AdV : AdV ;
|
|
and_Conj : Conj ;
|
|
because_Subj : Subj ;
|
|
before_Prep : Prep ;
|
|
behind_Prep : Prep ;
|
|
between_Prep : Prep ;
|
|
both7and_DConj : DConj ;
|
|
but_PConj : PConj ;
|
|
by8agent_Prep : Prep ;
|
|
by8means_Prep : Prep ;
|
|
can8know_VV : VV ;
|
|
can_VV : VV ;
|
|
during_Prep : Prep ;
|
|
either7or_DConj : DConj ;
|
|
every_Det : Det ;
|
|
everybody_NP : NP ;
|
|
everything_NP : NP ;
|
|
everywhere_Adv : Adv ;
|
|
first_Ord : Ord ;
|
|
few_Det : Det ;
|
|
from_Prep : Prep ;
|
|
he_Pron : Pron ;
|
|
here_Adv : Adv ;
|
|
here7to_Adv : Adv ;
|
|
here7from_Adv : Adv ;
|
|
how_IAdv : IAdv ;
|
|
how8many_IDet : IDet ;
|
|
i_Pron : Pron ;
|
|
if_Subj : Subj ;
|
|
in8front_Prep : Prep ;
|
|
in_Prep : Prep ;
|
|
it_Pron : Pron ;
|
|
less_CAdv : CAdv ;
|
|
many_Det : Det ;
|
|
more_CAdv : CAdv ;
|
|
most_Predet : Predet ;
|
|
much_Det : Det ;
|
|
must_VV : VV ;
|
|
no_Phr : Phr ;
|
|
on_Prep : Prep ;
|
|
one_Quant : QuantSg ;
|
|
only_Predet : Predet ;
|
|
or_Conj : Conj ;
|
|
otherwise_PConj : PConj ;
|
|
part_Prep : Prep ;
|
|
please_Voc : Voc ;
|
|
possess_Prep : Prep ;
|
|
quite_Adv : AdA ;
|
|
she_Pron : Pron ;
|
|
so_AdA : AdA ;
|
|
someSg_Det : Det ;
|
|
somePl_Det : Det ;
|
|
somebody_NP : NP ;
|
|
something_NP : NP ;
|
|
somewhere_Adv : Adv ;
|
|
that_Quant : Quant ;
|
|
that_NP : NP ;
|
|
there_Adv : Adv ;
|
|
there7to_Adv : Adv ;
|
|
there7from_Adv : Adv ;
|
|
therefore_PConj : PConj ;
|
|
these_NP : NP ;
|
|
they_Pron : Pron ;
|
|
this_Quant : Quant ;
|
|
this_NP : NP ;
|
|
those_NP : NP ;
|
|
through_Prep : Prep ;
|
|
to_Prep : Prep ;
|
|
too_AdA : AdA ;
|
|
under_Prep : Prep ;
|
|
very_AdA : AdA ;
|
|
want_VV : VV ;
|
|
we_Pron : Pron ;
|
|
whatPl_IP : IP ;
|
|
whatSg_IP : IP ;
|
|
when_IAdv : IAdv ;
|
|
when_Subj : Subj ;
|
|
where_IAdv : IAdv ;
|
|
whichPl_IDet : IDet ;
|
|
whichSg_IDet : IDet ;
|
|
whoPl_IP : IP ;
|
|
whoSg_IP : IP ;
|
|
why_IAdv : IAdv ;
|
|
with_Prep : Prep ;
|
|
without_Prep : Prep ;
|
|
yes_Phr : Phr ;
|
|
youSg_Pron : Pron ;
|
|
youPl_Pron : Pron ;
|
|
youPol_Pron : Pron ;
|
|
|
|
}
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{Texts}
|
|
\begin{verbatim}
|
|
abstract Text = Common ** {
|
|
|
|
fun
|
|
TEmpty : Text ;
|
|
TFullStop : Phr -> Text -> Text ;
|
|
TQuestMark : Phr -> Text -> Text ;
|
|
TExclMark : Phr -> Text -> Text ;
|
|
|
|
}
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
|
|
\subsubsection{The construction of verb phrases}
|
|
\begin{verbatim}
|
|
abstract Verb = Cat ** {
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Complementization rules}
|
|
Verb phrases are constructed from verbs by providing their
|
|
complements. There is one rule for each verb category.
|
|
|
|
\begin{verbatim}
|
|
fun
|
|
UseV : V -> VP ; -- sleep
|
|
ComplV2 : V2 -> NP -> VP ; -- use it
|
|
ComplV3 : V3 -> NP -> NP -> VP ; -- send a message to her
|
|
|
|
ComplVV : VV -> VP -> VP ; -- want to run
|
|
ComplVS : VS -> S -> VP ; -- know that she runs
|
|
ComplVQ : VQ -> QS -> VP ; -- ask if she runs
|
|
|
|
ComplVA : VA -> AP -> VP ; -- look red
|
|
ComplV2A : V2A -> NP -> AP -> VP ; -- paint the house red
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Other ways of forming verb phrases}
|
|
Verb phrases can also be constructed reflexively and from
|
|
copula-preceded complements.
|
|
|
|
\begin{verbatim}
|
|
ReflV2 : V2 -> VP ; -- use itself
|
|
UseComp : Comp -> VP ; -- be warm
|
|
\end{verbatim}
|
|
|
|
Passivization of two-place verbs is another way to use
|
|
them. In many languages, the result is a participle that
|
|
is used as complement to a copula (\textit{is used}), but other
|
|
auxiliary verbs are possible (Ger. \textit{wird angewendet}, It.
|
|
\textit{viene usato}), as well as special verb forms (Fin. \textit{käytetään},
|
|
Swe. \textit{används}).
|
|
|
|
\textbf{Note}. the rule can be overgenerating, since the \texttt{V2} need not
|
|
take a direct object.
|
|
|
|
\begin{verbatim}
|
|
PassV2 : V2 -> VP ; -- be used
|
|
\end{verbatim}
|
|
|
|
Adverbs can be added to verb phrases. Many languages make
|
|
a distinction between adverbs that are attached in the end
|
|
vs. next to (or before) the verb.
|
|
|
|
\begin{verbatim}
|
|
AdvVP : VP -> Adv -> VP ; -- sleep here
|
|
AdVVP : AdV -> VP -> VP ; -- always sleep
|
|
\end{verbatim}
|
|
|
|
\textbf{Agents of passives} are constructed as adverbs with the
|
|
preposition \htmladdnormallink{Structural}{Structural.html}\texttt{.8agent\_Prep}.
|
|
|
|
\subsubsubsection{Complements to copula}
|
|
Adjectival phrases, noun phrases, and adverbs can be used.
|
|
|
|
\begin{verbatim}
|
|
CompAP : AP -> Comp ; -- (be) small
|
|
CompNP : NP -> Comp ; -- (be) a soldier
|
|
CompAdv : Adv -> Comp ; -- (be) here
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Coercions}
|
|
Verbs can change subcategorization patterns in systematic ways,
|
|
but this is very much language-dependent. The following two
|
|
work in all the languages we cover.
|
|
|
|
\begin{verbatim}
|
|
UseVQ : VQ -> V2 ; -- ask (a question)
|
|
UseVS : VS -> V2 ; -- know (a secret)
|
|
|
|
}
|
|
\end{verbatim}
|
|
|
|
\subsection{Inflectional paradigms}
|
|
Author:
|
|
Last update: Tue Jun 13 11:43:19 2006
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
==
|
|
|
|
\# -path=.:../scandinavian:../common:../abstract:../../prelude
|
|
|
|
|
|
\subsubsection{Danish Lexical Paradigms}
|
|
Aarne Ranta 2003
|
|
|
|
This is an API to the user of the resource grammar
|
|
for adding lexical items. It gives functions for forming
|
|
expressions of open categories: nouns, adjectives, verbs.
|
|
|
|
Closed categories (determiners, pronouns, conjunctions) are
|
|
accessed through the resource syntax API, \texttt{Structural.gf}.
|
|
|
|
The main difference with \texttt{MorphoDan.gf} is that the types
|
|
referred to are compiled resource grammar types. We have moreover
|
|
had the design principle of always having existing forms, rather
|
|
than stems, as string arguments of the paradigms.
|
|
|
|
The structure of functions for each word class \texttt{C} is the following:
|
|
first we give a handful of patterns that aim to cover all
|
|
regular cases. Then we give a worst-case function \texttt{mkC}, which serves as an
|
|
escape to construct the most irregular words of type \texttt{C}.
|
|
However, this function should only seldom be needed: we have a
|
|
separate module \texttt{IrregularEng}, which covers all irregularly inflected
|
|
words.
|
|
|
|
\begin{verbatim}
|
|
resource ParadigmsDan =
|
|
open
|
|
(Predef=Predef),
|
|
Prelude,
|
|
CommonScand,
|
|
ResDan,
|
|
MorphoDan,
|
|
CatDan in {
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Parameters}
|
|
To abstract over gender names, we define the following identifiers.
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
Gender : Type ;
|
|
|
|
utrum : Gender ;
|
|
neutrum : Gender ;
|
|
\end{verbatim}
|
|
|
|
To abstract over number names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Number : Type ;
|
|
|
|
singular : Number ;
|
|
plural : Number ;
|
|
\end{verbatim}
|
|
|
|
To abstract over case names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Case : Type ;
|
|
|
|
nominative : Case ;
|
|
genitive : Case ;
|
|
\end{verbatim}
|
|
|
|
Prepositions used in many-argument functions are just strings.
|
|
|
|
\begin{verbatim}
|
|
Preposition : Type = Str ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Nouns}
|
|
Worst case: give all four forms. The gender is computed from the
|
|
last letter of the second form (if \textit{n}, then \texttt{utrum}, otherwise \texttt{neutrum}).
|
|
|
|
\begin{verbatim}
|
|
mkN : (dreng,drengen,drenger,drengene : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
The regular function takes the singular indefinite form
|
|
and computes the other forms and the gender by a heuristic.
|
|
The heuristic is that all nouns are \texttt{utrum} with the
|
|
plural ending \textit{er///}r//.
|
|
|
|
\begin{verbatim}
|
|
regN : Str -> N ;
|
|
\end{verbatim}
|
|
|
|
Giving gender manually makes the heuristic more reliable.
|
|
|
|
\begin{verbatim}
|
|
regGenN : Str -> Gender -> N ;
|
|
\end{verbatim}
|
|
|
|
This function takes the singular indefinite and definite forms; the
|
|
gender is computed from the definite form.
|
|
|
|
\begin{verbatim}
|
|
mk2N : (bil,bilen : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
This function takes the singular indefinite and definite and the plural
|
|
indefinite
|
|
|
|
\begin{verbatim}
|
|
mk3N : (bil,bilen,biler : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Compound nouns}
|
|
All the functions above work quite as well to form compound nouns,
|
|
such as \textit{fotboll}.
|
|
|
|
\subsubsubsection{Relational nouns}
|
|
Relational nouns (\textit{daughter of x}) need a preposition.
|
|
|
|
\begin{verbatim}
|
|
mkN2 : N -> Preposition -> N2 ;
|
|
\end{verbatim}
|
|
|
|
The most common preposition is \textit{av}, and the following is a
|
|
shortcut for regular, \texttt{nonhuman} relational nouns with \textit{av}.
|
|
|
|
\begin{verbatim}
|
|
regN2 : Str -> Gender -> N2 ;
|
|
\end{verbatim}
|
|
|
|
Use the function \texttt{mkPreposition} or see the section on prepositions below to
|
|
form other prepositions.
|
|
|
|
Three-place relational nouns (\textit{the connection from x to y}) need two prepositions.
|
|
|
|
\begin{verbatim}
|
|
mkN3 : N -> Preposition -> Preposition -> N3 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relational common noun phrases}
|
|
In some cases, you may want to make a complex \texttt{CN} into a
|
|
relational noun (e.g. \textit{the old town hall of}). However, \texttt{N2} and
|
|
\texttt{N3} are purely lexical categories. But you can use the \texttt{AdvCN}
|
|
and \texttt{PrepNP} constructions to build phrases like this.
|
|
|
|
\subsubsubsection{Proper names and noun phrases}
|
|
Proper names, with a regular genitive, are formed as follows
|
|
|
|
\begin{verbatim}
|
|
regPN : Str -> Gender -> PN ; -- John, John's
|
|
\end{verbatim}
|
|
|
|
Sometimes you can reuse a common noun as a proper name, e.g. \textit{Bank}.
|
|
|
|
\begin{verbatim}
|
|
nounPN : N -> PN ;
|
|
\end{verbatim}
|
|
|
|
To form a noun phrase that can also be plural and have an irregular
|
|
genitive, you can use the worst-case function.
|
|
|
|
\begin{verbatim}
|
|
mkNP : Str -> Str -> Number -> Gender -> NP ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adjectives}
|
|
Non-comparison one-place adjectives need three forms:
|
|
|
|
\begin{verbatim}
|
|
mkA : (galen,galet,galne : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
For regular adjectives, the other forms are derived.
|
|
|
|
\begin{verbatim}
|
|
regA : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
In most cases, two forms are enough.
|
|
|
|
\begin{verbatim}
|
|
mk2A : (stor,stort : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place adjectives}
|
|
Two-place adjectives need a preposition for their second argument.
|
|
|
|
\begin{verbatim}
|
|
mkA2 : A -> Preposition -> A2 ;
|
|
\end{verbatim}
|
|
|
|
Comparison adjectives may need as many as five forms.
|
|
|
|
\begin{verbatim}
|
|
mkADeg : (stor,stort,store,storre,storst : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
The regular pattern works for many adjectives, e.g. those ending
|
|
with \textit{ig}.
|
|
|
|
\begin{verbatim}
|
|
regADeg : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
Just the comparison forms can be irregular.
|
|
|
|
\begin{verbatim}
|
|
irregADeg : (tung,tyngre,tyngst : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
Sometimes just the positive forms are irregular.
|
|
|
|
\begin{verbatim}
|
|
mk3ADeg : (galen,galet,galna : Str) -> A ;
|
|
mk2ADeg : (bred,bredt : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
If comparison is formed by \textit{mer, //mest}, as in general for//
|
|
long adjective, the following pattern is used:
|
|
|
|
\begin{verbatim}
|
|
compoundA : A -> A ; -- -/mer/mest norsk
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adverbs}
|
|
Adverbs are not inflected. Most lexical ones have position
|
|
after the verb. Some can be preverbal (e.g. \textit{always}).
|
|
|
|
\begin{verbatim}
|
|
mkAdv : Str -> Adv ;
|
|
mkAdV : Str -> AdV ;
|
|
\end{verbatim}
|
|
|
|
Adverbs modifying adjectives and sentences can also be formed.
|
|
|
|
\begin{verbatim}
|
|
mkAdA : Str -> AdA ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Prepositions}
|
|
A preposition is just a string.
|
|
|
|
\begin{verbatim}
|
|
mkPreposition : Str -> Preposition ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs}
|
|
The worst case needs six forms.
|
|
|
|
\begin{verbatim}
|
|
mkV : (spise,spiser,spises,spiste,spist,spis : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
The 'regular verb' function is the first conjugation.
|
|
|
|
\begin{verbatim}
|
|
regV : (snakke : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
The almost regular verb function needs the infinitive and the preteritum.
|
|
|
|
\begin{verbatim}
|
|
mk2V : (leve,levde : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
There is an extensive list of irregular verbs in the module \texttt{IrregDan}.
|
|
In practice, it is enough to give three forms, as in school books.
|
|
|
|
\begin{verbatim}
|
|
irregV : (drikke, drakk, drukket : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs with //være// as auxiliary}
|
|
By default, the auxiliary is \textit{have}. This function changes it to \textit{være}.
|
|
|
|
\begin{verbatim}
|
|
vaereV : V -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs with a particle}
|
|
The particle, such as in \textit{switch on}, is given as a string.
|
|
|
|
\begin{verbatim}
|
|
partV : V -> Str -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Deponent verbs}
|
|
Some words are used in passive forms only, e.g. \textit{hoppas}, some as
|
|
reflexive e.g. \textit{ångra sig}.
|
|
|
|
\begin{verbatim}
|
|
depV : V -> V ;
|
|
reflV : V -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place verbs}
|
|
Two-place verbs need a preposition, except the special case with direct object.
|
|
(transitive verbs). Notice that a particle comes from the \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
mkV2 : V -> Preposition -> V2 ;
|
|
|
|
dirV2 : V -> V2 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Three-place verbs}
|
|
Three-place (ditransitive) verbs need two prepositions, of which
|
|
the first one or both can be absent.
|
|
|
|
\begin{verbatim}
|
|
mkV3 : V -> Str -> Str -> V3 ; -- speak, with, about
|
|
dirV3 : V -> Str -> V3 ; -- give,_,to
|
|
dirdirV3 : V -> V3 ; -- give,_,_
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Other complement patterns}
|
|
Verbs and adjectives can take complements such as sentences,
|
|
questions, verb phrases, and adjectives.
|
|
|
|
\begin{verbatim}
|
|
mkV0 : V -> V0 ;
|
|
mkVS : V -> VS ;
|
|
mkV2S : V -> Str -> V2S ;
|
|
mkVV : V -> VV ;
|
|
mkV2V : V -> Str -> Str -> V2V ;
|
|
mkVA : V -> VA ;
|
|
mkV2A : V -> Str -> V2A ;
|
|
mkVQ : V -> VQ ;
|
|
mkV2Q : V -> Str -> V2Q ;
|
|
|
|
mkAS : A -> AS ;
|
|
mkA2S : A -> Str -> A2S ;
|
|
mkAV : A -> AV ;
|
|
mkA2V : A -> Str -> A2V ;
|
|
\end{verbatim}
|
|
|
|
Notice: categories \texttt{V2S, V2V, V2A, V2Q} are in v 1.0 treated
|
|
just as synonyms of \texttt{V2}, and the second argument is given
|
|
as an adverb. Likewise \texttt{AS, A2S, AV, A2V} are just \texttt{A}.
|
|
\texttt{V0} is just \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
V0, V2S, V2V, V2A, V2Q : Type ;
|
|
AS, A2S, AV, A2V : Type ;
|
|
\end{verbatim}
|
|
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
==
|
|
|
|
\# -path=.:../abstract:../../prelude:../common
|
|
|
|
|
|
\subsubsection{English Lexical Paradigms}
|
|
Aarne Ranta 2003--2005
|
|
|
|
This is an API to the user of the resource grammar
|
|
for adding lexical items. It gives functions for forming
|
|
expressions of open categories: nouns, adjectives, verbs.
|
|
|
|
Closed categories (determiners, pronouns, conjunctions) are
|
|
accessed through the resource syntax API, \texttt{Structural.gf}.
|
|
|
|
The main difference with \texttt{MorphoEng.gf} is that the types
|
|
referred to are compiled resource grammar types. We have moreover
|
|
had the design principle of always having existing forms, rather
|
|
than stems, as string arguments of the paradigms.
|
|
|
|
The structure of functions for each word class \texttt{C} is the following:
|
|
first we give a handful of patterns that aim to cover all
|
|
regular cases. Then we give a worst-case function \texttt{mkC}, which serves as an
|
|
escape to construct the most irregular words of type \texttt{C}.
|
|
However, this function should only seldom be needed: we have a
|
|
separate module \texttt{IrregularEng}, which covers all irregularly inflected
|
|
words.
|
|
|
|
The following modules are presupposed:
|
|
|
|
\begin{verbatim}
|
|
resource ParadigmsEng = open
|
|
(Predef=Predef),
|
|
Prelude,
|
|
MorphoEng,
|
|
CatEng
|
|
in {
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Parameters}
|
|
To abstract over gender names, we define the following identifiers.
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
Gender : Type ;
|
|
|
|
human : Gender ;
|
|
nonhuman : Gender ;
|
|
masculine : Gender ;
|
|
\end{verbatim}
|
|
|
|
To abstract over number names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Number : Type ;
|
|
|
|
singular : Number ;
|
|
plural : Number ;
|
|
\end{verbatim}
|
|
|
|
To abstract over case names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Case : Type ;
|
|
|
|
nominative : Case ;
|
|
genitive : Case ;
|
|
\end{verbatim}
|
|
|
|
Prepositions are used in many-argument functions for rection.
|
|
|
|
\begin{verbatim}
|
|
Preposition : Type ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Nouns}
|
|
Worst case: give all four forms and the semantic gender.
|
|
|
|
\begin{verbatim}
|
|
mkN : (man,men,man's,men's : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
The regular function captures the variants for nouns ending with
|
|
\textit{s},\textit{sh},\textit{x},\textit{z} or \textit{y}: \textit{kiss - kisses}, \textit{flash - flashes};
|
|
\textit{fly - flies} (but \textit{toy - toys}),
|
|
|
|
\begin{verbatim}
|
|
regN : Str -> N ;
|
|
\end{verbatim}
|
|
|
|
In practice the worst case is just: give singular and plural nominative.
|
|
|
|
\begin{verbatim}
|
|
mk2N : (man,men : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
All nouns created by the previous functions are marked as
|
|
\texttt{nonhuman}. If you want a \texttt{human} noun, wrap it with the following
|
|
function:
|
|
|
|
\begin{verbatim}
|
|
genderN : Gender -> N -> N ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Compound nouns}
|
|
A compound noun ia an uninflected string attached to an inflected noun,
|
|
such as \textit{baby boom}, \textit{chief executive officer}.
|
|
|
|
\begin{verbatim}
|
|
compoundN : Str -> N -> N ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relational nouns}
|
|
Relational nouns (\textit{daughter of x}) need a preposition.
|
|
|
|
\begin{verbatim}
|
|
mkN2 : N -> Preposition -> N2 ;
|
|
\end{verbatim}
|
|
|
|
The most common preposition is \textit{of}, and the following is a
|
|
shortcut for regular relational nouns with \textit{of}.
|
|
|
|
\begin{verbatim}
|
|
regN2 : Str -> N2 ;
|
|
\end{verbatim}
|
|
|
|
Use the function \texttt{mkPreposition} or see the section on prepositions below to
|
|
form other prepositions.
|
|
|
|
Three-place relational nouns (\textit{the connection from x to y}) need two prepositions.
|
|
|
|
\begin{verbatim}
|
|
mkN3 : N -> Preposition -> Preposition -> N3 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relational common noun phrases}
|
|
In some cases, you may want to make a complex \texttt{CN} into a
|
|
relational noun (e.g. \textit{the old town hall of}).
|
|
|
|
\begin{verbatim}
|
|
cnN2 : CN -> Preposition -> N2 ;
|
|
cnN3 : CN -> Preposition -> Preposition -> N3 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Proper names and noun phrases}
|
|
Proper names, with a regular genitive, are formed as follows
|
|
|
|
\begin{verbatim}
|
|
regPN : Str -> Gender -> PN ; -- John, John's
|
|
\end{verbatim}
|
|
|
|
Sometimes you can reuse a common noun as a proper name, e.g. \textit{Bank}.
|
|
|
|
\begin{verbatim}
|
|
nounPN : N -> PN ;
|
|
\end{verbatim}
|
|
|
|
To form a noun phrase that can also be plural and have an irregular
|
|
genitive, you can use the worst-case function.
|
|
|
|
\begin{verbatim}
|
|
mkNP : Str -> Str -> Number -> Gender -> NP ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adjectives}
|
|
Non-comparison one-place adjectives need two forms: one for
|
|
the adjectival and one for the adverbial form (\textit{free - freely})
|
|
|
|
\begin{verbatim}
|
|
mkA : (free,freely : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
For regular adjectives, the adverbial form is derived. This holds
|
|
even for cases with the variation \textit{happy - happily}.
|
|
|
|
\begin{verbatim}
|
|
regA : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place adjectives}
|
|
Two-place adjectives need a preposition for their second argument.
|
|
|
|
\begin{verbatim}
|
|
mkA2 : A -> Preposition -> A2 ;
|
|
\end{verbatim}
|
|
|
|
Comparison adjectives may two more forms.
|
|
|
|
\begin{verbatim}
|
|
ADeg : Type ;
|
|
|
|
mkADeg : (good,better,best,well : Str) -> ADeg ;
|
|
\end{verbatim}
|
|
|
|
The regular pattern recognizes two common variations:
|
|
\textit{-e} (\textit{rude} - \textit{ruder} - \textit{rudest}) and
|
|
\textit{-y} (\textit{happy - happier - happiest - happily})
|
|
|
|
\begin{verbatim}
|
|
regADeg : Str -> ADeg ; -- long, longer, longest
|
|
\end{verbatim}
|
|
|
|
However, the duplication of the final consonant is nor predicted,
|
|
but a separate pattern is used:
|
|
|
|
\begin{verbatim}
|
|
duplADeg : Str -> ADeg ; -- fat, fatter, fattest
|
|
\end{verbatim}
|
|
|
|
If comparison is formed by \textit{more, //most}, as in general for//
|
|
long adjective, the following pattern is used:
|
|
|
|
\begin{verbatim}
|
|
compoundADeg : A -> ADeg ; -- -/more/most ridiculous
|
|
\end{verbatim}
|
|
|
|
From a given \texttt{ADeg}, it is possible to get back to \texttt{A}.
|
|
|
|
\begin{verbatim}
|
|
adegA : ADeg -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adverbs}
|
|
Adverbs are not inflected. Most lexical ones have position
|
|
after the verb. Some can be preverbal (e.g. \textit{always}).
|
|
|
|
\begin{verbatim}
|
|
mkAdv : Str -> Adv ;
|
|
mkAdV : Str -> AdV ;
|
|
\end{verbatim}
|
|
|
|
Adverbs modifying adjectives and sentences can also be formed.
|
|
|
|
\begin{verbatim}
|
|
mkAdA : Str -> AdA ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Prepositions}
|
|
A preposition as used for rection in the lexicon, as well as to
|
|
build \texttt{PP}s in the resource API, just requires a string.
|
|
|
|
\begin{verbatim}
|
|
mkPreposition : Str -> Preposition ;
|
|
mkPrep : Str -> Prep ;
|
|
\end{verbatim}
|
|
|
|
(These two functions are synonyms.)
|
|
|
|
\subsubsubsection{Verbs}
|
|
Except for \textit{be}, the worst case needs five forms: the infinitive and
|
|
the third person singular present, the past indicative, and the
|
|
past and present participles.
|
|
|
|
\begin{verbatim}
|
|
mkV : (go, goes, went, gone, going : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
The regular verb function recognizes the special cases where the last
|
|
character is \textit{y} (\textit{cry - cries} but \textit{buy - buys}) or \textit{s}, \textit{sh}, \textit{x}, \textit{z}
|
|
(\textit{fix - fixes}, etc).
|
|
|
|
\begin{verbatim}
|
|
regV : Str -> V ;
|
|
\end{verbatim}
|
|
|
|
The following variant duplicates the last letter in the forms like
|
|
\textit{rip - ripped - ripping}.
|
|
|
|
\begin{verbatim}
|
|
regDuplV : Str -> V ;
|
|
\end{verbatim}
|
|
|
|
There is an extensive list of irregular verbs in the module \texttt{IrregularEng}.
|
|
In practice, it is enough to give three forms,
|
|
e.g. \textit{drink - drank - drunk}, with a variant indicating consonant
|
|
duplication in the present participle.
|
|
|
|
\begin{verbatim}
|
|
irregV : (drink, drank, drunk : Str) -> V ;
|
|
irregDuplV : (get, got, gotten : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs with a particle.}
|
|
The particle, such as in \textit{switch on}, is given as a string.
|
|
|
|
\begin{verbatim}
|
|
partV : V -> Str -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Reflexive verbs}
|
|
By default, verbs are not reflexive; this function makes them that.
|
|
|
|
\begin{verbatim}
|
|
reflV : V -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place verbs}
|
|
Two-place verbs need a preposition, except the special case with direct object.
|
|
(transitive verbs). Notice that a particle comes from the \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
mkV2 : V -> Preposition -> V2 ;
|
|
|
|
dirV2 : V -> V2 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Three-place verbs}
|
|
Three-place (ditransitive) verbs need two prepositions, of which
|
|
the first one or both can be absent.
|
|
|
|
\begin{verbatim}
|
|
mkV3 : V -> Preposition -> Preposition -> V3 ; -- speak, with, about
|
|
dirV3 : V -> Preposition -> V3 ; -- give,_,to
|
|
dirdirV3 : V -> V3 ; -- give,_,_
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Other complement patterns}
|
|
Verbs and adjectives can take complements such as sentences,
|
|
questions, verb phrases, and adjectives.
|
|
|
|
\begin{verbatim}
|
|
mkV0 : V -> V0 ;
|
|
mkVS : V -> VS ;
|
|
mkV2S : V -> Str -> V2S ;
|
|
mkVV : V -> VV ;
|
|
mkV2V : V -> Str -> Str -> V2V ;
|
|
mkVA : V -> VA ;
|
|
mkV2A : V -> Str -> V2A ;
|
|
mkVQ : V -> VQ ;
|
|
mkV2Q : V -> Str -> V2Q ;
|
|
|
|
mkAS : A -> AS ;
|
|
mkA2S : A -> Str -> A2S ;
|
|
mkAV : A -> AV ;
|
|
mkA2V : A -> Str -> A2V ;
|
|
\end{verbatim}
|
|
|
|
Notice: categories \texttt{V2S, V2V, V2A, V2Q} are in v 1.0 treated
|
|
just as synonyms of \texttt{V2}, and the second argument is given
|
|
as an adverb. Likewise \texttt{AS, A2S, AV, A2V} are just \texttt{A}.
|
|
\texttt{V0} is just \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
V0, V2S, V2V, V2A, V2Q : Type ;
|
|
AS, A2S, AV, A2V : Type ;
|
|
\end{verbatim}
|
|
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
==
|
|
|
|
\# -path=.:../abstract:../common:../../prelude
|
|
|
|
|
|
\subsubsection{Finnish Lexical Paradigms}
|
|
Aarne Ranta 2003--2005
|
|
|
|
This is an API to the user of the resource grammar
|
|
for adding lexical items. It gives functions for forming
|
|
expressions of open categories: nouns, adjectives, verbs.
|
|
|
|
Closed categories (determiners, pronouns, conjunctions) are
|
|
accessed through the resource syntax API, \texttt{Structural.gf}.
|
|
|
|
The main difference with \texttt{MorphoFin.gf} is that the types
|
|
referred to are compiled resource grammar types. We have moreover
|
|
had the design principle of always having existing forms, rather
|
|
than stems, as string arguments of the paradigms.
|
|
|
|
The structure of functions for each word class \texttt{C} is the following:
|
|
first we give a handful of patterns that aim to cover all
|
|
regular cases. Then we give a worst-case function \texttt{mkC}, which serves as an
|
|
escape to construct the most irregular words of type \texttt{C}.
|
|
However, this function should only seldom be needed: we have a
|
|
separate module \texttt{IrregularFin}, which covers all irregularly inflected
|
|
words.
|
|
|
|
The following modules are presupposed:
|
|
|
|
\begin{verbatim}
|
|
resource ParadigmsFin = open
|
|
(Predef=Predef),
|
|
Prelude,
|
|
MorphoFin,
|
|
CatFin
|
|
in {
|
|
\end{verbatim}
|
|
|
|
flags optimize=all ;
|
|
|
|
\begin{verbatim}
|
|
flags optimize=noexpand ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Parameters}
|
|
To abstract over gender, number, and (some) case names,
|
|
we define the following identifiers. The application programmer
|
|
should always use these constants instead of their definitions
|
|
in \texttt{TypesInf}.
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
Number : Type ;
|
|
|
|
singular : Number ;
|
|
plural : Number ;
|
|
|
|
Case : Type ;
|
|
nominative : Case ;
|
|
genitive : Case ;
|
|
partitive : Case ;
|
|
translative : Case ;
|
|
inessive : Case ;
|
|
elative : Case ;
|
|
illative : Case ;
|
|
adessive : Case ;
|
|
ablative : Case ;
|
|
allative : Case ;
|
|
\end{verbatim}
|
|
|
|
The following type is used for defining \textbf{rection}, i.e. complements
|
|
of many-place verbs and adjective. A complement can be defined by
|
|
just a case, or a pre/postposition and a case.
|
|
|
|
\begin{verbatim}
|
|
prePrep : Case -> Str -> Prep ; -- ilman, partitive
|
|
postPrep : Case -> Str -> Prep ; -- takana, genitive
|
|
postGenPrep : Str -> Prep ; -- takana
|
|
casePrep : Case -> Prep ; -- adessive
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Nouns}
|
|
The worst case gives ten forms and the semantic gender.
|
|
In practice just a couple of forms are needed, to define the different
|
|
stems, vowel alternation, and vowel harmony.
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
mkN : (talo, talon, talona, taloa, taloon,
|
|
taloina,taloissa,talojen,taloja,taloihin : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
The regular noun heuristic takes just one form (singular
|
|
nominative) and analyses it to pick the correct paradigm.
|
|
It does automatic grade alternation, and is hence not usable
|
|
for words like \textit{auto} (whose genitive would become \textit{audon}).
|
|
|
|
\begin{verbatim}
|
|
regN : (talo : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
If \texttt{regN} does not give the correct result, one can try and give
|
|
two or three forms as follows. Examples of the use of these
|
|
functions are given in \texttt{BasicFin}. Most notably, \texttt{reg2N} is used
|
|
for nouns like \textit{kivi - kiviä}, which would otherwise become like
|
|
\textit{rivi - rivejä}. \texttt{regN3} is used e.g. for
|
|
\textit{sydän - sydämen - sydämiä}, which would otherwise become
|
|
\textit{sydän - sytämen}.
|
|
|
|
\begin{verbatim}
|
|
reg2N : (savi,savia : Str) -> N ;
|
|
reg3N : (vesi,veden,vesiä : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
Some nouns have an unexpected singular partitive, e.g. \textit{meri}, \textit{lumi}.
|
|
|
|
\begin{verbatim}
|
|
sgpartN : (meri : N) -> (merta : Str) -> N ;
|
|
nMeri : (meri : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
The rest of the noun paradigms are mostly covered by the three
|
|
heuristics.
|
|
|
|
Nouns with partitive \textit{a///}ä// are a large group.
|
|
To determine for grade and vowel alternation, three forms are usually needed:
|
|
singular nominative and genitive, and plural partitive.
|
|
Examples: \textit{talo}, \textit{kukko}, \textit{huippu}, \textit{koira}, \textit{kukka}, \textit{syylä}, \textit{särki}...
|
|
|
|
\begin{verbatim}
|
|
nKukko : (kukko,kukon,kukkoja : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
A special case are nouns with no alternations:
|
|
the vowel harmony is inferred from the last letter,
|
|
which must be one of \textit{o}, \textit{u}, \textit{ö}, \textit{y}.
|
|
|
|
\begin{verbatim}
|
|
nTalo : (talo : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
Another special case are nouns where the last two consonants
|
|
undergo regular weak-grade alternation:
|
|
\textit{kukko - kukon}, \textit{rutto - ruton}, \textit{hyppy - hypyn}, \textit{sampo - sammon},
|
|
\textit{kunto - kunnon}, \textit{sisältö - sisällön}, .
|
|
|
|
\begin{verbatim}
|
|
nLukko : (lukko : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
\textit{arpi - arven}, \textit{sappi - sapen}, \textit{kampi - kammen};\textit{sylki - syljen}
|
|
|
|
\begin{verbatim}
|
|
nArpi : (arpi : Str) -> N ;
|
|
nSylki : (sylki : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
Foreign words ending in consonants are actually similar to words like
|
|
\textit{malli///}mallin\textit{/}malleja\textit{, with the exception that the //i} is not attached
|
|
to the singular nominative. Examples: \textit{linux}, \textit{savett}, \textit{screen}.
|
|
The singular partitive form is used to get the vowel harmony. (N.B. more than
|
|
1-syllabic words ending in \textit{n} would have variant plural genitive and
|
|
partitive forms, like \textit{sultanien///}sultaneiden//, which are not covered.)
|
|
|
|
\begin{verbatim}
|
|
nLinux : (linuxia : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
Nouns of at least 3 syllables ending with \textit{a} or \textit{ä}, like \textit{peruna}, \textit{tavara},
|
|
\textit{rytinä}.
|
|
|
|
\begin{verbatim}
|
|
nPeruna : (peruna : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
The following paradigm covers both nouns ending in an aspirated \textit{e}, such as
|
|
\textit{rae}, \textit{perhe}, \textit{savuke}, and also many ones ending in a consonant
|
|
(\textit{rengas}, \textit{kätkyt}). The singular nominative and essive are given.
|
|
|
|
\begin{verbatim}
|
|
nRae : (rae, rakeena : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
The following covers nouns with partitive \textit{ta///}tä//, such as
|
|
\textit{susi}, \textit{vesi}, \textit{pieni}. To get all stems and the vowel harmony, it takes
|
|
the singular nominative, genitive, and essive.
|
|
|
|
\begin{verbatim}
|
|
nSusi : (susi,suden,sutta : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
Nouns ending with a long vowel, such as \textit{puu}, \textit{pää}, \textit{pii}, \textit{leikkuu},
|
|
are inflected according to the following.
|
|
|
|
\begin{verbatim}
|
|
nPuu : (puu : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
One-syllable diphthong nouns, such as \textit{suo}, \textit{tie}, \textit{työ}, are inflected by
|
|
the following.
|
|
|
|
\begin{verbatim}
|
|
nSuo : (suo : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
Many adjectives but also nouns have the nominative ending \textit{nen} which in other
|
|
cases becomes \textit{s}: \textit{nainen}, \textit{ihminen}, \textit{keltainen}.
|
|
To capture the vowel harmony, we use the partitive form as the argument.
|
|
|
|
\begin{verbatim}
|
|
nNainen : (naista : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
The following covers some nouns ending with a consonant, e.g.
|
|
\textit{tilaus}, \textit{kaulin}, \textit{paimen}, \textit{laidun}.
|
|
|
|
\begin{verbatim}
|
|
nTilaus : (tilaus,tilauksena : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
Special case:
|
|
|
|
\begin{verbatim}
|
|
nKulaus : (kulaus : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
The following covers nouns like \textit{nauris} and adjectives like \textit{kallis}, \textit{tyyris}.
|
|
The partitive form is taken to get the vowel harmony.
|
|
|
|
\begin{verbatim}
|
|
nNauris : (naurista : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
Separately-written compound nouns, like \textit{sambal oelek}, \textit{Urho Kekkonen},
|
|
have only their last part inflected.
|
|
|
|
\begin{verbatim}
|
|
compN : Str -> N -> N ;
|
|
\end{verbatim}
|
|
|
|
Nouns used as functions need a case, of which by far the commonest is
|
|
the genitive.
|
|
|
|
\begin{verbatim}
|
|
mkN2 : N -> Prep -> N2 ;
|
|
genN2 : N -> N2 ;
|
|
|
|
mkN3 : N -> Prep -> Prep -> N3 ;
|
|
\end{verbatim}
|
|
|
|
Proper names can be formed by using declensions for nouns.
|
|
The plural forms are filtered away by the compiler.
|
|
|
|
\begin{verbatim}
|
|
mkPN : N -> PN ;
|
|
mkNP : N -> Number -> NP ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adjectives}
|
|
Non-comparison one-place adjectives are just like nouns.
|
|
|
|
\begin{verbatim}
|
|
mkA : N -> A ;
|
|
\end{verbatim}
|
|
|
|
Two-place adjectives need a case for the second argument.
|
|
|
|
\begin{verbatim}
|
|
mkA2 : A -> Prep -> A2 ;
|
|
\end{verbatim}
|
|
|
|
Comparison adjectives have three forms. The comparative and the superlative
|
|
are always inflected in the same way, so the nominative of them is actually
|
|
enough (except for the superlative \textit{paras} of \textit{hyvä}).
|
|
|
|
\begin{verbatim}
|
|
mkADeg : (kiva : N) -> (kivempaa,kivinta : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
The regular adjectives are based on \texttt{regN} in the positive.
|
|
|
|
\begin{verbatim}
|
|
regA : (punainen : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs}
|
|
The grammar does not cover the potential mood and some nominal
|
|
forms. One way to see the coverage is to linearize a verb to
|
|
a table.
|
|
The worst case needs twelve forms, as shown in the following.
|
|
|
|
\begin{verbatim}
|
|
mkV : (tulla,tulee,tulen,tulevat,tulkaa,tullaan,
|
|
tuli,tulin,tulisi,tullut,tultu,tullun : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
The following heuristics cover more and more verbs.
|
|
|
|
\begin{verbatim}
|
|
regV : (soutaa : Str) -> V ;
|
|
reg2V : (soutaa,souti : Str) -> V ;
|
|
reg3V : (soutaa,soudan,souti : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
The subject case of verbs is by default nominative. This dunction can change it.
|
|
|
|
\begin{verbatim}
|
|
subjcaseV : V -> Case -> V ;
|
|
\end{verbatim}
|
|
|
|
The rest of the paradigms are special cases mostly covered by the heuristics.
|
|
A simple special case is the one with just one stem and without grade alternation.
|
|
|
|
\begin{verbatim}
|
|
vValua : (valua : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
With two forms, the following function covers a variety of verbs, such as
|
|
\textit{ottaa}, \textit{käyttää}, \textit{löytää}, \textit{huoltaa}, \textit{hiihtää}, \textit{siirtää}.
|
|
|
|
\begin{verbatim}
|
|
vKattaa : (kattaa, katan : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
When grade alternation is not present, just a one-form special case is needed
|
|
(\textit{poistaa}, \textit{ryystää}).
|
|
|
|
\begin{verbatim}
|
|
vOstaa : (ostaa : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
The following covers
|
|
\textit{juosta}, \textit{piestä}, \textit{nousta}, \textit{rangaista}, \textit{kävellä}, \textit{surra}, \textit{panna}.
|
|
|
|
\begin{verbatim}
|
|
vNousta : (nousta, nousen : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
This is for one-syllable diphthong verbs like \textit{juoda}, \textit{syödä}.
|
|
|
|
\begin{verbatim}
|
|
vTuoda : (tuoda : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
All the patterns above have \texttt{nominative} as subject case.
|
|
If another case is wanted, use the following.
|
|
|
|
\begin{verbatim}
|
|
caseV : Case -> V -> V ;
|
|
\end{verbatim}
|
|
|
|
The verbs \textit{be} is special.
|
|
|
|
\begin{verbatim}
|
|
vOlla : V ;
|
|
\end{verbatim}
|
|
|
|
Two-place verbs need a case, and can have a pre- or postposition.
|
|
|
|
\begin{verbatim}
|
|
mkV2 : V -> Prep -> V2 ;
|
|
\end{verbatim}
|
|
|
|
If the complement needs just a case, the following special function can be used.
|
|
|
|
\begin{verbatim}
|
|
caseV2 : V -> Case -> V2 ;
|
|
\end{verbatim}
|
|
|
|
Verbs with a direct (accusative) object
|
|
are special, since their complement case is finally decided in syntax.
|
|
But this is taken care of by \texttt{ClauseFin}.
|
|
|
|
\begin{verbatim}
|
|
dirV2 : V -> V2 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Three-place verbs}
|
|
Three-place (ditransitive) verbs need two prepositions, of which
|
|
the first one or both can be absent.
|
|
|
|
\begin{verbatim}
|
|
mkV3 : V -> Prep -> Prep -> V3 ; -- speak, with, about
|
|
dirV3 : V -> Case -> V3 ; -- give,_,to
|
|
dirdirV3 : V -> V3 ; -- acc, allat
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Other complement patterns}
|
|
Verbs and adjectives can take complements such as sentences,
|
|
questions, verb phrases, and adjectives.
|
|
|
|
\begin{verbatim}
|
|
mkV0 : V -> V0 ;
|
|
mkVS : V -> VS ;
|
|
mkV2S : V -> Prep -> V2S ;
|
|
mkVV : V -> VV ;
|
|
mkV2V : V -> Prep -> V2V ;
|
|
mkVA : V -> Prep -> VA ;
|
|
mkV2A : V -> Prep -> Prep -> V2A ;
|
|
mkVQ : V -> VQ ;
|
|
mkV2Q : V -> Prep -> V2Q ;
|
|
|
|
mkAS : A -> AS ;
|
|
mkA2S : A -> Prep -> A2S ;
|
|
mkAV : A -> AV ;
|
|
mkA2V : A -> Prep -> A2V ;
|
|
\end{verbatim}
|
|
|
|
Notice: categories \texttt{V2S, V2V, V2Q} are in v 1.0 treated
|
|
just as synonyms of \texttt{V2}, and the second argument is given
|
|
as an adverb. Likewise \texttt{AS, A2S, AV, A2V} are just \texttt{A}.
|
|
\texttt{V0} is just \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
V0, V2S, V2V, V2Q : Type ;
|
|
AS, A2S, AV, A2V : Type ;
|
|
\end{verbatim}
|
|
|
|
The definitions should not bother the user of the API. So they are
|
|
hidden from the document.
|
|
Author:
|
|
Last update: Tue Jun 13 11:43:19 2006
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
==
|
|
|
|
\# -path=.:../romance:../common:../abstract:../../prelude
|
|
|
|
|
|
\subsubsection{French Lexical Paradigms}
|
|
Aarne Ranta 2003
|
|
|
|
This is an API to the user of the resource grammar
|
|
for adding lexical items. It gives functions for forming
|
|
expressions of open categories: nouns, adjectives, verbs.
|
|
|
|
Closed categories (determiners, pronouns, conjunctions) are
|
|
accessed through the resource syntax API, \texttt{Structural.gf}.
|
|
|
|
The main difference with \texttt{MorphoFre.gf} is that the types
|
|
referred to are compiled resource grammar types. We have moreover
|
|
had the design principle of always having existing forms, rather
|
|
than stems, as string arguments of the paradigms.
|
|
|
|
The structure of functions for each word class \texttt{C} is the following:
|
|
first we give a handful of patterns that aim to cover all
|
|
regular cases. Then we give a worst-case function \texttt{mkC}, which serves as an
|
|
escape to construct the most irregular words of type \texttt{C}.
|
|
However, this function should only seldom be needed: we have a
|
|
separate module \texttt{IrregularEng}, which covers all irregularly inflected
|
|
words.
|
|
|
|
\begin{verbatim}
|
|
resource ParadigmsFre =
|
|
open
|
|
(Predef=Predef),
|
|
Prelude,
|
|
CommonRomance,
|
|
ResFre,
|
|
MorphoFre,
|
|
CatFre in {
|
|
|
|
flags optimize=all ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Parameters}
|
|
To abstract over gender names, we define the following identifiers.
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
Gender : Type ;
|
|
|
|
masculine : Gender ;
|
|
feminine : Gender ;
|
|
\end{verbatim}
|
|
|
|
To abstract over number names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Number : Type ;
|
|
|
|
singular : Number ;
|
|
plural : Number ;
|
|
\end{verbatim}
|
|
|
|
Prepositions used in many-argument functions are either strings
|
|
(including the 'accusative' empty string) or strings that
|
|
amalgamate with the following word (the 'genitive' \textit{de} and the
|
|
'dative' \textit{à}).
|
|
|
|
\begin{verbatim}
|
|
Preposition : Type ;
|
|
|
|
accusative : Preposition ;
|
|
genitive : Preposition ;
|
|
dative : Preposition ;
|
|
|
|
mkPreposition : Str -> Preposition ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Nouns}
|
|
Worst case: give both two forms and the gender.
|
|
|
|
\begin{verbatim}
|
|
mkN : (oeil,yeux : Str) -> Gender -> N ;
|
|
\end{verbatim}
|
|
|
|
The regular function takes the singular form,
|
|
and computes the plural and the gender by a heuristic. The plural
|
|
heuristic currently
|
|
covers the cases \textit{pas-pas}, \textit{prix-prix}, \textit{nez-nez},
|
|
\textit{bijou-bijoux}, \textit{cheveu-cheveux}, \textit{plateau-plateaux}, \textit{cheval-chevaux}.
|
|
The gender heuristic is less reliable: it treats as feminine all
|
|
nouns ending with \textit{e} and \textit{ion}, all others as masculine.
|
|
If in doubt, use the \texttt{cc} command to test!
|
|
|
|
\begin{verbatim}
|
|
regN : Str -> N ;
|
|
\end{verbatim}
|
|
|
|
Adding gender information widens the scope of the foregoing function.
|
|
|
|
\begin{verbatim}
|
|
regGenN : Str -> Gender -> N ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Compound nouns}
|
|
Some nouns are ones where the first part is inflected as a noun but
|
|
the second part is not inflected. e.g. \textit{numéro de téléphone}.
|
|
They could be formed in syntax, but we give a shortcut here since
|
|
they are frequent in lexica.
|
|
|
|
\begin{verbatim}
|
|
compN : N -> Str -> N ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relational nouns}
|
|
Relational nouns (\textit{fille de x}) need a case and a preposition.
|
|
|
|
\begin{verbatim}
|
|
mkN2 : N -> Preposition -> N2 ;
|
|
\end{verbatim}
|
|
|
|
The most common cases are the genitive \textit{de} and the dative \textit{à},
|
|
with the empty preposition.
|
|
|
|
\begin{verbatim}
|
|
deN2 : N -> N2 ;
|
|
aN2 : N -> N2 ;
|
|
\end{verbatim}
|
|
|
|
Three-place relational nouns (\textit{la connection de x à y}) need two prepositions.
|
|
|
|
\begin{verbatim}
|
|
mkN3 : N -> Preposition -> Preposition -> N3 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relational common noun phrases}
|
|
In some cases, you may want to make a complex \texttt{CN} into a
|
|
relational noun (e.g. \textit{the old town hall of}). However, \texttt{N2} and
|
|
\texttt{N3} are purely lexical categories. But you can use the \texttt{AdvCN}
|
|
and \texttt{PrepNP} constructions to build phrases like this.
|
|
|
|
\subsubsubsection{Proper names and noun phrases}
|
|
Proper names need a string and a gender.
|
|
|
|
\begin{verbatim}
|
|
mkPN : Str -> Gender -> PN ; -- Jean
|
|
\end{verbatim}
|
|
|
|
To form a noun phrase that can also be plural,
|
|
you can use the worst-case function.
|
|
|
|
\begin{verbatim}
|
|
mkNP : Str -> Gender -> Number -> NP ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adjectives}
|
|
Non-comparison one-place adjectives need four forms in the worst
|
|
case (masc and fem singular, masc plural, adverbial).
|
|
|
|
\begin{verbatim}
|
|
mkA : (banal,banale,banaux,banalement : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
For regular adjectives, all other forms are derived from the
|
|
masculine singular. The heuristic takes into account certain
|
|
deviant endings: \textit{banal- -banaux}, \textit{chinois- -chinois},
|
|
\textit{heureux-heureuse-heureux}, \textit{italien-italienne}, \textit{jeune-jeune},
|
|
\textit{amer-amère}, \textit{carré- - -carrément}, \textit{joli- - -joliment}.
|
|
|
|
\begin{verbatim}
|
|
regA : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
These functions create postfix adjectives. To switch
|
|
them to prefix ones (i.e. ones placed before the noun in
|
|
modification, as in \textit{petite maison}), the following function is
|
|
provided.
|
|
|
|
\begin{verbatim}
|
|
prefA : A -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place adjectives}
|
|
Two-place adjectives need a preposition for their second argument.
|
|
|
|
\begin{verbatim}
|
|
mkA2 : A -> Preposition -> A2 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Comparison adjectives}
|
|
Comparison adjectives are in the worst case put up from two
|
|
adjectives: the positive (\textit{bon}), and the comparative (\textit{meilleure}).
|
|
|
|
\begin{verbatim}
|
|
mkADeg : A -> A -> A ;
|
|
\end{verbatim}
|
|
|
|
If comparison is formed by \textit{plus}, as usual in French,
|
|
the following pattern is used:
|
|
|
|
\begin{verbatim}
|
|
compADeg : A -> A ;
|
|
\end{verbatim}
|
|
|
|
For prefixed adjectives, the following function is
|
|
provided.
|
|
|
|
\begin{verbatim}
|
|
prefA : A -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adverbs}
|
|
Adverbs are not inflected. Most lexical ones have position
|
|
after the verb.
|
|
|
|
\begin{verbatim}
|
|
mkAdv : Str -> Adv ;
|
|
\end{verbatim}
|
|
|
|
Some appear next to the verb (e.g. \textit{toujours}).
|
|
|
|
\begin{verbatim}
|
|
mkAdV : Str -> AdV ;
|
|
\end{verbatim}
|
|
|
|
Adverbs modifying adjectives and sentences can also be formed.
|
|
|
|
\begin{verbatim}
|
|
mkAdA : Str -> AdA ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs}
|
|
Irregular verbs are given in the module \texttt{VerbsFre}.
|
|
If a verb should be missing in that list, the module
|
|
\texttt{BeschFre} gives all the patterns of the \textit{Bescherelle} book.
|
|
|
|
Regular verbs are ones with the infinitive \textit{er} or \textit{ir}, the
|
|
latter with plural present indicative forms as \textit{finissons}.
|
|
The regular verb function is the first conjugation recognizes
|
|
these endings, as well as the variations among
|
|
\textit{aimer, céder, placer, peser, jeter, placer, manger, assiéger, payer}.
|
|
|
|
\begin{verbatim}
|
|
regV : Str -> V ;
|
|
\end{verbatim}
|
|
|
|
Sometimes, however, it is not predictable which variant of the \textit{er}
|
|
conjugation is to be selected. Then it is better to use the function
|
|
that gives the third person singular present indicative and future
|
|
((\textit{il}) \textit{jette}, \textit{jettera}) as second argument.
|
|
|
|
\begin{verbatim}
|
|
reg3V : (jeter,jette,jettera : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
The function \texttt{regV} gives all verbs the compound auxiliary \textit{avoir}.
|
|
To change it to \textit{être}, use the following function. Reflexive implies \textit{être}.
|
|
|
|
\begin{verbatim}
|
|
etreV : V -> V ;
|
|
reflV : V -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place verbs}
|
|
Two-place verbs need a preposition, except the special case with direct object.
|
|
(transitive verbs). Notice that a particle comes from the \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
mkV2 : V -> Preposition -> V2 ;
|
|
|
|
dirV2 : V -> V2 ;
|
|
\end{verbatim}
|
|
|
|
You can reuse a \texttt{V2} verb in \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
v2V : V2 -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Three-place verbs}
|
|
Three-place (ditransitive) verbs need two prepositions, of which
|
|
the first one or both can be absent.
|
|
|
|
\begin{verbatim}
|
|
mkV3 : V -> Preposition -> Preposition -> V3 ; -- parler, à, de
|
|
dirV3 : V -> Preposition -> V3 ; -- donner,_,à
|
|
dirdirV3 : V -> V3 ; -- donner,_,_
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Other complement patterns}
|
|
Verbs and adjectives can take complements such as sentences,
|
|
questions, verb phrases, and adjectives.
|
|
|
|
\begin{verbatim}
|
|
mkV0 : V -> V0 ;
|
|
mkVS : V -> VS ;
|
|
mkV2S : V -> Preposition -> V2S ;
|
|
mkVV : V -> VV ; -- plain infinitive: "je veux parler"
|
|
deVV : V -> VV ; -- "j'essaie de parler"
|
|
aVV : V -> VV ; -- "j'arrive à parler"
|
|
mkV2V : V -> Preposition -> Preposition -> V2V ;
|
|
mkVA : V -> VA ;
|
|
mkV2A : V -> Preposition -> Preposition -> V2A ;
|
|
mkVQ : V -> VQ ;
|
|
mkV2Q : V -> Preposition -> V2Q ;
|
|
|
|
mkAS : A -> AS ;
|
|
mkA2S : A -> Preposition -> A2S ;
|
|
mkAV : A -> Preposition -> AV ;
|
|
mkA2V : A -> Preposition -> Preposition -> A2V ;
|
|
\end{verbatim}
|
|
|
|
Notice: categories \texttt{V2S, V2V, V2Q} are in v 1.0 treated
|
|
just as synonyms of \texttt{V2}, and the second argument is given
|
|
as an adverb. Likewise \texttt{AS, A2S, AV, A2V} are just \texttt{A}.
|
|
\texttt{V0} is just \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
V0, V2S, V2V, V2Q : Type ;
|
|
AS, A2S, AV, A2V : Type ;
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
==
|
|
|
|
\# -path=.:../common:../abstract:../../prelude
|
|
|
|
|
|
\subsubsection{German Lexical Paradigms}
|
|
Aarne Ranta \& Harald Hammarström 2003--2006
|
|
|
|
This is an API to the user of the resource grammar
|
|
for adding lexical items. It gives functions for forming
|
|
expressions of open categories: nouns, adjectives, verbs.
|
|
|
|
Closed categories (determiners, pronouns, conjunctions) are
|
|
accessed through the resource syntax API, \texttt{Structural.gf}.
|
|
|
|
The main difference with \texttt{MorphoGer.gf} is that the types
|
|
referred to are compiled resource grammar types. We have moreover
|
|
had the design principle of always having existing forms, rather
|
|
than stems, as string arguments of the paradigms.
|
|
|
|
The structure of functions for each word class \texttt{C} is the following:
|
|
first we give a handful of patterns that aim to cover all
|
|
regular cases. Then we give a worst-case function \texttt{mkC}, which serves as an
|
|
escape to construct the most irregular words of type \texttt{C}.
|
|
However, this function should only seldom be needed: we have a
|
|
separate module \texttt{IrregularGer}, which covers all irregularly inflected
|
|
words.
|
|
|
|
\begin{verbatim}
|
|
resource ParadigmsGer = open
|
|
(Predef=Predef),
|
|
Prelude,
|
|
MorphoGer,
|
|
CatGer
|
|
in {
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Parameters}
|
|
To abstract over gender names, we define the following identifiers.
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
Gender : Type ;
|
|
|
|
masculine : Gender ;
|
|
feminine : Gender ;
|
|
neuter : Gender ;
|
|
\end{verbatim}
|
|
|
|
To abstract over case names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Case : Type ;
|
|
|
|
nominative : Case ;
|
|
accusative : Case ;
|
|
dative : Case ;
|
|
genitive : Case ;
|
|
\end{verbatim}
|
|
|
|
To abstract over number names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Number : Type ;
|
|
|
|
singular : Number ;
|
|
plural : Number ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Nouns}
|
|
Worst case: give all four singular forms, two plural forms (others + dative),
|
|
and the gender.
|
|
|
|
\begin{verbatim}
|
|
mkN : (x1,_,_,_,_,x6 : Str) -> Gender -> N ;
|
|
-- mann, mann, manne, mannes, männer, männern
|
|
\end{verbatim}
|
|
|
|
The regular heuristics recognizes some suffixes, from which it
|
|
guesses the gender and the declension: \textit{e, ung, ion} give the
|
|
feminine with plural ending \textit{-n, -en}, and the rest are masculines
|
|
with the plural \textit{-e} (without Umlaut).
|
|
|
|
\begin{verbatim}
|
|
regN : Str -> N ;
|
|
\end{verbatim}
|
|
|
|
The 'almost regular' case is much like the information given in an ordinary
|
|
dictionary. It takes the singular and plural nominative and the
|
|
gender, and infers the other forms from these.
|
|
|
|
\begin{verbatim}
|
|
reg2N : (x1,x2 : Str) -> Gender -> N ;
|
|
\end{verbatim}
|
|
|
|
Relational nouns need a preposition. The most common is \textit{von} with
|
|
the dative. Some prepositions are constructed in \htmladdnormallink{StructuralGer}{StructuralGer.html}.
|
|
|
|
\begin{verbatim}
|
|
mkN2 : N -> Prep -> N2 ;
|
|
vonN2 : N -> N2 ;
|
|
\end{verbatim}
|
|
|
|
Use the function \texttt{mkPrep} or see the section on prepositions below to
|
|
form other prepositions.
|
|
|
|
Three-place relational nouns (\textit{die Verbindung von x nach y}) need two prepositions.
|
|
|
|
\begin{verbatim}
|
|
mkN3 : N -> Prep -> Prep -> N3 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Proper names and noun phrases}
|
|
Proper names, with a regular genitive, are formed as follows
|
|
The regular genitive is \textit{s}, omitted after \textit{s}.
|
|
|
|
\begin{verbatim}
|
|
mkPN : (karolus, karoli : Str) -> PN ; -- karolus, karoli
|
|
regPN : (Johann : Str) -> PN ; -- Johann, Johanns ; Johannes, Johannes
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adjectives}
|
|
Adjectives need three forms, one for each degree.
|
|
|
|
\begin{verbatim}
|
|
mkA : (x1,_,x3 : Str) -> A ; -- gut,besser,beste
|
|
\end{verbatim}
|
|
|
|
The regular adjective formation works for most cases, and includes
|
|
variations such as \textit{teuer - teurer}, \textit{böse - böser}.
|
|
|
|
\begin{verbatim}
|
|
regA : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
Invariable adjective are a special case.
|
|
|
|
\begin{verbatim}
|
|
invarA : Str -> A ; -- prima
|
|
\end{verbatim}
|
|
|
|
Two-place adjectives are formed by adding a preposition to an adjective.
|
|
|
|
\begin{verbatim}
|
|
mkA2 : A -> Prep -> A2 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adverbs}
|
|
Adverbs are just strings.
|
|
|
|
\begin{verbatim}
|
|
mkAdv : Str -> Adv ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Prepositions}
|
|
A preposition is formed from a string and a case.
|
|
|
|
\begin{verbatim}
|
|
mkPrep : Str -> Case -> Prep ;
|
|
\end{verbatim}
|
|
|
|
Often just a case with the empty string is enough.
|
|
|
|
\begin{verbatim}
|
|
accPrep : Prep ;
|
|
datPrep : Prep ;
|
|
genPrep : Prep ;
|
|
\end{verbatim}
|
|
|
|
A couple of common prepositions (always with the dative).
|
|
|
|
\begin{verbatim}
|
|
von_Prep : Prep ;
|
|
zu_Prep : Prep ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs}
|
|
The worst-case constructor needs six forms:
|
|
|
|
\begin{itemize}
|
|
\item Infinitive,
|
|
\item 3p sg pres. indicative,
|
|
\item 2p sg imperative,
|
|
\item 1/3p sg imperfect indicative,
|
|
\item 1/3p sg imperfect subjunctive (because this uncommon form can have umlaut)
|
|
\item the perfect participle
|
|
\end{itemize}
|
|
|
|
\begin{verbatim}
|
|
mkV : (x1,_,_,_,_,x6 : Str) -> V ; -- geben, gibt, gib, gab, gäbe, gegeben
|
|
\end{verbatim}
|
|
|
|
Weak verbs are sometimes called regular verbs.
|
|
|
|
\begin{verbatim}
|
|
regV : Str -> V ; -- führen
|
|
\end{verbatim}
|
|
|
|
Irregular verbs use Ablaut and, in the worst cases, also Umlaut.
|
|
|
|
\begin{verbatim}
|
|
irregV : (x1,_,_,_,x5 : Str) -> V ; -- sehen, sieht, sah, sähe, gesehen
|
|
\end{verbatim}
|
|
|
|
To remove the past participle prefix \textit{ge}, e.g. for the verbs
|
|
prefixed by \textit{be-, ver-}.
|
|
|
|
\begin{verbatim}
|
|
no_geV : V -> V ;
|
|
\end{verbatim}
|
|
|
|
To add a movable suffix e.g. \textit{auf(fassen)}.
|
|
|
|
\begin{verbatim}
|
|
prefixV : Str -> V -> V ;
|
|
\end{verbatim}
|
|
|
|
To change the auxiliary from \textit{haben} (default) to \textit{sein} and
|
|
vice-versa.
|
|
|
|
\begin{verbatim}
|
|
seinV : V -> V ;
|
|
habenV : V -> V ;
|
|
\end{verbatim}
|
|
|
|
Reflexive verbs can take reflexive pronouns of different cases.
|
|
|
|
\begin{verbatim}
|
|
reflV : V -> Case -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place verbs}
|
|
Two-place verbs need a preposition, except the special case with direct object
|
|
(accusative, transitive verbs). There is also a case for dative objects.
|
|
|
|
\begin{verbatim}
|
|
mkV2 : V -> Prep -> V2 ;
|
|
|
|
dirV2 : V -> V2 ;
|
|
datV2 : V -> V2 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Three-place verbs}
|
|
Three-place (ditransitive) verbs need two prepositions, of which
|
|
the first one or both can be absent.
|
|
|
|
\begin{verbatim}
|
|
mkV3 : V -> Prep -> Prep -> V3 ; -- speak, with, about
|
|
dirV3 : V -> Prep -> V3 ; -- give,_,to
|
|
accdatV3 : V -> V3 ; -- give,_,_
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Other complement patterns}
|
|
Verbs and adjectives can take complements such as sentences,
|
|
questions, verb phrases, and adjectives.
|
|
|
|
\begin{verbatim}
|
|
mkV0 : V -> V0 ;
|
|
mkVS : V -> VS ;
|
|
mkV2S : V -> Prep -> V2S ;
|
|
mkVV : V -> VV ;
|
|
mkV2V : V -> Prep -> V2V ;
|
|
mkVA : V -> VA ;
|
|
mkV2A : V -> Prep -> V2A ;
|
|
mkVQ : V -> VQ ;
|
|
mkV2Q : V -> Prep -> V2Q ;
|
|
|
|
mkAS : A -> AS ;
|
|
mkA2S : A -> Prep -> A2S ;
|
|
mkAV : A -> AV ;
|
|
mkA2V : A -> Prep -> A2V ;
|
|
\end{verbatim}
|
|
|
|
Notice: categories \texttt{V2S, V2V, V2A, V2Q} are in v 1.0 treated
|
|
just as synonyms of \texttt{V2}, and the second argument is given
|
|
as an adverb. Likewise \texttt{AS, A2S, AV, A2V} are just \texttt{A}.
|
|
\texttt{V0} is just \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
V0, V2S, V2V, V2A, V2Q : Type ;
|
|
AS, A2S, AV, A2V : Type ;
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
==
|
|
|
|
\# -path=.:../romance:../common:../abstract:../../prelude
|
|
|
|
|
|
\subsubsection{Italian Lexical Paradigms}
|
|
Aarne Ranta 2003
|
|
|
|
This is an API to the user of the resource grammar
|
|
for adding lexical items. It gives functions for forming
|
|
expressions of open categories: nouns, adjectives, verbs.
|
|
|
|
Closed categories (determiners, pronouns, conjunctions) are
|
|
accessed through the resource syntax API, \texttt{Structural.gf}.
|
|
|
|
The main difference with \texttt{MorphoIta.gf} is that the types
|
|
referred to are compiled resource grammar types. We have moreover
|
|
had the design principle of always having existing forms, rather
|
|
than stems, as string arguments of the paradigms.
|
|
|
|
The structure of functions for each word class \texttt{C} is the following:
|
|
first we give a handful of patterns that aim to cover all
|
|
regular cases. Then we give a worst-case function \texttt{mkC}, which serves as an
|
|
escape to construct the most irregular words of type \texttt{C}.
|
|
However, this function should only seldom be needed: we have a
|
|
separate module \texttt{IrregularEng}, which covers all irregularly inflected
|
|
words.
|
|
|
|
\begin{verbatim}
|
|
resource ParadigmsIta =
|
|
open
|
|
(Predef=Predef),
|
|
Prelude,
|
|
CommonRomance,
|
|
ResIta,
|
|
MorphoIta,
|
|
BeschIta,
|
|
CatIta in {
|
|
|
|
flags optimize=all ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Parameters}
|
|
To abstract over gender names, we define the following identifiers.
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
Gender : Type ;
|
|
|
|
masculine : Gender ;
|
|
feminine : Gender ;
|
|
\end{verbatim}
|
|
|
|
To abstract over number names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Number : Type ;
|
|
|
|
singular : Number ;
|
|
plural : Number ;
|
|
\end{verbatim}
|
|
|
|
Prepositions used in many-argument functions are either strings
|
|
(including the 'accusative' empty string) or strings that
|
|
amalgamate with the following word (the 'genitive' \textit{de} and the
|
|
'dative' \textit{à}).
|
|
|
|
\begin{verbatim}
|
|
Preposition : Type ;
|
|
|
|
accusative : Preposition ;
|
|
genitive : Preposition ;
|
|
dative : Preposition ;
|
|
|
|
mkPreposition : Str -> Preposition ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Nouns}
|
|
Worst case: give both two forms and the gender.
|
|
|
|
\begin{verbatim}
|
|
mkN : (uomi,uomini : Str) -> Gender -> N ;
|
|
\end{verbatim}
|
|
|
|
The regular function takes the singular form and the gender,
|
|
and computes the plural and the gender by a heuristic.
|
|
The heuristic says that the gender is feminine for nouns
|
|
ending with \textit{a}, and masculine for all other words.
|
|
|
|
\begin{verbatim}
|
|
regN : Str -> N ;
|
|
\end{verbatim}
|
|
|
|
To force a different gender, use one of the following functions.
|
|
|
|
\begin{verbatim}
|
|
mascN : N -> N ;
|
|
femN : N -> N ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Compound nouns}
|
|
Some nouns are ones where the first part is inflected as a noun but
|
|
the second part is not inflected. e.g. \textit{numéro de téléphone}.
|
|
They could be formed in syntax, but we give a shortcut here since
|
|
they are frequent in lexica.
|
|
|
|
\begin{verbatim}
|
|
compN : N -> Str -> N ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relational nouns}
|
|
Relational nouns (\textit{figlio di x}) need a case and a preposition.
|
|
|
|
\begin{verbatim}
|
|
mkN2 : N -> Preposition -> N2 ;
|
|
\end{verbatim}
|
|
|
|
The most common cases are the genitive \textit{di} and the dative \textit{a},
|
|
with the empty preposition.
|
|
|
|
\begin{verbatim}
|
|
diN2 : N -> N2 ;
|
|
aN2 : N -> N2 ;
|
|
\end{verbatim}
|
|
|
|
Three-place relational nouns (\textit{la connessione di x a y}) need two prepositions.
|
|
|
|
\begin{verbatim}
|
|
mkN3 : N -> Preposition -> Preposition -> N3 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relational common noun phrases}
|
|
In some cases, you may want to make a complex \texttt{CN} into a
|
|
relational noun (e.g. \textit{the old town hall of}). However, \texttt{N2} and
|
|
\texttt{N3} are purely lexical categories. But you can use the \texttt{AdvCN}
|
|
and \texttt{PrepNP} constructions to build phrases like this.
|
|
|
|
\subsubsubsection{Proper names and noun phrases}
|
|
Proper names need a string and a gender.
|
|
|
|
\begin{verbatim}
|
|
mkPN : Str -> Gender -> PN ; -- Jean
|
|
\end{verbatim}
|
|
|
|
To form a noun phrase that can also be plural,
|
|
you can use the worst-case function.
|
|
|
|
\begin{verbatim}
|
|
mkNP : Str -> Gender -> Number -> NP ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adjectives}
|
|
Non-comparison one-place adjectives need five forms in the worst
|
|
case (masc and fem singular, masc plural, adverbial).
|
|
|
|
\begin{verbatim}
|
|
mkA : (solo,sola,soli,sole, solamente : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
For regular adjectives, all other forms are derived from the
|
|
masculine singular.
|
|
|
|
\begin{verbatim}
|
|
regA : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
These functions create postfix adjectives. To switch
|
|
them to prefix ones (i.e. ones placed before the noun in
|
|
modification, as in \textit{petite maison}), the following function is
|
|
provided.
|
|
|
|
\begin{verbatim}
|
|
prefA : A -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place adjectives}
|
|
Two-place adjectives need a preposition for their second argument.
|
|
|
|
\begin{verbatim}
|
|
mkA2 : A -> Preposition -> A2 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Comparison adjectives}
|
|
Comparison adjectives are in the worst case put up from two
|
|
adjectives: the positive (\textit{buono}), and the comparative (\textit{migliore}).
|
|
|
|
\begin{verbatim}
|
|
mkADeg : A -> A -> A ;
|
|
\end{verbatim}
|
|
|
|
If comparison is formed by \textit{più}, as usual in Italian,
|
|
the following pattern is used:
|
|
|
|
\begin{verbatim}
|
|
compADeg : A -> A ;
|
|
\end{verbatim}
|
|
|
|
The regular pattern is the same as \texttt{regA} for plain adjectives,
|
|
with comparison by \textit{plus}.
|
|
|
|
\begin{verbatim}
|
|
regADeg : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adverbs}
|
|
Adverbs are not inflected. Most lexical ones have position
|
|
after the verb.
|
|
|
|
\begin{verbatim}
|
|
mkAdv : Str -> Adv ;
|
|
\end{verbatim}
|
|
|
|
Some appear next to the verb (e.g. \textit{sempre}).
|
|
|
|
\begin{verbatim}
|
|
mkAdV : Str -> AdV ;
|
|
\end{verbatim}
|
|
|
|
Adverbs modifying adjectives and sentences can also be formed.
|
|
|
|
\begin{verbatim}
|
|
mkAdA : Str -> AdA ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs}
|
|
Regular verbs are ones with the infinitive \textit{er} or \textit{ir}, the
|
|
latter with plural present indicative forms as \textit{finissons}.
|
|
The regular verb function is the first conjugation recognizes
|
|
these endings, as well as the variations among
|
|
\textit{aimer, céder, placer, peser, jeter, placer, manger, assiéger, payer}.
|
|
|
|
\begin{verbatim}
|
|
regV : Str -> V ;
|
|
\end{verbatim}
|
|
|
|
The module \texttt{BeschIta} gives all the patterns of the \textit{Bescherelle}
|
|
book. To use them in the category \texttt{V}, wrap them with the function
|
|
|
|
\begin{verbatim}
|
|
verboV : Verbo -> V ;
|
|
\end{verbatim}
|
|
|
|
The function \texttt{regV} gives all verbs the compound auxiliary \textit{avere}.
|
|
To change it to \textit{essere}, use the following function.
|
|
Reflexive implies \textit{essere}.
|
|
|
|
\begin{verbatim}
|
|
essereV : V -> V ;
|
|
reflV : V -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place verbs}
|
|
Two-place verbs need a preposition, except the special case with direct object.
|
|
(transitive verbs). Notice that a particle comes from the \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
mkV2 : V -> Preposition -> V2 ;
|
|
|
|
dirV2 : V -> V2 ;
|
|
\end{verbatim}
|
|
|
|
You can reuse a \texttt{V2} verb in \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
v2V : V2 -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Three-place verbs}
|
|
Three-place (ditransitive) verbs need two prepositions, of which
|
|
the first one or both can be absent.
|
|
|
|
\begin{verbatim}
|
|
mkV3 : V -> Preposition -> Preposition -> V3 ; -- parler, à, de
|
|
dirV3 : V -> Preposition -> V3 ; -- donner,_,à
|
|
dirdirV3 : V -> V3 ; -- donner,_,_
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Other complement patterns}
|
|
Verbs and adjectives can take complements such as sentences,
|
|
questions, verb phrases, and adjectives.
|
|
|
|
\begin{verbatim}
|
|
mkV0 : V -> V0 ;
|
|
mkVS : V -> VS ;
|
|
mkV2S : V -> Preposition -> V2S ;
|
|
mkVV : V -> VV ; -- plain infinitive: "je veux parler"
|
|
deVV : V -> VV ; -- "j'essaie de parler"
|
|
aVV : V -> VV ; -- "j'arrive à parler"
|
|
mkV2V : V -> Preposition -> Preposition -> V2V ;
|
|
mkVA : V -> VA ;
|
|
mkV2A : V -> Preposition -> Preposition -> V2A ;
|
|
mkVQ : V -> VQ ;
|
|
mkV2Q : V -> Preposition -> V2Q ;
|
|
|
|
mkAS : A -> AS ;
|
|
mkA2S : A -> Preposition -> A2S ;
|
|
mkAV : A -> Preposition -> AV ;
|
|
mkA2V : A -> Preposition -> Preposition -> A2V ;
|
|
\end{verbatim}
|
|
|
|
Notice: categories \texttt{V2S, V2V, V2Q} are in v 1.0 treated
|
|
just as synonyms of \texttt{V2}, and the second argument is given
|
|
as an adverb. Likewise \texttt{AS, A2S, AV, A2V} are just \texttt{A}.
|
|
\texttt{V0} is just \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
V0, V2S, V2V, V2Q : Type ;
|
|
AS, A2S, AV, A2V : Type ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{The definitions of the paradigms}
|
|
The definitions should not bother the user of the API. So they are
|
|
hidden from the document.
|
|
Author:
|
|
Last update: Tue Jun 13 11:43:19 2006
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
==
|
|
|
|
\# -path=.:../scandinavian:../common:../abstract:../../prelude
|
|
|
|
|
|
\subsubsection{Norwegian Lexical Paradigms}
|
|
Aarne Ranta 2003
|
|
|
|
This is an API to the user of the resource grammar
|
|
for adding lexical items. It gives functions for forming
|
|
expressions of open categories: nouns, adjectives, verbs.
|
|
|
|
Closed categories (determiners, pronouns, conjunctions) are
|
|
accessed through the resource syntax API, \texttt{Structural.gf}.
|
|
|
|
The main difference with \texttt{MorphoNor.gf} is that the types
|
|
referred to are compiled resource grammar types. We have moreover
|
|
had the design principle of always having existing forms, rather
|
|
than stems, as string arguments of the paradigms.
|
|
|
|
The structure of functions for each word class \texttt{C} is the following:
|
|
first we give a handful of patterns that aim to cover all
|
|
regular cases. Then we give a worst-case function \texttt{mkC}, which serves as an
|
|
escape to construct the most irregular words of type \texttt{C}.
|
|
However, this function should only seldom be needed: we have a
|
|
separate module \texttt{IrregularEng}, which covers all irregularly inflected
|
|
words.
|
|
|
|
\begin{verbatim}
|
|
resource ParadigmsNor =
|
|
open
|
|
(Predef=Predef),
|
|
Prelude,
|
|
CommonScand,
|
|
ResNor,
|
|
MorphoNor,
|
|
CatNor in {
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Parameters}
|
|
To abstract over gender names, we define the following identifiers.
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
Gender : Type ;
|
|
|
|
masculine : Gender ;
|
|
feminine : Gender ;
|
|
neutrum : Gender ;
|
|
\end{verbatim}
|
|
|
|
To abstract over number names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Number : Type ;
|
|
|
|
singular : Number ;
|
|
plural : Number ;
|
|
\end{verbatim}
|
|
|
|
To abstract over case names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Case : Type ;
|
|
|
|
nominative : Case ;
|
|
genitive : Case ;
|
|
\end{verbatim}
|
|
|
|
Prepositions used in many-argument functions are just strings.
|
|
|
|
\begin{verbatim}
|
|
Preposition : Type = Str ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Nouns}
|
|
Worst case: give all four forms. The gender is computed from the
|
|
last letter of the second form (if \textit{n}, then \texttt{utrum}, otherwise \texttt{neutrum}).
|
|
|
|
\begin{verbatim}
|
|
mkN : (dreng,drengen,drenger,drengene : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
The regular function takes the singular indefinite form
|
|
and computes the other forms and the gender by a heuristic.
|
|
The heuristic is that nouns ending \textit{e} are feminine like \textit{kvinne},
|
|
all others are masculine like \textit{bil}.
|
|
If in doubt, use the \texttt{cc} command to test!
|
|
|
|
\begin{verbatim}
|
|
regN : Str -> N ;
|
|
\end{verbatim}
|
|
|
|
Giving gender manually makes the heuristic more reliable.
|
|
|
|
\begin{verbatim}
|
|
regGenN : Str -> Gender -> N ;
|
|
\end{verbatim}
|
|
|
|
This function takes the singular indefinite and definite forms; the
|
|
gender is computed from the definite form.
|
|
|
|
\begin{verbatim}
|
|
mk2N : (bil,bilen : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Compound nouns}
|
|
All the functions above work quite as well to form compound nouns,
|
|
such as \textit{fotboll}.
|
|
|
|
\subsubsubsection{Relational nouns}
|
|
Relational nouns (\textit{daughter of x}) need a preposition.
|
|
|
|
\begin{verbatim}
|
|
mkN2 : N -> Preposition -> N2 ;
|
|
\end{verbatim}
|
|
|
|
The most common preposition is \textit{av}, and the following is a
|
|
shortcut for regular, \texttt{nonhuman} relational nouns with \textit{av}.
|
|
|
|
\begin{verbatim}
|
|
regN2 : Str -> Gender -> N2 ;
|
|
\end{verbatim}
|
|
|
|
Use the function \texttt{mkPreposition} or see the section on prepositions below to
|
|
form other prepositions.
|
|
|
|
Three-place relational nouns (\textit{the connection from x to y}) need two prepositions.
|
|
|
|
\begin{verbatim}
|
|
mkN3 : N -> Preposition -> Preposition -> N3 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relational common noun phrases}
|
|
In some cases, you may want to make a complex \texttt{CN} into a
|
|
relational noun (e.g. \textit{the old town hall of}). However, \texttt{N2} and
|
|
\texttt{N3} are purely lexical categories. But you can use the \texttt{AdvCN}
|
|
and \texttt{PrepNP} constructions to build phrases like this.
|
|
|
|
\subsubsubsection{Proper names and noun phrases}
|
|
Proper names, with a regular genitive, are formed as follows
|
|
|
|
\begin{verbatim}
|
|
regPN : Str -> Gender -> PN ; -- John, John's
|
|
\end{verbatim}
|
|
|
|
Sometimes you can reuse a common noun as a proper name, e.g. \textit{Bank}.
|
|
|
|
\begin{verbatim}
|
|
nounPN : N -> PN ;
|
|
\end{verbatim}
|
|
|
|
To form a noun phrase that can also be plural and have an irregular
|
|
genitive, you can use the worst-case function.
|
|
|
|
\begin{verbatim}
|
|
mkNP : Str -> Str -> Number -> Gender -> NP ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adjectives}
|
|
Non-comparison one-place adjectives need three forms:
|
|
|
|
\begin{verbatim}
|
|
mkA : (galen,galet,galne : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
For regular adjectives, the other forms are derived.
|
|
|
|
\begin{verbatim}
|
|
regA : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
In most cases, two forms are enough.
|
|
|
|
\begin{verbatim}
|
|
mk2A : (stor,stort : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place adjectives}
|
|
Two-place adjectives need a preposition for their second argument.
|
|
|
|
\begin{verbatim}
|
|
mkA2 : A -> Preposition -> A2 ;
|
|
\end{verbatim}
|
|
|
|
Comparison adjectives may need as many as five forms.
|
|
|
|
\begin{verbatim}
|
|
mkADeg : (stor,stort,store,storre,storst : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
The regular pattern works for many adjectives, e.g. those ending
|
|
with \textit{ig}.
|
|
|
|
\begin{verbatim}
|
|
regADeg : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
Just the comparison forms can be irregular.
|
|
|
|
\begin{verbatim}
|
|
irregADeg : (tung,tyngre,tyngst : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
Sometimes just the positive forms are irregular.
|
|
|
|
\begin{verbatim}
|
|
mk3ADeg : (galen,galet,galna : Str) -> A ;
|
|
mk2ADeg : (bred,bredt : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
If comparison is formed by \textit{mer, //mest}, as in general for//
|
|
long adjective, the following pattern is used:
|
|
|
|
\begin{verbatim}
|
|
compoundA : A -> A ; -- -/mer/mest norsk
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adverbs}
|
|
Adverbs are not inflected. Most lexical ones have position
|
|
after the verb. Some can be preverbal (e.g. \textit{always}).
|
|
|
|
\begin{verbatim}
|
|
mkAdv : Str -> Adv ;
|
|
mkAdV : Str -> AdV ;
|
|
\end{verbatim}
|
|
|
|
Adverbs modifying adjectives and sentences can also be formed.
|
|
|
|
\begin{verbatim}
|
|
mkAdA : Str -> AdA ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Prepositions}
|
|
A preposition is just a string.
|
|
|
|
\begin{verbatim}
|
|
mkPreposition : Str -> Preposition ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs}
|
|
The worst case needs six forms.
|
|
|
|
\begin{verbatim}
|
|
mkV : (spise,spiser,spises,spiste,spist,spis : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
The 'regular verb' function is the first conjugation.
|
|
|
|
\begin{verbatim}
|
|
regV : (snakke : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
The almost regular verb function needs the infinitive and the preteritum.
|
|
|
|
\begin{verbatim}
|
|
mk2V : (leve,levde : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
There is an extensive list of irregular verbs in the module \texttt{IrregNor}.
|
|
In practice, it is enough to give three forms, as in school books.
|
|
|
|
\begin{verbatim}
|
|
irregV : (drikke, drakk, drukket : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs with //være// as auxiliary}
|
|
By default, the auxiliary is \textit{have}. This function changes it to \textit{være}.
|
|
|
|
\begin{verbatim}
|
|
vaereV : V -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs with a particle.}
|
|
The particle, such as in \textit{switch on}, is given as a string.
|
|
|
|
\begin{verbatim}
|
|
partV : V -> Str -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Deponent verbs.}
|
|
Some words are used in passive forms only, e.g. \textit{hoppas}, some as
|
|
reflexive e.g. \textit{ångra sig}.
|
|
|
|
\begin{verbatim}
|
|
depV : V -> V ;
|
|
reflV : V -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place verbs}
|
|
Two-place verbs need a preposition, except the special case with direct object.
|
|
(transitive verbs). Notice that a particle comes from the \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
mkV2 : V -> Preposition -> V2 ;
|
|
|
|
dirV2 : V -> V2 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Three-place verbs}
|
|
Three-place (ditransitive) verbs need two prepositions, of which
|
|
the first one or both can be absent.
|
|
|
|
\begin{verbatim}
|
|
mkV3 : V -> Str -> Str -> V3 ; -- speak, with, about
|
|
dirV3 : V -> Str -> V3 ; -- give,_,to
|
|
dirdirV3 : V -> V3 ; -- give,_,_
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Other complement patterns}
|
|
Verbs and adjectives can take complements such as sentences,
|
|
questions, verb phrases, and adjectives.
|
|
|
|
\begin{verbatim}
|
|
mkV0 : V -> V0 ;
|
|
mkVS : V -> VS ;
|
|
mkV2S : V -> Str -> V2S ;
|
|
mkVV : V -> VV ;
|
|
mkV2V : V -> Str -> Str -> V2V ;
|
|
mkVA : V -> VA ;
|
|
mkV2A : V -> Str -> V2A ;
|
|
mkVQ : V -> VQ ;
|
|
mkV2Q : V -> Str -> V2Q ;
|
|
|
|
mkAS : A -> AS ;
|
|
mkA2S : A -> Str -> A2S ;
|
|
mkAV : A -> AV ;
|
|
mkA2V : A -> Str -> A2V ;
|
|
\end{verbatim}
|
|
|
|
Notice: categories \texttt{V2S, V2V, V2A, V2Q} are in v 1.0 treated
|
|
just as synonyms of \texttt{V2}, and the second argument is given
|
|
as an adverb. Likewise \texttt{AS, A2S, AV, A2V} are just \texttt{A}.
|
|
\texttt{V0} is just \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
V0, V2S, V2V, V2A, V2Q : Type ;
|
|
AS, A2S, AV, A2V : Type ;
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
==
|
|
|
|
\# -path=.:../abstract:../../prelude:../common
|
|
|
|
|
|
\subsubsection{Russian Lexical Paradigms}
|
|
Janna Khegai 2003--2005
|
|
|
|
This is an API to the user of the resource grammar
|
|
for adding lexical items. It gives functions for forming
|
|
expressions of open categories: nouns, adjectives, verbs.
|
|
|
|
Closed categories (determiners, pronouns, conjunctions) are
|
|
accessed through the resource syntax API, \texttt{Structural.gf}.
|
|
|
|
The main difference with \texttt{MorphoEng.gf} is that the types
|
|
referred to are compiled resource grammar types. We have moreover
|
|
had the design principle of always having existing forms, rather
|
|
than stems, as string arguments of the paradigms.
|
|
|
|
The structure of functions for each word class \texttt{C} is the following:
|
|
first we give a handful of patterns that aim to cover all
|
|
regular cases. Then we give a worst-case function \texttt{mkC}, which serves as an
|
|
escape to construct the most irregular words of type \texttt{C}.
|
|
However, this function should only seldom be needed: we have a
|
|
separate module \texttt{IrregularEng}, which covers all irregularly inflected
|
|
words.
|
|
|
|
The following modules are presupposed:
|
|
|
|
\begin{verbatim}
|
|
resource ParadigmsRus = open
|
|
(Predef=Predef),
|
|
Prelude,
|
|
MorphoRus,
|
|
CatRus,
|
|
NounRus
|
|
in {
|
|
|
|
flags coding=utf8 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Parameters}
|
|
To abstract over gender names, we define the following identifiers.
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
Gender : Type ;
|
|
masculine : Gender ;
|
|
feminine : Gender ;
|
|
neuter : Gender ;
|
|
\end{verbatim}
|
|
|
|
To abstract over case names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Case : Type ;
|
|
|
|
nominative : Case ;
|
|
genitive : Case ;
|
|
dative : Case ;
|
|
accusative : Case ;
|
|
instructive : Case ;
|
|
prepositional : Case ;
|
|
\end{verbatim}
|
|
|
|
In some (written in English) textbooks accusative case
|
|
is put on the second place. However, we follow the case order
|
|
standard for Russian textbooks.
|
|
To abstract over number names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Number : Type ;
|
|
|
|
singular : Number ;
|
|
plural : Number ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Nouns}
|
|
Best case: indeclinabe nouns: \textit{коÑе}, \textit{палÑÑо}, \textit{ÐУÐ}.
|
|
|
|
\begin{verbatim}
|
|
Animacy: Type ;
|
|
|
|
animate: Animacy;
|
|
inanimate: Animacy;
|
|
|
|
mkIndeclinableNoun: Str -> Gender -> Animacy -> N ;
|
|
\end{verbatim}
|
|
|
|
Worst case - give six singular forms:
|
|
Nominative, Genetive, Dative, Accusative, Instructive and Prepositional;
|
|
corresponding six plural forms and the gender.
|
|
May be the number of forms needed can be reduced,
|
|
but this requires a separate investigation.
|
|
Animacy parameter (determining whether the Accusative form is equal
|
|
to the Nominative or the Genetive one) is actually of no help,
|
|
since there are a lot of exceptions and the gain is just one form less.
|
|
|
|
\begin{verbatim}
|
|
mkN : (_,_,_,_,_,_,_,_,_,_,_,_ : Str) -> Gender -> Animacy -> N ;
|
|
|
|
-- мÑжÑина, мÑжÑинÑ, мÑжÑине, мÑжÑинÑ, мÑжÑиной, мÑжÑине
|
|
-- мÑжÑинÑ, мÑжÑин, мÑжÑинам, мÑжÑин, мÑжÑинами, мÑжÑинаÑ
|
|
\end{verbatim}
|
|
|
|
The regular function captures the variants for some popular nouns
|
|
endings below:
|
|
|
|
\begin{verbatim}
|
|
regN : Str -> N ;
|
|
\end{verbatim}
|
|
|
|
Here are some common patterns. The list is far from complete.
|
|
Feminine patterns.
|
|
|
|
\begin{verbatim}
|
|
nMashina : Str -> N ; -- feminine, inanimate, ending with "-а", Inst -"маÑин-ой"
|
|
nEdinica : Str -> N ; -- feminine, inanimate, ending with "-а", Inst -"единиÑ-ей"
|
|
nZhenchina : Str -> N ; -- feminine, animate, ending with "-a"
|
|
nNoga : Str -> N ; -- feminine, inanimate, ending with "г_к_Ñ
-a"
|
|
nMalyariya : Str -> N ; -- feminine, inanimate, ending with "-иÑ"
|
|
nTetya : Str -> N ; -- feminine, animate, ending with "-Ñ"
|
|
nBol : Str -> N ; -- feminine, inanimate, ending with "-Ñ"(soft sign)
|
|
\end{verbatim}
|
|
|
|
Neuter patterns.
|
|
|
|
\begin{verbatim}
|
|
nObezbolivauchee : Str -> N ; -- neutral, inanimate, ending with "-ee"
|
|
nProizvedenie : Str -> N ; -- neutral, inanimate, ending with "-e"
|
|
nChislo : Str -> N ; -- neutral, inanimate, ending with "-o"
|
|
nZhivotnoe : Str -> N ; -- masculine, inanimate, ending with "-енÑ"
|
|
\end{verbatim}
|
|
|
|
Masculine patterns.
|
|
Ending with consonant:
|
|
|
|
\begin{verbatim}
|
|
nPepel : Str -> N ; -- masculine, inanimate, ending with "-ел"- "пеп-ла"
|
|
|
|
nBrat: Str -> N ; -- animate, бÑаÑ-ÑÑ
|
|
nStul: Str -> N ; -- same as above, but inanimate
|
|
nMalush : Str -> N ; -- малÑÑей
|
|
nPotolok : Str -> N ; -- поÑол-ок - поÑол-ка
|
|
|
|
-- the next four differ in plural nominative and/or accusative form(s) :
|
|
nBank: Str -> N ; -- банк-и (Nom=Acc)
|
|
nStomatolog : Str -> N ; -- same as above, but animate
|
|
nAdres : Str -> N ; -- адÑеÑ-а (Nom=Acc)
|
|
nTelefon : Str -> N ; -- ÑелеÑон-Ñ (Nom=Acc)
|
|
|
|
nNol : Str -> N ; -- masculine, inanimate, ending with "-Ñ" (soft sign)
|
|
nUroven : Str -> N ; -- masculine, inanimate, ending with "-енÑ"
|
|
\end{verbatim}
|
|
|
|
Nouns used as functions need a preposition. The most common is with Genitive.
|
|
|
|
\begin{verbatim}
|
|
mkFun : N -> Prep -> N2 ;
|
|
mkN2 : N -> N2 ;
|
|
mkN3 : N -> Prep -> Prep -> N3 ;
|
|
\end{verbatim}
|
|
|
|
Proper names.
|
|
|
|
\begin{verbatim}
|
|
mkPN : Str -> Gender -> Animacy -> PN ; -- "Ðван", "ÐаÑа"
|
|
nounPN : N -> PN ;
|
|
\end{verbatim}
|
|
|
|
On the top level, it is maybe \texttt{CN} that is used rather than \texttt{N}, and
|
|
\texttt{NP} rather than \texttt{PN}.
|
|
|
|
\begin{verbatim}
|
|
mkCN : N -> CN ;
|
|
mkNP : Str -> Gender -> Animacy -> NP ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adjectives}
|
|
Non-comparison (only positive degree) one-place adjectives need 28 (4 by 7)
|
|
forms in the worst case:
|
|
Masculine $|$ Feminine $|$ Neutral $|$ Plural
|
|
Nominative
|
|
Genitive
|
|
Dative
|
|
Accusative Inanimate
|
|
Accusative Animate
|
|
Instructive
|
|
Prepositional
|
|
Notice that 4 short forms, which exist for some adjectives are not included
|
|
in the current description, otherwise there would be 32 forms for
|
|
positive degree.
|
|
mkA : ( : Str) -$>$ A ;
|
|
The regular function captures the variants for some popular adjective
|
|
endings below:
|
|
|
|
\begin{verbatim}
|
|
regA : Str -> Str -> A ;
|
|
\end{verbatim}
|
|
|
|
Invariable adjective is a special case.
|
|
|
|
\begin{verbatim}
|
|
adjInvar : Str -> A ; -- khaki, mini, hindi, netto
|
|
\end{verbatim}
|
|
|
|
Some regular patterns depending on the ending.
|
|
|
|
\begin{verbatim}
|
|
AStaruyj : Str -> Str -> A ; -- ending with "-Ñй"
|
|
AMalenkij : Str -> Str -> A ; -- ending with "-ий", Gen - "маленÑк-ого"
|
|
AKhoroshij : Str -> Str -> A ; -- ending with "-ий", Gen - "Ñ
оÑоÑ-его"
|
|
AMolodoj : Str -> Str -> A ; -- ending with "-ой",
|
|
-- plural - молод-Ñе"
|
|
AKakoj_Nibud : Str -> Str -> Str -> A ; -- ending with "-ой",
|
|
-- plural - "как-ие"
|
|
\end{verbatim}
|
|
|
|
Two-place adjectives need a preposition and a case as extra arguments.
|
|
|
|
\begin{verbatim}
|
|
mkA2 : A -> Str -> Case -> A2 ; -- "делим на"
|
|
\end{verbatim}
|
|
|
|
Comparison adjectives need a positive adjective
|
|
(28 forms without short forms).
|
|
Taking only one comparative form (non-syntaxic) and
|
|
only one superlative form (syntaxic) we can produce the
|
|
comparison adjective with only one extra argument -
|
|
non-syntaxic comparative form.
|
|
Syntaxic forms are based on the positive forms.
|
|
mkADeg : A -$>$ Str -$>$ ADeg ;
|
|
On top level, there are adjectival phrases. The most common case is
|
|
just to use a one-place adjective.
|
|
ap : A -$>$ IsPostfixAdj -$>$ AP ;
|
|
|
|
\subsubsubsection{Adverbs}
|
|
Adverbs are not inflected. Most lexical ones have position
|
|
after the verb. Some can be preverbal (e.g. \textit{always}).
|
|
|
|
\begin{verbatim}
|
|
mkAdv : Str -> Adv ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs}
|
|
In our lexicon description (\textit{Verbum}) there are 62 forms:
|
|
2 (Voice) by \{ 1 (infinitive) + [2(number) by 3 (person)](imperative) +
|
|
[ [2(Number) by 3(Person)](present) + [2(Number) by 3(Person)](future) +
|
|
4(GenNum)(past) ](indicative)+ 4 (GenNum) (subjunctive) \}
|
|
Participles (Present and Past) and Gerund forms are not included,
|
|
since they fuction more like Adjectives and Adverbs correspondingly
|
|
rather than verbs. Aspect regarded as an inherent parameter of a verb.
|
|
Notice, that some forms are never used for some verbs. Actually,
|
|
the majority of verbs do not have many of the forms.
|
|
|
|
\begin{verbatim}
|
|
Voice: Type;
|
|
Aspect: Type;
|
|
\end{verbatim}
|
|
|
|
Tense : Type;
|
|
|
|
\begin{verbatim}
|
|
Bool: Type;
|
|
Conjugation: Type ;
|
|
|
|
first: Conjugation; -- "гÑлÑ-ÐÑÑ, гÑлÑ-Ðм"
|
|
firstE: Conjugation; -- Verbs with vowel "Ñ": "даÑÑÑ" (give), "пÑÑÑÑ" (drink)
|
|
second: Conjugation; -- "вид-ÐÑÑ, вид-Ðм"
|
|
mixed: Conjugation; -- "Ñ
оÑ-ÐÑÑ - Ñ
оÑ-Ðм"
|
|
dolzhen: Conjugation; -- irregular
|
|
|
|
true: Bool;
|
|
false: Bool;
|
|
|
|
active: Voice ;
|
|
passive: Voice ;
|
|
imperfective: Aspect;
|
|
perfective: Aspect ;
|
|
\end{verbatim}
|
|
|
|
present : Tense ;
|
|
past : Tense ;
|
|
The worst case need 6 forms of the present tense in indicative mood
|
|
(\textit{Ñ Ð±ÐµÐ³Ñ}, \textit{ÑÑ Ð±ÐµÐ¶Ð¸ÑÑ}, \textit{он бежиÑ}, \textit{Ð¼Ñ Ð±ÐµÐ¶Ð¸Ð¼}, \textit{Ð²Ñ Ð±ÐµÐ¶Ð¸Ñе}, \textit{они бегÑÑ}),
|
|
a past form (singular, masculine: \textit{Ñ Ð±ÐµÐ¶Ð°Ð»}), an imperative form
|
|
(singular, second person: \textit{беги}), an infinitive (\textit{бежаÑÑ}).
|
|
Inherent aspect should also be specified.
|
|
|
|
\begin{verbatim}
|
|
mkVerbum : Aspect -> (_,_,_,_,_,_,_,_,_ : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
Common conjugation patterns are two conjugations:
|
|
first - verbs ending with \textit{-аÑÑ/-ÑÑÑ} and second - \textit{-иÑÑ/-еÑÑ}.
|
|
Instead of 6 present forms of the worst case, we only need
|
|
a present stem and one ending (singular, first person):
|
|
\textit{Ñ Ð»Ñб-лÑ}, \textit{Ñ Ð¶Ð´-Ñ}, etc. To determine where the border
|
|
between stem and ending lies it is sufficient to compare
|
|
first person from with second person form:
|
|
\textit{Ñ Ð»Ñб-лÑ}, \textit{ÑÑ Ð»Ñб-иÑÑ}. Stems shoud be the same.
|
|
So the definition for verb \textit{лÑбиÑÑ} looks like:
|
|
regV Imperfective Second \textit{лÑб} \textit{лÑ} \textit{лÑбил} \textit{лÑби} \textit{лÑбиÑÑ};
|
|
|
|
\begin{verbatim}
|
|
regV :Aspect -> Conjugation -> (_,_,_,_,_ : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
For writing an application grammar one usualy doesn't need
|
|
the whole inflection table, since each verb is used in
|
|
a particular context that determines some of the parameters
|
|
(Tense and Voice while Aspect is fixed from the beginning) for certain usage.
|
|
The \textit{V} type, that have these parameters fixed.
|
|
We can extract the \textit{V} from the lexicon.
|
|
mkV: Verbum -$>$ Voice -$>$ V ;
|
|
mkPresentV: Verbum -$>$ Voice -$>$ V ;
|
|
Two-place verbs, and the special case with direct object. Notice that
|
|
a particle can be included in a \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
mkV2 : V -> Str -> Case -> V2 ; -- "войÑи в дом"; "в", accusative
|
|
mkV3 : V -> Str -> Str -> Case -> Case -> V3 ; -- "ÑложиÑÑ Ð¿Ð¸ÑÑмо в конвеÑÑ"
|
|
dirV2 : V -> V2 ; -- "видеÑÑ", "лÑбиÑÑ"
|
|
tvDirDir : V -> V3 ;
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
==
|
|
|
|
\# -path=.:../romance:../common:../abstract:../../prelude
|
|
|
|
|
|
\subsubsection{Spanish Lexical Paradigms}
|
|
Aarne Ranta 2003
|
|
|
|
This is an API to the user of the resource grammar
|
|
for adding lexical items. It gives functions for forming
|
|
expressions of open categories: nouns, adjectives, verbs.
|
|
|
|
Closed categories (determiners, pronouns, conjunctions) are
|
|
accessed through the resource syntax API, \texttt{Structural.gf}.
|
|
|
|
The main difference with \texttt{MorphoSpa.gf} is that the types
|
|
referred to are compiled resource grammar types. We have moreover
|
|
had the design principle of always having existing forms, rather
|
|
than stems, as string arguments of the paradigms.
|
|
|
|
The structure of functions for each word class \texttt{C} is the following:
|
|
first we give a handful of patterns that aim to cover all
|
|
regular cases. Then we give a worst-case function \texttt{mkC}, which serves as an
|
|
escape to construct the most irregular words of type \texttt{C}.
|
|
|
|
\begin{verbatim}
|
|
resource ParadigmsSpa =
|
|
open
|
|
(Predef=Predef),
|
|
Prelude,
|
|
CommonRomance,
|
|
ResSpa,
|
|
MorphoSpa,
|
|
BeschSpa,
|
|
CatSpa in {
|
|
|
|
flags optimize=all ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Parameters}
|
|
To abstract over gender names, we define the following identifiers.
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
Gender : Type ;
|
|
|
|
masculine : Gender ;
|
|
feminine : Gender ;
|
|
\end{verbatim}
|
|
|
|
To abstract over number names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Number : Type ;
|
|
|
|
singular : Number ;
|
|
plural : Number ;
|
|
\end{verbatim}
|
|
|
|
Prepositions used in many-argument functions are either strings
|
|
(including the 'accusative' empty string) or strings that
|
|
amalgamate with the following word (the 'genitive' \textit{de} and the
|
|
'dative' \textit{à}).
|
|
|
|
\begin{verbatim}
|
|
Preposition : Type ;
|
|
|
|
accusative : Preposition ;
|
|
genitive : Preposition ;
|
|
dative : Preposition ;
|
|
|
|
mkPreposition : Str -> Preposition ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Nouns}
|
|
Worst case: two forms (singular + plural),
|
|
and the gender.
|
|
|
|
\begin{verbatim}
|
|
mkN : (_,_ : Str) -> Gender -> N ; -- uomo, uomini, masculine
|
|
\end{verbatim}
|
|
|
|
The regular function takes the singular form and the gender,
|
|
and computes the plural and the gender by a heuristic.
|
|
The heuristic says that the gender is feminine for nouns
|
|
ending with \textit{a} or \textit{z}, and masculine for all other words.
|
|
Nouns ending with \textit{a}, \textit{o}, \textit{e} have the plural with \textit{s},
|
|
those ending with \textit{z} have \textit{ces} in plural; all other nouns
|
|
have \textit{es} as plural ending. The accent is not dealt with.
|
|
|
|
\begin{verbatim}
|
|
regN : Str -> N ;
|
|
\end{verbatim}
|
|
|
|
To force a different gender, use one of the following functions.
|
|
|
|
\begin{verbatim}
|
|
mascN : N -> N ;
|
|
femN : N -> N ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Compound nouns}
|
|
Some nouns are ones where the first part is inflected as a noun but
|
|
the second part is not inflected. e.g. \textit{numéro de téléphone}.
|
|
They could be formed in syntax, but we give a shortcut here since
|
|
they are frequent in lexica.
|
|
|
|
\begin{verbatim}
|
|
compN : N -> Str -> N ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relational nouns}
|
|
Relational nouns (\textit{fille de x}) need a case and a preposition.
|
|
|
|
\begin{verbatim}
|
|
mkN2 : N -> Preposition -> N2 ;
|
|
\end{verbatim}
|
|
|
|
The most common cases are the genitive \textit{de} and the dative \textit{a},
|
|
with the empty preposition.
|
|
|
|
\begin{verbatim}
|
|
deN2 : N -> N2 ;
|
|
aN2 : N -> N2 ;
|
|
\end{verbatim}
|
|
|
|
Three-place relational nouns (\textit{la connessione di x a y}) need two prepositions.
|
|
|
|
\begin{verbatim}
|
|
mkN3 : N -> Preposition -> Preposition -> N3 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relational common noun phrases}
|
|
In some cases, you may want to make a complex \texttt{CN} into a
|
|
relational noun (e.g. \textit{the old town hall of}). However, \texttt{N2} and
|
|
\texttt{N3} are purely lexical categories. But you can use the \texttt{AdvCN}
|
|
and \texttt{PrepNP} constructions to build phrases like this.
|
|
|
|
\subsubsubsection{Proper names and noun phrases}
|
|
Proper names need a string and a gender.
|
|
|
|
\begin{verbatim}
|
|
mkPN : Str -> Gender -> PN ; -- Jean
|
|
\end{verbatim}
|
|
|
|
To form a noun phrase that can also be plural,
|
|
you can use the worst-case function.
|
|
|
|
\begin{verbatim}
|
|
mkNP : Str -> Gender -> Number -> NP ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adjectives}
|
|
Non-comparison one-place adjectives need five forms in the worst
|
|
case (masc and fem singular, masc plural, adverbial).
|
|
|
|
\begin{verbatim}
|
|
mkA : (solo,sola,soli,sole, solamente : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
For regular adjectives, all other forms are derived from the
|
|
masculine singular. The types of adjectives that are recognized are
|
|
\textit{alto}, \textit{fuerte}, \textit{util}.
|
|
|
|
\begin{verbatim}
|
|
regA : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
These functions create postfix adjectives. To switch
|
|
them to prefix ones (i.e. ones placed before the noun in
|
|
modification, as in \textit{petite maison}), the following function is
|
|
provided.
|
|
|
|
\begin{verbatim}
|
|
prefA : A -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place adjectives}
|
|
Two-place adjectives need a preposition for their second argument.
|
|
|
|
\begin{verbatim}
|
|
mkA2 : A -> Preposition -> A2 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Comparison adjectives}
|
|
Comparison adjectives are in the worst case put up from two
|
|
adjectives: the positive (\textit{bueno}), and the comparative (\textit{mejor}).
|
|
|
|
\begin{verbatim}
|
|
mkADeg : A -> A -> A ;
|
|
\end{verbatim}
|
|
|
|
If comparison is formed by \textit{mas}, as usual in Spanish,
|
|
the following pattern is used:
|
|
|
|
\begin{verbatim}
|
|
compADeg : A -> A ;
|
|
\end{verbatim}
|
|
|
|
The regular pattern is the same as \texttt{regA} for plain adjectives,
|
|
with comparison by \textit{mas}.
|
|
|
|
\begin{verbatim}
|
|
regADeg : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adverbs}
|
|
Adverbs are not inflected. Most lexical ones have position
|
|
after the verb.
|
|
|
|
\begin{verbatim}
|
|
mkAdv : Str -> Adv ;
|
|
\end{verbatim}
|
|
|
|
Some appear next to the verb (e.g. \textit{siempre}).
|
|
|
|
\begin{verbatim}
|
|
mkAdV : Str -> AdV ;
|
|
\end{verbatim}
|
|
|
|
Adverbs modifying adjectives and sentences can also be formed.
|
|
|
|
\begin{verbatim}
|
|
mkAdA : Str -> AdA ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs}
|
|
Regular verbs are ones inflected like \textit{cortar}, \textit{deber}, or \textit{vivir}.
|
|
The regular verb function is the first conjugation (\textit{ar}) recognizes
|
|
the variations corresponding to the patterns
|
|
\textit{actuar, cazar, guiar, pagar, sacar}. The module \texttt{BeschSpa} gives
|
|
the complete set of \textit{Bescherelle} conjugations.
|
|
|
|
\begin{verbatim}
|
|
regV : Str -> V ;
|
|
\end{verbatim}
|
|
|
|
The module \texttt{BeschSpa} gives all the patterns of the \textit{Bescherelle}
|
|
book. To use them in the category \texttt{V}, wrap them with the function
|
|
|
|
\begin{verbatim}
|
|
verboV : Verbum -> V ;
|
|
\end{verbatim}
|
|
|
|
To form reflexive verbs:
|
|
|
|
\begin{verbatim}
|
|
reflV : V -> V ;
|
|
\end{verbatim}
|
|
|
|
Verbs with a deviant passive participle: just give the participle
|
|
in masculine singular form as second argument.
|
|
|
|
\begin{verbatim}
|
|
special_ppV : V -> Str -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place verbs}
|
|
Two-place verbs need a preposition, except the special case with direct object.
|
|
(transitive verbs). Notice that a particle comes from the \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
mkV2 : V -> Preposition -> V2 ;
|
|
|
|
dirV2 : V -> V2 ;
|
|
\end{verbatim}
|
|
|
|
You can reuse a \texttt{V2} verb in \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
v2V : V2 -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Three-place verbs}
|
|
Three-place (ditransitive) verbs need two prepositions, of which
|
|
the first one or both can be absent.
|
|
|
|
\begin{verbatim}
|
|
mkV3 : V -> Preposition -> Preposition -> V3 ; -- parler, à, de
|
|
dirV3 : V -> Preposition -> V3 ; -- donner,_,à
|
|
dirdirV3 : V -> V3 ; -- donner,_,_
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Other complement patterns}
|
|
Verbs and adjectives can take complements such as sentences,
|
|
questions, verb phrases, and adjectives.
|
|
|
|
\begin{verbatim}
|
|
mkV0 : V -> V0 ;
|
|
mkVS : V -> VS ;
|
|
mkV2S : V -> Preposition -> V2S ;
|
|
mkVV : V -> VV ; -- plain infinitive: "je veux parler"
|
|
deVV : V -> VV ; -- "j'essaie de parler"
|
|
aVV : V -> VV ; -- "j'arrive à parler"
|
|
mkV2V : V -> Preposition -> Preposition -> V2V ;
|
|
mkVA : V -> VA ;
|
|
mkV2A : V -> Preposition -> Preposition -> V2A ;
|
|
mkVQ : V -> VQ ;
|
|
mkV2Q : V -> Preposition -> V2Q ;
|
|
|
|
mkAS : A -> AS ;
|
|
mkA2S : A -> Preposition -> A2S ;
|
|
mkAV : A -> Preposition -> AV ;
|
|
mkA2V : A -> Preposition -> Preposition -> A2V ;
|
|
\end{verbatim}
|
|
|
|
Notice: categories \texttt{V2S, V2V, V2Q} are in v 1.0 treated
|
|
just as synonyms of \texttt{V2}, and the second argument is given
|
|
as an adverb. Likewise \texttt{AS, A2S, AV, A2V} are just \texttt{A}.
|
|
\texttt{V0} is just \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
V0, V2S, V2V, V2Q : Type ;
|
|
AS, A2S, AV, A2V : Type ;
|
|
\end{verbatim}
|
|
|
|
\commOut{Produced by
|
|
gfdoc - a rudimentary GF document generator.
|
|
(c) Aarne Ranta (\htmladdnormallink{aarne@cs.chalmers.se}{mailto:aarne@cs.chalmers.se}) 2002 under GNU GPL.}
|
|
|
|
==
|
|
|
|
\# -path=.:../scandinavian:../common:../abstract:../../prelude
|
|
|
|
|
|
\subsubsection{Swedish Lexical Paradigms}
|
|
Aarne Ranta 2003
|
|
|
|
This is an API to the user of the resource grammar
|
|
for adding lexical items. It gives functions for forming
|
|
expressions of open categories: nouns, adjectives, verbs.
|
|
|
|
Closed categories (determiners, pronouns, conjunctions) are
|
|
accessed through the resource syntax API, \texttt{Structural.gf}.
|
|
|
|
The main difference with \texttt{MorphoSwe.gf} is that the types
|
|
referred to are compiled resource grammar types. We have moreover
|
|
had the design principle of always having existing forms, rather
|
|
than stems, as string arguments of the paradigms.
|
|
|
|
The structure of functions for each word class \texttt{C} is the following:
|
|
first we give a handful of patterns that aim to cover all
|
|
regular cases. Then we give a worst-case function \texttt{mkC}, which serves as an
|
|
escape to construct the most irregular words of type \texttt{C}.
|
|
However, this function should only seldom be needed: we have a
|
|
separate module \texttt{IrregularEng}, which covers all irregularly inflected
|
|
words.
|
|
|
|
\begin{verbatim}
|
|
resource ParadigmsSwe =
|
|
open
|
|
(Predef=Predef),
|
|
Prelude,
|
|
CommonScand,
|
|
ResSwe,
|
|
MorphoSwe,
|
|
CatSwe in {
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Parameters}
|
|
To abstract over gender names, we define the following identifiers.
|
|
|
|
\begin{verbatim}
|
|
oper
|
|
Gender : Type ;
|
|
|
|
utrum : Gender ;
|
|
neutrum : Gender ;
|
|
\end{verbatim}
|
|
|
|
To abstract over number names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Number : Type ;
|
|
|
|
singular : Number ;
|
|
plural : Number ;
|
|
\end{verbatim}
|
|
|
|
To abstract over case names, we define the following.
|
|
|
|
\begin{verbatim}
|
|
Case : Type ;
|
|
|
|
nominative : Case ;
|
|
genitive : Case ;
|
|
\end{verbatim}
|
|
|
|
Prepositions used in many-argument functions are just strings.
|
|
|
|
\begin{verbatim}
|
|
Preposition : Type = Str ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Nouns}
|
|
Worst case: give all four forms. The gender is computed from the
|
|
last letter of the second form (if \textit{n}, then \texttt{utrum}, otherwise \texttt{neutrum}).
|
|
|
|
\begin{verbatim}
|
|
mkN : (apa,apan,apor,aporna : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
The regular function takes the singular indefinite form and computes the other
|
|
forms and the gender by a heuristic. The heuristic is currently
|
|
to treat all words ending with \textit{a} like \textit{flicka}, with \textit{e} like \textit{rike},
|
|
and otherwise like \textit{bil}.
|
|
If in doubt, use the \texttt{cc} command to test!
|
|
|
|
\begin{verbatim}
|
|
regN : Str -> N ;
|
|
\end{verbatim}
|
|
|
|
Adding the gender manually greatly improves the correction of \texttt{regN}.
|
|
|
|
\begin{verbatim}
|
|
regGenN : Str -> Gender -> N ;
|
|
\end{verbatim}
|
|
|
|
In practice the worst case is often just: give singular and plural indefinite.
|
|
|
|
\begin{verbatim}
|
|
mk2N : (nyckel,nycklar : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
This heuristic takes just the plural definite form and infers the others.
|
|
It does not work if there are changes in the stem.
|
|
|
|
\begin{verbatim}
|
|
mk1N : (bilarna : Str) -> N ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Compound nouns}
|
|
All the functions above work quite as well to form compound nouns,
|
|
such as \textit{fotboll}.
|
|
|
|
\subsubsubsection{Relational nouns}
|
|
Relational nouns (\textit{daughter of x}) need a preposition.
|
|
|
|
\begin{verbatim}
|
|
mkN2 : N -> Preposition -> N2 ;
|
|
\end{verbatim}
|
|
|
|
The most common preposition is \textit{av}, and the following is a
|
|
shortcut for regular, \texttt{nonhuman} relational nouns with \textit{av}.
|
|
|
|
\begin{verbatim}
|
|
regN2 : Str -> Gender -> N2 ;
|
|
\end{verbatim}
|
|
|
|
Use the function \texttt{mkPreposition} or see the section on prepositions below to
|
|
form other prepositions.
|
|
|
|
Three-place relational nouns (\textit{the connection from x to y}) need two prepositions.
|
|
|
|
\begin{verbatim}
|
|
mkN3 : N -> Preposition -> Preposition -> N3 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Relational common noun phrases}
|
|
In some cases, you may want to make a complex \texttt{CN} into a
|
|
relational noun (e.g. \textit{the old town hall of}). However, \texttt{N2} and
|
|
\texttt{N3} are purely lexical categories. But you can use the \texttt{AdvCN}
|
|
and \texttt{PrepNP} constructions to build phrases like this.
|
|
|
|
\subsubsubsection{Proper names and noun phrases}
|
|
Proper names, with a regular genitive, are formed as follows
|
|
|
|
\begin{verbatim}
|
|
regPN : Str -> Gender -> PN ; -- John, John's
|
|
\end{verbatim}
|
|
|
|
Sometimes you can reuse a common noun as a proper name, e.g. \textit{Bank}.
|
|
|
|
\begin{verbatim}
|
|
nounPN : N -> PN ;
|
|
\end{verbatim}
|
|
|
|
To form a noun phrase that can also be plural and have an irregular
|
|
genitive, you can use the worst-case function.
|
|
|
|
\begin{verbatim}
|
|
mkNP : Str -> Str -> Number -> Gender -> NP ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adjectives}
|
|
Adjectives may need as many as seven forms.
|
|
|
|
\begin{verbatim}
|
|
mkA : (liten, litet, lilla, sma, mindre, minst, minsta : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
The regular pattern works for many adjectives, e.g. those ending
|
|
with \textit{ig}.
|
|
|
|
\begin{verbatim}
|
|
regA : Str -> A ;
|
|
\end{verbatim}
|
|
|
|
Just the comparison forms can be irregular.
|
|
|
|
\begin{verbatim}
|
|
irregA : (tung,tyngre,tyngst : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
Sometimes just the positive forms are irregular.
|
|
|
|
\begin{verbatim}
|
|
mk3A : (galen,galet,galna : Str) -> A ;
|
|
mk2A : (bred,brett : Str) -> A ;
|
|
\end{verbatim}
|
|
|
|
Comparison forms may be compound (\textit{mera svensk} - \textit{mest svensk}).
|
|
|
|
\begin{verbatim}
|
|
compoundA : A -> A ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place adjectives}
|
|
Two-place adjectives need a preposition for their second argument.
|
|
|
|
\begin{verbatim}
|
|
mkA2 : A -> Preposition -> A2 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Adverbs}
|
|
Adverbs are not inflected. Most lexical ones have position
|
|
after the verb. Some can be preverbal (e.g. \textit{always}).
|
|
|
|
\begin{verbatim}
|
|
mkAdv : Str -> Adv ;
|
|
mkAdV : Str -> AdV ;
|
|
\end{verbatim}
|
|
|
|
Adverbs modifying adjectives and sentences can also be formed.
|
|
|
|
\begin{verbatim}
|
|
mkAdA : Str -> AdA ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Prepositions}
|
|
A preposition is just a string.
|
|
|
|
\begin{verbatim}
|
|
mkPreposition : Str -> Preposition ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs}
|
|
The worst case needs five forms.
|
|
|
|
\begin{verbatim}
|
|
mkV : (supa,super,sup,söp,supit,supen : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
The 'regular verb' function is inspired by Lexin. It uses the
|
|
present tense indicative form. The value is the first conjugation if the
|
|
argument ends with \textit{ar} (\textit{tala} - \textit{talar} - \textit{talade} - \textit{talat}),
|
|
the second with \textit{er} (\textit{leka} - \textit{leker} - \textit{lekte} - \textit{lekt}, with the
|
|
variations like \textit{gräva}, \textit{vända}, \textit{tyda}, \textit{hyra}), and
|
|
the third in other cases (\textit{bo} - \textit{bor} - \textit{bodde} - \textit{bott}).
|
|
|
|
\begin{verbatim}
|
|
regV : (talar : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
The almost regular verb function needs the infinitive and the preteritum.
|
|
It is not really more powerful than the new implementation of
|
|
\texttt{regV} based on the indicative form.
|
|
|
|
\begin{verbatim}
|
|
mk2V : (leka,lekte : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
There is an extensive list of irregular verbs in the module \texttt{IrregularSwe}.
|
|
In practice, it is enough to give three forms, as in school books.
|
|
|
|
\begin{verbatim}
|
|
irregV : (dricka, drack, druckit : Str) -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Verbs with a particle.}
|
|
The particle, such as in \textit{passa på}, is given as a string.
|
|
|
|
\begin{verbatim}
|
|
partV : V -> Str -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Deponent verbs.}
|
|
Some words are used in passive forms only, e.g. \textit{hoppas}, some as
|
|
reflexive e.g. \textit{ångra sig}.
|
|
|
|
\begin{verbatim}
|
|
depV : V -> V ;
|
|
reflV : V -> V ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Two-place verbs}
|
|
Two-place verbs need a preposition, except the special case with direct object.
|
|
(transitive verbs). Notice that a particle comes from the \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
mkV2 : V -> Preposition -> V2 ;
|
|
|
|
dirV2 : V -> V2 ;
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Three-place verbs}
|
|
Three-place (ditransitive) verbs need two prepositions, of which
|
|
the first one or both can be absent.
|
|
|
|
\begin{verbatim}
|
|
mkV3 : V -> Preposition -> Preposition -> V3 ; -- tala med om
|
|
dirV3 : V -> Preposition -> V3 ; -- ge _ till
|
|
dirdirV3 : V -> V3 ; -- ge _ _
|
|
\end{verbatim}
|
|
|
|
\subsubsubsection{Other complement patterns}
|
|
Verbs and adjectives can take complements such as sentences,
|
|
questions, verb phrases, and adjectives.
|
|
|
|
\begin{verbatim}
|
|
mkV0 : V -> V0 ;
|
|
mkVS : V -> VS ;
|
|
mkV2S : V -> Str -> V2S ;
|
|
mkVV : V -> VV ;
|
|
mkV2V : V -> Str -> Str -> V2V ;
|
|
mkVA : V -> VA ;
|
|
mkV2A : V -> Str -> V2A ;
|
|
mkVQ : V -> VQ ;
|
|
mkV2Q : V -> Str -> V2Q ;
|
|
|
|
mkAS : A -> AS ;
|
|
mkA2S : A -> Str -> A2S ;
|
|
mkAV : A -> AV ;
|
|
mkA2V : A -> Str -> A2V ;
|
|
\end{verbatim}
|
|
|
|
Notice: categories \texttt{V2S, V2V, V2A, V2Q} are in v 1.0 treated
|
|
just as synonyms of \texttt{V2}, and the second argument is given
|
|
as an adverb. Likewise \texttt{AS, A2S, AV, A2V} are just \texttt{A}.
|
|
\texttt{V0} is just \texttt{V}.
|
|
|
|
\begin{verbatim}
|
|
V0, V2S, V2V, V2A, V2Q : Type ;
|
|
AS, A2S, AV, A2V : Type ;
|
|
\end{verbatim}
|
|
|
|
\end{document}
|