gf-core/lib/resource/doc/gfdoc/ParadigmsRus.html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<META NAME="generator" CONTENT="http://txt2tags.sf.net">
<TITLE> Russian Lexical Paradigms</TITLE>
</HEAD><BODY BGCOLOR="white" TEXT="black">
<P ALIGN="center"><CENTER><H1> Russian Lexical Paradigms</H1>
<FONT SIZE="4">
<I>Last update: 2007-07-06 10:39:50 CEST</I><BR>
</FONT></CENTER>

<P></P>
<HR NOSHADE SIZE=1>
<P></P>
    <UL>
    <LI><A HREF="#toc1">Parameters</A>
    <LI><A HREF="#toc2">Nouns</A>
    <LI><A HREF="#toc3">Adjectives</A>
    <LI><A HREF="#toc4">Adverbs</A>
    <LI><A HREF="#toc5">Verbs</A>
      <UL>
      <LI><A HREF="#toc6">Two-place verbs</A>
      <LI><A HREF="#toc7">Three-place verbs</A>
      </UL>
    </UL>

<P></P>
<HR NOSHADE SIZE=1>
<P></P>
<P>
Produced by
gfdoc - a rudimentary GF document generator.
(c) Aarne Ranta (<A HREF="mailto:aarne@cs.chalmers.se">aarne@cs.chalmers.se</A>) 2002 under GNU GPL.
</P>
<P>
Janna Khegai 2003--2006
</P>
<P>
This is an API for the user of the resource grammar
for adding lexical items. It gives functions for forming
expressions of open categories: nouns, adjectives, verbs.
</P>
<P>
Closed categories (determiners, pronouns, conjunctions) are
accessed through the resource syntax API, <CODE>Structural.gf</CODE>.
</P>
<P>
The main difference with <CODE>MorphoRus.gf</CODE> is that the types
referred to are compiled resource grammar types. We have moreover
had the design principle of always having existing forms, rather
than stems, as string arguments of the paradigms.
</P>
<P>
The structure of functions for each word class <CODE>C</CODE> is the following:
first we give a handful of patterns that aim to cover all
regular cases. Then we give a worst-case function <CODE>mkC</CODE>, which serves as an
escape to construct the most irregular words of type <CODE>C</CODE>.
</P>
<P>
The following modules are presupposed:
</P>
<PRE>
    resource ParadigmsRus = open
      (Predef=Predef),
      Prelude,
      MorphoRus,
      CatRus,
      NounRus
      in {

    flags  coding=utf8 ;
</PRE>
<P></P>
<A NAME="toc1"></A>
<H2>Parameters</H2>
<P>
To abstract over gender names, we define the following identifiers.
</P>
<PRE>
    oper
      Gender : Type ;
      masculine : Gender ;
      feminine  : Gender ;
      neuter    : Gender ;
</PRE>
<P></P>
<P>
To abstract over case names, we define the following.
</P>
<PRE>
      Case : Type ;

      nominative    : Case ;
      genitive      : Case ;
      dative        : Case ;
      accusative    : Case ;
      instructive   : Case ;
      prepositional : Case ;
</PRE>
<P></P>
<P>
In some (written in English) textbooks accusative case
is put on the second place. However, we follow the case order
standard for Russian textbooks.
To abstract over number names, we define the following.
</P>
<PRE>
      Number : Type ;

      singular : Number ;
      plural   : Number ;

      Animacy: Type ;

      animate: Animacy;
      inanimate: Animacy;
</PRE>
<P></P>
<A NAME="toc2"></A>
<H2>Nouns</H2>
<P>
Best case: indeclinabe nouns: <I>кофе</I>, <I>пальто</I>, <I>ВУЗ</I>.
</P>
<PRE>
      mkN : overload {
</PRE>
<P></P>
<P>
The regular function captures the variants for some popular nouns
endings below:
</P>
<PRE>
        mkN : Str -&gt; N ;
</PRE>
<P></P>
<P>
This function is for indeclinable nouns.
</P>
<PRE>
        mkN : Str -&gt; Gender -&gt; Animacy -&gt; N ;
</PRE>
<P></P>
<P>
Worst case - give six singular forms:
Nominative, Genetive, Dative, Accusative, Instructive and Prepositional;
corresponding six plural forms and the gender.
May be the number of forms needed can be reduced,
but this requires a separate investigation.
Animacy parameter (determining whether the Accusative form is equal
to the Nominative or the Genetive one) is actually of no help,
since there are a lot of exceptions and the gain is just one form less.
</P>
<PRE>
        mkN : (nomSg,_,_,_,_,_,_,_,_,_,_,prepPl : Str) -&gt; Gender -&gt; Animacy -&gt; N ;
</PRE>
<P></P>
<P>
мужчина, мужчины, мужчине, мужчину, мужчиной, мужчине
мужчины, мужчин, мужчинам, мужчин, мужчинами, мужчинах
</P>
<PRE>
      } ;
</PRE>
<P></P>
<P>
Here are some common patterns. The list is far from complete.
Feminine patterns.
</P>
<PRE>
      nMashina   : Str -&gt; N ;    -- feminine, inanimate, ending with "-а", Inst -"машин-ой"
      nEdinica   : Str -&gt; N ;    -- feminine, inanimate, ending with "-а", Inst -"единиц-ей"
      nZhenchina : Str -&gt; N ;    -- feminine, animate, ending with "-a"
      nNoga      : Str -&gt; N ;    -- feminine, inanimate, ending with "г_к_х-a"
      nMalyariya : Str -&gt; N ;    -- feminine, inanimate, ending with "-ия"
      nTetya     : Str -&gt; N ;    -- feminine, animate, ending with "-я"
      nBol       : Str -&gt; N ;    -- feminine, inanimate, ending with "-ь"(soft sign)
</PRE>
<P></P>
<P>
Neuter patterns.
</P>
<PRE>
      nObezbolivauchee : Str -&gt; N ;   -- neutral, inanimate, ending with "-ee"
      nProizvedenie : Str -&gt; N ;   -- neutral, inanimate, ending with "-e"
      nChislo : Str -&gt; N ;   -- neutral, inanimate, ending with "-o"
      nZhivotnoe : Str -&gt; N ;    -- masculine, inanimate, ending with "-ень"
</PRE>
<P></P>
<P>
Masculine patterns.
Ending with consonant:
</P>
<PRE>
      nPepel : Str -&gt; N ;    -- masculine, inanimate, ending with "-ел"- "пеп-ла"

      nBrat: Str -&gt; N ;   -- animate, брат-ья
      nStul: Str -&gt; N ;    -- same as above, but inanimate
      nMalush : Str -&gt; N ; -- малышей
      nPotolok : Str -&gt; N ; -- потол-ок - потол-ка

     -- the next four differ in plural nominative and/or accusative form(s) :
      nBank: Str -&gt; N ;    -- банк-и (Nom=Acc)
      nStomatolog : Str -&gt; N ;  -- same as above, but animate
      nAdres     : Str -&gt; N ;     -- адрес-а (Nom=Acc)
      nTelefon   : Str -&gt; N ;     -- телефон-ы (Nom=Acc)

      nNol       : Str -&gt; N ;    -- masculine, inanimate, ending with "-ь" (soft sign)
      nUroven    : Str -&gt; N ;    -- masculine, inanimate, ending with "-ень"
</PRE>
<P></P>
<P>
Nouns used as functions need a preposition. The most common is with Genitive.
</P>
<PRE>
      mkFun  : N -&gt; Prep -&gt; N2 ;
      mkN2 : N -&gt; N2 ;
      mkN3 : N -&gt; Prep -&gt; Prep -&gt; N3 ;
</PRE>
<P></P>
<P>
Proper names.
</P>
<PRE>
      mkPN : overload {
        mkPN : Str -&gt; PN ;
        mkPN : Str -&gt; Gender -&gt; Animacy -&gt; PN ;          -- "Иван", "Маша"
        mkPN : N -&gt; PN ;
      } ;
</PRE>
<P></P>
<A NAME="toc3"></A>
<H2>Adjectives</H2>
<P>
Non-comparison (only positive degree) one-place adjectives need 28 (4 by 7)
forms in the worst case:
(Masculine  | Feminine | Neutral | Plural) **
**
(Nominative | Genitive | Dative | Accusative Inanimate | Accusative Animate |
Instructive | Prepositional)
Notice that 4 short forms, which exist for some adjectives are not included
in the current description, otherwise there would be 32 forms for
positive degree.
The regular function captures the variants for some popular adjective
endings below. The first string agrument is the masculine singular form,
the second is comparative:
Invariable adjective is a special case, with only on string needed.
</P>
<PRE>
       mkA : overload {
         mkA : Str -&gt; A ;          -- khaki, mini, hindi, netto
         mkA : Str -&gt; Str -&gt; A ;
       } ;
</PRE>
<P></P>
<P>
Some regular patterns depending on the ending.
</P>
<PRE>
       AStaruyj : Str -&gt; Str -&gt; A ;            -- ending with "-ый"
       AMalenkij : Str -&gt; Str -&gt; A ;           -- ending with "-ий", Gen - "маленьк-ого"
       AKhoroshij : Str -&gt; Str -&gt; A ;          -- ending with "-ий", Gen - "хорош-его"
       AMolodoj : Str -&gt; Str -&gt; A ;            -- ending with "-ой",
                                               -- plural - молод-ые"
       AKakoj_Nibud : Str -&gt; Str -&gt; Str -&gt; A ; -- ending with "-ой",
                                               -- plural - "как-ие"
</PRE>
<P></P>
<P>
Two-place adjectives need a preposition and a case as extra arguments.
</P>
<PRE>
       mkA2 : A -&gt; Str -&gt; Case -&gt; A2 ;  -- "делим на"
</PRE>
<P></P>
<P>
Comparison adjectives need a positive adjective
(28 forms without short forms).
Taking only one comparative form (non-syntactic) and
only one superlative form (syntactic) we can produce the
comparison adjective with only one extra argument -
non-syntactic comparative form.
Syntactic forms are based on the positive forms.
</P>
<A NAME="toc4"></A>
<H2>Adverbs</H2>
<P>
Adverbs are not inflected.
</P>
<PRE>
      mkAdv : Str -&gt; Adv ;
</PRE>
<P></P>
<A NAME="toc5"></A>
<H2>Verbs</H2>
<P>
In our lexicon description (<I>Verbum</I>) there are 62 forms:
2 (Voice) by { 1 (infinitive) + [2(number) by 3 (person)](imperative) +
[ [2(Number) by 3(Person)](present) + [2(Number) by 3(Person)](future) +
4(GenNum)(past) ](indicative)+ 4 (GenNum) (subjunctive) }
Participles (Present and Past) and Gerund forms are not included,
since they fuction more like Adjectives and Adverbs correspondingly
rather than verbs. Aspect is regarded as an inherent parameter of a verb.
Notice, that some forms are never used for some verbs.
</P>
<PRE>
    Voice: Type;
    Aspect: Type;
    Bool: Type;
    Conjugation: Type ;

    first:   Conjugation; -- "гуля-Ешь, гуля-Ем"
    firstE:  Conjugation; -- Verbs with vowel "ё": "даёшь" (give), "пьёшь" (drink)
    second:  Conjugation; -- "вид-Ишь, вид-Им"
    mixed:   Conjugation; -- "хоч-Ешь - хот-Им"
    dolzhen: Conjugation; -- irregular

    true:  Bool;
    false: Bool;

    active: Voice ;
    passive: Voice ;
    imperfective: Aspect;
    perfective: Aspect ;
</PRE>
<P></P>
<P>
Common conjugation patterns are two conjugations:
first - verbs ending with <I>-ать/-ять</I> and second - <I>-ить/-еть</I>.
Instead of 6 present forms of the worst case, we only need
a present stem and one ending (singular, first person):
<I>я люб-лю</I>, <I>я жд-у</I>, etc. To determine where the border
between stem and ending lies it is sufficient to compare
first person from with second person form:
<I>я люб-лю</I>, <I>ты люб-ишь</I>. Stems shoud be the same.
So the definition for verb <I>любить</I>  looks like:
regV Imperfective Second <I>люб</I> <I>лю</I> <I>любил</I> <I>люби</I> <I>любить</I>;
</P>
<P>
There is no one-argument case.
</P>
<PRE>
      mkV : overload {
        mkV : Aspect -&gt; Conjugation -&gt; (stemPrsSgP1,endPrsSgP1,pastSgP1,imp,inf : Str) -&gt; V ;
</PRE>
<P></P>
<P>
The worst case need 6 forms of the present tense in indicative mood
(<I>я бегу</I>, <I>ты бежишь</I>, <I>он бежит</I>, <I>мы бежим</I>, <I>вы бежите</I>, <I>они бегут</I>),
a past form (singular, masculine: <I>я бежал</I>), an imperative form
(singular, second person: <I>беги</I>), an infinitive (<I>бежать</I>).
Inherent aspect should also be specified.
</P>
<PRE>
       mkV : Aspect -&gt; (presSgP1,presSgP2,presSgP3,presPlP1,presPlP2,presPlP3,pastSgMasc,imp,inf: Str) -&gt; V ;

      } ;
</PRE>
<P></P>
<A NAME="toc6"></A>
<H3>Two-place verbs</H3>
<P>
Two-place verbs, and the special case with direct object. Notice that
a particle can be included in a <CODE>V</CODE>.
</P>
<PRE>
      mkV2 : overload {
        mkV2 : V -&gt; V2 ;                    -- "видеть", "любить"
        mkV2 : V   -&gt; Str -&gt; Case -&gt; V2 ;   -- "войти в дом"; "в", accusative
      } ;
</PRE>
<P></P>
<A NAME="toc7"></A>
<H3>Three-place verbs</H3>
<PRE>
       tvDirDir : V -&gt; V3 ;
       mkV3     : V -&gt; Str -&gt; Str -&gt; Case -&gt; Case -&gt; V3 ; -- "сложить письмо в конверт"
</PRE>
<P></P>
<P>
The definitions should not bother the user of the API. So they are
hidden from the document.
</P>

<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
<!-- cmdline: txt2tags -thtml -\-toc russian/ParadigmsRus.txt -->
</BODY></HTML>