mirror of
https://github.com/GrammaticalFramework/gf-core.git
synced 2026-04-09 04:59:31 -06:00
started grammar description text
This commit is contained in:
5
lib/doc/languages/Makefile
Normal file
5
lib/doc/languages/Makefile
Normal file
@@ -0,0 +1,5 @@
|
||||
all: english
|
||||
|
||||
english:
|
||||
txt2tags -thtml --toc gf-english.txt
|
||||
|
||||
183
lib/doc/languages/gf-english.txt
Normal file
183
lib/doc/languages/gf-english.txt
Normal file
@@ -0,0 +1,183 @@
|
||||
English: A Digital Grammar
|
||||
Aarne Ranta
|
||||
%%date
|
||||
|
||||
|
||||
%!postproc(tex) : "#BECE" "begin{center}"
|
||||
%!postproc(html) : "#BECE" "<center>"
|
||||
%!postproc(tex) : "#ENCE" "end{center}"
|
||||
%!postproc(html) : "#ENCE" "</center>"
|
||||
|
||||
|
||||
**Digital grammars** are grammars usable by computers, so that they can mechanically perform
|
||||
tasks like interpreting, producing, and translating languages. The **GF Resource Grammar Library**
|
||||
(RGL) is a set of digital grammars which, at the time of writing, covers 28 languages. These grammars
|
||||
are written in GF, **Grammatical Framework**, which is a programming language designed for
|
||||
writing digital grammars.
|
||||
|
||||
The grammars in the RGL have been written by linguists, computer scientists, and
|
||||
programmers who know the languages thoroughly, both in practice and in theory. Almost 50 persons from
|
||||
around the world have contributed to this work, and ongoing projects are expected to give us many new
|
||||
languages soon.
|
||||
|
||||
The leading idea of the RGL is that different languages share large parts of their grammars, despite
|
||||
their observed differences. One important thing that is shared are the **categories**, that is, the
|
||||
types of words and expressions. For instance, every language in RGL has a category of **nouns**, but
|
||||
what exactly a noun is varies from language to language. Thus English nouns have four forms
|
||||
(singular and plural, nominative and genitive, as in //house, houses, house's, houses'//)
|
||||
whereas French nouns have just two forms (singular and plural //maison, maisons//, "house"), but they also
|
||||
have a piece of information that English nouns don't have, namely gender (masculine and feminine).
|
||||
Chinese nouns have just one form (房子 //fangzi// "house"), which is used for both singular and plural, but in
|
||||
addition, a little bit like the French gender, they have a **classifier** (间 //jian// for the word
|
||||
"house"). German nouns have 8 forms and a gender, Finnish nouns have 26 forms, and so on.
|
||||
|
||||
|
||||
|
||||
+Lexical categories+
|
||||
|
||||
Categories of words are called **lexical categories**.
|
||||
The language-specific variation in lexical categories is due to **morphology**, that is, the different forms that
|
||||
one and the same word can have in different contexts. If we look at the 28 languages in the RGL, we can
|
||||
see that the classification of words is common to all the languages, and the
|
||||
differences are in morphology. In this chapter, we will explain all lexical categories and give an overview
|
||||
of their morphological aspects. Details of morphology for each language is given in the language-specific documents.
|
||||
|
||||
|
||||
++Main parts of speech: content words++
|
||||
|
||||
The most important categories of words are given in the following table. More precisely, we will give the
|
||||
categories of **content words**, which, so so say, describe things and events in the real world.
|
||||
Content words are distinguished from **structural words**, whose purpose is to combine words into syntactic
|
||||
structures. Each category of content words may have thousands of words, and new words can be introduced
|
||||
continuously; therefore, these categories are also called **open categories**. In contrast, structural
|
||||
words are very few (maybe some dozens), and new ones are very seldom added.
|
||||
|
||||
Each category has a GF name, that is, a short symbolic name, which is the name actually used in the GF program code.
|
||||
In the text we usually use the text names, but will sometimes find the GF names handy to use as well, since they
|
||||
give us a short and precise way to state grammatical rules.
|
||||
|
||||
|
||||
===Table: categories of content words===
|
||||
|
||||
|| GF name | text name | example | inflectional features | inherent features ||
|
||||
| ``N`` | noun | //house// | number, case | gender, classifier
|
||||
| ``PN`` | proper name | //Paris// | case | gender
|
||||
| ``A`` | adjective | //blue// | gender, number, case, degree | position
|
||||
| ``V`` | verb | //sleep// | number, person, tense, aspect, mood | subject case
|
||||
| ``Adv`` | adverb | //here// | (none) | adverb type (place, time, manner)
|
||||
|
||||
|
||||
In addition to the names and examples, the table lists the **inflectional features** and **inherent features**
|
||||
typical of each category. Inflectional features are those that create different forms of words. For instance,
|
||||
French nouns have forms for number (singular and plural) - or, as one often says,
|
||||
French nouns are //inflected for number//. In contrast to number, the gender does not give rise to different forms
|
||||
of French nouns: //maison// ("house") //is// feminine, inherently, and there is no masculine form of //maison//.
|
||||
(Of course, there are some nouns that do have masculine and feminine forms, such as //étudiant, étudiante//
|
||||
"male/female student", but this only applies to a minority of French nouns and shouldn't be taken as an
|
||||
indication of an inflectional gender.)
|
||||
|
||||
|
||||
++Syntactic implications++
|
||||
|
||||
The features given in the table are rough indications for what one can expect in different languages. Thus,
|
||||
for instance, some languages have no gender at all, and hence their nouns and adjectives won't have
|
||||
genders either. But the table is a rather good generalization from the 28 language of the RGL: we can
|
||||
safely say that, if a language //does// have gender, then nouns have an inherent gender and adjectives have
|
||||
a variable gender. This is not a coincidence but has to do with **syntax**, that is, the combination of words
|
||||
into complex expressions. Thus, for instance, nouns are combined with adjectives that modify them, so that
|
||||
#BECE
|
||||
//blue// + //house// = //blue house//
|
||||
#ENCE
|
||||
Now, adjectives have to be combinable with all nouns, independently of the gender of the noun: there are no
|
||||
separate classes of masculine and feminine adjectives (again, with some apparent exceptions, such as //pregnant//,
|
||||
but even these adjectives have at least grammatically correct metaphoric uses with nouns of other genders).
|
||||
This means that we must be able to pick the gender of the adjective in agreement with the gender of the noun
|
||||
that it modifies, which means that the gender of adjectives must be inflectional. Thus in French the adjective
|
||||
for "blue" is //bleu//, with the feminine form //bleue//, and works as follows:
|
||||
#BECE
|
||||
//bleu// + //maison// = //maison bleue// ("blue house", feminine)
|
||||
|
||||
//bleu// + //livre// = //livre bleu// ("blue book", masculine)
|
||||
#ENCE
|
||||
French also provides examples of adjectives with different **positions**: //bleu// is put after the noun
|
||||
it modifies, whereas //vieux// ("old") is put before the noun: //vieux livre// ("old book").
|
||||
|
||||
We will return to syntax later. At this point, it is sufficient to say that the morphological features of
|
||||
words are not there just for nothing, but they play an important role in how words are combined in syntax.
|
||||
In particular, they determine to a great extent how **agreement** works, that is, how the features of
|
||||
words depend on each other in combinations.
|
||||
|
||||
|
||||
++Subcategorization++
|
||||
|
||||
In addition to the features needed for inflection and agreement, the lexicon must give information about //what//
|
||||
combinations are possible with each word. For most nouns and adjective, this is simple: a noun can be modified
|
||||
by an adjective, for instance, and there is a uniform syntax rule for this. However, there are some nouns and adjectives
|
||||
that are trickier, because they don't correspond to simple things but to **relations**. For instance, //brother// is
|
||||
a **relational noun**, since its primary usage is not alone bur in phrases like //brother of this man//.
|
||||
In the same way, //similar//
|
||||
is a **relational adjective**, since its primary use is in phrases like //similar to this//. The additional
|
||||
term attached to these words is called its **complement**; thus //this// is the complement in //similar to this//.
|
||||
The categories of words that take complements are called **subcategories**. They are morphologically similar to
|
||||
the main categories, but need extra information for the usage of complements.
|
||||
|
||||
The RGL has categories
|
||||
for relational nouns and adjectives, and nouns also have a variant with two complements
|
||||
(e.g. //distance from Paris to Munich//).
|
||||
From the logical point of you, complements are called **places**, and the number of places
|
||||
is one plus the number of complements. Hence, for instance, ``N2`` is a **two-place noun**, and
|
||||
in a phrase like
|
||||
#BECE
|
||||
//John is a brother of Mary//,
|
||||
#ENCE
|
||||
//John// occupies the "first place" and //Mary// occupies the "second place". This terminology is ultimately
|
||||
borrowed from logic, where this phrase is represented as the application of a **two-place predicate**,
|
||||
#BECE
|
||||
//brother//(//John//,//Mary//).
|
||||
#ENCE
|
||||
Ordinary nouns (``N``) have one place, and could therefore in principle be called ``N1``.
|
||||
|
||||
The following table shows the categories of relational nouns and adjectives in the RGL. The inflectional and
|
||||
inherent features are the same as for one-place nouns and adjectives, but for each complement, the lexicon
|
||||
must tell what preposition, if any, is needed to attach that complement. For instance, the preposition for
|
||||
//similar// is //to//, whereas the preposition for //different// is //from//. In languages with richer case
|
||||
systems (such as German, Latin, and Finnish), the complement information also determines the case (genitive,
|
||||
dative, ablative, and so on).
|
||||
|
||||
|
||||
===Table: subcategories of nouns and adjectives===
|
||||
|
||||
|| GF name | text name | example | inherent complement features ||
|
||||
| ``N2`` | two-place noun | //brother// (//of someone// | case or preposition
|
||||
| ``N3`` | three-place noun | //distance// (//from some place to some place// | case or preposition
|
||||
| ``A2`` | two-place adjective | //similar// (//to something// | case or preposition
|
||||
|
||||
|
||||
Verbs show a particularly rich variation in subcategorization. The most familiar distinction is the one between
|
||||
**intransitive** and **transitive** verbs: intransitive verbs need only a **subject** (like //she// in //she sleeps//),
|
||||
whereas transitive verbs also need an **object** (like //him// in //she loves him//). Our category ``V`` obviously includes
|
||||
intransitive verbs. But there is no category for transitive verbs in the RGL. Instead, we have a more general category of
|
||||
**two-place verbs**, which includes transitive verbs but also verbs that need a preposition (such as //at// in
|
||||
//she looks at him//). Just like for relational nouns and adjectives, the complement of a two-place verb has variations
|
||||
in cases and prepositions.
|
||||
|
||||
The following table shows the subcategories of verbs in the RGL. The list is long but it may still be incomplete. For
|
||||
example, there are no four-place verbs (//she paid him one million pounds for the house//). Such constructions can
|
||||
be built, as we will see later, by using for instance a ``V3`` verb with an additional adverb. But we can envisage
|
||||
future additions of more subcategories for verbs.
|
||||
|
||||
|
||||
===Table: subcategories of verbs===
|
||||
|
||||
|| GF name | text name | example | inherent complement features ||
|
||||
| ``V2`` | two-place verb | //love// (//someone// | case or preposition
|
||||
| ``V3`` | three-place verb | //give// (//something to someone//) | two cases or prepositions
|
||||
| ``VV`` | verb-complement verb | //try// (//to do something//) | infinitive form
|
||||
| ``VS`` | sentence-complement verb | //know// (//that something happens//) | sentence mood
|
||||
| ``VQ`` | question-complement verb | //ask// (//what happens//) | question mood
|
||||
| ``VA`` | adjective-complement verb | //become// (//something, e.g. old//) | adjective case
|
||||
| ``V2V`` | two-place verb-complement verb | //force// (//someone to do something//) | infinitive form, control type
|
||||
| ``V2S`` | two-place sentence-complement verb | //tell// (//someone that something happens//) | object case, sentence mood
|
||||
| ``V2Q`` | two-place question-complement verb | //ask// (//someone what happens//) | object case, question mood
|
||||
| ``V2A`` | two-place adjective-complement verb | //paint// (//something in some colour, e.g. blue//) | object and adjective case
|
||||
|
||||
Reference in New Issue
Block a user