gf-core/examples/fracas/doc/FraCaSBank.lyx

#LyX 2.0 created this file. For more info see http://www.lyx.org/
\lyxformat 413
\begin_document
\begin_header
\textclass article
\begin_preamble
\usepackage{times}
\end_preamble
\use_default_options true
\maintain_unincluded_children false
\language english
\language_package default
\inputencoding auto
\fontencoding global
\font_roman default
\font_sans default
\font_typewriter default
\font_default_family default
\use_non_tex_fonts false
\font_sc false
\font_osf false
\font_sf_scale 100
\font_tt_scale 100

\graphics default
\default_output_format default
\output_sync 0
\bibtex_command default
\index_command default
\paperfontsize 11
\spacing single
\use_hyperref true
\pdf_bookmarks true
\pdf_bookmarksnumbered false
\pdf_bookmarksopen false
\pdf_bookmarksopenlevel 1
\pdf_breaklinks false
\pdf_pdfborder false
\pdf_colorlinks true
\pdf_backref false
\pdf_pdfusetitle true
\papersize a4paper
\use_geometry false
\use_amsmath 1
\use_esint 1
\use_mhchem 1
\use_mathdots 1
\cite_engine natbib_authoryear
\use_bibtopic false
\use_indices false
\paperorientation portrait
\suppress_date false
\use_refstyle 1
\index Index
\shortcut idx
\color #008000
\end_index
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\paragraph_indentation default
\quotes_language english
\papercolumns 1
\papersides 1
\paperpagestyle default
\tracking_changes false
\output_changes false
\html_math_output 0
\html_css_as_file 0
\html_be_strict false
\end_header

\begin_body

\begin_layout Title
A Bilingual Treebank for the FraCaS Test Suite
\begin_inset Newline newline
\end_inset

CLT Project Report
\end_layout

\begin_layout Author
Peter Ljunglöf and Magdalena Siverbo
\begin_inset Newline newline
\end_inset

Centre for Language Technology
\begin_inset Newline newline
\end_inset

University of Gothenburg
\begin_inset Newline newline
\end_inset

E-mail:
\begin_inset Flex URL
status open

\begin_layout Plain Layout

peter.ljunglof@gu.se
\end_layout

\end_inset


\end_layout

\begin_layout Date
31st October, 2011
\end_layout

\begin_layout Abstract
\noindent
We have created a bilingual treebank for 99% of the sentences in the FraCaS
 test suite.
 The treebank is built together with an associated bilingual English-Swedish
 lexicon written in the Grammatical Framework Resource Grammar.
 The original FraCaS sentences are English, and we have tested the multilinguali
ty of the Resource Grammar by analysing the grammaticality and naturalness
 of the Swedish translations.
 86% of the sentences are grammatically and semantically correct and sound
 natural.
 About 10% can probably be fixed by adding new lexical items or grammatical
 rules, and only a small amount are considered to be difficult to cure.
\end_layout

\begin_layout Standard
\begin_inset ERT
status open

\begin_layout Plain Layout


\backslash
thispagestyle{empty}
\end_layout

\end_inset


\end_layout

\begin_layout Section
Introduction
\end_layout

\begin_layout Standard
In this project we have created a bilingual treebank for the FraCaS test
 suite
\begin_inset CommandInset citation
LatexCommand citep
key "CooperCrouchEijck1996:Using-the-Framework"

\end_inset

, using the Grammatical Framework Resource Grammar Library
\begin_inset CommandInset citation
LatexCommand citep
key "Ranta2009:The-GF-Resource-Grammar-Library,Ranta2009:Grammatical-Framework:-A-Multilingual,Ranta2011:Grammatical-Framework:-Programming"

\end_inset

.
 The project consisted of two parts that were partly interwoven.
 The first aim was to construct a treebank, which involved creating a lexicon
 and a limited grammar specific for the FraCaS test suite, parsing the sentences
 and selecting the most representative trees.
 The second aim was to build a FraCaS corpus in Swedish, using the treebank
 constructed in the first part of the project.
 This involved translating the English lexicon and grammar into Swedish
 equivalents, generating Swedish sentences for all the trees in the treebank
 and evaluate the results.
\end_layout

\begin_layout Standard
\begin_inset Newpage pagebreak
\end_inset


\end_layout

\begin_layout Subsection
The FraCaS Corpus
\end_layout

\begin_layout Standard
The FraCaS textual inference problem set
\begin_inset CommandInset citation
LatexCommand citep
key "CooperCrouchEijck1996:Using-the-Framework"

\end_inset

 was built in the mid 1990's by the FraCaS project, a large collaboration
 aimed at developing resources and theories for computational semantics.
 This test set was later modified and converted to XML by Bill MacCartney:

\end_layout

\begin_layout Standard
\noindent
\align center

\family sans
\begin_inset CommandInset href
LatexCommand href
target "http://www-nlp.stanford.edu/~wcmac/downloads/fracas.xml"

\end_inset


\end_layout

\begin_layout Standard
It is the latter, modified version that has been used in this project.
 The corpus consists of 346 problems each containing one or more statements
 and one yes/no-question (except for four problems, where there is no question).
 The total number of sentences in the corpus is 1220, but since some of
 them are repeated in several problems, there are in total 874 unique sentences.
\end_layout

\begin_layout Standard
The FraCaS problems contain relatively simple sentences, and the premise
 and hypothesis sentences are usually syntactically similar.
 Despite this simplicity, the problems are intended to reflect a broad variety
 of semantic and inferential phenomena.
 For this reason, the FraCaS corpus has been used as a benchmark for evaluating
 different computational semantics systems
\begin_inset CommandInset citation
LatexCommand citep
key "MacCartneyManning2008:Modeling-semantic-containment"

\end_inset

.
\end_layout

\begin_layout Standard
The FraCaS corpus only contains made-up sentences, which are intended to
 be grammatically correct.
 Therefore we took the opportunity to correct some obvious minor mistakes,
 such as
\emph on

\begin_inset Quotes eld
\end_inset

a executive
\begin_inset Quotes erd
\end_inset


\emph default
.

\emph on

\begin_inset Quotes eld
\end_inset

does
\family typewriter
[\SpecialChar \ldots{}
]
\family default
 has
\begin_inset Quotes erd
\end_inset


\emph default
,
\emph on

\begin_inset Quotes eld
\end_inset

did
\family typewriter
[\SpecialChar \ldots{}
]
\family default
 delivered
\begin_inset Quotes erd
\end_inset


\emph default
, and
\emph on

\begin_inset Quotes eld
\end_inset

Jones's
\begin_inset Quotes erd
\end_inset


\emph default
.
 In total 7 sentences were corrected.
\end_layout

\begin_layout Standard
\begin_inset Note Note
status collapsed

\begin_layout Subsubsection
from MacCartney's thesis:
\end_layout

\begin_layout Plain Layout
The FraCaS test suite
\begin_inset CommandInset citation
LatexCommand cite
key "CooperCrouchEijck1996:Using-the-Framework"

\end_inset

(Cooper et al.
 1996) of NLI problems was one product of the FraCaS Consortium, a large
 collaboration in the mid-1990s aimed at developing a range of resources
 related to computational semantics.
 The FraCaS problems contain comparatively simple sentences, and the premise
 and hypothesis sentences are usu- ally quite similar, so that just a few
 edits suffice to transform p into h.
 Despite this simplicity, the problems are designed to reflect a broad diversity
 of semantic and infer- ential phenomena.
 For this reason, the FraCaS test suite has proven to be invaluable as a
 developmental test bed for the NatLog system and as a yardstick for evaluating
 its effectiveness.
 Indeed, the test suite was created with just such an application as its
 primary goal.
 As the authors write:
\end_layout

\begin_layout Quote
In light of the view expressed elsewhere in this and other FraCaS de- liverables
 ...
 that inferential ability is not only a central manifestation of semantic
 competence but is in fact centrally constitutive of it, it shouldn’t be
 a surprise that we regard inferencing tasks as the best way of testing
 an NLP system’s semantic capacity.2
\end_layout

\begin_layout Subsubsection
from MacCartney & Manning (2007):
\end_layout

\begin_layout Plain Layout
The FraCaS test suite (Cooper et al., 1996) was de- veloped as part of a
 collaborative research effort in computational semantics.
 It contains 346 inference problems reminiscent of a textbook on formal
 se- mantics.
 In the authors’ view, “inferencing tasks [are] the best way of testing
 an NLP system’s se- mantic capacity.”
\end_layout

\begin_layout Plain Layout
The problems are divided into nine sections, each focused on a category
 of semantic phenomena, such as quantifiers or anaphora (see table 2).
 Each prob- lem consists of one or more premise sentences, fol- lowed by
 a one-sentence question.
 For this project, the questions were converted into declarative hy- potheses.
\end_layout

\begin_layout Plain Layout
Each problem also has an answer, which (usually) takes one of three values:
 yes (the hypoth- esis can be inferred from the premise(s)), no (the negation
 of the hypothesis can be inferred), or unk (neither the hypothesis nor
 its negation can be in- ferred).
\end_layout

\begin_layout Subsubsection
from Mac&Mann (2008):
\end_layout

\begin_layout Plain Layout
The FraCaS test suite (Cooper et al., 1996) con- tains 346 NLI problems,
 divided into nine sec- tions, each focused on a specific category of se-
 mantic phenomena (listed in table 3).
 Each prob- lem consists of one or more premise sentences, a question sentence,
 and one of three answers: yes, no, or unknown
\end_layout

\end_inset


\end_layout

\begin_layout Subsubsection
Examples from the FraCaS Corpus
\end_layout

\begin_layout Standard
The FraCaS problems are divided into 9 broad categories which cover many
 aspects of semantic inference.
 The categories are called
\emph on
quantifiers
\emph default
,
\emph on
plurals
\emph default
,
\emph on
anaphora
\emph default
,
\emph on
ellipsis
\emph default
,
\emph on
adjectives
\emph default
,
\emph on
comparatives
\emph default
,
\emph on
temporal reference
\emph default
,
\emph on
verbs
\emph default
, and
\emph on
attitudes
\emph default
, and they are also sub-categorised and sub-sub-categorised in an hierarchy
 of semantic phenomena.
 Each problem starts with one or more premises, and a question that can
 be answered with yes, no or unknown.
 Here are two similar examples with different semantic inferences from the

\emph on
anaphora
\emph default
 category:
\end_layout

\begin_layout Labeling
\labelwidthstring (999)
(135) P: Every customer who owns a computer has a service contract for it.
\begin_inset Newline newline
\end_inset

P: MFI is a customer that owns several computers.
\begin_inset Newline newline
\end_inset

Q: Does MFI have a service contract for all its computers?
\begin_inset Newline newline
\end_inset

A: Yes.
\end_layout

\begin_layout Labeling
\labelwidthstring (999)
(136) P: Every executive who had a laptop computer brought it to take notes
 at the meeting.
\begin_inset Newline newline
\end_inset

P: Smith is an executive who owns five different laptop computers.
\begin_inset Newline newline
\end_inset

Q: Did Smith take five laptop computers to the meeting?
\begin_inset Newline newline
\end_inset

A: Unknown.
\end_layout

\begin_layout Standard
Some of the problems are equivalent to each other, but with different answers
 depending on ambiguity.
 This happens for the following problem from the
\emph on
ellipsis
\emph default
 category:
\end_layout

\begin_layout Labeling
\labelwidthstring (160--161)
(160--161) P: John owns a red car.
\begin_inset Newline newline
\end_inset

P: Bill owns a fast one.
\begin_inset Newline newline
\end_inset

Q: Does Bill own a fast red car?
\begin_inset Newline newline
\end_inset

A: Yes or unknown, depending on the reading of
\begin_inset Quotes eld
\end_inset

one
\begin_inset Quotes erd
\end_inset

.
\end_layout

\begin_layout Subsection
Grammatical Framework
\end_layout

\begin_layout Standard
Grammatical Framework (GF)
\begin_inset CommandInset citation
LatexCommand citep
key "Ranta2009:Grammatical-Framework:-A-Multilingual,Ranta2011:Grammatical-Framework:-Programming"

\end_inset

 is a grammar formalism based on type theory.
 The main feature is the separation of abstract and concrete syntax.
 The abstract syntax of a grammar defines a set of abstract syntactic structures
, called abstract terms or trees; and the concrete syntax defines a relation
 between abstract structures and concrete structures.
 The concrete syntax is expressive enough to describe language-specific
 linguistic features such as word order, gender and case inflection, and
 discontinuous phrases.
 This makes it very suitable for writing multilingual grammars, where the
 abstract syntax is lifted to a more language universal level.
\end_layout

\begin_layout Subsubsection
Simple GF Example
\end_layout

\begin_layout Standard
As an example to show the possibilities of GF, we define adjectives as noun-modi
fying functions in the spirit of categorial grammar:
\end_layout

\begin_layout Description
(Abstract)
\begin_inset Formula $\mathit{green:CN\rightarrow CN}$
\end_inset


\end_layout

\begin_layout Standard
This means that
\emph on
green
\emph default
 is a grammatical construction that create common nouns (CN) from common
 nouns (CN).
 This does not say anything about the word order, which is instead defined
 in the linearisation rules in the concrete syntax.
 In English, the adjective comes before the noun:
\end_layout

\begin_layout Description

\series bold
(English)
\series default

\begin_inset Formula $\mathit{green\; n="\! green"\,+\negmedspace\negmedspace+\:\: n}$
\end_inset


\end_layout

\begin_layout Standard
Whereas in French the adjective comes after:
\end_layout

\begin_layout Description
(French)
\begin_inset Formula $\mathit{green\; n=n\:+\negmedspace\negmedspace+\:\:"\! vert"}$
\end_inset


\end_layout

\begin_layout Standard
But since French adjectives are inflected by number and gender, this is
 only correct for singular masculine nouns.
 That is why GF concrete syntax has support for inflection tables, inherent
 attributes and discontinuous constituents, which makes the formalism as
 expressive as Multiple Context-Free Grammars
\begin_inset CommandInset citation
LatexCommand citep
key "Ljunglof2004:Expressivity-and-Complexity-of-GF"

\end_inset

.
 A slightly more correct French variant of the adjective
\emph on
green
\emph default
 would then be:
\end_layout

\begin_layout Description

\series bold
(French)
\series default

\begin_inset Formula $\mathit{green\; n=\mathbf{table}\left\{ \begin{array}{l}
Sg\:\Rightarrow\: n\,!\, Sg\:+\negmedspace\negmedspace+\:\:"\! vert"\\
Pl\:\Rightarrow\: n\,!\, Pl\:+\negmedspace\negmedspace+\:\:"\! verts"
\end{array}\right\} }$
\end_inset


\end_layout

\begin_layout Standard
But this still does not handle feminine nouns, which of course is possible.
 Even better is to make use of the GF Resource Grammar, where all these
 inflection paradigms are already defined.
\end_layout

\begin_layout Subsubsection
The GF Resource Grammar
\end_layout

\begin_layout Standard
GF has a rich module system which facilitates grammar writing as an engineering
 task, by reusing common grammars.
 The abstract syntax of one grammar can be used as a concrete syntax of
 another grammar.
 This makes it possible to implement grammar resources to be used in several
 different application domains.
 These points are currently exploited in the GF Resource Grammar Library

\begin_inset CommandInset citation
LatexCommand citep
key "Ranta2009:The-GF-Resource-Grammar-Library,Ranta2011:Grammatical-Framework:-Programming"

\end_inset

, which is a multilingual GF grammar with a common abstract syntax for 20
 languages, including Finnish, Persian, Russian and Urdu.
 The main purpose of the Grammar Library is as a resource for writing domain-spe
cific grammars.
\end_layout

\begin_layout Standard
Now we can define the French and English linearisations for the adjective
 functions using the resource grammar, which then takes care of all kinds
 of inflection:
\end_layout

\begin_layout Description
(French)
\begin_inset Formula $\mathit{green\; n=AdjCN\:(PositA\:(mkA\;"\! vert"))\: n}$
\end_inset


\end_layout

\begin_layout Description
(English)
\begin_inset Formula $\mathit{green\; n=AdjCN\:(PositA\:(mkA\;"\! green"))\: n}$
\end_inset


\end_layout

\begin_layout Standard
Here
\emph on
AdjCN
\emph default
 is a function that modifies a common noun with an adjective phrase,
\emph on
PositA
\emph default
 uses the positive form of an adjective, and
\emph on
mkA
\emph default
 creates all possible inflections of a regular adjective.
 Note that the structures of the English and French linearisations are the
 same, except for the lexical entries, and this can be exploited in GF by
 creating a language-independent concrete syntax.
 The FraCaS treebank is language-independent in this sense, since the tree
 for each sentence is the same for both English and Swedish.
\end_layout

\begin_layout Section
The English Treebank
\end_layout

\begin_layout Subsection
The FraCaS Grammar
\end_layout

\begin_layout Standard
To be able to construct a GF treebank we need a grammar and a lexicon that
 can describe every sentence in the corpus.
 We have used the GF Resource Grammar as underlying grammar, and added lexical
 items that capture the FraCaS domain.
 On top of the resource grammar we have added a few new grammatical construction
s, as well as functions for handling elliptic phrases.

\end_layout

\begin_layout Standard
In total, we used 107 grammatical functions out of the 189 that are defined
 in the resource grammar.
 In addition we added four new grammatical constructions that were lacking,
 and 7 different elliptic phrases.
\end_layout

\begin_layout Standard
\begin_inset Note Note
status collapsed

\begin_layout Plain Layout
In order to construct the treebank for FraCaS, two modules were written,
 one lexicon module and one grammar module.
\end_layout

\begin_layout Subsubsection
Lexicon module
\end_layout

\begin_layout Plain Layout
The FraCaS lexicon module consists of an abstract and a concrete part.
\end_layout

\begin_layout Description
FraCaSLex Abstract lexicon for the FraCaS test suite
\end_layout

\begin_layout Description
FraCaSLexEng Concrete lexicon for the FraCaS test suite
\end_layout

\begin_layout Plain Layout
The lexicon was built using the functions mkN, mkA, mkV etc, mainly from
 the Paradigms module.
\end_layout

\begin_layout Subsubsection
Grammar module
\end_layout

\begin_layout Plain Layout
The FraCaS grammar module consists of an abstract and a concrete part.
\end_layout

\begin_layout Description
FraCaS Abstract grammar for the FraCaS test suite
\end_layout

\begin_layout Description
FraCaSEng Concrete grammar for the FraCaS test suite
\end_layout

\begin_layout Plain Layout
Initially, the whole Grammar module from the resource grammar was imported,
 but in the end only parts of the Grammar module (namely Noun, Verb, Adjective,
 Adverb, Numeral and Tense) were imported, while other parts were opened
 and necessary functions used in the FraCaS module.
 A few functions were added, mainly on clause and sentence level, in order
 to simplify the tree structures.
\end_layout

\end_inset


\end_layout

\begin_layout Subsubsection
Lexicon
\end_layout

\begin_layout Standard
The lexicon has in total 531 entries, some of which are structural words
 already defined in the resource grammar.
 Some of the lexical items denote different meanings of the same word.
 Examples of this include the word
\emph on

\begin_inset Quotes eld
\end_inset

than
\begin_inset Quotes erd
\end_inset


\emph default
 which can function as a preposition and as a subjunction, the verb
\emph on

\begin_inset Quotes eld
\end_inset

go
\begin_inset Quotes erd
\end_inset


\emph default
 which can mean
\emph on

\begin_inset Quotes eld
\end_inset

travel
\begin_inset Quotes erd
\end_inset


\emph default
 or
\emph on

\begin_inset Quotes eld
\end_inset

walk
\begin_inset Quotes erd
\end_inset


\emph default
, and the conjunction
\emph on

\begin_inset Quotes eld
\end_inset

and
\begin_inset Quotes erd
\end_inset


\emph default
 which can be a phrase initial conjunction and an ordinary conjuntion.
 Other entries denote different valencies of the same meaning.
 This is most common for verbs, such as the transitive verb
\emph on

\begin_inset Quotes eld
\end_inset

finish
\begin_inset Quotes erd
\end_inset


\emph default
 which can take a noun phrase or a verb phrase argument, and the verb
\emph on

\begin_inset Quotes eld
\end_inset

know
\begin_inset Quotes erd
\end_inset


\emph default
 which can take either a question or a sentence as argument.
\end_layout

\begin_layout Standard
The lexicon entries are divided into 63 adjectives, 77 adverbials, 20 conjunctio
ns/subjunctions, 34 determiners, 142 nouns, 19 numerals, 40 proper nouns,
 15 prepositions, 12 pronouns, and 109 verbs.
 Out of these, 55 adverbials and 28 nouns/proper nouns are multi-word expression
s.
\end_layout

\begin_layout Subsubsection
Multi-word Lexical Items
\begin_inset CommandInset label
LatexCommand label
name "sub:Multi-word-Lexical-Items"

\end_inset


\end_layout

\begin_layout Standard
83 of the lexical items denote multi-word phrases.
 They were mainly divided into two types:
\end_layout

\begin_layout Standard
\begin_inset Note Note
status collapsed

\begin_layout Itemize
P: Modified proper nouns (A + PN) could not be parsed.
\begin_inset Newline newline
\end_inset

S: “southern Europe” was defined as PN in FraCaSLex.
\end_layout

\begin_layout Itemize
P: Compounds constructed from a proper noun and a noun (PN + N) , and hyphenated
 nouns (N-N) could not be parsed.
\begin_inset Newline newline
\end_inset

S: “Labour MP”, “APCOM manager”, “stock-market” etc.
 were defined as N in FraCaSLex.
\end_layout

\begin_layout Itemize
(SKIP) P: Certain indefinite pronouns were not recognized as they did not
 exist in the resource grammar.
\begin_inset Newline newline
\end_inset

S: “all”, “anyone”, “everyone”, “no one” and “someone” were defined as NP
 in FraCaSLex.
\end_layout

\end_inset


\begin_inset Note Note
status collapsed

\begin_layout Paragraph
Quantifiers
\end_layout

\begin_layout Itemize
P: Numbers written without spaces between the digits were not recognized.
\begin_inset Newline newline
\end_inset

S: “10”, “99”, “100”, “2500” etc.
 defined as Det in FraCaSLex.
\end_layout

\begin_layout Itemize
P: Certain longer numerical expressions could not be parsed.
\begin_inset Newline newline
\end_inset

S: “one or more”, “the other 99” and “two out of ten” were defined as Det
 in FraCaSLex.
\end_layout

\begin_layout Itemize
P: Certain quantifiers were not recognized as they did not exist in the
 resource grammar.
\begin_inset Newline newline
\end_inset

S: “a few”, “both”, “either”, “most of the”, “several” etc.
 were defined as Det in FraCaSLex.
\end_layout

\begin_layout Paragraph
Conjunctions
\end_layout

\begin_layout Itemize
P: Sentences starting with a conjunction could not be parsed.
\begin_inset Newline newline
\end_inset

S: The functions SentencePAnd and SentencePBut were added in FraCaS.
\end_layout

\begin_layout Itemize
P: Conjunctions preceded by comma or semicolon could not be parsed.
\begin_inset Newline newline
\end_inset

S: “, and” and “; and” were defined as Conj in FraCaSLex.
\end_layout

\end_inset


\end_layout

\begin_layout Description
Compounds Compound noun phrases such as
\emph on

\begin_inset Quotes eld
\end_inset

southern Europe
\begin_inset Quotes erd
\end_inset


\emph default
 (adjective + proper noun),
\emph on

\begin_inset Quotes eld
\end_inset

APCOM manager
\begin_inset Quotes erd
\end_inset


\emph default
 (proper noun + noun) and
\emph on

\begin_inset Quotes eld
\end_inset

university student
\begin_inset Quotes erd
\end_inset


\emph default
 (noun + noun) were problematic.
 Partly because the Resource Grammar currently cannot handle all kinds of
 compounding, but mostly because many of the corresponding Swedish phrases
 are single compound words.
 In total there were 28 wulti-word compounds, divided between nouns, proper
 nouns and adjectives.
\end_layout

\begin_layout Description
Time
\begin_inset space ~
\end_inset

and
\begin_inset space ~
\end_inset

Date
\begin_inset space ~
\end_inset

Expressions Time and date expressions were problematic for different reasons.
 First, although a generic multilingual time and date resource grammar is
 in the making, it is not finished yet.
 Second, different languages use different syntactic constructions for times
 and dates.
 Especially the use prepositions differ a lot:
\emph on

\begin_inset Quotes eld
\end_inset

in 1990
\begin_inset Quotes erd
\end_inset


\emph default
,
\emph on

\begin_inset Quotes eld
\end_inset

in February
\begin_inset Quotes erd
\end_inset


\emph default
 and
\emph on

\begin_inset Quotes eld
\end_inset

in two years
\begin_inset Quotes erd
\end_inset


\emph default
, are translated to Swedish as
\emph on

\begin_inset Quotes eld
\end_inset

1990
\begin_inset Quotes erd
\end_inset


\emph default
,
\emph on

\begin_inset Quotes eld
\end_inset

i februari
\begin_inset Quotes erd
\end_inset


\emph default
 and
\emph on

\begin_inset Quotes eld
\end_inset

om två år
\begin_inset Quotes erd
\end_inset


\emph default
, respectively.
 For these reasons, we have defined all time and date expressions as multi-word
 adverbials.
 In total we defined 55 different time and date phrases.
\end_layout

\begin_layout Subsubsection
Grammar Additions
\end_layout

\begin_layout Standard
Three different grammatical constructions were added to the grammar.
 They consist of natural extensions to and slight modifications of existing
 functions.
 The intention is that they will be added to the resource grammar in the
 near future.
 Examples include the idiom
\emph on

\begin_inset Quotes eld
\end_inset

so do I
\begin_inset Quotes erd
\end_inset


\emph default
 /
\emph on

\begin_inset Quotes eld
\end_inset

so did she
\begin_inset Quotes erd
\end_inset


\emph default
, and question adverbials such as
\emph on

\begin_inset Quotes eld
\end_inset

if Smith signed the contract, did Jones sign the contract?
\begin_inset Quotes erd
\end_inset


\emph default
.
\end_layout

\begin_layout Subsubsection
Elliptic Phrases
\end_layout

\begin_layout Standard
The resource grammar cannot handle all kinds of conjunctions and elliptical
 phrases.
 In the FraCaS corpus there are 35 sentences with more advanced elliptical
 constructions.
 Examples include
\emph on

\begin_inset Quotes eld
\end_inset

Bill did
\family typewriter
[\SpecialChar \ldots{}
]
\family default
 too
\begin_inset Quotes erd
\end_inset


\emph default
, and
\emph on

\begin_inset Quotes eld
\end_inset

Smith saw Jones sign the contract and
\family typewriter
[\SpecialChar \ldots{}
]
\family default
 his secretary make a copy
\begin_inset Quotes erd
\end_inset


\emph default
.
 Our solution was to introduce empty phrases, one for each grammatical category.
 E.g., in the first example, the ellipsis is an empty verb phrase, and the
 longer example contains an empty ditransitive verb.
\end_layout

\begin_layout Subsection
Coverage
\end_layout

\begin_layout Standard
Of the 874 unique sentences, 812 could be parsed directly with the Resource
 Grammar and the implemented lexicon, as shown in table
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:coverage"

\end_inset

.
 With the three additional grammatical constructions 14 more sentences were
 parsed.
 The addition of elliptical phrases increased the number of sentences by
 another 34.
 Of the 14 remaining sentences, we could parse 6 more by doing some minor
 reformulations, such as moving a comma or adding a preposition.

\end_layout

\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status open

\begin_layout Plain Layout
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="7" columns="3">
<features tabularvalignment="middle">
<column alignment="left" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Total
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
% of sentences
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Unique sentences
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
874
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
100%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Accepted by the RG
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
812
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
92.9%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
- with grammar extensions
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
826
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
94.5%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
- with elliptic phrases
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
860
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
98.4%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
- with slight reformulation of sentence
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
866
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
99.1%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Unable to parse
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
8
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
0.9%
\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
The coverage of the English FraCaS grammar
\begin_inset CommandInset label
LatexCommand label
name "tab:coverage"

\end_inset


\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
\begin_inset Note Note
status collapsed

\begin_layout Plain Layout
Grammatical extensions: RelNP_nocomma, SoDoI, ExtAdvQS, ConjQS.
\end_layout

\begin_layout Plain Layout
Note that this statistics is very strict in the sense that punctuation (in
 particular commas) are included and has to be incorporated by the grammar.
\end_layout

\begin_layout Plain Layout
After having taken measures to solve the problems described in section 2.2,
 the parsing rate was at 84,6%.
 Part of these sentences could be parsed, but returned no representative
 trees, which gave a lower percentage of correctly parsed sentences (83,2%).
 There were various reasons why certain sentences could not be parsed, with
 various degrees of severity.
 The table below shows the results after changing the corpus by giving substitut
ions for problematic sentences on each of these levels.
 The first number is the number of sentences out of 1220, while the percentage
 is on the next line.
\end_layout

\begin_layout Plain Layout
These are explanations for the different levels:
\end_layout

\begin_layout Enumerate
the original corpus with no changes.
\end_layout

\begin_layout Enumerate
substitution for simple spelling or grammar mistakes, such as double punctuation
 or incorrect verb forms.
 The change also involved using only uncontracted negation, for the sake
 of conformity and simplicity.
 There were only a few sentences of these types, so changing them did not
 make a major difference to the results.
\end_layout

\begin_layout Enumerate
rewriting of certain constructions that could not be handled by the parser.
 These were constructions like “the people [..] all voted...”, changed to “all
 the people [...] voted...”.
\end_layout

\begin_layout Enumerate
filling of gaps in gap constructions, e.g.
 adding “spoken to Mary” to “Bill has”, rendering “Bill has spoken to Mary”.
\end_layout

\begin_layout Plain Layout
\begin_inset Tabular
<lyxtabular version="3" rows="5" columns="3">
<features tabularvalignment="middle">
<column alignment="left" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
FraCaS version
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Parsed
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Correctly parsed
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1.
 original
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1032 84,6%
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1015 83,2%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
2.
 mistakes corrected; uncontracted negation
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1037 85,0%
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1020 83,6%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
3.
 reconstructions
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1040 85,2%
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1026 84,1%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
4.
 gap filling
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1045 85,7%
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1043 85,5%
\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\begin_layout Plain Layout
As we can see, the changes made in the corpus did not cause any major increase
 in the percentage of parsed sentences, and only a slightly higher increase
 in the percentage of correctly parsed sentences.
 It would take more radical changes for a more radical increase.
 In the following section, we will look into what those changes would concern.
\end_layout

\end_inset


\begin_inset Note Note
status collapsed

\begin_layout Plain Layout
The following are a few examples of tree structures resulting from parsing
 FraCaS sentences using this grammar.
\end_layout

\begin_layout Description
Positive
\begin_inset space ~
\end_inset

declarative:
\begin_inset Quotes eld
\end_inset

No delegate finished the report.
\begin_inset Quotes erd
\end_inset


\end_layout

\begin_deeper
\begin_layout Plain Layout
Sentence (DeclPos TPast ASimul (PredVP (DetCN (DetQuant no_Quant NumSg)
 (UseN delegate_N)) (ComplSlash (SlashV2a finish_V2) (DetCN (DetQuant DefArt
 NumSg) (UseN report_N)))))
\end_layout

\end_deeper
\begin_layout Description
Negative
\begin_inset space ~
\end_inset

declarative:
\begin_inset Quotes eld
\end_inset

Bill did not speak to Mary on Monday.
\begin_inset Quotes erd
\end_inset


\end_layout

\begin_deeper
\begin_layout Plain Layout
Sentence (DeclNeg TPast ASimul (PredVP (UsePN bill_PN) (AdvVP (ComplSlash
 (SlashV2a speak_to_V2) (UsePN mary_PN)) on_monday_Adv)))
\end_layout

\end_deeper
\begin_layout Description
Question:
\begin_inset Quotes eld
\end_inset

Did a Swede win a Nobel prize?
\begin_inset Quotes erd
\end_inset


\end_layout

\begin_deeper
\begin_layout Plain Layout
Sentence (Question TPast ASimul (PredVP (DetCN (DetQuant IndefArt NumSg)
 (UseN swede_N)) (ComplSlash (SlashV2a win_V2) (DetCN (DetQuant IndefArt
 NumSg) (UseN nobel_prize_N)))))
\end_layout

\end_deeper
\begin_layout Description
Clause
\begin_inset space ~
\end_inset

conjunction:
\begin_inset Quotes eld
\end_inset

Smith took a machine on Tuesday, and Jones took a machine on Wednesday.
\begin_inset Quotes erd
\end_inset


\end_layout

\begin_deeper
\begin_layout Plain Layout
Sentence (DeclConj comma_and_Conj TPast ASimul (PredVP (UsePN smith_PN)
 (AdvVP (ComplSlash (SlashV2a take_V2) (DetCN (DetQuant IndefArt NumSg)
 (UseN machine_N))) on_tuesday_Adv)) (PredVP (UsePN jones_PN) (AdvVP (ComplSlash
 (SlashV2a take_V2) (DetCN (DetQuant IndefArt NumSg) (UseN machine_N)))
 on_wednesday_Adv)))
\end_layout

\end_deeper
\begin_layout Description
Sentence-initial
\begin_inset space ~
\end_inset

conjunction:
\begin_inset Quotes eld
\end_inset

But only one woman.
\begin_inset Quotes erd
\end_inset


\end_layout

\begin_deeper
\begin_layout Plain Layout
SentencePBut (UttNP (PredetNP only_Predet (DetCN (DetQuant IndefArt (NumCard
 (NumNumeral (num (pot2as3 (pot1as2 (pot0as1 pot01))))))) (UseN woman_N))))
\end_layout

\end_deeper
\begin_layout Description
Noun
\begin_inset space ~
\end_inset

phrase
\begin_inset space ~
\end_inset

conjunction:
\begin_inset Quotes eld
\end_inset

John and his colleagues went to a meeting.
\begin_inset Quotes erd
\end_inset


\end_layout

\begin_deeper
\begin_layout Plain Layout
Sentence (DeclPos TPast ASimul (PredVP (ConjNP2 and_Conj (UsePN john_PN)
 (DetCN (DetQuant (PossPron he_Pron) NumPl) (UseN colleague_N))) (AdvVP
 (UseV go8walk_V) (PrepNP to_Prep (DetCN (DetQuant IndefArt NumSg) (UseN
 meeting_N))))))
\end_layout

\end_deeper
\end_inset


\begin_inset Note Note
status collapsed

\begin_layout Plain Layout
Three of the sentences that are encoded as synonyms have attachment ambiguities
 that can be encoded in the grammar.
 This means that they have different trees in different problems (169.1.p/170.1.p,
 175.1.p/176.1.p, 244.1.p/245.1.p).
 But we don't count them in this statistics.
\end_layout

\end_inset


\end_layout

\begin_layout Subsection
Syntactical Ambiguity
\end_layout

\begin_layout Standard
All trees in the FraCaS treebank are implemented in the GF grammar described
 above.
 This grammar can be used by itself for parsing and analysing similar sentences.
 It is useful to know how ambiguous the grammar is, so we have parsed the
 866 sentences that are covered by the grammar and counted the number of
 trees for each sentence.
 Table
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:ambiguity"

\end_inset

 shows that the grammar is moderately ambiguous, where almost 70% of the
 sentences have less than 10 different parse trees, and over 90% have less
 than 100 trees.
 The median is for a sentence to have 5 parse trees, and the largest number
 of trees for a sentence is 33,048.
 The ambiguous sentence is:
\emph on

\begin_inset Quotes eld
\end_inset

Since APCOM bought its present office building it has been paying mortgage
 interest on it for more than 10 years.
\begin_inset Quotes erd
\end_inset


\end_layout

\begin_layout Standard
Note that the number of parse trees are misleading for the 34 sentences
 with elliptic phrases, since ellipsis is linearised as
\emph on

\begin_inset Quotes eld
\end_inset


\family typewriter
[\SpecialChar \ldots{}
]
\family default

\begin_inset Quotes erd
\end_inset


\emph default
 in the FraCaS grammar.
 If we had made the elliptic phrases invisible, the number of parse trees
 would increase dramatically.
\end_layout

\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status open

\begin_layout Plain Layout
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="5" columns="3">
<features tabularvalignment="middle">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
No.
 parse trees
\end_layout

\end_inset
</cell>
<cell multicolumn="1" alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
No.
 sentences
\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1 -- 9
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
598
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
69.1%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
10 -- 99
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
203
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
23.4%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
100 -- 999
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
49
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
5.7%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset Formula $\geq$
\end_inset

 1000
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
16
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1.8%
\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
Ambiguity of the FraCaS treebank
\begin_inset CommandInset label
LatexCommand label
name "tab:ambiguity"

\end_inset


\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
\begin_inset Note Note
status collapsed

\begin_layout Subsection
Problems remaining
\end_layout

\begin_layout Plain Layout
Some problems could not be solved, due to their complexity and/or the time
 limitations of the project.
 Remaining problems are listed below, categorised according to their nature.
 Examples from the FraCaS corpus are given with the relevant parts italicized.
 For each type of problem, the number of affected sentences is given in
 brackets (out of the 177 sentences that were not correctly parsed).
 A few sentences had more than one problem, but was only counted in one
 category.
\end_layout

\begin_layout Paragraph
Adverbials (46)
\end_layout

\begin_layout Plain Layout
Certain kinds and uses of adverbials were problematic.
\end_layout

\begin_layout Itemize
Verb phrase adverbials (1)
\end_layout

\begin_deeper
\begin_layout Plain Layout
“Every executive who had a laptop computer brought it to take notes at the
 meeting.”
\end_layout

\end_deeper
\begin_layout Itemize
Noun phrase adverbials (3)
\end_layout

\begin_deeper
\begin_layout Plain Layout
“It lasted 2 days.”
\end_layout

\begin_layout Plain Layout
“Smith had been travelling the day before she arrived in Katmandu.”
\end_layout

\end_deeper
\begin_layout Itemize
Sentence-initial adverbials (34)
\end_layout

\begin_deeper
\begin_layout Plain Layout
“Since 1992 ITEL has been in Birmingham.”
\end_layout

\begin_layout Plain Layout
“Yesterday APCOM signed the contract.”
\end_layout

\begin_layout Plain Layout
“Then she took a taxi to the station.”
\end_layout

\begin_layout Plain Layout
“Two years from now Smith will have been to Florence at least four times.”
\end_layout

\end_deeper
\begin_layout Itemize
To this group also belong sentence-initial subordinate clauses.
 (Subordinate clauses following the main clause are treated as adverbials,
 so it is only natural to treat subordinate clauses preceding the main clause
 as adverbials too.)
\end_layout

\begin_deeper
\begin_layout Plain Layout
“If Smith and Anderson did not sign the contract, Jones signed the contract.”
\end_layout

\begin_layout Plain Layout
“When Smith arrived in Katmandu she had been travelling for three days.”
\end_layout

\begin_layout Plain Layout
“Before APCOM bought its present office building, it had been paying mortgage
 interest [...].”
\end_layout

\end_deeper
\begin_layout Itemize
Adverbials with copula (8)
\end_layout

\begin_deeper
\begin_layout Plain Layout
“It is now 1996.”
\end_layout

\begin_layout Plain Layout
“Today is Saturday, July 14th.”
\end_layout

\end_deeper
\begin_layout Paragraph
Verb phrase conjunctions (5)
\end_layout

\begin_layout Plain Layout
The grammar could handle conjunction on the noun phrase and clause level,
 but not verb phrase conjunctions.
\end_layout

\begin_layout Plain Layout
“ICM is one of the companies and owns 150 computers.”
\end_layout

\begin_layout Plain Layout
“She took a taxi to the station and caught the first train to Luxembourg.”
\end_layout

\begin_layout Plain Layout
“Jones graduated in March and has been employed ever since.”
\end_layout

\begin_layout Paragraph
Auxiliary verbs (17)
\end_layout

\begin_layout Plain Layout
Auxiliary verbs used independently could not be parsed.
\end_layout

\begin_layout Plain Layout
“John wanted to buy a car, and he did.”
\end_layout

\begin_layout Plain Layout
“Bill spoke to everyone that John did.”
\end_layout

\begin_layout Plain Layout
“She finished before he did.”
\end_layout

\begin_layout Paragraph
Complex comparisons (23)
\end_layout

\begin_layout Plain Layout
Simple comparatives worked well, but not comparatives embedded in a noun
 phrase or other complex comparisons.
\end_layout

\begin_layout Plain Layout
“John is a fatter politician than Bill.”
\end_layout

\begin_layout Plain Layout
“ITEL won more orders than APCOM lost.”
\end_layout

\begin_layout Plain Layout
“ITEL sold 3000 more computers than APCOM.”
\end_layout

\begin_layout Plain Layout
“APCOM has a more important customer than ITEL.”
\end_layout

\begin_layout Plain Layout
“Mary's story lasted as long as Jones's updating the program.”
\end_layout

\begin_layout Paragraph
Relative clauses (11)
\end_layout

\begin_layout Plain Layout
Some relative clauses could not be parsed or parsed correctly.
\end_layout

\begin_layout Itemize
-- Relative clauses using present participle (1)
\end_layout

\begin_deeper
\begin_layout Plain Layout
“No one gambling seriously stops until he is broke.”
\end_layout

\end_deeper
\begin_layout Itemize
-- Relative clauses modifying a pronoun (8)
\end_layout

\begin_deeper
\begin_layout Plain Layout
“No one who starts gambling seriously stops until he is broke.”
\end_layout

\begin_layout Plain Layout
“Everyone who starts gambling seriously continues until he is broke.”
\end_layout

\begin_layout Plain Layout
“Nobody who is asleep ever knows that he is asleep.”
\end_layout

\end_deeper
\begin_layout Itemize
-- Relative clauses with object gap (2)
\end_layout

\begin_deeper
\begin_layout Plain Layout
“There is a representative that Smith wrote to every week.”
\end_layout

\end_deeper
\begin_layout Paragraph
Complement infinitive clauses (17)
\end_layout

\begin_layout Plain Layout
The verb “see” as in “see someone do something”, defined as V2V, does not
 work.
 It requires an infinitive marker, which should not be present in this case.
\end_layout

\begin_layout Plain Layout
“Smith saw Jones sign the contract.”
\end_layout

\begin_layout Plain Layout
“Smith saw Jones' heart beat.”
\end_layout

\begin_layout Paragraph
Other (58)
\end_layout

\begin_layout Plain Layout
Apart from the problems in the categories above, there are other problems
 that are harder to classify.
 Some of these could have been solved, had time permitted, while others
 are of a more intricate type.
 Each problem is exemplified by one sentence from the FraCaS corpus.
\end_layout

\begin_layout Plain Layout
“Mary represents her own company.” (15)
\end_layout

\begin_layout Plain Layout
“APCOM sold exactly 2500 computers.” (1)
\end_layout

\begin_layout Plain Layout
“Smith spent two hours writing the report.” (12)
\end_layout

\begin_layout Plain Layout
“No representative took less than half a day to read the report.” (1)
\end_layout

\begin_layout Plain Layout
“The conference was over on July 8th, 1994.” (2)
\end_layout

\begin_layout Plain Layout
“Bill owns a blue one.” (6)
\end_layout

\begin_layout Plain Layout
“That is, there was one lawyer who signed all the reports.” (1)
\end_layout

\begin_layout Plain Layout
“Bill is going to speak to Mary.” (1)
\end_layout

\begin_layout Plain Layout
“It is the case that Jones is not and will never be allowed to write his
 memoirs.” (4)
\end_layout

\begin_layout Plain Layout
“It took the representatives more than a week to read the report.” (2)
\end_layout

\begin_layout Plain Layout
“Smith represents his company and so does Jones.” (13)
\end_layout

\begin_layout Subsection
Tree selection
\end_layout

\begin_layout Plain Layout
When having parsed the whole corpus, a selection had to be made for each
 sentence to be represented by the most adequate tree structure.
 Most of the time there was a clear choice, while at other times, two trees
 were kept since it was not clear which one was the most suitable representation
 of the sentence.
 This was especially common for sentences using a copula with an indefinite
 noun phrase as complement.
 In these cases, both the tree with the indefinite article represented and
 the one without were kept.
\end_layout

\end_inset


\end_layout

\begin_layout Section
The Swedish Corpus
\end_layout

\begin_layout Standard
\begin_inset Note Note
status collapsed

\begin_layout Subsection
Modules
\end_layout

\begin_layout Plain Layout
In order to build the Swedish version of the FraCaS corpus, two modules
 were written, one lexicon module and one grammar module.
\end_layout

\begin_layout Subsubsection
Lexicon module
\end_layout

\begin_layout Plain Layout
FraCaSLexSwe is the Swedish concrete lexicon.
 It was built in a very similar way to the English counterpart, using the
 functions mkN, mkA, mkV etc, mainly from the Paradigms module.
\end_layout

\begin_layout Subsubsection
Grammar module
\end_layout

\begin_layout Plain Layout
FraCaSSwe is the Swedish concrete grammar.
 Just as for the English counterpart, parts of the Grammar module (namely
 Noun, Verb, Adjective, Adverb, Numeral and Tense) were imported, while
 other parts were opened and necessary functions used in FraCaSSwe.
\end_layout

\end_inset


\begin_inset Note Note
status collapsed

\begin_layout Plain Layout
Some of the FraCaS sentences depend on lexical ambiguity that cannot be
 expressed adequately in Swedish.
\end_layout

\end_inset


\end_layout

\begin_layout Standard
A long-term goal of this project is that the treebank should be truly multilingu
al for all the languages in the GF resource grammar.
 Of course this is not possible in the general case, since some of the sentences
 cannot even be translated without changing their semantic content.
 But at least we can try to create a multlingual treebank of as many sentences
 as possible.
\end_layout

\begin_layout Standard
As a first step we have created Swedish translations of the sentences, by
 writing a new Swedish lexicon.
 Then we evaluated the translations and iteratively made changes to the
 trees to make the translations better.
 Note that since we use exactly the same syntax trees for the Swedish and
 English sentences, we had to make sure that the English translation was
 not changed when we modified the trees.

\end_layout

\begin_layout Standard
This means the corpus was not created by manually translating the English
 sentences, but instead we translated the lexicon and let the Swedish Resource
 Grammar take care of the syntactical translation.
 Currently, out of the 866 sentences in the treebank, 748 are translated
 into grammatically correct and comprehensible Swedish sentences.
\end_layout

\begin_layout Subsection
The Swedish Lexicon
\end_layout

\begin_layout Standard
\begin_inset Note Note
status collapsed

\begin_layout Plain Layout
When creating the Swedish lexicon
\end_layout

\begin_layout Plain Layout
As was the case for the parsing part of the project, certain problems were
 also discovered in the process of generating into Swedish.
 Often these problems had to be solved by going back to the English lexicon
 and making changes so that more suitable, often more general, trees would
 be constructed.
 This is where the two project parts were interwoven.
\end_layout

\begin_layout Plain Layout
Some of the problems could be solved and some remain.
 The solutions are presented in this section, while remaining problems are
 listed in the next section on statistics (3.3).
\end_layout

\begin_layout Plain Layout
The problems encountered have been divided into categories as seen below.
 The explanations follow P (Problem) and S (Solution).
 FraCaSLex here refers to both the abstract lexicon and the two concrete
 lexicons (FraCaSLexEng and FraCaSLexSwe).
 In the same way, FraCaS refers to both the abstract grammar and the two
 concrete grammars (FraCaSEng and FraCaSSwe).
\end_layout

\end_inset


\end_layout

\begin_layout Standard
When we created the Swedish lexicon, we often had to go back to the English
 lexicon and make changes so that more suitable trees could be constructed.
 Sometimes we merged several lexical entries into one multi-word entry,
 and sometimes we split one entry into different meanings.
 Most of the changes consisted of the following types:
\end_layout

\begin_layout Description
Compounds Many compound noun phrases, such as
\emph on
“company car”
\emph default
,
\emph on
“mortgage interest”
\emph default
 and
\emph on

\begin_inset Quotes eld
\end_inset

APCOM manager
\begin_inset Quotes erd
\end_inset


\emph default
, are single words in Swedish (
\emph on

\begin_inset Quotes eld
\end_inset

tjänstebil
\begin_inset Quotes erd
\end_inset


\emph default
,
\emph on

\begin_inset Quotes eld
\end_inset

hypoteksränta
\begin_inset Quotes erd
\end_inset


\emph default
 and
\emph on

\begin_inset Quotes eld
\end_inset

APCOM-direktör
\begin_inset Quotes erd
\end_inset


\emph default
, respectively).
 We solved this by defining them as multi-word nouns, as described in section

\begin_inset CommandInset ref
LatexCommand ref
reference "sub:Multi-word-Lexical-Items"

\end_inset

.
\end_layout

\begin_layout Description
Lexical
\begin_inset space ~
\end_inset

ambiguity Several words in English are translated into different Swedish
 words, depending on the context.
 Such words were split into different lexical entries.
 The adjective
\emph on
“poor”
\emph default
, for example, was handled by creating two different functions, one with
 the meaning
\emph on

\begin_inset Quotes eld
\end_inset

not good
\begin_inset Quotes erd
\end_inset


\emph default
 (Swedish
\emph on

\begin_inset Quotes eld
\end_inset

dålig
\begin_inset Quotes erd
\end_inset


\emph default
), and one with the meaning
\emph on

\begin_inset Quotes eld
\end_inset

not rich
\begin_inset Quotes erd
\end_inset


\emph default
 (Swedish
\emph on

\begin_inset Quotes eld
\end_inset

fattig
\begin_inset Quotes erd
\end_inset


\emph default
).
\end_layout

\begin_layout Description
Prepositions Prepositions are often translated differently in different
 contexts.
 E.g.,
\emph on

\begin_inset Quotes eld
\end_inset

inhabitant of
\begin_inset Quotes erd
\end_inset


\emph default
 is translated to
\emph on

\begin_inset Quotes eld
\end_inset

invånare i
\begin_inset Quotes erd
\end_inset


\emph default
 if the argument is a country or a town, but to
\emph on

\begin_inset Quotes eld
\end_inset

invånare på
\begin_inset Quotes erd
\end_inset


\emph default
 if the argument is an island.
 This was solved, either by creating different lexical entries, or by making
 the preposition a part of the main verb.
\end_layout

\begin_layout Description
Adverbials Most of the multi-word adverbials are time and date expressions.
 The reason for this is that many time and date expressions are translated
 very differently between different languages.
 E.g., the English preposition
\emph on

\begin_inset Quotes eld
\end_inset

in
\begin_inset Quotes erd
\end_inset


\emph default
 is translated differently for different time and date expressions:
\emph on

\begin_inset Quotes eld
\end_inset

in March
\begin_inset Quotes erd
\end_inset


\emph default
 becomes
\emph on

\begin_inset Quotes eld
\end_inset

i mars
\begin_inset Quotes erd
\end_inset


\emph default
 and
\emph on

\begin_inset Quotes eld
\end_inset

in a month
\begin_inset Quotes erd
\end_inset


\emph default
 translates to
\emph on

\begin_inset Quotes eld
\end_inset

om en månad
\begin_inset Quotes erd
\end_inset


\emph default
, whereas
\emph on
“in 1994”
\emph default
 is best formulated as the bare word
\emph on

\begin_inset Quotes eld
\end_inset

1994
\begin_inset Quotes erd
\end_inset


\emph default
 in Swedish.
 As already explained, we defined all time and date expressions as multi-word
 adverbials.
\end_layout

\begin_layout Subsection
Coverage
\end_layout

\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status open

\begin_layout Plain Layout
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="9" columns="3">
<features tabularvalignment="middle">
<column alignment="left" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Total
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
% of sentences
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Sentences in treebank
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
866
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
100%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Correct Swedish translation
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
748
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
86.4%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Problematic sentences
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
118
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
13.6%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset space ~
\end_inset


\begin_inset space ~
\end_inset

-- idioms
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
31
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
3.6%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset space ~
\end_inset


\begin_inset space ~
\end_inset

-- agreement
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
24
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
2.8%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset space ~
\end_inset


\begin_inset space ~
\end_inset

-- future tense
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
12
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
1.4%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset space ~
\end_inset


\begin_inset space ~
\end_inset

-- elliptical
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
19
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
2.2%
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset space ~
\end_inset


\begin_inset space ~
\end_inset

-- uncomprehensible
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
32
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
3.7%
\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
The coverage of the Swedish FraCaS grammar
\begin_inset CommandInset label
LatexCommand label
name "tab:swedish-coverage"

\end_inset


\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
Table
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:swedish-coverage"

\end_inset

 gives an overview of the coverage of the Swedish lexicon and grammar.
 Of the 866 unique sentences in the treebank, we consider 748 to have good
 Swedish translations.
 The remaining 118 sentences had some problems which we divided into five
 different classes -- idioms, agreement, future tense, elliptical phrases,
 and more difficult errors.
 Table
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:swedish-problems"

\end_inset

 gives examples of some of the encountered problems, and in the next section
 are short descriptions.
\end_layout

\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status open

\begin_layout Plain Layout
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="19" columns="4">
<features tabularvalignment="middle">
<column alignment="center" valignment="middle" width="25col%">
<column alignment="center" valignment="middle" width="25col%">
<column alignment="center" valignment="middle" width="25col%">
<column alignment="center" valignment="middle" width="25col%">
<row>
<cell alignment="center" valignment="middle" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
English original
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Direct translation
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Better idiom
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Literally in English
\end_layout

\end_inset
</cell>
</row>
<row>
<cell multicolumn="1" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
idioms
\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X is likely to Y
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X
\series bold
är trolig
\series default
 att Y
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
\emph on
det är troligt
\series default
 att X Y
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
it is likely that X Y
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
members of the committee
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
\emph on
medlemmar av
\series default
 kommittén
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
kommitté
\series bold
medlemmar
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
committee-members
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X is asleep
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X
\series bold
är sovande
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X
\series bold
sover
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X sleeps
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
the previous one
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
den förra
\series bold
en
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
den förra
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
the previous
\end_layout

\end_inset
</cell>
</row>
<row>
<cell multicolumn="1" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
agreement
\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X has the right to Y
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X har
\series bold
rätten
\series default
 att Y
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X har
\series bold
rätt
\series default
 att Y
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X has right to Y
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
traffic increased
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
\emph on
trafik
\series default
 ökade
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
\emph on
trafiken
\series default
 ökade
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
the traffic increased
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
one of the tenors
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
\emph on
ett
\series default
av tenorerna
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
\emph on
en
\series default
 av tenorerna
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
---
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
everyone continues until he is broke
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
alla fortsätter tills
\series bold
han
\series default
är pank
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
alla fortsätter tills
\series bold
de
\series default
är panka
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
all continue until they are broke
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
clients at the demonstration
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
\emph on
klienter
\series default
på presentationen
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
\emph on
klienterna
\series default
 på presentationen
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
the clients at the demonstration
\end_layout

\end_inset
</cell>
</row>
<row>
<cell multicolumn="1" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
future tense
\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X will make a poor stock market trader
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X
\series bold
ska
\series default
 bli en dålig aktiehandlare
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X
\series bold
kommer att
\series default
 bli en dålig aktiehandlare
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
---
\end_layout

\end_inset
</cell>
</row>
<row>
<cell multicolumn="1" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
elliptical phrases
\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X wanted to buy a car, and he did
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X ville köpa en bil, och han gjorde
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X ville köpa en bil, och han gjorde
\series bold
det
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X wanted to buy a car, and he did it
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X did too
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X gjorde också
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X gjorde
\series bold
det
\series default
också
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X did it too
\end_layout

\end_inset
</cell>
</row>
<row>
<cell multicolumn="1" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
more difficult
\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X took less than half a day to Y
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X tog mindre än en halv dag att Y
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\emph on
X tog mindre än en halv dag
\series bold
 på sig för
\series default
 att Y
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="middle" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
---
\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
Examples of encountered problems with the Swedish translation
\begin_inset CommandInset label
LatexCommand label
name "tab:swedish-problems"

\end_inset


\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Subsubsection
Types of translation problems
\end_layout

\begin_layout Description
Idioms We encountered 10 problematic idioms in 31 sentences, where the direct
 translation of a phrase is not the most natural, but instead we should
 use a different syntactical construction.

\end_layout

\begin_layout Description
Agreement There were 7 different noun phrase agreement problems in 24 of
 the sentences, where the Swedish translation would be more natural if we
 could change the number, definiteness or gender of the noun phrase.

\end_layout

\begin_layout Description
Future
\begin_inset space ~
\end_inset

tense Swedish future tense takes two different forms, either
\emph on

\begin_inset Quotes eld
\end_inset

ska
\begin_inset Quotes erd
\end_inset


\emph default
 or
\emph on

\begin_inset Quotes eld
\end_inset

kommer att
\begin_inset Quotes erd
\end_inset


\emph default
.
 The resource grammar defaults to
\emph on

\begin_inset Quotes eld
\end_inset

ska
\begin_inset Quotes erd
\end_inset


\emph default
, but
\emph on

\begin_inset Quotes eld
\end_inset

kommer att
\begin_inset Quotes erd
\end_inset


\emph default
 is the more natural translation for all 12 FraCaS sentences using future
 tense.
 This is the case for 12 sentences, one example is
\emph on

\begin_inset Quotes eld
\end_inset

Bill will talk to Mary
\begin_inset Quotes erd
\end_inset


\emph default
, which should be translated to
\emph on

\begin_inset Quotes eld
\end_inset

Bill kommer att prata med Mary
\begin_inset Quotes erd
\end_inset


\emph default
.
\end_layout

\begin_layout Description
Elliptical
\begin_inset space ~
\end_inset

phrases 19 sentences has problems with elliptical phrases in Swedish.
 15 of them has to do with the auxiliary verb
\emph on

\begin_inset Quotes eld
\end_inset

do/does/did
\begin_inset Quotes erd
\end_inset


\emph default
, which sounds very awkward when it is translated to the Swedish verb
\emph on

\begin_inset Quotes eld
\end_inset

gör/gjorde
\begin_inset Quotes erd
\end_inset


\emph default
.
 E.g.,
\emph on

\begin_inset Quotes eld
\end_inset

Bill did too
\begin_inset Quotes erd
\end_inset


\emph default
 is translated as
\emph on

\begin_inset Quotes eld
\end_inset

Bill gjorde också
\begin_inset Quotes erd
\end_inset


\emph default
.
 In Swedish we also need an object
\emph on

\begin_inset Quotes eld
\end_inset

det
\begin_inset Quotes erd
\end_inset


\emph default
 (lit.

\emph on

\begin_inset Quotes eld
\end_inset

it
\begin_inset Quotes erd
\end_inset


\emph default
), so a better translation is
\emph on

\begin_inset Quotes eld
\end_inset

Bill gjorde det också
\begin_inset Quotes erd
\end_inset


\emph default
 (lit.

\emph on

\begin_inset Quotes eld
\end_inset

Bill did it too
\begin_inset Quotes erd
\end_inset


\emph default
).
 The remaining four problematic elliptical sentences are more difficult
 to analyse.
\end_layout

\begin_layout Description
Serious 32 of the sentences had more serious problems in Swedish.
 Some of them did not translate at all, since one of the grammatical constructio
ns had not been implemented for Swedish yet.
 Others translated, but with a very strange word order or inflection, since
 the corresponding grammatical construction did not function as expected.

\end_layout

\begin_layout Standard
All in all, out of the 118 problematic Swedish sentences we believe than
 more than two thirds of them should be possible to correct without too
 much trouble.

\end_layout

\begin_layout Standard
\begin_inset Note Note
status collapsed

\begin_layout Paragraph
Idioms
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

in business
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

i affärsverksamhet
\begin_inset Quotes erd
\end_inset

? (3)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

Bill is likely to [..]
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

är sannolik/trolig att
\begin_inset Quotes erd
\end_inset

? [bättre:
\begin_inset Quotes eld
\end_inset

det är troligt att Bill [..]
\begin_inset Quotes erd
\end_inset

] (2)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

Mary is female
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

Mary är kvinnlig
\begin_inset Quotes erd
\end_inset

? [bättre:
\begin_inset Quotes eld
\end_inset

Mary är kvinna
\begin_inset Quotes erd
\end_inset

] (2)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

members of the committee
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

medlemmar av kommittén
\begin_inset Quotes erd
\end_inset

 [bättre:
\begin_inset Quotes eld
\end_inset

kommittémedlem
\begin_inset Quotes erd
\end_inset

] (2)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

had his paper accepted
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

hade sin uppsats godkänd
\begin_inset Quotes erd
\end_inset

 [bättre:
\begin_inset Quotes eld
\end_inset

fick sin uppsats godkänd
\begin_inset Quotes erd
\end_inset

] (3)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

made a loss
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

gjorde en förlust
\begin_inset Quotes erd
\end_inset

 [bättre:
\begin_inset Quotes eld
\end_inset

gick med förlust
\begin_inset Quotes erd
\end_inset

] (4)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

a chain of businesses
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

en kedja av affärsverksamheter
\begin_inset Quotes erd
\end_inset

 [bättre:
\begin_inset Quotes eld
\end_inset

en affärskedja
\begin_inset Quotes erd
\end_inset

] (7)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

be sleeping
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

är sovande
\begin_inset Quotes erd
\end_inset

 [bättre:
\begin_inset Quotes eld
\end_inset

sover
\begin_inset Quotes erd
\end_inset

] (4)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

no one stops until
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

eveyone continues until
\begin_inset Quotes erd
\end_inset

 => [
\begin_inset Quotes eld
\end_inset

ingen slutar förrän
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

alla fortsätter tills
\begin_inset Quotes erd
\end_inset

]
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

a blue one
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

en blå en
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

en blå
\begin_inset Quotes erd
\end_inset

 (3)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

the previous one
\begin_inset Quotes erd
\end_inset

 => ?? /
\begin_inset Quotes eld
\end_inset

den förra
\begin_inset Quotes erd
\end_inset

 (1)
\end_layout

\begin_layout Plain Layout

\series bold
OK
\series default
:
\begin_inset Quotes eld
\end_inset

comes cheap
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

fås billigt
\begin_inset Quotes erd
\end_inset

? [bättre:
\begin_inset Quotes eld
\end_inset

är billig
\begin_inset Quotes erd
\end_inset

] (3)
\end_layout

\begin_layout Plain Layout

\series bold
OK
\series default
: (group_N2)
\begin_inset Quotes eld
\end_inset

a group of people
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

en grupp av människor
\begin_inset Quotes erd
\end_inset

 [
\begin_inset Quotes eld
\end_inset

en grupp människor
\begin_inset Quotes erd
\end_inset

] (2)
\end_layout

\begin_layout Paragraph
OK: Passive form
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

was blamed
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

blev beskyllda
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

beskylldes
\begin_inset Quotes erd
\end_inset

 (3)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

was used
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

blev använd
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

användes
\begin_inset Quotes erd
\end_inset

 (2)
\end_layout

\begin_layout Paragraph
Agreement
\end_layout

\begin_layout Plain Layout
16 of these contained variations of the definite noun phrase
\begin_inset Quotes eld
\end_inset


\emph on
the right
\begin_inset Quotes erd
\end_inset


\emph default
 (used in the context
\emph on

\begin_inset Quotes eld
\end_inset


\emph default
X
\emph on
 has the right to live in
\emph default
Y
\emph on

\begin_inset Quotes erd
\end_inset


\emph default
), which is translated to
\begin_inset Quotes eld
\end_inset


\emph on
rätten
\begin_inset Quotes erd
\end_inset


\emph default
.
 But in Swedish it sounds more natural to say
\emph on

\begin_inset Quotes eld
\end_inset

rätt
\begin_inset Quotes erd
\end_inset


\emph default
 (lit.

\emph on

\begin_inset Quotes eld
\end_inset

right
\begin_inset Quotes erd
\end_inset


\emph default
), at least in this context.
 In other cases, English indefinite noun phrases are better translated to
 definite form, such as
\emph on

\begin_inset Quotes eld
\end_inset

traffic
\begin_inset Quotes erd
\end_inset


\emph default
 which should translate to
\emph on

\begin_inset Quotes eld
\end_inset

trafiken
\begin_inset Quotes erd
\end_inset


\emph default
 (lit.

\emph on

\begin_inset Quotes eld
\end_inset

the traffic
\begin_inset Quotes erd
\end_inset


\emph default
).
 Another example is gender problems, since Swedish has two genders, such
 as
\emph on

\begin_inset Quotes eld
\end_inset

one of the tenors
\begin_inset Quotes erd
\end_inset


\emph default
 where the gender of
\emph on

\begin_inset Quotes eld
\end_inset

one
\begin_inset Quotes erd
\end_inset


\emph default
 should depend on the gender of
\emph on

\begin_inset Quotes eld
\end_inset

tenor
\begin_inset Quotes erd
\end_inset


\emph default
.
 Problems with number were mostly due to the singular pronoun
\emph on

\begin_inset Quotes eld
\end_inset

everyone
\begin_inset Quotes erd
\end_inset


\emph default
 which was translated to the plural pronoun
\emph on

\begin_inset Quotes eld
\end_inset

alla
\begin_inset Quotes erd
\end_inset


\emph default
.
\end_layout

\begin_layout Paragraph
Agreement examples
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

one of the tenors
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

ett av tenorerna
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

en av tenorerna
\begin_inset Quotes erd
\end_inset

 (1)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

everyone continues until he is broke
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

alla fortsätter tills han är pank
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

\SpecialChar \ldots{}
 tills de är panka
\begin_inset Quotes erd
\end_inset

 (1)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

clients at the demonstration
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

klienter på presentationen
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

klienterna \SpecialChar \ldots{}

\begin_inset Quotes erd
\end_inset

 (2)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

traffic increased
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

trafik ökade
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

trafiken ökade
\begin_inset Quotes erd
\end_inset

 (1)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

is the chairman of ITEL
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

är ordföranden för ITEL
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

ordförande
\begin_inset Quotes erd
\end_inset

 (1)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

every customer who owns a computer has a service contract for it
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

varje kund som äger en dator har ett servicekontrakt för det
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

\SpecialChar \ldots{}
 för den
\begin_inset Quotes erd
\end_inset

 (2)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

the right to \SpecialChar \ldots{}

\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

rätten att \SpecialChar \ldots{}

\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

rätt att \SpecialChar \ldots{}

\begin_inset Quotes erd
\end_inset

 (16)
\end_layout

\begin_layout Paragraph
OK: (ta bort ProgrVP på svenska) Progressive
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

Smith was writing a report
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

Smith höll på att skriva en rapport
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

skrev en rapport
\begin_inset Quotes erd
\end_inset

 (24)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

APCOM has been paying mortgage
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

APCOM har hållit på att betala hypoteksränta
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

betalat
\begin_inset Quotes erd
\end_inset


\end_layout

\begin_layout Paragraph
Reflexive pronouns
\end_layout

\begin_layout Plain Layout

\series bold
OK
\series default
: (lägg till refl_Pron)
\begin_inset Quotes eld
\end_inset

his/her/their
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

hans/hennes/deras
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

sin
\begin_inset Quotes erd
\end_inset

/
\begin_inset Quotes erd
\end_inset

sitt
\begin_inset Quotes erd
\end_inset

/
\begin_inset Quotes erd
\end_inset

sina
\begin_inset Quotes erd
\end_inset

 (~30)
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

himself
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

sig
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

sig själv
\begin_inset Quotes erd
\end_inset

 (but not always) (1)
\end_layout

\begin_layout Paragraph
Uncomprehensible
\end_layout

\begin_layout Plain Layout
prepositions/subjunctions: 2
\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

twice as many than \SpecialChar \ldots{}

\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

dubbelt så många än \SpecialChar \ldots{}

\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

som
\begin_inset Quotes erd
\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Quotes eld
\end_inset

Bill suggested to Frank's boss that \SpecialChar \ldots{}
, and Carl to Alan's wife
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

Bill föreslog för Franks chef att \SpecialChar \ldots{}
, och Carl till Alans fru
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

för Alans fru
\begin_inset Quotes erd
\end_inset


\end_layout

\begin_layout Plain Layout

\series bold
OK
\series default
: (arrive_in_V2)
\begin_inset Quotes eld
\end_inset

arrived in Katmandu
\begin_inset Quotes erd
\end_inset

 =>
\begin_inset Quotes eld
\end_inset

anlände i Katmandu
\begin_inset Quotes erd
\end_inset

 /
\begin_inset Quotes eld
\end_inset

till
\begin_inset Quotes erd
\end_inset

 (2)
\end_layout

\begin_layout Plain Layout
Uncomprehensible/difficult to fix: 6
\end_layout

\begin_layout Plain Layout
No linearisation: 24
\end_layout

\begin_layout Plain Layout
\begin_inset Note Note
status collapsed

\begin_layout Subsection
Statistics
\end_layout

\begin_layout Plain Layout
Out of 1220 original sentences, 1043 could eventually be correctly parsed
 and their tree representations be used for generating the equivalent Swedish
 sentences.
 Also, the changes listed in section 3.2 were performed, resulting in better
 linearizations.
 The generated Swedish sentences were checked for accuracy and divided into
 a few different groups.
 The number of sentences in each group is given in the left-most column.
 Descriptions and examples for each group are given on the right and can
 be viewed as a list of remaining problems to be solved.
\end_layout

\begin_layout Plain Layout
\begin_inset Tabular
<lyxtabular version="3" rows="4" columns="3">
<features tabularvalignment="middle">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Tabular
<lyxtabular version="3" rows="6" columns="3">
<features tabularvalignment="middle">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<column alignment="center" valignment="top" width="0">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
unique sentences
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
874
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
(som förut)
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
599
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
(skiljer sig)
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
89
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
(hade inte förut)
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
150
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
no linearisation
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
36
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\begin_layout Paragraph
Number Type Description Result Desired result
\end_layout

\begin_layout Itemize
811 correct & natural
\end_layout

\begin_layout Itemize
120 considered correct but could be more natural
\end_layout

\begin_deeper
\begin_layout Itemize
“each” / “every”: “varje europé” “alla européer”
\end_layout

\begin_layout Itemize
\begin_inset Note Note
status open

\begin_layout Plain Layout
proper inclusion -- indefinite article: “Mary är en student” “Mary är student”
\end_layout

\end_inset


\end_layout

\begin_layout Itemize
\begin_inset Note Note
status open

\begin_layout Plain Layout
infinitive marker desired: “John sade Bill hade skadat sig” “John sade att
 Bill hade skadat sig”
\end_layout

\end_inset


\end_layout

\begin_layout Itemize
\begin_inset Note Note
status open

\begin_layout Plain Layout
infinitive marker not desired: “lyckades att vinna” “lyckades vinna”
\end_layout

\end_inset


\end_layout

\begin_layout Itemize
passive constructions: “blev använd” “användes”
\end_layout

\begin_layout Itemize
gender of pronoun referring to previous sentence: “Bill äger ett också”
 (referring to “bil”) “Bill äger en också”
\end_layout

\begin_layout Itemize
definite form: “ordföranden för” “ordförande för”
\end_layout

\begin_layout Itemize
meaning of “female”: “Mary är kvinnlig” “Mary är kvinna”
\end_layout

\end_deeper
\begin_layout Itemize
28 requiring changes in the FraCaS lexicon
\end_layout

\begin_deeper
\begin_layout Itemize
“of” constructions:
\end_layout

\begin_deeper
\begin_layout Itemize
“medlemmar av kommittén” “medlemmar i kommittén”
\end_layout

\begin_layout Itemize
“kedja av affärsverksamhet” “affärskedja”
\end_layout

\begin_layout Itemize
“grupp av människor” “grupp människor”
\end_layout

\begin_layout Itemize
\begin_inset Note Note
status open

\begin_layout Plain Layout
“alla av dem” “alla” / “allihop”
\end_layout

\end_inset


\end_layout

\end_deeper
\begin_layout Itemize
\begin_inset Note Note
status open

\begin_layout Plain Layout
translation of “should”: “föreslog [...] att de borde” “föreslog [...] att de
 skulle”
\end_layout

\end_inset


\end_layout

\begin_layout Itemize
translation of “make a loss”: “gjorde en förlust” “gick med förlust”
\end_layout

\begin_layout Itemize
translation of “have been to”: “har varit till” “har varit i”
\end_layout

\begin_layout Itemize
translation of “be asleep”: “har varit sovande” “har sovit”
\end_layout

\end_deeper
\begin_layout Itemize
30 requiring changes in the English and/or Swedish general grammar(s)
\end_layout

\begin_deeper
\begin_layout Itemize
gender: “ett av de ledande tenorerna” “en av de ledande tenorerna”
\end_layout

\begin_layout Itemize
translation of “come cheap”: “fås billigt” “vara billig (att anlita)”
\end_layout

\begin_layout Itemize
\begin_inset Note Note
status open

\begin_layout Plain Layout
“both” with adjective -- definite article: “båda ledande tenorerna” “båda
 de ledande tenorerna”
\end_layout

\end_inset


\end_layout

\begin_layout Itemize
“will” -- difference in modality: “ska bli” “kommer att bli” (sometimes)
\end_layout

\begin_layout Itemize
AdV position of “also”: “hon gav också dem en faktura” “hon gav dem också
 en faktura”
\end_layout

\begin_layout Itemize
translation of “awarded himself”: “tilldelade sig” “tilldelade sig själv”
\end_layout

\begin_layout Itemize
\begin_inset Note Note
status open

\begin_layout Plain Layout
translation of “used to be”: “brukade att vara” e.g.
 “var tidigare”
\end_layout

\end_inset


\end_layout

\end_deeper
\begin_layout Itemize
54 difficult to correct
\end_layout

\begin_deeper
\begin_layout Itemize
\begin_inset Note Note
status open

\begin_layout Plain Layout
“were blamed for” (non-human subject): “blev anklagade för” [difficult to
 find Swedish equivalent]
\end_layout

\end_inset


\end_layout

\begin_layout Itemize
reflexive possessive: “skrev hans första roman” “skrev sin första roman”
\end_layout

\begin_layout Itemize
progressive aspect: “höll på att” (sometimes meaning “nearly”) [difficult
 to find Swedish equivalent]
\end_layout

\begin_layout Itemize
singular / plural: “alla italienska män vill vara en framstående tenor”
 “alla italienska män vill vara framstående tenorer”
\end_layout

\begin_layout Itemize
“be likely to”: “Smith är sannolik att bli” “det är sannolikt att Smith
 blir”
\end_layout

\begin_layout Itemize
\begin_inset Note Note
status open

\begin_layout Plain Layout
“some”: “snabbare än någon ITEL-dator” “snabbare än någon viss ITEL-dator”
\end_layout

\end_inset


\end_layout

\begin_layout Itemize
“lose one's temper”: “Smith förlorade hans humör” “Smith tappade humöret”
\end_layout

\begin_layout Itemize
“have something accepted”: “John hade hans uppsats godkänd” “John fick sin
 uppsats godkänd”
\end_layout

\end_deeper
\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Section
Discussion
\end_layout

\begin_layout Standard
The FraCaS treebank was a small project financed by the Centre for Language
 Technology (CLT) at the University of Gothenburg.
 The project used less than three person months to create a treebank for
 the FraCaS test suite, together with a bilingual GF grammar for the trees.
 The coverage of the English grammar is 95--99%, depending on whether you
 include elliptic phrases or not.
 The Swedish grammar is not as developed yet and has a coverage of 86% of
 the FraCaS sentences.
\end_layout

\begin_layout Standard
The treebank is released under an open-source license, and can be downloaded
 as a part of the Gothenburg CLT Toolkit:
\end_layout

\begin_layout Standard
\noindent
\align center

\family sans
\begin_inset CommandInset href
LatexCommand href
target "http://www.clt.gu.se/clt-toolkit"

\end_inset


\end_layout

\begin_layout Subsection
Implications for the FraCaS Test Suite
\end_layout

\begin_layout Standard
From the corpus point of view, the FraCaS test suite is not very interesting.
 It is a small corpus (less than 1000 sentences), with non-natural, made
 up sentences.
 Furthermore it uses a fairly standard syntax and is monolingual.
\end_layout

\begin_layout Standard
However, the main value of FraCaS is as a resource for testing semantic
 inference algorithms
\begin_inset CommandInset citation
LatexCommand citep
key "MacCartneyManning2007:Natural-logic-for-textual,MacCartneyManning2008:Modeling-semantic-containment"

\end_inset

.
 This project adds syntactic structures to the test sentences, which we
 hope can be beneficial since the semantics of a sentence has a close dependence
 on syntax.

\end_layout

\begin_layout Standard
Furthermore, we have added a new language to the test set, albeit not perfect
 yet.
 And since we are using the multilingual GF resource grammar, more languages
 should be relatively easy to add.

\end_layout

\begin_layout Subsection
Implications for GF
\end_layout

\begin_layout Standard
The making of this treebank has been a strees test, both for GF and for
 the resource grammar.
 The main work in this project has been by a person who is an experienced
 computational linguist, but had never used GF before.
 This means that the project has been a test of how easy it is to learn
 and start using GF and its resource grammar.
 Furthermore, the project was a test of the coverage of the existing grammatical
 constructions in the resource grammar.

\end_layout

\begin_layout Subsection
Future Work
\end_layout

\begin_layout Standard
There are several remaining problems and interesting extension possible
 with the FraCaS treebank; the following are some examples:
\end_layout

\begin_layout Itemize
First and most important is to get most of the remaining Swedish sentences
 to work, by factoring out idioms and other constructions from the treebank
 and put them in the grammars instead.

\end_layout

\begin_layout Itemize
A good treatment of elliptical phrases, by implementing more coordination
 constructions in the resource grammar.

\end_layout

\begin_layout Itemize
We would like to add new languages from the resource grammar to the multilingual
 FraCaS grammar.
 Hopefully this will also benefit the existing two languages, by requiring
 us to abstract away from language-specific details, thus making the grammar
 more abstract.
\end_layout

\begin_layout Itemize
A long-term goal would be to make the treebank and the associated grammar
 more
\begin_inset Quotes eld
\end_inset

semantic
\begin_inset Quotes erd
\end_inset

 by factoring out even more syntactic constructions and put them in a semantic
 resource grammar.
 That it is possible to formulate classic Montague semantics in GF has already
 been shown
\begin_inset CommandInset citation
LatexCommand citep
key "Ranta2001:Computational-Semantics"

\end_inset

, but here we need to handle many more semantic and pragmatic phenomena.
\end_layout

\begin_layout Standard
\begin_inset Note Note
status open

\begin_layout Subsection
Related work
\end_layout

\begin_layout Plain Layout
Converting the Penn Treebank to GF, Swedish Talbanken to GF
\end_layout

\end_inset


\end_layout

\begin_layout Standard
\begin_inset CommandInset bibtex
LatexCommand bibtex
bibfiles "FraCaSBank"
options "apalike"

\end_inset


\end_layout

\end_body
\end_document