mirror of
https://github.com/GrammaticalFramework/gf-core.git
synced 2026-05-16 06:32:51 -06:00
multimodal document revised
This commit is contained in:
@@ -1,4 +1,4 @@
|
|||||||
Multimodal Resource Grammars
|
Demonstrative Expressions and Multimodal Grammars
|
||||||
Author: Aarne Ranta <aarne (at) cs.chalmers.se>
|
Author: Aarne Ranta <aarne (at) cs.chalmers.se>
|
||||||
Last update: %%date(%c)
|
Last update: %%date(%c)
|
||||||
|
|
||||||
@@ -12,11 +12,19 @@ Last update: %%date(%c)
|
|||||||
%!target:html
|
%!target:html
|
||||||
|
|
||||||
|
|
||||||
==Plan==
|
==Abstract==
|
||||||
|
|
||||||
After an introduction to **demonstratives**
|
This document shows a method to write grammars
|
||||||
and **integrated multimodality**,
|
in which spoken utterances are accompanied by
|
||||||
we will show how multimodal grammars can be written in GF
|
pointing gestures. A computer application of such
|
||||||
|
grammars are **multimodal dialogue systems**, in
|
||||||
|
which the pointing gestures are performed by
|
||||||
|
mouse clicks and movements.
|
||||||
|
|
||||||
|
After an introduction to the notions of
|
||||||
|
**demonstratives** and **integrated multimodality**,
|
||||||
|
we will show by a concrete example
|
||||||
|
how multimodal grammars can be written in GF
|
||||||
and how they can be used in dialogue systems.
|
and how they can be used in dialogue systems.
|
||||||
The explanation is given in three stages:
|
The explanation is given in three stages:
|
||||||
|
|
||||||
@@ -25,7 +33,7 @@ The explanation is given in three stages:
|
|||||||
+ How to use a multimodal resource grammar.
|
+ How to use a multimodal resource grammar.
|
||||||
|
|
||||||
|
|
||||||
==Multimodal expressions==
|
==Multimodal grammars==
|
||||||
|
|
||||||
**Demonstrative expressions** are an old idea. Such
|
**Demonstrative expressions** are an old idea. Such
|
||||||
expressions get their meaning from the context.
|
expressions get their meaning from the context.
|
||||||
@@ -37,8 +45,8 @@ expressions get their meaning from the context.
|
|||||||
In particular, as in these examples, the meaning
|
In particular, as in these examples, the meaning
|
||||||
can be obtained from accompanying pointing gestures.
|
can be obtained from accompanying pointing gestures.
|
||||||
|
|
||||||
Thus the meaning-bearing unit if neither the words and the
|
Thus the meaning-bearing unit is neither the words nor the
|
||||||
gesture alone, but their combination. Demonstratives
|
gestures alone, but their combination. Demonstratives
|
||||||
thus provide an example of **integrated multimodality**,
|
thus provide an example of **integrated multimodality**,
|
||||||
as opposed to parallel multimodality. In parallel
|
as opposed to parallel multimodality. In parallel
|
||||||
multimodality, speech and other modes of communication
|
multimodality, speech and other modes of communication
|
||||||
@@ -83,7 +91,7 @@ of **linearization types**. A linearization type is the type of
|
|||||||
the **concrete syntax objects** assigned to semantic values.
|
the **concrete syntax objects** assigned to semantic values.
|
||||||
What a GF grammar defines is a relation
|
What a GF grammar defines is a relation
|
||||||
```
|
```
|
||||||
abstract syntax trees --- concrete syntax objects
|
abstract syntax trees <---> concrete syntax objects
|
||||||
```
|
```
|
||||||
When modelling context-free grammar in GF,
|
When modelling context-free grammar in GF,
|
||||||
the concrete syntax objects are just strings.
|
the concrete syntax objects are just strings.
|
||||||
@@ -111,7 +119,7 @@ A simple example of a multimodal GF grammar is the one called
|
|||||||
the Tram Demo grammar. It was written by Björn Bringert within
|
the Tram Demo grammar. It was written by Björn Bringert within
|
||||||
the TALK project as a part of a dialogue system that
|
the TALK project as a part of a dialogue system that
|
||||||
deals with queries about tram timetables. The system interprets
|
deals with queries about tram timetables. The system interprets
|
||||||
a speech input in combination with clicks on a digital map.
|
a speech input in combination with mouse clicks on a digital map.
|
||||||
|
|
||||||
The abstract syntax of (a minimal fragment of) the Tram Demo
|
The abstract syntax of (a minimal fragment of) the Tram Demo
|
||||||
grammar is
|
grammar is
|
||||||
@@ -120,8 +128,8 @@ cat
|
|||||||
Input, Dep, Dest, Click ;
|
Input, Dep, Dest, Click ;
|
||||||
fun
|
fun
|
||||||
GoFromTo : Dep -> Dest -> Input ; -- "I want to go from x to y"
|
GoFromTo : Dep -> Dest -> Input ; -- "I want to go from x to y"
|
||||||
DepClick : Click -> Dep ; -- "from here" with click
|
DepHere : Click -> Dep ; -- "from here" with click
|
||||||
DestClick : Click -> Dest ; -- "to here" with click
|
DestHere : Click -> Dest ; -- "to here" with click
|
||||||
|
|
||||||
CCoord : Int -> Int -> Click ; -- click coordinates
|
CCoord : Int -> Int -> Click ; -- click coordinates
|
||||||
```
|
```
|
||||||
@@ -133,8 +141,8 @@ lincat
|
|||||||
|
|
||||||
lin
|
lin
|
||||||
GoFromTo x y = {s = ["I want to go"] ++ x.s ++ y.s ; p = x.p ++ y.p} ;
|
GoFromTo x y = {s = ["I want to go"] ++ x.s ++ y.s ; p = x.p ++ y.p} ;
|
||||||
DepClick c = {s = ["from here"] ; p = c.p} ;
|
DepHere c = {s = ["from here"] ; p = c.p} ;
|
||||||
DestClick c = {s = ["to here"] ; p = c.p} ;
|
DestHere c = {s = ["to here"] ; p = c.p} ;
|
||||||
|
|
||||||
CCoord x y = {p = "(" ++ x.s ++ "," ++ y.s ++ ")"} ;
|
CCoord x y = {p = "(" ++ x.s ++ "," ++ y.s ++ ")"} ;
|
||||||
```
|
```
|
||||||
@@ -185,7 +193,7 @@ we split verb phrases (``VP``) into a finite and infinitive part.
|
|||||||
lin Quest np vp = {s = vp.fin ++ np.s ++ vp.inf} ;
|
lin Quest np vp = {s = vp.fin ++ np.s ++ vp.inf} ;
|
||||||
```
|
```
|
||||||
|
|
||||||
==From grammars to dialogue systems==
|
===From grammars to dialogue systems===
|
||||||
|
|
||||||
The general recipe for using GF when building dialogue systems
|
The general recipe for using GF when building dialogue systems
|
||||||
is to write a grammar with the following components:
|
is to write a grammar with the following components:
|
||||||
@@ -218,57 +226,65 @@ manager by Prolog representations of abstract syntax.
|
|||||||
|
|
||||||
==Adding multimodality to a unimodal grammar==
|
==Adding multimodality to a unimodal grammar==
|
||||||
|
|
||||||
This section gives a recipe for converting a unimodal grammar to
|
This section gives a recipe for making any unimodal grammar
|
||||||
multimodal, by adding pointing gestures to expressions. The recipe
|
multimodal, by adding pointing gestures to chosen expressions. The recipe
|
||||||
guarantees that the resulting grammar remains semantically well-formed,
|
guarantees that the resulting grammar remains semantically well-formed,
|
||||||
i.e. type correct.
|
i.e. type correct.
|
||||||
|
|
||||||
|
|
||||||
===The multimodal conversion===
|
===The multimodal conversion===
|
||||||
|
|
||||||
The **multimodal conversion** of a grammar consists of three
|
The **multimodal conversion** of a grammar consists of seven
|
||||||
steps involving a decision, and four derivative steps:
|
steps, of which the first is always the same, the second
|
||||||
|
involves a decision, and the rest are derivative:
|
||||||
|
|
||||||
+ (Decision) Decide which categories are demonstrative. This means that their
|
+ Add the category ```Point``` with a standard linearization type.
|
||||||
expressions can (but need not) contain pointing gestures.
|
```
|
||||||
+ (Decision) Define constructors that are truly demonstrative, i.e. take
|
cat Point ;
|
||||||
a pointing gesture as an argument. These constructors have the form
|
lincat Point = {point : Str} ;
|
||||||
|
```
|
||||||
|
+ (Decision) Decide which constructors are demonstrative, i.e. take
|
||||||
|
a pointing gesture as an argument. Add a ``Point``` as their last argument.
|
||||||
|
The new type signatures for such constructors //d// have the form
|
||||||
```
|
```
|
||||||
fun d : ... -> Point -> D
|
fun d : ... -> Point -> D
|
||||||
```
|
```
|
||||||
In the simplest case, such a //d// is an already existing
|
+ (Derivative) Add a ``point`` field to the linearization type //L// of any
|
||||||
constructor, to which a ``Point`` argument it added. But it is also
|
demonstrative category //D//, i.e. a category that has at least one demonstrative
|
||||||
possible to add new constructors.
|
constructor:
|
||||||
+ (Derivative) Add an extra ``point`` field to the linearization type //L// of any
|
|
||||||
demonstrative category //D//:
|
|
||||||
```
|
```
|
||||||
lincat D = L ** {point : Str} ;
|
lincat D = L ** {point : Str} ;
|
||||||
```
|
```
|
||||||
+ (Derivative) Add an extra ``point`` field to the linearization //t// of any
|
+ (Derivative) If some other category //C// has a constructor //d// that takes
|
||||||
constructor //d// that has been made demonstrative:
|
|
||||||
```
|
|
||||||
lin d x1 ... xn p = t x1 ... xn ** p ;
|
|
||||||
```
|
|
||||||
+ (Decision) Define the linearization rules of those demonstrative constructors
|
|
||||||
that are new.
|
|
||||||
+ (Derivative) If some other category //C// has a constructor //f// that takes
|
|
||||||
demonstratives as arguments, make it demonstrative by adding a //point// field
|
demonstratives as arguments, make it demonstrative by adding a //point// field
|
||||||
to its linearization type.
|
to its linearization type.
|
||||||
|
+ (Derivative) Store the ``point`` field in the linearization //t// of any
|
||||||
|
constructor //d// that has been made demonstrative:
|
||||||
|
```
|
||||||
|
lin d x1 ... xn p = t x1 ... xn ** {point = p.point} ;
|
||||||
|
```
|
||||||
+ (Derivative) For each constructor //f// that takes demonstratives //D_1,...,D_n//
|
+ (Derivative) For each constructor //f// that takes demonstratives //D_1,...,D_n//
|
||||||
as arguments, collect the //point// fields of the arguments in the //point//
|
as arguments, collect the //point// fields of the arguments in the //point//
|
||||||
field of the value:
|
field of the value:
|
||||||
```
|
```
|
||||||
lin f x_1 ... x_m = t x_1 ... x_m ** {point = x_d1.point ++ ... ++ x_dn.point} ;
|
lin f x_1 ... x_m =
|
||||||
|
t x_1 ... x_m ** {point = x_d1.point ++ ... ++ x_dn.point} ;
|
||||||
```
|
```
|
||||||
Make sure that the pointings ``x_d1.point ... x_dn.point`` are concatenated
|
Make sure that the pointings ``x_d1.point ... x_dn.point`` are concatenated
|
||||||
in the same order as the arguments appear in the //linearization// //t//,
|
in the same order as the arguments appear in the //linearization// //t//,
|
||||||
which is not necessarily the same as the abstract argument order.
|
which is not necessarily the same as the abstract argument order.
|
||||||
|
+ (Derivative) To preserve type correctness, add an empty
|
||||||
|
``point`` field to the linearization //t// of any
|
||||||
|
constructor //c// of a demonstrative category:
|
||||||
|
```
|
||||||
|
lin c x1 ... xn = t x1 ... xn ** {point = []} ;
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
===An example of the conversion===
|
===An example of the conversion===
|
||||||
|
|
||||||
Start with a Tram Demo grammar with no demonstratives, but just
|
Start with a Tram Demo grammar with no demonstratives, but just
|
||||||
tram stop names and the indexical //here// (referring to the user's
|
tram stop names and the indexical //here// (interpreted as e.g. the user's
|
||||||
standing place).
|
standing place).
|
||||||
```
|
```
|
||||||
cat
|
cat
|
||||||
@@ -296,45 +312,48 @@ lin
|
|||||||
|
|
||||||
Almedal = {s = "Almedal"} ;
|
Almedal = {s = "Almedal"} ;
|
||||||
```
|
```
|
||||||
We now decide that the categories ``Dep`` and ``Dest`` are demonstrative.
|
Let us follow the steps of the recipe.
|
||||||
This means, derivatively, that ``Input`` is also demonstrative.
|
|
||||||
But ``Name`` remains unimodal.
|
+ We add the category ``Point`` and its linearization type.
|
||||||
|
+ We decide that ``DepHere`` and ``DestHere`` involve a pointing gesture.
|
||||||
|
+ We add ``point`` to the linearization types of ``Dep`` and ``Dest``.
|
||||||
|
+ Therefore, also add ``point`` to ``Input``. (But ``Name`` remains unimodal.)
|
||||||
|
+ Add ``p.point`` to the linearizations of ``DepHere`` and ``DestHere``.
|
||||||
|
+ Concatenate the points of the arguments of ``GoFromTo``.
|
||||||
|
+ Add an empty ``point`` to ``DepName`` and ``DestName``.
|
||||||
|
|
||||||
We also decide that ``DepHere`` and ``DestHere`` involve a pointing gesture.
|
|
||||||
This has consequences for ``GoFromTo`` but not for the other constructors.
|
|
||||||
However, even here we have to add an empty pointing sequence if required by the
|
|
||||||
linearization type.
|
|
||||||
|
|
||||||
In the resulting grammar, one category is added and
|
In the resulting grammar, one category is added and
|
||||||
two functions are changed in the abstract syntax:
|
two functions are changed in the abstract syntax (annotated by the step numbers):
|
||||||
```
|
```
|
||||||
cat
|
cat
|
||||||
Point ;
|
Point ; -- 1
|
||||||
fun
|
fun
|
||||||
DepHere : Point -> Dep ;
|
DepHere : Point -> Dep ; -- 2
|
||||||
DestHere : Point -> Dest ;
|
DestHere : Point -> Dest ; -- 2
|
||||||
|
|
||||||
```
|
```
|
||||||
The concrete syntax in its entirety looks as follows:
|
The concrete syntax in its entirety looks as follows
|
||||||
```
|
```
|
||||||
lincat
|
lincat
|
||||||
Input, Dep, Dest = {s : Str ; point : Str} ;
|
Dep, Dest = {s : Str ; point : Str} ; -- 3
|
||||||
|
Input = {s : Str ; point : Str} ; -- 4
|
||||||
Name = {s : Str} ;
|
Name = {s : Str} ;
|
||||||
Point = {point : Str} ;
|
Point = {point : Str} ; -- 1
|
||||||
lin
|
lin
|
||||||
GoFromTo x y = {s = ["I want to go"] ++ x.s ++ y.s ;
|
GoFromTo x y = {s = ["I want to go"] ++ x.s ++ y.s ; -- 6
|
||||||
point = x.point ++ y.point
|
point = x.point ++ y.point
|
||||||
} ;
|
} ;
|
||||||
DepHere p = {s = ["from here"] ;
|
DepHere p = {s = ["from here"] ; -- 5
|
||||||
point = p.point
|
point = p.point
|
||||||
} ;
|
} ;
|
||||||
DestHere p = {s = ["to here"] :
|
DestHere p = {s = ["to here"] : -- 5
|
||||||
point = p.point
|
point = p.point
|
||||||
} ;
|
} ;
|
||||||
DepName n = {s = ["from"] ++ n.s ;
|
DepName n = {s = ["from"] ++ n.s ; -- 7
|
||||||
point = []
|
point = []
|
||||||
} ;
|
} ;
|
||||||
DestName n = {s = ["to"] ++ n.s ;
|
DestName n = {s = ["to"] ++ n.s ; -- 7
|
||||||
point = []
|
point = []
|
||||||
} ;
|
} ;
|
||||||
Almedal = {s = "Almedal"} ;
|
Almedal = {s = "Almedal"} ;
|
||||||
@@ -345,6 +364,9 @@ What we need in addition, to use the grammar in applications, are
|
|||||||
+ Top-level categories, like ``Query`` and ``Speech`` in the original.
|
+ Top-level categories, like ``Query`` and ``Speech`` in the original.
|
||||||
|
|
||||||
|
|
||||||
|
But their proper place is probably in another grammar module, so that
|
||||||
|
the core Tram Demo grammar can be used in different systems e.g.
|
||||||
|
encoding clicks in different ways.
|
||||||
|
|
||||||
|
|
||||||
===Multimodal conversion combinators===
|
===Multimodal conversion combinators===
|
||||||
@@ -386,7 +408,8 @@ lincat
|
|||||||
Name = SS ;
|
Name = SS ;
|
||||||
|
|
||||||
lin
|
lin
|
||||||
GoFromTo x y = {s = ["I want to go"] ++ x.s ++ y.s} ** concatPoint x y ;
|
GoFromTo x y = {s = ["I want to go"] ++ x.s ++ y.s} **
|
||||||
|
concatPoint x y ;
|
||||||
DepHere = mkDem SS {s = ["from here"]} ;
|
DepHere = mkDem SS {s = ["from here"]} ;
|
||||||
DestHere = mkDem SS {s = ["to here"]} ;
|
DestHere = mkDem SS {s = ["to here"]} ;
|
||||||
DepName n = nonDem SS {s = ["from"] ++ n.s} ;
|
DepName n = nonDem SS {s = ["from"] ++ n.s} ;
|
||||||
@@ -406,19 +429,19 @@ concise. Notice the use of partial application in ``DepHere`` and
|
|||||||
|
|
||||||
The main advantage of using GF when building dialogue systems is
|
The main advantage of using GF when building dialogue systems is
|
||||||
that various components of the system
|
that various components of the system
|
||||||
can be automatically generated GF grammars.
|
can be automatically generated from GF grammars.
|
||||||
Writing grammars, however, can still be a considerable
|
Writing these grammars, however, can still be a considerable
|
||||||
task. A case in point are multilingual systems:
|
task. A case in point are multilingual systems:
|
||||||
how to localize e.g. a system built in a car to
|
how to localize e.g. a system built in a car to
|
||||||
the languages of all those customers to whom the
|
the languages of all those customers to whom the
|
||||||
car is sold? This problem has been the main focus of
|
car is sold? This problem has been the main focus of
|
||||||
GF for some years, and the solution on which work has been
|
GF for some years, and the solution on which most work has been
|
||||||
done is the development of **resource grammar libraries**.
|
done is the development of **resource grammar libraries**.
|
||||||
These libraries work in the same way as program libraries
|
These libraries work in the same way as program libraries
|
||||||
in software engineering, enabling a division of labour
|
in software engineering, enabling a division of labour
|
||||||
between, in the present case, linguists and domain experts.
|
between linguists and domain experts.
|
||||||
|
|
||||||
One of the challenges in the resource grammars of different
|
One of the goals in the resource grammars of different
|
||||||
languages has been to provide a **language-independent API**,
|
languages has been to provide a **language-independent API**,
|
||||||
which makes the same resource grammar functions available for
|
which makes the same resource grammar functions available for
|
||||||
different languages. For instance, the categories
|
different languages. For instance, the categories
|
||||||
@@ -441,15 +464,16 @@ multimodality is heavily dependent on similar things. What can we
|
|||||||
do to make multimodal grammars easier to write (for different languages)?
|
do to make multimodal grammars easier to write (for different languages)?
|
||||||
There are two orthogonal answers:
|
There are two orthogonal answers:
|
||||||
|
|
||||||
+ Use resource grammars and before and then apply the multimodal
|
+ Use resource grammars to write a unimodal dialogue grammar and
|
||||||
|
then apply the multimodal
|
||||||
conversion to manually chosen parts.
|
conversion to manually chosen parts.
|
||||||
+ Use **multimodal resource grammars** to derive multimodal
|
+ Use **multimodal resource grammars** to derive multimodal
|
||||||
dialogue system grammars automatically.
|
dialogue system grammars directly.
|
||||||
|
|
||||||
|
|
||||||
The multimodal resource grammar library has been obtained from
|
The multimodal resource grammar library has been obtained from
|
||||||
the unimodal one by applying, manually, an idea similar to the
|
the unimodal one by applying the multimodal conversion manually.
|
||||||
multimodal conversion. In addition, the API has been simplified
|
In addition, the API has been simplified
|
||||||
by leaving out structures needed in written technical documents
|
by leaving out structures needed in written technical documents
|
||||||
(the original application area of GF) but not in spoken dialogue.
|
(the original application area of GF) but not in spoken dialogue.
|
||||||
|
|
||||||
@@ -646,7 +670,7 @@ the ``Multimodal`` API has been implemented:
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
==A problem: switched order==
|
===The order problem===
|
||||||
|
|
||||||
It was pointed out in the section on the multimodal conversion that
|
It was pointed out in the section on the multimodal conversion that
|
||||||
the concrete word order may be different from the abstract one,
|
the concrete word order may be different from the abstract one,
|
||||||
@@ -667,7 +691,7 @@ ignore the word order problem, if it is correctly dealt with in
|
|||||||
the resource.
|
the resource.
|
||||||
|
|
||||||
|
|
||||||
==A recipe for using a resource library==
|
===A recipe for using a resource library===
|
||||||
|
|
||||||
In the beginning, we believed resource grammars are all that
|
In the beginning, we believed resource grammars are all that
|
||||||
an application grammarian needs to write a concrete syntax.
|
an application grammarian needs to write a concrete syntax.
|
||||||
@@ -676,8 +700,8 @@ the grammar development in this way: selecting functions from
|
|||||||
a resource API requires more abstract thinking than just
|
a resource API requires more abstract thinking than just
|
||||||
writing things (maybe even in a context-free grammar notation,
|
writing things (maybe even in a context-free grammar notation,
|
||||||
also supported by GF). This experience has led to the following
|
also supported by GF). This experience has led to the following
|
||||||
steps for grammar development, which at the same time give
|
steps for grammar development, which, while permitting
|
||||||
the work a quick start and in the end used increased abstraction
|
a quick start of the work, towards the end increase abstraction
|
||||||
to localize the grammar in different languages.
|
to localize the grammar in different languages.
|
||||||
|
|
||||||
+ Encode domain ontology in and abstract syntax, ``Domain``.
|
+ Encode domain ontology in and abstract syntax, ``Domain``.
|
||||||
|
|||||||
Reference in New Issue
Block a user