The T-expressions in the START system are built using the pattern <subject relation object> at every level of embedding and thus mimic the hierarchical organization of English sentences and parallel the representational characteristics of natural language. A language-based knowledge representation system has many advantages: it is very expressive and easy to use; it provides a uniform symbolic representation for parsing and generation; and it makes it possible to automatically create large knowledge bases from natural language texts.
However, a representation mimicking the hierarchical organization of natural language syntax has one undesirable consequence: sentences differing in their surface syntax but close in meaning are not considered similar by the system. Thus, given sentence (10) as input, START will create T-expressions (11), whereas a near paraphrase, sentence (12), will generate T-expressions (13):
(10) Bill surprised Hillary with his answer.
(11)
<<Bill surprise Hillary> with answer>
<answer related-to Bill>
(12) Bill's answer surprised Hillary.
(13)
<answer surprise Hillary>
<answer related-to Bill>
Speakers of English know (at least implicitly) that in
sentence (10), the subject (Bill) brings about
the emotional reaction (surprise) by means of some property
expressed in the with phrase. Sentence (12)
describes the same emotional reaction as in (10)
despite different syntactic realizations of some of the arguments;
namely, in (12), the property and its possessor are
collapsed into a single noun phrase. It seems natural that this kind
of knowledge be available to a natural language system. However,
START, as described so far, does not consider T-expressions (11) and (13), which are associated
with these sentences, to be similar.
The difference in the T-expressions becomes particularly problematic
when START is asked a question. Suppose the input text includes the
surprise sentence (10) that is stored in the
knowledge base using T-expressions (11). Now
suppose the user asked the following question:
(14) Whose answer surprised Hillary?
Although a speaker of English could easily answer this
question after being told sentence (10),
START would not be able to answer it because T-expressions (15)
produced by question (14) will not match
T-expressions (11) in the knowledge base.
(15)
<answer surprise Hillary>
<answer related-to whom>
To be able to handle such questions, the START system should be made
aware of the interactions between the syntactic and semantic
properties of verbs. Interactions similar to the one
just described pervade the English language and, therefore, cannot be
ignored in the construction of a natural language system.
The surprise example illustrates that START needs information
that allows it to deduce the relationship between alternate
realizations of the arguments of verbs. In this instance, we want
START to know that whenever A surprised B with C, then it is
also true that A's C surprised B. We do this by introducing
rules that make explicit the relationship between alternate
realizations of the arguments of verbs. We call such rules
S-rules. Here is the S-rule that solves the problem caused by the
verb surprise:
(16)
Surprise S-rule
If <<subject surprise object1> with object2>
Then <object2 surprise object1>
S-rules are implemented as a rule-based system. Each S-rule
is made up of two parts, an antecedent (the If-clause) and a
consequent (the Then-clause). Each clause consists of a set of
templates for T-expressions, where the template elements are filled by
variables or constants. The Surprise S-rule will apply only to
T-expressions which involve the verb surprise and which meet the
additional structural constraints.
S-rules operate in two modes: forward and backward.
When triggered by certain conditions, S-rules in the forward mode
allow the system to intercept T-expressions produced by the
understanding module, transform or augment them in a way specified by
the rule, and then incorporate the result into the knowledge base.
For instance, if the Surprise S-rule is used in the forward mode,
as soon as its antecedent matches T-expressions (17) produced by the understanding module, it creates a
new T-expression in (18) and then adds it to the
knowledge base:
(17)
<<Bill surprise Hillary> with answer>
<answer related-to Bill>
(18)
<answer surprise Hillary>
<answer related-to Bill>
Now question (14) can be answered since
T-expressions (15) associated with this
question match against T-expressions (18).
The generating module of START responds:
(19) Bill's answer surprised Hillary.
All additional facts produced by the forward S-rules are
instantly entered in the knowledge base. The forward mode is
especially useful when the information processed by START is put into
action by another computer system because in such a situation START
ought to provide the interfacing system with as much data as possible.
In contrast, the backward mode is employed when the user queries the
knowledge base. Often for reasons of computational efficiency, it is
advantageous not to incorporate all inferred knowledge into the
knowledge base immediately. S-rules in the backward mode trigger
when a request comes in which cannot be answered directly, initiating
a search in the knowledge base to determine if the answer can be
deduced from the available information. For example, the Surprise S-rule
used in the backward mode does not trigger when
sentence (10) is read and T-expressions (11) are produced by START; it triggers only when
question (14) is asked.
Next: The Lexical Component
Up: From Sentence Processing
Previous: An Overview of the START System
Boris Katz
Thu Feb 27 15:34:49 EST 1997