9. Some Examples of Interaction of Linguistic and
Extra-linguistic Knowledge in Interpretation.

Fixing a border between linguistic and extra-linguistic (domain-specific) knowledge is a rather hard task. Consider the following text of a problem: "Две машины выехали из двух городов... Каково будет расстояние между ними через два ча- са?" ("Two cars departed from two cities... What will be the distance between them in two hours?") What is the way of grasping that the words "между ними" ("between them") refer to the cars rather than the cities? One may state that a time pointer appearing in the question part plays its specific role. In fact the pointer would be redundant in case we would refer to cities, i.e., to immoveable objects. The argument (not being easily formalized within the proposed model) leads to the conclusion that establishing an anaphoric link we must always avoid variants defying interpretation in the given domain. But this argument can't be applied to language in general or else we would never be able to say absurdities. Could the choice of the pronoun antecedent be influenced by topic-focus relations and not by the concept of constant distance between cities? The word "машины" ("cars") from the first sentence belong to the topic, and "городов" ("cities") neither to the topic nor to the focus? Or maybe both explanations are valid? The more so the statement of the constant distance may be treated (with a certain stretch) as being of linguistic nature: what if the word "расстояние" ("distance") has two different "meanings" - the distance between immoveable objects and the distance between the objects where at least one of them can move; in this case a time pointer can be accepted only in the second variant.

One more example: "Поезду осталось пройти 50 км." ("The train had to pass 50 km more"). For solving problems containing a condition like that usually a model of a composite motion is needed. The model requires three motions to be considered: the one that should be performed (the "whole" motion), the one already performed (indirect referents) and the remaining one (the direct referent). Motions mentioned in other sentences may be merged with one of those three. E.g., in the preceding sentence it could be said that "поезд прошел..." ("the train passed...") or "по- езд должен был пройти..." ("the train had to pass..."). The only feature guiding the merging process is modality (or tense). Thus here the merging of motions is guided by a linguistic feature. And the question remains whether three motions appearing due to the word "осталось" ("remained") is a pure linguistic phenomena?

A more complex example: "Поезд задержался на 15 минут, поэтому, чтобы прибыть на станцию по расписанию, oн увеличил скорость на..." ("The train was delayed to 15 minutes and that's why for arriving at the station in accordance with the schedule it increased the velocity by..."). In order to use the text for the problem solving the word "задержаться" ("to be delayed") must be explained as follows: "to begin an action later than it was planned to". Then the words "поезд задержался на 15 минут" ("the train was delayed by 15 minutes") will produce a net with the following semantic objects: 1. the begining of an action (real), 2. the beginning of the action (planned), 3. time interval between them - the 15 minutes. Then interpreting the subsequent text must lead to merging this not-mentioned action with a motion from some point up to the station (also in the two modalities); a modality of "planned" must be merged with a pointer "according to the schedule". Could we treat the methods like this as legal explanations of the meaning of linguistic units or these are merely tricks?

We must also note that the abundance of the cases making these texts a source of interesting linguistic observations is of somewhat artificial nature. Underlying these facts is the existence of specific "rules of the game" making the author of these texts to fulfill the task of packing complex numerical correlations into rather compact verbal wrapper.

Here is one more example from a simple domain - queries to a database dealing with different staff information: "to find a department where the older the employee is the more he/she earns." Mapping the text to the database structure is not a difficult task. For the comparative construction one must use a model extracting a set of objects being compared and the two compared parametres (age and salary). This model is easily connected with a program testing the age/salary correlation and also, if needed, the translation of the query to the predicate calculus language ("for any two employees of the department if the age of the first exceeds the age of the second..."). This example shows the unexpedience of using an intermediary representation between natural language and a database. It is clear that using the initial formulation (to be more exact - using a comparative construction model) will lead to more efficient computer program than the one resulting from the predicate formula. Actually in a computer program one can sort a set by one of the fields and to test the monotonicity of values of the second field in the sorted sequence. To extract the possibility from the predicate calculus language representation is rather difficult.