A Computational Approach to the Comparative Construction
- Jessie Pinkham
MSR-TR-96-16 |
The field of computational linguistics has as one of its main goals the analysis of naturally occurring unrestricted text input, based in part on the belief that linguists’ armchair examples provide a poor sample of linguistic reality. This paper will examine the results of searching the Brown corpus for examples of comparative constructions, and outline the analysis we propose to implement in the Microsoft Natural Language Understanding System parsing module. Comparatives are ideally suited for searches in corpora because they can be easily identified through the key words ( as, than, more, less), and are sufficiently complex as to make the artificial creation of relevant examples a poor substitute for naturally occurring cases.
The paper will review the comparative construction as outlined in McCawley (1988), which has a clear presentation of most of the salient facts that have been discussed in the literature. When dealing with comparatives, considerable effort is usually spent in the construction of semantically appropriate underlying forms, and in the description of the necessary syntactic rules that will allow the derivation of surface forms from those underlying constructs. I will not go into this in much detail here, primarily because under a computational approach, it is more fruitful to look for interpretive approaches to mapping the meaning of comparatives. My first goal will be to inventory the occurrences of the construction that are found in the Brown corpus, and to devise a classification system that more accurately reflects the problem from a computational perspective. My second objective will be to outline and extend the analysis of comparatives outlined in Fauconnier (1985).
The central concept in the pragmatic/semantic perspective proposed in Fauconnier is that comparative constructions can span two mental spaces, i.e. “constructs distinct from linguistic structures, but built up in any discourse according to guidelines provided by the linguistic expressions. In the model, mental spaces will be represented as structured, incrementable sets – that is, sets with elements (a, b, c, …) and relations holding between them (R1ab, R2a, R3cbf), such that new elements can be added to them and new relations established between their elements”(p. 16) Once we accept the view of the comparative construction spanning two mental spaces, the interpretation of “elliptical” cases becomes a pragmatics/discourse issue rather than one of ellipsis of a complex underlying structure.