Yorick A. Wilks
Brian M. Slator
A Bradford Book:
The ACL-MIT Press Series in Natural Language Processing.
July 1995 -- 288 pp. -- $32.50
"The use of computers to understand words continues to be an area of burgeoning research. Electric Words is the first general survey of and introduction to the entire range of work in lexical linguistics and corpora -- the study of such on-line resources as dictionaries and other texts -- in the broader fields of natural language processing and artificial intelligence.
The authors integrate and synthesize the goals and methods of computational lexicons in relation to AI's sister disciplines of philosophy, linguistics, and psychology. One of the underlying messages of the book is that current research should be guided by both computational and theoretical tools and not only by statistical techniques -- that matters have gone far beyond counting to encompass the difficult province of meaning itself and how it can be formally expressed."
(Quoted from The MIT Press Computer Science and Artificial Intelligence Spring 1995 Catalog)
"Dictionaries and computation are two subjects not often brought together in the same book, or even the same proposition until recently. The aim of this book is to explore the growing relations between them and, in particular, to investigate whether what is found in traditional dictionaries can be of service to those concerned with getting computers to process and, some would say, understand natural languages like English, Japanese, Spanish and so on."
"Artificial Intelligence (AI) will be a subject never far from the concerns of this book. Most of its adherents have claimed that coded knowledge of the real world was required if machines were to understand language in any serious sense. The problem was always how to get it: there were no suitable sources and no time to hand-craft it in. [...] Dictionaries, like encyclopedias, are now of some immediate interest to anyone who shares the knowledge-directed goals of AI."
"[...] Although machine translation, by methods now considered superficial by most who work in AI, processed large volumes of text by the mid-sixties, those AI researchers who aimed for something deeper, for a modeling of what most people would call "understanding" a text, wrote programs that processed only a few sentences, a fact often disguised in their thesis', papers and books. Reputations have been made from the processing of little more than a single sentence."
"Those earliest efforts [at computational lexicography] involved little more than word frequency counting, but they were very important. They were not, as they were often treated, the most boring and unintellectual parts of computational linguistics. Matters have now gone far beyond mere counting, and have begun to intrude into the difficult province of meaning itself, and the question of how it can be formally expressed. This is perhaps the most interesting area for the authors of this book and we shall give it full coverage [...]"
"The issue here can be understood by analogy with the studies [...] of legal decisions taken in the courts. [This] activity was not thought quite right by many philosophers of ethics, for whom their subject was a wholly a priori one, having nothing to gain from empirical evidence. [It was] argued that, whereas ethical philosophers argue endlessly in the abstract about moral matters, courts, especially appeal courts, where written justifications of metaphysical and ethical issues are regularly given, actually take concrete decisions with far reaching effect. Philosophers argue about the principles of causation, for example, but judges take reasoned decisions about what caused what in particular cases."
"One can argue that lexicographers, too, take concrete decisions about meaning and its expression (indeed, thousands and thousands of them, under commercial time constraints!) while philosophers and formal semanticists argue about principles with little concrete effect."
"[...] Judges, rightly or wrongly, believe they are eliciting and expressing the common meaning of concepts and so do lexicographers. If that is so then, in the case of lexicography, it may be a serious contribution to a theory of meaning to attempt to formalize and refine what the lexicographer's craft actually contains, quite independently of any practical computational or technological benefit."
"Johnson defined lexicographers in his own dictionary as "harmless drudges" and, until recently, computational lexicographers were thought of rather similarly by their colleagues in Artificial Intelligence (AI) and the more fashionable parts of computational linguistics: syntax, semantics, pragmatics and so on. That perception has changed in recent years, partly because of the renewed emphasis on scale in computational linguistics itself, and partly because of a return to empiricism in the field and the realization that, even though dictionaries are only fallible human products, they may be the place to look for large-scale facts about meaning."
"Meaning is a large and traditionally difficult topic but, since dictionaries claim to capture the meanings of words, we cannot ignore it. Moreover, much computational linguistics, and especially within the AI tradition, has claimed to be computing meanings in one way or another."
"One might say that lexicography is a "craft theory of meaning. [...] The function of this chapter is to look quickly at the systematic study of meaning, so as to see which parts have been taken up in craft and computational lexicography, and which parts rejected. We shall be as brief as possible, and will, at every stage, try to keep in mind that we are discussing meaning only in the context of symbolic expressions and manipulations that a lexicographer or computer might carry out."
"[...] It may seem churlish to criticize Searle, since he is so plainly right about the inadequacy of truth conditions as providing a basis for a theory of meaning. But the consequences of what he says are very important for our overall goal of understanding lexical sense meaning, so let us pursue the case a little, remembering that Searle's views on this issue have been influential (and are closely related to Lewis' notion of convention in meaning specification 1972) on contemporary lexicographers."
"[...] The first substantial claims concerning the use of first order logic as a general representational language for human knowledge are in McCarthy and Hayes (1969)."
"That work established a school in AI for which first order predicate calculus was the natural representational device for language and knowledge, and it functioned as a representational language, largely divorced from the claims noted above about the formal semantics of that representational language. A later representational development in AI was the advocacy of a more sophisticated representational system, from a logical point of view: not just first order calculus but the associated Tarskian model theory as well. This was forcefully argued by McDermott for some time (1976) though he later recanted (1987)."
"Quine investigated the possibility of an approach to identity of meaning, or sentence synonymy, via behavioral criteria like those in the scenarios of primitive anthropological linguistics of the American structuralist type. That methodology actually proceeded by a method of word substitution to build up alternative utterances, but Quine had declared that taboo, so he pursued the possibility of a purely behavioral definition of the similarity of sentences. It will surprise no one to know that he found that inadequate: the thesis of the "indeterminacy of translation" holds that for any apparently synonymous utterances there will always be a potential infinity of other assumptions (which can be equated to the possible beliefs of the utterer one is seeking to understand with the "translation") which will fit the evidence. Hence there can be no determinate translation."
"All this is true but quite vacuous: it interferes with the reality of translation, human or mechanical, no more than Hume's radical skepticism about causation interfered with physics [...]"
"In replying to Schank, Chomsky rejected all functional views of language and insisted on a language function, organ or (in Fodor's 1983 term) a module, and found it necessary to write explicitly that "language is not a task-oriented device" (Fodor 1980). This is an extraordinary remark, devoid of any general support in that paper or elsewhere in Chomsky's work, and all the more strange coming from one who has recently adopted the manner of speaking of the "language organ" and its similarity to other organs of the human body. For to speak of organs and their development, let alone of genetic endowment, as Chomsky also does, is to speak of their function."
"[...] The linguistic debate over whether or not "kill" can be represented in a system of primitives as CAUSE-to-DIE or CAUSE-to-BECOME-NOT-ALIVE (Morgan 1969; Chomsky 1972) has shown that there is no agreement there over whether or how such proposals can lead to any conceivable observations of real sentences that will settle the matter."
"The continuing appeal to the above pairs not being fully equivalent (Pulman 1983a), in the sense of biconditional entailments (true in all conceivable circumstances) has led to endless silliness: from Sampson's 1975 claim that words are "indivisible", so that no explanations of meaning can be given, let alone analytic definitions, and even to Fodor's use of the non-equivalence to found a theory of mental language rich with innate but indefinable concepts like "telephone" (1975)!"
"[...] The counter claim made in this chapter is [...] that every semantic primitive can appear as a surface word in a natural language. This does not require that the same "word" as it appears in the primitive and surface form must, in any definable sense "have the same meaning". It is simply a claim that the link between the two cannot be broken, even though it is often essential that the two usages do differ, as in the linguistic equation of "kill" with CAUSE to BECOME NOT ALIVE. The latter can only be a representation of "kill" if CAUSE is taken to mean "reasonably immediately cause". For, if it can cover causation at any distance in time, then the non-equivalence of the two forms is obvious, and your birth could certainly be said to kill you."
"Dictionaries at least until now are passive entities: they wait almost all their useful lives on shelves waiting to be all too briefly consulted. And then, often, they fail for one reason or another when pressed into service. Sometimes dictionaries fail because they say too little. Definition by synonym can create this failure, as in the (possibly apocryphal) "furze" and "gorse" example where, so the story goes, a dictionary defined "furze" with its synonym, "gorse". The puzzled reader who looked to "gorse" found it too was defined by synonym, "furze". [...]"
"Sometimes dictionaries fail because they say too much. One example of this is when dictionary makers, in their zeal to outdo the competition, seemingly create new sense distinctions for the sake of it. Lexicographers are judged, in part, according to the new words they discover and to the refinements they can make to existing words. This is undoubtedly good for the science of lexicography, but is does not guarantee that dictionaries will be better, and certainly does not make them easier to use. [...]"
"Compounding the too much/little failures of dictionaries is a problem of representational ergonomics. The sublanguage of dictionary definitions is inconsistent, and the vocabulary of that sublanguage is ill-defined. This is of little consequence when dictionaries are used as they are intended: as a resource for individuals who already understand the sublanguage of dictionary definitions. But when a dictionary is to become the foundation of a computational resource, our over-arching aim in this book, a well-defined system for the semantics of definition is crucial."
"Johnson claimed that, in writing his dictionary, he had become one of "those who toil at the lower employments of life". Until quite recently that same description was thought by many in computational linguistics to apply to their colleagues working in what we now know as computational lexicology. It seemed to have neither the grand claims of syntax, nor the mystery of semantics and pragmatics, nor even the dull but precise satisfactions of phonetics and morphology. Yet now the value and interest of that early lexicological work is clear."
"At that moment in SDC, then, in 1967, the three major ingredients of what is now a global style of work were actually all copresent: empirical computation on dictionaries locating defining primitives, linguistic work on lexicons and sense-extension, and AI-procedural work on lexical sense-extension from texts. But the three components did not in fact interact at all within that building: the time was not then ripe."
"It is a trivial fact that dictionaries, in spite of their special status as source books on the language in which they are (normally) written are, nevertheless, themselves only texts with all the fallibility that implies. Johnson finally withdrew from his own great work with what he termed "frigid tranquillity", [...] the earliest computational work was done on dictionaries like Merriam-Webster, large dictionaries compiled by what we might call traditional methods. More recently, there has been a group of new MRDs, largely from Britain and largely created for the TEFL (the teaching of English as a foreign language) market. They have been made subject to a much greater degree of internal formalization that previous dictionaries, and those formats have been computer checked. Unsurprisingly, they have become the basis for much computational work using MRDs. The principal dictionaries have been the Longman Dictionary of Contemporary English (LDOCE), the Collins COBUILD dictionary (Procter 1978), and the Oxford Advanced Learner's Dictionary (OALD) (Hornby 1974), and we shall discuss the structure and production of the first two in some detail."
"From a practical point of view, the first citizen of natural language understanding is the word. Over the years, there have been countless individual efforts devoted to collecting small populations of these, which has resulted in a great diversity of practice. From the words, every system requires its own collection of lexical facts, and each collection is informed by its own (competing) theory of language and meaning."
"It is, by now, a well-documented fact (and it has been a popular trend in the recent literature to emphasize it), that there has been renewed emphasis on scale in computational linguistics, and a corresponding disillusionment with the so-called "toy systems" of linguistic and AI models. Recent years have also seen rising interest in the computational analysis of machine-readable dictionaries as a lexical resource for various purposes. These two trends are neither accidental nor independent of each other: an obvious place to look for large-scale linguistic information is in existing dictionaries."
"[...] We devote this chapter to
one aspect to that work: constructing semantic networks from MRD's."
"Many researchers believe that in order to process language effectively, it is necessary to build a knowledge base which includes hierarchical information. It is not difficult to argue that the knowledge base should "know" facts like a poodle IS-A dog, a dog IS-A mammal, a mammal IS-A animal. Most researchers would further argue that in the knowledge base, poodles will have all the properties of dogs, and dogs will have all the properties of mammals, etc. and, although there are differing opinions about whether this knowledge is inferred at the time of processing or inferred earlier and stored in the knowledge base, this is nonetheless crucial information which must be available for language processing."
"Dictionaries provide a rich source from which we can extract this kind of information automatically on a large scale. [...]"
"Lexical ambiguity pervades language and lexical ambiguity is pervasive in most forms of text, including dictionary definitions themselves. The words used in dictionary definitions of words, and their senses, are themselves lexically ambiguous. Ambiguity and arbitrariness are observable within a single dictionary when the sense-distinctions made for the definition of a word cannot be made to match with the uses of that word in the definitions of other words in the same dictionary."
"[...] One interesting point about bilingual dictionaries is their asymmetry. A word in one language often will not correspond to a single word in another, but rather to several words or a phrase. Similarly, a phrase in one language might best resolve to a single word in another, or to several different synonymous words. This creates difficulties in translation, and one of the themes of the work discussed in this chapter is how to deal with these problems. [...]"
"Machine translation (MT) is the oldest problem in natural language processing (NLP). The earliest systems implemented a doomed and straightforward word-for-word strategy. This failed, among other reasons, because so many words in a given source language have mappings to several words or phrases in a given target language. [...]"
"Lexicons for NLP systems traditionally have been hand-crafted, a process which consumes a large part of development time and effort in any research project. Many researchers believe that because of the effort needed to construct system lexicons, most NLP systems have only been able to process a trivial amount of text, and it is this "lexical bottleneck" that one aims to overcome with automatic or semi-automatic methods for constructing lexicons. The section below describes several systems which have made use of the information in a machine readable dictionary for the construction of their lexicons. We conclude with a description of a general tool for constructing lexicons in any formalism."
"In this chapter we describe two quite different kinds of future developments in lexically related areas:
1) New developments in the construction of MRDs from corpus and other sources, using partial computational methods, of the kind described in this book, and being in some cases dictionaries intended for CL purposes, in addition to the normal printed (or CD ROM) forms. [...]
2) New developments of an organizational type: a brief survey of cooperative efforts in the research, development, survey and distribution of lexical materials, all designed in their different ways to speed up the construction of lexical systems as the basis of NLP. [...]"
"The trends described briefly in the last chapter suggest that MRD-related work may be at a turning point. No one doubts the value of lexicons in NLP work except the most unrepentant "statistics alone" (Brown et al.) theorist. Nor does anyone question that lexicons should be derived by the most automatic possible means. [...]"
"What is occasionally overlooked [...] are the fundamental assumptions driving MRD work in the first place. These assumptions being, not coincidentally, the major themes of this book: sufficiency, extricability, and bootstrapping."
"Do MRDs contain sufficient information to do all we need dont to texts? After many years of hard work, amid considerable progress, the jury is still out. Clearly, no dictionary will supply all the information required for every linguistic theory, but that was never a serious requirement. Nobody knows what "pragmatic force" is, to choose a random example, and one would not expect MRDs to provide lexical entries coded for that. The sufficency question is a different one: Do MRDs provide enough to make MTDs that are useful, robust, and extensible?"
"The extricability question is very much related --- given there is information in there, can we get at it? And will it come along quietly, or will it be more trouble than it's worth? Experience tells us guarded optimism is appropriate here. If anything, this book is a monument to extricablity, and its pages are filled with success stories of various sorts."
"The final component of our triad, bootstrapping, is the most interesting question, and the one that will take us into the 21st century. The NMSU lexical research programme has always held bootstrapping as a central tenet, and the most promising work in our world-wide survey has that flavor. The key to maintaining a lexical knowledge base is extensibilty: the will to keep up with the languages of the world as they change day by day. Corpora keep us current, and the world is going towards corpora based studies of various sorts. Moving in that direction from an MRD foundation, as many seem to be, is a bootstrapping strategy in the finest tradition."