Knowledge-Based Systems

Knowledge-based systems (KBS) is a subfield of artificial intelligence concerned with creating programs that embody the reasoning expertise of human experts. In simplest terms, the overall intent is a form of intellectual cloning: find persons with a reasoning skill that is important and rare (e.g., an expert medical diagnostician, chess player, chemist), talk to them to determine what specialized knowledge they have and how they reason, then embody that knowledge and reasoning in a program.

The undertaking is distinguished from AI in general in several ways. First, there is no claim of breadth or generality; these systems are narrowly focused on specific domains of knowledge and cannot venture beyond them. Second, the systems are often motivated by a combination of science and application on real-world tasks; success is defined at least in part by accomplishing a useful level of performance on that task.

Third, and most significant, the systems are knowledge based in a technical sense: they base their performance on the accumulation of a substantial body of task-specific knowledge. AI has examined other notions of how intelligence might arise. GPS (Ernst 1969), for example, was inspired by the observation that people can make some progress on almost any problem we give them, and it depended for its power on a small set of very general problem -solving methods. It was in that sense methods based.

Knowledge-based systems, by contrast, work because of what they know and in this respect are similar to human experts. If we ask why experts are good at what they know, the answer will contain a variety of factors, but to some significant degree it is that they simply know more. They do not think faster, nor in fundamentally different ways (though practice effects may produce shortcuts); their expertise arises because they have substantially more knowledge about the task.

Human EXPERTISE is also apparently sharply domain-specific. Becoming a chess grandmaster does not also make one an expert physician; the skill of the master chemist does not extend to automobile repair. So it is with these programs.

The systems are at times referred to as expert systems and the terms are informally used interchangeably. "Expert system" is however better thought of as referring to the level of aspiration for the system. If it can perform as well as an expert, then it can play that role; this has been done, but is relatively rare. More commonly the systems perform as intelligent assistants, making recommendations for a human to review. Speaking of them as expert systems thus sets too narrow a perspective and restricts the utility of the technology.

Although knowledge-based systems have been built with a variety of representation technologies, two common architectural characteristics are the distinction between inference engine and knowledge base, and the use of declarative style representations. The knowledge base is the system's repository of task-specific information; the inference engine is a (typically simple) interpreter whose job is to retrieve relevant knowledge from the knowledge base and apply it to the problem at hand. A declarative representation is one that aims to express knowledge in a form relatively independent of the way in which it is going to be used; predicate calculus is perhaps the premier example.

The separation of inference engine and knowledge base, along with the declarative character of the knowledge, facilitates the construction and maintenance of the knowledge base. Developing a knowledge-based system becomes a task of debugging knowledge rather than code; the question is, what should the program know, rather than what should it do.

Three systems are frequently cited as foundational in this area: SIN (Moses 1967) and its descendant MACSYMA (Moses 1971), DENDRAL (Feigenbaum et al. 1971), and MYCIN (Davis 1977). SIN's task domain was symbolic integration. Although its representation was more procedural than would later become the norm, the program was one of the first embodiments of the hypothesis that problem solving power could be based on knowledge. It stands in stark contrast to SAINT (Slagle 1963), the first of the symbolic integration programs, which was intended by its author to work by tree search, that is, methodically exploring the space of possible problem transformations. SIN, on the other hand, claimed that its goal was the avoidance of search, and that search was to be avoided by knowing what to do. It sought to bring to bear all of the cleverness that good integrators used, and attempted to do so using its most powerful techniques first. Only if these failed would it eventually fall back on search.

DENDRAL's task was analytic chemistry: determining the structure of a chemical compound from a variety of physical data about the compound, particularly its mass spectrum (the way in which the compound fragments when subjected to ionic bombardment). DENDRAL worked by generate and test, generating possible structures and testing them (in simulation) to see whether they would produce the mass spectrum observed. The combinatorics of the problem quickly become unmanageable: even relatively modest sized compounds have tens of millions of isomers (different ways in which the same set of atoms can be assembled). Hence in order to work at all, DENDRAL's generator had to be informed. By working with the expert chemists, DENDRAL's authors were able to determine what clues chemists found in the spectra that permitted them to focus their search on particular subclasses of molecules. Hence DENDRAL's key knowledge was about the "fingerprints" that different classes of molecules would leave in a mass spectrum; without this it would have floundered among the millions of possible structures.

MYCIN's task was diagnosis and therapy selection for a variety of infectious diseases. It was the first system to have all of the hallmarks that came to be associated with knowledge -based systems and as such came to be regarded as a prototypical example. Its knowledge was expressed as a set of some 450 relatively independent if/then rules; its inference engine used a simple backward-chaining control structure; and it was capable of explaining its recommendations (by showing the user a recap of the rules that had been used).

The late 1970s and early 1980s saw the construction of a wide variety of knowledge-based systems for tasks as di-verse as diagnosis, configuration, design, and tutoring. The decade of the 1980s also saw an enormous growth in industrial interest in the technology. One of the most successful and widely known industrial systems was XCON (for expert configurer), a program used by Digital Equipment Corporation (DEC) to handle the wide variation in the ways a DEC VAX computer could be configured. The system's task was to ensure that an order had all the required components and no superfluous ones. This was a knowledge-intensive task: factors to be considered included the physical layout of components in the computer cabinet, the electrical requirements of each component, the need to establish interrupt priorities on the bus, and others. Tests of XCON demonstrated that its error rate on the task was below 3 percent, which compared quite favorably to human error rates in the range of 15 percent (Barker and O'Connor 1989). Digital has claimed that over the decade of the 1980s XCON and a variety of other knowledge-based systems saved it over one billion dollars. Other commercial knowledge-based systems of pragmatic consequence were constructed by American Express, Manufacturer's Hanover, duPont, Schlumberger, and others.

The mid-1980s also saw the development of BAYESIAN NETWORKS (Pearl 1986), a form of KNOWLEDGE REPRESENTATION grounded in probability theory, that has recently seen wide use in developing a number of successful expert systems (see, e.g., Heckerman, Wellman, and Mamdani 1995).

Work in knowledge-based systems is rooted in observations about the nature of human intelligence, viz., the observation that human expertise of the sort involved in explicit reasoning is typically domain specific and dependent on a large store of task specific knowledge. Use of simple if/then rules -- production rules -- is drawn directly from the early work of Newell and Simon (1972) that used production rules to model human PROBLEM SOLVING.

The conception of knowledge-based systems as attempts to clone human reasoning also means that these systems often produce detailed models of someone's mental conception of a problem and the knowledge needed to solve it. Where other AI technologies (e.g., predicate calculus) are more focused on finding a way to achieve intelligence without necessarily modeling human reasoning patterns, knowledge-based systems seek explicitly to capture what people know and how they use it. One consequence is that the effort of constructing a system often produces as a side effect a more complete and explicit model of the expert's conception of the task than had previously been available. The system's knowledge base thus has independent value, apart from the program, as an expression of one expert's mental model of the task and the relevant knowledge.

MYCIN and other programs also provided one of the early and clear illustrations of Newell's concept of the knowledge level (Newell 1982), that is, the level of abstraction of a system (whether human or machine) at which one can talk coherently about what it knows, quite apart from the details of how the knowledge is represented and used.

These systems also offered some of the earliest evidence that knowledge could obviate the need for search, with DENDRAL in particular offering a compelling case study of the power of domain specific knowledge to avoid search in a space that can quickly grow to hundreds of millions of choices.

Additional links

-- Randall Davis

References

Barker, V., and D. O'Connor. (1989). Expert systems for configuration at Digital: XCON and beyond. Communications of the ACM March: 298-318.

Davis, R. (1977). Production rules as a representation for a knowledge - based consultation program. Artificial Intelligence 8:15-45.

Ernst, G. W. (1969). GPS: A Case Study in Generality and Problem Solving. New York: Academic Press.

Feigenbaum, E. A., B. G. Buchanan, and J. Lederberg. (1971). On generality and problem solving: A case study using the DENDRAL program. In B. Meltzer and D. Michie, Eds., Machine Intelligence 6, pp. 165-189.

Heckerman, D., M. Wellman, and A. Mamdani, Eds. (1995). Real world applications of Bayesian networks. Communications of the ACM, vol. 38, no. 3.

Moses, J. (1967). Symbolic Integration. Ph.D. diss., Massachusetts Institute of Technology.

Moses, J. (1971). Symbolic integration: The stormy decade. Communications of the ACM 14:548-560.

Newell, A. (1982). The knowledge level. Artificial Intelligence 18(1):87-127.

Newell, A., and H. A. Simon. (1972). Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall.

Pearl, J. (1986). Fusion, propagation, and structuring in belief networks. Artificial Intelligence (29)3:241-288.

Slagle, J. (1963). A heuristic program that solves symbolic integration problems in freshman calculus. In E. A. Feigenbaum and J. Feldman, Eds., Computers and Thought, pp. 191-206.

Knowledge-Based Systems

See also

Additional links

References

Further Readings