Uncertainty

Almost all information is subject to uncertainty. Uncertainty may arise from inaccurate or incomplete information (e.g., how large are the current U.S. petroleum reserves?), from linguistic imprecision (what exactly do we mean by "petroleum reserves"?), and from disagreement between information sources. We may even be uncertain about our degree of uncertainty. The representation of uncertainty is intrinsic to the representation of information, which is its dual.

Many schemes have been developed to formalize the notion of uncertainty and to mechanize reasoning under uncertainty within KNOWLEDGE-BASED SYSTEMS. Probability is, by far, the best-known and most widely used formalism. However, apparent limitations and difficulties in applying probability have spawned the development of a rich variety of alternatives. These include heuristic approximations to probability used in rule-based expert systems, such as certainty factors (Clancy and Shortliffe 1984); fuzzy set theory and FUZZY LOGIC (Zadeh 1984); interval representations, such as upper probabilities and Dempster-Shafer belief functions (Shafer 1976); NONMONOTONIC LOGICS and default reasoning (Ginsberg 1987); and qualitative versions of probability, such as the Spohn calculus and kappa- calculus. There has been controversy about the assumptions, appropriateness, and practicality of these various schemes, particularly about their use in representing and reasoning about uncertainty in databases and knowledge-based systems. It is useful to consider a variety of criteria, both theoretical and pragmatic, in comparing these schemes.

The first criterion concerns epistemology: What kinds of uncertainty does each scheme represent? Like most quantitative representations of uncertainty, probability expresses degree of belief that a proposition is true, or that an event will happen, by a cardinal number, between 0 and 1. Fuzzy set theory and fuzzy logic also represent degrees of belief or truth by a number between 0 and 1. Upper probabilities and Dempster-Shafer belief functions represent degrees of belief by a range of numbers between 0 and 1, allowing the expression of ambiguity or ignorance as the extent of the range. Qualitative representations of belief, such as nonmonotonic logic (Ginsberg 1987) and the kappa-calculus, often represent degrees of belief on an ordinal scale.

In the frequentist view, the probability of an event is the frequency of the event occurring in a large number of similar trials. For example, the probability of heads for a bent coin is the frequency of heads from a large number of tosses of that coin. Unfortunately, for most events, it is unclear what population of similar trials one should use. When estimating the probability that a chemical is carcinogenic, should you compare it with all known chemicals, only those tested for carcinogenicity, or only those chemicals with a similar molecular structure? Therefore, in practical reasoning, the personalist (also known as Bayesian or subjective) interpretation is often more useful: In this interpretation, the probability of an event is a person's degree of belief, given all the information currently known to that person. Different people may reasonably hold different probabilities, depending on what information they have.

It is important to be able to represent uncertain relationships between propositions as well as degrees of belief. A conditional probability distribution for proposition a given b, P(a | b), expresses the belief in a given the state of b. Belief networks, also known as BAYESIAN NETWORKS (Pearl 1988), provide an intuitive graphical way to represent qualitative uncertain knowledge about conditional dependence, and independence, among propositions.

A common source of uncertainty is linguistic imprecision. We find it useful to say that a river is "wide," without explaining exactly what we mean. Even from the personalist view, an event or quantity must be well specified for a meaningful probability distribution to be assessable. It is therefore important to eliminate linguistic imprecision, by developing unambiguous definitions of quantities, as a first step to encoding uncertain knowledge as probabilities. Instead of asking for the probability that a river is "wide," one might ask for the probability that it is wider than, say, 50 meters. Without such precision, vagueness about meaning is confounded with uncertainty about value.

Fuzzy set theorists argue that, because imprecision is intrinsic in human language, we should represent this imprecision explicitly in any formalism. For example, the linguistic variable "wide river" may be represented by a fuzzy set with a membership function that associates degrees of membership with different widths. A river 5 meters wide has membership 0 in "wide river;" a river 100 meters wide has membership 1; and intermediate widths have intermediate degrees of membership.

A second criterion for comparison of schemes is descriptive validity: Does the scheme provide a good model of human reasoning under uncertainty? There has been extensive experimental research by behavioral decision theorists on human judgment under uncertainty, which documents systematic divergences between human judgment and the norms of probabilistic inference (Kahneman, Slovic, and Tversky 1982; see PROBABILISTIC REASONING, TVERSKY, and JUDGMENT HEURISTICS). People use mental heuristics that often provide a qualitative approximation to probabilistic reasoning, such as in explaining away (Wellman and Henrion 1993). But they also exhibit systematic biases. There has been little experimental research on the descriptive validity of nonprobabilistic schemes, although there is little reason to expect that any simple mathematical scheme will fare well as a descriptive model.

Poor descriptive validity is a deficiency in a psychological theory, but not necessarily in a scheme for automated reasoning. Indeed, if the automated scheme exactly duplicated human commonsense reasoning, there would be less justification for the automated systems. If we believe that the formal scheme is based on axioms of rationality, as is claimed for probabilistic reasoning, it may be preferable to our flawed human reasoning for complicated problems in defined domains. That is why we rely on statistical analysis rather than informal reasoning in science.

A third criterion is ease of knowledge engineering: Is it practical to express human knowledge in this formalism? Knowledge engineers use a variety of tools for eliciting numerical probabilities (Morgan and Henrion 1990) and structuring complex uncertain beliefs using belief networks (Pearl 1988) and influence diagrams (Howard and Matheson 1984). Fuzzy logic provides a variety of ways of eliciting fuzzy variables. Nonmonotonic logical and other qualitative representations appear easy for people to express uncertain knowledge, although there have been few practical large-scale systems.

A fourth criterion is ease of data mining: Is it practical to extract and represent knowledge from data using this scheme? Increasingly, knowledge-based systems are supplementing or replacing knowledge encoded from human experts with knowledge extracted by experts from large databases using a wide variety of statistical techniques. The probabilistic basis of statistics, and Bayesian techniques for combining judgmental and data-based knowledge, give probabilistic techniques a major advantage on this criterion.

A fifth criterion concerns the tractability of computation with large knowledge bases: Modular rule-based schemes, such as certainty factors and fuzzy logic rules, are efficient, with linear computation time. Exact probabilistic inference in belief networks is NP hard; that is, potentially intractable for very large belief networks. However, effective approximate methods exist, and probabilistic inference is practical for many real knowledge bases. Higher-order representations, such as Dempster-Shafer and interval probabilities, are intrinsically more complex computationally than conventional probabilistic representations.

A sixth criterion is the relation of the scheme to making decisions. In practice, uncertain inference becomes valuable primarily when it is used as the basis for important decisions. Subjective probability is embedded in decision theory and UTILITY THEORY, developed by VON NEUMANN and Morgenstern to provide a theory for RATIONAL DECISION MAKING. Utility theory provides a way to express attitudes to risk, especially risk aversion. There have been several attempts to develop fuzzy utilities, however decision theories for nonprobabilistic representations are less developed, and this is an important area for research.

Humans or machines can rarely be certain about anything, and so practical reasoning requires some scheme for representing uncertainty. We have a rich array of formalisms. Recent developments on knowledge engineering, and inference methods for probabilistic methods, notably Bayesian belief networks and influence diagrams, have resolved many of the apparent difficulties with probability, and have led to a resurgence of research on probabilistic methods, with many real world applications (Henrion, Breese, and Horvitz 1991). Fuzzy logic has also had notable success in applications to approximate control systems. However, each method has its merits and may be suitable for particular applications.

See also

Additional links

-- Max Henrion

References

Clancy, W. J., and E. H. Shortliffe, Eds. (1984). Readings in Medical Artificial Intelligence: The First Decade. Reading, MA: Addison-Wesley.

Ginsberg, M. L., Ed. (1987). Readings in Nonmonotonic Logic. Los Altos, CA: Kaufman.

Henrion, M., J. S. Breese, and E. J. Horvitz. (1991). Decision analysis and expert systems. Artificial Intelligence Magazine 12(4):64-91.

Horvitz, E. J., J. S. Breese, and M. Henrion. (1988). Decision theory in expert systems and artificial intelligence. Intl. J. of Approximate Reasoning 2:247-302.

Howard, R. A., and J. E. Matheson. (1981). Influence diagrams. In Howard and Matheson (1984), pp. 719-762.

Howard, R. A., and J. E. Matheson, Eds. (1984). Readings in the Principles and Applications of Decision Analysis. Menlo Park, CA: Strategic Decisions Group.

Kahneman, D., P. Slovic, and A. Tversky. (1982). Judgment under Uncertainty: Heuristics and Biases. Cambridge: Cambridge University Press.

Morgan, M. G., and M. Henrion. (1990). Uncertainty: A Guide to the Treatment of Uncertainty in Quantitative Policy and Risk Analysis. New York: Cambridge University Press.

Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Los Angeles, CA: Kaufman.

Raiffa, H. (1968). Decision Analysis: Introductory Lectures on Choice Under Uncertainty. Reading, MA: Addison-Wesley.

Shafer, G. (1976). A Mathematical Theory of Evidence. Princeton, NJ: Princeton University Press.

Wellman, M., and M. Henrion. (1993). Explaining "explaining away". IEEE Transactions on Pattern Analysis and Machine Intelligence 15(3):287-291.

Zadeh, L. A. (1984). The role of fuzzy logic in the management of uncertainty in expert systems. Fuzzy Sets and Systems 11:199-227.

Further Readings

Heckerman, D. E., and E. J. Horvitz. (1988). The myth of modularity. In Lemmer and Kanal (1988), pp. 23-34.

Howard, R. A. (1988). Uncertainty about probability: A decision analysis perspective. Risk Analysis 8(1):91-98.

Kanal, L. N., and J. Lemmer, Eds. (1986). Uncertainty in artificial intelligence. Machine Intelligence and Pattern Recognition, vol. 4. Amsterdam: Elsevier.

Kaufman, A. (1975). Introduction to the Theory of Fuzzy Sets. New York: Academic Press.

Lemmer, J. F., and L. N. Kanal, Eds. (1988). Uncertainty in Artificial Intelligence 2. Vol. 5, Machine Intelligence and Pattern Recognition. Amsterdam: Elsevier.

Henrion, M. (1987). Uncertainty in artificial intelligence: Is probability epistemologically and heuristically adequate? In J. Mumpower et al., Eds., Expert Judgment and Expert Systems. NATO. ISI Series F., vol. 35. Berlin: Springer, pp. 105-130.

Shachter, R. D., and D. Heckerman. (1987). Thinking backwards for knowledge acquisition. Artif. Intell. Magazine 8:55-62.

Shortliffe, E. H., and B. G. Buchanan. (1975). A model of inexact reasoning in medicine. Math. Biosciences 23:351-379.

Szolovits, P., and S. G. Pauker. (1984). Categorical and probabilistic reasoning in medical diagnosis. In Clancy and Shortliffe (1984), pp. 210-240.

Wallsten, T. S., D. V. Budescu, A. Rapaport, R. Zwick, and B. Forsyth. (1986). Measuring the vague meanings of probability terms. J. Exp. Psych: General 15(4):348-365.