Cognitive Architecture

Cognitive architecture refers to the design and organization of the mind. Theories of cognitive architecture strive to provide an exhaustive survey of cognitive systems, a description of the functions and capacities of each, and a blueprint to integrate the systems. Such theories are designed around a small set of principles of operation. Theories of cognitive architecture can be contrasted with other kinds of cognitive theories in providing a set of principles for constructing cognitive models, rather than a set of hypotheses to be empirically tested.

Theories of cognitive architecture can be roughly divided according to two legacies: those motivated by the digital computer and those based on an associative architecture. The currency of the first kind of architecture is information in the form of symbols; the currency of the second kind is activation that flows through a network of associative links. The most common digital computer architecture is called the VON NEUMANN architecture in recognition of the contributions of the mathematician John von Neumann to its development. The key idea, the stored-program technique, allows program and data to be stored together. The von Neumann architecture consists of a central processing unit, a memory unit, and input and output units. Information is input, stored, and transformed algorithmically to derive an output. The critical role played by this framework in the development of modern technology helped make the COMPUTATIONAL THEORY OF MIND seem viable. The framework has spawned three classes of theories of cognitive architecture, each encompassing several generations. The three classes are not mutually exclusive; they should be understood as taking different perspectives on cognitive organization that result in different performance models.

The original architecture of this type was a PRODUCTION SYSTEM. In this view, the mind consists of a working memory, a large set of production rules, and a set of precedence rules determining the order of firing of production rules. A production rule is a condition-action pair specifying actions to perform if certain conditions are met. The first general theory of this type was proposed by NEWELL, Simon, and Shaw (1958) and was called the General Problem Solver (GPS). The idea was that a production system incorporating a few simple heuristics could solve difficult problems in the same way that humans did. A descendant of this approach, SOAR (Newell 1990), elaborates the production system architecture by adding mechanisms for making decisions, for recursive application of operators to a hierarchy of goals and subgoals, and for learning of productions. The architecture has been applied to help understand a range of human performance from simple stimulus-response tasks, to typing, syllogistic reasoning, and more.

A second class of von Neumann-inspired cognitive architecture is the information processing theory. Unlike production systems, which posit a particular language of symbolic transformation, information processing theories posit a sequence of processing stages from input through encoding, memory storage and retrieval, to output. All such theories assume the critical components of a von Neumann architecture: a central executive to control the flow of information, one or more memories to retain information, sensory devices to input information, and an output device. The critical issues for such theories concern the nature and time course of processing at each stage. An early example of such a theory is Broadbent's (1958) model of ATTENTION, the imprint of which can be found on the "modal" information processing theory, whose central distinction is between short-term and long-term memory (e.g., Atkinson and Shiffrin 1968), and on later models of WORKING MEMORY.

The digital computer also inspired a class of cognitive architecture that emphasizes veridical representation of the structure of human knowledge. The computer model distinguishes program from data, and so the computer modeler has the option of putting most of the structure to be represented in the computer program or putting it in the data that the program operates on. Representational models do the latter; they use fairly sophisticated data structures to model organized knowledge. Theories of this type posit two memory stores: a working memory and a memory for structured data. Various kinds of structured data formats have been proposed, including frames (Minsky 1975), SCHEMATA (Rumelhart and Ortony 1977), and scripts (Schank and Abelson 1977), each specializing in the representation of different aspects of the world (objects, events, and action sequences, respectively). What the formats have in common is that they (i) represent "default" relations that normally hold, though not always; (ii) have variables, so that they can represent relations between abstract classes and not merely individuals; (iii) can embed one another (hierarchical organization); and (iv) are able to represent the world at multiple levels of abstraction.

The second type of cognitive architecture is associative. In contrast with models of the von Neumann type, which assume that processing involves serial, rule-governed operations on symbolic representations, associative models assume that processing is done by a large number of parallel operators and conforms to principles of similarity and contiguity. For example, an associative model of memory explains how remembering part of an event can cue retrieval of the rest of the event by claiming that an association between the two parts was constructed when the event was first encoded. Activation from the representation of the first part of the event flows to the representation of the second part through an associative connection. More generally, the first part of the event cues associative retrieval of the entire event (and thus the second part) by virtue of being similar to the entire event.

Associative models have a long history stretching back to Aristotle, who construed MEMORY and some reasoning processes in terms of associations between elementary sense images. More recent associative models are more promiscuous: different models assume associations between different entities, concepts themselves, or some more primitive set of elements out of which concepts are assumed to be constructed.

Such modern conceptions of associative cognitive architecture have two antecedents located in the history of cognitive science and two more immediate precursors. The first historical source is the foundational work on associative computation begun by MCCULLOCH and PITTS (1943) demonstrating the enormous computational power of populations of neurons and the ability of such systems to learn using simple algorithms. The second source is the application of associative models based on neurophysiology to psychology. An influential synthesis of these efforts was Hebb's (1949) book, The Organization of Behavior. HEBB attempted to account for psychological phenomena using a theory of neural connections (cell assemblies) that could be neurophysiologically motivated, in part by appeal to large-scale cortical organization. Thus, brain architecture became a source of inspiration for cognitive architecture. Hebb's conception was especially successful as an account of perceptual learning. The two remaining antecedents involve technical achievements that led to a renewed focus on associative models in the 1980s. Earlier efforts to build associative devices resulted in machines that were severely limited in the kinds of distinctions they were able to make (they could only distinguish linearly separable patterns). This limitation was overcome by the introduction of a learning algorithm called "backpropagation of error" (Rumelhart, Hinton, and Williams 1986). The second critical technical achievement was a set of proofs, due in large part to Hopfield (e.g., 1982), that provided new ways of interpreting associative computation and brought new tools to bear on the study of associative networks. These proofs demonstrated that certain kinds of associative networks could be interpreted as optimizing mathematical functions. This insight gave theorists a tool to translate a problem that a person might face into an associative network. This greatly simplified the process of constructing associative models of cognitive tasks.

These achievements inspired renewed interest in associative architectures. In 1986, Rumelhart and McClelland published a pair of books on parallel, distributed processing that described a set of models of different cognitive systems (e.g., memory, perception, and language) based on common associative principles. The work lent credence to the claim that an integrated associative architecture could be developed.

Although a broad division between von Neumann and associative architectures helps to organize the various conceptions of mental organization that have been offered, it does an injustice to hybrid architectural proposals; that is, proposals that include von Neumann-style as well as associative components. Such alliances of processing systems seem necessary on both theoretical and empirical grounds (Sloman 1996). Only von Neumann components seem capable of manipulating variables in a way that matches human competence (see BINDING PROBLEM), yet associative components seem better able to capture the context-specificity of human judgment and performance as well as people's ability to deal with and integrate many pieces of information simultaneously. One important hybrid theory is ACT* (Anderson 1983). ACT* posits three memories: a production, a declarative, and a working memory, as well as processes that interrelate them. The architecture includes both a production system and an associative network. In this sense, ACT* is an early attempt to build an architecture that takes advantage of both von Neumann and associative principles. But integrating these very different attitudes in a principled and productive way is an ongoing challenge.

See also

Additional links

-- Steven Sloman


Anderson, J. R. (1983). The Architecture of Cognition. Cambridge, MA: Harvard University Press.

Atkinson, R. C., and R. M. Shiffrin. (1968). Human memory: a proposed system and its control processes. In K. W. Spence and J. T. Spence, Eds., The Psychology of Learning and Motivation: Advances in Research and Theory, vol. 2. New York: Academic Press, pp. 89-195.

Broadbent, D. E. (1958). Perception and Communication. London: Pergamon Press.

Hebb, D. O. (1949). The Organization of Behavior. New York: Wiley.

Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, USA 79:2554-2558.

McCulloch, W. S., and W. Pitts. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5:115-133.

Minsky, M. (1975). A framework for representing knowledge. In P. Winston, Ed., The Psychology of Computer Vision. New York: McGraw-Hill.

Newell, A. (1990). Unified Theories of Cognition. Cambridge: Harvard University Press.

Newell, A., H. A. Simon, and J. C. Shaw. (1958). Elements of a theory of human problem solving. Psychological Review 65:151-166.

Rumelhart, D. E., and A. Ortony. (1977). The representation of knowledge in memory. In R. C. Anderson, R. J. Spiro, and W. E. Montague, Eds., Schooling and the Acquisition of Knowledge.

Rumelhart, D. E., G. E. Hinton, and R. J. Williams. (1986). Learning internal representations by error propagation. In D. E. Rumelhart, J. L. McClelland, and the PDP Research Group, Eds., Parallel Distributed Processing, 1. Cambridge, MA: MIT Press.

Schank, R. C., and R. Abelson. (1977). Scripts, Plans, Goals, and Understanding. Hillsdale, NJ: Erlbaum.

Sloman, S. A. (1996). The empirical case for two systems of reasoning. Psychological Bulletin 119:3-22.

Further Readings

Newell, A., and H. A. Simon. (1972). Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall.

Hinton, G. E., and J. A. Anderson. (1989). Parallel Models of Associative Memory. Hillsdale, NJ: Erlbaum.

Pinker, S., and J. Mehler, Eds. (1988). Connections and Symbols. Cambridge, MA: MIT Press.

Rumelhart, D. E., J. L. McClelland, and the PDP Research Group, Eds. (1986). Parallel Distributed Processing. Cambridge, MA: MIT Press.

Smolensky, P. (1988). On the proper treatment of connectionism. Behavioral and Brain Sciences 11:1-23.