Connectionist cognitive modeling is an approach to understanding the mechanisms of human cognition through the use of simulated networks of simple, neuronlike processing units. Connectionist models are most often applied to what might be called natural cognitive tasks. These tasks include perceiving the world of objects and events and interpreting it for the purpose of organized behavior; retrieving contextually appropriate information from memory; perceiving and understanding language; and what might be called intuitive or implicit reasoning, in which an inference is derived or a solution to a problem is discovered without the explicit application of a predefined ALGORITHM. Because connectionist models capture cognition at a microstructural level, a more succinct characterization of a cognitive process -- especially one that is temporally extended or involves explicit, verbal reasoning -- can sometimes be given through the use of a more symbolic modeling framework. However, many connectionists hold that a connectionist microstructure underlies all aspects of human cognition, and a connectionist approach may well be necessary to capture the supreme achievements of human reasoning and problem solving, to the extent that such achievements arise from sudden insight, implicit reasoning, and/or imagining, as opposed to algorithmic derivation. See CONNECTIONISM, PHILOSOPHICAL ISSUES for further discussion, and PROBLEM SOLVING and COGNITIVE MODELING, SYMBOLIC for other perspectives.
Connectionist models -- often called parallel distributed processing models or NEURAL NETWORKS -- begin with the assumption that natural cognition takes place through the interactions of large numbers of simple processing units. Inspiration for this approach comes from the fact that the brain appears to consist of vast numbers of such units -- neurons. While connectionists often seek to capture putative principles of neural COMPUTATION in their models, the units in an actual connectionist simulation model should not generally be thought of as corresponding to individual neurons, because there are far fewer units in most simulations than neurons in the relevant brain regions, and because some of the properties of the units used may not be exactly neuron-like.
In connectionist systems, an active mental representation, such as a precept, is a pattern of activation over the set of processing units in the model. Processing takes place via the propagation of activation among the units, via weighted connections. The "knowledge" that governs processing consists of the values of the connection weights, and learning occurs through the gradual adaptation of the connection weights, which occur as a result of activity in the network, sometimes taken together with "error" signals, either in the form of a success or failure signal (cf. REINFORCEMENT LEARNING) or an explicit computation of the mismatch between obtained results and some "teaching" signal (cf., error correction learning, back propagation).
Perception is a highly context-dependent process. Individual stimulus elements may be highly ambiguous, but when considered in light of other elements present in the input, together with knowledge of patterns of co-occurrence of elements, there may be a single, best interpretation. Such is the case with the famous Dalmatian dog figure, shown in figure 1. An early connectionist model that captured the joint role of stimulus and context information in perception was the interactive activation model (McClelland and Rumelhart 1981). This model contained units for familiar words, for letters in each position within the words, and for features of letters in each position (fig. 2). Mutually consistent units had mutually excitatory connections (e.g., T unit for T in the first position had a mutually excitatory connection with the units for words beginning with T, such as TIME, TAPE, etc.). Mutually inconsistent units had mutually inhibitory connections (e.g., there can only be one letter per position, so the units for all of the letters in a given position are mutually inhibitory). Simulations of perception as occurring through the excitatory and inhibitory interactions among these units have led to a detailed account of a large body of psychological evidence on the role of context in letter perception. The interactive activation model further addresses the fact that perceptual enhancement also occurs for novel, wordlike stimuli such as MAVE. The presentation of an item like MAVE produces partial activation of a number of word units (such as SAVE, GAVE, MAKE, MOVE, etc.). Each of these provides a small amount of feedback support to the units for the letters it contains, with the outcome that the letters in items like MAVE receive almost as much feedback as letters in actual words. Stochastic versions of the interactive activation model overcome empirical shortcomings of the original version (McClelland 1991).
Other connectionist models have investigated issues in the perception of spoken language (McClelland and Elman 1986), in VISUAL OBJECT RECOGNITION, AI (Hummel and Biederman 1992), and in the interaction of perceptual and attentional processes (Phaf, van der Heijden, and Hudson 1990; Mozer 1987; Mozer and Behrmann 1990).
Memory and Learning A fundamental assumption of connectionist models of MEMORY is that memory is inherently a constructive process, taking place through the interactions of simple processing units, just as in the case of perception. One can think of recall as a process of constructing a pattern of activation that is taken by the recaller to reflect not the present input to the senses, but some pattern previously experienced. Central to this view is the idea that recall is prone to a variety of influences that often help us fill in missing details but which are not always bound to fill in correct information. An early model of memory retrieval (McClelland 1981) showed how multiple items in memory can become partially activated, thereby filling in missing information, when memory is probed. The partial activation is based on similarity of the item in memory to the probe and to the information initially retrieved in response to the probe. Similarity based generalization appears to be ubiquitous, and it can often lead to correct inference, but this is far from guaranteed, and indeed subsequent work by Nystrom and McClelland (1992) showed how connectionist networks can lead to blend errors in recall. Connectionist models have also been applied productively to aspects of concept learning (Gluck and Bower 1988), prototype formation (Knapp and Anderson 1984; McClelland and Rumelhart 1985) and the acquisition of conceptual representations of concepts (Rumelhart and Todd 1993). A crucial aspect of this latter work is the demonstration that connectionist models trained with back propagation can learn what basis to use for representations of concepts, so that similarity based generalization can be based on deep or structural rather than superficial aspects of similarity (Hinton 1989; see McClelland 1994 for discussion).
Connectionist models also address the distinction between explicit and implicit memory. Implicit memory refers to an aftereffect of experience with an item in a task that does not require explicit reference to the prior occurrence of the item. These effects often occur without any recollection of one having previously seen the item. Connectionist models account for such findings in terms of the adjustments of the strengths of the connections among the units in networks responsible for processing the stimuli (McClelland and Rumelhart 1985; Becker et al. 1997). Explicit memory for recent events and experiences may be profoundly impaired in individuals who show normal implicit learning (Squire 1992), suggesting a special brain system may be required for the formation of new explicit memories. A number of connectionist models have been proposed in an effort to explain how and why these effects occur (Murre 1997; Alvarez and Squire 1994; McClelland, McNaughton, and O"Reilly 1995).
Language and Reading
Connectionist models have suggested a clear alternative to the notion that knowledge of language must be represented as a system of explicit (though inaccessible) rules, and have presented mechanisms of morphological inflection, spelling-sound conversion, and sentence processing and comprehension that account for important aspects of the psychological phenomena of language that have been ignored by traditional accounts. Key among the phenomena not captured by traditional, rule-based approaches have been the existence of quasi-regular structure, and the sensitivity of language behavior to varying degrees of frequency and consistency. While all approaches acknowledge the existence of exceptions, traditional approaches have failed to take account of the fact that the exceptions are far from a random list of completely arbitrary items. Exceptions to the regular past tense of English, for example, come in clusters that share phonological characteristics (e.g., weep-wept, sleep-slept, sweep-swept, creep-crept) and quite frequently have elements in common with the "regular" past tense (/d/ or /t/, like their "regular" counterparts). An early connectionist model of Rumelhart and McClelland (1986) showed that a network model that learned connection weights to generate the past tense of a word from its present tense could capture a number of aspects of the acquisition of the past tense. Critiques of aspects of this model (Pinker and Prince 1988; Lachter and Bever 1988) raised a number of objections, but subsequent modeling work (MacWhinney and Leinbach 1991; Plunkett and Marchman 1993) has addressed many of the criticisms. Debate still revolves around the need to assume that explicit, inaccessible rules arise at some point in the course of normal development (Pinker 1991). A similar debate has arisen in the domain of word reading (see Coltheart et al. 1993; Plaut et al. 1996).
Connectionist approaches have also been used to account for aspects of language comprehension and production. Connectionists suggest that language processing is a constraint-satisfaction process sensitive to semantic and contextual factors as well as syntactic constraints (Rumelhart 1977; McClelland 1987). Considerable evidence (Taraban and McClelland 1988; MacDonald, Pearlmutter, and Seidenberg 1994; Tanenhaus et al. 1995) now supports the constraint-satisfaction position, and a model that takes joint effects of content and sentence structure into account has been implemented (St. John and McClelland 1995). In production, evidence supporting a constraint-satisfaction approach to the generation of the sounds of a word has led to interactive connectionist models of word production (Dell 1986; Dell et al. forthcoming). Another, very important direction of connectionist work is the area of language centers on the learning of the grammatical structure of sentences in a class of connectionist networks known as the simple recurrent net (Elman 1990). Such networks could learn to become sensitive to long-distance dependencies characteristic of sentences with embedded clauses, suggesting that there may not be a need to posit explicit, inaccessible rules to account for human knowledge of syntax (Elman 1991; Servan-Schreiber, Cleeremans, and McClelland 1991; Rohde and Plaut forthcoming). However, existing models have been trained on very small "languages," and successes with larger language corpora, as well as demonstrations of sensitivity to additional aspects of syntax, are needed.
Reasoning and Problem Solving
While connectionist models have had considerable success in many areas of cognition, their full promise for addressing higher level aspects of cognition, such as reasoning and problem solving, remains to be fully realized. A number of papers point toward the prospect of connectionist models in these areas (Rumelhart et al. 1986; Rumelhart 1989) without full implementations, perhaps in part because higher level cognition often has a temporally extended character, not easily captured in a single settling of a network to an attractor state. Though there have been promising developments in the use of RECURRENT NETWORKS to model temporally extended aspects of cognition, many researchers have opted for "hybrid" models. These models often rely on external, more traditional modeling frameworks to assign units and connections so that appropriate constraint-satisfaction processes can then be carried out in the connectionist component. This approach has been used in the domain of analogical reasoning (Holyoak and Thagard 1989). A slightly different approach, suggested by Rumelhart (1989), assumes that concepts are represented by distributed patterns of activity that capture both their superficial and their deeper conceptual and relational features. Discovering an analogy then consists of activating the conceptual and relational features of the source concept, which may then settle to an attractor state consisting of an analog in another domain that shares these same deep features but differs in superficial details.
Many researchers in this area view the "binding problem" (the assignment of arbitrary content to a slot in a structural description) as a fundamental problem to be solved in the implementation of connectionist models of reasoning, and several solutions have been proposed (Smolensky, Legendre, and Miyata forthcoming; Shastri and Ajjanagadde 1993; Hummel and Holyoak 1997). However, networks can learn to create their own slots so that they can carry out natural inferences in familiar content areas (St. John 1992). Whether learning mechanisms can yield a general enough implementation to capture people"s ability to reason in unfamiliar domains remains to be determined.
Alvarez, P., and L. R. Squire. (1994). Memory consolidation and the medial temporal lobe: a simple network model. Proceedings of the National Academy of Sciences, USA 91:7041-7045.
Becker, S., M. Moscovitch, M. Behrmann, and S. Joordens. (1997). Long-term semantic priming: a computational account and empirical evidence. Journal of Experimental Psychology: Learning, Memory, and Cognition 23:1059-1082.
Coltheart, M., B. Curtis, P. Atkins, and M. Haller. (1993). Models of reading aloud: dual-route and parallel-distributed-processing approaches. Psychological Review 100:589-608.
Dell, G. S. (1986). A spreading-activation theory of retrieval in sentence production. Psychological Review 93:283-321.
Dell, G. S., M. F. Schwartz, N. Martin, E. M. Saffran, and D. A. Gagnon. (forthcoming). Lexical access in normal and aphasic speakers.
Elman, J. L. (1990). Finding structure in time. Cognitive Science 14:179-211.
Elman, J. L. (1991). Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning 7:194-220.
Gluck, M. A., and G. H. Bower. (1988). Evaluating an adaptive network model of human learning. Journal of Memory and Language 27:166-195.
Hinton, G. E. (1989). Learning distributed representations of concepts. In R. G. M. Morris, Ed., Parallel Distributed Processing: Implications for Psychology and Neurobiology. Oxford, England: Clarendon Press, pp. 46-61.
Holyoak, K. J., and P. Thagard. (1989). Analogical mapping by constraint satisfaction. Cognitive Science 13:295-356.
Hummel, J. E., and I. Biederman. (1992). Dynamic binding in a neural network for shape recognition. Psychological Review 99:480-517.
Hummel, J. E., and K. J. Holyoak. (1997). Distributed representations of structure: a theory of analogical access and mapping. Psychological Review 104:427-466.
Knapp, A., and J. A. Anderson. (1984). A signal averaging model for concept formation. Journal of Experimental Psychology: Learning, Memory, and Cognition 10:617-637.
Lachter, J., and T. G. Bever. (1988). The relation between linguistic structure and theories of language learning: a constructive critique of some connectionist learning models. Cognition 28:195-247.
MacDonald, M. C., N. J. Pearlmutter, and M. S. Seidenberg. (1994). The lexical nature of syntactic ambiguity resolution. Psychological Review 101:676-703.
MacWhinney, B., and J. Leinbach. (1991). Implementations are not conceptualizations: revising the verb learning model. Cognition 40:121-153.
Marr, D. (1982). Vision. W. H. Freeman.
McClelland, J. L. (1981). Retrieving general and specific information from stored knowledge of specifics. In Proceedings of the Third Annual Conference of the Cognitive Science Society. Berkeley, CA, pp. 170-172.
McClelland, J. L. (1987). The case for interactionism in language processing. In M. Coltheart, Ed., Attention and Performance XII: The Psychology of Reading. London: Erlbaum, pp. 1-36.
McClelland, J. L. (1991). Stochastic interactive activation and the effect of context on perception. Cognitive Psychology 23:1-44.
McClelland, J. L. (1994). The interaction of nature and nurture in development: a parallel distributed processing perspective. In P. Bertelson, P. Eelen, and G. D"Ydewalle, Eds., International Perspectives on Psychological Science, vol. 1: Leading Themes. Hillsdale, NJ: Erlbaum, pp. 57-88.
McClelland, J. L., and J. L. Elman. (1986). The TRACE model of speech perception. Cognitive Psychology 18:1-86.
McClelland, J. L., B. L. McNaughton, and R. C. O"Reilly. (1995). Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychological Review 102:419-457.
McClelland, J. L., and D. E. Rumelhart. (1981). An interactive activation model of context effects in letter perception: Part 1: an account of basic findings. Psychological Review 88:375-407.
McClelland, J. L., and D. E. Rumelhart. (1985). Distributed memory and the representation of general and specific information. Journal of Experimental Psychology: General 114:159-188.
Mozer, M. C. (1987). Early parallel processing in reading: a connectionist approach. In M. Coltheart, Ed., Attention and Performance XII: The Psychology of Reading. London: Erlbaum, pp. 83-104.
Mozer, M. C., and M. Behrmann. (1990). On the interaction of selective attention and lexical knowledge: a connectionist account of neglect dyslexia. Journal of Cognitive Neuroscience 2:96-123.
Murre, J. M. (1997). Implicit and explicit memory in amnesia: some explanations and predictions of the tracelink model. Memory 5:213-232.
Nystrom, L. E., and J. L. McClelland. (1992). Trace synthesis in cued recall. Journal of Memory and Language 31:591-614.
Phaf, R. H., A. H. C. van der Heijden, and P. T. W. Hudson. (1990). SLAM: A connectionist model for attention in visual selection tasks. Cognitive Psychology 22:273-341.
Pinker, S. (1991). Rules of language. Science 253: 530.
Pinker, S., and A. Prince. (1988). On language and connectionism: analysis of a parallel distributed processing model of language acquisition. Cognition 28:73-193.
Plaut, D. C., J. L. McClelland, M. S. Seidenberg, and K. E. Patterson. (1996). Understanding normal and impaired word reading: computational principles in quasi-regular domains. Psychological Review 103:56-115.
Plunkett, K., and V. A. Marchman. (1993). From rote learning to system building: acquiring verb morphology in children and connectionist nets. Cognition 48:21-69.
Rohde, D., and D. C. Plaut. (forthcoming). Simple Recurrent Networks and Natural Language: How Important is Starting Small? Pittsburgh, PA: Center for the Neural Basis of Cognition, Carnegie Mellon and the University of Pittsburgh.
Rumelhart, D. E. (1977). Toward an interactive model of reading. In S. Dornic, Ed., Attention and Performance VI. Hillsdale, NJ: Erlbaum.
Rumelhart, D. E. (1989). Toward a microstructural account of human reasoning. In S. Vosniadou and A. Ortony, Eds., Similarity and Analogical Reasoning. New York: Cambridge University Press, pp. 298-312.
Rumelhart, D. E., and J. L. McClelland. (1986). On learning the past tenses of English verbs. In J. L. McClelland, D. E. Rumelhart, and the PDP Research Group, Eds., Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 2. Cambridge, MA: MIT Press, pp. 216-271.
Rumelhart, D. E., P. Smolensky, J. L. McClelland, and G. E. Hinton. (1986). Schemata and sequential thought processes in PDP models. In J. L. McClelland, D. E. Rumelhart, and the PDP Research Group, Eds., Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 2. Cambridge, MA: MIT Press, pp. 7-57.
Rumelhart, D. E., and P. M. Todd. (1993). Learning and connectionist representations. In D. E. Meyer and S. Kornblum, Eds., Attention and Performance XIV: Synergies in Experimental Psychology, Artificial Intelligence and Cognitive Neuroscience. Cambridge, MA: MIT Press, pp. 3-30.
Servan-Schreiber, D., A. Cleeremans, and J. L. McClelland. (1991). Graded state machines: the representation of temporal contingencies in simple recurrent networks. Machine Learning 7:161-193.
Shastri, L., and V. Ajjanagadde. (1993). From simple associations to systematic reasoning: a connectionist representation of rules, variables and dynamic bindings using temporal synchrony. Behavioral and Brain Sciences 16:417-494.
Smolensky, P., G. Legendre, and Y. Miyata. (forthcoming). Principles for an Integrated Connectionist/Symbolic Theory of Higher Cognition. Hillsdale, NJ: Erlbaum. (Also available as Technical Report CU-CS-600-92. Boulder: Computer Science Department and Institute of Cognitive Science, University of Colorado at Boulder.)
Squire, L. R. (1992). Memory and the hippocampus: a synthesis from findings with rats, monkeys and humans. Psychological Review 99:195-231.
St. John, M. F. (1992). The story gestalt: a model of knowledge-intensive processes in text comprehension. Cognitive Science 16:271-306.
St. John, M. F., and J. L. McClelland. (1990). Learning and applying contextual constraints in sentence comprehension. Artificial Intelligence 46:217-257.
Tanenhaus, M. K., M. J. Spivey-Knowlton, K. M. Eberhard, and J. C. Sedivy. (1995). Integration of visual and linguistic information in spoken language comprehension. Science 268:1632-1634.
Taraban, R., and J. L. McClelland. (1988). Constituent attachment and thematic role assignment in sentence processing: influences of content-based expectations. Journal of Memory and Language 27:597-632
Copyright © 1999 Massachusetts Institute of Technology