Feature Detectors

The existence of feature detectors is based on evidence obtained by recording from single neurons in the visual pathways (Barlow 1953; Lettvin et al. 1959; Hubel and Wiesel 1962; Waterman and Wiersma 1963; see also SINGLE-NEURON RECORDING). It was found that responses from many types of NEURON do not correlate well with the straightforward physical parameters of the stimulus, but instead require some specific pattern of excitation, often a spatio-temporal pattern that involves movement. The rabbit retina provides some well-documented examples, though they were not the first to be described. Directionally selective ganglion cells respond to movements of the image in one direction, but respond poorly to the reverse motion, however bright or contrasty the stimulus; there are two classes of these ganglion cells, distinguished by the fast or slow speed of movement that each prefers, and within each class there are groups responding to different directions of motion (Barlow, Hill, and Levick 1964). Another class, found mainly in the central zone of the RETINA, are local edge detectors that respond only to an edge moving very slowly over precisely the right position in the visual field (Levick 1967). These units are often highly specific in their stimulus requirements, and it can be a difficult task to find out what causes such a unit to fire reliably; yet once the appropriate trigger feature has been properly defined it will work every time. Another class, the fast movement detectors, respond only to very rapid image movements, and yet another, the uniformity detectors, fire continuously at a high rate except when patterned stimulation is delivered to the part of the retina they are connected to.

All classes have a restricted retinal region where the appropriate feature has to be positioned, and this is described as the unit's receptive field, even though the operations being performed on the input are very different from simple linear summation of excitatory and inhibitory influences, which the term receptive field is sometimes thought to imply. Units often show considerable invariance of response for changes of the luminance, contrast, and even polarity of the light stimulus, while maintaining selectivity for their particular patterned spatio-temporal feature. There is also evidence for feature detectors in auditory (Evans 1974; Suga 1994; see also DISTINCTIVE FEATURES) and tactile pathways.

Feature detection in the retina makes it clear that complicated logical operations can be achieved in simple neural circuits, and that these processes need to be described in computational, rather than linear, terms. For a time there was some rivalry between feature creatures, who espoused this logical, computational view of the operation of visual neurons, and frequency freaks, who were devoted to the use of sine wave spatial stimuli and Fourier interpretations. The latter had genuine successes that yielded valid new insights (Braddick, Campbell, and Atkinson 1978), but their approach works best for systems that operate nearly linearly. Object recognition is certainly not a linear process, and the importance of feature detectors lies in the insight they give into how the brain achieves this very difficult task. But first, glance back at the history of feature detection before single-neuron feature-detectors were discovered.

Sherrington found that to elicit the scratch reflex -- the rhythmical scratching movements made by a dog's hind leg -- a tactile stimulus had to be applied to a particular region of the flank, and it was most effective if it was applied to several neighboring cutaneous regions in succession. This must require a tactile feature detector not unlike some of those discovered in visual pathways.

Some years later the ethologists Lorenz (1961) and Tinbergen (1953) popularized the notion of innate releasers: these are special sensory stimuli that trigger specific behavioral responses when delivered under the appropriate circumstances. One example is the red dot on a herring gull's beak, which has been shown to cause the chick to open its bill to receive regurgitated food from its mother. Another example is the stimulus for eliciting the rather stereotyped feeding behavior shown by many vertebrates: a small moving object first alerts the animal, then causes it to orient itself toward the stimulus, next to approach it, and finally to snap at it. It was early suggested that the retinal ganglion cells in the frog that respond to small moving objects might act as such bug detectors (Barlow 1953; Lettvin et al. 1959).

Such feature detectors must be related to the specific requirements of particular species in particular ecological niches, but feature detection may have a more general role in perception and classification. A clue to their significance in object recognition may be found in the early attempts by computer scientists to recognize alphanumeric characters (Grimsdale et al. 1959; Selfridge and Neisser 1960; Kamentsky and Liu 1963). It was found that fixed templates, one for each letter, perform very badly because the representation of the same character varies in different fonts. Performance could be much improved by detecting the features (bars, loops, intersections etc.) that make up the characters, for latitude could then be allowed in the positioning of these relative to each other. This is the germ of an idea that seems to provide a good qualitative explanation for the feature detectors found at successive levels in the visual system: operations that restrict response or increase selectivity for one aspect of the stimulus are combined with operations that generalize or relax selectivity for another aspect. In the preceding example the components of letters vary less from font to font than the overall pattern of the letters, so the initial feature detectors can be rather selective. But having found that certain features are present, the system can be less demanding about how they are positioned relative to each other, and this achieves some degree of font-invariant character recognition.

In the primate visual system some retinal ganglion cells are excited by a single foveal cone, so they are very selective for position. Several of these connect, through the lateral geniculate nucleus, to a single neuron in primary visual cortex, but the groups that so connect are arranged along lines: the cortical neuron thus maintains selectivity for position orthogonal to the line, but relaxes selectivity and summates along the line (Hubel and Wiesel 1962). This makes each unit selectively responsive to lines of a particular orientation, and they may be combined together at later stages to generalize in various ways.

The best-described examples of units that generalize for position are provided by the cortical neurons of area MT or V5 that specialize in the analysis of image motion: these collect together information from neurons in cortical area V1 that come from a patch several degrees in diameter in the visual field (Newsome et al. 1990; Raiguel et al. 1995), but all the neurons converging on one MT neuron signal movements of similar direction and velocity. Thus all the information about motion with a particular direction and velocity occurring in a patch of the visual field is pooled onto a single MT neuron, and such neurons have been shown to be as sensitive to weak motion cues as the intact, behaving animal (Newsome, Britten, and Movshon 1989). Possibly the whole sensory cortex should be viewed as an immense bank of tuned filters, each collecting the information that enables it to detect with high sensitivity the occurrence of a patterned feature having characteristics lying within a specific range (Barlow and Tripathy 1997). The existence of this enormous array of near-optimal detectors, all matched to stimuli of different characteristics, would explain why the mammalian visual system can perform detection and discrimination tasks with a sensitivity and speed that computer vision finds hard to emulate.

Another aspect of the problem is currently arousing interest: Why do we have detectors for some features, but not others? What property of a spatio-temporal pattern makes it desirable as a feature? A suggestion (Barlow 1972; Field 1994) currently receiving some support (Bell and Sejnowski 1995; Ohlshausen and Field 1996) is that the feature detectors we possess are able to create a rather complete representation of the current sensory scene using the principle of s parse coding; this means that at any one time only a small selection of all the units is active, yet this small number firing in combination suffices to represent the scene effectively. The types of feature that will achieve this double criterion, sparsity with completeness, can be described as suspicious coincidences: they are local patterns in the image that would be expected, from the probabilities of their constituent elements, to occur rarely, but in fact occur more commonly.

Sparse coding goes some way toward preventing accidental conjunctions of attributes, which is the basis for the so-called BINDING PROBLEM. Although sparsely coded features are not mutually exclusive, they nonetheless occur infrequently: hence accidental conjunctions of them will only occur very infrequently, possibly no more often than they do in fact occur.

Like the basis functions that are used for image compression, those suitable for sparse coding achieve their result through being adapted to the statistical properties of natural images. This adaptation must be done primarily through evolutionary selection molding their pattern selective mechanisms, though it is known that they are also modified by experience during the critical period of development of the visual system (Hubel and Wiesel 1970; Movshon and Van Sluyters 1981), and perhaps also through short term processes of contingent adaptation (Barlow 1990). Feature detectors that exploit statistical properties of natural images in this way could provide a representation that is optimally up-to-date, minimizes the effects of delays in afferent and efferent pathways, and perhaps also achieves some degree of prediction (see also CEREBRAL CORTEX).

Although we are far from being able to give a complete account of the physiological mechanisms that underlie even the simplest examples of object recognition, the existence of feature detecting neurons, and these theories about their functional role, provide grounds for optimism.

See also

-- Horace Barlow

References

Barlow, H. B. (1953). Summation and inhibition in the frog's retina. Journal of Physiology, London 119:69-88.

Barlow, H. B. (1972). Single units and sensation: a neuron doctrine for perceptual psychology? Perception 1:371-394.

Barlow, H. B. (1990). A theory about the functional role and synaptic mechanism of visual after-effects. In C. B. Blakemore, Ed., Vision: Coding and Efficiency. Cambridge: Cambridge University Press.

Barlow, H. B., R. M. Hill, and W. R. Levick. (1964). Retinal ganglion cells responding selectively to direction and speed of motion in the rabbit. Journal of Physiology, London 173:377-407.

Barlow, H. B., and S. P. Tripathy. (1997). Correspondence noise and signal pooling as factors determining the detectability of coherent visual motion. Journal of Neuroscience 17 7954-7966.

Bell, A. J., and T. J. Sejnowski. (1995). An information maximisation approach to blind separation and blind deconvolution. Neural Computation 7:1129-1159.

Braddick, O. J., F. W. Campbell, and J. Atkinson. (1978). Channels in vision: basic aspects. In R. Held, H. W. Leibowicz, and H. L. Teuber, Eds., Handbook of Sensory Physiology, New York: Springer, pp. 1-38.

Evans, E. F. (1974). Feature- and call-specific neurons in auditory pathways. In F. C. S. and F. G. Worden, Eds., The Neurosciences: Third Study Program. Cambridge, MA: MIT Press.

Field, D. J. (1994). What is the goal of sensory coding? Neural Computation 6:559-601.

Grimsdale, R. L., F. H. Sumner, C. J. Tunis, and T. Kilburn. (1959). A system for the automatic recognition of patterns. Proceedings of the Institute of Electrical Engineers, B 106:210-221.

Hubel, D. H., and T. N. Wiesel. (1962). Receptive fields, binocular interaction, and functional architecture in the cat's visual cortex. Journal of Physiology, London 195:215-243.

Hubel, D. H., and T. N. Wiesel. (1970). The period of susceptibility to the physiological effects of unilateral eye closure in kittens. Journal of Physiology, London 206:419-436.

Kamentsky, L. A., and C. N. Liu. (1963). Computer-automated design of multifont print recognition logic. IBM Journal of Research and Development 7:2-13.

Lettvin, J. Y., H. R. Maturana, W. S. McCulloch, and W. H. Pitts. (1959). What the frog's eye tells the frog's brain. Proceedings of the Institute of Radio Engineers 47:1940-1951.

Levick, W. R. (1967). Receptive fields and trigger features of ganglion cells in the visual streak of the rabbit's retina. Journal of Physiology, London 188:285-307.

Lorenz, K. (1961). King Solomon's Ring. Trans. M. K. Wilson. Cambridge: Cambridge University Press.

Movshon, J. A., and R. C. Van Sluyters. (1981). Visual neural development. Annual Review of Psychology 32:477-522.

Newsome, W. T., K. H. Britten, and J. A. Movshon. (1989). Neuronal correlates of a perceptual decision. Nature 341:52-54.

Newsome, W. T., K. H. Britten, C. D. Salzman, and J. A. Movshon. (1990). Neuronal mechanisms of motion perception. Cold Spring Harbor Symposia on Quantitative Biology 55:697-705.

Olshausen, B. A., and D. J. Field. (1996). Emergence of simple-cell receptive-field properties by learning a sparse code for natural images. Nature 381:607-609.

Raiguel, S., M. M. Van Hulle, D.-K. Xiao, V. L. Marcar, and G. A. Orban. (1995). Shape and spatial distribution of receptive fields and antagonistic motion surrounds in the middle temporal area (V5) of the macaque. European Journal of Neuroscience 7:2064-2082.

Selfridge, O., and U. Neisser. (1960). Pattern recognition by machine. Scientific American 203(2):60-68.

Suga, N. (1994). Multi-function theory for cortical processing of auditory information: implications of single-unit and lesion data for future research. Journal of Comparative Physiology A 175:135-144.

Tinbergen, N. (1953). The Herring Gull's World. London: Collins.

Waterman, T. H., and C. A. G. Wiersma. (1963). Electrical res-ponses in decapod crustacean visual systems. Journal of Cellular and Comparative Physiology 61:1-16.

Further Readings

Ballard, D. H. (1997). An Introduction to Natural Computation. Cambridge, MA: MIT Press.

Barlow, H. B. (1995). The neuron doctrine in perception. In M. Gazzaniga, Ed., The Cognitive Neurosciences. Cambridge, MA: MIT Press, pp. 415-435.