Audition

Audition refers to the perceptual experience associated with stimulation of the sense of hearing. For humans, the sense of hearing is stimulated by acoustical energy -- sound waves -- that enter the outer ear (pinna and external auditory meatus) and set into vibration the eardrum and the attached bones (ossicles) of the middle ear, which transfer the mechanical energy to the inner ear, the cochlea. The auditory system can also be stimulated by bone conduction (Tonndorf 1972) when the sound source causes the bones of the skull to vibrate (e.g., one's own voice may be heard by bone conduction). Mechanical energy is transduced into neural impulses within the cochlea through the stimulation of the sensory hair cells which synapse on the eighth cranial, or auditory, nerve. In addition to the ascending, or afferent, auditory pathway from the cochlea to the cortex, there is a descending, efferent, pathway from the brain to the cochlea, although the functional significance of the efferent pathway is not well understood at present (Brugge 1992). Immediately following stimulation, the auditory system may become less sensitive due to adaptation or fatigue (Ward 1973), and prolonged high-intensity stimulation can damage the sensory process (noise-induced hearing loss; for a series of review articles, see J. Acoust. Soc. Am. 1991, vol. 90: 124-227).

The auditory system is organized tonotopically such that the frequency of a stimulating sound is mapped onto a location along the basilar membrane within the cochlea, providing a place code (cf. auditory physiology). For example, low-frequency tones lead to maximal displacement of the apical portion of the basilar membrane and high-frequency tones lead to maximal displacement of the basal portion of the basilar membrane. In addition, cells exhibit frequency selectivity throughout the auditory pathway (e.g., Pickles 1988). This tonotopic organization provides a basis for spectral analysis of sounds. Temporal aspects of the stimulus (waveform fine structure or envelope) are preserved in the pattern of activity of auditory nerve fibers (Kiang et al., 1965), providing a basis for the coding of synchronized activity both across frequency and across the two ears. The dual presence of place and timing cues is pervasive in models of auditory perception.

The percept associated with a particular sound might be described in a variety of ways, but descriptions in terms of pitch, loudness, timbre, and perceived spatial location are probably the most common (Blauert 1983; Yost 1994; Moore 1997). Pitch is most closely associated with sound frequency, or the fundamental frequency for complex periodic sounds; loudness is most closely associated with sound intensity; and timbre is most closely associated with the distribution of acoustic energy across frequency (i.e., the shape of the power spectrum). The perceived location of a sound in space (direction and distance) is based primarily on the comparison of the sound arriving at the two ears (binaural hearing) and the acoustical filtering associated with the presence of the head and pinnae. Each of these perceptual classifications also depends on other factors, particularly when complex, time-varying sounds are being considered.

The frequency range of human hearing extends from a few cycles per second (Hertz, abbreviated Hz) to about 20,000 Hz, although the upper limit of hearing decreases markedly with age (e.g., Weiss 1963; Stelmachowitcz et al. 1989). The intensity range of human hearing extends over many orders of magnitude depending on frequency; at 2-4 kHz, the range may be greater than twelve orders of magnitude (120 decibels, abbreviated dB).

Despite the wide dynamic range of human hearing, the auditory system is remarkably acute: the just-discriminable difference (JND) in frequency is as small as 0.2 percent (e.g., Wier, Jesteadt, and Green 1977) and in intensity is approximately one dB (e.g., Jesteadt, Wier, and Green 1977). Sensitivity to differences in sounds arriving at the two ears is perhaps even more remarkable: time delays as small as a few microseconds may be discerned (Klumpp and Eady 1956). Although behavioral estimates of the JND for intensity, frequency, etc., provide invaluable information regarding the basic properties of the human auditory system, it is important to keep in mind that estimates of JNDs depend on both sensory and nonsensory factors such as memory and attention (e.g., Harris 1952; Durlach and Braida 1969; Berliner and Durlach 1972; Howard et al. 1984).

The interference one sound causes in the reception of another sound is called masking. Masking has a peripheral component resulting from interfering/overlapping patterns of excitation in the auditory nerve (e.g., Greenwood 1961), and a central component due to uncertainty, sometimes called "informational masking" (Watson 1987; see also auditory attention). In a classic experiment, Fletcher (1940) studied the masking of a tone by noise in order to evaluate the frequency selectivity of the human auditory system. To account for the obtained data, Fletcher proposed a "critical band" that likened the ear to a bandpass filter (or, to encompass the entire frequency range, a set of contiguous, overlapping bandpass filters). This proposed "auditory filter" is a theoretical construct that reflects frequency selectivity present in the auditory system, and, in one form or another, auditory filters comprise a first stage in models of the spectrotemporal (across frequency and time) analysis performed by the auditory system (e.g., Patterson and Moore 1986).

The separation of sound into multiple frequency channels is not sufficient to provide a solution to the problem of sound segregation. Sound waves from different sources simply add, meaning that the frequencies shared by two or more sounds are processed en masse at the periphery. In order to form distinct images, the energy at a single frequency must be appropriately parsed. The computations used to achieve sound segregation depend on the coherence/incoherence of sound onsets, the shared/unshared spatial location of the sound sources, differences in the harmonic structure of the sounds and other cues in the physical stimulus. Yost has proposed that the spectrotemporal and spatial-location analysis performed by the auditory system serves the purpose of sound source determination (Yost 1991) and allows the subsequent organization of sound images into an internal map of the acoustic environment (Bregman 1990).

Approximately 28 million people in the United States suffer from hearing loss, and a recent census indicated that deafness and other hearing impairments ranked 6th among chronic conditions reported (National Center for Health Statistics 1993). Among those aged sixty-five and older, deafness and other hearing impairments ranked third among chronic conditions. The assessment of function and nonmedical remediation of hearing loss is typically performed by an audiologist, whereas the diagnosis and treatment of ear disease is performed by an otologist.

See also

AUDITORY PLASTICITY; PHONOLOGY, ACQUISITION OF; PSYCHOPHYSICS; SIGN LANGUAGE AND THE BRAIN; SPEECH PERCEPTION

Additional links

-- Virginia M. Richards and Gerald D. Kidd, Jr.

References

Berliner, J. E., and N. I. Durlach. (1972). Intensity perception IV. Resolution in roving-level discrimination. J. Acoust. Soc. Am. 53:1270-1287.

Blauert, J. (1983). Spatial Hearing. Cambridge, MA: MIT Press.

Bregman, A. S. (1990). Auditory Scene Analysis. Cambridge, MA: MIT Press.

Brugge, J. F. (1992). An overview of central auditory processing. In A. N. Popper and R. R. Fay, Eds., The Mammalian Auditory Pathway: Neurophysiology. New York: Springer.

Durlach, N. I., and L. D. Braida. (1969). Intensity perception I, Preliminary theory of intensity resolution. J. Acoust. Soc. Am. 46:372-383.

Fletcher, H. (1940). Auditory patterns. Rev. Mod. Phys. 12:47-65.

Greenwood, D. D. (1961). Auditory masking and the critical band. J. Acoust. Soc. Am. 33:484-502.

Harris, J. D. (1952). The decline of pitch discrimination with time. J. Exp. Psych. 43:96-99.

Howard, J. H., A. J. O'Toole, R. Parasuraman, and K. B. Bennett. (1984). Pattern-directed attention in uncertain-frequency detection. Percept. Psychophys. 35:256-264.

Jesteadt, W., C. C. Wier, and D. M. Green (1977). Intensity discrimination as a function of frequency and sensation level. J. Acoust. Soc. Am. 61:169-177.

Kiang, N. Y-S., T. Watanabe, E. C. Thomas, and L. F. Clark. (1965). Discharge Patterns of Single Fibers in the Cat"s Auditory Nerve. Cambridge, MA: MIT Press.

Klumpp, R., and H. Eady. (1956). Some measurements of interaural time differences thresholds. J. Acoust. Soc. Am. 28:859-864.

Moore, B. C. J. (1997). An Introduction to the Psychology of Hearing. Fourth edition. London: Academic Press.

National Center for Health Statistics (1993). Vital statistics: prevalence of selected chronic conditions: United States 1986-1988, Series 10. Data from National Health survey #182, USHHS, PHS.

Patterson, R. A., and B. C. J. Moore. (1986). Auditory filters and excitation patterns as representations of frequency resolution. In B. C. J. Moore, Ed., Frequency Selectivity in Hearing. New York: Academic Press.

Pickles, J. O. (1988). An Introduction to the Physiology of Hearing. Second edition. London: Academic Press.

Stelmachowitcz, P. G., K. A. Beauchaine, A. Kalberer, and W. Jesteadt. (1989). Normative thresholds in the 8- to 20-kHz range as a function of age. J. Acoust. Soc. Am. 86:1384-1391.

Tonndorf, J. (1972). Bone conduction. In J. V. Tobias, Ed., Foundations of Modern Auditory Theory II. New York: Academic Press.

Ward, W. D. (1973). Adaptation and Fatigue. In J. Jerger, Ed., Modern Developments in Audiology. New York: Academic Press.

Watson, C. S. (1987). Uncertainty, informational masking, and the capacity of immediate auditory memory. In W. A. Yost and C. S. Watson, Eds., Auditory Processing of Complex Sounds. Hillsdale, NJ: Erlbaum.

Weiss, A. D. (1963). Auditory perception in relation to age. In J. E. Birren, R. N. Butler, S. W. Greenhouse, L. Sokoloff, and M. Tarrow, Eds., Human Aging: a Biological and Behavioral Study. Bethesda: NIMH.

Wier, C. C., W. Jesteadt, and D. M. Green (1977). Frequency discrimination as a function of frequency and sensation level. J. Acoust. Soc. Am. 61:178-184.

Yost, W. A. (1991). Auditory image perception and analysis. Hear. Res. 56:8-18.

Yost, W. A. (1994). Fundamentals of Hearing: An Introduction. San Diego: Academic Press.

Further Readings

Gilkey, R. A., and T. R. Anderson. (1997). Binaural and Spatial Hearing in Real and Virtual Environments. Hillsdale, NJ: Erlbaum.

Green, D. M. (1988). Profile Analysis: Auditory Intensity Discrimination. Oxford: Oxford Science Publications.

Hamernik, R. P., D. Henderson, and R. Salvi. (1982). New Perspectives on Noise-Induced Hearing Loss. New York: Raven.

Hartmann, W. M. (1997). Signals, Sound and Sensation. Woodbury, NY: AIP Press.

NIH (1995). NIH Consensus Development Conferences on Cochlear Implants in Adults and Children. Bethesda: NIH .