Pictorial Art and Vision

Pictorial art attempts to capture the three-dimensional structure of a scene -- some chosen view of particular objects, people, or a landscape. The artist's goal is to convey a message about the world around us, but we can also find in art a message about the workings of the brain. Many look to art for examples of pictorial depth cues -- perspective, occlusion, TEXTURE gradients, and so on -- as these are the only cues available for depth in pictures. DEPTH PERCEPTION based on binocular disparity, vergence, and accommodation is inappropriate for the depths depicted, and head movements no longer provide new views of the scene. However, pictorial cues are abundant in real scenes -- that is why they work in pictures -- and there is no obvious benefit in studying their effectiveness in art as opposed to their effectiveness in natural scenes.

And yet pictorial art can tell us a great deal about vision and the brain if we pay attention to the ways in which paintings differ from the scenes they depict. First of all, we learn that artists get away with a great deal -- impossible colors, inconsistent shading and shadows, inaccurate perspective, the use of lines to stand for sharp discontinuities in depth or brightness. These representational "errors" do not prevent human observers from perceiving robust three-dimensional forms. Art that captures the three-dimensional structure of the world without merely recreating or copying it offers a revealing glimpse of the short cuts and economies of the inner codes of vision. The nonveridicality of representation in art is so commonplace that we seldom question the reason why it works.

Figure 1

Figure 1 (a) An early example of outline drawing from France. (b) As you view this image from different angles, the changes in the distance from face to hand and in the shape of the head are subtle. A 3-D computer model of this scene would require large-scale relative motions and 3-D shape changes to maintain the 2-D view seen with changing viewpoints. (c) Impossible lighting, highlights, or shadows (note the overlapping cast shadows at the bottom) are difficult to spot in paintings, implying that human observers use a simplistic local model of light and shade.

A line drawing of a building or an elephant can convey its 3-D structure very convincingly, but remember that there are no lines in the real world corresponding to the lines used in the drawings. The surface occlusions, folds, or creases that are represented by lines in drawings are revealed by changes in, say, brightness or texture in the real world, and these changes have one value extending on one side and a different value on the other. This is not a line. It is not obvious why lines should work at all. The effectiveness of line drawings is not based simply on learned convention, passed on through our culture. This point has been controversial (Kennedy 1975; see Deregowski 1989, and its following comments), but most recent evidence suggests that line drawings are universally interpreted in the same way -- infants (Yonas and Arterberry 1994), stone-age tribesmen (Kennedy and Ross 1975), and even monkeys (Itakura 1994) appear to be capable of interpreting line drawings as we do. Nor is it the case that the lines in line drawings just trace the brightness discontinuities in the image, because this type of representation is rendered meaningless by the inclusion of cast shadow and pigment contours. By a quirk of design or an economy of encoding, lines may be directly activating the internal code for object structure, but only object contours can be present in the drawing for this shortcut to work. The shortcut, discovered and exploited by artists, hints at the simplicity of the internal code that underlies the vision of 3-D structures. This code is both simpler than the 2 -D sketch of David MARR and sparser than the compact, reversible codes (Olshausen and Field 1996) that may reflect the workings of early areas of VISUAL CORTEX. Both artists and brains have found out which are the key contours necessary to represent the essential structure of an object. By studying the nature of lines used in line drawings, scientists too may eventually join this group.

Another aspect as commonplace and as informative as the effectiveness of lines is that pictures are flat and yet they provide consistent, apparently 3-D interpretations from a wide range of viewpoints. This is not only convenient for the artist, but also prime evidence that our impressions of a 3-D world are not supported by true, 3-D internal representations. If we had real 3-D vision, the scene depicted in a flat picture would have to distort grotesquely in 3-D space as we moved about the picture. To the contrary, however, objects in pictures seem reassuringly the same as we change our vantage point (with some interesting exceptions; see Gregory 1994). We don't experience the distortions probably because the visual system does not generate a true 3-D representation of the object. It has some qualities of three dimensions but it is far from Euclidean. It may follow some other geometry, affine or nonmetric in nature (Todd and Reichel 1989; Busey, Brady, and Cutting 1990). The effectiveness of flat images is of course a boon to artists who do not have to worry about special vantage points and to film makers who can have theaters with more than one seat in them. It is also of great importance for understanding the internal representations of objects and space.

Finally, consider the enormous range of discrepancies between light and shade in the world and their renditions in art. When light and shade were introduced into art about 2,200 years ago, it was through the use of local techniques such as lightening a surface fold to make it come forward (a Greek technique described by Pliny the Elder; see Gombrich 1976 for a beautiful reinterpretation of this ancient presentation of painting techniques). These local techniques of shading, shadows, and highlights were applied with little thought to making them all consistent with a given light source -- and yet they all work very well. Even 500 years ago, when the geometry of perspective was well understood, the geometry of light was still ignored. The resulting errors in light and shadow would be caught immediately by any analysis based on physical optics, but pass unnoticed to human observers. Modern artists with a full understanding of the physics of light and shade available to them often still choose inconsistencies in lighting either because it never matters much, or perhaps because it looks better.

Evidently, we as observers do not reconstruct a light source in order to recover the depth from shading and shadow, we do not act as optical geometers in the way that computer graphics programs can. We do not notice inconsistencies across different portions of a painting but recover depth cues locally. The message here is that in the real world, the information is rich and redundant, so we do not have to analyze the image much beyond a local region to resolve any ambiguities. When faced with the sparser cues of pictorial art, we do not adopt a larger region of analysis -- the local cues are meaningful, albeit inconsistent with cues in other areas of the painting. To the advantage of the artist, the inconsistencies go unnoticed. And again, like many aspects of art, this discrepancy between the art and the scene it depicts informs us about the brain within us as much as about the world around us.

See also

Additional links

-- Patrick Cavanagh


Busey, T. A., N. P. Brady, and J. E. Cutting. (1990). Compensation is unnecessary for the perception of faces in slanted pictures. Perception and Psychophysics 48:1-11.

Deregowski, J. B. (1989). Real space and represented space: Cross-cultural perspectives. Behavioral and Brain Sciences 12:51-119.

Gombrich, E. H. (1976). The Heritage of Apelles. Oxford: Phaidon Press.

Gregory, R. (1994). Experiments for a desert island. Perception 23:1389-1394.

Itakura, S. (1994). Recognition of line-drawing representations by a chimpanzee (Pan troglodytes). Journal of General Psychology 121:189-197.

Kennedy, J. M. (1975). Drawings were discovered, not invented. New Scientist 67:523-527.

Kennedy, J. M., and A. S. Ross. (1975). Outline picture perception by the Songe of Papua. Perception 4:391-406.

Olshausen, B. A., and D. J. Field. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381:606-607.

Todd, J. T., and F. D. Reichel. (1989). Ordinal structure in the visual perception and cognition of smoothly curved surfaces. Psychological Review 96:643-657.

Yonas, A., and M. E. Arterberry. (1994). Infants perceive spatial structure specified by line junctions. Perception 23:1427-1435.

Further Readings

Gombrich, E. H. (1960). Art and Illusion. Princeton: Princeton University Press.

Gregory, R., J. Harris, P. Heard, and D. Rose. (1995). The Artful Eye. Oxford: Oxford University Press.

Kennedy, J. M. (1974). The Psychology of Picture Perception. San Francisco: Jossey-Bass Inc.

Maffei, L., and A. Fiorentini. (1995). Arte e Cervello. Bologna: Zanichelli Editore.

Willats, J. (1997). Art and Representation. Princeton: Princeton University Press.