Machine Translation

Machine translation (MT), which celebrated its fiftieth anniversary in 1997, uses computers to translate texts written in one human language, such as Spanish, into another human language, such as Ukrainian. In the ideal situation, sometimes abbreviated as FAHQMT (for Fully Automated High Quality Machine Translation) the computer program produces fully automatic, high-quality translations of text. Programs that assist human translators are called "machine-aided translators" (MATs).

MT is the intellectual precursor to the field of COMPUTATIONAL LINGUISTICS (also called NATURAL LANGUAGE PROCESSING), and shares interests with computer science (artificial intelligence), linguistics, and occasionally anthropology. Machine translation dates back to the work of Warren Weaver (1955), who suggested applying ideas from cryptography, as employed during World War II, and information theory, as outlined in 1947 by Claude Shannon, to language processing. Not surprisingly, the first large-scale MT project was funded by the U.S. government to translate Russian Air Force manuals into English. After an initial decade of naive optimism, the ALPAC (for Automatic Language Processing Advisory Committee) report (Pierce et al. 1966), issued by a government-sponsored study panel, put a damper on research in the United States for many years. Research and commercial development continued largely in Europe and after 1970 also in Japan. Today, over fifty companies worldwide produce and sell translation by computer, whether as translation services to outsiders, as in-house translation bureaus, or as providers of on-line multilingual chat rooms. Some 250 of the world's most widely spoken languages have been translated, at least in pilot systems. By some estimates, expenditures for MT in 1989 exceeded $20 million worldwide, involving 200-300 million pages per year (Wilks 1992).

Translation is not easy -- even humans find it hard to translate complex texts such as novels. Current technology produces output whose quality level varies from perfect (for very circumscribed domains with just a few hundred words) to hardly readable (for unrestricted domains requiring lexicons of a quarter million words or more). Research groups continue to investigate unsolved issues. Recent large government-sponsored research collaborations include CICC in Japan (Tsujii 1990), Eurotra in Europe (Johnson, King, and des Tombes 1985), and the DARPA MT effort in the United States (White et al. 1992-1994). (For reviews of the history, theory, and applications of MT, see Hutchins and Somers 1992; Nirenburg et al. 1992; and Hovy 1993; useful collections of papers can be found in AMTA 1996, 1994; CL 1985.)

Before producing their output, all MT systems perform some analysis of the input text. The degree of analysis largely determines what type of translation is being performed, and what the average output quality is. Generally, the more refined or "deeper" the analysis, the better the output quality. Three major levels of analysis are traditionally recognized:

  1. Direct replacement. The simplest systems perform very little analysis of the input, essentially replacing source language (input) words with equivalent target language (output) words, inflected as necessary for tense, number, and so on. When the source and target languages are fairly similar in structure and word use, as between Italian, Spanish, and French, this approach can produce surprisingly understandable results. However, as soon as the word order starts to differ (say, if the verb appears at the end of the sentence, as in Japanese), then some syntactic analysis is required. Modern research using this approach has focused on the semiautomated construction of large word and phrase correspondence "tables," extracting this information from human translations as examples (Nirenburg, Beale, and Domashnev 1994) or using statistics (Brown et al. 1990, 1993).
  2. Transfer. In order to produce grammatically appropriate translations, syntactic transfer systems include so-called parsers, transfer modules, and realizers or generators (see NATURAL LANGUAGE GENERATION). A parser is a computer program that accepts a natural language sentence as input and produces a parse tree of that sentence as output. A transfer module applies transfer rules to convert the source parse tree into a tree conforming to the requirements of the target language grammar -- for example, by shifting the verb from the end of the sentence (Japanese) to the second position (English). A realizer then converts the target tree back into a linear string of words in the target language, inflecting them as required. (For more details, see Kinoshita, Phillips, and Tsujii 1992; Somers et al. 1988; and Nagao 1987.)

    Unfortunately, syntactic analysis is often not enough. Effective translation may require the system to "understand" the actual meaning of the sentence. For example, "I am small" is expressed in many languages using the verb "to be," but "I am hungry" is often expressed using the verb "to have," as in "I have hunger." For the translation system to handle such cases (and their more complex variants), it needs to have information about the meaning of size, hunger, and so on (see SEMANTICS). Often such information is represented in case frames, small collections of attributes and their values. The translation system then requires an additional analysis module, usually called the "semantic analyzer," additional (semantic) transfer rules, and additional rules for the realizer. The semantic analyzer produces a case frame from the syntax tree, and the transfer module converts the case frame derived from the source language sentence into the case frame required for the target language.

  3. Interlinguas. Although transfer systems are common, because a distinct set of transfer rules must be created for each language pair in each direction -- for N languages, one needs about N 2 pairs of rule sets -- they require a great deal of effort to build. The solution is to create a single intermediate representation scheme to capture the language-neutral meaning of any sentence in any language. Then only 2N sets of mappings are required -- from each language into the interlingua and back out again.

    This idea appeals to many. Despite numerous attempts, however, it has never yet been achieved on a large scale; all interlingual MT systems to date have been at the level of demonstrations (a few hundred lexical items) or prototypes (a few thousand). A great deal has been written about interlinguas, but no clear methodology exists for determining exactly how one should build a true language-neutral meaning representation, if such a thing is possible at all (Nirenburg et al. 1992; Dorr 1994; Hovy and Nirenburg 1992).

Machine translation applications are classified into two traditional and one more recent types:

  1. Assimilation. People interested in tracking the multilingual world use MT systems for assimilation -- to produce (rough) translations of many externally created documents, from which they then select the ones of interest (and then possibly submit them for more refined, human translation). Typical users are commercial and government staff who monitor developments in areas of interest. The desired output quality need not be very high, but the MT system should cover a large domain, and it should be fast.
  2. Dissemination. People wishing to disseminate their own documents to the world in various languages use MT systems to produce the translations. Typical users are manufacturers such as Caterpillar, Honda, and Fujitsu. In this case, the desired output quality should be as high as possible, but the system need cover only the application domain, and speed is not generally a consideration.
  3. Interaction. People wanting to converse with others in foreign countries via E-mail or chat rooms use MT systems on-line to translate their messages. Typical users are chat room participants and business travelers setting up meetings and reserving hotel rooms. CompuServe currently supports MT for some of its highly popular chat rooms at the cost of one cent per word. The desired output quality should be as high as possible, given the requirements of system speed and relatively broad coverage.

A great deal of effort has been devoted to the evaluation of MT systems (see White et al. 1992-1994: AMTA 1992: Nomura 1992; Church and Hovy 1993; King and Falkedal 1990; Kay 1980; and Van Slype 1979). No single measure can capture all the aspects of a translation system. While, from the ultimate user's point of view, the major dimensions will probably be cost, output quality, range of coverage, and degree of automation, numerous more specific evaluation metrics have been developed. These range from system-internal aspects such as number of grammar rules and treatment of multisentence phenomena to user-related aspects such as the ability to extend the lexicon and the quality of the system's interface.

See also

Additional links

-- Eduard Hovy


AMTA (Association for Machine Translation in the Americas). (1992). MT Evaluation: Basis for Future Directions. San Diego, CA.

AMTA. (1994). Proceedings of the Conference of the AMTA. Columbia, MD.

AMTA. (1996). Proceedings of the Conference of the AMTA. Montreal, CAN.

Brown, P. F., J. Cocke, S. A. Della Pietra, V. J. Della Pietra, F. Jelinek, J. D. Lafferty, R. L. Mercer, and P. S. Roossin. (1990). A statistical approach to machine translation. Computational Linguistics 16(2):79-85.

Brown, P. F., S. Della Pietra, V. Della Pietra, and R. Mercer. (1993). The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19(2):263-311.

Church, K. W., and E. H. Hovy. (1993). Good applications for crummy machine translation. Machine Translation 8:239-258.

CL (Computational Linguistics). (1985). Special issues on machine translation. Vol. 11, nos. 2-3.

Dorr, B. J. (1994). Machine translation divergences: A formal description and proposed solution. Computational Linguistics 20(4):597-634.

Hovy, E. H. (1993). How MT works. Byte. January: 167-176. Special feature on machine translation.

Hovy, E. H., and S. Nirenburg. (1992). Approximating an interlingua in a principled way. In Proceedings of the DARPA Speech and Natural Language Workshop, New York: Arden House.

Hutchins, W. J., and H. Somers. (1992). An Introduction to Machine Translation. San Diego: Academic Press.

Johnson, R., M. King, and L. des Tombes. (1985). Eurotra: A multilingual system under development. Computational Linguistics 11(2-3): 155 - 169.

Kay, M. (1980). The proper place of men and machines in language translation. XEROX PARC Research Report CSL-80-11. Palo Alto, CA: Xerox Parc.

King, M., and K. Falkedal. (1990). Using test suites in evaluation of machine translation systems. In Proceedings of the Eighteenth COLING Conference, vol. 2, pp. 435-447.

Kinoshita, S., J. Phillips, and J. Tsujii. (1992). Interactions between structural changes in machine translation. In Proceedings of the Twentieth COLING Conference, pp. 679-685.

Nagao, M. (1987). Role of structural transformation in a machine translation system. In S. Nirenburg, Ed., Machine Translation: Theoretical and Methodological Issues. Cambridge: Cambridge University Press, pp. 262-277.

Nirenburg, S., S. Beale, and C. Domashnev. (1994). A full-text experiment in example-based machine translation. In Proceedings of the International Conference on New Methods in Language Processing, pp. 95-103.

Nirenburg, S., J. C. Carbonell, M. Tomita, and K. Goodman. (1992). Machine Translation: A Knowledge-Based Approach. San Mateo, CA: Kaufmann.

Nomura, H. (1992). JEIDA Methodology and Criteria on Machine Translation Evaluation (JEIDA Report). Tokyo: Japan Electronic Industry Development Association.

Pierce, J. R., J. B. Carroll, E. P. Hamp, D. G. Hays, C. F. Hockett, A. G. Dettinger, and A. Perlis (1966). Computers in Translation and Linguistics (ALPAC Report). National Academy of Sciences/National Research Council Publication 1416. Washington, DC: NAS Press.

Somers, H., H. Hirakawa, S. Miike, and S. Amano. (1988). The treatment of complex English nominalizations in machine translation. Computers and Translation (now Machine Translation) 3(1):3-22.

Tsujii, Y. (1990). Multi-language translation system using interlingua for Asian languages. In Proceedings of International Conference Organized by IPSJ for its Thirtieth Anniversary.

Van Slype, G. (1979). Critical Study of Methods for Evaluating the Quality of Machine Translation. Prepared for the European Commission Directorate on General Scientific and Technical Information and Information Management. Report BR 19142. Brussells: Bureau Marcel van Dijk.

White, J., and T. O'Connell (1992-1994). ARPA Workshops on Machine Translation. Series of four workshops on comparative evaluation. McLean, VA: Litton PRC Inc.

Wilks, Y. (1992). MT contrasts between the U.S. and Europe. In J. Carbonell, E. Rich, D. Johnson, M. Tomita, M. Vasconcellos, and Y. Wilks, Eds., JTEC Panel Report. Commissioned by DARPA and Japanese Technology Evaluation Center.

Weaver, W. (1955). Translation. In W. N. Locke and A. D. Booth, Eds., Machine Translation of Languages. Cambridge, MA: MIT Press.

Further Readings

TMI. (1995). Proceedings of the Conference on Theoretical and Methodological Issues in Machine Translation. Leuven, Belgium.

TMI. (1997). Proceedings of the Conference on Theoretical and Methodological Issues in Machine Translation. Santa Fe, NM.

Whorf, B. L. (1956). Language, Thought, and Reality: Selected Writings of Benjamin Lee Whorf. J. B. Carroll, Ed. Cambridge, MA: MIT Press.