Statistical Machine Translation - Word-based Translation

Word-based Translation

In word-based translation, the fundamental unit of translation is a word in some natural language. Typically, the number of words in translated sentences are different, because of compound words, morphology and idioms. The ratio of the lengths of sequences of translated words is called fertility, which tells how many foreign words each native word produces. Necessarily it is assumed by information theory that each covers the same concept. In practice this is not really true. For example, the English word corner can be translated in Spanish by either rincón or esquina, depending on whether it is to mean its internal or external angle.

Simple word-based translation can't translate between languages with different fertility. Word-based translation systems can relatively simply be made to cope with high fertility, but they could map a single word to multiple words, but not the other way about. For example, if we were translating from English to French, each word in English could produce any number of French words— sometimes none at all. But there's no way to group two English words producing a single French word.

An example of a word-based translation system is the freely available GIZA++ package (GPLed), which includes the training program for IBM models and HMM model and Model 6.

The word-based translation is not widely used today; phrase-based systems are more common. Most phrase-based system are still using GIZA++ to align the corpus. The alignments are used to extract phrases or deduce syntax rules. And matching words in bi-text is still a problem actively discussed in the community. Because of the predominance of GIZA++, there are now several distributed implementations of it online.

Read more about this topic:  Statistical Machine Translation

Famous quotes containing the word translation:

    Whilst Marx turned the Hegelian dialectic outwards, making it an instrument with which he could interpret the facts of history and so arrive at an objective science which insists on the translation of theory into action, Kierkegaard, on the other hand, turned the same instruments inwards, for the examination of his own soul or psychology, arriving at a subjective philosophy which involved him in the deepest pessimism and despair of action.
    Sir Herbert Read (1893–1968)