Volume 6 - Issue 10
Uyghur-Chinese statistical machine translation by incorporating morphological information
Abstract
This paper presents a method of machine translation from Uyghur, an agglutinative language with very productive inflectional and derivational morphology, to Chinese, by incorporating morphological information into a statistical machine translation model. The basic idea is the agglutinated suffixes should be treated carefully so as to make correct translation, because they play important roles in the Uyghur language. Experimental results showed that morphological decomposition of Uyghur source is beneficial, specially for smaller-size training corpora. The BLEU score is improved to 25.26 from 13.61 when the input data is tokenized compared to the case without tokenization.
Paper Details
PaperID: 78149400450
Author's Name: Aisha, B., Sun, M.
Volume: Volume 6
Issues: Issue 10
Keywords: Chinese, Machine translation, Morphological information, Uyghur
Year: 2010
Month: October
Pages: 3137 - 3146