A Statistical Approach to English and Sinhala Translation

J.U.Liyanapathirana, A.R.Weerasinghe


Statistical Machine Translation (SMT) is not a new term to the natural language research community. Applications of SMT range from simple model generation to optimizing model quality through parameter tuning. However, most of these researches have been conducted for translation between European languages. The usage of SMT for Asian languages are rather low, mainly due to lack of their resources and their structural differences, which make them lesser studied or explored.

This research tries to overcome this scarcity by building an EnglishSinhala Translator using SMT. Several obstacles would be faced when trying to accomplish this approach. Limited amount of Sinhala resources are available and hence the process of data collection would be tedious. The other issue would be that improving the translation quality would need more effort, since the structure of the two languages are different by nature.