An Information Theoretic basis for the Sinhala Language

H.L.Premaratne, D.N.Ranasinghe

ABSTRACT

The approximation of natural languages such as English, Mandarin and Russian and several others by mathematical models of sources have been investigated in the past arising out of the seminal work of Shannon. This paper presents a similar preliminary study for the Sinhala language which is to our best of knowledge is the first of its kind for Sinhala. The applicability of Zipf’s law to the Sinhala Language has been investigated and the word entropy, symbol entropy and redundancy have been estimated. The results confirm that the Sinhala language is in close agreement with most other natural languages that have been investigated so far. The results will pave way for researchers and linguists to enhance their own approaches to learning and coding of language based on a firm information theoretic basis.

Citation Info :

In Conference Proceedings - 6th International Information Technology Conference on From Research to Reality, Infotel Lanka Society Colombo, Sri Lanka, 29 Nov- 01 Dec 2004, pp. 11-16, ISBN 955-8974-01-3.