Estimating the minimum entropy of chinese and Japanese languages Article

Ren, F, Yen, K. (2005). Estimating the minimum entropy of chinese and Japanese languages . INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 4(4), 679-689.

cited authors

  • Ren, F; Yen, K

authors

abstract

  • The study of minimum entropy of a natural language has been an interesting research subject. For English, great progress has been made, but few reports on other languages have been found in literature. Based on two hypotheses on the conservation of information quantity, we proposed a method which can be used to estimate the minimum entropy of characters in natural languages. With a large quantity of translation corpus, this method enables us to estimate the minimum entropy without calculating the probability. Besides, as the scale of translation corpus increases, the fluctuation of the ratio between character quantities in any two languages becomes negligible. In this paper, we apply this method to the study of two languages of a large character total -Japanese and Chinese. © World Scientific Publishing Company.

publication date

  • January 1, 2005

start page

  • 679

end page

  • 689

volume

  • 4

issue

  • 4