As a natural language processing tool, statistical language modeling is proved to be able to process large-scale real text.
2
对大规模语料进行统计,发现一些语言现象和建立统计语言模型,是语言学和计算语言学研究常用的方法。
To count the large scale language corpus, to find some linguistic phenomena and to establish statistical model are the common method used in linguistic research and computational linguistic research.
Different estimation methods of the probabilities of sparse events for the computation of the entropy in large scale modern Chinese text are applied in this paper.