Catalog Home Page

Thai word segmentation for visualization of Thai Web sites

Thanadechteemapat, W. and Fung, C.C. (2011) Thai word segmentation for visualization of Thai Web sites. In: International Conference on Machine Learning and Cybernetics, ICMLC 2011, 10 - 13 July, Guilin, China pp. 1544-1549.

[img]
Preview
PDF - Authors' Version
Download (232kB)
Link to Published Version: http://dx.doi.org/10.1109/ICMLC.2011.6016978
*Subscription may be required

Abstract

Information overload is a problem in the Information Age and Information visualization is an approach to provide an overview of the content of a web site. Tag cloud is one of the ways to represent information as an image of a group of words. However, there are limitations on tag cloud generation, and one of them is due to the characteristics for the language. In order to extract tags or words for tag cloud, word segmentation is required. This paper proposes a Thai word segmentation approach for the visualization of Thai Web sites. The proposed Thai word segmentation technique is based on the longest matching technique together with a refined corpus. The results of Thai word segmentation are compatible with the results from previous BEST's contests in Thailand.

Publication Type: Conference Paper
Murdoch Affiliation: School of Information Technology
Publisher: IEEE
Copyright: © 2011 IEEE
URI: http://researchrepository.murdoch.edu.au/id/eprint/6016
Item Control Page Item Control Page

Downloads

Downloads per month over past year