Thai word segmentation for visualization of Thai Web sites
Thanadechteemapat, W. and Fung, C.C. (2011) Thai word segmentation for visualization of Thai Web sites. In: International Conference on Machine Learning and Cybernetics, ICMLC 2011, 10 - 13 July, Guilin, China.
|PDF - Authors' Version |
Download (227kB) | Preview
*Subscription may be required
Information overload is a problem in the Information Age and Information visualization is an approach to provide an overview of the content of a web site. Tag cloud is one of the ways to represent information as an image of a group of words. However, there are limitations on tag cloud generation, and one of them is due to the characteristics for the language. In order to extract tags or words for tag cloud, word segmentation is required. This paper proposes a Thai word segmentation approach for the visualization of Thai Web sites. The proposed Thai word segmentation technique is based on the longest matching technique together with a refined corpus. The results of Thai word segmentation are compatible with the results from previous BEST's contests in Thailand.
|Publication Type:||Conference Paper|
|Murdoch Affiliation:||School of Information Technology|
|Copyright:||© 2011 IEEE|
|Item Control Page|
Downloads per month over past year