Murdoch University Research Repository

Welcome to the Murdoch University Research Repository

The Murdoch University Research Repository is an open access digital collection of research
created by Murdoch University staff, researchers and postgraduate students.

Learn more

Thai word segmentation for visualization of Thai Web sites

Thanadechteemapat, W. and Fung, C.C.ORCID: 0000-0001-5182-3558 (2011) Thai word segmentation for visualization of Thai Web sites. In: International Conference on Machine Learning and Cybernetics, ICMLC 2011, 10 - 13 July, Guilin, China pp. 1544-1549.

PDF - Authors' Version
Download (232kB)
Link to Published Version:
*Subscription may be required


Information overload is a problem in the Information Age and Information visualization is an approach to provide an overview of the content of a web site. Tag cloud is one of the ways to represent information as an image of a group of words. However, there are limitations on tag cloud generation, and one of them is due to the characteristics for the language. In order to extract tags or words for tag cloud, word segmentation is required. This paper proposes a Thai word segmentation approach for the visualization of Thai Web sites. The proposed Thai word segmentation technique is based on the longest matching technique together with a refined corpus. The results of Thai word segmentation are compatible with the results from previous BEST's contests in Thailand.

Item Type: Conference Paper
Murdoch Affiliation(s): School of Information Technology
Publisher: IEEE
Copyright: © 2011 IEEE
Item Control Page Item Control Page


Downloads per month over past year