Catalog Home Page

Thai word segmentation for visualization of Thai Web sites

Thanadechteemapat, W. and Fung, C.C. (2011) Thai word segmentation for visualization of Thai Web sites. In: International Conference on Machine Learning and Cybernetics, ICMLC 2011, 10 - 13 July, Guilin, China.

[img]
Preview
PDF - Authors' Version
Download (227kB) | Preview
    Link to Published Version: http://dx.doi.org/10.1109/ICMLC.2011.6016978
    *Subscription may be required

    Abstract

    Information overload is a problem in the Information Age and Information visualization is an approach to provide an overview of the content of a web site. Tag cloud is one of the ways to represent information as an image of a group of words. However, there are limitations on tag cloud generation, and one of them is due to the characteristics for the language. In order to extract tags or words for tag cloud, word segmentation is required. This paper proposes a Thai word segmentation approach for the visualization of Thai Web sites. The proposed Thai word segmentation technique is based on the longest matching technique together with a refined corpus. The results of Thai word segmentation are compatible with the results from previous BEST's contests in Thailand.

    Publication Type: Conference Paper
    Murdoch Affiliation: School of Information Technology
    Publisher: IEEE
    Copyright: © 2011 IEEE
    URI: http://researchrepository.murdoch.edu.au/id/eprint/6016
    Item Control Page

    Downloads

    Downloads per month over past year