Murdoch University Research Repository

Welcome to the Murdoch University Research Repository

The Murdoch University Research Repository is an open access digital collection of research
created by Murdoch University staff, researchers and postgraduate students.

Learn more

Text dimensionality reduction for document clustering using hybrid memetic feature selection

Al-Jadir, I., Wong, K.W., Fung, C.C.ORCID: 0000-0001-5182-3558 and Xie, H. (2017) Text dimensionality reduction for document clustering using hybrid memetic feature selection. Lecture Notes in Computer Science, 10607 . pp. 281-289.

Link to Published Version: https://doi.org/10.1007/978-3-319-69456-6_23
*Subscription may be required

Abstract

In this paper, a document clustering method with a hybrid feature selection method is proposed. The proposed hybrid feature selection method integrates a Genetic-based wrapper method with ranking filter. The method is named Memetic Algorithm-Feature Selection (MA-FS). In this paper, MA-FS is combined with K-means and Spherical K-means (SK-means) clustering methods to perform document clustering. For the purpose of comparison, another unsupervised feature selection method, Feature Selection Genetic Text Clustering (FSGATC), is used. Two real-world criminal report document sets were used along with two popular benchmark datasets which are Reuters and 20newsgroup, were used in the comparisons. F-Micro, F-Macro and Average Distance of Document to Cluster (ADDC) measures were used for evaluation. The test results showed that the MA-FS method has outperformed the FSGATC method. It has also outperformed the results after using the entire feature space (ALL).

Item Type: Journal Article
Murdoch Affiliation: School of Engineering and Information Technology
Publisher: Springer Verlag
Copyright: © 2017 Springer International Publishing AG
URI: http://researchrepository.murdoch.edu.au/id/eprint/39790
Item Control Page Item Control Page