Murdoch University Research Repository

Welcome to the Murdoch University Research Repository

The Murdoch University Research Repository is an open access digital collection of research
created by Murdoch University staff, researchers and postgraduate students.

Learn more

Using misclassification analysis for data cleaning

Jeatrakul, P., Wong, K.W. and Fung, C.C.ORCID: 0000-0001-5182-3558 (2009) Using misclassification analysis for data cleaning. In: International Workshop on Advanced Computational Intelligence and Intelligent Informatics, IWACIII 2009, 7 November, Tokyo, Japan

PDF - Authors' Version
Download (74kB)


Data cleaning is a pre-processing technique used in most data mining problems. The purpose of data cleaning is to remove noise, inconsistent data and errors in order to obtain a better and representative data set to develop a reliable prediction model. In most prediction model, unclean data could sometime affect the prediction accuracies of a model. In this paper, we investigate classification problem, which make use of misclassification analysis technique for data cleaning. To demonstrate our concept, we have used artificial neural network (ANN) as the core computational intelligence technique. We use three benchmark data sets obtained from the University of California Irvine (UCI) machine learning repository to investigate the results from our proposed data cleaning technique. The experimental data sets used in our experiment are binary classification problems, which are German credit data, BUPA liver disorders, and Johns Hopkins Ionosphere. The results from our experiments show that the proposed cleaning technique could be a good alternative to provide some confidence when constructing a classification model.

Item Type: Conference Paper
Murdoch Affiliation(s): School of Information Technology
Conference Website:
Item Control Page Item Control Page


Downloads per month over past year