Murdoch University Research Repository

Welcome to the Murdoch University Research Repository

The Murdoch University Research Repository is an open access digital collection of research
created by Murdoch University staff, researchers and postgraduate students.

Learn more

A deep-ensemble-level-based interpretable Takagi-Sugeno-Kang fuzzy classifier for imbalanced data

Wang, G.ORCID: 0000-0002-5258-0532, Zhou, T., Choi, K-S and Lu, J. (2020) A deep-ensemble-level-based interpretable Takagi-Sugeno-Kang fuzzy classifier for imbalanced data. IEEE Transactions on Cybernetics . Early Access.

Link to Published Version: https://doi.org/10.1109/TCYB.2020.3016972
*Subscription may be required

Abstract

Existing research reveals that the misclassification rate for imbalanced data depends heavily on the problematic areas due to the existence of small disjoints, class overlap, borderline, and rare data samples. In this study, by stacking zero-order Takagi-Sugeno-Kang (TSK) fuzzy subclassifiers on the minority class and its problematic areas in the deep ensemble, a novel deep-ensemble-level-based TSK fuzzy classifier (IDE-TSK-FC) for imbalanced data classification tasks is presented to achieve both promising classification performance and high interpretability of zero-order TSK fuzzy classifiers. Simultaneously, according to the stacked generalization principle, the proposed classifier lifts up oversampling from the data level to the deep ensemble level with a guarantee of enhanced generalization capability for class imbalance learning. In the structure of IDE-TSK-FC, the first interpretable zero-order TSK fuzzy subclassifier is built on the original training dataset. After that, several successive zero-order TSK fuzzy subclassifiers are stacked layer by layer on the newly identified problematic areas from the original training dataset plus the corresponding interpretable predictions obtained by the averaging strategy on all previous layers. IDE-TSK-FC simply takes the classical K-nearest neighboring algorithm at each layer to identify its problematic area that consists of the minority samples and its surrounding K majority neighbors. After randomly neglecting certain input features and randomly selecting the five Gaussian membership functions for all the chosen input features and the augmented feature in the premise of each fuzzy rule, each subclassifier can be quickly obtained by using the least learning machine to determine the consequent part of each fuzzy rule. The experimental results on both the public datasets and a real-world healthcare dataset demonstrate IDE-TSK-FC's superiority in class imbalanced learning.

Item Type: Journal Article
Murdoch Affiliation: Information Technology, Mathematics and Statistics
Publisher: IEEE
Copyright: © 2020 IEEE
URI: http://researchrepository.murdoch.edu.au/id/eprint/57819
Item Control Page Item Control Page