File(s) not publicly available
Ensemble classifier generation using class-pure cluster balancing
Clustering based ensemble of classifiers have shown a significant improvement in classification accuracy in many real-world applications. Most of the existing clustering-based ensemble approaches generate and use predefined number of data clusters. However, datasets have different spatial structure that depends on number of characteristics for example class labels. Therefore, using a predefined set of hyperparameters to generate a clustering-based ensemble classifier is not an effective methodology. In this paper we propose a methodology to overcome this limitation by generating dataset dependent strong and balanced data clusters per class. This ensures that any spatial information that is inherent in the dataset can be exploited to train an ensemble classifier that can surpass the classification accuracy plateau. An ensemble classifier framework is proposed that benefits from this methodology and trains base classifiers on generated strong and balanced data clusters. We have evaluated the proposed approach on 8 benchmark datasets from UCI repository. Detailed experiments and results are presented in the paper, and it is evident from the results that varying the number of clusters per class does have an impact on the overall classification accuracy of the ensemble. © Springer Nature Switzerland AG 2019.