Improved ensemble classification for evolving data streams
journal contribution
posted on 2024-04-22, 00:49authored byH Tian, L Wang, Hong Shen, AWC Liew
A major challenge for evolving data stream classification is feature evolution where features of stream instances are dynamically changing as they progress. Existing classification methods considered feature evolution either for fixed-size data or of limited degree with presumed dependence to history, making them unable to work effectively on evolving data streams of unbounded size and arbitrary feature evolution. Particularly, for evolving data streams containing instances of multiple labels, classification coping with feature evolution faces significant challenges. In this article, we present efficient ensemble methods for classifying evolving data streams of both single label and multiple labels through effective model coupling. For single-label classification, we present an improved unsupervised classification algorithm that applies multi-cluster feature selection (MCFS), which was originally proposed for static data classification, in the DXMiner framework to handle each window of instances in a dynamic stream. Our method generates an optimal feature subset and achieves a high classification accuracy. We further improve the time complexity of the feature selection process in MCFS by applying the Ball-tree searching technique. For multi-label classification, we propose an effective fixed-size ensemble classifier based on multi-label KNN, which works only for static multi-label data classification, by incorporating a weight adaptation strategy among the classifiers in the ensemble to dynamically update the model and cope with arbitrary feature evolution of stream instances as the stream progresses. Extensive experiment results on real-life data streams show that our algorithms outperform the existing results for single-label and multi-label classification in classification accuracy and efficiency.
History
Volume
37
Issue
1
Start Page
38
End Page
50
Number of Pages
13
eISSN
1941-1294
ISSN
1541-1672
Publisher
Institute of Electrical and Electronics Engineers (IEEE)