Removing bias from diverse data clusters for ensemble classification

Fletcher, Samuel; Verma, Brijesh

Removing bias from diverse data clusters for ensemble classification

conference contribution

posted on 2018-05-22, 00:00 authored by Samuel Fletcher, Brijesh Verma

Diversity plays an important role in successful ensemble classification. One way to diversify the base-classifiers in an ensemble classifier is to diversify the data they are trained on. Sampling techniques such as bagging have been used for this task in the past, however we argue that since they maintain the global distribution, they do not engender diversity. We instead make a principled argument for the use of k-Means clustering to create diversity. When creating multiple clusterings with multiple k values, there is a risk of different clusterings discovering the same clusters, which would then train the same base-classifiers. This would bias the ensemble voting process. We propose a new approach that uses the Jaccard Index to detect and remove similar clusters before training the base-classifiers, reducing classification error by removing repeated votes. We demonstrate the effectiveness of our proposed approach by comparing it to three state-of-the-art ensemble algorithms on eight UCI datasets.

Funding

Category 1 - Australian Competitive Grants (this includes ARC, NHMRC)

History

Editor

Liu D; Xie S; Li Y; El-Alfy EM

Volume

LNCS 10637

Start Page

140

End Page

149

Number of Pages

10

Start Date

2017-11-14

Finish Date

2017-11-18

eISSN

1611-3349

ISSN

0302-9743

ISBN-13

9783319700922

Location

Guangzhou, China

Publisher

Springer

Place of Publication

Cham, Germany

Publisher DOI

https://doi.org/10.1007/978-3-319-70093-9_15

Full Text URL

https://link.springer.com/chapter/10.1007/978-3-319-70093-9_15

Peer Reviewed

Yes

Open Access

No

Author Research Institute

Centre for Intelligent Systems

Era Eligible

Yes

Name of Conference

24th International Conference on Neural Information Processing (ICONIP 2017)

Usage metrics

Keywords

Neural Networks Ensemble Classification Clustering Diversity Computer Vision Neural, Evolutionary and Fuzzy Computation Pattern Recognition and Data Mining

Licence

CQUniversity General 1.0

Removing bias from diverse data clusters for ensemble classification

Funding

Category 1 - Australian Competitive Grants (this includes ARC, NHMRC)

History

Editor

Volume

Start Page

End Page

Number of Pages

Start Date

Finish Date

eISSN

ISSN

ISBN-13

Location

Publisher

Place of Publication

Publisher DOI

Full Text URL

Peer Reviewed

Open Access

Author Research Institute

Era Eligible

Name of Conference

Usage metrics

Categories

Keywords

Licence

Exports