Ensemble classifiers are approaches which train multiple classifiers and fuse their
decisions to produce the final decision. The training process in an Ensemble
classifier aims at producing the base classifiers in such a way that they are accurate
but also differs from each other in terms of the errors they made on identical
patterns.
Although Ensemble classifiers are very useful and can be applied to many real
world applications for classifying unseen data patterns into one of the known or
unknown classes, there are many problems facing the Ensemble classifiers such as
finding the appropriate number of layers, clusters or even base classifiers which can
produce the best level of diversity and accuracy. There has been very little research
conducted in this area. In addition, there is also a lack of automatic approach to find
these parameters.
This thesis presents a Multi-Layered Ensemble Classifier and the Evolutionary
Algorithm Based Optimization (EABO) approaches to identify the optimal number
of layers and clusters in the Multi-Layered Neural Ensemble Classifiers. This thesis
focuses on the following research issues. Firstly it proposes a Multi-Layered Neural
Ensemble Classifier and a Single-Objective Evolutionary Algorithm Based
Optimization approach to optimize the Ensemble parameters. It investigates an
approach for finding the impact of parameters such as attributes, instances and
classes on clusters, accuracy and diversity. Secondly it investigates the relationship
among clusters, layers, diversity and size of dataset using the Neural Ensemble
Abstract
iii | P a g e
Classifiers. This is to see whether or not there is any relationship between the
number of clusters in the Ensemble Classifier and the data size. Finally this
research presents and investigates a Multi-Objective Evolutionary Algorithm Based
Optimization (MOEABO) approach for optimizing the Multi-Layered Neural
Ensemble Classifiers.
The proposed approaches are evaluated on the University of California at Irvine
(UCI) machine learning repository benchmark data sets. The results show that the
proposed EABO approach is better than the Bagging and Boosting Ensemble
classifiers. The results show that the maximum number of clusters is influenced by
decrementing the number of instances in the data set. The results also show the
impact of variability in data which can produce the best level of diversity and
accuracy. The findings provide a better understanding of the relationships between
clusters in Ensemble and the level of accuracy and diversity. Finally, the results
show that the proposed MOEABO approach achieves a better performance than the
Bagging, Boosting, EABO and NULCOEC. A comparative analysis of the results
using the proposed approaches and recently published approaches in the literature is
presented in this thesis.