Computational intelligence for the diagnosis of cancer and heart disease
Throughout history, human beings have been affected by mortal diseases. Of these different diseases, heart disease and cancer have drawn the broad notice of medical researchers. This research aimed to develop a computational intelligence based approach for diagnosis of heart disease and cancer. Early detection and general awareness of the risk factors of these diseases appear to be essential when treating these diseases as existing research into these fields has shown. This research develops an approach that could have an impact on future diagnostic systems in traditional medical practices through streamlining processes that are normally time-consuming and costly, thus contributing an important factor to early diagnosis and accessibility.
More specifically this research developed different classification and association rule mining techniques that allowed for the early detection of heart disease and cancer, determine what the risk factors are that cause heart disease and different types of cancers with specific focus on breast cancer. The research developed a novel database extracting data from existing literature in computational intelligence applied in the medical research. To underpin this research further, different algorithms were compared for their performances; something that, as far as the researcher is aware, has not been done in the past. These algorithms were also used on a dataset made from collecting patient's stress test record from a regional hospital in Queensland, Australia, further placing the research in an Australian context. Another aspect that hasn't been given full attention before is the nature of imbalanced data, although an everyday reality in medical data in the data mining field. This prompted for further attention through investigating classifications using imbalanced data with the aim to improve performance prediction.
Overall, the study contributes to the knowledge about heart disease and cancer, and also demonstrates the effectiveness of using computational intelligence techniques in making diagnostic more accurate. Thus, the findings of this study add significant contributions to the practice of cancer and heart disease diagnosis. The contributions to knowledge in this medical research include:
- The development of a comparative view of classification algorithms used for heart disease detection. Existing studies have investigated only a limited number of algorithms and a comparative study is still lacking. This thesis researches the best suited algorithm in this area and discusses the different levels of strength and suitability of the different techniques for heart disease diagnosis. From the experimental results it could be argued that some of the algorithms tested showed particular potential as a classification algorithm for heart disease diagnosis; the comparison between different feature selection processes for heart disease diagnosis, the proposed MFS and MFS+CFS are promising techniques for use in heart disease diagnostics; this research further presented a rule extraction experiment on heart disease data using different rule mining algorithms (Apriori, predictive Apriori and tertius) for sick, healthy, male and female and highlighted the efficiency of the Apriori algorithm. From the experiment on sick and healthy persons and taking confidence as an indicator, females have more chance of being free from heart disease then males. Likewise in view of high confidence rules but independent of gender, factors the results indicate that women seem to be less at risk of developing heart disease.
- A novel database for use for risk and prevention factor mining for different cancers. Researchers disagree regarding the risk and prevention factors for cancer. This study, with an extensive survey on related literature, contributes a new database considering the differing viewpoints of allied researchers and practitioners, allowing further research on different types of cancer. From this research the significant risk and prevention factors (in other words, the risk avoidance and protection factors to guide a healthy life-style) for breast cancer, lung cancer, prostate cancer, bowel cancer, ovarian cancer and brain cancer have been identified.
- Due to its high mortality rate, special focus is given on early breast cancer identification from microarray and image data. Part of the contribution derived from the identification of a number of high performing algorithms. In addition, the influences of imbalanced data classification technique (SMOTE) were highlighted, demonstrating the effectiveness of classifiers and SMOTE in this respect.
- Identification of the best techniques for heart disease identification on real stress test data including a new approach for rule mining when dataset is imbalanced. The research showed that SMO (sequential minimal optimization) combined with MFS performed better than the other classifiers in the experiment, in terms of the majority of measures. This could imply that SMO in combination with MFS is a promising technique for stress test data analysis.
The benefits of this research could apply to different stakeholder communities such as the CAD designers and the data mining community, the medical field and in particular general practitioners as well as the general public. The research highlighted that the Computational Intelligence community favoured more complex algorithms, consequently paying less attention to simpler yet at the same time equally efficient algorithms that could have an impact on time and cost. It could be argued that the collection of data from the existing literature is another field that has been somewhat neglected. The wealth of research articles provides an abundance of data that begged to be tapped into. The literature review in this research and set up of a database based on this review was sourced from a total of 1,460 journal articles, conference paper, articles, books and book chapters alone. One of the findings was the commonality of factors found across different research studies that thus far have not been brought into consideration, casting a new angle of analysis within data mining in the medical field. While it would be impossible to claim that this research holds the key to all answers in early disease diagnostics, it does however highlight the need into more cost and time effective procedures in the medical field using Computational Intelligence. It further emphasises the vital role CI plays in timely medical diagnostic (per se) as time is likely to be the most crucial constituent when facing heart disease, different cancer and specially breast cancer.
History
Start Page
1End Page
372Number of Pages
372Publisher
Central Queensland UniversityPlace of Publication
Rockhampton, QueenslandOpen Access
- Yes
Era Eligible
- No
Supervisor
Professor Kevin S. Tickle ; Dr. Phoebe ChenThesis Type
- Doctoral Thesis
Thesis Format
- By publication