File(s) not publicly available
Pattern discovery from biological data
chapterposted on 2017-12-06, 00:00 authored by Jesmin Nahar, Kevin TickleKevin Tickle, A B M Shawkat Ali
Extracting useful information from structured and unstructured biological data is crucial in the health industry. Some examples include medical practitioner’s need to identify breast cancer patient in the early stage, estimate survival time of a heart disease patient, or recognize uncommon disease characteristics which suddenly appear. Currently there is an explosion in biological data available in the data bases. But information extraction and true open access to data require time to resolve issues such as ethical clearance. The emergence of novel IT technologies allows health practitioners to facilitate the comprehensive analyses of medical images, genomes, transcriptomes, and proteomes in health and disease. The information that is extracted from such technologies may soon exert a dramatic change in the pace of medical research and impact considerably on the care of patients. The current research will review the existing technologies being used in heart and cancer research. Finally this research will provide some possible solutions to overcome the limitations of existing technologies. In summary the primary objective of this research is to investigate how existing modern machine learning techniques (with their strength and limitations) are being used in the indent of heartbeat related disease and the early detection of cancer in patients. After an extensive literature review these are the objectives chosen: to develop a new approach to find the association between diseases such as high blood pressure, stroke and heartbeat, to propose an improved feature selection method to analyze huge images and microarray databases for machine learning algorithms in cancer research, to find an automatic distance function selection method for clustering tasks, to discover the most significant risk factors for specific cancers, and to determine the preventive factors for specific cancers that are aligned with the most significant risk factors. Therefore we propose a research plan to attain these objectives within this chapter. The possible solutions of the above objectives are: new heartbeat identification techniques show promising association with the heartbeat patterns and diseases, sensitivity based feature selection methods will be applied to early cancer patient classification, meta learning approaches will be adopted in clustering algorithms to select an automatic distance function, and Apriori algorithm will be applied to discover the significant risks and preventive factors for specific cancers. We expect this research will add significant contributions to the medical professional to enable more accurate diagnosis and better patient care. It will also contribute in other area such as biomedical modeling, medical image analysis and early diseases warning.