Towards Breast Cancer Survivability Prediction Models in Thai Hospital Information Systems

Thongkam, Jaree (2009) Towards Breast Cancer Survivability Prediction Models in Thai Hospital Information Systems. PhD thesis, Victoria University.


Finding suitable ways to develop models for predicting unknown data classes is a challenging task in data mining and machine learning. The improvement of the quality of data sets and combining AdaBoost with a weak learner is an important contribution to the development of these prediction models. The objectives of this thesis are to build accurate, stable and effective breast cancer survivability prediction models using breast cancer data obtained from the Srinagarind Hospital in Thailand. To achieve these objectives, five approaches were proposed including: 1) £-means and RELIEF to improve accuracy and stability of prediction models generated from AdaBoost algorithms; 2) C-Support Vector Classification Filtering (CSVCF) to identify and eliminate outliers; 3) a combination of C-SVCF and oversampling approaches to handle both outliers and imbalanced data problems; 4) a hybrid AdaBoost and Random Forests to build stronger prediction models; and 5) C4.5 to form breast cancer survivability decision trees and rules. To illustrate capability, performance and effectiveness of these approaches, extensive experimental studies have been conducted using W E K A version 3.5.6, AdaBoost M A T L A B Toolbox, L I B S V M and C4.5 program.

Item type Thesis (PhD thesis)
Subjects Historical > FOR Classification > 1117 Public Health and Health Services
Historical > Faculty/School/Research Centre/Department > School of Engineering and Science
Keywords data mining, outliers, data space, filtering, over-sampling, Thailand, health information systems
Download/View statistics View download statistics for this item

Search Google Scholar

Repository staff login