Towards Breast Cancer Survivability Prediction Models in Thai Hospital Information Systems
Thongkam, Jaree (2009) Towards Breast Cancer Survivability Prediction Models in Thai Hospital Information Systems. PhD thesis, Victoria University.
Abstract
Finding suitable ways to develop models for predicting unknown data classes is a challenging task in data mining and machine learning. The improvement of the quality of data sets and combining AdaBoost with a weak learner is an important contribution to the development of these prediction models. The objectives of this thesis are to build accurate, stable and effective breast cancer survivability prediction models using breast cancer data obtained from the Srinagarind Hospital in Thailand. To achieve these objectives, five approaches were proposed including: 1) £-means and RELIEF to improve accuracy and stability of prediction models generated from AdaBoost algorithms; 2) C-Support Vector Classification Filtering (CSVCF) to identify and eliminate outliers; 3) a combination of C-SVCF and oversampling approaches to handle both outliers and imbalanced data problems; 4) a hybrid AdaBoost and Random Forests to build stronger prediction models; and 5) C4.5 to form breast cancer survivability decision trees and rules. To illustrate capability, performance and effectiveness of these approaches, extensive experimental studies have been conducted using W E K A version 3.5.6, AdaBoost M A T L A B Toolbox, L I B S V M and C4.5 program.
Item type | Thesis (PhD thesis) |
URI | https://vuir.vu.edu.au/id/eprint/29496 |
Subjects | Historical > FOR Classification > 1117 Public Health and Health Services Historical > Faculty/School/Research Centre/Department > School of Engineering and Science |
Keywords | data mining, outliers, data space, filtering, over-sampling, Thailand, health information systems |
Download/View statistics | View download statistics for this item |