Toward Breast Cancer Survivablility Prediction Models Through Improving Training Space

Full text for this resource is not available from the Research Repository.

Thongkam, Jaree, Xu, Guandong, Zhang, Yanchun and Huang, Fuchun (2009) Toward Breast Cancer Survivablility Prediction Models Through Improving Training Space. Expert Systems with Applications, 36 (10). pp. 12200-12209. ISSN 0957-4174

Abstract

Due to the difficulties of outlier and skewed data, the prediction of breast cancer survivability has presented many challenges in the field of data mining and pattern precognition, especially in medical research. To solve these problems, we have proposed a hybrid approach to generating higher quality data sets in the creation of improved breast cancer survival prediction models. This approach comprises two main steps: (1) utilization of an outlier filtering approach based on C-Support Vector Classification (CSVC) to identify and eliminate outlier instances; and (2) application of an over-sampling approach using over-sampling with replacement to increase the number of instances in the minority class. In order to assess the capability and effectiveness of the proposed approach, several measurement methods including basic performance (e.g., accuracy, sensitivity, and specificity), Area Under the receiver operating characteristic Curve (AUC) and F-measure were utilized. Moreover, a 10-fold cross-validation method was used to reduce the bias and variance of the results of breast cancer survivability prediction models. Results have indicated that the proposed approach leads to improving the performance of breast cancer survivability prediction models by up to 28.34% due to the improved training data space.

Dimensions Badge

Altmetric Badge

Item type Article
URI https://vuir.vu.edu.au/id/eprint/4634
DOI 10.1016/j.eswa.2009.04.067
Official URL http://www.sciencedirect.com/science/article/pii/S...
Subjects Historical > Faculty/School/Research Centre/Department > School of Engineering and Science
Historical > FOR Classification > 0806 Information Systems
Keywords ResPubID18527, data mining, outliers, over-sampling, breast cancer survivability prediction models
Citations in Scopus 42 - View on Scopus
Download/View statistics View download statistics for this item

Search Google Scholar

Repository staff login