Learning naive Bayes classifiers from positive and unlabelled examples with uncertainty

Download

Full text for this resource is not available from the Research Repository.

Export

He, Jiazhen, Zhang, Yang, Li, Xue and Shi, Peng (2012) Learning naive Bayes classifiers from positive and unlabelled examples with uncertainty. International Journal of Systems Science, 43 (10). pp. 1805-1825. ISSN 0020-7721

Abstract

Traditional classification algorithms require a large number of labelled examples from all the predefined classes, which is generally difficult and time-consuming to obtain. Furthermore, data uncertainty is prevalent in many real-world applications, such as sensor network, market analysis and medical diagnosis. In this article, we explore the issue of classification on uncertain data when only positive and unlabelled examples are available. We propose an algorithm to build naive Bayes classifier from positive and unlabelled examples with uncertainty. However, the algorithm requires the prior probability of positive class, and it is generally difficult for the user to provide this parameter in practice. Two approaches are proposed to avoid this user-specified parameter. One approach is to use a validation set to search for an appropriate value for this parameter, and the other is to estimate it directly. Our extensive experiments show that the two approaches can basically achieve satisfactory classification performance on uncertain data. In addition, our algorithm exploiting uncertainty in the dataset can potentially achieve better classification performance comparing to traditional naive Bayes which ignores uncertainty when handling uncertain data.

Dimensions Badge

Altmetric Badge

Item type	Article
URI	https://vuir.vu.edu.au/id/eprint/22120
DOI	10.1080/00207721.2011.627475
Subjects	Historical > FOR Classification > 0801 Artificial Intelligence and Image Processing Historical > FOR Classification > 0806 Information Systems Current > Division/Research > College of Science and Engineering
Keywords	ResPubID25838, uncertain data, naive Bayes, data mining, data uncertainty, positive unlabelled learning, PU learning
Citations in Scopus	17 - View on Scopus
Download/View statistics	View download statistics for this item