Learning naive Bayes classifiers from positive and unlabelled examples with uncertainty

Full text for this resource is not available from the Research Repository.

He, Jiazhen, Zhang, Yang, Li, Xue and Shi, Peng (2012) Learning naive Bayes classifiers from positive and unlabelled examples with uncertainty. International Journal of Systems Science, 43 (10). pp. 1805-1825. ISSN 0020-7721

Abstract

Traditional classification algorithms require a large number of labelled examples from all the predefined classes, which is generally difficult and time-consuming to obtain. Furthermore, data uncertainty is prevalent in many real-world applications, such as sensor network, market analysis and medical diagnosis. In this article, we explore the issue of classification on uncertain data when only positive and unlabelled examples are available. We propose an algorithm to build naive Bayes classifier from positive and unlabelled examples with uncertainty. However, the algorithm requires the prior probability of positive class, and it is generally difficult for the user to provide this parameter in practice. Two approaches are proposed to avoid this user-specified parameter. One approach is to use a validation set to search for an appropriate value for this parameter, and the other is to estimate it directly. Our extensive experiments show that the two approaches can basically achieve satisfactory classification performance on uncertain data. In addition, our algorithm exploiting uncertainty in the dataset can potentially achieve better classification performance comparing to traditional naive Bayes which ignores uncertainty when handling uncertain data.

Dimensions Badge

Altmetric Badge

Item type Article
URI https://vuir.vu.edu.au/id/eprint/22120
DOI https://doi.org/10.1080/00207721.2011.627475
Subjects Historical > FOR Classification > 0801 Artificial Intelligence and Image Processing
Historical > FOR Classification > 0806 Information Systems
Current > Division/Research > College of Science and Engineering
Keywords ResPubID25838, uncertain data, naive Bayes, data mining, data uncertainty, positive unlabelled learning, PU learning
Citations in Scopus 15 - View on Scopus
Download/View statistics View download statistics for this item

Search Google Scholar

Repository staff login