Learning very fast decision tree from uncertain data streams with positive and unlabeled samples

Full text for this resource is not available from the Research Repository.

Liang, Chunquan, Zhang, Yanchun, Shi, Peng and Hu, Zhengguo (2012) Learning very fast decision tree from uncertain data streams with positive and unlabeled samples. Information Sciences, 213. pp. 50-67. ISSN 0020-0255 (print), 1872-6291 (online)


Most data stream classification algorithms need to supply input with a large amount of precisely labeled data. However, in many data stream applications, streaming data contains inherent uncertainty, and labeled samples are difficult to be collected, while abundant data are unlabeled. In this paper, we focus on classifying uncertain data streams with only posi- tive and unlabeled samples available. Based on concept-adapting very fast decision tree (CVFDT) algorithm, we propose an algorithm namely puuCVFDT (CVFDT for positive and unlabeled uncertain data). Experimental results on both synthetic and real-life datasets demonstrate the strong ability and efficiency of puuCVFDT to handle concept drift with uncertainty under positive and unlabeled learning scenario. Even when 90% of the samples in the stream are unlabeled, the classification performance of the proposed algorithm is still compared to that of CVFDT, which is learned from fully labeled data without uncertainty.

Dimensions Badge

Altmetric Badge

Item type Article
URI https://vuir.vu.edu.au/id/eprint/22129
DOI 10.1016/j.ins.2012.05.023
Official URL http://www.sciencedirect.com/science/article/pii/S...
Subjects Historical > FOR Classification > 0804 Data Format
Historical > FOR Classification > 0807 Library and Information Studies
Current > Division/Research > College of Science and Engineering
Keywords ResPubID25830, positive unlabeled learning, uncertain attributes, algorithms
Citations in Scopus 37 - View on Scopus
Download/View statistics View download statistics for this item

Search Google Scholar

Repository staff login