Bi-Directional Feature Fixation-Based Particle Swarm Optimization for Large-Scale Feature Selection

Yang, Jia-Quang ORCID: 0000-0002-8004-6695, Yang, Qi-Te ORCID: 0000-0001-5430-7073, Du, Ke-Jing, Chen, Chun-Hua ORCID: 0000-0002-4087-5309, Wang, Hua ORCID: 0000-0002-8465-0996, Jeon, Sang-Woon ORCID: 0000-0002-0199-2254, Zhang, Jun ORCID: 0000-0003-4148-4294 and Zhan, Zhi-Hui ORCID: 0000-0003-0862-0514 (2022) Bi-Directional Feature Fixation-Based Particle Swarm Optimization for Large-Scale Feature Selection. IEEE Transactions On Big Data, 9 (3). pp. 1004-1017. ISSN 2332-7790

Abstract

Feature selection, which aims to improve the classification accuracy and reduce the size of the selected feature subset, is an important but challenging optimization problem in data mining. Particle swarm optimization (PSO) has shown promising performance in tackling feature selection problems, but still faces challenges in dealing with large-scale feature selection in Big Data environment because of the large search space. Hence, this article proposes a bi-directional feature fixation (BDFF) framework for PSO and provides a novel idea to reduce the search space in large-scale feature selection. BDFF uses two opposite search directions to guide particles to adequately search for feature subsets with different sizes. Based on the two different search directions, BDFF can fix the selection states of some features and then focus on the others when updating particles, thus narrowing the large search space. Besides, a self-adaptive strategy is designed to help the swarm concentrate on a more promising direction for search in different stages of evolution and achieve a balance between exploration and exploitation. Experimental results on 12 widely-used public datasets show that BDFF can improve the performance of PSO on large-scale feature selection and obtain smaller feature subsets with higher classification accuracy.

Dimensions Badge

Altmetric Badge

Item type Article
URI https://vuir.vu.edu.au/id/eprint/47123
DOI 10.1109/TBDATA.2022.3232761
Official URL https://ieeexplore.ieee.org/document/10002858
Subjects Current > FOR (2020) Classification > 4602 Artificial intelligence
Current > Division/Research > Institute for Sustainable Industries and Liveable Cities
Keywords feature selection, optimisation, data preprocessing, particle swarm optimisation
Download/View statistics View download statistics for this item

Search Google Scholar

Repository staff login