Feature selection is an important data preprocessing method in data mining and machine learning, yet it faces the challenge of "curse of dimensionality" when dealing with high-dimensional data. In this paper, a self-adaptive level-based learning artificial bee colony (SLLABC) algorithm is proposed for high-dimensional feature selection problem. The SLLABC algorithm includes three new mechanisms: (1) A novel level-based learning mechanism is introduced to accelerate the convergence of the basic artificial bee colony algorithm, which divides the population into several levels and the individuals on each level learn from the individuals on higher levels, especially, the individuals on the highest level learn from each other. (2) A self-adaptive method is proposed to keep the balance between exploration and exploitation abilities, which takes the diversity of population into account to determine the number of levels. The lower the diversity is, the fewer the levels are divided. (3) A new update mechanism is proposed to reduce the number of selected features. In this mechanism, if the error rate of an offspring is higher than or is equal to that of its parent but selects more features, then the offspring is discarded and the parent is retained, otherwise, the offspring replaces its parent. Further, we discuss and analyze the contribution of these novelties to the diversity of population and the performance of classification. Finally, the results, compared with 8 state-of-the-art algorithms on 12 high-dimensional datasets, confirm the competitive performance of the proposed SLLABC on both classification accuracy and the size of the feature subset.
Quantitative Structure–Activity Relationship (QSAR) aims to correlate molecular structure properties with corresponding bioactivity. Chance correlations and multicollinearity are two major problems often encountered when generating QSAR models. Feature selection can significantly improve the accuracy and interpretability of QSAR by removing redundant or irrelevant molecular descriptors. An artificial bee colony algorithm (ABC) that mimics the foraging behaviors of honey bee colony was originally proposed for continuous optimization problems. It has been applied to feature selection for classification but seldom for regression analysis and prediction. In this paper, a binary ABC algorithm is used to select features (molecular descriptors) in QSAR. Furthermore, we propose an improved ABC-based algorithm for feature selection in QSAR, namely ABC-PLS-1. Crossover and mutation operators are introduced to employed bee and onlooker bee phase to modify several dimensions of each solution, which not only saves the process of converting continuous values into discrete values, but also reduces the computational resources. In addition, a novel greedy selection strategy which selects the feature subsets with higher accuracy and fewer features helps the algorithm to converge fast. Three QSAR datasets are used for the evaluation of the proposed algorithm. Experimental results show that ABC-PLS-1 outperforms PSO-PLS, WS-PSO-PLS, and BFDE-PLS in accuracy, root mean square error, and the number of selected features. Moreover, we also study whether to implement scout bee phase when tracking regression problems and drawing such an interesting conclusion that the scout bee phase is redundant when dealing with the feature selection in low-dimensional and medium-dimensional regression problems.
Accurately segmenting an insect from its original ecological image is the core technology restricting the accuracy and efficiency of automatic recognition. However, the performance of existing segmentation methods is unsatisfactory in insect images shot in wild backgrounds on account of challenges: various sizes, similar colors or textures to the surroundings, transparent body parts and vague outlines. These challenges of image segmentation are accentuated when dealing with camouflaged insects. Here, we developed an insect image segmentation method based on deep learning termed the progressive refinement network (PRNet), especially for camouflaged insects. Unlike existing insect segmentation methods, PRNet captures the possible scale and location of insects by extracting the contextual information of the image, and fuses comprehensive features to suppress distractors, thereby clearly segmenting insect outlines. Experimental results based on 1900 camouflaged insect images demonstrated that PRNet could effectively segment the camouflaged insects and achieved superior detection performance, with a mean absolute error of 3.2%, pixel-matching degree of 89.7%, structural similarity of 83.6%, and precision and recall error of 72%, which achieved improvements of 8.1%, 25.9%, 19.5%, and 35.8%, respectively, when compared to the recent salient object detection methods. As a foundational technology for insect detection, PRNet provides new opportunities for understanding insect camouflage, and also has the potential to lead to a step progress in the accuracy of the intelligent identification of general insects, and even being an ultimate insect detector.
Feature selection is an important data preprocessing method in data mining and machine learning, yet it faces the challenge of “curse of dimensionality” when dealing with high-dimensional data. In this paper, a self-adaptive level-based learning artificial bee colony (SLLABC) algorithm is proposed for high-dimensional feature selection problem. The SLLABC algorithm includes three new mechanisms: (1) A novel level-based learning mechanism is introduced to accelerate the convergence of the basic artificial bee colony algorithm, which divides the population into several levels and the individuals on each level learn from the individuals on higher levels, especially, the individuals on the highest level learn from each other. (2) A self-adaptive method is proposed to keep the balance between exploration and exploitation abilities, which takes the diversity of population into account to determine the number of levels. The lower the diversity is, the fewer the levels are divided. (3) A new update mechanism is proposed to reduce the number of selected features. In this mechanism, if the error rate of an offspring is higher than or is equal to that of its parent but selects more features, then the offspring is discarded and the parent is retained, otherwise, the offspring replaces its parent. Further, we discuss and analyze the contribution of these novelties to the diversity of population and the performance of classification. Finally, the results, compared with 8 state-of-the-art algorithms on 12 high-dimensional datasets, confirm the competitive performance of the proposed SLLABC on both classification accuracy and the size of the feature subset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.