IGICA: A Hybrid Feature Selection Approach in Text Categorization

Mojaveriyan, Mohammad; Ebrahimpour-Komleh, Hossein; Mousavirad, Seyed Jalaleddin

doi:10.5815/ijisa.2016.03.05

Cited by 35 publications

(17 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The selected feature subsets using the wrapper-based methods were evaluated using the J48 classifier. Finally, researchers in [60] and [61] implemented the IG as a filter-based method to select the top-ranking features. To do so, two-hundred (200) feature subsets are selected by IG to be fed into the Gray Wolf Optimizer (GWO) to select the optimal subset of feature, and top-ranking features to be fed into the Imperialist Competitive Algorithm (ICA) utilizing NB and KNN as classification methods, respectively.…”

Section: A: Adapted Metaheuristic Methodsmentioning

confidence: 99%

Wrapper and Hybrid Feature Selection Methods Using Metaheuristic Algorithms for English Text Classification: A Systematic Review

et al. 2022

View full text Add to dashboard Cite

Feature selection (FS) constitutes a series of processes used to decide which relevant features/attributes to include and which irrelevant features to exclude for predictive modeling. It is a crucial task that aids machine learning classifiers in reducing error rates, computation time, overfitting, and improving classification accuracy. It has demonstrated its efficacy in myriads of domains, ranging from its use for text classification (TC), text mining, and image recognition. While there are many traditional FS methods, recent research efforts have been devoted to applying metaheuristic algorithms as FS techniques for the TC task. However, there are few literature reviews concerning TC. Therefore, a comprehensive overview was systematically studied by exploring available studies of different metaheuristic algorithms used for FS to improve TC. This paper will contribute to the body of existing knowledge by answering four research questions (RQs): 1) What are the different approaches of FS that apply metaheuristic algorithms to improve TC? 2) Does applying metaheuristic algorithms for TC lead to better accuracy than the typical FS methods? 3) How effective are the modified, hybridized metaheuristic algorithms for text FS problems?, and 4) What are the gaps in the current studies and their future directions? These RQs led to a study of recent works on metaheuristic-based FS methods, their contributions, and limitations. Hence, a final list of thirty-seven (37) related articles was extracted and investigated to align with our RQs to generate new knowledge in the domain of study. Most of the conducted papers focused on addressing the TC in tandem with metaheuristic algorithms based on the wrapper and hybrid FS approaches. Future research should focus on using a hybrid-based FS approach as it intuitively handles complex optimization problems and potentiality provide new research opportunities in this rapidly developing field.

show abstract

Section: A: Adapted Metaheuristic Methodsmentioning

confidence: 99%

Wrapper and Hybrid Feature Selection Methods Using Metaheuristic Algorithms for English Text Classification: A Systematic Review

et al. 2022

View full text Add to dashboard Cite

show abstract

“…This algorithm was introduced in 2007,[ 24 ] and it has been used so far to solve many problems in the area of optimization. [ 25 26 27 28 29 30 31 32 33 34 35 36 37 ] Like other evolutionary algorithms, this algorithm is composed of the initial set of possible solutions, which of them is called a country. The ICA gradually improves these initial solutions (countries) and finally provides the desired answer to the optimization problem (the desired country).…”

Section: Methodsmentioning

confidence: 99%

Biomarker Discovery by Imperialist Competitive Algorithm in Mass Spectrometry Data for Ovarian Cancer Prediction

Pirhadi

Maghooli

Moteghaed

et al. 2021

Journal of Medical Signals &Amp; Sensors

View full text Add to dashboard Cite

Background: Mass spectrometry is a method for identifying proteins and could be used for distinguishing between proteins in healthy and nonhealthy samples. This study was conducted using mass spectrometry data of ovarian cancer with high resolution. Usually, diagnostic and monitoring tests are done according to sensitivity and specificity rates; thus, the aim of this study is to compare mass spectrometry of healthy and cancerous samples in order to find a set of biomarkers or indicators with a reasonable sensitivity and specificity rates. Methods: Therefore, combination methods were used for choosing the optimum feature set as t-test, entropy, Bhattacharya, and an imperialist competitive algorithm with K-nearest neighbors classifier. The resulting feature from each method was feed to the C5 decision tree with 10-fold cross-validation to classify data. Results: The most important variables using this method were identified and a set of rules were extracted. Similar to most frequent features, repetitive patterns were not obtained; the generalized rule induction method was used to identify the repetitive patterns. Conclusion: Finally, the resulting features were introduced as biomarkers and compared with other studies. It was found that the resulting features were very similar to other studies. In the case of the classifier, higher sensitivity and specificity rates with a lower number of features were achieved when compared with other studies.

show abstract

“…According to the literature [4] " e topic model-based corpus construction and computer-aided creation research," the topic model based on LDA uses the method of reference word recommendation to study and analyze the word characteristics in poetry, vocabulary semantic analysis, and style feature analysis. Literature [5] uses the vector space model (VSM) to represent the text of poems and proposes two types of classification models of classical poems, bold and graceful and graceful.…”

Section: Related Workmentioning

confidence: 99%

Analysis of Poetry Style Based on Text Classification Algorithm

Can

2022

Scientific Programming

View full text Add to dashboard Cite

In order to realize the effect of intelligent poetry style analysis, this paper applies the text classification algorithm to the poetry style analysis, combines the knowledge representation algorithm to perform text classification and recognition, improves the algorithm, and applies it to the poetry style analysis model. Moreover, this paper combines intelligent algorithms to construct a poetry style analysis system, constructs the system’s functional modules, preprocesses the poetry documents in the corpus, and maps them to the vector space that can be directly processed by the computer. In addition, after constructing the system model, this paper verifies the poetry style analysis system based on the text classification algorithm through simulation experiments. From the research results, the effect of the poetry style analysis method based on the text classification algorithm proposed in this paper is very good, which meets the actual needs of poetry style analysis.

show abstract

IGICA: A Hybrid Feature Selection Approach in Text Categorization

Cited by 35 publications

References 14 publications

Wrapper and Hybrid Feature Selection Methods Using Metaheuristic Algorithms for English Text Classification: A Systematic Review

Wrapper and Hybrid Feature Selection Methods Using Metaheuristic Algorithms for English Text Classification: A Systematic Review

Biomarker Discovery by Imperialist Competitive Algorithm in Mass Spectrometry Data for Ovarian Cancer Prediction

Analysis of Poetry Style Based on Text Classification Algorithm

Contact Info

Product

Resources

About