Graph Based Feature Selection for Reduction of Dimensionality in Next-Generation RNA Sequencing Datasets

Gakii, Consolata; Mireji, Paul O.; Rimiru, Richard

doi:10.3390/a15010021

Cited by 9 publications

(3 citation statements)

References 53 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To determine the same, different graph approaches have been proposed for feature selection [25][26][27][28][29][30][31][32][33][34][35][36][37]. Das et al [30] used Feature Association Map to present a graph-based hybrid feature selection method.…”

Section: Introductionmentioning

confidence: 99%

Augmentation of Densest Subgraph Finding Unsupervised Feature Selection Using Shared Nearest Neighbor Clustering

et al. 2023

View full text Add to dashboard Cite

Determining the optimal feature set is a challenging problem, especially in an unsupervised domain. To mitigate the same, this paper presents a new unsupervised feature selection method, termed as densest feature graph augmentation with disjoint feature clusters. The proposed method works in two phases. The first phase focuses on finding the maximally non-redundant feature subset and disjoint features are added to the feature set in the second phase. To experimentally validate, the efficiency of the proposed method has been compared against five existing unsupervised feature selection methods on five UCI datasets in terms of three performance criteria, namely clustering accuracy, normalized mutual information, and classification accuracy. The experimental analyses have shown that the proposed method outperforms the considered methods.

show abstract

Section: Introductionmentioning

confidence: 99%

Augmentation of Densest Subgraph Finding Unsupervised Feature Selection Using Shared Nearest Neighbor Clustering

et al. 2023

View full text Add to dashboard Cite

show abstract

“…The hybrid method gave better results than the individual algorithms. Gakii et al [32] proposed comparison methods using three algorithms for feature selection included in the PCA, RFE and graph-based feature selection. The results proved that the graph-based feature selection enhanced the performance of sequential minimal optimization and multilayer perceptron classifiers.…”

Section: Introductionmentioning

confidence: 99%

Effective hybrid feature selection using different bootstrap enhances cancers classification performance

2022

View full text Add to dashboard Cite

Background Machine learning can be used to predict the different onset of human cancers. Highly dimensional data have enormous, complicated problems. One of these is an excessive number of genes plus over-fitting, fitting time, and classification accuracy. Recursive Feature Elimination (RFE) is a wrapper method for selecting the best subset of features that cause the best accuracy. Despite the high performance of RFE, time computation and over-fitting are two disadvantages of this algorithm. Random forest for selection (RFS) proves its effectiveness in selecting the effective features and improving the over-fitting problem. Method This paper proposed a method, namely, positions first bootstrap step (PFBS) random forest selection recursive feature elimination (RFS-RFE) and its abbreviation is PFBS- RFS-RFE to enhance cancer classification performance. It used a bootstrap with many positions included in the outer first bootstrap step (OFBS), inner first bootstrap step (IFBS), and outer/ inner first bootstrap step (O/IFBS). In the first position, OFBS is applied as a resampling method (bootstrap) with replacement before selection step. The RFS is applied with bootstrap = false i.e., the whole datasets are used to build each tree. The importance features are hybrid with RFE to select the most relevant subset of features. In the second position, IFBS is applied as a resampling method (bootstrap) with replacement during applied RFS. The importance features are hybrid with RFE. In the third position, O/IFBS is applied as a hybrid of first and second positions. RFE used logistic regression (LR) as an estimator. The proposed methods are incorporated with four classifiers to solve the feature selection problems and modify the performance of RFE, in which five datasets with different size are used to assess the performance of the PFBS-RFS-RFE. Results The results showed that the O/IFBS-RFS-RFE achieved the best performance compared with previous work and enhanced the accuracy, variance and ROC area for RNA gene and dermatology erythemato-squamous diseases datasets to become 99.994%, 0.0000004, 1.000 and 100.000%, 0.0 and 1.000, respectively. Conclusion High dimensional datasets and RFE algorithm face many troubles in cancers classification performance. PFBS-RFS-RFE is proposed to fix these troubles with different positions. The importance features which extracted from RFS are used with RFE to obtain the effective features.

show abstract

“…For example in domains like the various omics (e.g. genomics), biomedical imaging, and biomedical signal processing, biological molecule sequencing, we can see various applications of deep learning, such as gene expression regulation, protein structure prediction, cancer diagnosis and prognosis, drug discovery, and medical image analysis RNNs ( [72][73][74][75][76]. An example is visible in the synergy between AI and the CRISPR technologies applied to vaccine design, therapeutic treatment improvement and RNA guide activities [74,75,77].…”

Section: Ai In Bioinformaticsmentioning

confidence: 99%

Revealing function, interactions, and localization of peroxisomal proteins using deep learning-based approaches

Anteghi

View full text Add to dashboard Cite

Computational approaches are practical when investigating putative peroxisomal proteins and for sub-peroxisomal protein localisation in unknown protein sequences. Nowadays, advancements in computational methods and Machine Learning (ML) can be used to hasten the discovery of novel peroxisomal proteins and can be combined with more established computational methodologies. In this chapter, we explain and list some of the most used tools and methodologies for novel peroxisomal protein detection and localisation.

show abstract

Graph Based Feature Selection for Reduction of Dimensionality in Next-Generation RNA Sequencing Datasets

Cited by 9 publications

References 53 publications

Augmentation of Densest Subgraph Finding Unsupervised Feature Selection Using Shared Nearest Neighbor Clustering

Augmentation of Densest Subgraph Finding Unsupervised Feature Selection Using Shared Nearest Neighbor Clustering

Effective hybrid feature selection using different bootstrap enhances cancers classification performance

Revealing function, interactions, and localization of peroxisomal proteins using deep learning-based approaches

Contact Info

Product

Resources

About