Citation: Wang P, Xiao X (2014) NRPred-FS: A Feature Selection based Twolevel Predictor for Nuclear Receptors. J Proteomics Bioinform S9: 002.All the aforementioned methods each have their own merits and did play a role in stimulating the development of this area, but they all have the following main shortcomings. (1) The datasets constructed to train the predictors were derived from the old version of NucleaRDB, which has been much updated recently. (2) Various feature extraction Abstract Motivation: Nuclear receptors (NRs) play a role in all developmental and physiological processes and are important drug targets in a wide variety of disease and healthy states. In the past years, to identify NRs and their subfamilies with high throughput and low-cost, many machine learning methods have been introduced. However, these predictors are all developed based on old dataset in the NucleaRDB, what's more, no feature selection technique is employed, so that the performances are very limited.Result: In this study, a feature selection based two-level predictor, called NRPred-FS, is developed that can be used to identify a query protein as a nuclear receptor or not based on its sequence information alone, if it is, the prediction will be automatically continued to further identify it among the following eight subfamilies: (1) Thyroid hormone like (NR1), (2) HNF4-like (NR2), (3) Estrogen like, (4) Nerve growth factor IB-like (NR4), (5) Fushi tarazu-F1 like (NR5), (6) Germ cell nuclear factor like (NR6), (7) knirps like (NR0A), and (8) DAX like (NR0B). The nuclear receptor sequences are encoded as sequence-derived feature vectors formed by incorporating various physicochemical and statistical features. Furthermore, the features set are optimized by forward feature selection algorithm for reducing the feature dimensions and for getting higher classifying accuracy. As a demonstration, this method gone through rigorous testing on a benchmark datasets derived from the latest version of NucleaRDB and UniProt. The overall prediction accuracies of leave-one-out cross-validation were about 97% and 93% in the first and second level respectively. As a convenience to the users, the powerful predictor, NRPred-FS, is freely accessible at http://www.jci-bioinfo.cn/NRPred-FS. Hopefully it will be a useful vehicle for identifying NRs and their subfamilies.
Journal of
Proteomics & BioinformaticsJ ou rnal of P ro te om ics & B io in fo rmatic s ISSN: 0974-276X Citation: Wang P, Xiao X (2014) NRPred-FS: A Feature Selection based Two-level Predictor for Nuclear Receptors. J Proteomics Bioinform S9: 002.