MicroRNAs (miRNAs) are short RNA species derived from hairpin-forming miRNA precursors (pre-miRNA) and acting as key posttranscriptional regulators. Most computational tools labeled as miRNA predictors are in fact pre-miRNA predictors and provide no information about the putative miRNA location within the pre-miRNA. Sequence and structural features that determine the location of the miRNA, and the extent to which these properties vary from species to species, are poorly understood. We have developed miRdup, a computational predictor for the identification of the most likely miRNA location within a given pre-miRNA or the validation of a candidate miRNA. MiRdup is based on a random forest classifier trained with experimentally validated miRNAs from miRbase, with features that characterize the miRNA–miRNA* duplex. Because we observed that miRNAs have sequence and structural properties that differ between species, mostly in terms of duplex stability, we trained various clade-specific miRdup models and obtained increased accuracy. MiRdup self-trains on the most recent version of miRbase and is easy to use. Combined with existing pre-miRNA predictors, it will be valuable for both de novo mapping of miRNAs and filtering of large sets of candidate miRNAs obtained from transcriptome sequencing projects. MiRdup is open source under the GPLv3 and available at http://www.cs.mcgill.ca/∼blanchem/mirdup/.
BackgroundUrban malaria can be a serious public health problem in Africa. Human-landing catches of mosquitoes, a standard entomological method to assess human exposure to malaria vector bites, can lack sensitivity in areas where exposure is low. A simple and highly sensitive tool could be a complementary indicator for evaluating malaria exposure in such epidemiological contexts. The human antibody response to the specific Anopheles gSG6-P1 salivary peptide have been described as an adequate tool biomarker for a reliable assessment of human exposure level to Anopheles bites. The aim of this study was to use this biomarker to evaluate the human exposure to Anopheles mosquito bites in urban settings of Dakar (Senegal), one of the largest cities in West Africa, where Anopheles biting rates and malaria transmission are supposed to be low.MethodsOne cross-sectional study concerning 1,010 (505 households) children (n = 505) and adults (n = 505) living in 16 districts of downtown Dakar and its suburbs was performed from October to December 2008. The IgG responses to gSG6-P1 peptide have been assessed and compared to entomological data obtained in or near the same district.ResultsConsiderable individual variations in anti-gSG6-P1 IgG levels were observed between and within districts. In spite of this individual heterogeneity, the median level of specific IgG and the percentage of immune responders differed significantly between districts. A positive and significant association was observed between the exposure levels to Anopheles gambiae bites, estimated by classical entomological methods, and the median IgG levels or the percentage of immune responders measuring the contact between human populations and Anopheles mosquitoes. Interestingly, immunological parameters seemed to better discriminate the exposure level to Anopheles bites between different exposure groups of districts.ConclusionsSpecific human IgG responses to gSG6-P1 peptide biomarker represent, at the population and individual levels, a credible new alternative tool to assess accurately the heterogeneity of exposure level to Anopheles bites and malaria risk in low urban transmission areas. The development of such biomarker tool would be particularly relevant for mapping and monitoring malaria risk and for measuring the efficiency of vector control strategies in these specific settings.
BackgroundAdvances in cloning and sequencing technology are yielding a massive number of viral genomes. The classification and annotation of these genomes constitute important assets in the discovery of genomic variability, taxonomic characteristics and disease mechanisms. Existing classification methods are often designed for specific well-studied family of viruses. Thus, the viral comparative genomic studies could benefit from more generic, fast and accurate tools for classifying and typing newly sequenced strains of diverse virus families.ResultsHere, we introduce a virus classification platform, CASTOR, based on machine learning methods. CASTOR is inspired by a well-known technique in molecular biology: restriction fragment length polymorphism (RFLP). It simulates, in silico, the restriction digestion of genomic material by different enzymes into fragments. It uses two metrics to construct feature vectors for machine learning algorithms in the classification step. We benchmark CASTOR for the classification of distinct datasets of human papillomaviruses (HPV), hepatitis B viruses (HBV) and human immunodeficiency viruses type 1 (HIV-1). Results reveal true positive rates of 99%, 99% and 98% for HPV Alpha species, HBV genotyping and HIV-1 M subtyping, respectively. Furthermore, CASTOR shows a competitive performance compared to well-known HIV-1 specific classifiers (REGA and COMET) on whole genomes and pol fragments.ConclusionThe performance of CASTOR, its genericity and robustness could permit to perform novel and accurate large scale virus studies. The CASTOR web platform provides an open access, collaborative and reproducible machine learning classifiers. CASTOR can be accessed at http://castor.bioinfo.uqam.ca.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-017-1602-3) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.