MicroRNAs play critical roles in many physiological processes. Their dysregulations are also closely related to the development and progression of various human diseases, including cancer. Therefore, identifying new microRNAs that are associated with diseases contributes to a better understanding of pathogenicity mechanisms. MicroRNAs also represent a tremendous opportunity in biotechnology for early diagnosis. To date, several in silico methods have been developed to address the issue of microRNA-disease association prediction. However, these methods have various limitations. In this study, we investigate the hypothesis that information attached to miRNAs and diseases can be revealed by distributional semantics. Our basic approach is to represent distributional information on miRNAs and diseases in a high-dimensional vector space and to define associations between miRNAs and diseases in terms of their vector similarity. Cross validations performed on a dataset of known miRNA-disease associations demonstrate the excellent performance of our method. Moreover, the case study focused on breast cancer confirms the ability of our method to discover new disease-miRNA associations and to identify putative false associations reported in databases.
MicroRNAs, small non-coding elements implied in gene regulation, are very interesting biomarkers for various diseases such as cancers. They represent potential prodigious biotechnologies for early diagnosis and gene therapies. However, experimental verification of microRNA-disease associations are time-consuming and costly, so that computational modeling is a proper solution. Previously, we designed MiRAI, a predictive method based on distributional semantics, to identify new associations between microRNA molecules and human diseases. Our preliminary results showed very good prediction scores compared to other available methods. However, MiRAI performances depend on numerous parameters that cannot be tuned manually. In this study, a parallel evolutionary algorithm is proposed for finding an optimal configuration of our predictive method. The automatically parametrized version of MiRAI achieved excellent performance. It highlighted new miRNA-disease associations, especially the potential implication of mir-188 and mir-795 in various diseases. In addition, our method allowed to detect several putative false associations contained in the reference database.
In Vibrio cholerae, the etiological agent of cholera, most of the virulence genes are located in two pathogenicity islands, named TCP (Toxin-Co-regulated Pilus) and CTX (Cholera ToXins). For each V. cholerae pathogenicity gene, we retrieved every primer published since 1990 and every known allele in order to perform a complete in silico survey and assess the quality of the PCR primers used for amplification of these genes. Primers with a melting temperature in the range 55–60°C against any target sequence were considered valid. Our survey clearly revealed that two thirds of the published primers are not able to properly detect every genetic variant of the target genes. Moreover, the quality of primers did not improve with time. Their lifetime, i.e. the number of times they were cited in the literature, is also not a factor allowing the selection of valid primers. We were able to improve some primers or design new primers for the few cases where no valid primer was found. In conclusion, many published primers should be avoided or improved for use in molecular detection tests, in order to improve and perfect specificity and coverage. This study suggests that bioinformatic analyses are important to validate the choice of primers.
SummaryPathogenic agents can be very hard to detect, and usually they do not cause illness for several hours or days. To improve the speed and the accuracy of detection tests and satisfy the needs of early diagnosis, molecular biology methods such as PCR are now used. However, selecting a proper target gene and designing good primers is often not easy. We present a dedicated website, http://patho‐genes.org, where we provide every sequence, functional annotation, published primer and relevant article for every annotated gene of major pathogenic bacterial species listed as key agents to be used for a bioterrorism attack. Each published primer was analysed to determine its melting temperature, its specificity and its coverage (i.e. its sensitivity against every allele of its target gene). Data generated have been organized in the form of data sheet for each gene, which are available through multiple browser panels and query systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.