Circulating nucleic acids are found in free form in body fluids and may serve as minimally invasive tools for cancer diagnosis and prognosis. Only a few studies have investigated the potential application of circulating mRNAs and microRNAs (miRNAs) in prostate cancer (PCa). The Cancer Genome Atlas (TCGA) database was used for an in silico analysis to identify circulating mRNA and miRNA as potential markers of PCa. A total of 2,267 genes and 49 miRNAs were differentially expressed between normal and tumor samples. The prediction analyses of target genes and integrative analysis of mRNA and miRNA expression revealed eleven genes and eight miRNAs which were validated by RT-qPCR in plasma samples from 102 untreated PCa patients and 50 cancer-free individuals. Two genes, OR51E2 and SIM2, and two miRNAs, miR-200c and miR-200b, showed significant association with PCa. Expression levels of these transcripts distinguished PCa patients from controls (67% sensitivity and 75% specificity). PCa patients and controls with prostate-specific antigen (PSA) ≤ 4.0 ng/mL were discriminated based on OR51E2 and SIM2 expression levels. The miR-200c expression showed association with Gleason score and miR-200b, with bone metastasis, bilateral tumor, and PSA > 10.0 ng/mL. The combination of circulating mRNA and miRNA was useful for the diagnosis and prognosis of PCa.
Our feature selection analysis considered 5468 features, and it used only 16 features to robustly identify lncRNA with the REPTree algorithm. That was the base to create the model and train it with lncRNA and mRNA data from five plant species (thale cress, cucumber, soybean, poplar and Asian rice). After an extensive comparison with other tools largely used in plants (CPC, CPC2, CPAT and PLncPRO), we found that RNAplonc produced more reliable lncRNA predictions from plant transcripts with 87.5% of the best result in eight tests in eight species from the GreeNC database and four independent studies in monocotyledonous (Brachypodium) and eudicotyledonous (Populus and Gossypium) species.
As consequence of the various genomic sequencing projects, an increasing volume of biological sequence data is being produced. Although machine learning algorithms have been successfully applied to a large number of genomic sequence-related problems, the results are largely affected by the type and number of features extracted. This effect has motivated new algorithms and pipeline proposals, mainly involving feature extraction problems, in which extracting significant discriminatory information from a biological set is challenging. Considering this, our work proposes a new study of feature extraction approaches based on mathematical features (numerical mapping with Fourier, entropy and complex networks). As a case study, we analyze long non-coding RNA sequences. Moreover, we separated this work into three studies. First, we assessed our proposal with the most addressed problem in our review, e.g. lncRNA and mRNA; second, we also validate the mathematical features in different classification problems, to predict the class of lncRNA, e.g. circular RNAs sequences; third, we analyze its robustness in scenarios with imbalanced data. The experimental results demonstrated three main contributions: first, an in-depth study of several mathematical features; second, a new feature extraction pipeline; and third, its high performance and robustness for distinct RNA sequence classification. Availability: https://github.com/Bonidia/FeatureExtraction_BiologicalSequences
LTR-retrotransposons are the most abundant repeat sequences in plant genomes and play an important role in evolution and biodiversity. Their characterization is of great importance to understand their dynamics. However, the identification and classification of these elements remains a challenge today. Moreover, current software can be relatively slow (from hours to days), sometimes involve a lot of manual work and do not reach satisfactory levels in terms of precision and sensitivity. Here we present Inpactor2, an accurate and fast application that creates LTR-retrotransposon reference libraries in a very short time. Inpactor2 takes an assembled genome as input and follows a hybrid approach (deep learning and structure-based) to detect elements, filter partial sequences and finally classify intact sequences into superfamilies and, as very few tools do, into lineages. This tool takes advantage of multi-core and GPU architectures to decrease execution times. Using the rice genome, Inpactor2 showed a run time of 5 minutes (faster than other tools) and has the best accuracy and F1-Score of the tools tested here, also having the second best accuracy and specificity only surpassed by EDTA, but achieving 28% higher sensitivity. For large genomes, Inpactor2 is up to seven times faster than other available bioinformatics tools.
BackgroundSmall non-coding regulatory RNAs control cellular functions at the transcriptional and post-transcriptional levels. Oral squamous cell carcinoma is among the leading cancers in the world and the presence of cervical lymph node metastases is currently its strongest prognostic factor. In this work we aimed at finding small RNAs expressed in oral squamous cell carcinoma that could be associated with the presence of lymph node metastasis.MethodsSmall RNA libraries from metastatic and non-metastatic oral squamous cell carcinomas were sequenced for the identification and quantification of known small RNAs. Selected markers were validated in plasma samples. Additionally, we used in silico analysis to investigate possible new molecules, not previously described, involved in the metastatic process.ResultsGlobal expression patterns were not associated with cervical metastases. MiR-21, miR-203 and miR-205 were highly expressed throughout samples, in agreement with their role in epithelial cell biology, but disagreeing with studies correlating these molecules with cancer invasion. Eighteen microRNAs, but no other small RNA class, varied consistently between metastatic and non-metastatic samples. Nine of these microRNAs had been previously detected in human plasma, eight of which presented consistent results between tissue and plasma samples. MiR-31 and miR-130b, known to inhibit several steps in the metastatic process, were over-expressed in non-metastatic samples and the expression of miR-130b was confirmed in plasma of patients showing no metastasis. MiR-181 and miR-296 were detected in metastatic tumors and the expression of miR-296 was confirmed in plasma of patients presenting metastasis. A novel microRNA-like molecule was also associated with non-metastatic samples, potentially targeting cell-signaling mechanisms.ConclusionsWe corroborate literature data on the role of small RNAs in cancer metastasis and suggest the detection of microRNAs as a tool that may assist in the evaluation of oral squamous cell carcinoma metastatic potential.Electronic supplementary materialThe online version of this article (doi:10.1186/s12920-015-0102-4) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.