BackgroundLong non-coding RNAs (lncRNAs) represent a novel class of non-coding RNAs having a crucial role in many biological processes. The identification of long non-coding homologs among different species is essential to investigate such roles in model organisms as homologous genes tend to retain similar molecular and biological functions. Alignment–based metrics are able to effectively capture the conservation of transcribed coding sequences and then the homology of protein coding genes. However, unlike protein coding genes the poor sequence conservation of long non-coding genes makes the identification of their homologs a challenging task.ResultsIn this study we compare alignment–based and alignment–free string similarity metrics and look at promoter regions as a possible source of conserved information. We show that promoter regions encode relevant information for the conservation of long non-coding genes across species and that such information is better captured by alignment–free metrics. We perform a genome wide test of this hypothesis in human, mouse, and zebrafish.ConclusionsThe obtained results persuaded us to postulate the new hypothesis that, unlike protein coding genes, long non-coding genes tend to preserve their regulatory machinery rather than their transcribed sequence. All datasets, scripts, and the prediction tools adopted in this study are available at https://github.com/bioinformatics-sannio/lncrna-homologs.Electronic supplementary materialThe online version of this article (10.1186/s12859-018-2441-6) contains supplementary material, which is available to authorized users.
Breast cancer (BC) is a heterogeneous disease characterized by different biopathological features, differential response to therapy and substantial variability in long-term-survival. BC heterogeneity recapitulates genetic and epigenetic alterations affecting transformed cell behavior. The estrogen receptor alpha positive (ERα+) is the most common BC subtype, generally associated with a better prognosis and improved long-term survival, when compared to ERα-tumors. This is mainly due to the efficacy of endocrine therapy, that interfering with estrogen biosynthesis and actions blocks ER-mediated cell proliferation and tumor spread. Acquired resistance to endocrine therapy, however, represents a great challenge in the clinical management of ERα+ BC, causing tumor growth and recurrence irrespective of estrogen blockade. Improving overall survival in such cases requires new and effective anticancer drugs, allowing adjuvant treatments able to overcome resistance to first-line endocrine therapy. To date, several studies focus on the application of loss-of-function genome-wide screenings to identify key (hub) “fitness” genes essential for BC progression and representing candidate drug targets to overcome lack of response, or acquired resistance, to current therapies. Here, we review the biological significance of essential genes and relative functional pathways affected in ERα+ BC, most of which are strictly interconnected with each other and represent potential effective targets for novel molecular therapies.
BackgroundThe unveiling of long non-coding RNAs as important gene regulators in many biological contexts has increased the demand for efficient and robust computational methods to identify novel long non-coding RNAs from transcripts assembled with high throughput RNA-seq data. Several classes of sequence-based features have been proposed to distinguish between coding and non-coding transcripts. Among them, open reading frame, conservation scores, nucleotide arrangements, and RNA secondary structure have been used with success in literature to recognize intergenic long non-coding RNAs, a particular subclass of non-coding RNAs.ResultsIn this paper we perform a systematic assessment of a wide collection of features extracted from sequence data. We use most of the features proposed in the literature, and we include, as a novel set of features, the occurrence of repeats contained in transposable elements. The aim is to detect signatures (groups of features) able to distinguish long non-coding transcripts from other classes, both protein-coding and non-coding. We evaluate different feature selection algorithms, test for signature stability, and evaluate the prediction ability of a signature with a machine learning algorithm. The study reveals different signatures in human, mouse, and zebrafish, highlighting that some features are shared among species, while others tend to be species-specific. Compared to coding potential tools and similar supervised approaches, including novel signatures, such as those identified here, in a machine learning algorithm improves the prediction performance, in terms of area under precision and recall curve, by 1 to 24%, depending on the species and on the signature.ConclusionsUnderstanding which features are best suited for the prediction of long non-coding RNAs allows for the development of more effective automatic annotation pipelines especially relevant for poorly annotated genomes, such as zebrafish. We provide a web tool that recognizes novel long non-coding RNAs with the obtained signatures from fasta and gtf formats. The tool is available at the following url: http://www.bioinformatics-sannio.org/software/.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-017-1594-z) contains supplementary material, which is available to authorized users.
Parkinson’s disease (PD) is the second most common neurodegenerative disorder. The number of cases of PD is expected to double by 2030, representing a heavy burden on the healthcare system. Clinical symptoms include the progressive loss of dopaminergic neurons in the substantia nigra of the midbrain, which leads to striatal dopamine deficiency and, subsequently, causes motor dysfunction. Certainly, the study of the transcriptome of the various RNAs plays a crucial role in the study of this neurodegenerative disease. In fact, the aim of this study was to evaluate the transcriptome in a cohort of subjects with PD compared with a control cohort. In particular we focused on mRNAs and long non-coding RNAs (lncRNA), using the Illumina NextSeq 550 DX System. Differential expression analysis revealed 716 transcripts with padj ≤ 0.05; among these, 630 were mRNA (coding protein), lncRNA, and MT_tRNA. Ingenuity pathway analysis (IPA, Qiagen) was used to perform the functional and pathway analysis. The highest statistically significant pathways were: IL-15 signaling, B cell receptor signaling, systemic lupus erythematosus in B cell signaling pathway, communication between innate and adaptive immune cells, and melatonin degradation II. Our findings further reinforce the important roles of mitochondria and lncRNA in PD and, in parallel, further support the concept of inverse comorbidity between PD and some cancers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.