Our genomes contain the blueprint of what makes us human and many indications as to why we develop disease. Until the last 10 years, most studies had focussed on protein-coding genes, more specifically DNA sequences coding for proteins. However, this represents less than 5% of our genomes. The other 95% is referred to as the 'dark matter' of our genomes, our understanding of which is extremely limited. Part of this 'dark matter' includes regions that give rise to RNAs that do not code for proteins. A subset of these non-coding RNAs are long non-coding RNAs (lncRNAs), which in particular are beginning to be dissected and their importance to human health revealed. To improve our understanding and treatment of disease it is vital that we understand the molecular and cellular function of lncRNAs, and how their misregulation can contribute to disease. It is not yet clear what proportion of lncRNAs is actually functional; conservation during evolution is being used to understand the biological importance of lncRNA. Here, we present key themes within the field of lncRNAs, emphasising the importance of their roles in both the nucleus and the cytoplasm of cells, as well as patterns in their modes of action. We discuss their potential functions in development and disease using examples where we have the greatest understanding. Finally, we emphasise why lncRNAs can serve as biomarkers and discuss their emerging potential for therapy.No conflicts of interest were declared.
What are lncRNAs?LncRNAs are RNAs of >200 nucleotides (nt) in length that are not thought to code for proteins. Although our appreciation and understanding of lncRNA function and importance has exploded in the last decade, the first lncRNAs were discovered in the 1990s: BC200, H19 [1], and Xist [2]. In the post-genomic era, extensive and deep RNA-Seq has revealed the existence of huge numbers of novel RNA transcripts, including lncRNAs. Many of these novel transcripts are low in abundance and so were not previously identified. Several consortia have been responsible for sequencing RNA from a variety of tissues, cell types, organisms, and disease states, and we now have a much more precise view of which RNA transcripts are expressed, and when and where (GENCODE [3], GTEX [4], FANTOM [5]).