Mammals and other complex organisms can transcribe an abundance of long non-coding RNAs (lncRNAs) that fulfill a wide variety of regulatory roles in many biological processes. These roles, including as scaffolds and as guides for protein-coding genes, mainly depend on the structure and expression level of lncRNAs. In this review, we focus on the current methods for analyzing lncRNA structure and expression, which is basic but necessary information for in-depth, large-scale analysis of lncRNA functions. The ENCODE project, which has published 30 papers to date, including a few that extensively characterize long non-coding RNAs (lncRNAs), has revealed that 76% of the human genome is transcribed to produce a range of lncRNAs [1]. The landscape of lncRNAs in mammals was unveiled by the rapid progress of deep sequencing technology [2] and computational methods to identify lncRNA [3,4]. These lncRNAs participate in a wide variety of biological processes, such as imprinting control, cell differentiation, immune responses through regulating expression, and activity and localization of protein coding genes [5,6]. However, the function and mechanisms of most lncRNAs are still unknown. Here, we combine our work and other related work to systematically illustrate the current methods to study lncRNA function through their structures and expression profiles.
RNA secondary structure predictionSecondary structures of RNAs are the basis of their tertiary structures, so we will first briefly review methods for their prediction. Such methods predict standard Watson-Crick base pairs (AU and CG) and non-standard base pairs in a RNA sequence. There are many methods for RNA secondary structure prediction, which are based on different principles. Here we focus on two types of commonly used methods: the minimum free energy method [7][8][9][10][11], and the multiple sequence alignment method [1214]. The minimum free energy methods are now the most widely used methods of RNA secondary structure predic-