2018
DOI: 10.1038/s41588-018-0207-8
|View full text |Cite
|
Sign up to set email alerts
|

Functional classification of long non-coding RNAs by k-mer content

Abstract: The functions of most long non-coding RNAs (lncRNAs) are unknown. In contrast to proteins, lncRNAs with similar functions often lack linear sequence homology; thus, the identification of function in one lncRNA rarely informs the identification of function in others. We developed a sequence comparison method to deconstruct linear sequence relationships in lncRNAs and evaluate similarity based on the abundance of short motifs called k-mers. We found that lncRNAs of related function often had similar k-mer profil… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

11
247
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 214 publications
(258 citation statements)
references
References 55 publications
11
247
0
Order By: Relevance
“…Thus, the enrichment of a particular protein-binding motif in an individual Xist or Rsx repeat domain does not provide direct evidence that the protein binds to the domain. Nevertheless, these results are consistent with the notions that lncRNA k-mer content encodes information about protein-binding potential (Kirk et al, 2018), and that the various repeats in Xist and Rsx encode function through the concerted recruitment of multiple RNAbinding proteins.…”
Section: Multiple Protein-binding Motifs Are Enriched To Extreme Levesupporting
confidence: 86%
See 3 more Smart Citations
“…Thus, the enrichment of a particular protein-binding motif in an individual Xist or Rsx repeat domain does not provide direct evidence that the protein binds to the domain. Nevertheless, these results are consistent with the notions that lncRNA k-mer content encodes information about protein-binding potential (Kirk et al, 2018), and that the various repeats in Xist and Rsx encode function through the concerted recruitment of multiple RNAbinding proteins.…”
Section: Multiple Protein-binding Motifs Are Enriched To Extreme Levesupporting
confidence: 86%
“…In our previous work, we found that SEEKR performed best when the length of the lncRNA or lncRNA fragment being studied was similar to 4^k, i.e. the total number of possible k-mers at kmer length k. In tests of Xist-like repressive activity, we found that comparisons of lncRNAs using k-mer lengths of k³7 underperformed relative to comparisons using smaller k-mer lengths, owing to the fact that most annotated lncRNAs are much less than 4^7 (16384) nucleotides long, and kmer profiles of individual lncRNAs at k³7 (³16384 possible k-mers) are dominated by "0" values (Kirk et al, 2018). Based on this observation, and because Repeats A and B, two essential repetitive regions within Xist (Almeida et al, 2017;Hoki et al, 2009;Pintacuda et al, 2017;Royce-Tolland et al, 2010;Wutz et al, 2002), are each about 4^4 (256) nucleotides in length, we reasoned that k-mer profiles at k=4 (4^4=256 possible k-mers) would provide a reasonable estimate of sequence complexity for the repeats without being dominated by "0" values.…”
Section: Non-linear Similarity Between Repeat Domains In Xist and Rsxmentioning
confidence: 90%
See 2 more Smart Citations
“…Plasmid construction pEGFP-C1 F-tractin-EGFP was a gift from Dyche Mullins (Addgene plasmid # 58473; http://n2t.net/addgene:58473; RRID:Addgene_58473). PiggyBac plasmids PB-rtTA and PB-miRE-tre-Puro were kindly provided by Mauro Calabrese (The University of North Carolina at Chapel Hill); PB-rtTA encodes reverse tetracycline-controlled transactivator (rtTA) and G418 resistant gene under UbC promoter 44 , and PB-miRE-tre-Hygro encodes a protein of interest and a hygromycin resistant gene under tetracycline-dependent and EF1 promoters, respectively. pFtractin-Halo was first generated by cloning Halo-Tag cDNA (forward primer, aggggggctagcgctcgccaccatggcagaaatcggtactggctttc; reverse primer, cgaagcttgagctcgagatctagtcgactgaattcgcgttatcgc) between AgeI and BglII site of pEGFP-C1 F-tractin-EGFP using Gibson assembly (New England Biolabs, MA).…”
Section: Volume Imaging Of Vimentin Actin and Lysosomesmentioning
confidence: 99%