2017
DOI: 10.1016/j.omtn.2017.04.008
|View full text |Cite
|
Sign up to set email alerts
|

2L-piRNA: A Two-Layer Ensemble Classifier for Identifying Piwi-Interacting RNAs and Their Function

Abstract: Involved with important cellular or gene functions and implicated with many kinds of cancers, piRNAs, or piwi-interacting RNAs, are of small non-coding RNA with around 19–33 nt in length. Given a small non-coding RNA molecule, can we predict whether it is of piRNA according to its sequence information alone? Furthermore, there are two types of piRNA: one has the function of instructing target mRNA deadenylation, and the other does not. Can we discriminate one from the other? With the avalanche of RNA sequences… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
108
0
1

Year Published

2017
2017
2020
2020

Publication Types

Select...
4
2
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 231 publications
(109 citation statements)
references
References 139 publications
(186 reference statements)
0
108
0
1
Order By: Relevance
“…To develop a useful sequence-based statistical predictor for a biological system as reported in a series of recent publications [74][75][76][77][78][79][80][81][82][83], the Chou's 5-step rule should be observed [84]: (1) How to construct or select a valid dataset to train and test the predictor? (2) How to formulate the biological sequence samples with an effective mathematical expression that can truly reflect their intrinsic correlation with the target to be predicted?…”
Section: Methodsmentioning
confidence: 99%
“…To develop a useful sequence-based statistical predictor for a biological system as reported in a series of recent publications [74][75][76][77][78][79][80][81][82][83], the Chou's 5-step rule should be observed [84]: (1) How to construct or select a valid dataset to train and test the predictor? (2) How to formulate the biological sequence samples with an effective mathematical expression that can truly reflect their intrinsic correlation with the target to be predicted?…”
Section: Methodsmentioning
confidence: 99%
“…Jun is activated through phosphorylation at Ser 63 and Ser 73 by JNK [92,93]. A high level of Jun has been observed in various types of cancer including non-small cell lung cancer, oral squamous cell carcinoma, breast cancer and colorectal cancer [94][95][96][97][98].…”
Section: Transcription Associated Genesmentioning
confidence: 99%
“…Visual comparison of the three Sv-IGFBP_N' complexes ( Figure 5a) clearly demonstrates the binding interface of the N' insulin-binding domain (supported by HADDOCK2.2 simulations: Figure S3), with all highlighted interacting residues predicted by both PDBsum and PRODIGY (shown in Figure 5b; additional residues predicted by PRODIGY presented in Table S1). Of all the predicted interacting residues presented in Figure 5b, we have highlighted those amino acids of IGFBP_N' that show conserved interaction contacts with all three ligands (*), namely: the negatively charged Asp(D) 71 and Asp(D) 94 ; supported by the polarGln(Q) 67 (where proton acceptor properties enable it to form two hydrogen bonds, stabilizing the overall negative charge); the neutrally charged Ser(S) 72 and Thr(T) 93 ; and Gly(G) 70 , Gly(G) 91, and Gly(G) 92 . In addition to these eight consistent contacts of IGFBP_N', PRODIGY predicts a further nine (Table S1).…”
Section: Complex Formationmentioning
confidence: 99%
“…Ever since the concept of pseudo amino acid composition or Chou's PseAAC [55][56][57][58] was proposed, it has been widely used in many biomedicine and drug development areas [59,60] as well as nearly all the areas of computational proteomics(see, e.g., [39,43,45,[61][62][63][64][65][66][67][68][69][70][71][72][73] and a long list of references cited in two review papers [74,75]). Encouraged by the successes of using PseAAC to deal with protein/peptide sequences, its idea and approach have been extended to deal with DNA/RNA sequences [76][77][78][79][80][81][82] in computational genomics via PseKNC (Pseudo K-tuple Nucleotide Composition) [83,84]. Recently, a very powerful web-server called "Pse-in-One" [85] and its updated version "Pse-in-One 2.0" [86] were developed, by which users can generate any pseudo components for both protein/peptide and DNA/RNA sequences as they wish or define.…”
Section: Proteins Sample Formulationmentioning
confidence: 99%
“…incorrectly predicted to be of the i-th location. The metrics of Equation (19) have been widely used to examine the quality of predictors in genome/proteome analysis (see, e.g., [46,47,[76][77][78][79][80][107][108][109]) and computational biomedicine (see, e.g., [82,[110][111][112]). Natural Science Given in Table 3 are the corresponding results obtained by pLoc-mGpos for each of the four subcellular locations.…”
Section: Comparison With the State-of-the-art Predictormentioning
confidence: 99%