2022
DOI: 10.3390/ijms23063044
|View full text |Cite
|
Sign up to set email alerts
|

Identification of D Modification Sites Using a Random Forest Model Based on Nucleotide Chemical Properties

Abstract: Dihydrouridine (D) is an abundant post-transcriptional modification present in transfer RNA from eukaryotes, bacteria, and archaea. D has contributed to treatments for cancerous diseases. Therefore, the precise detection of D modification sites can enable further understanding of its functional roles. Traditional experimental techniques to identify D are laborious and time-consuming. In addition, there are few computational tools for such analysis. In this study, we utilized eleven sequence-derived feature ext… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 58 publications
(91 reference statements)
0
7
0
Order By: Relevance
“…This section summarizes 11 remaining encoders namely, nucleic acid composition (NAC) 62 , enhanced nucleic acid composition (ENAC) 63 , accumulated nucleotide frequency (ANF) 64 , dinucleotide composition (DNC) 65 , trinucleotide composition (TNC) 66 , nucleotide chemical property (NCP) 67 , binary 68 , electron ionic interaction potential (EIIP) 57 , series correlation pseudo dinucleotide composition (SCPseDNC), 69 , pseudo dinucleotide composition (PSEDNC) 70 , 71 , and pseudo k-tupler composition (PSEKNC) 72 .…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…This section summarizes 11 remaining encoders namely, nucleic acid composition (NAC) 62 , enhanced nucleic acid composition (ENAC) 63 , accumulated nucleotide frequency (ANF) 64 , dinucleotide composition (DNC) 65 , trinucleotide composition (TNC) 66 , nucleotide chemical property (NCP) 67 , binary 68 , electron ionic interaction potential (EIIP) 57 , series correlation pseudo dinucleotide composition (SCPseDNC), 69 , pseudo dinucleotide composition (PSEDNC) 70 , 71 , and pseudo k-tupler composition (PSEKNC) 72 .…”
Section: Methodsmentioning
confidence: 99%
“…Similarly, dinucleotide composition (DNC) 65 and trinucleotide composition (TNC) 66 , use the pairs of nucleotides (k = 2, or k = 3) to compute normalized occurrence frequencies rather than taking into account individual nucleotides. Enhanced nucleic acid composition (ENAC) 63 transforms raw sequences into statistical vectors by counting the number of different k-mers at a fixed sliding window. First, a dictionary of unique k-mers is created and then for each unique each k-mer, within each window its count is computed.…”
Section: Methodsmentioning
confidence: 99%
“…Similarly, dinucleotide composition (DNC) 46 and trinucleotide composition (TNC) 47 , use the pairs of nucleotides (k=2, or k=3) to compute normalized occurrence frequencies rather than taking into account individual nucleotides. Enhanced nucleic acid composition (ENAC) 44 transforms raw sequences into statistical vectors by counting the number of different k-mers at a fixed sliding window. First, a dictionary of unique k-mers is created and then for each unique each k-mer, within each window its count is computed.…”
Section: Methodsmentioning
confidence: 99%
“… 25 A number of computational methods have been developed for predicting epigenetic modifications of RNA. 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 Among them, iRNAD is the first approach for D-site prediction from multiple species, which used a support vector machine to distinguish D and non-D sites. 34 Later, iRNAD_XGBoost used XGBoost-selected multiple features to construct a model for D detection.…”
Section: Introductionmentioning
confidence: 99%
“… 34 Later, iRNAD_XGBoost used XGBoost-selected multiple features to construct a model for D detection. 33 However, to the best of our knowledge, all existing D-site prediction tools 32 , 33 , 34 , 35 were trained on tRNAs, and it is not clear whether they can be applied to predict D sites on mRNAs. Although recent studies have unveiled the widely occurring nature and transcriptome-wide distribution of D (or the D epitranscriptome), 14 there are still no prediction tools constructed for mRNA D sites using mRNA D datasets.…”
Section: Introductionmentioning
confidence: 99%