2020
DOI: 10.1038/s41598-020-77824-2
|View full text |Cite
|
Sign up to set email alerts
|

XG-ac4C: identification of N4-acetylcytidine (ac4C) in mRNA using eXtreme gradient boosting with electron-ion interaction pseudopotentials

Abstract: N4-acetylcytidine (ac4C) is a post-transcriptional modification in mRNA which plays a major role in the stability and regulation of mRNA translation. The working mechanism of ac4C modification in mRNA is still unclear and traditional laboratory experiments are time-consuming and expensive. Therefore, we propose an XG-ac4C machine learning model based on the eXtreme Gradient Boost classifier for the identification of ac4C sites. The XG-ac4C model uses a combination of electron-ion interaction pseudopotentials a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
33
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
10

Relationship

2
8

Authors

Journals

citations
Cited by 45 publications
(33 citation statements)
references
References 38 publications
0
33
0
Order By: Relevance
“…To the best of our knowledge, only two computation methods, PACES ( Zhao et al , 2019 ) and XG-ac4C ( Alam et al , 2020 ), have been developed so far for the prediction of ac 4 C from sequences, and again, both were based on strong supervision. Although both approaches achieved positive prediction performance, they require a very specific sequence pattern and consider only sequences that have at least five continuous CXX repeats, which may limit the applicable scope of these methods.…”
Section: Resultsmentioning
confidence: 99%
“…To the best of our knowledge, only two computation methods, PACES ( Zhao et al , 2019 ) and XG-ac4C ( Alam et al , 2020 ), have been developed so far for the prediction of ac 4 C from sequences, and again, both were based on strong supervision. Although both approaches achieved positive prediction performance, they require a very specific sequence pattern and consider only sequences that have at least five continuous CXX repeats, which may limit the applicable scope of these methods.…”
Section: Resultsmentioning
confidence: 99%
“…Electron–Ion Interaction Pseudopotentials (EIIP) represent the energy of delocalized electrons in nucleotides. They have been used as a composition measure that has been effective in several bioinformatics algorithms [34] , [35] . Using the EIIP technique, every nucleotide in an RNA sequence is encoded using the distribution of free electron energies.…”
Section: Feature Extractionmentioning
confidence: 99%
“…The convolutional layer is used to automatically extract important features from an encoded DNA sequence. We apply the nucleotide chemical properties (NCP) and nucleotide density (ND) methods to encode the input DNA sequences [25,31,32]. Moreover, we use the batch normalization and dropout layers to control overfitting.…”
Section: Introductionmentioning
confidence: 99%