2021
DOI: 10.1101/2021.09.20.461055
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Detection of m6A from direct RNA sequencing using a Multiple Instance Learning framework

Abstract: RNA modifications such as m6A methylation form an additional layer of complexity in the transcriptome. Nanopore direct RNA sequencing captures this information in the raw current signal for each RNA molecule, enabling the detection of RNA modifications using supervised machine learning. However, experimental approaches provide only site-level training data, whereas the modification status for each single RNA molecule is missing. Here we present m6Anet, a neural network-based method that leverages the Multiple … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
27
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 19 publications
(28 citation statements)
references
References 59 publications
(99 reference statements)
1
27
0
Order By: Relevance
“…Although some positions are consistent between two experiments (e.g., 27,764), others vary (e.g., 28,616). The analysis of the output of m6Anet program shows that the positions putatively methylated have a significantly higher probability (>25%) than the putatively non-methylated (<5%), which is consistent with the model training and benchmarking [33]. Therefore, positions that, in at least one experiment, have a ≥50% probability are indicative that a substantial proportion of modified bases, over unmodified, are present in the sample.…”
Section: Discussionsupporting
confidence: 71%
See 2 more Smart Citations
“…Although some positions are consistent between two experiments (e.g., 27,764), others vary (e.g., 28,616). The analysis of the output of m6Anet program shows that the positions putatively methylated have a significantly higher probability (>25%) than the putatively non-methylated (<5%), which is consistent with the model training and benchmarking [33]. Therefore, positions that, in at least one experiment, have a ≥50% probability are indicative that a substantial proportion of modified bases, over unmodified, are present in the sample.…”
Section: Discussionsupporting
confidence: 71%
“…Therefore, positions that, in at least one experiment, have a ≥50% probability are indicative that a substantial proportion of modified bases, over unmodified, are present in the sample. The m6A detection method employed here uses a model that takes into account the mixture of modified and unmodified RNAs and outputs the m6A-modification probability at any given site for all the DRACH 5-mers represented in the neural network training data and this might, at least in part, explain the variance among experiments as observed with different cell lines [33]. As a future perspective, the m6Anet model can be trained for the SARS-CoV-2 genome, which might adjust the probability values for m6A prediction in these specific RNAs.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…These m 6A -labeled data sets can be obtained through wet lab assays such as m6ACE-seq and CLIP-seq [42,61] or artificially methylating adenosines with methyltransferases [43]. Such supervised methods include MINES [42], EpiNano [43], Nanom6A [44], and m6anet [45]. As m 6A is preferentially found in DRACH/RRACH motifs [12], MINES uses this aspect by modeling m 6A sites using a random forest classifier for each of the DRACH motifs using CLIP-seq-identified m 6A sites as positive samples [42].…”
Section: Supervised Learning Methods For Detecting Specific Rna Modificationsmentioning
confidence: 99%
“…EpiNano trains a support vector machine that predicts candidate m 6A sites from basecalling errors that are presumably caused by the presence of m 6A [43] while Nanom6A trains an eXtreme Gradient Boosting (XGBoost) model with the raw signal features [44]. Predicting the probability of a site being modified with m 6A , m6anet employs a multiple instance learning framework and takes the entire differentially labeled reads into account [45].…”
Section: Supervised Learning Methods For Detecting Specific Rna Modificationsmentioning
confidence: 99%