2018
DOI: 10.1093/bioinformatics/bty228
|View full text |Cite
|
Sign up to set email alerts
|

Convolutional neural networks for classification of alignments of non-coding RNA sequences

Abstract: MotivationThe convolutional neural network (CNN) has been applied to the classification problem of DNA sequences, with the additional purpose of motif discovery. The training of CNNs with distributed representations of four nucleotides has successfully derived position weight matrices on the learned kernels that corresponded to sequence motifs such as protein-binding sites.ResultsWe propose a novel application of CNNs to classification of pairwise alignments of sequences for accurate clustering of sequences an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
51
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 74 publications
(55 citation statements)
references
References 24 publications
0
51
0
Order By: Relevance
“…particularly adept at recognizing motifs and long-range interactions in nucleotide sequence data 10,[18][19][20][40][41][42][43][44] . We trained a CNN on a one-hot sequence input, an LSTM on a one-hot sequence input, and a CNN on a two-dimensional, one-hot complementarity map representation input (see "Methods" for complete descriptions of all models).…”
Section: Resultsmentioning
confidence: 99%
“…particularly adept at recognizing motifs and long-range interactions in nucleotide sequence data 10,[18][19][20][40][41][42][43][44] . We trained a CNN on a one-hot sequence input, an LSTM on a one-hot sequence input, and a CNN on a two-dimensional, one-hot complementarity map representation input (see "Methods" for complete descriptions of all models).…”
Section: Resultsmentioning
confidence: 99%
“…CNN is an essential model of deep learning, and suitable for identifying sequence profiles, due to its excellent feature extraction capability on high-dimensional data (Kelley et al, 2016;Zeng et al, 2016). The input vector of CNN is primarily based on sequence-derived features, such as the frequency of k-mer occurrence applied in this study and one-hot vector strategy (Aoki and Sakakibara, 2018;Fiannaca et al, 2015;Ghandi et al, 2014;Lee et al, 2011;Nguyen et al, 2016). One apparent advantage of the one-hot vector is to reserve specific position information of each individual nucleotide in sequences.…”
Section: Discussionmentioning
confidence: 99%
“…One particular deep learning model--Convolutional Neural Network (CNN)--have achieved outstanding performance in image classification, speech recognition, and natural language processing (Krizhevsky et al, 2012;Schmidhuber, 2015). CNN model has also been successfully applied in prediction of unknown sequences profiles or motifs and functional activity discovery, without pre-defining sequence features such as prediction of sequence specificities of DNA-and RNAbinding proteins (Alipanahi et al, 2015), effects of noncoding variants (Zhou and Troyanskaya, 2015), and classification of alignments of noncoding RNA sequences (Alipanahi et al, 2015;Aoki and Sakakibara, 2018;Schmidhuber, 2015;Zeng et al, 2016;Zhou and Troyanskaya, 2015).…”
Section: Introductionmentioning
confidence: 99%
“…In recent years, Convolutional Neural Network (CNN) has been widely used to solve biological problems. 22 , 27 , 28 The structure of the CNN is shown in Figure 1 . It contains a convolutional layer with 200 filters in which the kernel size is 6.…”
Section: Methodsmentioning
confidence: 99%