2018
DOI: 10.1186/s12864-018-4459-6
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of enhancer-promoter interactions via natural language processing

Abstract: BackgroundPrecise identification of three-dimensional genome organization, especially enhancer-promoter interactions (EPIs), is important to deciphering gene regulation, cell differentiation and disease mechanisms. Currently, it is a challenging task to distinguish true interactions from other nearby non-interacting ones since the power of traditional experimental methods is limited due to low resolution or low throughput.ResultsWe propose a novel computational framework EP2vec to assay three-dimensional genom… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
62
0
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 82 publications
(64 citation statements)
references
References 40 publications
1
62
0
1
Order By: Relevance
“…And it is very meaningful to explain biological meaning in the process of training of CNN with visualization. Recently, many studies on biological computing prediction classifications Li et al 2017a;Liu et al 2017c;Zeng et al 2018) involving CNNs have used convolution kernels of the first layer to extract informative motifs from massive sequence data sets. These followed on from the heuristic work from Deepbind (Alipanahi et al 2015) that generated a position weight matrix (PWM) by aligning all matched sequence segments and calculating the frequency for each kernel.…”
Section: Learned and Analyzed Motifs From Cnn Kernelmentioning
confidence: 99%
“…And it is very meaningful to explain biological meaning in the process of training of CNN with visualization. Recently, many studies on biological computing prediction classifications Li et al 2017a;Liu et al 2017c;Zeng et al 2018) involving CNNs have used convolution kernels of the first layer to extract informative motifs from massive sequence data sets. These followed on from the heuristic work from Deepbind (Alipanahi et al 2015) that generated a position weight matrix (PWM) by aligning all matched sequence segments and calculating the frequency for each kernel.…”
Section: Learned and Analyzed Motifs From Cnn Kernelmentioning
confidence: 99%
“…Deep learning becomes advantageous in this scenario as it identifies complex patterns via supervised and unsupervised learning from large datasets (Najafabadi et al, 2015) and can be applied for further insights into GWAS data. However, whilst deep learning enables the consideration of millions of parameters, its application to date has mostly flourished in image classification and natural language processing (Zeng et al, 2018;Aung et al, 2019;Hampe et al, 2019), requiring an investment in its development and benchmarking with traditional models for developing GWAS application. A deep neural network (ExPecto) applied by Zhou et al (2018) prioritized causal variants for immune-related diseases using sequence-based features.…”
Section: Machine Learning Modelsmentioning
confidence: 99%
“…Depending on the type of the input data, computational methods can mainly be divided into two categories: DNA sequence-based methods and epigenomic data-based methods. For DNA sequence-based methods, PEP [ 19 ] and EP2vec [ 20 ] took advantage of natural language processing to learn the feature representation of DNA sequences, and SPEID [ 21 ] used convolutional neural network to learn the feature representation of DNA sequences. Recently, Zhuang et al[ 22 ] introduced a novel method to improve the prediction performance of EPIs by using the existing labeled data to pretrain a convolutional neural network (CNN), then adopting the training data from the cell line of interest to continue to train the CNN.…”
Section: Introductionmentioning
confidence: 99%