2022
DOI: 10.1093/bioinformatics/btac572
|View full text |Cite
|
Sign up to set email alerts
|

MMGraph: a multiple motif predictor based on graph neural network and coexisting probability for ATAC-seq data

Abstract: Motivation Transcription factor binding sites (TFBSs) prediction is a crucial step in revealing functions of transcription factors (TFs) from high-throughput sequencing data. Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) provides insight on TFBSs and nucleosome positioning by probing open chromatic, which can simultaneously reveal multiple TFBSs compare to traditional technologies. The existing tools based on convolutional neural network (CNN) only find the fixed leng… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3

Relationship

2
4

Authors

Journals

citations
Cited by 8 publications
(14 citation statements)
references
References 6 publications
0
9
0
Order By: Relevance
“…To the best of our knowledge, just four other tools exist intended to generate de novo motifs from ATAC-seq data, namely BindVAE 15 and MMGraph 16 (both using machine learning in combination with k-mers), CEMIG 17 (which utilizes De Bruijin graphs created on k-mers), and the RSAT peak-motifs pipeline 18 (a pipeline intended for ChIP-seq, which is also applicable to ATAC-seq data). However, BindVAE, CEMIG and RSAT solely operate on the sequences of complete ATAC-seq peaks, therefore these tools lack precision compared to a FP based tool.…”
Section: Resultsmentioning
confidence: 99%
“…To the best of our knowledge, just four other tools exist intended to generate de novo motifs from ATAC-seq data, namely BindVAE 15 and MMGraph 16 (both using machine learning in combination with k-mers), CEMIG 17 (which utilizes De Bruijin graphs created on k-mers), and the RSAT peak-motifs pipeline 18 (a pipeline intended for ChIP-seq, which is also applicable to ATAC-seq data). However, BindVAE, CEMIG and RSAT solely operate on the sequences of complete ATAC-seq peaks, therefore these tools lack precision compared to a FP based tool.…”
Section: Resultsmentioning
confidence: 99%
“…To the best of our knowledge, just four other tools exist intended to generate de novo motifs from ATAC-seq data, namely BindVAE 11 and MMGraph 12 (both using machine learning in combination with k-mers), CEMIG 13 (which utilizes De Bruijin graphs created on k-mers), and the RSAT peak-motifs pipeline 14 (a pipeline intended for ChIP-seq, which is also applicable to ATAC-seq data). However, BindVAE, CEMIG and RSAT solely operate on the sequences of complete ATAC-seq peaks, therefore these tools lack precision compared to a FP based tool.…”
Section: Resultsmentioning
confidence: 99%
“…For each dataset, we regard all its peaks as positive sequences with label '1'. We generate negative sequences with label '0' by randomly shuffling all bases within a positive sequence 22 . The negative sequences do not contain TFBSs but possess the same GC contents as the positive ones.…”
Section: Data Acquisitionmentioning
confidence: 99%
“…The MEME-ChIP framework has adopted the STREME algorithm, which was recently published. The comparison of performance for sequence classification is evaluated by the Area Under Receiver Operating Characteristic curve (AUC), Accuracy (ACC), Matthews Correlation Coefficient (MCC), and area under the Precision-Recall Curve (PRC) 2,22 . We downloaded 20 ChIP-exo datasets of E. coli from the proChIPdb database 19 , which are composed of a wide array of peak numbers (25-1,630).…”
Section: Benchmarking Motif Discovery On Chip-exo Datasetsmentioning
confidence: 99%