2023
DOI: 10.1101/2023.05.26.542440
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CEMIG: Prediction of thecis-regulatory motif using the De Bruijn graph from ATAC-seq

Abstract: Sequence motif discovery algorithms identify novel DNA patterns with significant biological roles, such as transcription factor (TF) binding site motifs. Chromatin accessibility data, accumulated through assay for transposase-accessible chromatin with sequencing (ATAC-seq), has enriched resources for motif discovery. However, computational efforts in ATAC-seq data analysis mainly target TF binding activity footprinting rather than motif prediction. Here, we introduce CEMIG, an algorithm predicting and characte… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 54 publications
0
2
0
Order By: Relevance
“…To the best of our knowledge, just four other tools exist intended to generate de novo motifs from ATAC-seq data, namely BindVAE 11 and MMGraph 12 (both using machine learning in combination with k-mers), CEMIG 13 (which utilizes De Bruijin graphs created on k-mers), and the RSAT peak-motifs pipeline 14 (a pipeline intended for ChIP-seq, which is also applicable to ATAC-seq data). However, BindVAE, CEMIG and RSAT solely operate on the sequences of complete ATAC-seq peaks, therefore these tools lack precision compared to a FP based tool.…”
Section: Resultsmentioning
confidence: 99%
“…To the best of our knowledge, just four other tools exist intended to generate de novo motifs from ATAC-seq data, namely BindVAE 11 and MMGraph 12 (both using machine learning in combination with k-mers), CEMIG 13 (which utilizes De Bruijin graphs created on k-mers), and the RSAT peak-motifs pipeline 14 (a pipeline intended for ChIP-seq, which is also applicable to ATAC-seq data). However, BindVAE, CEMIG and RSAT solely operate on the sequences of complete ATAC-seq peaks, therefore these tools lack precision compared to a FP based tool.…”
Section: Resultsmentioning
confidence: 99%
“…Inspired by CEMIG [31], we employed dHICA to discern cell-specific HM peak sites in K562 and GM12878 cells. For each HM, we delineated K562-specific, GM12878-specific and shared peaks(detailed in supplementary Text S2).…”
Section: Performance Evaluation Across Different Cell Lines Tissues A...mentioning
confidence: 99%