2021
DOI: 10.1038/s41467-021-21765-5
|View full text |Cite
|
Sign up to set email alerts
|

Deep learning-based enhancement of epigenomics data with AtacWorks

Abstract: ATAC-seq is a widely-applied assay used to measure genome-wide chromatin accessibility; however, its ability to detect active regulatory regions can depend on the depth of sequencing coverage and the signal-to-noise ratio. Here we introduce AtacWorks, a deep learning toolkit to denoise sequencing coverage and identify regulatory peaks at base-pair resolution from low cell count, low-coverage, or low-quality ATAC-seq data. Models trained by AtacWorks can detect peaks from cell types not seen in the training dat… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
19
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 41 publications
(21 citation statements)
references
References 43 publications
1
19
0
Order By: Relevance
“…ATAC-seq data sets from cryopreserved tissue were slightly lower quality compared to data sets obtained from fresh tissue, a known issue when processing frozen samples and tissues specifically [31]. Processing with the AtacWorks algorithm [36] resulted in improved signal quality, similar to what we observed with processed data from fresh tissue ( Figure 5B ).…”
Section: Resultssupporting
confidence: 74%
See 1 more Smart Citation
“…ATAC-seq data sets from cryopreserved tissue were slightly lower quality compared to data sets obtained from fresh tissue, a known issue when processing frozen samples and tissues specifically [31]. Processing with the AtacWorks algorithm [36] resulted in improved signal quality, similar to what we observed with processed data from fresh tissue ( Figure 5B ).…”
Section: Resultssupporting
confidence: 74%
“…We observed enrichment of open chromatin at known oocyte-specific and granulosa cell-specific markers, GDF-9 and FOXL-2, respectively, that appeared consistent between donors ( Figure 4C ), indicative of isolation of the cell types of interest. To improve the signal over background in our data sets, we used a deep learning toolkit called AtacWorks [36] to denoise our sequencing data ( Figure 4D ).…”
Section: Resultsmentioning
confidence: 99%
“…Methods such as TF-MoDISco could be applied to scBasset ISM scores for de novo motif discovery (Shrikumar et al, 2018; Avsec et al, 2021). All approaches to scATAC analysis depend on accurate peak calls, and predictive modeling frameworks have been proposed to help identify highly specific regulatory elements (Lal et al, 2021). We expect a neural network model would further improve scATAC peak calling by taking into account sequence information (and accounting for Tn5 transposition bias).…”
Section: Discussionmentioning
confidence: 99%
“…Our work demonstrates the feasibility of training such complex models (thousands of free parameters) on limited data sets (hundreds rather than thousands of samples), and we have tested that it can handle other data sets of much larger size (tens of thousands of samples, data not shown). It stands in contrast to other available implementations of neural networks for regulatory genomics, which are targeted to modeling epigenomic ( 39 , 63 , 64 ) and genome-wide TF-DNA binding ( 36 , 38 ) data, or do not explicitly model the dependence of sequence function on cellular descriptors such as TF levels ( 65 ). This feature allows CoNSEPT to make predictions for varying cellular conditions.…”
Section: Discussionmentioning
confidence: 99%