2022
DOI: 10.1093/bioinformatics/btac390
|View full text |Cite
|
Sign up to set email alerts
|

hapCon: estimating contamination of ancient genomes by copying from reference haplotypes

Abstract: Motivation Human ancient DNA (aDNA) studies have surged in recent years, revolutionizing the study of the human past. Typically, aDNA is preserved poorly, making such data prone to contamination from other human DNA. Therefore, it is important to rule out substantial contamination before proceeding to downstream analysis. As most aDNA samples can only be sequenced to low coverages (<1x average depth), computational methods that can robustly estimate contamination in the low coverage re… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 15 publications
(12 citation statements)
references
References 43 publications
0
12
0
Order By: Relevance
“…The nuclear DNA contamination was estimated with several methods. We applied ANGSD 0.934 70 and hapCon 71 for libraries from male individuals, and applied contamLD 72 and a newly developed method that analyses contamination in ROH for female and male libraries (see Supplementary Information, section 2 for a detailed description). The mtDNA contamination was estimated by Schmutzi (--notusepredC --uselength) 69 for all the libraries.…”
Section: Methodsmentioning
confidence: 99%
“…The nuclear DNA contamination was estimated with several methods. We applied ANGSD 0.934 70 and hapCon 71 for libraries from male individuals, and applied contamLD 72 and a newly developed method that analyses contamination in ROH for female and male libraries (see Supplementary Information, section 2 for a detailed description). The mtDNA contamination was estimated by Schmutzi (--notusepredC --uselength) 69 for all the libraries.…”
Section: Methodsmentioning
confidence: 99%
“…We estimated individual library contamination using a recently developed method that requires only 0.02× whole-genome coverage per sample ( Huang and Ringbauer 2021 ). This method models and quantifies mismatches in haploid X Chromosomes as contamination and is therefore restricted to individuals who are molecularly sexed as male, which in our case corresponded to eight out of 10 individuals.…”
Section: Resultsmentioning
confidence: 99%
“…Duplicated sequences were removed using SAMtools’ rmdup command. Terminal deamination was assessed using mapDamage (v2.0) ( Jónsson et al 2013 ); contamination was assessed on the haploid X Chromosome of males using hapCon with a threshold of 0.02× or 2000 SNPs ( Huang and Ringbauer 2021 ); and molecular sex was determined by looking at the fraction of sequences aligning to the Y Chromosome compared with the total fraction aligning to both sex chromosomes ( Skoglund et al 2013 ). To randomly subsample FASTQ files for reanalysis of merged libraries, seqtk ( ) was used.…”
Section: Methodsmentioning
confidence: 99%
“…To account for the contamination present in the nuclear DNA, we estimated the contamination with the software hapCon ( 93 ), which is based on detecting polymorphic sites on the X chromosome of male individuals. The estimation of contamination was performed only in three individuals meeting the inclusion criteria (XY assignment and > 0.02x on the X chromosome).…”
Section: Methodsmentioning
confidence: 99%