Capturing conserved genomic elements to shed light on deep evolutionary history is becoming the new gold standard for phylogenomic research. Ultraconserved elements are shared among distantly related organisms, allowing the capture of unpreceded amounts of genomic data of nonâmodel taxa.
An underappreciated consequence of hybrid enrichment methods is the potential of introducing undetected DNA sequences from organisms outside the lineage of interest, facilitated through the high degree of conservation of the target regions. In this in silico study, we quantify ultraconserved loci using a data set of 400 published genomes. We utilized six newly designed UCE bait sets, tailored to various arthropod groups, and screened for shared conserved elements in all 242 currently published arthropod genomes. Additionally, we included a diverse set of other potential contaminating organisms, such as various species of fungi and bacteria.
Our results show that specific UCE bait sets can capture genomic elements from vastly divergent lineages, including human DNA. Nonetheless, our in silico modeling demonstrates that sufficiently strict bioinformatic processing parameters effectively filter out unintentionally targeted DNA from taxa other than the focus group. Lastly, we characterize all the 100 most widely shared UCE loci as highly conserved exonic regions.
We give practical recommendations to address contamination in data sets generated through targetedâenrichment.