Integration sites assay (11). B Clonal integration sites were identified ≥2 times in a single location or at least once in both locations. C Wilcoxon signed rank test to determine if differences in clonal detection are within sampling error.
HIV persists during antiretroviral therapy (ART) as integrated proviruses in cells descended from a small fraction of the CD4+ T cells infected prior to the initiation of ART. To better understand what controls HIV persistence and the distribution of integration sites (IS), we compared about 15,000 and 54,000 IS from individuals pre-ART and on ART, respectively, with approximately 395,000 IS from PBMC infected in vitro. The distribution of IS in vivo is quite similar to the distribution in PBMC, but modified by selection against proviruses in expressed genes, by selection for proviruses integrated into one of 7 specific genes, and by clonal expansion. Clones in which a provirus integrated in an oncogene contributed to cell survival comprised only a small fraction of the clones persisting in on ART. Mechanisms that do not involve the provirus, or its location in the host genome, are more important in determining which clones expand and persist.
Background: All retroviruses, including human immunodeficiency virus (HIV), must integrate a DNA copy of their genomes into the genome of the infected host cell to replicate. Although integrated retroviral DNA, known as a provirus, can be found at many sites in the host genome, integration is not random. The adaption of linker-mediated PCR (LM-PCR) protocols for high-throughput integration site mapping, using randomly-sheared genomic DNA and Illumina paired-end sequencing, has dramatically increased the number of mapped integration sites. Analysis of samples from human donors has shown that there is clonal expansion of HIV infected cells and that clonal expansion makes an important contribution to HIV persistence. However, analysis of HIV integration sites in samples taken from patients requires extensive PCR amplification and high-throughput sequencing, which makes the methodology prone to certain specific artifacts. Results: To address the problems with artifacts, we use a comprehensive approach involving experimental procedures linked to a bioinformatics analysis pipeline. Using this combined approach, we are able to reduce the number of PCR/ sequencing artifacts that arise and identify the ones that remain. Our streamlined workflow combines random cleavage of the DNA in the samples, end repair, and linker ligation in a single step. We provide guidance on primer and linker design that reduces some of the common artifacts. We also discuss how to identify and remove some of the common artifacts, including the products of PCR mispriming and PCR recombination, that have appeared in some published studies. Our improved bioinformatics pipeline rapidly parses the sequencing data and identifies bona fide integration sites in clonally expanded cells, producing an Excel-formatted report that can be used for additional data processing.
Conclusions:We provide a detailed protocol that reduces the prevalence of artifacts that arise in the analysis of retroviral integration site data generated from in vivo samples and a bioinformatics pipeline that is able to remove the artifacts that remain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.