Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering

Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, Max

doi:10.1089/aid.2014.0211

Cited by 20 publications

(22 citation statements)

References 68 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A greater extent of clustering for longer HIV sequences in this study corroborates the results of our recent study (93), which used a set of nearly full-length HIV-1C sequences from the LANL HIV Database (http://www.hiv.lanl.gov/). Longer HIV sequences are more informative for HIV cluster analysis due to a larger number of informative sites (93). The technique of long-range HIV genotyping allows the use of amplicon 1 and amplicon 2 sequences either separately or in concatenation for a powerful cluster analysis.…”

Section: Discussionsupporting

confidence: 90%

Long-Range HIV Genotyping Using Viral RNA and Proviral DNA for Analysis of HIV Drug Resistance and HIV Clustering

et al. 2015

Self Cite

View full text Add to dashboard Cite

HIV genotyping is a critical tool for antiviral drug resistance testing that has revolutionized HIV care and advanced HIVrelated research. Routine antiretroviral (ARV) drug resistance testing is useful in choosing an optimal treatment regimen and monitoring its efficiency in clinical practice (1-12). HIV genotyping has been used successfully in research on HIV transmission clusters and HIV transmission dynamics (13-35).Initial broadly used ARV regimens included combinations of nucleoside reverse transcriptase (RT) inhibitors (NRTIs) and nonnucleoside reverse transcriptase inhibitors (NNRTIs). To monitor the emergence of drug resistance mutations associated with NRTIs and NNRTIs, HIV genotyping targeted viral sequences spanning an approximately 1,000-to 1,300-bp region of the HIV-1 genome encoding viral protease and partial RT, using viral RNA as a template for amplification. While the RNA-based approach works well in antiretroviral therapy (ART)-naive individuals, it is less successful if levels of viral replication are low, such as in individuals on ART. The sequence length of traditional RNAbased HIV genotyping for drug resistance is relatively short and does not cover the HIV-1 region encoding viral integrase or the viral envelope, hindering analysis of drug resistance mutations associated with integrase strand transfer inhibitors or entry inhibitors. The global scale up of ARV treatment and successful introduction of integrase strand transfer inhibitors and entry inhibitors into clinical trials and clinical practice necessitate modification of traditional methods of HIV genotyping.Two commercial genotyping assays, ViroSeq HIV-1 from Abbott Molecular and TruGene HIV-1 from Siemens Molecular Diagnostics, have been widely used for analysis of HIV-1-associated drug resistance. Both genotyping kits were extensively tested and validated (36)(37)(38)(39)(40)(41)(42)(43)(44)(45). While the ViroSeq HIV-1 kit is still on the market, Siemens discontinued selling and supporting the TruGene HIV-1 kit in 2014. The ViroSeq HIV-1 kit covers the entire protease-coding region and the RT region encoding the first 320 amino acids. The TruGene HIV-1 sequences span the protease (amino acids 4 to 99)-and RT (amino acids 40 to 240)-coding regions. The CDC supplies WHO-designated and CDC-supported President's Emergency Plan for AIDS Relief (PEPFAR) Genotyping Laboratories with the ATCC HIV-1 Drug Resistance Genotyping kit (46) for drug resistance testing. Many experienced genotyping laboratories have developed their own in-house amplification and sequencing protocols (11,(47)(48)(49)(50)(51)(52)(53)(54)(55)(56), including identification of minor viral variants that are normally missed by commercial genotyping kits (57-61). All of these approaches generally include smaller and more restricted regions for testing HIV-1 drug resistance.Recently, the protocol developed by Gall et al. (62)

show abstract

Section: Discussionsupporting

confidence: 90%

Long-Range HIV Genotyping Using Viral RNA and Proviral DNA for Analysis of HIV Drug Resistance and HIV Clustering

et al. 2015

Self Cite

View full text Add to dashboard Cite

show abstract

“…This is consistent with our recent studies on sampling density (Novitsky et al, 2014) and importance of virus sequence length (Novitsky et al, 2015) in HIV cluster analysis. Two additional acute HIV sub-epidemics were found among clusters with 5+ members and bootstrap support between 0.70 and 0.80, although both of these clusters had low internode certainty.…”

Section: Discussionsupporting

confidence: 93%

“…It is possible that bootstrapped ML inference of the short-range sequence set selected HIV lineages that represent only small sub-chains of much larger transmission chains in the population. Recently we demonstrated that viral sequence length plays an important role in HIV cluster analysis (Novitsky et al, 2015). It is likely that using long-range sequences could refine clustering and reveal more extensive clustering.…”

Section: Discussionmentioning

confidence: 99%

Phylodynamic analysis of HIV sub-epidemics in Mochudi, Botswana

et al. 2015

Self Cite

View full text Add to dashboard Cite

Southern Africa continues to be the epicenter of the HIV/AIDS epidemic. This HIV-1 subtype C epidemic has a predominantly heterosexual mode of virus transmission and high (>15%) HIV prevalence among adults. The epidemiological dynamics of the HIV-1C epidemic in southern Africa are still poorly understood. Here, we aim at a better understanding of HIV transmission dynamics by analyzing HIV-1 subtype C sequences from Mochudi, a peri-urban village in Botswana. HIV-1C env gene sequences (gp120 V1C5) were obtained through enhanced household-based HIV testing and counseling in Mochudi. More than 1,200 sequences were generated and phylogenetically distinct sub-epidemics within Mochudi identified. The Bayesian birth-death skyline plot was used to estimate the effective reproductive number, R, and the timing of virus transmission, to classify sub-epidemics as “acute” (those with recent viral transmissions) or “historic” (those without recent viral transmissions). We identified two of the 15 sub-epidemics as “acute.” The median estimates of R among the clusters ranged from 0.72 to 1.77. The majority of HIV lineages, 11 out of 15 clusters with 5+ members, appear to have been introduced to Mochudi between 1996 and 2002. The median peak duration of viral transmissions was 7.1 years (range 2.9–9.7 years). The median life span of identified HIV sub-epidemics, i.e. the time between the inferred epidemic origin and its most recent sample, was 13.1 years (range 10.2–22.1 years). Most viral transmissions within the sub-epidemics occurred between 1997 and 2007. The time period during which infected people are infectious appears to have decreased since the introduction of the national ART program in Botswana. Real-time HIV genotyping and breaking down local HIV epidemics into phylogenetically distinct sub-epidemics may help to reveal the structure and dynamics of HIV transmission networks in communities, and aid in the design of targeted interventions for members of the acute sub-epidemics that likely fuel local HIV/AIDS epidemics.

show abstract

“…Although we did not perform a bootstrapping analysis of the reconstructed trees, previous analyses have further demonstrated that support for groupings in the tree is increased when longer sequences are used, and clustering found in full-length datasets can be missed when using sub-genomic regions141516. Given the difficulty in generating and/or handling full genome datasets, our results demonstrate that gag - pol provides a dependable approximation; however it should be kept in mind that, at this point and considering we analysed a simulated dataset, the good performance of gag - pol could be more attributable to these genes’ combined length than to their particular characteristics.…”

Section: Discussionmentioning

confidence: 95%

Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic

Yebra

Hodcroft

Ragonnet‐Cronin

et al. 2016

Sci Rep

View full text Add to dashboard Cite

HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree’s using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences.

show abstract

Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering

Cited by 20 publications

References 68 publications

Long-Range HIV Genotyping Using Viral RNA and Proviral DNA for Analysis of HIV Drug Resistance and HIV Clustering

Long-Range HIV Genotyping Using Viral RNA and Proviral DNA for Analysis of HIV Drug Resistance and HIV Clustering

Phylodynamic analysis of HIV sub-epidemics in Mochudi, Botswana

Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic

Contact Info

Product

Resources

About