Retroviruses are a large and diverse family of RNA viruses that synthesize a DNA copy of their RNA genome after infection of the host cell. Integration of this viral DNA into host DNA is an essential step in the replication cycle of HIV 1 and other retroviruses (reviewed in Refs. 1-3). The integrated viral DNA is transcribed to make the RNA genome of progeny virions and the template for translation of viral proteins. Following assembly, virions bud from the cell surface and subsequently infect previously uninfected cells, thus completing the replication cycle. An infecting retrovirus introduces a large nucleoprotein complex into the cytoplasm of the host cell. This complex, which is derived from the core of the infecting virion, contains two copies of the viral RNA together with a number of viral proteins, including reverse transcriptase and integrase. Reverse transcription of the viral RNA occurs within the complex to make a double-stranded DNA copy of the viral genome, the viral DNA substrate for integration. The viral DNA remains associated with both viral and cellular proteins in a nucleoprotein complex termed the preintegration complex. One constituent of the preintegration complex is the viral integrase protein, the key player in the integration of the viral DNA into the host genome. The other components of the preintegration complex that are transported to the nucleus along with the viral DNA and integrase, and their possible functions, have not been firmly established and are not discussed here. The critical DNA cutting and joining events that integrate the viral DNA are carried out by the integrase protein itself. Here we review our current knowledge of the molecular mechanism of this reaction and discuss some of the key issues that are yet to be understood.
The Mechanism of DNA IntegrationBiochemical studies have elucidated the basic chemical mechanism of integration, even though the organization of the active complex of integrase with its DNA substrates remains to be determined. We will focus on HIV integrase, but the key properties of this enzyme appear to be shared among the entire retroviral integrase family. In the first step of the integration process, two nucleotides are removed from each 3Ј-end of the viral DNA, a reaction termed 3Ј-end processing. Cleavage occurs to the 3Ј-side of a CA dinucleotide that is conserved among retroviruses, retrotransposons, and many DNA transposons, both in prokaryotes and eukaryotes. This reaction exposes the terminal 3Ј-hydroxyl group that is to be joined to target DNA (Fig. 1B). In the second step, DNA strand transfer, a pair of processed viral DNA ends is inserted into the target DNA (Fig. 1C). In the case of HIV, the sites of integration on the two target DNA strands are separated by 5 base pairs. Repair of this integration intermediate (Fig. 1D) results in a direct duplication of 5 base pairs flanking the integrated viral DNA (not shown). The repair step requires removal of the two unpaired nucleotides at the 5Ј-ends of the viral DNA, filling in the single ...