Adeno-associated virus type 2 (AAV2) preferentially integrates its genome into the AAVS1 locus on human chromosome 19. Preferential integration requires the AAV2 Rep68 or Rep78 protein (Rep68/78), a Rep68/78 binding site (RBS), and a nicking site within AAVS1 and may also require an RBS within the virus genome. To obtain further information that might help to elucidate the mechanism and preferred substrate configurations of preferential integration, we amplified junctions between AAV2 DNA and AAVS1 from AAV2-infected HeLaJW cells and cells with defective Artemis or xeroderma pigmentosum group A genes. We sequenced 61 distinct junctions. The integration junction sequences show the three classical types of nonhomologous-endjoining joints: microhomology at junctions (57%), insertion of sequences that are not normally contiguous with either the AAV2 or the AAVS1 sequences at the junction (31%), and direct joining (11%). These junctions were spread over 750 bases and were all downstream of the Rep68/78 nicking site within AAVS1. Two-thirds of the junctions map to 350 bases of AAVS1 that are rich in polypyrimidine tracts on the nicked strand. The majority of AAV2 breakpoints were within the inverted terminal repeat (ITR) sequences, which contain RBSs. We never detected a complete ITR at a junction. Residual ITRs at junctions never contained more than one RBS, suggesting that the hairpin form, rather than the linear ITR, is the more frequent integration substrate. Our data are consistent with a model in which a cellular protein other than Artemis cleaves AAV2 hairpins to produce free ends for integration.Adeno-associated virus type 2 (AAV2) is a naturally defective human parvovirus that usually requires a helper virus, such as an adenovirus or a herpesvirus, for productive infection (41). In the absence of helper virus, AAV2 can establish a latent infection by episomal persistence (54) or by integrating its DNA into the host chromosomes, preferentially within a 4-kb region of human chromosome 19 designated AAVS1 (11). This is not site-specific integration in the classic sense, but approximately 70% of integration events occur somewhere within this locus (Fig. 1A) (20,26,51). The mechanism is unknown, but it does require the Rep68 or Rep78 (Rep68/78) protein encoded by AAV2 and a 33-bp region of AAVS1 (Fig. 1A) that includes a Rep68/78 binding site (RBS) and a nicking site for Rep68/78 that resembles the terminal resolution site (trs) within the 145-base inverted terminal repeats (ITRs) of the AAV2 genome (Fig. 2) (29,57,64,67). The trs got its name because of its role in AAV2 replication. AAV2 has a linear, single-stranded DNA genome (4,679 bases; GenBank accession no. AF043303), and the ITRs are essential for replication. The ITRs are palindromic and fold into T-shaped hairpin structures, one of which provides a 3Ј end that primes secondstrand synthesis by a cellular DNA polymerase (Fig. 2) (10, 65). The hairpin is then nicked by Rep68/78 at trs (Fig. 2) and unwound, and the end is replicated. As a result of this mo...