We determined the complete nucleotide sequence of the gypsy element present at the forked locus of Drosophila melanogaster in thef allele. The gypsy element shares more homology with vertebrate retroviruses than with the copia element of D. melanogaster or the Ty element of Saccharomyces cerevisiae, both in overall organization and at the DNA sequence level. This transposable element is 7,469 base pairs long and encodes three putative protein products. The long terminal repeats are 482 nucleotides long and contain transcription initiation and termination signals; sequences homologous to the polypurine tract and tRNA primer binding site of retroviruses are located adjacent to the long terminal repeats. The central region of the element contains three different open reading frames. The second one encodes a putative protein which shows extensive amino acid homology to retroviral proteins, indluding gag-specific protease, reverse transcriptase, and DNA endonuclease.The Drosophila melanogaster gypsy transposable element is associated with spontaneous mutations whose phenotype can be reversed by mutations at unlinked suppressor loci (9). This element is transcribed in a temporal specific fashion, giving rise to a major 6.5 kilobase RNA which accumulates at highest levels in 2-to 3-day-old pupae (11). We recently proposed (11, 12) that the mutational activity of the gypsy element on suppressible genes is a direct consequence of the transcriptional properties of this element. The mutagenic effect of the transposable element on these loci is due to transcriptional interference on the genes located nearby.To understand the molecular basis of this phenomenon, we investigated the DNA structure, of the gypsy element.Gypsy i5 a member of a class of structurally similar transposable elements which contain long terminal repeats (LTRs) (1, 5, 9). Other members of this family are the copia-like elements ofD. melanogaster (14), the Ty elements of Saccharomyces cerevisiae (13), and vertebrate retrovirus proviruses (20). In addition to the conservation in the organization of the different transcription signals between the LTRs of retroviruses and copialike elements, the latter ones also contain nucleotide sequences homologous to the tRNA primer binding site and purine-rich sequences, both necessary for the initiation of DNA synthesis in a retrovirus system (6, 16, 22). The organization of the protein-coding regions of these different elements nevertheless varies. A typical vertebrate retrovirus consists of three genes, termed gag, pol, and env, required for viral infection and replication (see reference 20 for a review). The gag region encodes a polyprotein which is cleaved giving rise to several small proteins found in the core of the virus particle (3) and whose exact function is not yet understood. The pol gene is expressed as a gag-pol polyprotein which is the precursor of the mature form of reverse transcriptase. The N-terminal region of this protein product contains the DNA polymerase and RNase H activities of reverse transcnp...