The complete nucleotide sequence (8024 nucleotides) of the circular double-stranded DNA of cauliflower mosaic virus has been established. The DNA molecule is known to possess three discrete single-stranded discontinuities, often referred to as "gaps," two in one strand and one in the other. The sequence data indicate that gap 1, the single discontinuity in the alpha strand, corresponds to the absence of no more than one or two nucleotides with respect to the complementary beta strand. The two discontinuities in the beta strand, however, are not authentic gaps since no nucleotides are missing, but are instead regions of sequence overlap: a short sequence (19 residues for gap 2, t least 2 residues for gap 3) at one terminus of each discontinuity, probably the 5' terminus, is displaced from the double helix by an identical sequence at the other boundary of the discontinuity. Analysis of the distribution of nonsense codons in the DNA sequence is consistent with other evidence that only the alpha strand is transcribed. The coding region extends around the circular molecule from 4 map units of gap 1, the map origin, to map position 91, and consists of six long open reading frames. Our findings suggest, but do not prove, that the DNA sequence of the open reading frames is colinear with viral protein sequences. The cistron for the viral coat protein, which is probably synthesized in the form of a precursor, has been situated in coding region IV on the basis of its unusual amino acid composition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.