The determination of the total 5,224 base-pair DNA sequence of the virus SV40 has enabled us to locate precisely the known genes on the genome. At least 15.2% of the genome is presumably not translated into polypeptides. Particular points of interest revealed by the complete sequence are the initiation of the early t and T antigens at the same position and the fact that the T antigen is coded by two non-contiguous regions of the genome; the T antigen mRNA is spliced in the coding region. In the late region the gene for the major protein VP1 overlaps those for proteins VP2 and VP3 over 122 nucleotides but is read in a different frame. The almost complete amino acid sequences of the two early proteins as well as those of the late proteins have been deduced from the nucleotide sequence. The mRNAs for the latter three proteins are presumably spliced out of a common primary RNA transcript. The use of degenerate codons is decidedly non-random, but is similar for the early and late regions. Codons of the type NUC, NCG and CGN are absent or very rare.
The restriction fragment Hind-K represents 4.2 ' %, of the genome of Simian virus 40 (SV40) and is located near the middle of the late region. Tts nucleotide sequence is reported here. It was mainly established by analysis of transcription products, synthesized by means of Esscherichiu coli RNA polymerase and nucleoside triphosphates, one of which was (~-~~P ) -l a b e l e d .Strand assignment was possible by hybridization of asymmetric, labeled transcripts of total SV40 DNA to filter-bound Hind-K fragment. Further information and unambiguous confirmation of the sequence was obtained by the use of direct DNA-sequencing methods. For this purpose the fragment was labeled at the 5' ends by means of polynucleotide kinase and [ Y -~~P I A T P and redigested with a suitable restriction enzyme. The separated products were then either partially digested with snake venom diesterase for analysis by the 'wandering spot' method or partially degraded with the base-specific reagents dimethylsulphate or hydrazine for direct sequence analysis on gel. The Hind-K sequence is 219 base pairs long. The message strand is particularly rich in adenosine (39 2) and purines. The nucleotide sequence can unambiguously be translated into an amino acid sequence and the N-terminal codon of the viral protein VP1 gene could be identified. The amino-terminal part of VP1 is rich in proline and lysine. The nucleotide sequence of Hind-K codes also for the carboxylterminal part of the viral protein VP2 and VP3 genes, which partly overlap the VP1 gene.In order to understand the organization of the genetic information in the oncogenic animal virus Simian virus 40 (SV40) in molecular detail, we have sequenced various parts of this genome. The viral DNA is a circular, double-stranded, supercoiled molecule containing approximately 5200 base pairs. It is cleaved by the HzndII+III restriction endonucleases into 13 fragments which were ordered [1,2]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.