SUMMARYA DNA complementary to the T-terminal 1168 nucleotides of the genome of the N strain of soybean mosaic virus (SMV) has been cloned and sequenced, cDNA sequence and coat protein analyses indicate that the SMV coat protein-coding region is at the 3' end of the genome, and that the coat protein is processed from a larger protein. The coat protein-coding sequence is predicted to be 795 nucleotides in length, encoding a protein of 265 amino acids with a calculated Mr of 29 857. The 3' untranslated region is 259 nucleotides in length and is followed by a polyadenylate tract. The SMV coat proteincoding region, along with a small amount of upstream sequence, has been expressed in Escherichia coli as a/~-galactosidase fusion protein. The size of the protein was less than predicted for the fusion protein, suggesting processing in E. coli. The coat protein-coding region has also been expressed in Agrobacterium tumefaciens and transgenic tobacco callus as an unfused protein under the control of the cauliflower mosaic virus 35S promoter. The coat protein produced in transgenic tobacco callus had an electrophoretic mobility identical to that of SMV coat protein and constituted approximately 0.05~ (w/w) of the total extracted protein.