The amino acid sequence of the capsid (C) protein was deduced from the nucleotide sequence of the C gene. This part of the viral 42S RNA genome was transcribed into double-stranded cDNA. The cDNA was cloned in the Escherichia coil X1776-pBR322 host-vector system and then the base sequence was determined with the technique described by Maxam and Gilbert. The amino acid sequence of the C protein shows a clustering of basic amino acids and prolines within the first 110 amino acids.from the virus. This specificity can probably be explained by the formation of bonds between the C protein in the nucleocapsid and the spanning segments of the spike glycoproteins (9).A more detailed understanding of SFV structure and assembly at the molecular level is difficult without the knowledge of the amino acid sequences of the structural proteins. We report here the primary structure of the C protein.Semliki Forest virus (SFV) is a simple membrane virus of the alphavirus group. It has been used extensively as a model system to study the structure and assembly of cellular membranes (1). The virus particle consists of an icosahedral nucleocapsid surrounded by a membrane. The nucleocapsid is a complex of about 240 capsid proteins (C protein, Mr = 30,000) (2, 3) and a RNA molecule (42S), the viral genome (4). The membrane consists of a lipid bilayer with about 240 external glycoprotein spikes (5). Each spike contains three different glycopolypeptides: El (Mr = 49,000), E2 (Mr = 52,000), and E3 (Mr = 10,000) (6, 7). The E2 polypeptide spans the membrane; there are about 30 amino acid residues present on the internal side of the viral membrane (8, 9).The virus enters the host cell by absorptive endocytosis (10). Inside the lysosomes of the cell, the low pH probably triggers a fusion of the viral membrane with the lysosomal membrane (10,11). This allows the nucleocapsid to enter the cell cytoplasm, where the viral genome is uncoated so that it can act as a mRNA for synthesis of polymerase molecules. The viral RNA polymerase synthesizes new 42S RNA molecules and smaller 26S RNA molecules. The latter molecule is homologous to the 3' end of the viral genome (12) and functions as a mRNA for the SFV structural proteins, which are translated from a single initiation site (13). The C protein is made first. As soon as it is completed it is cleaved from the growing polypeptide chain, and the ribosomes continue to read off the membrane proteins in the order E3, E2, and El (8, 14). The membrane proteins are cotranslationally translocated across the membrane of the endoplasmic reticulum and transported to the plasma membrane (15, 16).The assembly of the nucleocapsid in the cell cytoplasm is not understood. Newly synthesized capsid proteins are known to be associated with the large subunit of the ribosome before they complex with the 42S RNA into nucleocapsids (17). The final step in SFV assembly, budding, takes place at the cell surface (18). The nucleocapsid binds to the cytoplasmic aspect of the plasma membrane which folds around the nu...