The gene encoding the spike glycoprotein of the human coronavirus HCV 229E has been cloned and sequenced. This analysis predicts an S polypeptide of 1173 amino acids with an M, of 128600. The polypeptide has 30 potential N-glycosylation sites. A number of structural features typical of coronavirus S proteins can be recognized, including a signal sequence, a membrane anchor, heptad repeat structures and a carboxy-terminal cysteine cluster. A detailed, computer-aided comparison with the S proteins of infectious bronchitis virus, feline infectious peritonitis virus, transmissible gastroenteritis virus and murine hepatitis virus, strain JHM is presented. We have also done a Northern blot analysis of viral RNAs in HCV 229E-infected cells using synthetic oligonucleotides. On the basis of this analysis, and by analogy to the replication strategy of other coronaviruses, we are able to propose a model for the organization and expression of the HCV 229E genome.