The rfa locus of Escherichia coli K-12 includes a block of about 10 closely spaced genes transcribed in the same direction which are involved in synthesis and modification of the hexose region of the llpopolysaccharide core. We have sequenced the first three genes in this block. The function of the first of these genes is unknown, but we have designated it rfaQ on the basis of its location and similarity to other rfa genes. Complementation of SalmoneUa typhimurium rfa mutants with E. coli rfa restriction fragments indicated that the second and third genes in the block were rfaG and rfaP. The deduced sizes of the RfaQ, RfaG, and RfaP proteins are 36,298, 42,284, and 30,872 Da, respectively, and the proteins are basic and lack extensive hydrophobic domins. RfaQ shares regions of homology with proteins RfaC and RfaF, which are involved in synthesis of the heptose region of the core. Proteins RfaB, RfaG, and RfaK share a region of homology, which suggests that they belong to a second family of Rfa proteins which are thought to be hexose transferases.This study deals with rfaG, the gene for addition of the first glucose residue to the lipopolysaccharide (LPS) core, and with rfaP, a gene involved in attachment of phosphatecontaining substituents to the inner core. Figure 1 shows a simplified structure of the inner core region of Salmonella typhimurium and what are thought to be the primary sites of action of the relevant rfa genes (8). It can be assumed that this part of the core structure is very similar or identical in Escherichia coli K-12, since cloned DNA fragments from E. coli K-12 efficiently complement mutations in rfaC, rfaF, and rfaG in S. typhimurium (3,12).Two E. coli K-12 recombinant plasmids from the ClarkeCarbon collection, pLC10-7 and pLC17-24, were identified by Creeger and Rothfield as containing inserts which would complement rfaG mutations in S. typhimurium (3). The inserts of these plasmids overlap by about 4 kb, and the region of overlap includes two HindIII sites near the pyrE end of the rfa locus at 81 min on the E. coli map (1). In vitro transcription and translation of restriction fragments derived from the insert of pLC10-7 showed that these Hindlll sites lie at the beginning of a series of contiguous genes which are transcribed counterclockwise from pyrE. The in vitro products of the first three of these genes, reading away from pyrE, had apparent molecular masses of 42, 39, and 35 kDa. TnJO insertions in the second gene resulted in a glucosedeficient LPS (chemotype Rd1) phenotype, suggesting that this is the rfaG gene, while an insertion in the third gene resulted in a deep rough phenotype and LPS of the RcPchemotype, which includes the first glucose residue. These properties suggest that this gene is rfaP (1).In this report, we describe the sequence and properties of these three genes and show, by complementation of Salmonella mutations, that two of them are indeed rfaG and rfaP. The other is a novel gene, which we designated rfaQ. The relationship of these genes to the physical map of ...