We have characterized sequences of genomic DNA 5' to the coding region of the rat malic enzyme gene. This sequence possesses neither TATA nor CCAAT sequences in their usual positions but is rich in GC residues. Sequences similar to those found in the regulatory regions of other genes are discussed. Deletion analyses have revealed that sequences + 1 to -41 are sufficient to initiate expression, although inclusion of information up to -177 is necessary for maximal promoter activity.Rat liver malic enzyme (ME) (EC 1.1.1.40) plays an important role in lipogenesis. We have previously reported cloning (22) and the sequence of ME mRNA (21) and shown its regulation by thyroid hormone and a high carbohydrate diet (4)(5)(6). To investigate regulation of the ME gene, we have identified and sequenced its 5'-flanking region.The sequence shown in Fig. 1 extends from the EcoRI site through the transcription start site and includes 109 base pairs (bp) of the first exon with a portion of the first intron. The structural organization of the ME promoter differs from that found in tissue-specific promoters, which are usually rapidly regulated and often switched off, but it resembles more closely promoter regions of several eucaryotic constitutive or "housekeeping" genes (23-28, 30, 31), the simian virus 40 late promoter (9), and some recently characterized proto-oncogenes (16,17). A TATA box (2), usually located 20 to 30 bp upstream from cap sites, lies at -622 ( Fig. 1). The sequence CCGAT, between -144 and -140, resembles the canonical CCAAT consensus sequence often found at a position 80 bp from transcription start sites in many eucaryotic promoters (7). The most striking motif is the nine hexanucleotides CCGCCC. Six of these GC boxes are located upstream from the major cap site from -376 to -10. One GC box is present in an untranslated region, while the last two GC boxes are found within the first intron. Noteworthy is the number of nucleotides, (i.e., 65, 63, and 61) separating the GC motifs at -46, -111, -174, and -235, respectively. This rather equal spacing, also seen in the 5'-flanking region of the adenosine deaminase gene (31), might play a role in the chromatin assembly required for expression. The 6-bp sequence of the GC motif is the same sequence that is repeated six times within the simian virus 40 promoter (9) and has been shown to be a core element in the decanucleotide sequence, exhibiting high affinity binding for Spl (1,8 with one deletion) to the consensus sequence of the 3' region of a 48-bp repeated element in the mouse dihydrofolate reductase gene (25).Farther downstream from -124 to -119 is the hexanucleotide sequence 5'-CGCTTC-3'. Three copies of this sequence occur within the boundaries of the Moloney murine sarcoma virus long terminal repeat distal transcription signal (11), immediately downstream from the CCAAT box. The position of this element relative to a CAT-like box (-144) found in the ME gene 5'-flanking region imitates the position identified in the viral long terminal repeat. We also note t...