28 RNA polymerase is an alternative RNA polymerase that has been proposed to have a role in late developmental gene regulation in Chlamydia, but only a single target gene has been identified. To discover additional 28 -dependent genes in the Chlamydia trachomatis genome, we applied bioinformatic methods using a probability weight matrix based on known 28 Genome sequencing has indicated that all Chlamydia species encode two alternative sigma factors, suggesting a role for alternative forms of RNA polymerase in chlamydial gene regulation. We have demonstrated that one of these alternative RNA polymerases, 28 RNA polymerase, transcribes hctB (24), a gene whose transcript is detectable only at late time points in the chlamydial developmental cycle (6, 16). hctB encodes Hc2, one of two histone-like proteins in Chlamydia that have been shown to be responsible for the condensation of DNA during conversion of the metabolically active form of chlamydiae, known as a reticulate body, to the infectious extracellular form, the elementary body (8). To date, hctB is the only 28 -regulated gene that has been identified in Chlamydia, and it is not known whether the role of 28 RNA polymerase is confined to the regulation of late gene expression in the developmental cycle.To identify additional 28 -regulated genes in Chlamydia, we have combined the use of bioinformatics, to predict 28 -regulated promoters in the chlamydial genome, with testing of promoter activity in a chlamydial 28 in vitro transcription assay. We used two in silico approaches, identifying candidate promoters on the basis of sequences that either resemble the consensus bacterial 28 promoter (9, 10) or are predicted to be highly transcribed by 28 RNA polymerase based on functional studies (25). Using information from both approaches, we have a developed a computer algorithm to identify candidate 28 promoters in the chlamydial genome and have shown that five promoters are transcribed by chlamydial 28 RNA polymerase. This method can be applied to other bacterial genomes, and we have also identified five new 28 -regulated genes in Escherichia coli.
MATERIALS AND METHODS
Development of a program for extracting sequences.We developed a program called SequenceExtractor to extract user-defined DNA sequences from a genome. The program requires two input files, consisting of a genome sequence file and a file containing the start and stop coordinates for each gene within the DNA sequence being examined. We applied this program to extract two files in fasta format from each of the genomes of C. trachomatis serovar D, E. coli K-12, and Salmonella enterica serovar Typhimurium using sequences obtained from TIGR (http://www.tigr.org). For each organism, the first output file contained 200 bp of sequence upstream for each gene ("200 bp upstream"). The second output file was more restrictive and contained up to 200 bp of upstream sequence for each gene, provided that these sequences were in the intergenic region and not within the coding region of the nearest upstream gene ("200...