Helicobacter pylori colonizes the stomach of half of the world's population, causing a wide spectrum of disease ranging from asymptomatic gastritis to ulcers to gastric cancer. Although the basis for these diverse clinical outcomes is not understood, more severe disease is associated with strains harboring a pathogenicity island. To characterize the genetic diversity of more and less virulent strains, we examined the genomic content of 15 H. pylori clinical isolates by using a whole genome H. pylori DNA microarray. We found that a full 22% of H. pylori genes are dispensable in one or more strains, thus defining a minimal functional core of 1281 H. pylori genes. While the core genes encode most metabolic and cellular processes, the strain-specific genes include genes unique to H. pylori, restriction modification genes, transposases, and genes encoding cell surface proteins, which may aid the bacteria under specific circumstances during their long-term infection of genetically diverse hosts. We observed distinct patterns of the strainspecific gene distribution along the chromosome, which may result from different mechanisms of gene acquisition and loss. Among the strain-specific genes, we have found a class of candidate virulence genes identified by their coinheritance with the pathogenicity island.
Helicobacter pylori is a highly host-adapted bacterial pathogen that establishes a chronic infection in the human stomach and has no known animal or environmental reservoirs (1). Epidemiological and serological studies have revealed that H. pylori strains containing the CagA protein are associated with more severe disease (2) and harbor a 40-kb pathogenicity island (PAI) (3, 4). The PAI encodes a bacterial type IV secretory system that secretes and translocates the CagA protein into host cells (5-8), where it is phosphorylated by a host-cell kinase and causes morphological changes (7). The PAI also induces IL-8 production by host cells independent of the CagA protein (9 -11). Efforts to classify H. pylori strains further by DNA fingerprinting uncovered extensive diversity (12, 13). The sequencing of two H. pylori genomes from independent strains, both containing the PAI, revealed that much of this diversity is silent at the amino acid level and thus at the functional gene level (14, 15). Here we used a H. pylori DNA microarray to examine the genomic composition of H. pylori clinical isolates containing and lacking the PAI at the level of individual genes to characterize the extent of genetic diversity between strains and to search for new candidate virulence determinants.
Materials and MethodsPCR Primer Design. The elements of our microarray consisted of large (mean size, 817 base pairs; 10th percentile, 130 base pairs; 90th percentile, 1,967 base pairs) DNA fragments corresponding to unique segments of individual open reading frames (ORFs). These fragments were generated by PCRs using gene-specific primers. We aimed to include in our array the superset of ORFs from both published genomes. When an ORF was present in bo...