SummaryBurkholderia mallei is a host-adapted pathogen and a category B biothreat agent. Although the B. mallei VirAG two-component regulatory system is required for virulence in hamsters, the virulence genes it regulates are unknown. Here we show with expression profiling that overexpression of virAG resulted in transcriptional activation of~60 genes, including some involved in capsule production, actin-based intracellular motility, and type VI secretion (T6S). The 15 genes encoding the major sugar component of the homopolymeric capsule were up-expressed >2.5-fold, but capsule was still produced in the absence of virAG. Actin tail formation required virAG as well as bimB, bimC and bimE, three previously uncharacterized genes that were activated four-to 15-fold when VirAG was overproduced. Surprisingly, actin polymerization was found to be dispensable for virulence in hamsters. In contrast, genes encoding a T6S system were up-expressed as much as 30-fold and mutations in this T6S gene cluster resulted in strains that were avirulent in hamsters. SDS-PAGE and mass spectrometry demonstrated that BMAA0742 was secreted by the T6S system when virAG was overexpressed. Purified His-tagged BMAA0742 was recognized by glanders antiserum from a horse, a human and mice, indicating that this Hcp-family protein is produced in vivo during infection.
We compare and contrast genome-wide compositional biases and distributions of short oligonucleotides across 15 diverse prokaryotes that have substantial genomic sequence collections. These include seven complete genomes (Escherichia coli, Haemophilus influenzae, Mycoplasma genitalium, Mycoplasma pneumoniae, Synechocystis sp. strain PCC6803, Methanococcus jannaschii, and Pyrobaculum aerophilum). A key observation concerns the constancy of the dinucleotide relative abundance profiles over multiple 50-kb disjoint contigs within the same genome. (The profile is XY * ؍ f XY * /f X * f Y * for all XY, where f X * denotes the frequency of the nucleotide X and f XY * denotes the frequency of the dinucleotide XY, both computed from the sequence concatenated with its inverted complementary sequence.) On the basis of this constancy, we refer to the collection { XY * } as the genome signature. We establish that the differences between { XY * } vectors of 50-kb sample contigs of different genomes virtually always exceed the differences between those of the same genomes. Various di-and tetranucleotide biases are identified. In particular, we find that the dinucleotide CpG؍CG is underrepresented in many thermophiles (e.g., M. jannaschii, Sulfolobus sp., and M. thermoautotrophicum) but overrepresented in halobacteria. TA is broadly underrepresented in prokaryotes and eukaryotes, but normal counts appear in Sulfolobus and P. aerophilum sequences. More than for any other bacterial genome, palindromic tetranucleotides are underrepresented in H. influenzae. The M. jannaschii sequence is unprecedented in its extreme underrepresentation of CTAG tetranucleotides and in the anomalous distribution of CTAG sites around the genome. Comparative analysis of numbers of long tetranucleotide microsatellites distinguishes H. influenzae. Dinucleotide relative abundance differences between bacterial sequences are compared. For example, in these assessments of differences, the cyanobacteria Synechocystis, Synechococcus, and Anabaena do not form a coherent group and are as far from each other as general gram-negative sequences are from general gram-positive sequences. The difference of M. jannaschii from low-G؉C gram-positive proteobacteria is one-half of the difference from gram-negative proteobacteria. Interpretations and hypotheses center on the role of the genome signature in highlighting similarities and dissimilarities across different classes of prokaryotic species, possible mechanisms underlying the genome signature, the form and level of genome compositional flux, the use of the genome signature as a chronometer of molecular phylogeny, and implications with respect to the three putative eubacterial, archaeal, and eukaryote domains of life and to the origin and early evolution of eukaryotes.In this report, we describe measures of genomic similarities that do not depend on prior alignment of homologous sequences and apply them to sufficiently large samples of prokaryotic genomic sequences. The approach departs from almost all other metho...
We review concepts and methods for comparative analysis of complete genomes including assessments of genomic compositional contrasts based on dinucleotide and tetranucleotide relative abundance values, identifications of rare and frequent oligonucleotides, evaluations and interpretations of codon biases in several large prokaryotic genomes, and characterizations of compositional asymmetry between the two DNA strands in certain bacterial genomes. The discussion also covers means for identifying alien (e.g. laterally transferred) genes and detecting potential specialization islands in bacterial genomes.
We present a comparative proteome analysis of the five complete eukaryoticgenomes(human,Drosophilamelanogaster,Caenorhabditis elegans, Saccharomyces cerevisiae, Arabidopsis thaliana), focusing on individual and multiple amino acid runs, charge and hydrophobic runs. We found that human proteins with multiple long runs are often associated with diseases; these include long glutamine runs that induce neurological disorders, various cancers, categories of leukemias (mostly involving chromosomal translocations), and an abundance of Ca 2 ؉ and K ؉ channel proteins. Many human proteins with multiple runs function in development and͞or transcription regulation and are Drosophila homeotic homologs. A large number of these proteins are expressed in the nervous system. More than 80% of Drosophila proteins with multiple runs seem to function in transcription regulation. The most frequent amino acid runs in Drosophila sequences occur for glutamine, alanine, and serine, whereas human sequences highlight glutamate, proline, and leucine. The most frequent runs in yeast are of serine, glutamine, and acidic residues. Compared with the other eukaryotic proteomes, amino acid runs are significantly more abundant in the fly. This finding might be interpreted in terms of innate differences in DNA-replication processes, repair mechanisms, DNA-modification systems, and mutational biases. There are striking differences in amino acid runs for glutamine, asparagine, and leucine among the five proteomes. Several human inherited neurodegenerative diseases are triplet-repeat diseases associated with proteins containing long runs of glutamine (long CAG codon iterations; for reviews, see refs. 1 and 2). Disease severity seems to be correlated with the extent of iterations of the CAG codon above a threshold (3). Strikingly, many of the triplet-repeat disease proteins contain multiple long runs of amino acids other than glutamine. Listing all runs of lengths of at least five residues (and using the standard one-letter amino acid code), the huntingtin protein contains Q 23 , P 11 , P 10 , E 5 , E 6 ; atrophin-1 (dentatorubral pallidoluysian atrophy, DRPLA) contains Q 20 , S 7 , S 10 , P 6 , H 5 ; the androgenreceptor protein (Kennedy's disease) contains Q 26 , Q 6 , Q 5 , P 8 , A 5 , G 24 ; and the brain-voltage-dependent calcium channel protein CCAA (spinocerebellar ataxia 6) contains H 10 and Q 11 .Consequences of hyperexpansion of DNA-triplet repeats might include altered rates of transcription or translation, mRNA instability, and aberrant DNA-hairpin structures (4, 5). Protein aggregation attributed to attachment of glutamine-rich proteins to unrelated molecules may lead to inappropriate multimerization or to formation of ''polar zippers,'' in which a long stretch of glutamine residues link strands by hydrogen bonds (6 -8).The foregoing examples motivate our comparative analysis of eukaryotic proteomes focusing on proteins containing multiple amino acid runs. The complete genomes investigated are those of the Human Genome Project tentative dr...
Our approach in predicting gene expression levels relates to codon usage differences among gene classes. In prokaryotic genomes, genes that deviate strongly in codon usage from the average gene but are sufficiently similar in codon usage to ribosomal protein genes, to translation and transcription processing factors, and to chaperone-degradation proteins are predicted highly expressed (PHX). By these criteria, PHX genes in most prokaryotic genomes include those encoding ribosomal proteins, translation and transcription processing factors, and chaperone proteins and genes of principal energy metabolism. In particular, for the fast-growing species Escherichia coli, Vibrio cholerae, Bacillus subtilis, and Haemophilus influenzae, major glycolysis and tricarboxylic acid cycle genes are PHX. In Synechocystis, prime genes of photosynthesis are PHX, and in methanogens, PHX genes include those essential for methanogenesis. Overall, the three protein familiesribosomal proteins, protein synthesis factors, and chaperone complexes-are needed at many stages of the life cycle, and apparently bacteria have evolved codon usage to maintain appropriate growth, stability, and plasticity. New interpretations of the capacity of Deinococcus radiodurans for resistance to high doses of ionizing radiation is based on an excess of PHX chaperone-degradation genes and detoxification genes. Expression levels of selected classes of genes, including those for flagella, electron transport, detoxification, histidine kinases, and others, are analyzed. Flagellar PHX genes are conspicuous among spirochete genomes. PHX genes are positively correlated with strong Shine-Dalgarno signal sequences. Specific regulatory proteins, e.g., two-component sensor proteins, are rarely PHX. Genes involved in pathways for the synthesis of vitamins record low predicted expression levels. Several distinctive PHX genes of the available complete prokaryotic genomes are highlighted. Relationships of PHX genes with stoichiometry, multifunctionality, and operon structures are discussed. Our methodology may be used complementary to experimental expression analysis.Gene expression and protein abundances in prokaryotes are regulated at several levels: (i) initiation of transcription, promoter strength, promoter configuration, and transcription factors; (ii) transcription termination, mRNA stability, and turnover rates; (iii) codon usage; (iv) translation initiation and elongation; and (v) protein folding, degradation, and cellular localization. An accounting of high gene expression in prokaryotic genomes generally focuses on at least one of three criteria: (i) The gene possesses a potent promoter sequence sometimes associated with bent DNA and/or specific binding factors. However, the characterization of regulatory cis elements underlying gene transcription is largely an unresolved problem.(ii) The gene possesses a strong Shine-Dalgarno (SD) ribosome binding sequence, but recognition of SD sequences is not discriminating (10, 14, 34, 51, 53) (see also below). (c) The gene ex...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.