Assays to accurately estimate relative fitness of bacteria growing in multistrain communities can advance our understanding of how selection shapes diversity within a lineage. Here, we present a variant of the "evolve and resequence" approach both to estimate relative fitness and to identify genetic variants responsible for fitness variation of symbiotic bacteria in free-living and host environments. We demonstrate the utility of this approach by characterizing selection by two plant hosts and in two free-living environments (sterilized soil and liquid media) acting on synthetic communities of the facultatively symbiotic bacterium We find () selection that hosts exert on rhizobial communities depends on competition among strains, () selection is stronger inside hosts than in either free-living environment, and () a positive host-dependent relationship between relative strain fitness in multistrain communities and host benefits provided by strains in single-strain experiments. The greatest changes in allele frequencies in response to plant hosts are in genes associated with motility, regulation of nitrogen fixation, and host/rhizobia signaling. The approach we present provides a powerful complement to experimental evolution and forward genetic screens for characterizing selection in bacterial populations, identifying gene function, and surveying the functional importance of naturally occurring genomic variation.
BackgroundSmall peptides encoded as one- or two-exon genes in plants have recently been shown to affect multiple aspects of plant development, reproduction and defense responses. However, popular similarity search tools and gene prediction techniques generally fail to identify most members belonging to this class of genes. This is largely due to the high sequence divergence among family members and the limited availability of experimentally verified small peptides to use as training sets for homology search and ab initio prediction. Consequently, there is an urgent need for both experimental and computational studies in order to further advance the accurate prediction of small peptides.ResultsWe present here a homology-based gene prediction program to accurately predict small peptides at the genome level. Given a high-quality profile alignment, SPADA identifies and annotates nearly all family members in tested genomes with better performance than all general-purpose gene prediction programs surveyed. We find numerous mis-annotations in the current Arabidopsis thaliana and Medicago truncatula genome databases using SPADA, most of which have RNA-Seq expression support. We also show that SPADA works well on other classes of small secreted peptides in plants (e.g., self-incompatibility protein homologues) as well as non-secreted peptides outside the plant kingdom (e.g., the alpha-amanitin toxin gene family in the mushroom, Amanita bisporigera).ConclusionsSPADA is a free software tool that accurately identifies and predicts the gene structure for short peptides with one or two exons. SPADA is able to incorporate information from profile alignments into the model prediction process and makes use of it to score different candidate models. SPADA achieves high sensitivity and specificity in predicting small plant peptides such as the cysteine-rich peptide families. A systematic application of SPADA to other classes of small peptides by research communities will greatly improve the genome annotation of different protein families in public genome databases.
Background: Previous studies exploring sequence variation in the model legume, Medicago truncatula, relied on mapping short reads to a single reference. However, read-mapping approaches are inadequate to examine large, diverse gene families or to probe variation in repeat-rich or highly divergent genome regions. De novo sequencing and assembly of M. truncatula genomes enables near-comprehensive discovery of structural variants (SVs), analysis of rapidly evolving gene families, and ultimately, construction of a pan-genome. Results: Genome-wide synteny based on 15 de novo M. truncatula assemblies effectively detected different types of SVs indicating that as much as 22% of the genome is involved in large structural changes, altogether affecting 28% of gene models. A total of 63 million base pairs (Mbp) of novel sequence was discovered, expanding the reference genome space for Medicago by 16%. Pan-genome analysis revealed that 42% (180 Mbp) of genomic sequences is missing in one or more accession, while examination of de novo annotated genes identified 67% (50,700) of all ortholog groups as dispensable -estimates comparable to recent studies in rice, maize and soybean. Rapidly evolving gene families typically associated with biotic interactions and stress response were found to be enriched in the accession-specific gene pool. The nucleotide-binding site leucine-rich repeat (NBS-LRR) family, in particular, harbors the highest level of nucleotide diversity, large effect single nucleotide change, protein diversity, and presence/ absence variation. However, the leucine-rich repeat (LRR) and heat shock gene families are disproportionately affected by large effect single nucleotide changes and even higher levels of copy number variation.
Genome-wide association analyses are a powerful approach for identifying gene function. These analyses are becoming commonplace in studies of humans, domesticated animals, and crop plants but have rarely been conducted in bacteria. We applied association analyses to 20 traits measured in Ensifer meliloti, an agriculturally and ecologically important bacterium because it fixes nitrogen when in symbiosis with leguminous plants. We identified candidate alleles and gene presence-absence variants underlying variation in symbiosis traits, antibiotic resistance, and use of various carbon sources; some of these candidates are in genes previously known to affect these traits whereas others were in genes that have not been well characterized. Our results point to the potential power of association analyses in bacteria, but also to the need to carefully evaluate the potential for false associations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.