Fossil records indicate that life appeared in marine environments ∼3.5 billion years ago (Gyr) and transitioned to terrestrial ecosystems nearly 2.5 Gyr. Sequence analysis suggests that “hydrobacteria” and “terrabacteria” might have diverged as early as 3 Gyr. Bacteria of the genus Azospirillum are associated with roots of terrestrial plants; however, virtually all their close relatives are aquatic. We obtained genome sequences of two Azospirillum species and analyzed their gene origins. While most Azospirillum house-keeping genes have orthologs in its close aquatic relatives, this lineage has obtained nearly half of its genome from terrestrial organisms. The majority of genes encoding functions critical for association with plants are among horizontally transferred genes. Our results show that transition of some aquatic bacteria to terrestrial habitats occurred much later than the suggested initial divergence of hydro- and terrabacterial clades. The birth of the genus Azospirillum approximately coincided with the emergence of vascular plants on land.
Shotgun proteomics experiments require the collection of thousands of tandem mass spectra; these sets of data will continue to grow as new instruments become available that can scan at even higher rates. Such data contain substantial amounts of redundancy with spectra from a particular peptide being acquired many times during a single LC-MS/MS experiment. In this article, we present MS2Grouper, an algorithm that detects spectral duplication, assesses groups of related spectra, and replaces these groups with synthetic representative spectra. Errors in detecting spectral similarity are corrected using a paraclique criterion-spectra are only assessed as groups if they are part of a clique of at least three completely interrelated spectra or are subsequently added to such cliques by being similar to all but one of the clique members. A greedy algorithm constructs a representative spectrum for each group by iteratively removing the tallest peaks from the spectral collection and matching to peaks in the other spectra. This strategy is shown to be effective in reducing spectral counts by up to 20% in LC-MS/MS datasets from protein standard mixtures and proteomes, reducing database search times without a concomitant reduction in identified peptides. (J Am Soc Mass Spectrom 2005, 16, 1250 -1261
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.