The study of microorganisms that pervade each and every part of this planet has encountered many challenges through time such as the discovery of unknown organisms and the understanding of how they interact with their environment. The aim of this review is to take the reader along the timeline and major milestones that led us to modern metagenomics. This new and thriving area is likely to be an important contributor to solve different problems. The transition from classical microbiology to modern metagenomics studies has required the development of new branches of knowledge and specialization. Here, we will review how the availability of high-throughput sequencing technologies has transformed microbiology and bioinformatics and how to tackle the inherent computational challenges that arise from the DNA sequencing revolution. New computational methods are constantly developed to collect, process, and extract useful biological information from a variety of samples and complex datasets, but metagenomics needs the integration of several of these computational methods. Despite the level of specialization needed in bioinformatics, it is important that life-scientists have a good understanding of it for a correct experimental design, which allows them to reveal the information in a metagenome.
The MGnify platform (https://www.ebi.ac.uk/metagenomics) facilitates the assembly, analysis and archiving of microbiome-derived nucleic acid sequences. The platform provides access to taxonomic assignments and functional annotations for nearly half a million analyses covering metabarcoding, metatranscriptomic, and metagenomic datasets, which are derived from a wide range of different environments. Over the past 3 years, MGnify has not only grown in terms of the number of datasets contained but also increased the breadth of analyses provided, such as the analysis of long-read sequences. The MGnify protein database now exceeds 2.4 billion non-redundant sequences predicted from metagenomic assemblies. This collection is now organised into a relational database making it possible to understand the genomic context of the protein through navigation back to the source assembly and sample metadata, marking a major improvement. To extend beyond the functional annotations already provided in MGnify, we have applied deep learning-based annotation methods. The technology underlying MGnify's Application Programming Interface (API) and website has been upgraded, and we have enabled the ability to perform downstream analysis of the MGnify data through the introduction of a coupled Jupyter Lab environment.
Metagenomics research has recently thrived due to DNA sequencing technologies improvement, driving the emergence of new analysis tools and the growth of taxonomic databases. However, there is no all-purpose strategy that can guarantee the best result for a given project and there are several combinations of software, parameters and databases that can be tested. Therefore, we performed an impartial comparison, using statistical measures of classification for eight bioinformatic tools and four taxonomic databases, defining a benchmark framework to evaluate each tool in a standardized context. Using in silico simulated data for 16S rRNA amplicons and whole metagenome shotgun data, we compared the results from different software and database combinations to detect biases related to algorithms or database annotation. Using our benchmark framework, researchers can define cut-off values to evaluate the expected error rate and coverage for their results, regardless the score used by each software. A quick guide to select the best tool, all datasets and scripts to reproduce our results and benchmark any new method are available at https://github.com/Ales-ibt/Metagenomic-benchmark. Finally, we stress out the importance of gold standards, database curation and manual inspection of taxonomic profiling results, for a better and more accurate microbial diversity description.
Marine sediments are an example of one of the most complex microbial habitats. These bacterial communities play an important role in several biogeochemical cycles in the marine ecosystem. In particular, the Gulf of Mexico has a ubiquitous concentration of hydrocarbons in its sediments, representing a very interesting niche to explore. Additionally, the Mexican government has opened its oil industry, offering several exploration and production blocks in shallow and deep water in the southwestern Gulf of Mexico (swGoM), from which there are no public results of conducted studies. Given the higher risk of large-scale oil spills, the design of contingency plans and mitigation activities before oil exploitation is of growing concern. Therefore, a bacterial taxonomic baseline profile is crucial to understanding the impact of any eventual oil spill. Here, we show a genus level taxonomic profile to elucidate the bacterial baseline, pointing out richness and relative abundance, as well as relationships with 79 abiotic parameters, in an area encompassing ∼150,000 km2, including a region where the exploitation of new oil wells has already been authorized. Our results describe for the first time the bacterial landscape of the swGoM, establishing a bacterial baseline “core” of 450 genera for marine sediments in this region. We can also differentiate bacterial populations from shallow and deep zones of the swGoM based on their community structure. Shallow sediments have been chronically exposed to aromatic hydrocarbons, unlike deep zones. Our results reveal that the bacterial community structure is particularly enriched with hydrocarbon-degrading bacteria in the shallow zone, where a greater aromatic hydrocarbon concentration was determined. Differences in the bacterial communities in the swGoM were also observed through a comprehensive comparative analysis relative to various marine sediment sequencing projects, including sampled sites from the Deep Water Horizon oil spill. This study in the swGoM provides clues to the bacterial population adaptation to the ubiquitous presence of hydrocarbons and reveals organisms such as Thioprofundum bacteria with potential applications in ecological surveillance. This resource will allow us to differentiate between natural conditions and alterations generated by oil extraction activities, which, in turn, enables us to assess the environmental impact of such activities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.