The temporal and spatial coarseness of megafaunal fossil records complicates attempts to to disentangle the relative impacts of climate change, ecosystem restructuring, and human activities associated with the Late Quaternary extinctions. Advances in the extraction and identification of ancient DNA that was shed into the environment and preserved for millennia in sediment now provides a way to augment discontinuous palaeontological assemblages. Here, we present a 30,000-year sedimentary ancient DNA (sedaDNA) record derived from loessal permafrost silts in the Klondike region of Yukon, Canada. We observe a substantial turnover in ecosystem composition between 13,500 and 10,000 calendar years ago with the rise of woody shrubs and the disappearance of the mammoth-steppe (steppe-tundra) ecosystem. We also identify a lingering signal of Equus sp. (North American horse) and Mammuthus primigenius (woolly mammoth) at multiple sites persisting thousands of years after their supposed extinction from the fossil record.
Background Machine learning (ML) encompasses a large set of algorithms that aim at discovering
Background A critical goal in biology is to relate the phenotype to the genotype, that is, to find the genetic determinants of various traits. However, while simple monofactorial determinants are relatively easy to identify, the underpinnings of complex phenotypes are harder to predict. While traditional approaches rely on genome-wide association studies based on Single Nucleotide Polymorphism data, the ability of machine learning algorithms to find these determinants in whole proteome data is still not well known. Results To better understand the applicability of machine learning in this case, we implemented two such algorithms, adaptive boosting (AB) and repeated random forest (RRF), and developed a chunking layer that facilitates the analysis of whole proteome data. We first assessed the performance of these algorithms and tuned them on an influenza data set, for which the determinants of three complex phenotypes (infectivity, transmissibility, and pathogenicity) are known based on experimental evidence. This allowed us to show that chunking improves runtimes by an order of magnitude. Based on simulations, we showed that chunking also increases sensitivity of the predictions, reaching 100% with as few as 20 sequences in a small proteome as in the influenza case (5k sites), but may require at least 30 sequences to reach 90% on larger alignments (500k sites). While RRF has less specificity than random forest, it was never <50%, and RRF sensitivity was significantly higher at smaller chunk sizes. We then used these algorithms to predict the determinants of three types of drug resistance (to Ciprofloxacin, Ceftazidime, and Gentamicin) in a bacterium, Pseudomonas aeruginosa . While both algorithms performed well in the case of the influenza data, results were more nuanced in the bacterial case, with RRF making more sensible predictions, with smaller errors rates, than AB. Conclusions Altogether, we demonstrated that ML algorithms can be used to identify genetic determinants in small proteomes (viruses), even when trained on small numbers of individuals. We further showed that our RRF algorithm may deserve more scrutiny, which should be facilitated by the decreasing costs of both sequencing and phenotyping of large cohorts of individuals. Electronic supplementary material The online version of this article (10.1186/s12864-019-5820-0) contains supplementary material, which is available to authorized users.
Escherichia coli – one of the most characterized bacteria and a major public health concern – remains invisible across the temporal landscape. Here, we present the meticulous reconstruction of the first ancient E. coli genome from a 16th century gallstone from an Italian mummy with chronic cholecystitis. We isolated ancient DNA and reconstructed the ancient E. coli genome. It consisted of one chromosome of 4446 genes and two putative plasmids with 52 genes. The E. coli strain belonged to the phylogroup A and an exceptionally rare sequence type 4995. The type VI secretion system component genes appears to be horizontally acquired from Klebsiella aerogenes, however we could not identify any pathovar specific genes nor any acquired antibiotic resistances. A sepsis mouse assay showed that a closely related contemporary E. coli strain was avirulent. Our reconstruction of this ancient E. coli helps paint a more complete picture of the burden of opportunistic infections of the past.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.