Machine learning, a collection of data-analytical techniques aimed at building predictive models from multi-dimensional datasets, is becoming integral to modern biological research. By enabling one to generate models that learn from large datasets and make predictions on likely outcomes, machine learning can be used to study complex cellular systems such as biological networks. Here, we provide a primer on machine learning for life scientists, including an introduction to deep learning. We discuss opportunities and challenges at the intersection of machine learning and network biology, which could impact disease biology, drug discovery, microbiome research, and synthetic biology.
BackgroundRecent clusters of outbreaks of mosquito-borne diseases (Rift Valley fever and chikungunya) in Africa and parts of the Indian Ocean islands illustrate how interannual climate variability influences the changing risk patterns of disease outbreaks. Although Rift Valley fever outbreaks have been known to follow periods of above-normal rainfall, the timing of the outbreak events has largely been unknown. Similarly, there is inadequate knowledge on climate drivers of chikungunya outbreaks. We analyze a variety of climate and satellite-derived vegetation measurements to explain the coupling between patterns of climate variability and disease outbreaks of Rift Valley fever and chikungunya.Methods and FindingsWe derived a teleconnections map by correlating long-term monthly global precipitation data with the NINO3.4 sea surface temperature (SST) anomaly index. This map identifies regional hot-spots where rainfall variability may have an influence on the ecology of vector borne disease. Among the regions are Eastern and Southern Africa where outbreaks of chikungunya and Rift Valley fever occurred 2004–2009. Chikungunya and Rift Valley fever case locations were mapped to corresponding climate data anomalies to understand associations between specific anomaly patterns in ecological and climate variables and disease outbreak patterns through space and time. From these maps we explored associations among Rift Valley fever disease occurrence locations and cumulative rainfall and vegetation index anomalies. We illustrated the time lag between the driving climate conditions and the timing of the first case of Rift Valley fever. Results showed that reported outbreaks of Rift Valley fever occurred after ∼3–4 months of sustained above-normal rainfall and associated green-up in vegetation, conditions ideal for Rift Valley fever mosquito vectors. For chikungunya we explored associations among surface air temperature, precipitation anomalies, and chikungunya outbreak locations. We found that chikungunya outbreaks occurred under conditions of anomalously high temperatures and drought over Eastern Africa. However, in Southeast Asia, chikungunya outbreaks were negatively correlated (p<0.05) with drought conditions, but positively correlated with warmer-than-normal temperatures and rainfall.Conclusions/SignificanceExtremes in climate conditions forced by the El Niño/Southern Oscillation (ENSO) lead to severe droughts or floods, ideal ecological conditions for disease vectors to emerge, and may result in epizootics and epidemics of Rift Valley fever and chikungunya. However, the immune status of livestock (Rift Valley fever) and human (chikungunya) populations is a factor that is largely unknown but very likely plays a role in the spatial-temporal patterns of these disease outbreaks. As the frequency and severity of extremes in climate increase, the potential for globalization of vectors and disease is likely to accelerate. Understanding the underlying patterns of global and regional climate variability and their impacts on ecolo...
While synthetic biology has revolutionized our approaches to medicine, agriculture, and energy, the design of completely novel biological circuit components beyond naturally-derived templates remains challenging due to poorly understood design rules. Toehold switches, which are programmable nucleic acid sensors, face an analogous design bottleneck; our limited understanding of how sequence impacts functionality often necessitates expensive, time-consuming screens to identify effective switches. Here, we introduce Sequence-based Toehold Optimization and Redesign Model (STORM) and Nucleic-Acid Speech (NuSpeak), two orthogonal and synergistic deep learning architectures to characterize and optimize toeholds. Applying techniques from computer vision and natural language processing, we ‘un-box’ our models using convolutional filters, attention maps, and in silico mutagenesis. Through transfer-learning, we redesign sub-optimal toehold sensors, even with sparse training data, experimentally validating their improved performance. This work provides sequence-to-function deep learning frameworks for toehold selection and design, augmenting our ability to construct potent biological circuit components and precision diagnostics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.