We describe the design and use of the Stanford CoreNLP toolkit, an extensible pipeline that provides core natural language analysis. This toolkit is quite widely used, both in the research NLP community and also among commercial and government users of open source NLP technology. We suggest that this follows from a simple, approachable design, straightforward interfaces, the inclusion of robust and good quality analysis components, and not requiring use of a large amount of associated baggage.
We used ecologic niche modeling of outbreaks and sporadic cases of filovirus-associated hemorrhagic fever (HF) to provide a large-scale perspective on the geographic and ecologic distributions of Ebola and Marburg viruses. We predicted that filovirus would occur across the Afrotropics: Ebola HF in the humid rain forests of central and western Africa, and Marburg HF in the drier and more open areas of central and eastern Africa. Most of the predicted geographic extent of Ebola HF has been observed; Marburg HF has the potential to occur farther south and east. Ecologic conditions appropriate for Ebola HF are also present in Southeast Asia and the Philippines, where Ebola Reston is hypothesized to be distributed. This first large-scale ecologic analysis provides a framework for a more informed search for taxa that could constitute the natural reservoir for this virus family.
Universal dependencies (UD) is a framework for morphosyntactic annotation of human language, which to date has been used to create treebanks for more than 100 languages. In this article, we outline the linguistic theory of the UD framework, which draws on a long tradition of typologically oriented grammatical theories. Grammatical relations between words are centrally used to explain how predicate–argument structures are encoded morphosyntactically in different languages while morphological features and part-of-speech classes give the properties of words. We argue that this theory is a good basis for cross-linguistically consistent annotation of typologically diverse languages in a way that supports computational natural language understanding as well as broader linguistic studies.
Accurately surveying shark populations is critical to monitoring precipitous ongoing declines in shark abundance and interpreting the effects that these reductions are having on ecosystems. To evaluate the effectiveness of existing survey tools, we used field trials and computer simulations to critically examine the operation of four common methods for counting coastal sharks: stationary point counts, belt transects, video surveys, and mark and recapture abundance estimators. Empirical and theoretical results suggest that (1) survey method selection has a strong impact on the estimates of shark density that are produced, (2) standardizations by survey duration are needed to properly interpret and compare survey outputs, (3) increasing survey size does not necessarily increase survey precision, and (4) methods that yield the highest density estimates are not always the most accurate. These findings challenge some of the assumptions traditionally associated with surveying mobile marine animals. Of the methods we trialed, 8 x 50 m belt transects and a 20 m radius point count produced the most accurate estimates of shark density. These findings can help to improve the ways we monitor, manage, and understand the ecology of globally imperiled coastal shark populations.
BackgroundCurrent multi-petaflop supercomputers are powerful systems, but present challenges when faced with problems requiring large machine learning workflows. Complex algorithms running at system scale, often with different patterns that require disparate software packages and complex data flows cause difficulties in assembling and managing large experiments on these machines.ResultsThis paper presents a workflow system that makes progress on scaling machine learning ensembles, specifically in this first release, ensembles of deep neural networks that address problems in cancer research across the atomistic, molecular and population scales. The initial release of the application framework that we call CANDLE/Supervisor addresses the problem of hyper-parameter exploration of deep neural networks.ConclusionsInitial results demonstrating CANDLE on DOE systems at ORNL, ANL and NERSC (Titan, Theta and Cori, respectively) demonstrate both scaling and multi-platform execution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.