Discourse structures have a central role in several computational tasks, such as
question-answering or dialogue generation. In particular, the framework of the
Rhetorical Structure Theory (RST) offers a sound formalism for hierarchical text
organization. In this article, we present HILDA, an implemented discourse parser based
on RST and Support Vector Machine (SVM) classification. SVM classifiers are trained and
applied to discourse segmentation and relation labeling. By combining labeling with a
greedy bottom-up tree building approach, we are able to create accurate discourse trees
in linear time complexity. Importantly, our parser can parse entire texts, whereas the
publicly available parser SPADE (Soricut and Marcu 2003) is limited to sentence level
analysis. HILDA outperforms other discourse parsers for tree structure construction and
discourse relation labeling. For the discourse parsing task, our system reaches 78.3% of
the performance level of human annotators. Compared to a state-of-the-art rule-based
discourse parser, our system achieves a performance increase of 11.6%.
Despite significant progress in the development of human action detection datasets and algorithms, no current dataset is representative of real-world aerial view scenarios. We present Okutama-Action, a new video dataset for aerial view concurrent human action detection. It consists of 43 minute-long fully-annotated sequences with 12 action classes. Okutama-Action features many challenges missing in current datasets, including dynamic transition of actions, significant changes in scale and aspect ratio, abrupt camera movement, as well as multi-labeled actors. As a result, our dataset is more challenging than existing ones, and will help push the field forward to enable real-world applications.
Background
The goal of this research is to create a system that can use the available relevant information about the factors responsible for the spread of dengue and; use it to predict the occurrence of dengue within a geographical region, so that public health experts can prepare for, manage and control the epidemic. Our study presents new geospatial insights into our understanding and management of health, disease and health-care systems.
Methods
We present a machine learning-based methodology capable of providing forecast estimates of dengue prediction in each of the fifty districts of Thailand by leveraging data from multiple data sources. Using a set of prediction variables, we show an increase in prediction accuracy of the model with an optimal combination of predictors which include: meteorological data, clinical data, lag variables of disease surveillance, socioeconomic data and the data encoding spatial dependence on dengue transmission. We use Generalized Additive Models (GAMs) to fit the relationships between the predictors (with a lag of one month) and the clinical data of Dengue hemorrhagic fever (DHF) using the data from 2008 to 2012. Using the data from 2013 to 2015 and a comparative set of prediction models, we evaluate the predictive ability of the fitted models according to RMSE and SRMSE as well as using adjusted R-squared value, deviance explained and change in AIC.
Results
The model allows for combining different predictors to make forecasts with a lead time of one month and also describe the statistical significance of the variables used to characterize the forecast. The discriminating ability of the final model was evaluated against Bangkok specific
constant
threshold and WHO
moving
threshold of the epidemic in terms of specificity, sensitivity, positive predictive value (PPV), and negative predictive value (NPV).
Conclusions
The out-of-sample validation showed poorer results than the in-sample validation, however it demonstrated ability in detecting outbreaks up-to one month ahead. We also determine that for the predicting dengue outbreaks within a district, the influence of dengue incidences and socioeconomic data from the surrounding districts is statistically significant. This validates the influence of movement patterns of people and spatial heterogeneity of human activities on the spread of the epidemic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.