In this paper, experiments on automatic extraction of keywords from abstracts using a supervised machine learning algorithm are discussed. The main point of this paper is that by adding linguistic knowledge to the representation (such as syntactic features), rather than relying only on statistics (such as term frequency and ngrams), a better result is obtained as measured by keywords previously assigned by professional indexers. In more detail, extracting NP-chunks gives a better precision than n-grams, and by adding the POS tag(s) assigned to the term as a feature, a dramatic improvement of the results is obtained, independent of the term selection approach applied.
In the field of syndromic surveillance, various sources are exploited for outbreak detection, monitoring and prediction. This paper describes a study on queries submitted to a medical web site, with influenza as a case study. The hypothesis of the work was that queries on influenza and influenza-like illness would provide a basis for the estimation of the timing of the peak and the intensity of the yearly influenza outbreaks that would be as good as the existing laboratory and sentinel surveillance. We calculated the occurrence of various queries related to influenza from search logs submitted to a Swedish medical web site for two influenza seasons. These figures were subsequently used to generate two models, one to estimate the number of laboratory verified influenza cases and one to estimate the proportion of patients with influenza-like illness reported by selected General Practitioners in Sweden. We applied an approach designed for highly correlated data, partial least squares regression. In our work, we found that certain web queries on influenza follow the same pattern as that obtained by the two other surveillance systems for influenza epidemics, and that they have equal power for the estimation of the influenza burden in society. Web queries give a unique access to ill individuals who are not (yet) seeking care. This paper shows the potential of web queries as an accurate, cheap and labour extensive source for syndromic surveillance.
This paper presents a study on if and how automatically extracted keywords can be used to improve text categorization. In summary we show that a higher performance -as measured by micro-averaged F-measure on a standard text categorization collection -is achieved when the full-text representation is combined with the automatically extracted keywords. The combination is obtained by giving higher weights to words in the full-texts that are also extracted as keywords. We also present results for experiments in which the keywords are the only input to the categorizer, either represented as unigrams or intact. Of these two experiments, the unigrams have the best performance, although neither performs as well as headlines only.
Background:The rapidly increasing dissemination of carbapenem-resistant Enterobacteriaceae (CRE) in both humans and animals poses a global threat to public health. However, the transmission of CRE between humans and animals has not yet been well studied.Objectives:We investigated the prevalence, risk factors, and drivers of CRE transmission between humans and their backyard animals in rural China.Methods:We conducted a comprehensive sampling strategy in 12 villages in Shandong, China. Using the household [residents and their backyard animals (farm and companion animals)] as a single surveillance unit, we assessed the prevalence of CRE at the household level and examined the factors associated with CRE carriage through a detailed questionnaire. Genetic relationships among human- and animal-derived CRE were assessed using whole-genome sequencing–based molecular methods.Results:A total of 88 New Delhi metallo-β-lactamases–type carbapenem-resistant Escherichia coli (NDM-EC), including 17 from humans, 44 from pigs, 12 from chickens, 1 from cattle, and 2 from dogs, were isolated from 65 of the 746 households examined. The remaining 12 NDM-EC were from flies in the immediate backyard environment. The NDM-EC colonization in households was significantly associated with a) the number of species of backyard animals raised/kept in the same household, and b) the use of human and/or animal feces as fertilizer. Discriminant analysis of principal components (DAPC) revealed that a large proportion of the core genomes of the NDM-EC belonged to strains from hosts other than their own, and several human isolates shared closely related core single-nucleotide polymorphisms and blaNDM genetic contexts with isolates from backyard animals.Conclusions:To our knowledge, we are the first to report evidence of direct transmission of NDM-EC between humans and animals. Given the rise of NDM-EC in community and hospital infections, combating NDM-EC transmission in backyard farm systems is needed. https://doi.org/10.1289/EHP5251
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.