Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18 2018
DOI: 10.1145/3178876.3186050
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Task Learning Improves Disease Models from Web Search

Abstract: We investigate the utility of multi-task learning to disease surveillance using Web search data. Our motivation is two-fold. Firstly, we assess whether concurrently training models for various geographies -inside a country or across different countries -can improve accuracy. We also test the ability of such models to assist health systems that are producing sporadic disease surveillance reports that reduce the quantity of available training data. We explore both linear and nonlinear models, specifically a mult… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
33
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
2
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 44 publications
(33 citation statements)
references
References 50 publications
0
33
0
Order By: Relevance
“…queries with frequencies that are correlated to ILI rates, but have no link to flu, such as 'college basketball' or 'spring break'. To remove these unrelated queries we applied a semantic filter based on word embedding representations, similar to the one proposed in [38,72,73]. Word embeddings were trained on the English Wikipedia corpus using the fastText method [12].…”
Section: Step 1 -Learning a Regression Function In The Source Domainmentioning
confidence: 99%
See 1 more Smart Citation
“…queries with frequencies that are correlated to ILI rates, but have no link to flu, such as 'college basketball' or 'spring break'. To remove these unrelated queries we applied a semantic filter based on word embedding representations, similar to the one proposed in [38,72,73]. Word embeddings were trained on the English Wikipedia corpus using the fastText method [12].…”
Section: Step 1 -Learning a Regression Function In The Source Domainmentioning
confidence: 99%
“…4 is given by [4,23]. A query's embedding is defined as the average of the embeddings of its tokens, an effective practice for short texts [8,42,66,72]. We denote with v S i , v T j both ∈ R 1×d , the embeddings of a source query (from Q S ) and of a target query from P T , respectively.…”
mentioning
confidence: 99%
“…The former is a Web interface offering a global, longitudinal view of the interest for one or more queries, provided as a normalized number between 0 and 100. The latter is a service offered to researchers, with access to the interest of search queries related to health topics (see also [23,46]). Here, the interest is provided as a frequency, i.e., the number of times the queries were searched during a time period normalized by the total number of searches in the same period.…”
Section: Related Workmentioning
confidence: 99%
“…Our work focuses on a methodology aimed at achieving this for influenza activity 12 surveillance. 13 Influenza has a large seasonal burden across the United States, infecting up to 35 million people 14 and causing between 12000 and 56000 deaths per year [3]. Limiting the spread of outbreaks and reducing 15 morbidity in those already infected are crucial steps for mitigating the impact of influenza.…”
mentioning
confidence: 99%
“…Many utilize innovative web-based data sources such as Internet search 31 frequencies and electronic health records [5]. Some have also taken into account historically-observed spatial 32 and temporal synchronicities in flu activity [11,12] to improve the accuracy of existing flu surveillance 33 tools [13,14]. Because influenza transmission occurs locally and is spread from person to person, the timing 34 of outbreaks and resulting infection rate curves can significantly differ from state to state.…”
mentioning
confidence: 99%