2016
DOI: 10.1371/journal.pcbi.1004876
|View full text |Cite
|
Sign up to set email alerts
|

Towards Identifying and Reducing the Bias of Disease Information Extracted from Search Engine Data

Abstract: The estimation of disease prevalence in online search engine data (e.g., Google Flu Trends (GFT)) has received a considerable amount of scholarly and public attention in recent years. While the utility of search engine data for disease surveillance has been demonstrated, the scientific community still seeks ways to identify and reduce biases that are embedded in search engine data. The primary goal of this study is to explore new ways of improving the accuracy of disease prevalence estimations by combining tra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
20
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(21 citation statements)
references
References 39 publications
1
20
0
Order By: Relevance
“…The success of GFT motivated several studies aiming to assess current flu activity based on secondary data such as Internet search queries and electronic health records [ 29 35 ]. Several studies have been conducted on HFMD prediction using Baidu search queries [ 36 38 ]. In these research works, Baidu search queries are incorporated into forecasting methods, and the HFMD prediction is either at provincial or national level.…”
Section: Introductionmentioning
confidence: 99%
“…The success of GFT motivated several studies aiming to assess current flu activity based on secondary data such as Internet search queries and electronic health records [ 29 35 ]. Several studies have been conducted on HFMD prediction using Baidu search queries [ 36 38 ]. In these research works, Baidu search queries are incorporated into forecasting methods, and the HFMD prediction is either at provincial or national level.…”
Section: Introductionmentioning
confidence: 99%
“…It has been pointed out that failing to separate these sources entails dangers, both in the context of population dynamics in general (Nadeem et al, 2016) and of infectious disease dynamics in particular (see, e.g., Fujiwara, 2009; Gibbons et al, 2014). Moreover, measurement models have been receiving more attention motivated by the access to massive datasets related to infectious diseases in recent times (Huang et al, 2016). Here, we focus on two variations on the models of Section 3.2 for noisy parameters in measurement models that seem to have been so far unexplored: dependent environmental noise and diagnosis error.…”
Section: Discussionmentioning
confidence: 99%
“…Interestingly, these words use generic nouns but not ILI symptoms. Huang et al [22] used the same method to choose key words for the prediction of HFMD epidemics from Baidu queries, but extended the number of key words to 11. Hulth et al [14] detected an influenza outbreak in Sweden by counting 20 types of web queries that contained 'influenza' or symptoms of ILI (in Swedish).…”
Section: Query Abstractmentioning
confidence: 99%
“…Previous predictive studies, which adopted methods such as dynamic modeling [3], autologistic regression [4], gray system GM (1,1) [5,6], and neural networks [7], did not report errors in the predicted value; therefore the predictive value of these different models cannot be compared using MAPE. Huang et al [22] used MAPE to describe the difference of fitted results and the real HFMD data. But what we care about is the divergence between the predictive value and the actual case number.…”
Section: Stagementioning
confidence: 99%
See 1 more Smart Citation