Disease surveillance systems are a cornerstone of public health tracking and prevention. This review addresses the use, promise, perils, and ethics of social media– and Internet-based data collection for public health surveillance. Our review highlights untapped opportunities for integrating digital surveillance in public health and current applications that could be improved through better integration, validation, and clarity on rules surrounding ethical considerations. Promising developments include hybrid systems that couple traditional surveillance data with data from search queries, social media posts, and crowdsourcing. In the future, it will be important to identify opportunities for public and private partnerships, train public health experts in data science, reduce biases related to digital data (gathered from Internet use, wearable devices, etc.), and address privacy. We are on the precipice of an unprecedented opportunity to track, predict, and prevent global disease burdens in the population using digital data.
Hand hygiene is thought to be more effective against gastrointestinal illness than it is against respiratory illness, but no clear consensus has been reached on this point. Minimal hand-hygiene interventions seem to be effective at reducing the incidence of employee illness. Along with reducing infections among employees, hand-hygiene programs in the workplace may provide additional benefits to employers by reducing the number of employee health insurance claims and improving employee morale. Future research should use objective measures of hand hygiene and illness, and explore economic impacts on employers more fully.
Background: Modern causal inference methods allow machine learning to be used to weaken parametric modeling assumptions. However, the use of machine learning may result in complications for inference. Doubly robust cross-fit estimators have been proposed to yield better statistical properties. Methods: We conducted a simulation study to assess the performance of several different estimators for the average causal effect. The data generating mechanisms for the simulated treatment and outcome included log-transforms, polynomial terms, and discontinuities. We compared singly robust estimators (g-computation, inverse probability weighting) and doubly robust estimators (augmented inverse probability weighting, targeted maximum likelihood estimation). We estimated nuisance functions with parametric models and ensemble machine learning separately. We further assessed doubly robust cross-fit estimators. Results: With correctly specified parametric models, all of the estimators were unbiased and confidence intervals achieved nominal coverage. When used with machine learning, the doubly robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage. Conclusions: Due to the difficulty of properly specifying parametric models in high-dimensional data, doubly robust estimators with ensemble learning and cross-fitting may be the preferred approach for estimation of the average causal effect in most epidemiologic studies. However, these approaches may require larger sample sizes to avoid finite-sample issues.
Despite repeated calls by scholars to critically engage with the concepts of race and ethnicity in US epidemiologic research, the incorporation of these social constructs in scholarship may be suboptimal. This study characterizes the conceptualization, operationalization, and utilization of race and ethnicity in US research published in leading journals whose publications shape discourse and norms around race, ethnicity, and health within the field of epidemiology. We systematically reviewed randomly selected articles from prominent epidemiology journals across five periods: 1995-99, 2000-04, 2005-09, 2010-14, 2015-18. All original human-subjects research conducted in the US was eligible for review. Information on definitions, measurement, coding, and use in analysis was extracted. We reviewed 1050 articles, including 414 (39%) in analyses. Four studies explicitly defined race and/or ethnicity. Authors rarely made clear delineations between race and ethnicity, often adopting an ethno-racial construct. In the majority of studies across time periods, authors did not state how race and/or ethnicity was measured. Top coding schemes included “Black, White” (race), “Hispanic, Non-Hispanic” (ethnicity), and “Black, White, Hispanic” (ethno-racial). Most often, race and ethnicity were deemed “not of interest” in analyses (e.g., control variable). Broadly, disciplinary practices have remained largely the same between 1995-2018 and are in need of improvement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.