Big data refers to large, complex, potentially linkable data from diverse sources, ranging from the genome and social media, to individual health information and the contributions of citizen science monitoring, to large-scale long-term oceanographic and climate modeling and its processing in innovative and integrated “data mashups.” Over the past few decades, thanks to the rapid expansion of computer technology, there has been a growing appreciation for the potential of big data in environment and human health research. The promise of big data mashups in environment and human health includes the ability to truly explore and understand the “wicked environment and health problems” of the 21st century, from tracking the global spread of the Zika and Ebola virus epidemics to modeling future climate change impacts and adaptation at the city or national level. Other opportunities include the possibility of identifying environment and health hot spots (i.e., locations where people and/or places are at particular risk), where innovative interventions can be designed and evaluated to prevent or adapt to climate and other environmental change over the long term with potential (co-) benefits for health; and of locating and filling gaps in existing knowledge of relevant linkages between environmental change and human health. There is the potential for the increasing control of personal data (both access to and generation of these data), benefits to health and the environment (e.g., from smart homes and cities), and opportunities to contribute via citizen science research and share information locally and globally. At the same time, there are challenges inherent with big data and data mashups, particularly in the environment and human health arena. Environment and health represent very diverse scientific areas with different research cultures, ethos, languages, and expertise. Equally diverse are the types of data involved (including time and spatial scales, and different types of modeled data), often with no standardization of the data to allow easy linkage beyond time and space variables, as data types are mostly shaped by the needs of the communities where they originated and have been used. Furthermore, these “secondary data” (i.e., data re-used in research) are often not even originated for this purpose, a particularly relevant distinction in the context of routine health data re-use. And the ways in which the research communities in health and environmental sciences approach data analysis and synthesis, as well as statistical and mathematical modeling, are widely different. There is a lack of trained personnel who can span these interdisciplinary divides or who have the necessary expertise in the techniques that make adequate bridging possible, such as software development, big data management and storage, and data analyses. Moreover, health data have unique challenges due to the need to maintain confidentiality and data privacy for the individuals or groups being studied, to evaluate the implications of shared information for the communities affected by research and big data, and to resolve the long-standing issues of intellectual property and data ownership occurring throughout the environment and health fields. As with other areas of big data, the new “digital data divide” is growing, where some researchers and research groups, or corporations and governments, have the access to data and computing resources while others do not, even as citizen participation in research initiatives is increasing. Finally with the exception of some business-related activities, funding, especially with the aim of encouraging the sustainability and accessibility of big data resources (from personnel to hardware), is currently inadequate; there is widespread disagreement over what business models can support long-term maintenance of data infrastructures, and those that exist now are often unable to deal with the complexity and resource-intensive nature of maintaining and updating these tools. Nevertheless, researchers, policy makers, funders, governments, the media, and members of the general public are increasingly recognizing the innovation and creativity potential of big data in environment and health and many other areas. This can be seen in how the relatively new and powerful movement of Open Data is being crystalized into science policy and funding guidelines. Some of the challenges and opportunities, as well as some salient examples, of the potential of big data and big data mashup applications to environment and human health research are discussed.
Background Children incur lead toxicity even at low blood-lead concentrations (BLCs), and testing in England is opportunistic. We described epidemiology of cases notified to a passive laboratory-based surveillance system (SS), the Lead Poisoning in Children (LPIC) SS to inform opportunities to prevent lead exposure in children in England. Methods Surveillance population: children <16 years of age and resident in England during the reporting period September 2014–17. Case definition: children with BLC ≥0.48 μmol/l (10 μg/dl). We extracted case demographic/location data and linked it with laboratory, area-level population and socio-economic status (SES) data. We described case BLCs and calculated age-, gender- and SES-specific notification rates, and age-sex standardised regional notification rates. Results Between 2014 and 2017 there were 86 newly notified cases, giving an annual average notification rate of 2.76 per million children aged 0–15 years. Regionally, rates varied from 0.36 to 9.89 per million. Rates were highest in the most deprived quintile (5.38 per million), males (3.75 per million) and children aged 1–4 years (5.89 per million). Conclusions Males, children aged 1–4 years, and children in deprived areas may be at higher risk, and could be targeted for primary prevention. Varied regional notification rates suggest differences in clinician awareness of lead exposure and risk factors; guidelines standardising the indications for BLC-testing may assist secondary prevention.
Creating dedicated data and analysis resources, such as the one described here, will become an increasingly vital step in improving understanding of the complex interconnections between the environment and human health and wellbeing, whilst still ensuring appropriate confidentiality safeguards. The issues raised in this paper can inform the future development of similar tools by other researchers working in this field.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.