Contamination of drinking water by nitrate is a growing problem in many agricultural areas of the country. Ingested nitrate can lead to the endogenous formation of N-nitroso compounds, potent carcinogens. We developed a predictive model for nitrate concentrations in private wells in Iowa. Using 34,084 measurements of nitrate in private wells, we trained and tested random forest models to predict log nitrate levels by systematically assessing the predictive performance of 179 variables in 36 thematic groups (well depth, distance to sinkholes, location, land use, soil characteristics, nitrogen inputs, meteorology, and other factors). The final model contained 66 variables in 17 groups. Some of the most important variables were well depth, slope length within 1 km of the well, year of sample, and distance to nearest animal feeding operation. The correlation between observed and estimated nitrate concentrations was excellent in the training set (r-square=0.77) and was acceptable in the testing set (r-square=0.38). The random forest model had substantially better predictive performance than a traditional linear regression model or a regression tree. Our model will be used to investigate the association between nitrate levels in drinking water and cancer risk in the Iowa participants of the Agricultural Health Study cohort.
Summary1. White-nose syndrome (WNS) is an emerging disease of hibernating North American bats that is caused by the cold-growing fungus Geomyces destructans. Since first observed in the winter of 2007, WNS has led to unprecedented mortality in several species of bats and may threaten more than 15 additional hibernating bat species if it continues across the continent. Although the exact means by which fungal infection causes mortality are undetermined, available evidence suggests a strong role of winter environmental conditions in disease mortality. 2. By 2010, the fungus G. destructans was detected in new areas of North America far from the area it was first observed, as well as in eight European bat species in different countries, yet mortality was not observed in many of these new areas of North America or in any part of Europe. This could be because of the differences in the fungus, rates of disease progression and ⁄ or in life-history or physiological traits of the affected bat species between different regions. Infection of bats by G. destructans without associated mortality might also suggest that certain environmental conditions might have to co-occur with fungal infection to cause mortality. 3. We tested the environmental conditions hypothesis using Maxent to map and model landscape surface conditions associated with WNS mortality. This approach was unique in that we modelled possible requisite environmental conditions for disease mortality and not simply the presence of the causative agent. 4. The top predictors of WNS mortality were land use ⁄ land cover types, mean air temperature of wettest quarter, elevation, frequency of precipitation and annual temperature range. Model results suggest that WNS mortality is most likely to occur in landscapes that are higher in elevation and topographically heterogeneous, drier and colder during winter, and more seasonally variable than surrounding landscapes. 5. Synthesis and applications. This study mapped the most likely environmental surface conditions associated with bat mortality owing to WNS in the north-eastern United Sates; maps can be used for selection of priority monitoring sites. Our results provide a starting point from which to investigate and predict the potential spread and population impacts of this catastrophic emerging disease.
Unregulated private wells in the United States are susceptible to many groundwater contaminants. Ingestion of nitrate, the most common anthropogenic private well contaminant in the United States, can lead to the endogenous formation of N-nitroso-compounds, which are known human carcinogens. In this study, we expand upon previous efforts to model private well groundwater nitrate concentration in North Carolina by developing multiple machine learning models and testing against out-of-sample prediction. Our purpose was to develop exposure estimates in unmonitored areas for use in the Agricultural Health Study (AHS) cohort. Using approximately 22,000 private well nitrate measurements in North Carolina, we trained and tested continuous models including a censored maximum likelihood-based linear model, random forest, gradient boosted machine, support vector machine, neural networks, and kriging. Continuous nitrate models had low predictive performance (R2 < 0.33), so multiple random forest classification models were also trained and tested. The final classification approach predicted < 1 mg/L, 1 – 5 mg/L, and ≥5 mg/L using a random forest model with 58 variables and maximizing the Cohen’s kappa statistic. The final model had an overall accuracy of 0.75 and high specificity for the higher two categories and high sensitivity for the lowest category. The results will be used for the categorical prediction of private well nitrate for AHS cohort participants that reside in North Carolina.
BackgroundEnvironmental exposure assessments often require a study participant’s residential location, but the positional accuracy of geocoding varies by method and the rural status of an address. We evaluated geocoding error in the Agricultural Health Study (AHS), a cohort of pesticide applicators and their spouses in Iowa and North Carolina, U.S.A.MethodsFor 5,064 AHS addresses in Iowa, we compared rooftop coordinates as a gold standard to two alternate locations: 1) E911 locations (intersection of the private and public road), and 2) geocodes generated by matching addresses to a commercial street database (NAVTEQ) or placed manually. Positional error (distance in meters (m) from the rooftop) was assessed overall and separately for addresses inside (non-rural) or outside town boundaries (rural). We estimated the sensitivity and specificity of proximity-based exposures (crops, animal feeding operations (AFOs)) and the attenuation in odds ratios (ORs) for a hypothetical nested case–control study. We also evaluated geocoding errors within two AHS subcohorts in Iowa and North Carolina by comparing them to GPS points taken at residences.ResultsNearly two-thirds of the addresses represented rural locations. Compared to the rooftop gold standard, E911 locations were more accurate overall than address-matched geocodes (median error 39 and 90 m, respectively). Rural addresses generally had greater error than non-rural addresses, although errors were smaller for E911 locations. For highly prevalent crops within 500 m (>97% of homes), sensitivity was >95% using both data sources; however, lower specificities with address-matched geocodes (more common for rural addresses) led to substantial attenuation of ORs (e.g., corn <500 m ORobs = 1.47 vs. ORtrue = 2.0). Error in the address-matched geocodes resulted in even greater ORobs attenuation for AFO exposures. Errors for North Carolina addresses were generally smaller than those in Iowa.ConclusionsGeocoding error can be minimized when known coordinates are available to test alternative data and methods. Our assessment suggests that where E911 locations are available, they offer an improvement upon address-matched geocodes for rural addresses. Exposure misclassification resulting from positional error is dependent on the geographic database, geocoding method, and the prevalence of exposure.Electronic supplementary materialThe online version of this article (doi:10.1186/1476-072X-13-37) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.