With increased accessibility of large scale open data, public health studies are able to take advantage of integrative spatial big data to increase the spatial resolution to community or neighborhood level. One critical information for such studies is the large number of addresses of patients, which is private and highly sensitive. Geocoding such massive private addresses poses major challenges for public health researchers. Many geocoders provide only Web APIs which require sending private addresses over the Internet, which is not feasible. Commercial geocoders require high licensing fee and often have limitations on daily usage, which becomes a major hurdle for researchers. Scalability is another major challenge for large scale address dataset. In this paper, we present EaserGeocoder, a novel open source geocoder for effectively geocoding massive address datasets. EaserGeocoder takes an integrative approach by using multiple references based on open address data sources contributed by governments or communities. It takes a machine learning approach to automatically find the best answer from candidates produced by multiple references. The system provides high scalability through parallel processing. Our comparative studies demonstrate Easer-Geocoder outperforms open source geocoders and is comparable to commercial ones in terms of both accuracy and error. It provides a cost-effective and feasible solution for large scale public health studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.