Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery &Amp; Data Mining 2019
DOI: 10.1145/3292500.3330784
|View full text |Cite
|
Sign up to set email alerts
|

Predicting Economic Development using Geolocated Wikipedia Articles

Abstract: Progress on the UN Sustainable Development Goals (SDGs) is hampered by a persistent lack of data regarding key social, environmental, and economic indicators, particularly in developing countries. For example, data on poverty -the first of seventeen SDGs -is both spatially sparse and infrequently collected in Sub-Saharan Africa due to the high cost of surveys. Here we propose a novel method for estimating socioeconomic indicators using open-source, geolocated textual information from Wikipedia articles. We dem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
45
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 59 publications
(45 citation statements)
references
References 24 publications
0
45
0
Order By: Relevance
“…Therefore, during the last few years, researchers have begun to investigate the potential of non-traditional data and new computational methods to estimate vulnerabilities and socioeconomic characteristics when primary data is not available. In these studies, mobile phone data [2], satellite imagery [3], a combination of both [4,5], geolocated Wikipedia articles [6] or Tweets [7], and social media advertising data [8], have been used in combination with state-of-theart machine learning methods to provide reliable estimates of poverty at different spatial resolutions for several Sub-Saharan African countries as well as Southern and Southeastern Asian ones.…”
mentioning
confidence: 99%
“…Therefore, during the last few years, researchers have begun to investigate the potential of non-traditional data and new computational methods to estimate vulnerabilities and socioeconomic characteristics when primary data is not available. In these studies, mobile phone data [2], satellite imagery [3], a combination of both [4,5], geolocated Wikipedia articles [6] or Tweets [7], and social media advertising data [8], have been used in combination with state-of-theart machine learning methods to provide reliable estimates of poverty at different spatial resolutions for several Sub-Saharan African countries as well as Southern and Southeastern Asian ones.…”
mentioning
confidence: 99%
“…Researchers can apply advanced data mining and machine learning techniques to improve their analyses of existing data. The are some examples where Big Data have contributed to improve macroeconomic forecasts and indicators-for example, Google data [128,129], economic news [125], web data [130,131] and individual bank card transaction data [132]. Others have exploited innovative methodologies or techniques Big Data has to improve practical functionality of economic modeling-for instance, early warnings of economic crisis using artificial neural networks [133] or trade nowcasting [134].…”
Section: Big Data and Decent Work And Economic Growthmentioning
confidence: 99%
“…Approaches include stacking the inputs as additional channels of a single network, or multi-branch architectures where data modalities are processed separately to extract features which are then concatenated before a final prediction layer. Examples of this approach include models that combine multiple sources of satellite information 30 or models that combine imagery with data from weather sensors, 48 cell phones, 29 Wikipedia, 49 social media, 50 street-level imagery 51 or Open Street Map 52 to predict development-related outcomes.…”
Section: Shallow Models Based On Hand-crafted Featuresmentioning
confidence: 99%
“…Second, although small samples make generalization tenuous, studies that made predictions at more aggregate spatial scales, and studies that combined satellite information with data from other sources, tended to outperform village-level satellite-only models. These data fusion approaches have become increasingly common, with researchers demonstrating how combining imagery with data from cell phones, 29 Wikipedia, 49 social media, 50 or Open Street Map 52 can improve predictions.…”
Section: Economic Livelihoodsmentioning
confidence: 99%