2019
DOI: 10.1073/pnas.1903064116
|View full text |Cite
|
Sign up to set email alerts
|

Predicting neighborhoods’ socioeconomic attributes using restaurant data

Abstract: Accessing high-resolution, timely socioeconomic data such as data on population, employment, and enterprise activity at the neighborhood level is critical for social scientists and policy makers to design and implement location-based policies. However, in many developing countries or cities, reliable local-scale socioeconomic data remain scarce. Here, we show an easily accessible and timely updated location attribute—restaurant—can be used to accurately predict a range of socioeconomic attributes of urban neig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
53
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 94 publications
(54 citation statements)
references
References 33 publications
1
53
0
Order By: Relevance
“…This dataset is all publicly available and downloaded from Meituan-Dianping website. Dong et al [31] use the same Dianping dataset to accurately predict socioeconomic attributes of various neighborhoods in several Chinese cities including Beijing. The attributes are daytime and nighttime population, number of firms, and consumption level in those areas.…”
Section: Consumption Datamentioning
confidence: 99%
“…This dataset is all publicly available and downloaded from Meituan-Dianping website. Dong et al [31] use the same Dianping dataset to accurately predict socioeconomic attributes of various neighborhoods in several Chinese cities including Beijing. The attributes are daytime and nighttime population, number of firms, and consumption level in those areas.…”
Section: Consumption Datamentioning
confidence: 99%
“…Before defining ROC AUC and PR AUC, we introduce the false positive rate, r fp , and true positive rate, r tp , r fp = FP TN+FP (9) r tp = TP TP+FN ,…”
Section: Methodsmentioning
confidence: 99%
“…The relationship between the fitted city GDP values using the multiple linear regression (MLR) models of transportation features and the actual city GDP in three provinces (i.e., Liaoning, Jiangsu, and Shaanxi) are summarized in Fig , S4, and S5, the residuals center on zero and are not correlated with any predictors, which indicate that these models' predictions have a relatively constant variance and show homoscedasticity and normality. In addition, by applying two regularized regression methods: the Ridge regression 46 and the least absolute shrinkage and selection operator (LASSO) regression 47,48 , the smallest RMSE values were 54.2 (Liaoning), 121.0 (Jiangsu), and 30.66 (Shaanxi) billion CNY, respectively. Moreover, these two methods only select a few predictors while reducing the coefficients of other highly correlated predictors to zero.…”
Section: Estimation Of Gdp From Traffic Flowsmentioning
confidence: 99%
“…By shrinking some coefficients, the Ridge regression and the LASSO regression are able to control the multicollinearity in the model. It tends to pick one predictor from a few very correlated predictors and set the coefficients of the others to zero 47,48 . Therefore, the regularization techniques are very helpful when there are many intercorrelated features and feature selection is necessary 64 .…”
Section: Ln(gdpmentioning
confidence: 99%