Understanding cities as complex systems, sustainable urban planning depends on reliable high-resolution data, for example of the building stock to upscale region-wide retrofit policies. For some cities and regions, these data exist in detailed 3D models based on real-world measurements. However, they are still expensive to build and maintain, a significant challenge, especially for small and medium-sized cities that are home to the majority of the European population. New methods are needed to estimate relevant building stock characteristics reliably and cost-effectively. Here, we present a machine learning based method for predicting building heights, which is based only on open-access geospatial data on urban form, such as building footprints and street networks. The method allows to predict building heights for regions where no dedicated 3D models exist currently. We train our model using building data from four European countries (France, Italy, the Netherlands, and Germany) and find that the morphology of the urban fabric surrounding a given building is highly predictive of the height of the building. A test on the German state of Brandenburg shows that our model predicts building heights with an average error well below the typical floor height (about 2.5 m), without having access to training data from Germany. Furthermore, we show that even a small amount of local height data obtained by citizens substantially improves the prediction accuracy. Our results illustrate the possibility of predicting missing data on urban infrastructure; they also underline the value of open government data and volunteered geographic information for scientific applications, such as contextual but scalable strategies to mitigate climate change.
Capturing complex dependence structures between outcome variables (e.g., study endpoints) is of high relevance in contemporary biomedical data problems and medical research. Distributional copula regression provides a flexible tool to model the joint distribution of multiple outcome variables by disentangling the marginal response distributions and their dependence structure. In a regression setup, each parameter of the copula model, that is, the marginal distribution parameters and the copula dependence parameters, can be related to covariates via structured additive predictors. We propose a framework to fit distributional copula regression via model‐based boosting, which is a modern estimation technique that incorporates useful features like an intrinsic variable selection mechanism, parameter shrinkage and the capability to fit regression models in high‐dimensional data setting, that is, situations with more covariates than observations. Thus, model‐based boosting does not only complement existing Bayesian and maximum‐likelihood based estimation frameworks for this model class but rather enables unique intrinsic mechanisms that can be helpful in many applied problems. The performance of our boosting algorithm for copula regression models with continuous margins is evaluated in simulation studies that cover low‐ and high‐dimensional data settings and situations with and without dependence between the responses. Moreover, distributional copula boosting is used to jointly analyze and predict the length and the weight of newborns conditional on sonographic measurements of the fetus before delivery together with other clinical variables.
Capturing complex dependence structures between outcome variables (e.g., study endpoints) is of high relevance in contemporary biomedical data problems and medical research. Distributional copula regression provides a flexible tool to model the joint distribution of multiple outcome variables by disentangling the marginal response distributions and their dependence structure. In a regression setup each parameter of the copula model, i.e. the marginal distribution parameters and the copula dependence parameters, can be related to covariates via structured additive predictors. We propose a framework to fit distributional copula regression models via a model-based boosting algorithm. Model-based boosting is a modern estimation technique that incorporates useful features like an intrinsic variable selection mechanism, parameter shrinkage and the capability to fit regression models in high dimensional data setting, i.e. situations with more covariates than observations. Thus, model-based boosting does not only complement existing Bayesian and maximum-likelihood based estimation frameworks for this model class but rather enables unique intrinsic mechanisms that can be helpful in many applied problems. The performance of our boosting algorithm in the context of copula regression models with continuous margins is evaluated in simulation studies that cover low-and high-dimensional data settings and situations with and without dependence between the responses. Moreover, distributional copula boosting is used to jointly analyze and predict the length and the weight of newborns conditional on sonographic measurements of the fetus before delivery together with other clinical variables.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.