We create a hedonic price model for house prices for six geographical submarkets in the Netherlands. Our model is based on a recent data-mining technique called boosting. Boosting is an ensemble technique that combines multiple models, in our case decision trees, into a combined prediction. Boosting enables capturing of complex nonlinear relationships and interaction effects between input variables.We report mean relative errors and mean absolute error for all regions and compare our models with a standard linear regression approach. Our model improves prediction performance by up to 39% compared with linear regression and by up to 20% compared with a log-linear regression model. Next, we interpret the boosted models: we determine the most influential characteristics and graphically depict the relationship between the most important input variables and the house price. We find the size of the house to be the most important input for all but one region, and find some interesting nonlinear relationships between inputs and price.Finally, we construct hedonic price indices and compare these with the mean and median index and find that these indices differ notably in the urban regions of Amsterdam and Rotterdam. and Rubinfeld (1978), for example, use a hedonic model to find a relationship between air pollution and house prices.The third way in which the hedonic model can be useful, is when it is used to create a hedonic price index. A hedonic price index uses a hedonic model to correct for quality differences over time. Ordinary indices may give a deceptive view, because the average or median product in year t may be a better (or worse) product than in year t − 1. An average house in the 1930s, for instance, can in no way be compared with an average house sold in the year 2004 (since these have different characteristics); nonetheless, this is what a regular price index does. Hedonic indices for housing are, for instance, constructed by Wallace (1996) and Clapp (2004). Also, two leading price indices in the UK, namely the Halifax house price index and the Nationwide house price index, use a hedonic technique developed by Nellis (1984, 1985).Traditional hedonic models have the advantage they are easy to interpret and estimate, but they often suffer from misspecification: The assumptions made on functional form do not allow a good representation of reality, i.e. the hedonic model does not fit the data well. Several researchers (e.g. have used semi-and non-parametric methods to estimate a hedonic price model and compared these models with the traditional parametric hedonic models. Usually, these new models outperformed the parametric models in terms of prediction performance. Also, artificial neural nets, a popular machine-learning technique, are frequently used for the estimation of the hedonic function (e.g. Daniels and Kamp, 1999;Kershaw and Rossini, 1999;Lomsombunchai et al., 2004). An artificial neural net is a very flexible model, which in theory is a universal function approximator.A recent successful machine-lear...
In content-and knowledge-based recommender systems often a measure of (dis)similarity between items is used. Frequently, this measure is based on the attributes of the items.However, which attributes are important for the users of the system remains an important question to answer. In this paper, we present an approach to determine attribute weights in a dissimilarity measure using clickstream data of an e-commerce website. AbstractIn content-and knowledge-based recommender systems often a measure of (dis)similarity between items is used. Frequently, this measure is based on the attributes of the items. However, which attributes are important for the users of the system remains an important question to answer. In this paper, we present an approach to determine attribute weights in a dissimilarity measure using clickstream data of an e-commerce website. Counted is how many times products are sold and based on this a Poisson regression model is estimated. Estimates of this model are then used to determine the attribute weights in the dissimilarity measure. We show an application of this approach on a product catalog of MP3 players provided by Compare Group, owner of the Dutch price comparison site http://www.vergelijk.nl, and show how the dissimilarity measure can be used to improve 2D product catalog visualizations.
AND KEYWORDS AbstractTraditionally, recommender systems present recommendations in lists to the user. In contentand knowledge-based recommendation systems these list are often sorted on some notion of similarity with a query, ideal product specification, or sample product. However, a lot of information is lost in this way, since two even similar products can differ from the query on a completely different set of product characteristics. When using a two dimensional, that is, a map-based, representation of the recommendations, it is possible to retain this information. In the map we can then position recommendations that are similar to each other in the same area of the map.Both in science and industry an increasing number of two dimensional graphical interfaces have been introduced over the last years. However, some of them lack a sound scientific foundation, while other approaches are not applicable in a recommendation setting. In our chapter, we will describe a framework, which has a solid scientific foundation (using state-of-the-art statistical models) and is specifically designed to work with e-commerce product catalogs. Basis of the framework is the Product Catalog Map interface based on multidimensional scaling. Also, weshow another type of interface based on nonlinear principal components analysis, which provides an easy way in constraining the space based on specific characteristic values. Then, we discuss some advanced issues. Firstly, we discuss how the product catalog interface can be adapted to better fit the users' notion of importance of attributes using click stream analysis.Secondly, we show an user interface that combines recommendation by proposing with the map based approach. Finally, we show how these methods can be applied to a real e-commerce product catalog of MP3-players.Free Keywords map-based interface, multidimensional scaling, nonlinear principal components analysis, recommender systems, dissimilarity measure Availability The ERIM Report Series is distributed through the following platforms:
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.