A fully automatic procedure for the construction of histograms is proposed. It consists of constructing both a regular and an irregular histogram and then choosing between the two. For the regular histogram, only the number of bins has to be chosen. Irregular histograms can be constructed using a dynamic programming algorithm if the number of bins is known. To choose the number of bins, two different penalties motivated by recent work in model selection are proposed. A complete description of the algorithm and a proper tuning of the penalties is given. Finally, different versions of the procedure are compared to other existing proposals for a wide range of densities and sample sizes. In the simulations, the squared Hellinger risk of the procedure that chooses between regular and irregular histograms is always at most twice as large as the risk of the best of the other methods. The procedure is implemented in an R-Package.
We consider data consisting of photon counts of diffracted x-ray radiation as a function of the angle of diffraction. The problem is to determine the positions, powers and shapes of the relevant peaks. An additional difficulty is that the power of the peaks is to be measured from a baseline which itself must be identified. Most methods of de-noising data of this kind do not explicitly take into account the modality of the final estimate. The residual-based procedure we propose uses the so-called taut string method, which minimizes the number of peaks subject to a tube constraint on the integrated data. The baseline is identified by combining the result of the taut string with an estimate of the first derivative of the baseline obtained using a weighted smoothing spline. Finally, each individual peak is expressed as the finite sum of kernels chosen from a parametric family.
This article describes the benchden package which implements a set of 28 example densities for nonparametric density estimation in R. In addition to the usual functions that evaluate the density, distribution and quantile functions or generate random variates, a function designed to be specifically useful for larger simulation studies has been added. After describing the set of densities and the usage of the package, a small toy example of a simulation study conducted using the benchden package is given.
In the age of digitalization, customer and consumer data have become a valuable source of information for companies. However, to obtain these data, companies depend on peoples' willingness to share (WTS) their private data with them. By means of a large‐scale online experiment with more than 20,000 participants, we investigated the extent to which peoples' WTS private data is affected by contextual factors. We complement and extend previous research by (i) simultaneously addressing several contextual factors that companies can largely control themselves, (ii) comparing their relative impacts on WTS, and (iii) explicitly examining interactions between these contextual factors in addition to their specific univariate effects. Concretely, we investigate contextual factors, such as the type of data requested, the purpose for which the data are used, the industry sector a corresponding company belongs to, the type of compensation offered for the shared data, and the degree to which the data allows for personal identification. Our data suggest that all these factors do affect peoples' WTS significantly, while there are also multiple significant interaction effects between these contextual factors. For instance, we found that a better intuitive match between the core business a company is engaged in and the type of data that is requested, results in higher proportions of people who are willing to share the corresponding data with the corresponding company. Hence, companies may benefit from tuning their requests for consumer or customer data according to the specific context in which they operate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.