With the coming data deluge from synoptic surveys, there is a growing need for frameworks that can quickly and automatically produce calibrated classification probabilities for newly-observed variables based on a small number of time-series measurements. In this paper, we introduce a methodology for variable-star classification, drawing from modern machine-learning techniques. We describe how to homogenize the information gleaned from light curves by selection and computation of real-numbered metrics (features), detail methods to robustly estimate periodic light-curve features, introduce treeensemble methods for accurate variable star classification, and show how to rigorously evaluate the classification results using cross validation. On a 25-class data set of 1542 well-studied variable stars, we achieve a 22.8% overall classification error using the random forest classifier; this represents a 24% improvement over the best previous classifier on these data. This methodology is effective for identifying samples of specific science classes: for pulsational variables used in Milky Way tomography we obtain a discovery efficiency of 98.2% and for eclipsing systems we find an efficiency of 99.1%, both at 95% purity. We show that the random forest (RF) classifier is superior to other machine-learned methods in terms of accuracy, speed, and relative immunity to features with no useful class information; the RF classifier can also be used to estimate the importance of each feature in classification. Additionally, we present the first astronomical use of hierarchical classification methods to incorporate a known class taxonomy in the classifier, which further reduces the catastrophic error rate to 7.8%. Excluding low-amplitude sources, our overall error rate improves to 14%, with a catastrophic error rate of 3.5%. 5 High-precision photometry missions (Kepler, MOST, CoRoT, etc.) are already challenging the theoretical understanding of the origin of variability and the connection of some specific sources to established classes of variables. 6 General Catalog of Variable Stars, http://www.sai.msu.su/groups/cluster/gcvs/gcvs/ 7 Not discussed herein are the challenges associated with discovery of variability. See Shin et al. (2009) for a review.
The pace of online shopping revenue growth means it is important for retailers and manufacturers to understand how consumers behave online compared with their behaviour in brick and mortar stores. We conducted a study in which the detailed behaviour of 40 shoppers was screen recorded while they each undertook an online shopping 'trip'. The shopping trip comprised purchasing a basket of 12 commonly bought grocery categories at one of two major retailers. The shoppers were all inexperienced in online grocery shopping. Results show that online grocery shopping is fast, even for these consumers who were new to it -half of the online shoppers spent less than 10 seconds purchasing from a category. This result is very similar to that of past studies in physical stores. Indeed, half of all the 12 item-shopping trips took less than 10 minutes. Also, most purchases were made from the first category page displayed in the retailer's online store. Shoppers also consistently used the default display options chosen by the retailers but used a combination of navigational tools to find their products. We conclude that online shoppers do not behave differently from those offline in terms of time spent or effort expended. Online shopping, in the grocery context at least, seems to primarily reflect a desire for time efficiency on the part of the shopper. In that regard, online shopping seems very similar to in-store shopping. The study begins the job of documenting shopper behaviour into this new channel and provides practical knowledge for retailers and manufacturers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.