SummaryThe need for new methods to deal with big data is a common theme in most scientific fields, although its definition tends to vary with the context. Statistical ideas are an essential part of this, and as a partial response, a thematic program on statistical inference, learning and models in big data was held in 2015 in Canada, under the general direction of the Canadian Statistical Sciences Institute, with major funding from, and most activities located at, the Fields Institute for Research in Mathematical Sciences. This paper gives an overview of the topics covered, describing challenges and strategies that seem common to many different areas of application and including some examples of applications to make these challenges and strategies more concrete.
Background: In 2018, Health Canada, the Federal department responsible for public health, put forward a regulatory proposal to introduce regulations requiring a “High in” front-of-package label (FOPL) on foods that exceed predetermined thresholds for sodium, sugars, or saturated fat. This study evaluated the efficacy of the proposed FOPL as a quick and easy tool for making food choices that support reduction in the intakes of these nutrients. Methods: Consumers (n = 625) of varying health literacy levels (HL) were assigned to control (current labeling with no FOPL) or one of four FOPL designs. They completed six shopping tasks, designed to control for internal motivations. Efficacy was measured with correct product selection and response time (seconds) to make food choices using repeated measures statistical modeling, adjusting for HL, task type, and task order. Eye-tracking and structured interviews were used to gather additional insights about participants’ choices. Results: Overall, FOPL was significantly more effective than current labeling at helping consumers of varying HL levels to identify foods high in nutrients of concern and make healthier food choices. All FOPL were equally effective. Conclusions: “High in” FOPL can be effective at helping Canadians of varying HL levels make more informed food choices in relation to sugars, sodium, and saturated fat.
In this paper, we consider an estimation problem of the regression coefficients in multiple regression models with several unknown change-points. Under some realistic assumptions, we propose a class of estimators which includes as a special cases shrinkage estimators (SEs) as well as the unrestricted estimator (UE) and the restricted estimator (RE). We also derive a more general condition for the SEs to dominate the UE. To this end, we generalize some identities for the evaluation of the bias and risk functions of shrinkage-type estimators. As illustrative example, our method is applied to the "gross domestic product" data set of 10 countries whose USA, Canada, UK, France and Germany. The simulation results corroborate our theoretical findings.
In this paper, we consider an estimation problem of the matrix of the regression coefficients in multivariate regression models with unknown change‐points. More precisely, we consider the case where the target parameter satisfies an uncertain linear restriction. Under general conditions, we propose a class of estimators that includes as special cases shrinkage estimators (SEs) and both the unrestricted and restricted estimator. We also derive a more general condition for the SEs to dominate the unrestricted estimator. To this end, we extend some results underlying the multidimensional version of the mixingale central limit theorem as well as some important identities for deriving the risk function of SEs. Finally, we present some simulation studies that corroborate the theoretical findings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.