In the U.S. Forest Service's Forest Inventory and Analysis (FIA) program, as in other natural resource surveys, many auxiliary variables are available for use in model-assisted inference about finite population parameters. Some of this auxiliary information may be extraneous, and therefore model selection is appropriate to improve the efficiency of the survey regression estimators of finite population totals. A model-assisted survey regression estimator using the lasso is presented and extended to the adaptive lasso. For a sequence of finite populations and probability sampling designs, asymptotic properties of the lasso survey regression estimator are derived, including design consistency and central limit theory for the estimator and design consistency of a variance estimator. To estimate multiple finite population quantities with the method, lasso survey regression weights are developed, using both a model calibration approach and a ridge regression approximation. The gains in efficiency of the lasso estimator over the full regression estimator are demonstrated through a simulation study estimating tree canopy cover for a region in Utah.
National forest inventories in many countries combine expensive ground plot data with remotely-sensed information to improve precision in estimators of forest parameters. A simple post-stratified estimator is often the tool of choice because it has known statistical properties, is easy to implement, and is intuitive to the many users of inventory data. Because of the increased availability of remotely-sensed data with improved spatial, temporal, and thematic resolutions, there is a need to equip the inventory community with a more diverse array of statistical estimators. Focusing on generalized regression estimators, we step the reader through seven estimators including: Horvitz Thompson, ratio, post-stratification, regression, lasso, ridge, and elastic net. Using forest inventory data from Daggett county in Utah, USA as an example, we illustrate how to construct, as well as compare the relative performance of, these estimators. Augmented by simulations, we also show how the standard variance estimator suffers from greater negative bias than the bootstrap variance estimator, especially as the size of the assisting model grows. Each estimator is made readily accessible through the new R package, mase. We conclude with guidelines in the form of a decision tree on when to use which an estimator in forest inventory applications.
Undergraduate research experiences (UREs), whether within the context of a mentor-mentee experience or a classroom framework, represent an excellent opportunity to expose students to the independent scholarship model. The high impact of undergraduate research has received recent attention in the context of STEM disciplines. Reflecting a 2017 survey of statistics faculty, this article examines the perceived benefits of UREs, as well as barriers to the incorporation of UREs, specifically within the field of statistics. Viewpoints of students, faculty mentors, and institutions are investigated. Further, the article offers several strategies for leveraging characteristics unique to the field of statistics to overcome barriers and thereby provide greater opportunity for undergraduate statistics students to gain research experience.
Despite having desirable properties, model‐assisted estimators are rarely used in anything but their simplest form to produce official statistics. This is due to the fact that the more complicated models are often ill suited to the available auxiliary data. Under a model‐assisted framework, we propose a regression tree estimator for a finite‐population total. Regression tree models are adept at handling the type of auxiliary data usually available in the sampling frame and provide a model that is easy to explain and justify. The estimator can be viewed as a post‐stratification estimator where the post‐strata are automatically selected by the recursive partitioning algorithm of the regression tree. We establish consistency of the regression tree estimator and a variance estimator, along with asymptotic normality of the regression tree estimator. We compare the performance of our estimator to other survey estimators using the United States Bureau of Labor Statistics Occupational Employment Statistics Survey data.
The total of a study variable in a finite population may be estimated using data from a complex survey via Horvitz-Thompson estimation. If additional auxiliary information is available, then efficiency is often improved via model-assisted survey regression estimation. Semiparametric models based on penalised spline regression are particularly attractive in this context, as they lead to natural extensions of classical survey regression estimators. Existing theory for the model-assisted penalised spline regression estimator does not account for the setting in which the number of knots is large relative to sample size. This gap is addressed by considering survey design asymptotics for the model-assisted penalised spline survey regression estimator, as the finite population size, sample size, and number of knots all increase to infinity. Conditions on the sequence of designs are developed under which the estimator is consistent for the finite population total and its variance is consistently estimated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.