It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid ``post-selection inference'' by reducing the problem to one of simultaneous inference and hence suitably widening conventional confidence and retention intervals. Simultaneity is required for all linear functions that arise as coefficient estimates in all submodels. By purchasing ``simultaneity insurance'' for all possible submodels, the resulting post-selection inference is rendered universally valid under all possible model selection procedures. This inference is therefore generally conservative for particular selection procedures, but it is always less conservative than full Scheffe protection. Importantly it does not depend on the truth of the selected submodel, and hence it produces valid inference even in wrong models. We describe the structure of the simultaneous inference problem and give some asymptotic results.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1077 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
To explore the genetic contribution to autistic spectrum disorders (ASDs), we have studied genomic copy-number variation in a large cohort of families with a single affected child and at least one unaffected sibling. We confirm a major contribution from de novo deletions and duplications but also find evidence of a role for inherited "ultrarare" duplications. Our results show that, relative to males, females have greater resistance to autism from genetic causes, which raises the question of the fate of female carriers. By analysis of the proportion and number of recurrent loci, we set a lower bound for distinct target loci at several hundred. We find many new candidate regions, adding substantially to the list of potential gene targets, and confirm several loci previously observed. The functions of the genes in the regions of de novo variation point to a great diversity of genetic causes but also suggest functional convergence.
We investigate parallel analysis (PA), a selection rule for the number-of-factors problem, from the point of view of permutation assessment. The idea of applying permutation test ideas to PA leads to a quasi-inferential, non-parametric version of PA which accounts not only for finite-sample bias but sampling variability as well. We give evidence, however, that quasi-inferential PA based on normal random variates (as opposed to data permutations) is surprisingly independent of distributional assumptions, and enjoys therefore certain non- parametric properties as well. This is a justification for providing tables for quasi-inferential PA. Based on permutation theory, we compare PA of principal components with PA of principal factor analysis and show that PA of principal factors may tend to select too many factors. We also apply parallel analysis to so-called resistant correlations and give evidence that this yields a slightly more conservative factor selection method. Finally, we apply PA to loadings and show how this provides benchmark values for loadings which are sensitive to the number of variables, number of subjects, and order of factors. These values therefore improve on conventional fixed thresholds such as 0.5 or 0.8 which are used irrespective of the size of the data.
Recurrent copy number variations (CNVs) of human 16p11.2 have been associated with a variety of developmental/neurocognitive syndromes. In particular, deletion of 16p11.2 is found in patients with autism, developmental delay, and obesity. Patients with deletions or duplications have a wide range of clinical features, and siblings carrying the same deletion often have diverse symptoms. To study the consequence of 16p11.2 CNVs in a systematic manner, we used chromosome engineering to generate mice harboring deletion of the chromosomal region corresponding to 16p11.2, as well as mice harboring the reciprocal duplication. These 16p11.2 CNV models have dosage-dependent changes in gene expression, viability, brain architecture, and behavior. For each phenotype, the consequence of the deletion is more severe than that of the duplication. Of particular note is that half of the 16p11.2 deletion mice die postnatally; those that survive to adulthood are healthy and fertile, but have alterations in the hypothalamus and exhibit a "behavior trap" phenotype-a specific behavior characteristic of rodents with lateral hypothalamic and nigrostriatal lesions. These findings indicate that 16p11.2 CNVs cause brain and behavioral anomalies, providing insight into human neurodevelopmental disorders.Home-cage | stereotypic behavior | structural variation | brain MRI A ccumulating evidence suggests the importance of copy number variations (CNVs) in the etiology of neuropsychiatric disorders, including autism (1), schizophrenia (2-4), developmental delay (5), and other complex traits (6). The 16p11.2 region is particularly intriguing. Whereas deletion of 16p11.2 has been associated with autism (7-9), duplication of 16p11.2 has been associated with autism (9, 10) as well as schizophrenia (11). 16p11.2 CNVs have also been reported in patients with developmental delay, mental retardation, repetitive behaviors (12-16), and a highly penetrant form of obesity (17). A reciprocal effect of 16p11.2 dosage on head size has been noted, as deletions are associated with large head size or macrocephaly, whereas duplications are associated with microcephaly (16). These studies reveal the variability of symptoms in patients carrying the same 16p11.2 CNV, an extreme example being a family with three affected members with symptoms so heterogeneous that they were barely overlapping (18).Mouse models allow direct assessment of CNVs while reducing variability caused by genetic and environmental factors. We and others have previously used chromosome engineering (19) to model genetic alterations found in complex human diseases including cancer (20) and genomic disorders (21-24), allowing identification of the causative gene and elucidation of the mechanism involved (20,(25)(26)(27)). Here we used a similar approach to generate mouse models with deletion and duplication corresponding to those found in patients with 16p11.2 CNVs. Because of the evidence for clinical heterogeneity, we screened these models for multiple changes in brain anatomy and behavior by usin...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.