Recently, gene set analysis (GSA) has been extended from use on gene expression data to use on single-nucleotide polymorphism (SNP) data in genome-wide association studies. When GSA has been demonstrated on SNP data, two popular statistics from gene expression data analysis (gene set enrichment analysis [GSEA] and Fisher's exact test [FET]) have been used. However, GSEA and FET have shown a lack of power and robustness in the analysis of gene expression data. The purpose of this work is to investigate whether the same issues are also true for the analysis of SNP data. Ultimately, we conclude that GSEA and FET are not optimal for the analysis of SNP data when compared with the SUMSTAT method. In analysis of real SNP data from the Framingham Heart Study, we find that SUMSTAT finds many more gene sets to be significant when compared with other methods. In an analysis of simulated data, SUMSTAT demonstrates high power and better control of the type I error rate. GSA is a promising approach to the analysis of SNP data in GWAS and use of the SUMSTAT statistic instead of GSEA or FET may increase power and robustness.
Acer (the maple genus) is one of the diverse tree genera in the Northern Hemisphere with about 152 species, most of which are in eastern Asia. There are roughly a dozen species in Europe/western Asia and a dozen in North America. Several phylogenetic studies of Acer have been conducted since 1998, but none have provided a satisfactory resolution for basal relationships among sections of Acer. Here we report the first well‐resolved phylogeny of Acer based on DNA sequences of over 500 nuclear loci generated using the anchored hybrid enrichment method and explore the implications of the robust phylogeny for Acer systematics and biogeography. Our phylogenetic results support the most recent taxonomic treatment of Acer by de Jong with some modifications; section Pentaphylla may be expanded to include section Trifoliata, and A. yangbiense may be included in section Lithocarpa. Sections Spicata, Negundo, Arguta, and Palmata form a clade sister to the rest of the genus where sections Glabra and Parviflora comprise the first clade followed by section Macrantha, sections Ginnala, Lithocarpa, Indivisa, sections Platanoidea and Macrophylla, section Rubra, section Acer, and section Pentaphylla. Monotypic sections Glabra and Macrophylla in North America are sister to the Japanese section Parviflora and Eurasian section Platanoidea, respectively. Ancestral area inferences using statistical dispersal and vicariance analysis (S‐DIVA) and dispersal and extinction cladogenesis (DEC) methods suggest that Asia might be the most likely ancestral area of Acer as proposed by Wolfe and Tanai and molecular dating using Bayesian evolutionary analysis by sampling trees (BEAST) indicate that section diversifications of Acer might have completed largely in the late Eocene and the intercontinental disjunctions of Acer between eastern Asia and eastern North America formed mostly in the Miocene.
A number of rare variant statistical methods have been proposed for analysis of the impending wave of next-generation sequencing data. To date, there are few direct comparisons of these methods on real sequence data. Furthermore, there is a strong need for practical advice on the proper analytic strategies for rare variant analysis. We compare four recently proposed rare variant methods (combined multivariate and collapsing, weighted sum, proportion regression, and cumulative minor allele test) on simulated phenotype and next-generation sequencing data as part of Genetic Analysis Workshop 17. Overall, we find that all analyzed methods have serious practical limitations on identifying causal genes. Specifically, no method has more than a 5% true discovery rate (percentage of truly causal genes among all those identified as significantly associated with the phenotype). Further exploration shows that all methods suffer from inflated false-positive error rates (chance that a noncausal gene will be identified as associated with the phenotype) because of population stratification and gametic phase disequilibrium between noncausal SNPs and causal SNPs. Furthermore, observed true-positive rates (chance that a truly causal gene will be identified as significantly associated with the phenotype) for each of the four methods was very low (<19%). The combination of larger than anticipated false-positive rates, low true-positive rates, and only about 1% of all genes being causal yields poor discriminatory ability for all four methods. Gametic phase disequilibrium and population stratification are important areas for further research in the analysis of rare variant data.
In this paper we prove two multiset analogs of classical results. We prove a multiset analog of Lovász's version of the Kruskal-Katona Theorem and an analog of the Bollobás-Thomason threshold result. As a corollary we obtain the existence of pebbling thresholds for arbitrary graph sequences. In addition, we improve both the lower and upper bounds for the 'random pebbling' threshold of the sequence of paths.1991 AMS Subject Classification: 05D05, 05C35, 05A20
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.