In order to overcome the inaccuracy of the forecast of a single model, a new optimal weight combination model is established to increase accuracies in precipitation forecasting, in which three forecast submodels based on rank set pair analysis (R-SPA) model, radical basis function (RBF) model and autoregressive model (AR) and one weight optimization model based on improved real-code genetic algorithm (IRGA) are introduced. The new model for forecasting precipitation time series is tested using the annual precipitation data of Beijing, China, from 1978 to 2008. Results indicate the optimal weights were obtained by using genetic algorithm in the new optimal weight combination model. Compared with the results of R-SPA, RBF, and AR models, the new model can improve the forecast accuracy of precipitation in terms of the error sum of squares. The amount of improved precision is 22.6%, 47.4%, 40.6%, respectively. This new forecast method is an extension to the combination prediction method.
Motivation
Large scale genome-wide association studies (GWAS) have resulted in the identification of a wide range of genetic variants related to a host of complex traits and disorders. Despite their success, the individual single-nucleotide polymorphism (SNP) analysis approach adopted in most current GWAS can be limited in that it is usually biologically simple to elucidate a comprehensive genetic architecture of phenotypes and statistically underpowered due to heavy multiple-testing correction burden. On the other hand, multiple-SNP analyses (e.g. gene-based or region-based SNP-set analysis) are usually more powerful to examine the joint effects of a set of SNPs on the phenotype of interest. However, current multiple-SNP approaches can only draw an overall conclusion at the SNP-set level and does not directly inform which SNPs in the SNP-set are driving the overall genotype–phenotype association.
Results
In this article, we propose a new permutation-assisted tuning procedure in lasso (plasso) to identify phenotype-associated SNPs in a joint multiple-SNP regression model in GWAS. The tuning parameter of lasso determines the amount of shrinkage and is essential to the performance of variable selection. In the proposed plasso procedure, we first generate permutations as pseudo-SNPs that are not associated with the phenotype. Then, the lasso tuning parameter is delicately chosen to separate true signal SNPs and non-informative pseudo-SNPs. We illustrate plasso using simulations to demonstrate its superior performance over existing methods, and application of plasso to a real GWAS dataset gains new additional insights into the genetic control of complex traits.
Availability and implementation
R codes to implement the proposed methodology is available at https://github.com/xyz5074/plasso.
Supplementary information
Supplementary data are available at Bioinformatics online.
This study proposes a time-varying effect model that can be used to characterize gender-specific trajectories of health behaviors and conduct hypothesis testing for gender differences. The motivating examples demonstrate that the proposed model is applicable to not only multi-wave longitudinal studies but also short-term studies that involve intensive data collection. The simulation study shows that the accuracy of estimation of trajectory functions improves as the sample size and the number of time points increase. In terms of the performance of the hypothesis testing, the type I error rates are close to their corresponding significance levels under all combinations of sample size and number of time points. Furthermore, the power increases as the alternative hypothesis deviates more from the null hypothesis, and the rate of this increasing trend is higher when the sample size and the number of time points are larger.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.