Background
Although in the past few years we have witnessed the rapid development of novel statistical methods for association studies of qualitative traits using next Generation Sequencing (NGS) data, only a few statistics are proposed for testing the association of rare variants with quantitative traits. The QTL analysis of rare variants remains challenging. Analysis from low dimensional data to high dimensional genomic data demands changes in statistical methods from multivariate data analysis to functional data analysis.
Methods
In this paper, we propose a functional linear model (FLM) as a general principle for developing novel and powerful QTL analysis methods designed for resequencing data. By simulations we calculate the type I error rates and evaluate the power of the FLM and other eight existing statistical methods even in the presence of both positive and negative signs of effects.
Results
Since the FLM retains all of the genetic information in the data and explores the merits of both variant-by-variant and collective analysis and overcomes their limitation, the FLM has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the FLM is applied to association analysis of six quantitative traits in the Dallas Heart Study, and RNA-seq eQTL analysis with genetic variation in the low coverage resequencing data of the 1000 Genomes Project. Real data analysis shows that the FLM has much smaller P-values to identify significantly associated variants than other existing methods.
Conclusions
The FLM is expected to open a new route for QTL analysis.