What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology.
We address the problem of prediction in the spatial autoregressive SAR model for areal data which is classically used in spatial econometrics. With the Kriging theory, prediction using Best Linear Unbiased Predictors is at the heart of the geostatistical literature. From the methodological point of view, we explore the limits of the extension of BLUP formulas in the context of the spatial autoregressive SAR models for out-of-sample prediction simultaneously at several sites. We propose a more tractable "almost best" alternative and clarify the relationship between the BLUP and a proper EM-algorithm predictor. From an empirical perspective, we present data-based simulations to compare the efficiency of the classical formulas with the best and almost best predictions.
JEL classification: C21, C53
In an election, the vote shares by party on a given subdivision of a territory form a vector with positive components adding up to 1 called a composition. Using a conventional multiple linear regression model to explain this vector by some factors is not adapted for at least two reasons: the existence of the constraint on the sum of the components and the assumption of statistical independence across territorial units questionable due to potential spatial autocorrelation. We develop a simultaneous spatial autoregressive model for compositional data which allows for both spatial correlation and correlations across equations. We propose an estimation method based on two-stage and three-stage least squares. We illustrate the method with simulations and with a data set from the 2015 French departmental election.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.