The multiplicity problem has become increasingly important in genetic studies as the capacity for high-throughput genotyping has increased. The control of False Discovery Rate (FDR) (Benjamini and Hochberg. [1995] J. R. Stat. Soc. Ser. B 57:289-300) has been adopted to address the problems of false positive control and low power inherent in high-volume genome-wide linkage and association studies. In many genetic studies, there is often a natural stratification of the m hypotheses to be tested. Given the FDR framework and the presence of such stratification, we investigate the performance of a stratified false discovery control approach (i.e. control or estimate FDR separately for each stratum) and compare it to the aggregated method (i.e. consider all hypotheses in a single stratum). Under the fixed rejection region framework (i.e. reject all hypotheses with unadjusted p-values less than a pre-specified level and then estimate FDR), we demonstrate that the aggregated FDR is a weighted average of the stratum-specific FDRs. Under the fixed FDR framework (i.e. reject as many hypotheses as possible and meanwhile control FDR at a pre-specified level), we specify a condition necessary for the expected total number of true positives under the stratified FDR method to be equal to or greater than that obtained from the aggregated FDR method. Application to a recent Genome-Wide Association (GWA) study by Maraganore et al. ([2005] Am. J. Hum. Genet. 77:685-693) illustrates the potential advantages of control or estimation of FDR by stratum. Our analyses also show that controlling FDR at a low rate, e.g. 5% or 10%, may not be feasible for some GWA studies.
Starting with the seminal paper of Haario, Saksman and Tamminen (Haario et al. (2001)), a substantial amount of work has been done to validate adaptive Markov chain Monte Carlo algorithms. In this paper we focus on two practical aspects of adaptive Metropolis samplers. First, we draw attention to the deficient performance of standard adaptation when the target distribution is multi-modal. We propose a parallel chain adaptation strategy that incorporates multiple Markov chains which are run in parallel. Second, we note that 1 the current adaptive MCMC paradigm implicitly assumes that the adaptation is uniformly efficient on all regions of the state space. However, in many practical instances, different "optimal" kernels are needed in different regions of the state space. We propose here a regional adaptation algorithm in which we account for possible errors made in defining the adaptation regions. This corresponds to the more realistic case in which one does not know exactly the optimal regions for adaptation. The methods focus on the random walk Metropolis sampling algorithm but their scope is much wider. We provide theoretical justification for the two adaptive approaches using the existent theory build for adaptive Markov chain Monte Carlo. We illustrate the performance of the methods using simulations and analyze a mixture model for real data using an algorithm that combines the two approaches.
SummaryThis paper considers inference methods for case-control logistic regression in longitudinal setups. The motivation is provided by an analysis of plains bison spatial location as a function of habitat heterogeneity. The sampling is done according to a longitudinal matched case-control design in which, at certain time points, exactly one case, the actual location of an animal, is matched to a number of controls, the alternative locations that could have been reached. We develop inference methods for the conditional logistic regression model in this setup, which can be formulated within a generalized estimating equation (GEE) framework. This permits the use of statistical techniques developed for GEE-based inference, such as robust variance estimators and model selection criteria adapted for non-independent data. The performance of the methods is investigated in a simulation study and illustrated with the bison data analysis.
Summary:The study of dependence between random variables is a mainstay in Statistics. In many cases the strength of dependence between two or more random variables varies according to the values of a measured covariate. We propose inference for this type of variation using a conditional copula model where the copula function belongs to a parametric copula family and the copula parameter varies with the covariate. In order to estimate the functional relationship between the copula parameter and the covariate, we propose a nonparametric approach based on local likelihood. Of importance is also the choice of the copula family that best represents a given set of data. The proposed framework naturally leads to a novel copula selection method based on cross-validated prediction errors. We derive the asymptotic bias and variance of the resulting local polynomial estimator, and outline how to construct pointwise confidence intervals. The finite sample performance of our method is investigated using simulation studies and is illustrated using a subset of the Matched Multiple Birth data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.