Summary. We introduce a new estimator, the simultaneous multiscale change point estimator SMUCE, for the change point problem in exponential family regression. An unknown step function is estimated by minimizing the number of change points over the acceptance region of a multiscale test at a level α. The probability of overestimating the true number of change points K is controlled by the asymptotic null distribution of the multiscale test statistic. Further, we derive exponential bounds for the probability of underestimating K . By balancing these quantities, α will be chosen such that the probability of correctly estimating K is maximized. All results are even non-asymptotic for the normal case. On the basis of these bounds, we construct (asymptotically) honest confidence sets for the unknown step function and its change points. At the same time, we obtain exponential bounds for estimating the change point locations which for example yield the minimax rate O.n 1 / up to a log-term. Finally, the simultaneous multiscale change point estimator achieves the optimal detection rate of vanishing signals as n ! 1, even for an unbounded number of change points. We illustrate how dynamic programming techniques can be employed for efficient computation of estimators and confidence regions. The performance of the multiscale approach proposed is illustrated by simulations and in two cutting edge applications from genetic engineering and photoemission spectroscopy.
We propose, a heterogeneous simultaneous multiscale change point estimator called 'H-SMUCE' for the detection of multiple change points of the signal in a heterogeneous Gaussian regression model. A piecewise constant function is estimated by minimizing the number of change points over the acceptance region of a multiscale test which locally adapts to changes in the variance. The multiscale test is a combination of local likelihood ratio tests which are properly calibrated by scale-dependent critical values to keep a global nominal level α, even for finite samples. We show that H-SMUCE controls the error of overestimation and underestimation of the number of change points. For this, new deviation bounds for F -type statistics are derived. Moreover, we obtain confidence sets for the whole signal. All results are non-asymptotic and uniform over a large class of heterogeneous change point models. H-SMUCE is fast to compute, achieves the optimal detection rate and estimates the number of change points at almost optimal accuracy for vanishing signals, while still being robust. We compare H-SMUCE with several state of the art methods in simulations and analyse current recordings of a transmembrane protein in the bacterial outer membrane with pronounced heterogeneity for its states. An R-package is available on line.
Fast multiple change-point segmentation methods, which additionally provide faithful statistical statements on the number, locations and sizes of the segments, have recently received great attention. In this paper, we propose a multiscale segmentation method, FDRSeg, which controls the false discovery rate (FDR) in the sense that the number of false jumps is bounded linearly by the number of true jumps. In this way, it adapts the detection power to the number of true jumps. We prove a non-asymptotic upper bound for its FDR in a Gaussian setting, which allows to calibrate the only parameter of FDRSeg properly. Change-point locations, as well as the signal, are shown to be estimated in a uniform sense at optimal minimax convergence rates up to a log-factor. The latter is w.r.t. L p -risk, p ≥ 1, over classes of step functions with bounded jump sizes and either bounded, or possibly increasing, number of change-points. FDRSeg can be efficiently computed by an accelerated dynamic program; its computational complexity is shown to be linear in the number of observations when there are many change-points. The performance of the proposed method is examined by comparisons with some state of the art methods on both simulated and real datasets. An R-package is available online.
Based on a combination of jump segmentation and statistical multiresolution analysis for dependent data, a new approach called J-SMURF to idealize ion channel recordings has been developed. It is model-free in the sense that no a-priori assumptions about the channel’s characteristics have to be made; it thus complements existing methods which assume a model for the channel's dynamics, like hidden Markov models. The method accounts for the effect of an analog filter being applied before the data analysis, which results in colored noise, by adapting existing muliresolution statistics to this situation. J-SMURF’s ability to denoise the signal without missing events even when the signal-to-noise ratio is low is demonstrated on simulations as well as on ion current traces obtained from gramicidin A channels reconstituted into solvent-free planar membranes. When analyzing a newly synthesized acylated system of a fatty acid modified gramicidin channel, we are able to give statistical evidence for unknown gating characteristics such as subgating.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.