Abstract. We prove that a wide class of Markov models of neighbor-dependent substitution processes on the integer line is solvable. This class contains some models of nucleotidic substitutions recently introduced and studied empirically by molecular biologists. We show that the polynucleotidic frequencies at equilibrium solve some finite-size linear systems. This provides, for the first time up to our knowledge, explicit and algebraic formulas for the stationary frequencies of non degenerate neighbor-dependent models of DNA substitutions. Furthermore, we show that the dynamics of these stochastic processes and their distribution at equilibrium exhibit some stringent, rather unexpected, independence properties. For example, nucleotidic sites at distance at least three evolve independently, and all the sites, when only encoded as purines and pyrimidines, evolve independently.
The variability of the products of polymerase chain reactions, due to mutations and to incomplete replications, can have important clinical consequences. Sun (1995) and Weiss and von Haeseler (1995) modeled these errors by a branching process and introduced estimators of the mutation rate and of the efficiency of the reaction based, for example, on the empirical distribution of the mutations of a random sequence. This distribution involves a noncanonical branching Markov chain which, although easy to describe, is not analytically tractable except in the infinite-population limit. These authors for the infinite-target limit, and Wang et al. (2000) for finite targets, solved the infinite-population limit. In this paper, we provide bounds of the difference between the finite-target finite-population case and its finite-target infinite-population approximation. The bounds are explicit functions of the efficiency of the reaction, the mutation rate per site and per cycle, the size of the target, the number of cycles, and the size of the initial population. They concern every moment and, what might be more surprising, the histogram itself of the distributions. The bounds for the moments exhibit a phase transition at the value 1 - 1/N = 3/4 of the mutation rate per site and per cycle, where N = 4 is the number of letters in the encoding alphabet of DNA and RNA. Of course, in biological contexts, the mutation rates are much smaller than 3/4.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.