Tree shape statistics provide valuable quantitative insights into evolutionary mechanisms underpinning phylogenetic trees, a commonly used graph representation of evolutionary relationships among taxonomic units ranging from viruses to species. We study two subtree counting statistics, the number of cherries and the number of pitchforks, for random phylogenetic trees generated by two widely used null tree models: the proportional to distinguishable arrangements (PDA) and the Yule-Harding-Kingman (YHK) models. By developing limit theorems for a version of extended Pólya urn models in which negative entries are permitted for their replacement matrices, we deduce the strong laws of large numbers and the central limit theorems for the joint distributions of these two counting statistics for the PDA and the YHK models. Our results indicate that the limiting behaviour of these two statistics, when appropriately scaled using the number of leaves in the underlying trees, is independent of the initial tree used in the tree generating process.
In this paper we consider a new type of urn scheme, where the selection probabilities are proportional to a weight function, which is linear but decreasing in the proportion of existing colours. We refer to it as the de-preferential urn scheme. We establish the almost-sure limit of the random configuration for any balanced replacement matrix R. In particular, we show that the limiting configuration is uniform on the set of colours if and only if R is a doubly stochastic matrix. We further establish the almost-sure limit of the vector of colour counts and prove central limit theorems for the random configuration as well as for the colour counts.
Our main results are quantitative bounds in the multivariate normal approximation of centred subgraph counts in random graphs generated by a general graphon and independent vertex labels. We are interested in these statistics because they are key to understanding fluctuations of regular subgraph counts -a cornerstone of dense graph limit theory. We also identify the resulting limiting Gaussian stochastic measures by means of the theory of generalised U -statistics and Gaussian Hilbert spaces, which we think is a suitable framework to describe and understand higherorder fluctuations in dense random graph models. With this article, we believe we answer the question "What is the central limit theorem of dense graph limit theory?". We complement the theory with some statistical applications to illustrate the use of centred subgraph counts in network modelling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.