We show that, when the (limiting) request distribution has a heavy tail (e.g., generalized Zipf's law), P R = n ∼ c/n α as n → ∞, α > 1, then the limiting stationary search cost distribution P C > n , or, equivalently, the least-recently used (LRU) caching fault probability, satisfieswhere is the Gamma function and γ = 0 5772 is Euler's constant. When the request distribution has a light tail P R = n ∼ c exp −λn β as n → ∞ c λ β > 0 , thenindependently of c λ β, where C f is a fluid approximation of C. We experimentally demonstrate that the derived asymptotic formulas yield accurate results for lists of finite sizes. This should be contrasted with the exponential computational complexity of Burville and Kingman's exact expression for finite lists. The results also imply that the fault probability of LRU caching is asymptotically at most a factor e γ ≈ 1 78 greater than for the optimal static arrangement.
In this paper we consider the stochastic analysis of information ranking algorithms of large interconnected data sets, e.g. Google's PageRank algorithm for ranking pages on the World Wide Web. The stochastic formulation of the problem results in an equation of the form R d = Q + N i=1 C i R i , where N, Q, {R i } i≥1 , and {C, C i } i≥1 are independent nonnegative random variables, the {C, C i } i≥1 are identically distributed, and the {R i } i≥1 are independent copies of R; ' d =' stands for equality in distribution. We study the asymp-totic properties of the distribution of R that, in the context of PageRank, represents the frequencies of highly ranked pages. The preceding equation is interesting in its own right since it belongs to a more general class of weighted branching processes that have been found to be useful in the analysis of many other algorithms. Our first main result shows that if E N E[C α ] = 1, α > 0, and Q, N satisfy additional moment conditions, then R has a power law distribution of index α. This result is obtained using a new approach based on an extension of Goldie's (1991) implicit renewal theorem. Furthermore, when N is regularly varying of index α > 1, E N E[C α ] < 1, and Q, C have higher moments than α, then the distributions of R and N are tail equivalent. The latter result is derived via a novel sample path large deviation method for recursive random sums. Similarly, we characterize the situation when the distribution of R is determined by the tail of Q. The preceding approaches may be of independent interest, as they can be used for analyzing other functionals on trees. We also briefly discuss the engineering implications of our results.
Consider distributional fixed point equations of the formwhere f (·) is a possibly random real valued function, N ∈ {0, 1, 2, 3, . . . }∪{∞}, {C i } N i=1 are real valued random weights and {R i } i≥1 are iid copies of R, independent of (N, C 1 , . . . , C N ); D = represents equality in distribution. Fixed point equations of this type are of utmost importance for solving many applied probability problems, ranging from the average case analysis of algorithms to statistical physics. We develop an Implicit Renewal Theorem that enables the characterization of the power tail behavior of the solutions R to many equations of multiplicative nature that fall into this category. This result extends the prior work in [16], which assumed nonnegative weights {C i }, to general real valued weights. We illustrate the developed theorem by deriving the power tail asymptotics of the solution R to the
Consider a generic data unit of random size L that needs to be transmitted over a channel of unit capacity. The channel dynamics is modeled as an on-off process {(Ai, Ui)} i≥1 with alternating independent periods when channel is available Ai and unavailable Ui, respectively. During each period of time that the channel becomes available, say Ai, we attempt to transmit the data unit. If L ≤ Ai, the transmission was considered successful; otherwise, we wait for the next period Ai+1 when the channel is available and attempt to retransmit the data from the beginning. We study the asymptotic properties of the total transmission time T and number of retransmissions N until the data is successfully transmitted.In recent studies [1], [2], it was proved that the waiting time T follows a power law when the distributions of L and A 1 are of an exponential type, e.g., Gamma distribution. In this paper, we show that the distributions of N and T follow power laws with exponent α as long as log P[L > x] ≈ α log P[A 1 > x] for large x. Hence, it may appear surprising that we obtain power law distributions irrespective of how heavy or light the distributions of L and A1 may be. In particular, both L and A1 can decay faster than any exponential, which we term superexponential. For example, if L and A1 are Gaussian with variances σ 2 L and σ 2 A , respectively, then N and T have power law distributions with exponent α = σ 2 A /σ 2 L ; note that, if σ 2 A < σ 2 L , the transmission time has an infinite mean and, thus, the system is unstable.The preceding model, as recognized in [1], describes a variety of situations where failures require jobs to restart from the beginning. Here, we identify that this model also provides a new mechanism for explaining the frequently observed power law phenomenon in data networks. Specifically, we argue that it may imply the power laws on both the application as well as the data link layer, where variable-sized documents and (IP) packets are transmitted, respectively. We discuss the engineering ramifications of our observations, especially in the context of wireless ad hoc and sensor networks where channel failures are frequent. Furthermore, our results provide an easily computable benchmark for measuring the matching between the data and channel characteristics that permits/prevents satisfactory transmission.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.