PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as a frequency of visiting a Web page by a random surfer and thus it reflects the popularity of a Web page. Google computes the PageRank using the power iteration method which requires about one week of intensive computations. In the present work we propose and analyze Monte Carlo type methods for the PageRank computation. There are several advantages of the probabilistic Monte Carlo methods over the deterministic power iteration method: Monte Carlo methods provide good estimation of the PageRank for relatively important pages already after one iteration; Monte Carlo methods have natural parallel implementation; and finally, Monte Carlo methods allow to perform continuous update of the PageRank as the structure of the Web changes.
We study the parametric perturbation of Markov chains with denumerable state spaces. We consider both regular and singular perturbations. By the latter we mean that transition probabilities of a Markov chain, with several ergodic classes, are perturbed such that (rare) transitions among the different ergodic classes of the unperturbed chain are allowed. Singularly perturbed Markov chains have been studied in the literature under more restrictive assumptions such as strong recurrence ergodicity or Doeblin conditions. We relax these conditions so that our results can be applied to queueing models (where the conditions mentioned above typically fail to hold). Assuming ν-geometric ergodicity, we are able to explicitly express the steady-state distribution of the perturbed Markov chain as a Taylor series in the perturbation parameter. We apply our results to quasi-birth-anddeath processes and queueing models.
This work proposes and studies the properties of a hybrid sampling scheme that mixes independent uniform node sampling and random walk (RW)-based crawling. We show that our sampling method combines the strengths of both uniform and RW sampling while minimizing their drawbacks. In particular, our method increases the spectral gap of the random walk, and hence, accelerates convergence to the stationary distribution. The proposed method resembles PageRank but unlike PageRank preserves time-reversibility. Applying our hybrid RW to the problem of estimating degree distributions of graphs shows promising results. Key-words:
PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as the frequency that a random surfer visits a Web page, and thus it reflects the popularity of a Web page. We study the effect of newly created links on Google PageRank. We discuss to what extent a page can control its PageRank. Using asymptotic analysis we provide simple conditions that show whether or not new links result in increased PageRank for a Web page and its neighbors. Furthermore, we show that there exists an optimal (although impractical) linking strategy. We conclude that a Web page benefits from links inside its Web community and on the other hand irrelevant links penalize the Web pages and their Web communities.
Abstract-Internet measurements show that a small number of large TCP flows are responsible for the largest amount of data transferred, whereas most of the TCP sessions are made up of few packets. Several authors have invoked this property to suggest the use of scheduling algorithms which favor short jobs, such as LAS (Least Attained Service), to differentiate between short and long TCP flows.We propose a packet level stateless, threshold based scheduling mechanism for TCP flows, RuN2C. We describe an implementation of this mechanism which has the advantage of being TCP compatible and progressively deployable. We compare the behavior of RuN2C with LAS based mechanisms through analytical models and simulations. As an analytical model, we use a two level priority Processor Sharing P S + P S. In the P S +P S system, a connection is classified as high or low priority depending on the amount of service it has obtained. We show that P S + P S reduces the mean response time in comparison with standard Processor Sharing when the hazard rate of the file size distribution is decreasing. By simulations we study the impact of RuN2C on extreme values of response times and the mean number of connections in the system. Both simulations and analytical results show that RuN2C has a very beneficial effect on the delay of short flows, while treating large flows as the current TCP implementation does. In contrast, we find that LAS based mechanisms can lead to pathological behavior in extreme cases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.