This survey reviews the research related to PageRank computing. Components of a PageRank vector serve as authority weights for web pages independent of their textual content, solely based on the hyperlink structure of the web. PageRank is typically used as a web search ranking component. This defines the importance of the model and the data structures that underly PageRank processing. Computing even a single PageRank is a difficult computational task. Computing many PageRanks is a much more complex challenge. Recently, significant effort has been invested in building sets of personalized PageRank vectors. PageRank is also used in many diverse applications other than ranking.We are interested in the theoretical foundations of the PageRank formulation, in the acceleration of PageRank computing, in the effects of particular aspects of web graph structure on the optimal organization of computations, and in PageRank stability. We also review alternative models that lead to authority indices similar to PageRank and the role of such indices in applications other than web search. We also discuss linkbased search personalization and outline some aspects of PageRank infrastructure from associated measures of convergence to link preprocessing.
Abstract. We introduce a novel bookmark-coloring algorithm (BCA) that computes authority weights over the web pages utilizing the web hyperlink structure. The computed vector (BCV) is similar to the PageRank vector defined for a page-specific teleportation. Meanwhile, BCA is very fast, and BCV is sparse. BCA also has important algebraic properties. If several BCVs corresponding to a set of pages (called hub) are known, they can be leveraged in computing arbitrary BCV via a straightforward algebraic process and hub BCVs can be efficiently computed and encoded.
We describe a real-time bidding algorithm for performancebased display ad allocation. A central issue in performance display advertising is matching campaigns to ad impressions, which can be formulated as a constrained optimization problem that maximizes revenue subject to constraints such as budget limits and inventory availability. The current practice is to solve the optimization problem offline at a tractable level of impression granularity (e.g., the page level), and to serve ads online based on the precomputed static delivery scheme. Although this offline approach takes a global view to achieve optimality, it fails to scale to ad allocation at the individual impression level. Therefore, we propose a realtime bidding algorithm that enables fine-grained impression valuation (e.g., targeting users with real-time conversion data), and adjusts value-based bids according to real-time constraint snapshots (e.g., budget consumption levels). Theoretically, we show that under a linear programming (LP) primal-dual formulation, the simple real-time bidding algorithm is indeed an online solver to the original primal problem by taking the optimal solution to the dual problem as input. In other words, the online algorithm guarantees the offline optimality given the same level of knowledge an offline optimization would have. Empirically, we develop and experiment with two real-time bid adjustment approaches to adapting to the non-stationary nature of the marketplace: one adjusts bids against real-time constraint satisfaction levels using control-theoretic methods, and the other adjusts bids also based on the statistically modeled historical bidding landscape. Finally, we show experimental results with real-world ad delivery data that support our theoretical conclusions.
-In addition to classic clustering algorithms, many different approaches to clustering are emerging for objects of special nature. In this article we deal with the grouping of rows and columns of a matrix with non-negative entries. Two rows (or columns) are considered similar if corresponding cross-distributions are close. This grouping is a dual clustering of two sets of elements, row and column indices. The introduced approach is based on the minimization of reduction of mutual information contained in a matrix that represents the relationship between two sets of elements. Our clustering approach contains many parallels with K-Means clustering due to certain common algebraic properties. The obtained results have many applications, including grouping of Web visit data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.