An author's profile on Google Scholar consists of indexed articles and associated data, such as the number of citations and the H-index. The author is allowed to merge articles; this may affect the H-index. We analyze the (parameterized) computational complexity of maximizing the H-index using article merges. Herein, to model realistic manipulation scenarios, we define a compatibility graph whose edges correspond to plausible merges. Moreover, we consider several different measures for computing the citation count of a merged article. For the measure used by Google Scholar, we give an algorithm that maximizes the H-index in linear time if the compatibility graph has constant-size connected components. In contrast, if we allow to merge arbitrary articles (that is, for compatibility graphs that are cliques), then already increasing the H-index by one is NP-hard. Experiments on Google Scholar profiles of AI researchers show that the H-index can be manipulated substantially only if one merges articles with highly dissimilar titles.
Finding the origin of short phrases propagating through the web has been formalized by Leskovec et al. [ACM SIGKDD 2009] as DAG PARTITIONING: given an arc-weighted directed acyclic graph on n vertices and m arcs, delete arcs with total weight at most k such that each resulting weakly-connected component contains exactly one sink-a vertex without outgoing arcs. DAG PARTITIONING is NP-hard.We show an algorithm to solve DAG PARTITIONING in O(2 k · (n + m)) time, that is, in linear time for fixed k. We complement it with linear-time executable data reduction rules. Our experiments show that, in combination, they can optimally solve DAG PARTITIONING on simulated citation networks within five minutes for k ≤ 190 and m being 10 7 and larger. We use our obtained optimal solutions to evaluate the solution quality of Leskovec et al.'s heuristic.We show that Leskovec et al.'s heuristic works optimally on trees and generalize this result by showing that DAG PARTITIONING is solvable in 2 O(t 2 ) · n time if a width-t tree decomposition of the input graph is given. Thus, we improve an algorithm and answer an open question of Alamdari and Mehrabian [WAW 2012].We complement our algorithms by lower bounds on the running time of exact algorithms and on the effectivity of data reduction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.