Predicting the helpfulness of product reviews is a key component of many ecommerce tasks such as review ranking and recommendation. However, previous work mixed review helpfulness prediction with those outer layer tasks. Using nontext features, it leads to less transferable models. This paper solves the problem from a new angle by hypothesizing that helpfulness is an internal property of text. Purely using review text, we isolate review helpfulness prediction from its outer layer tasks, employ two interpretable semantic features, and use human scoring of helpfulness as ground truth. Experimental results show that the two semantic features can accurately predict helpfulness scores and greatly improve the performance compared with using features previously used. Cross-category test further shows the models trained with semantic features are easier to be generalized to reviews of different product categories. The models we built are also highly interpretable and align well with human annotations.
Local community detection aims to find a set of densely-connected nodes containing given query nodes. Most existing local community detection methods are designed for a single network. However, a single network can be noisy and incomplete. Multiple networks are more informative in real-world applications. There are multiple types of nodes and multiple types of node proximities. Complementary information from different networks helps to improve detection accuracy. In this paper, we propose a novel RWM (Random W alk in Multiple networks) model to find relevant local communities in all networks for a given query node set from one network. RWM sends a random walker in each network to obtain the local proximity w.r.t. the query nodes (i.e., node visiting probabilities). Walkers with similar visiting probabilities reinforce each other. They restrict the probability propagation around the query nodes to identify relevant subgraphs in each network and disregard irrelevant parts. We provide rigorous theoretical foundations for RWM and develop two speeding-up strategies with performance guarantees. Comprehensive experiments are conducted on synthetic and real-world datasets to evaluate the effectiveness and efficiency of RWM.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.