User-generated texts such as reviews and social media are valuable sources of information. Online reviews are important assets for users to buy a product, see a movie, or make a decision. Therefore, rating of a review is one of the reliable factors for all users to read and trust the reviews. This paper analyzes the texts of the reviews to evaluate and predict the ratings. Moreover, we study the effect of lexical features generated from text as well as sentimental words on the accuracy of rating prediction. Our analysis show that words with high information gain score are more efficient compared to words with high TF-IDF value. In addition, we explore the best number of features for predicting the ratings of the reviews.
MapReduce framework is the de facto standard in Hadoop. Considering the data locality in data centers, the load balancing problem of map tasks is a special case of affinity scheduling problem. There is a huge body of work on affinity scheduling, proposing heuristic algorithms which try to increase data locality in data centers like Delay Scheduling and Quincy. However, not enough attention has been put on theoretical guarantees on throughput and delay optimality of such algorithms. In this work, we present and compare different algorithms and discuss their shortcoming and strengths. To the best of our knowledge, most data centers are using static load balancing algorithms which are not efficient in any ways and results in wasting the resources and causing unnecessary delays for users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.