Since the growth of social media, news outrage over mass through social media has become evident and its control over all information sources has increased significantly. Social media services such as twitter collects enormous amounts of information and allows media companies to publish news related information as tweets. Every other big news company has its twitter account to public news in form of tweets. Beside news social media platforms has enormous amount of news attached to them as well. To make correct and better information reach to users we have to filter noise and segregate the content based on similarity and content’s respective value. Even after filtering noise, information payload exists in data so to prioritize information must be ranked in order of considered factors. In our proposed work, news are filtered and ranked based on three factors. First, media focus (MF) which tells the temporal prevalence of a particular topic in news media. Second, user attention (UA) which tells how mass is responding to the topic. Last, is the user interaction which tells how users are forming view over the topic. Our proposed work introduces an unsupervised machine learning framework which identifies news topics prevalent in both social media and the news media, and then ranks them ordering them using their degrees of MF, UA, and UI.
In today's world, most of the private and public sector organizations deal with massive amounts of raw data, which includes information and knowledge in their secret layer. In addition, the format, scale, variety, and velocity of generated data make it more difficult to use the algorithms in an efficient manner. This complexity necessitates the use of sophisticated methods, strategies, and algorithms to solve the challenges of managing raw data. Big data query optimization (BDQO) requires businesses to define, diagnose, forecast, prescribe, and cognize hidden growth opportunities and guiding them toward achieving market value. BDQO uses advanced analytical methods to extract information from an increasingly growing volume of data, resulting in a reduction in the difficulty of the decision-making process. Hadoop, Apache Hive, No SQL, Map Reduce, and HPCC are the technologies used in big data applications to manage large data. It is less costly to consume data for query processing because big data provides scalability. However, small businesses will never be able to query large databases. Joining tables with millions of tuples could take hours. Parallelism, which solves the problem by using more processors, may be a potential solution. Unfortunately, small businesses cannot afford to operate on a shoestring budget. There are many techniques to tackle the problem. The technologies used in the big data query optimization process are discussed in depth in this paper.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.