There is a surge in the use of location activity in social media, in particular to broadcast the change of physical whereabouts. We are interested in analyzing the temporal characteristics of check-ins data from the user's perspective and also at the aggregate level for detecting patterns. In this paper we conduct a large study using check-in data from Facebook to analyze different temporal characteristics in four venue categories (restaurants, movies, shopping, and get-away). We present the results of such study and outline application areas where the conjunction of location and temporal-aware data can help in new search scenarios.
Over the past couple of years, social networks such as Twitter and Facebook have become the primary source for consuming information on the Internet. One of the main differentiators of this content from traditional information sources available on the Web is the fact that these social networks surface individuals' perspectives. When social media users post and share updates with friends and followers, some of those short fragments of text contain a link and a personal comment about the web page, image or video. We are interested in mining the text around those links for a better understanding of what people are saying about the object they are referring to. Capturing the salient keywords from the crowd is rich metadata that we can use to augment a web page. This metadata can be used for many applications like ranking signals, query augmentation, indexing, and for organizing and categorizing content. In this paper, we present a technique called social signatures that given a link to a web page, pulls the most important keywords from the social chatter around it. That is, a high level representation of the web page from a social media perspective. Our findings indicate that the content of social signatures differs compared to those from a web page and therefore provides new insights. This difference is more prominent as the number of link shares increase. To showcase our work, we present the results of processing a dataset that contains around 1 Billion unique URLs shared in Twitter and Facebook over a two month period. We also provide data points that shed some light on the dynamics of content sharing in social media.
The physical world is rife with cues that allow us to distinguish between safe and unsafe situations. By contrast, the Internet offers a much more ambiguous environment; hence many users are unable to distinguish a scam from a legitimate Web page. To help address this problem, we explore how to train classifiers that can automatically identify malicious Web pages based on clues from their textual content, structural tags, page links, visual appearance, and URLs. Using a contemporary labeled data feed from a large Web mail provider, we extract such features and demonstrate how they can be used to improve classification accuracy over previous, more constrained approaches. In particular, by analyzing the full content of individual Web pages, we more than halve the error rate obtained by a comparably trained classifier that only extracts features from URLs. By training classifiers on different sets of features, we are further able to assess the strength of clues provided by these different sources of information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.