Online human interactions take place within a dynamic hierarchy, where social influence is determined by qualities such as status, eloquence, trustworthiness, authority and persuasiveness. In this work, we consider topic-based twitter interaction networks, and address the task of identifying influential players. Our motivation is the strong desire of many commercial entities to increase their social media presence by engaging positively with pivotal bloggers and tweeters. After discussing some of the issues involved in extracting useful interaction data from a twitter feed, we define the concept of an active node subnetwork sequence. This provides a time-dependent, topic-based, summary of relevant twitter activity. For these types of transient interactions, it has been argued that the flow of information, and hence the influence of a node, is highly dependent on the timing of the links. Some nodes with relatively small bandwidth may turn out to be key players because of their prescience and their ability to instigate follow-on network activity. To simulate a commercial application, we build an active node subnetwork sequence based on key words in the area of travel and holidays. We then compare a range of network centrality measures, including a recently proposed version that accounts for the arrow of time, with respect to their ability to rank important nodes in this dynamic setting. The centrality rankings use only connectivity information (who tweeted whom, when), without requiring further information about the account type or message content, but if we post-process the results by examining account details, we find that the time-respecting, dynamic approach, which looks at the follow-on flow of information, is less likely to be ‘misled’ by accounts that appear to generate large numbers of automatic tweets with the aim of pushing out web links. We then benchmark these algorithmically derived rankings against independent feedback from five social media experts, given access to the full tweet content, who judge twitter accounts as part of their professional duties. We find that the dynamic centrality measures add value to the expert view, and can be hard to distinguish from an expert in terms of who they place in the top ten. These algorithms, which involve sparse matrix linear system solves with sparsity driven by the underlying network structure, can be applied to very large-scale networks. We also test an extension of the dynamic centrality measure that allows us to monitor the change in ranking, as a function of time, of the twitter accounts that were eventually deemed influential
Abstract. Online human interactions take place within a dynamic hierarchy, where social influence is determined by qualities such as status, eloquence, trustworthiness, authority and persuasiveness. In this work, we consider topic-based Twitter interaction networks, and address the task of identifying influential players. Our motivation is the strong desire of many commerical entities to increase their social media presence by engaging positively with pivotal bloggers and tweeters. After discussing some of the issues involved in extracting useful interaction data from a Twitter feed, we define the concept of an active node subnetwork sequence. This provides a time-dependent, topic-based, summary of relevant Twitter activity. For these types of transient interactions, it has been argued that the flow of information, and hence the influence of a node, is highly dependent on the timing of the links. Some nodes with relatively small bandwidth may turn out to be key players because of their prescience and their ability to instigate follow-on network activity. To simulate a commercial application, we build an active node subnetwork sequence based on key words in the area of travel and holidays. We then compare a range of network centrality measures, including a recently proposed version that accounts for the arrow of time, with respect to their ability to rank important nodes in this dynamic setting. The centrality rankings use only connectivity information (who Tweeted whom, when), but if we post-process the results by examining account details, we find that the time-respecting, dynamic, approach, which looks at the follow-on flow of information, is less likely to be 'misled' by accounts that appear to generate large numbers of automatic Tweets with the aim of pushing out web links. We then benchmark these algorithmically derived rankings against independent feedback from five social media experts who judge Twitter accounts as part of their professional duties. We find that the dynamic centrality measures add value to the expert view, and indeed can be hard to distinguish from an expert in terms of who they place in the top ten. We also highlight areas where the algorithmic approach can be refined and improved.
Within the online media universe there are many underlying communities. These may be defined, for example, through politics, location, health, occupation, extracurricular interests or retail habits. Government departments, charities and commercial organisations can benefit greatly from insights about the structure of these communities; the move to customer-centered practices requires knowledge of the customer base. Motivated by this issue, we address the fundamental question of whether a subnetwork looks like a collection of individuals who have effectively been picked at random from the whole, or instead forms a distinctive community with a new, discernible structure. In the former case, to spread a message to the intended user base it may be best to use traditional broadcast media (TV, billboard), whereas in the latter case a more targeted approach could be more effective. In this work, we therefore formalize a concept of testing for substructure and apply it to social interaction data. First, we develop a statistical test to determine whether a given subnetwork (induced subgraph) is likely to have been generated by sampling nodes from the full network uniformly at random. This tackles an interesting inverse alternative to the more widely studied “forward” problem. We then apply the test to a Twitter reciprocated mentions network where a range of brand name based subnetworks are created via tweet content. We correlate the computed results against the independent views of sixteen digital marketing professionals. We conclude that there is great potential for social media based analytics to quantify, compare and interpret on-line brand allegiances systematically, in real time and at large scale
This work seeks to introduce improvements to the traditional variable selection procedures employed in the development of geodemographic classifications. It presents a proposal for shifting from a traditional approach for generating general-purpose one-size-fits-all geodemographic classifications to application-specific classifications. This proposal addresses the recent scepticism towards the utility of general-purpose applications by employing supervised machine learning techniques in order to identify contextually relevant input variables from which to develop geodemographic classifications with increased discriminatory power. A framework introducing such techniques in the variable selection phase of geodemographic classification development is presented via a practical use-case that is focused on generating a geodemographic classification with an increased capacity for discriminating the propensity for Library use in the UK city of Leeds. Two local classifications are generated for the city, one a general-purpose classification, and the other, an application-specific classification incorporating supervised Feature Selection methods in the selection of input variables. The discriminatory power of each classification is evaluated and compared, with the result successfully demonstrating the capacity for the application-specific approach to generate a more contextually relevant result, and thus underpins increasingly targeted public policy decision making, particularly in the context of urban planning.
This version is available at https://strathprints.strath.ac.uk/42457/ Strathprints is designed to allow users to access the research output of the University of Strathclyde. Unless otherwise explicitly stated on the manuscript, Copyright © and Moral Rights for the papers on this site are retained by the individual authors and/or other copyright owners. Please check the manuscript for details of any other licences that may have been applied. You may not engage in further distribution of the material for any profitmaking activities or any commercial gain. You may freely distribute both the url (https://strathprints.strath.ac.uk/) and the content of this paper for research or private study, educational, or not-for-profit purposes without prior permission or charge.Any correspondence concerning this service should be sent to the Strathprints administrator: strathprints@strath.ac.ukThe Strathprints institutional repository (https://strathprints.strath.ac.uk) is a digital archive of University of Strathclyde research outputs. It has been developed to disseminate open access research outputs, expose data about those outputs, and enable the management and persistent access to Strathclyde's intellectual output.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.